Substring Diff


Problem Statement :


In this problem, we'll use the term "longest common substring" loosely. It refers to substrings differing at some number or fewer characters when compared index by index. For example, 'abc' and 'adc' differ in one position, 'aab' and 'aba' differ in two.

Given two strings and an integer k, determine the length of the longest common substrings of the two strings that differ in no more than k positions.

For example, k=1. Strings s1=abcd and s2=bbca. Check to see if the whole string (the longest substrings) matches. Given that neither the first nor last characters match and 2>k, we need to try shorter substrings. The next longest substrings are s1' = [abc, bcd] and s2' = [bbc, bca]. Two pairs of these substrings only differ in 1 position: [abc, bbc] and [bcd, bca]. They are of length 3.

Function Description

Complete the substringDiff function in the editor below. It should return an integer that represents the length of the longest common substring as defined.

substringDiff has the following parameter(s):

k: an integer that represents the maximum number of differing characters in a matching pair
s1: the first string
s2: the second string

Input Format

The first line of input contains a single integer, t, the number of test cases follow.
Each of the next t lines contains three space-separated values: an integer k and two strings, s1 and s2.

Constraints

1 <= t <= 10
0 <= k <= |s1|
|s1|  = |s2|
1 <= |s1|, |s2| <= 1500
All characters in s1 and s2 ∈ ascii[a-z].
Output Format

For each test case, output a single integer which is the length of the maximum length common substrings differing at k or fewer positions.



Solution :



title-img


                            Solution in C :

In C++ :






#include <iostream>
#include <cstdio>
#include <vector>
#include <map>
#include <queue>
#include <deque>
#include <stack>
#include <set>
#include <bitset>
#include <cmath>
#include <complex>
#include <algorithm>
#include <cstring>
#include <cstdlib>
#include <stdlib.h>
#include <utility>
#include <ctime>
using namespace std;

#define MOD 1000000007
#define BIT(x) __builtin_popcount(x)

int n , k;
int D[1505][1505],K[1505][1505];
char A[1505],B[1505];

int main()
{
	int t; cin >> t;
while(t--){
  scanf("%d",&k);
  scanf("%s%s",A,B);
  n = strlen(A);
  int r = 0;
  memset(D,0,sizeof(D));
  memset(K,0,sizeof(K));
  for(int i = n-1; i>=0 ; i--)
	for(int j = n-1; j>=0 ; j--){
     D[i][j] = D[i+1][j+1] + 1;
     K[i][j] = K[i+1][j+1] + ((A[i]==B[j])?0:1);
     while(K[i][j]>k){
    	 if(A[i+D[i][j]-1] != B[j+D[i][j]-1])
    		 K[i][j]--;
    	 D[i][j]--;
     }
     r = max(r, D[i][j]);
	}
  cout << r << endl;
}
  return 0;
}








In Java :






import java.awt.Point;
import java.io.*;
import java.math.BigInteger;
import java.util.*;
import static java.lang.Math.*;

public class Solution {

    BufferedReader in;
    PrintWriter out;
    StringTokenizer tok = new StringTokenizer("");

    public static void main(String[] args) {
        new Solution().run();
    }

    public void run() {
        try {
            long t1 = System.currentTimeMillis();
               in = new BufferedReader(new InputStreamReader(System.in));
                out = new PrintWriter(System.out);
           
            Locale.setDefault(Locale.US);
            solve();
            in.close();
            out.close();
            long t2 = System.currentTimeMillis();
            System.err.println("Time = " + (t2 - t1));
        } catch (Throwable t) {
            t.printStackTrace(System.err);
            System.exit(-1);
        }
    }

    String readString() throws IOException {
        while (!tok.hasMoreTokens()) {
            tok = new StringTokenizer(in.readLine());
        }
        return tok.nextToken();
    }

    int readInt() throws IOException {
        return Integer.parseInt(readString());
    }

    long readLong() throws IOException {
        return Long.parseLong(readString());
    }

    double readDouble() throws IOException {
        return Double.parseDouble(readString());
    }

    // solution
    void solve() throws IOException {
        int n = readInt();
        for (int i = 0; i < n; i++)
        {
            int k = readInt();
            String s1 = readString();
            String s2 = readString();
            out.println(Math.max(find(k, s1, s2), find(k, s2, s1)));
        }
    }
    
    int find(int k, String s1, String s2)
    {
        int max = 0;
        for (int startFrom = 0; startFrom < s1.length(); startFrom++)
        {
            int l = 0;
            int penalty = 0;
            for (int r = 0; (r < s2.length()) && (startFrom + r < s1.length()); r++)
            {
                if (s1.charAt(startFrom + r) != s2.charAt(r))
                    penalty++;
                while (penalty > k)
                {
                   if (s1.charAt(startFrom + l) != s2.charAt(l))
                       penalty--;
                   l++;
                }
                max = Math.max(max, r - l + 1);
            }
        }
        return max;
    }
}








In C :





#include <assert.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAXN 1500

char buf[4096];
char diff[MAXN][MAXN];
int n, k;

void mkdiff(char *s, char *t) {
  int i, j;

  for (i = 0; i < n; i++)
    for (j = 0; j < n; j++)
      diff[i][j] = (s[i] == t[j]) ? 0 : 1;
}

int isgood(int L) {
  int d, i, j, sum;

  if (L <= k)
    return 1;

  for (d = -n+1; d <= n-1; d++) {
    if (d <= 0) {
      if (n + d < L)
	continue;
      sum = 0;
      for (i = -d, j = 0; i < n; i++, j++) {
	sum += diff[i][j];
	if (j >= L)
	  sum -= diff[i-L][j-L];
	if (j >= L-1 && sum <= k)
	  return 1;
      }
    } else {
      if (n - d < L)
	continue;
      sum = 0;
      for (i = 0, j = d; j < n; i++, j++) {
	sum += diff[i][j];
	if (i >= L)
	  sum -= diff[i-L][j-L];
	if (i >= L-1 && sum <= k)
	  return 1;
      }
    }
  }
  return 0;
}

/* binary search to find largest L which satisfies M(i,j,L) <= k */
int search() {
  int i, j, m;

  // Invariant: f(i-1)=true, f(j)=false
  i = 0;
  j = n+1;
  while (i < j) {
    m = (i + j) / 2;
    if (isgood(m))
      i = m+1;
    else
      j = m;
  }
  return i-1;
}

int main() {
  int ncases;
  char *s, *t;

  fgets(buf, sizeof buf, stdin);
  ncases = atoi(buf);

  while (ncases-- > 0) {
    fgets(buf, sizeof buf, stdin);
    s = strchr(buf, ' ');
    *s++ = '\0';
    t = strchr(s, ' ');
    *t++ = '\0';
    k = atoi(buf);
    n = strlen(s);
    if (t[n] == '\n')
      t[n] = '\0';

    mkdiff(s, t);

    printf("%d\n", search(0, n));
  }

  return 0;
}








In Python3 :





def l_func(p,q,max_s):
    n = len(q)
    res_ar = [0]
    count = 0
    ans = 0
    for i in range(n):
        if(p[i]!=q[i]):
            res_ar.append(i+1)
            count += 1
            if(count>max_s):
                l = res_ar[-1]-res_ar[-2-max_s]-1
                if(l>ans):ans = l
    if(count<=max_s):
        return n
    return ans

def check_func(p,q,s):
    n = len(q)
    ans = 0
    for i in range(n):
        if(n-i<=ans):break
        l = l_func(p,q[i:],s)
        if(l>ans):
            ans = l
    for i in range(n):
        if(n-i<=ans):break
        l = l_func(q,p[i:],s)
        if(l>ans):
            ans = l
    return ans
for case_t in range(int(input())):
    str_s,p,q = input().strip().split()
    s = int(str_s)
    print(check_func(p,q,s))
                        








View More Similar Problems

Polynomial Division

Consider a sequence, c0, c1, . . . , cn-1 , and a polynomial of degree 1 defined as Q(x ) = a * x + b. You must perform q queries on the sequence, where each query is one of the following two types: 1 i x: Replace ci with x. 2 l r: Consider the polynomial and determine whether is divisible by over the field , where . In other words, check if there exists a polynomial with integer coefficie

View Solution →

Costly Intervals

Given an array, your goal is to find, for each element, the largest subarray containing it whose cost is at least k. Specifically, let A = [A1, A2, . . . , An ] be an array of length n, and let be the subarray from index l to index r. Also, Let MAX( l, r ) be the largest number in Al. . . r. Let MIN( l, r ) be the smallest number in Al . . .r . Let OR( l , r ) be the bitwise OR of the

View Solution →

The Strange Function

One of the most important skills a programmer needs to learn early on is the ability to pose a problem in an abstract way. This skill is important not just for researchers but also in applied fields like software engineering and web development. You are able to solve most of a problem, except for one last subproblem, which you have posed in an abstract way as follows: Given an array consisting

View Solution →

Self-Driving Bus

Treeland is a country with n cities and n - 1 roads. There is exactly one path between any two cities. The ruler of Treeland wants to implement a self-driving bus system and asks tree-loving Alex to plan the bus routes. Alex decides that each route must contain a subset of connected cities; a subset of cities is connected if the following two conditions are true: There is a path between ever

View Solution →

Unique Colors

You are given an unrooted tree of n nodes numbered from 1 to n . Each node i has a color, ci. Let d( i , j ) be the number of different colors in the path between node i and node j. For each node i, calculate the value of sum, defined as follows: Your task is to print the value of sumi for each node 1 <= i <= n. Input Format The first line contains a single integer, n, denoti

View Solution →

Fibonacci Numbers Tree

Shashank loves trees and math. He has a rooted tree, T , consisting of N nodes uniquely labeled with integers in the inclusive range [1 , N ]. The node labeled as 1 is the root node of tree , and each node in is associated with some positive integer value (all values are initially ). Let's define Fk as the Kth Fibonacci number. Shashank wants to perform 22 types of operations over his tree, T

View Solution →