Monday, April 13, 2015

Longest Common Substring


Problem: Given two strings, find the length of the longest common substring.

Solution:
  1. Find the length of the longest common suffix for all sub-strings of both strings and store these lengths in a table.
  2. The maximum length Longest Common Suffix is the longest common substring.

Implementation:

int LongestCommonSubstring(std::string str1, std::string str2)
{
int len1 = str1.size(), len2 = str2.size(), max = 0;
int **maxLenSubStr = new int* [len1];
for(int i = 0; i < len1; ++i)
maxLenSubStr[i] = new int[len2];

for(int i = 0; i < len1; ++i)
{
for(int j = 0; j < len2; ++j)
{
if(str1[i] != str2[j])
{
maxLenSubStr[i][j] = 0;
}
else
{
if(i == 0 || j == 0)
maxLenSubStr[i][j] = 1;
else
maxLenSubStr[i][j] = 1 + maxLenSubStr[i-1][j-1];
max = max < maxLenSubStr[i][j] ? maxLenSubStr[i][j] : max;
}
}
}
return max;
}

Complexity: O(n2)


Space efficient Solution:

As we can see that only the previous column of the grid storing the dynamic state is ever actually used in computing the next column. Thus, this algorithm can be altered to have only an O(n) storage instead of O(n2). By reassigning array references between two 1D arrays, this can be done without copying the state data from one array to another. Please note that using this method, we will not be able to get the longest common substring. We can only get its length.

Implementation:

int LongestCommonSubstring(std::string str1, std::string str2)
{
int len1 = str1.size(), len2 = str2.size(), max = 0;
int *curr = new int [len2], *prev = new int[len2]; 
for(int i = 0; i < len1; ++i)
{
for(int j = 0; j < len2; ++j)
{
if(str1[i] != str2[j])
{
curr[j] = 0;
}
else
{
if(i == 0 || j == 0)
curr[j] = 1;
else
curr[j] = 1 + prev[j-1];
}
max = max < curr[j] ? curr[j] : max;
}
int *temp = curr;
curr = prev;
prev = temp;
}
delete [] curr;
delete [] prev;
return max;
}

Complexity: O(n2)

No comments:

Post a Comment