Monday, September 12, 2016

Revisit the LCS

Taken from codechief.

The standard LCS problem


Given two strings, when trying to find out LCS,  we split things into three cases:
Case 1: We don't use the first character of the first string.
Case 2: We don't use the first character of the second string.
Case 3: We match up the first character of each string.
The third case is obviously only possible if the first characters match.
Once we have decided which of these options we are going to use, we have reduced the problem to one of a smaller size: we want to find the LCS of the remaining strings, which are suffixes of the original strings.
The terminating condition for the DP occurs when one of the strings is empty - in that case, the length of the LCS is 0.

This results in the following code:
[code]
// lcs is an array of dimensions [M+1][N+1]
// s is first string of length M
// t is second string of length N
//lcs[i][j] means LCS is comparing the two strings from position i and j

for (int i=M; i>=0; i--)  // using the ith character onwards of first string
for (int j=N; j>=0; j--) {   // using the jth character onwards of second string
    if (i==M || j==N) lcs[i][j]=0;   // empty string
    else {
        lcs[i][j] = max(lcs[i+1][j],lcs[i][j+1]);   // first two cases
        
        if (s[i]==t[j]) lcs[i][j] = max(lcs[i][j], lcs[i+1][j+1] + 1);   // third case
    }
}
[/code]

This code runs in O(MN) time.  lcs[0][0] will hold the length of the longest common subsequence of both strings. 

No comments: