Tuesday, September 20, 2016

Longest Common Subsequence LCS -- Again

When I see this again, I admired people coming up this solution again. Posting it here to learn it again.

Given two strings, when trying to find out LCS,  we split things into three cases, first two cases are when first characters do not match:
Case 1: We don't use the first character of the first string.
Case 2: We don't use the first character of the second string.
Case 3: We match up the first character of each string.
The third case is obviously only possible if the first characters match.
Once we have decided which of these options we are going to use, we have reduced the problem to one of a smaller size: we want to find the LCS of the remaining strings, which are suffixes of the original strings.
The terminating condition for the DP occurs when one of the strings is empty - in that case, the length of the LCS is 0.

This results in the following code:

// lcs is an array of dimensions [M+1][N+1]
// s is first string of length M
// t is second string of length N
//lcs[i][j] means LCS is comparing the two strings with s starts from position i and t starts from j

for (int i=M; i>=0; i--){  // using the ith character onwards of first string
    for (int j=N; j>=0; j--) {   // using the jth character onwards of second string
        if (i==M || j==N) lcs[i][j]=0;   // empty string
        else {
            lcs[i][j] = max(lcs[i+1][j],lcs[i][j+1]);   // first two cases
            if (s[i]==t[j]) lcs[i][j] = max(lcs[i][j], lcs[i+1][j+1] + 1);   // third case
        }
    }
}


This code runs in O(MN) time.  lcs[0][0] will hold the length of the longest common subsequence of both strings.

No comments: