Longest Common Subsequence

Problem Statement:

Given two strings find the longest common subsequence to them. Subsequence of a string \(a\) is a sequence \(b\) formed by removing one or more characters from string \(a\).

Algorithm

This is similar to edit distance but here we find subquences of two strings and find the maximum length subsequence which is common to both. In fact we can use this lcs algorithm to find lcs of string a and b then we can find edit distance by using \(edit\_dist = a.size() + b.size() - 2*lcs(a, b) \)

Pseudo code :

lcs(a,b,m,n):
    if m==0 or n==0:
        return 0
    if a[m-1] == b[n-1]:
        return 1+lcs(a,b,m-1,n-1)
    else:
        return max of lcs(a,b,m,n-1) and lcs(a,b,m-1,n)
    

Code :

    #include <bits/stdc++.h>
    using namespace std;
    
    typedef long long ll;
    typedef unsigned long long ull;
    typedef unsigned long ul;
    
    #define pii pair<int, int>
    #define vii vector<int, int>
    #define vi vector<int>
    #define rep(T, x, a, b) for (T x = a; x < b; x++)
    
    int lcs(string &a, string &b, int m, int n)
    {
        if (m == 0 || n == 0)
        {
            return 0;
        }
        if (a[m - 1] == b[n - 1])
        {
            return 1 + lcs(a, b, m - 1, n - 1);
        }
        else
        {
            return max(lcs(a, b, m, n - 1), lcs(a, b, m - 1, n));
        }
    }
    
    int main()
    {
        string a, b;
        int m, n;
        cin >> a >> b;
        m = a.size();
        n = b.size();
        cout << "LCS length = " << lcs(a, b, m, n) << "\n";
    }
                
                        
    

Complexity Analysis :

best case : \(O(mn)\) when array is already sorted

video resources

web resources

LCS explaination.

book resources