In the edit distance problem we have to find the minimum number of operations required to convert a string X of length n to another string Y of length m.
Operations allowed:
- Insertion
- Deletion
- Substitution
Table of Contents
Example
Input:
String1 = “abcd”
String2 = “abe”
Output:
Minimum operations required is 2 ( one deletion and one substitution )
We can solve this question by solving it’s sub-problem i.e., find the minimum number of operations required to convert a string X[0,…,i] to Y[0,…j].
Steps
- If the last character of X and Y are the same, then find minimum operations required to convert string X[0,…,n-1] to Y[0,…,m-1].
- If the last character of X and Y are not the same, then we can do one of the three operations:
- Insertion at last index of X.
- Deletion of the last index of X.
- Substitution of the last index of X.
Using the above steps, let’s try to draw the recursion tree for this example:
string X= “ab” and Y = “xy” Therefore n=2 and m=2 ed(i,j) denotes minimum operations required to obtained a string Y[1,…,j] from X[1,…,i]. ( 1-based index).
As none of the characters in X is same as Y, hence at every iteration, we will call recursion three times by applying one of the three allowed operations.
As we can see there are many overlapping sub-problems in the above recursion due to which the worst case complexity of this procedure is O(3^n).
This can be easily reduced using Dynamic Programming which will resolve the issue of re-computation of overlapping sub-problems.
Dynamic Programming Approach
Base conditions
- To convert a string of length n to the given string of length 0, we require n deletion operations.
- To convert a string of length 0 to the given string of length n, we require n insertion operations.
At every position we may come across any of these two situations
- If the last character of both strings is the same. In this case, we don’t need to do any operation, we can simply say that the answer to this problem will be the same as for the sub-problem editDistance(X[0,…,i-1] , Y[0,….,j-1]).
- If the last character is not the same, we have to choose one of these operations:
- Insertion: In this case, we will insert Y[j] character at ith position and add 1 to the answer of editDistance(X[0,….i], Y[0,….,j-1]). By doing this we ensure that now the last character of both strings is the same.
- Deletion: In this case, we will delete the ith character of X and add 1 to the answer of editDistance(X[0,….,i-1] , Y[0,…,j]). By doing this we are leaving the last character of X string i.e., deletion is performed.
- Substitution: In this case we will substitute X[i] by Y[j] and add 1 to the answer of editDistance(X[0,…,i-1] , Y[0,…,j-1]). This becomes the same case as having equal last character in both strings.
As we require the minimum number of operations hence, we will take a minimum of the three possible operations.
C++ Program
#include<bits/stdc++.h> using namespace std; int min(int a,int b,int c){ return min(a,min(b,c)); } int editDistance(string X,string Y){ int n=X.length(); int m=Y.length(); int ed[n+1][m+1]; // to convert a string into null string we need to perform // deletion operation n number of times where n is length of the string for(int i=0;i<=n;i++){ ed[i][0]=i; } // to convert a null string into the given string we need to perform // insertion operation n number of times where n is length of the string for(int i=0;i<=m;i++){ ed[0][i]=i; } for(int i=1;i<=n;i++){ for(int j=1;j<=m;j++){ if(X[i-1]==Y[j-1]){ // no operation required ed[i][j] = ed[i-1][j-1]; } else{ // one of the three operation required ed[i][j] = 1+ min( ed[i-1][j], //deletion ed[i][j-1], //insertion ed[i-1][j-1] //substitution ); } } } return ed[n][m]; } int main() { string X,Y; cin>>X>>Y; int result = editDistance(X,Y); cout<<"Minimum operations required to convert string "<<X<<" into string "<<Y<<" is: "; cout<<result<<"\n"; return 0; }
programming problem
Minimum operations required to convert string programming into string problem is: 7
JAVA Program
import java.util.Scanner; class Main { static int min(int x, int y, int z) { return Math.min(x,Math.min(y,z)); } static int editDistance(String X, String Y) { int n=X.length(); int m=Y.length(); int ed[][] = new int[n + 1][m + 1]; // to convert a string into null string we need to perform // deletion operation n number of times where n is length of the string for(int i=0;i<=n;i++){ ed[i][0]=i; } // to convert a null string into the given string we need to perform // insertion operation n number of times where n is length of the string for(int i=0;i<=m;i++){ ed[0][i]=i; } for(int i=1;i<=n;i++){ for(int j=1;j<=m;j++){ if(X.charAt(i - 1)==Y.charAt(j - 1)){ // no operation required ed[i][j] = ed[i-1][j-1]; } else{ // one of the three operation required ed[i][j] = 1+ min( ed[i-1][j], //deletion ed[i][j-1], //insertion ed[i-1][j-1] //substitution ); } } } return ed[n][m]; } public static void main(String args[]) { String X ,Y; Scanner sc = new Scanner(System.in); X = sc.nextLine(); Y = sc.nextLine(); int result = editDistance(X,Y); System.out.println("Minimum operations required to convert string "+X+" into string "+Y+" is: "+result); } }
codingAndChill codeAndGrow
Minimum operations required to convert string codingAndChill into string codeAndGrow is: 8
Complexity Analysis
Time complexity: O(n*m)
where n = length of 1st string
m = length of 2nd string
as we are filling a 2D matrix( ed ) using two nested loops.