1,585 reads

Solving the Edit Distance Problem Using The Dynamic Programming Approach

by Shakhawat HossainMay 9th, 2020

Too Long; Didn't Read

A problem that can be broken into several chunks and then combined them to get the optimum solution is known as an optimal substructure. The problem we are going to solve has the same characteristic as recursion. The main task is to find the minimum number of operations needed to convert word1 to word2. We need to ask, have we incurred any operation, no. To convert a character into another, first, we need to figure out the number of actions that were needed. And somewhere we have to store the previous number of. actions to calculate the current one.

People Mentioned

featured image - Solving the Edit Distance Problem Using The Dynamic Programming Approach

The first question that arises when solving a problem using dynamic programming(DP) is how to figure out that DP is a way to solve it?

So I will solve a problem using dynamic programming and will explain how I have figured it out for this one.

“A problem well stated is a problem half solved.” -- John Dewey

Let's discuss the problem to make it half solved.

Given two words word1 and word2. Find the minimum number of operations required to convert word1 to word2.
You have the following 3 operations permitted on a word:

Insert a character
Delete a character
Replace a character

Understand the problem:

We will get two words and three permitted actions. We have to convert word1 to word2 by performing those permitted actions. The main task is to find the minimum number of actions needed to convert word1 to word2.

Lets take an example to make it more clear.

Input: word1 = "intention" word2 = "execution"
Output: 5
Explanation:
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')

Indications of Dynamic Programming(DP):

I have already talked about the DP in this article. You can take a look at it to get a refreshment about basic idea of DP. If we look at the definition of DP, there are two major points.

1. Optimal Substructure
2. Overlapping sub-problems

Now the most important question what do those points mean? Let's get to them one by one.

Optimal Substructure: A problem that can be broken into several chunks and then combined them to get the optimum solution is known as an optimal substructure. That is closely related to recursion. So, in summary, we can say, if we can solve a problem recursively, that problem has an optimal substructure.

Edit distance problem has optimal substructure, cause we have to do a similar task again and again based on some predefined condition.

Overlapping sub-problems: A problem is said to have an overlapping sub-problems if the problem can be split into sub-problems where it involves solving the same sub-problem multiple times.

The problem we are going to solve has the same characteristic. How!... Remember the primary task is to convert one word into another. If we split it into another level, we can say, we have to convert a character into another one. Now to convert a character of a sudden position into another character, first, we need to figure out the number of actions that were needed to convert the previous characters. Let's get an example that fits this scenario. Suppose we have word1 = 'ab', word2 = 'cd'. To convert a to c, first, we need to ask, have we incurred any operation, no. To convert a -> c, we need to perform 1 operation. In total 0+1 = 1 operation. Now we have to convert b to d. So we have to take one action that replaces b with d and the total is (previous + current) 1 + 1 = 2. And somewhere we have to store the previous number of actions to calculate the current one.

So we can say whatever the current situation is, the major thing that we have to do is calculate the number of operations that have been incurred previously. This indication confirms that this problem contains overlapping sub-problems.

As we have an overview of the problem and how it is related to the DP, lets dive into the coding part by taking another example

This time we are going to count the number of operations also by using a chart. Each character of the first word will take place on the x-axis. And the second-word characters will be placed on the y-axis. There is an exceptional case that we have to keep in mind. We may have to convert a string to an empty string or vice versa. So the first column and row will be denoted with the Phi symbol. Assume the word1 = 'TEA' and word2 = 'COFFEE'. So the chart will look something like this.

Let's fill the first column. The first column denotes, if we want to cover our word 'TEA' into an empty string, then what will be the number of required actions. We will count it one by one. If we want to convert an empty word to an empty word, the number of actions will be 0. If we want to convert 'T' to an empty word, we have to remove 'T'. So the number of actions becomes 1. Similarly, to convert the TE to an empty string, we need have to take 2 actions. Remove 'T', then remove 'E'. As described above, to convert 'TEA' to an empty word, we have to make 3 remove actions. The process is similar for the first row too. This time we have to covert an empty word into 'COFFEE'. After calculating all the steps, the chart will look something like,

Now we will take it into the next step. Here we will fill the second row of our chart. As usual, the first task is to convert 'T' to 'C'. It should take 1 action to do so. All we have to do is, replace 'T' with 'C'. One question is pocking into my mind at this moment. Can we relate this value with any of our previous value? It looks difficult at this moment. Let me help you to find some possible paths. All I can see, we can take any of its immediate top or left value or we can find the minimum value from its immediate top, left and diagonal left values to add 1 to it(Seems more valid, cause we have to find the minimum number of actions.). Lets complete the row by following the process that we have discussed above. We will see the following chart so far.

We can fill the third row before the cell denoted by row 'E' and column 'E'.

The cell denoted by row 'E' and column 'E' has a special case. Here we can see that we have to convert 'TE' into 'COFFE'.To reach 'COFF', we need to take 4 actions. Now, 'E' matches with 'E'. As it matched, we don't need to incur any operation. So the number of operation remains at 4. Let's make another guess for this scenario. We can see only two preferred options. Take any value from the immediate left or the diagonal left cell.

Which one we should choose from the above conditions seems indistinct. To make it more clear, let's move to another step. At this moment we aim to convert 'TE' to 'COFFEE'. We had to take 4 actions to match 'COFEE'. Now we have to add another 'E' to the word. As 'E' matches with the 'E'. We stay with the previous guess we did, we can find only one option left. Take the value of the immediate diagonal left cell.

If we continue the steps mentioned above, the final chart will looks similar to,

The answer we are looking for exists at the bottom right corner cell of our chart. Amazing.... We have the solution in our hands now. Lets summarize the process to point out the steps that we need to perform for solving this problem.

Steps to solve the problem:

1.  Define a 2d array(n+1 * m+1). Where n and m respectively holds the the length of word1 and word2.
2.  Assign first row with 0 to m, and first column with 0 to n.
3.  Iterate through each of the cell except row 2 and column 2 to calculate the number of actions.
    1. If both of the characters of a given cell matches, take the immediate diagonal left value to fill that cell.
    2. If the characters do not match, find the minimum from immediate left, top or diagonal left values and add 1 to it.
4.  Return the bottom right most cell value.

We have covered a long path. I love to work with Ruby. So I will solve it in Ruby.

def min_distance(word1, word2)
    m = word1.length
    n = word2.length
    dp = Array.new(m+1) { Array.new(n+1, 0) }
    for i in 0..m
        dp[i][0] = i
    end
    for i in 1..n
        dp[0][i] = i
    end
    for i in  0...m
        for j in  0...n
            if word1[i] == word2[j]
                dp[i + 1][j + 1] = dp[i][j]
            else 
                dp[i + 1][j + 1] = [dp[i][j], dp[i][j + 1], dp[i + 1][j]].min + 1
            end
        end
    end
    dp[m][n]
end

Time Complexity:
The time complexity of this algorithm is O(m*n). As we are iterating through each of the cell.

Space Complexity:
We stored all the values in a 2D array. And initially we have added extra one column and row to store value for the empty state. So the space complexity for this algorithm becomes O(n+1 * m+1).

For further information:

1. https://www.geeksforgeeks.org/dynamic-programming /
2. https://leetcode.com/problems/edit-distance/
3. https://en.wikipedia.org/wiki/Dynamic_programming