Dynamic Programming
What is Dynamic Programming?
Dynamic programming (usually referred to as DP ) is a very powerful technique to solve a particular class of problems. It demands very elegant formulation of the approach and simple thinking and the coding part is very easy. The idea is very simple, If you have solved a problem with the given input, then save the result for future reference, so as to avoid solving the same problem again. shortly 'Remember your Past' :)
Now you should ask, How it's different than Divide & conquer method, right?
- The answer is very simple, There is only one key difference, In Divide and conquer method once you solve one particular sub-problem then you do not encounter it again, simply non-overlapping sub-problems. In D&Q, we divide the problem in to non-overlapping sub-problems and solve them independently, like in MergeSort and QuickSort.
- Whereas, if you find there are many overlapping sub-problems, then it's a big hint that it's a DP. If you still don't get my point don't worry just continue till the end.
Understanding DP more precisely:
If the given problem can be broken up in to smaller sub-problems and these smaller sub-problems are in turn divided in to still-smaller ones, and in this process, if you observe some over-lapping sub-problems, then its a big hint for DP. Also, the optimal solutions to the sub-problems contribute to the optimal solution of the given problem ( referred to as the Optimal Substructure Property ).
There are two ways of doing this:
1. Top-Down (aka Memoization)
2. Bottom-UP (aka Tabulation )
Let's look at both of them:
1) Top-Down:
Start solving the given problem by breaking it down. If you see that the problem has been solved already, then just return the saved answer. If it has not been solved, solve it and save the answer. This is usually easy to think of and very intuitive. This is referred to as Memoization.
2) Bottom-Up:
Analyze the problem and see the order in which the sub-problems are solved and start solving from the trivial sub-problem, up towards the given problem. In this process, it is guaranteed that the sub-problems are solved before solving the problem. This is referred to as Tabulation.
Hence we could say following are characteristics of DP:
1) Overlapping sub-problems.
2) Optimal Substructure property
Recursion vs Dynamic Programming:
Recursion uses the top-down approach to solve the problem i.e. It begin with the core problem then breaks it into sub-problems and solve these sub-problems similarly. In this approach same sub-problem can occur multiple times and consume more CPU cycle ,hence increase the time complexity. Whereas in Dynamic programming same sub-problem will not be solved multiple times but the prior result will be used to optimize the solution.
Let's understand it more better with below Fibonacci series example:
Fib(4) = Fib(3) + Fib(2)
= (Fib(2) + Fib(1)) + Fib(2)
= ((Fib(1) + Fib(0)) + Fib(1)) + Fib(2)
= ((Fib(1) + Fib(0)) + Fib(1)) + (Fib(1) + Fib(0))
Here, we could clearly see that call to Fib(1) and Fib(0) is made multiple times. Similarly, In the case of Fib(100) these calls would be count for million times. Hence there is lots of wastage of resources (CPU cycles & Memory for storing information on stack).
In dynamic programming all the sub-problems are solved even those which are not needed, but in recursion only required sub-problem are solved. So solution by dynamic programming should be properly framed to remove this ill-effect.
NOTE: Dynamic programming and recursion work in almost similar way wherein the core problem has non overlapping sub-problems. In such scenarios the other approaches could be used like “divide and conquer”.
NOTE: A Dynamic Programming solution is based on the principal of Mathematical Induction, whereas greedy algorithms require other kinds of proof.
Let's start your DP journey with very simple problem:
Minimum Steps To One
Problem Statement:
On a positive integer, you can perform any one of the following 3 steps.
1) Subtract 1 from it. ( n = n - 1 ) ,
2) If its divisible by 2, divide by 2. ( if n % 2 == 0 , then n = n / 2 ) ,
3) If its divisible by 3, divide by 3. ( if n % 3 == 0 , then n = n / 3 ).
Now the question is, given a positive integer n, find the minimum number of steps that takes n to 1
Example Input, Output:
1) For n = 1 , output: 0
2) For n = 4 , output: 2 ( 4 /2 = 2 /2 = 1 )
3) For n = 7 , output: 3 ( 7 -1 = 6 /3 = 2 /2 = 1 )
Approach / Idea:
One can think of greedily choosing the step, which makes n as low as possible and continue the same, till it reaches 1. If you observe carefully, the greedy strategy doesn't work here. E.g: Given n = 10 , Greedy --> 10 /2 = 5 -1 = 4 /2 = 2 /2 = 1 ( 4 steps ). But the optimal way is --> 10 -1 = 9 /3 = 3 /3 = 1 ( 3 steps ). So, we need to try out all possible steps we can make for each possible value of n we encounter and choose the minimum of these possibilities.
It all starts with recursion :). F(n) = 1 + min{ F(n-1) , F(n/2) , F(n/3) } if (n>1) , else 0 ( i.e., F(1) = 0 ) . Now that we have our recurrence equation, we can right way start coding the recursion. Wait.., does it have over-lapping sub-problems ? YES. Is the optimal solution to a given input depends on the optimal solution of its sub-problems ? Yes... Bingo ! its DP :) So, we just store the solutions to the sub-problems we solve and use them later on, as in memoization.. or we start from bottom and move up till the given n, as in DP. As its the very first problem we are looking at here, lets see both the codes.
Memoization
int memo[n+1]; // we will initialize the elements to -1 ( -1 means, not solved it yet ) int getMinSteps ( int n ) { if ( n == 1 ) return 0; // base case if( memo[n] != -1 ) return memo[n]; // we have solved it already :) int r = 1 + getMinSteps( n - 1 ); // '-1' step . 'r' will contain the optimal answer finally if( n%2 == 0 ) r = min( r , 1 + getMinSteps( n / 2 ) ) ; // '/2' step if( n%3 == 0 ) r = min( r , 1 + getMinSteps( n / 3 ) ) ; // '/3' step memo[n] = r ; // save the result. If you forget this step, then its same as plain recursion. return r; }
Bottom-Up DP
int getMinSteps ( int n ) { int dp[n+1], i; dp[1] = 0; // trivial case for( i = 2 ; i < = n ; i ++ ) { dp[i] = 1 + dp[i-1]; if(i%2==0) dp[i] = min( dp[i] , 1+ dp[i/2] ); if(i%3==0) dp[i] = min( dp[i] , 1+ dp[i/3] ); } return dp[n]; }
Both the approaches are fine. But one should also take care of the lot of over head involved in the function calls in Memoization, which may give Stack-overflow error or Time-Limit-Exceeds error rarely.
Problem : Longest Common Sub-sequence (LCS)
Longest Common Sub-sequence - Dynamic Programming - Tutorial and C Program Source code
Given a sequence of elements, a sub-sequence of it can be obtained by removing zero or more elements from the sequence, preserving the relative order of the elements. Note that for a sub-string, the elements need to be contiguous in a given string, for a sub-sequence it need not be. E.g: S1="ABCDEFG" is the given string. "ACEG", "CDF" are sub-sequences, where as "AEC" is not. For a string of length n the total number of sub-sequences is 2n ( Each character can be taken or not taken ). Now the question is, what is the length of the longest sub-sequence that is common to the given two Strings S1 and S2. Lets denote length of S1 by N and length of S2 by M.
Brute Force
Consider each of the 2N sub-sequences of S1 and check if its also a sub-sequence of S2, and take the longest of all such sub-sequences. Clearly, very time consuming.
Recursion
Can we break the problem of finding the LCS of S1[1...N] and S2[1...M] in to smaller sub-problems ?
Some famous Dynamic Programming algorithms are:
- Unix diff for comparing two files
- Bellman-Ford for shortest path routing in networks
- TeX the ancestor of LaTeX
- WASP - Winning and Score Predictor
Now it's time to get our hands dirty by solving some basic / trivial problems:
1) Fibonacci Number
2) Counting Bits
3) Best Time to Buy and Sell Stock II
4) Pascal's Triangle
5) Divisor Game
Some classic DP problems to solve:
1) LIS ( Longest increasing sub-sequence )
2) LCS ( Longest common sub-sequence )
3) Knapsack Problem
4) Matrix Chain Multiplication
5) Subset sum
6) Coin change
7) All to all shortest path in graph
8) Assembly line joining or topographical sort
Some Advance DP problems:
1) DP k-Th lexicographical string
2) DP tree
3) DP+ BIT/segment tree
4) DP + convex hull
5) DP pre-processing
6) DP bit-mask ( UniqueCap )
7) DP matrix multiplication/ DP using recurrence
8) DP + TRIE ( Morse )
9) DP + Binary Search
10) DP + Knuth optimization (Breaking Strings)
Key Links / Resources to refer:
- MIT OpenCourseWare - Introduction to Algorithms
- Illinois education.
- The top-coder article on DP.
- Indian Computing Olympiad article.
Signing Off with a quote:
Those who do not remember the past are condemned to repeat it.
- Dynamic Programming.
Comments
Post a Comment