Sunday, February 22, 2015

Knapsack

During a robbery, a burglar finds much more loot than he had expected and has to decide what to take. His bag (or “knapsack”) will hold a total weight of at most W pounds. There are n items to pick from, of weight w1...wn and dollar value v1...vn. What's the most valuable combination of items he can t into his bag?

There are two versions of this problem: 
1: Unlimited quantities of each item available
2: Only one of each item

1: Knapsack with repetition:

K(w) = maximum value achievable with a knapsack of capacity w

Can we express this in terms of smaller sub-problems? Well, if the optimal solution to K(w) includes item i, then removing this item from the knapsack leaves an optimal solution to K(w - wi). In other words, K(w) is simply K(w - wi) + vi, for some i. We don't know which i, so we need to try all possibilities:


Implementation:

int maxWeight(int *weights, int *values, int *K, int n, int w)
{
    int max = 0;
    for(int i = 0; i < n; ++i)
    {
            if(weights[i] <= w)
            {
                          int val = K[w-weights[i]] + values[i];
                          if(max < val)
                                 max = val;
            }
    }
    return max;
}

int knapsackWithRepetition(int *weights, int *values, int n, int W)
{
    int *K = new int[W + 1];
    K[0] = 0;
    for(int w = 1; w <= W; ++w)
    {
            K[w] = maxWeight(weights, values, K, n, w);
    }
    return K[W];
}

Time Complexity: O(nW)


2: Knapsack without Repetition:

K(w, j) = maximum value achievable using a knapsack of capacity w and items 1...j


Implementation:

int knapsackWithoutRepetition(int *weights, int *values, int n, int W)
{
    int **K = new int*[W + 1];
    for(int w = 0; w <= W; ++w)
    {
            K[w] = new int[n + 1];
            K[w][0] = 0;
    }
    for(int i = 0; i <= n; ++i)
            K[0][i] = 0;
    
    for(int i = 1; i <= n; ++i)
    {
            for(int w = 1; w <= W; ++w)
            {
                    if(weights[i-1] > w)
                                  K[w][i] = K[w][i - 1];
                    else
                        K[w][i] = MAX(K[w][i-1], K[w-weights[i-1]][i-1] + values[i-1]);
            }
    }
    return K[W][n];
}

Time Complexity: O(nW)

Edit Distance

When a spell checker encounters a possible misspelling, it looks in its dictionary for other words that are close by. What is the appropriate notion of closeness in this case?
A natural measure of the distance between two strings is the extent to which they can be aligned, or matched up. Technically, an alignment is simply a way of writing the strings one above the other. For instance, here are two possible alignments of SNOWY and SUNNY:


The - indicates a “gap”, any number of these can be placed in either string. The cost of an alignment is the number of columns in which the letters differ. And the edit distance between two strings is the cost of their best possible alignment.

Edit distance is so named because it can also be thought of as the minimum number of edits—insertions, deletions, and substitutions of characters—needed to transform the first string into the second.

Dynamic Programming Solution:

Our goal is to find the edit distance between two strings x[1...m] and y[1...n]. So we need to find out the sub-problem, it should go part of the way toward solving the whole problem, so how about looking at the edit distance between some prefix of the first string, x[1...i], and some prefix of the second, y[1...j]? Call this sub-problem E(i, j). Our objective is to compute E(m, n).


where diff(i, j) = 0 if x[i] = y[j] and 1 otherwise.

Example:


Implementation:

int min(int i, int j, int k)
{
    return i < j ? (i < k ? i : k) : (j < k ? j : k);
}

int editDistance(string str1, string str2)
{
    int m = str1.length();
    int n = str2.length();
    int **E = new int*[m + 1];
    for(int i = 0; i <= m; ++i)
    {
            E[i] = new int[n + 1];
            E[i][0] = i;
    }
    for(int i = 0; i <= n; ++i)
            E[0][i] = i;
    
    for(int i = 1; i <= m; ++i)
    {
            for(int j = 1; j <= n; ++j)
                    E[i][j] = min(E[i - 1][j] + 1, E[i][j - 1] + 1, E[i - 1][j - 1] + (int)(str1[i - 1] != str2[j - 1]));
    }
    return E[m][n];
}

Time Complexity: O(mn)

Saturday, February 21, 2015

Median of two sorted array.

Problem: There are two sorted arrays of size n each. Write an algorithm to find the median of the array obtained after merging the given two arrays.

Algorithm:
1) Calculate the medians med1 and med2 of the input arrays arr1[]
   and arr2[].
2) If med1 and med2 both are equal then return med1.
3) If med1 is greater than med2, then median is present in one
    of the below two subarrays.
      a)  From first element of arr1 to med1.
      b)  From med2 to last element of arr2.
4) If med2 is greater than med1, then median is present in one  
    of the below two subarrays.
      a)  From med1 to last element of arr1.
      b)  From first element of arr2 to med2.
5) Repeat the above process until size of both the subarrays becomes 2.
6) If size of the two arrays is 2 then use following formula to get the median:
       Median = (MAX(arr1[0], arr2[0]) + MIN(arr1[1], arr2[1]))/2

Implementation:
inline int MAX(int i, int j)
{
       return i > j ? i : j;
}

inline int MIN(int i, int j)
{
       return i < j ? i : j;
}

int median(int *arr, int n)
{
    if(n % 2 == 0)
         return (arr[n/2] + arr[n/2 - 1]) / 2;
    else
        return arr[n/2];
}

int getMedian(int *arr1, int *arr2, int n)
{
    if(n <= 0)
         return -1;
    if(n == 1)
         return (arr1[0] + arr2[0]) / 2;
    if(n == 2)
         return (MAX(arr1[0], arr2[0]) + MIN(arr1[1], arr2[1])) / 2;
    int med1 = median(arr1, n);
    int med2 = median(arr2, n);
   
    if(med1 == med2)
          return med1;
    if(med1 < med2)
    {
          if(n % 2 == 0)
               return getMedian(arr1 + n/2 - 1, arr2, n - n/2 + 1);
          else
               return getMedian(arr1 + n/2, arr2, n - n/2);
    }
    else
    {
        if(n % 2 == 0)
             return getMedian(arr1, arr2 + n/2 - 1, n - n/2 + 1);
        else
             return getMedian(arr1, arr2 + n/2, n - n/2);
    }
}

Time Complexity: O(logn)

Friday, February 20, 2015

Design a stack with additional operations on middle element.

Problem: Implement a stack which supports following operations in O(1) -
1) push() which adds an element to the top of stack.
2) pop() which removes an element from top of stack.
3) findMiddle() which returns middle element of the stack.
4) delMiddle() which deletes the middle element.

Solution: Use Doubly Linked List to implement this stack.

Implementation:

struct DLNode
{
       int data;
       DLNode *next;
       DLNode *prev;
       DLNode(int i):data(i), next(0), prev(0){}
};

class Stack
{
public:
       Stack():head(0), mid(0), size(0){}
       void push(int i);
       int pop();
       int findMiddle();
       void delMiddle();
private:
      DLNode *head;
      DLNode *mid;
      int size;
};

void Stack::push(int i)
{
     DLNode *node = new DLNode(i);
     node->next = head;
     node->prev = NULL;
     size += 1;
     if(size == 1)
         mid = node;
     else
     {
         head->prev = node;
         if(size & 1)
                 mid = mid->prev;
     }
     head = node;  
}

int Stack::pop()
{
    if(size == 0)
             return -1;
    DLNode *node = head;
    int retVal = head->data;
    head = head->next;
    if(head)
            head->prev = 0;
    size -= 1;
    if(!(size & 1))
            mid =  mid->next;
    delete node;
    return retVal;
}

int Stack::findMiddle()
{
    int retVal = -1;
    if(size != 0)
           retVal = mid->data;
    return retVal;
}

void Stack::delMiddle()
{
     if(size == 0)
             return;
     DLNode *node = mid;
     if(size == 1)
     {
             mid = 0;
             head = 0;
     }
     --size;
     mid->prev->next = mid->next;
     mid->next->prev = mid->prev;
     if(!(size & 1))
               mid = node->next;
     else
         mid = node->prev;
     delete node;
}

Tuesday, February 17, 2015

Google Question: Construct BST from given preorder traversal

Problem: Construct BST from given preorder traversal.

Implementation:

BSTree::BSTree(int *pre, int size)
{
       int index = 0;
       root = constructTree(pre, index, pre[index], INT_MIN, INT_MAX, size);
}

BNode* BSTree::constructTree(int *pre, int &index, int num, int min, int max, int size)
{
     if(index >= size)
              return NULL;
     BNode *node = 0;
     if(num > min && num < max)
     {
            node = new BNode(num);
            ++index;
            if(index < size)
            {
                     node->left =  constructTree(pre, index, pre[index], min, num, size);
                     node->right = constructTree(pre, index, pre[index], num, max, size);
            }
     }
     return node;
}

Monday, February 16, 2015

Amazon Question: Reverse a Linked List in groups of given size

Problem: Given a linked list, write a function to reverse every k nodes.

Implementation:

void List::reverseK(int k)
{
     if(head)
             head = reverseK(head, k);
}

Node* List::reverseK(Node *node, int k)
{
     Node *curr = node, *next = 0, *prev = 0;
     int count = 0;
     while(curr && (count < k))
     {
                 next = curr->next;
                 curr->next = prev;
                 prev = curr;
                 curr = next;
                 ++count;
     }
   
     if(next)
             node->next = reverseK(next, k);
     return prev;
}

Friday, February 13, 2015

Amazon Question: Convert a binary tree to a tree that holds children sum tree

Problem: Given an arbitrary binary tree, convert it to a binary tree that holds Children Sum Property. You can only increment data values in any node (Structure can not be changed and decrement operation is not allowed).

Algorithm: Traverse given tree in post order to convert it, i.e., first change left and right children to hold the children sum property then change the parent node.

Let difference between node’s data and children sum be diff.

     diff = node’s children sum - node’s data  

If diff is 0 then nothing needs to be done.

If diff > 0 then increment the node’s data by diff.

If diff < 0 then increment one child’s data. We can choose to increment either left or right child if they both are not NULL. Let us always first increment the left child. Incrementing a child changes the subtree’s children sum property so we need to change left subtree also. So we recursively increment the left child. If left child is empty then we recursively call increment() for right child.

Implementation:

void increment(struct node* node, int diff)
{
  if(node->left != NULL)
  {
    node->left->data = node->left->data + diff;
    increment(node->left, diff);
  }
  else if (node->right != NULL)
  {
    node->right->data = node->right->data + diff;
    increment(node->right, diff);
  }
}

void convertTree(struct node* node)
{
  int left_data = 0,  right_data = 0, diff;
  if (node == NULL ||(node->left == NULL && node->right == NULL))
    return;
  else
  {
    convertTree(node->left);
    convertTree(node->right);

    if (node->left != NULL)
      left_data = node->left->data;

    if (node->right != NULL)
      right_data = node->right->data;

    diff = left_data + right_data - node->data;

    if (diff > 0)
       node->data = node->data + diff;

    if (diff < 0)
      increment(node, -diff);
  }
}

Check for children sum property in a binary tree

Problem: Given a binary tree, write a function that returns true if the tree satisfies below property:
For every node, data value must be equal to sum of data values in left and right children. Consider data value as 0 for NULL children.

Algorithm: Traverse the given binary tree. For each node check (recursively) if the node and both its children satisfy the Children Sum Property, if so then return true else return false.

Implementation:

bool isSumProperty(Node* node)
{
  int left_data = 0,  right_data = 0;
  if(node == NULL || (node->left == NULL && node->right == NULL))
    return true;
  else
  {
    if(node->left != NULL)
      left_data = node->left->data;
    if(node->right != NULL)
      right_data = node->right->data;
    if((node->data == left_data + right_data)&&
        isSumProperty(node->left) &&
        isSumProperty(node->right))
      return true;
    else
      return false;
  }
}

Range Minimum Query(RMQ)

Range Minimum Query (RMQ) is used on arrays to find the position of an element with the minimum value between two specified indices i.e. Given an array A[0,n-1] of n objects, Range Minimum Query from i to j asks for the position of a minimum element in the sub-array A[i, j].             
                


 
                                    


Preprocess RMQ:

             
The best approach  is to preprocess RMQ for sub arrays of length 2k using dynamic programming. Keep an array M[ 0, N-1 ] [ 0, logN ] where M[ i ] [ j ] is the index of the minimum value in the sub array starting at i having length 2j. Here is an example:





     

For computing M [ i ] [ j ] we must search for the minimum value in the first and second half of the interval. It's obvious that the small pieces have 2j - 1 length, so the recurrence is:




Implementation of the preprocessing function is as follows:


int** preprocess(int *A, int size)
{
    int **M = new int*[size];
    int logN = logbase2(size);

    for(int i=0i<10M[i++] = new int[logN]);
   
    // Generating Table of M[n][log(n)]

    for(int i=0i<10; ++i)
        M[i][0] = i;
   
    for(int j=1; (1<<j) <= 10; ++j)
    {
        for(i=0i+(1<<j)-1 < 10; ++i)
        {
            if(A[M[i][j-1]] <= A[M[i+(1<<(j-1))][j-1]])
                M[i][j] = M[i][j-1];
            else
                M[i][j] = M[i+(1<<(j-1))][j-1];
        }
      }
    return M;
}



RMQ Function:

Once we have these values preprocessed, We can use them to calculate RMQA(i, j). The idea is to select two blocks that entirely cover the interval [i..j] and  find the minimum between them. Let k = [log(j - i + 1)]. For computing RMQA(i, j) we can use the following formula:




Implementation of RMQ function is as follows:

intRMQ(int *A, int size, int i, int j)
{

    int **M = preprocess(A, size);
    int k=0;
    while((1<<k++) < (j-i));
    --k;

    if(A[M[i][k]] <= A[M[j-(1<<k)+1][k]])
        return A[M[i][k]];
    else
        return A[M[j-(1<<k)+1][k]];
}

You can see the complexity of preprocess( ) function is O(nlogn) and complexity of RMQ function is O(1).

Knuth-Morris-Pratt(KMP)

String Matching Problem:
             
            Suppose a Text is an array T[1..n] of length n and pattern is also an array P[1..m] (m <= n). We say pattern P occurs with shift s in Text T if 0 <= s <= n-m and T[s+1..s+m] = P[1..m] i.e. T[s + j] = P[j] for all 1 <= j <=m.
             
                 If P occurs with shift s in T the s is a valid shift otherwise it is an invalid shift.The string matching problem is the problem of finding all valid shifts with which a given Pattern P occurs in a given Text T.

 
                                   
Notation and Terminology:

Prefix : W is a prefix of string x if x = wy for some string y.
Suffix : W is suffix of string x if x = yw for some string y.
xy – string concatenation   | x | - length of string x

Knuth-Morris-Pratt Algorithm:

Knuth, Morris and Pratt proposed a linear time algorithm for the string matching problem.
A matching time of O(n) is achieved by avoiding comparisons with elements of T that have previously been involved in comparison with some elements of the pattern P to be matched. i.e., backtracking on the string T never occurs.

Components of KMP Algorithm:

Prefix Function ΠPrefix function Π for a pattern encapsulate knowledge about how the pattern matches against shifts of itself.

         
Knowing these q text characters allows us to determine immediately that certain shifts are invalid since the first character a would be aligned with a text character that is known to match with second pattern character b. So next shift should be s = s + 2.
           
 Given P[1..q] matches T[s+1..s+q] what is the least shift s1 > s such that
 P[1..k] = T[s1+1 . . s1+k] where s1 + k = s + q

Prefix function Π [q] = max ( k : k < q and Pk is suffix of Pq  where Pk  is P[1..k] i.e.  Π[q] is the length of longest prefix of P that is a proper suffix P."

My implementation of Prefix function is as follows:


intPrefix(char *pattern)
{
    int len = strlen(pattern);
    int *pre = new int[len];

    pre[0] = 0;
    int k = 0;

    for(int i=1i<len; ++i)
    {
        while(k>0 && pattern[k] != pattern[i])
            k = pre[k];
        if(pattern[k] == pattern[i])
            ++k;
        pre[i] = k;
    }
    return pre;
}

KMP Matcher:  With text T, pattern P and prefix function Π as inputs, it finds all the occurrence of P  in T and returns the number of shifts of  P after which occurrence is found.
Implementation of KMP Match function is as follows:

 bool Match(char *textchar *patternvector<int>& shifts)
{
    int textLength = strlen(text);
    int patrnLength = strlen(pattern);
    intprefixArray = Prefix(pattern); // Π function
    bool found = false;
    int q = 0;

    for(int i=0i<textLength; ++i)
    {
        whileq>0 && pattern[q] != text[i])
            q = prefixArray[q-1]; //Next character does not macth
        if(pattern[q] == text[i])
            ++q;  //Next character Matched
        if(q == patrnLength//All matched
        {
            shifts.push_back(i-patrnLength);
            found = true;
            q = prefixArray[q-1];  //Look for next match
        }
       
    }
    return found;
}

You can see both prefix( ) function and Match function takes linear time to execute.

Count Sort

This sorting algorithm takes advantage of the range of the numbers in the array to be sorted.
Say the range is 0..k then it can sort the array in O(k). If k is less than or equal to n, we are in great advantage because we can sort this array in linear time i.e. O(n).

The algorithm first determine for each element x the number of elements less than x. One of the advantage of this algorithm is its stability.

Following is my implementation:

void countSort(int *a, int size)
{
    int max = Max(a);
    int *c = new int[max+1];
    int *sorted = new int[size];
    for( int i=0; i<size; ++c[a[i++]] );   // Step 1
    for( i=1; i<max; c[i++] += c[i-1] );     // Step 2:Get the number of elements less than each number
    for(i=size-1; i >= 0; --i)    // Step 3
    {
        sorted[c[a[i]]-1] = a[i];
        --c[a[i]];
    }
    for(i=0; i<size; ++i)
        a[i] = sorted[i];
}

Example --

Input Array   

25302303  

Step 1: Array C

202301

Step 2: Array C

2
2
4
7
7
8

Step 3:

 a[7] = 3  c[3] = 7 sorted [6] = 3 => c[3] = 6

3  

a[6] = 0  c[0] = 2  sorted[1] = 0  => c[0] = 1

03  

a[5] = 3  c[3] = 6  sorted[5] = 3  => c[3] = 5

033  

.
.
.
.

In the End Sorted array

0022333 5