Thursday, July 23, 2015

Some Basic Ideas

000

the nature of the data structure has a direct impact on the efficiency of the
algorithms

data structure conversion

Very often, in order to achieve better access time, one data structure need to be converted to another data structure that naturally support better access time. E.g. converting string to array or hashtable to convert its elements access time from O(n) to O(1).  Conversion on data structure implies more space is required in memory.

Pointers or moving cursors

They are the key to in-place operation. often, one is to traversal the input, another is to the newly built result. for example, for in-place character removal in a string, destination cursor points to position that will be eventually returned back in new string, while source cursor is to traversal the original string character by character to make sure every characters are processed.

Other situations includes head pointer and tail pointer and they move towards middle, as in example to reverse a ward or sorted array, or to determine is a string is palindrome.


void reverseString( char str[], int start, int end ){
char temp;
while( end > start ){
/* Exchange characters */
temp = str[start];
str[start] = str[end];
str[end] = temp;
/* Move indices towards middle */
start++; end--;
}
}
Another obvious use of two pointers is merging two sorted arrays.

Binary search

it's used widely when finding an element in a sorted sequence.
middle point is just a concept of a point in between start and end, it does not has to be in the middle of start and end. A problem to find string in sorted string array with interspersed empty strings ss a good demo to this.

Hashtable vs Array

Hashtable has extra overhead which array does not, but array need to be predefined on its capacity, while hashtable only for content actually in the input. For vrey long string, Array can be a better choice since total number of distinct characters are predefined and likely the very long string will contain every character. Choice between them is situation by situation.

Array is a contiguous block of memory, while linked list is sparsely distributed. inserting or deleting an element in array requires every elements after it either be shifted left or right or even totally relocate to different memory area. while linked list just need to change a node's linkage. But traversal of linkedlist is linear and worse can be N if the element is at tail of singly linked list.

Recursion


All recursion can be converted to iteration.
it is most useful for tasks that can be defined in terms of similar subtasks. the key is to identify out relationship between whole and subtasks. Tree problems can typically be resolved by recursion because tree is composed of subtrees.

Base case: when to stop, when the subroutine can finish without calling itself
Recursive cases: the sub problems have not been resolved.

When converting recursive to iterative routine, usually start from base case, and build up the whole solution from bottom up. Recursive calls are generally used to preserve the current values of local variables and restore them when the subtask performed by the recursive call is completed.Since recursively calls are using stack to reserve the local variables, you can also use stack in iterative calls to mimic the recursive call behavior, but then very often, it is about more trouble than it’s worth.

with help of memorization, many recursive solution become more efficient by avoiding duplicated calculations, and can be called DP: dynamite programming.

It's common to prepare some parameter in another routine, and call the recursive routine from there.

Iteration


When converting recursion to iteration, think it either bottom up or top down. and also thinking about using a queue like structure to hold the intermediate results for further process until result is good.
Typical example would be using queue in tree's traversal, and string permutation example I gave in another article.

Stack and Queue

using stack to reverse order or access.(LIFO)
using queue to keep order or access.(FIFO)

String & Array

1. find first non-repeated character

--using a array indexed by character(128 for ASCII)
--using hash table/map if potential characters are in huge amount, do first loop to build up hashmap, do second loop to find first character that repeats only once
 --to find last non-repeated, looping back or resort to a data structure like LinkedHashmap that keeps the insertion order(you can make onekey-value pair with equals and hashcode same as it's key's behavior)

 Tree


BFT


BFT is used for level by level process, usually resorting to a queue to add a node, remove it, process it, and add its children to queue from left to right. keep processing it until queue is empty.

The queue here is changing constantly by removing a node adding its children to tail of queue.

Levels


usually you do not need to deal with level, if you do need to figure out level you are in, Another queue for serving queue of all children would be helpful. this case the normal queue mentioned above would serve for current level. while processing current level, when the current queue turns empty, it will add children back in, and clear out children queue. level will increase when this happens.

LinkedList is a queue in java.



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
 public static int getLevels(TreeNode aTree){
  if (aTree == null)return 0;
  int levels = 0;
  Queue<TreeNode> curLevel = new LinkedList<TreeNode>();
  Queue<TreeNode> nextLevel = new LinkedList<TreeNode>();
  //starting from root, level 0 will become to 1 after root processing
  curLevel.add(aTree);
  while(!curLevel.isEmpty()){
   TreeNode curNode = curLevel.remove();
   if (curNode.left!=null)
    nextLevel.add(curNode.left);
   if (curNode.right!=null)
    nextLevel.add(curNode.right);
   if(curLevel.isEmpty()){
    if (nextLevel.isEmpty())break;
    else{
     levels++;
     curLevel.addAll(nextLevel);
     nextLevel.clear();
    }
   }
  }
  return levels;
 } 
 
 
Lowest Common Ancestor

If n1 is in left tree, and n2 is in right tree, then root is the lowest common node.
if n1 and n2 are both in left tree, then recursively start from left node; if n1 and n2 are both in right tree, then recursively start from right child node.



Complete Binary Tree


is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible.

A Binary Heap is a Complete Binary Tree where items are stored in a special order such that value in a parent node is greater(or smaller) than the values in its two children nodes. The former is called as max heap and the latter is called min heap

Since a Binary Heap is a Complete Binary Tree, it can be easily represented as array and array based representation is space efficient. If the parent node is stored at index I, the left child can be calculated by 2 * I + 1 and right child by 2 * I + 2.

Number to String


negative can be kept in flag variable. convert number to positive the first.

build it up from right to left.
n%10+'0' is character of a number.
next number to process is n/10
loop it until num=0

String to number


From left to right
charAt(i)-'0' is the real number n
n=n*10+(charAt(i)-'o')


Binary Heap


It is a complete binary tree(so that you can use math to determine the relationships between nodes).
all levels are full except bottom level. bottom level should be filled from left to right.

it is an implementation of priority queue.

(Binary)Heap is a structure a parent is no less or no greater than children.
It's conveniently implemented by array. if making no use of a[0](starting from a[1]), then for a node i, its left child is located at i*2 and right child is at i*2+1.It's parent is (i-1)/2. if making use of a[0], then its left child is i*2+1, right child is i*2+2, its parent is (i-1)/2.

Priority Queue


minimum and maximum priority queue. basic operation is constant time to identify and delete  min or max, based on type of queue, element in the queue.

While adding element to priority queue, it sorts the elements to make sure minimum element or maximum element can be constantly accessed from get or delete operation. it does not care if rest of elements are sorted, it only makes sure get highest prioritized element in constant time.

Priority Queue implemented by Binary Heap  based on Array
1. delete. delete the top, move end to top and sink down to correct position(restore heap)
2. add.add to the end and then swim/float up to correct position (restore heap)

Element is actually often (k,v)
Typical problem to solve: scheduling, find most.. least..
An impressive use is to merge K sorted linked list.




Fibonacci Sequence Like Problems



Third number is sum of first two numbers. ladder climbing, staircase climbing, fabonacci, tree hights etc. they can be expressed as f(n)=f(n-1)+f(n-2), but with different base cases. recursion is naturally way to deal with them, but due to efficiency concern, most of them need to be solved in iterative ways. DP is specifically good at resolving this kind of problem by using extra data structure to remember the calculated result so that it won't repeated/duplicated calculated. for DP, you can do from top-down or bottom-up, one of the way should be easier, cleaner and more efficient.




Ladder/staircase climbing: climb one or two steps a time.

I changed the problem to be from a bag of n items, either taking 1 or 2 items out a time, how many
different ways you can use to take items out of bag?








public static int climbStairs(int n) {
 if(n==1)return 1;
 if(n==2)return 2;
 return climbStairs(n-1)+climbStairs(n-2);
}
public static int climbStairsIterative(int n) {
 if(n==1)return 1;
 if(n==2)return 2;
 int m1=1,m2=2,tmp;
 
 for(int i=3;i<=n;i++){
  tmp =m1 + m2;
  m1=m2;
  m2=tmp;
 }
 return m2;
}

Binomial theorem

(x+y)^n = {n \choose 0}x^n y^0 + {n \choose 1}x^{n-1}y^1 + {n \choose 2}x^{n-2}y^2 + \cdots + {n \choose n-1}x^1 y^{n-1} + {n \choose n}x^0 y^n,

No comments: