Twenty Percent Knowledge: Misc Notes

hashCode and equals

hashCode is used to find a bucket, equals is used to further compare it with objects already in the bucket.

MapReduce

it is a program model that composed of a map() procedure that performs filtering and sorting, a reduce() procedure that performs summary operation.

'MapReduce' is a framework for processing parallelizable problems across huge datasets using a large number of computers (nodes), collectively referred to as a cluster (if all nodes are on the same local network and use similar hardware) or a grid (if the nodes are shared across geographically and administratively distributed systems, and use more heterogenous hardware)

"Map" step: The master node takes the input, divides it into smaller sub-problems, and distributes them to worker nodes. A worker node may do this again in turn, leading to a multi-level tree structure. The worker node processes the smaller problem, and passes the answer back to its master node.

"Reduce" step: The master node then collects the answers from all the sub-problems and combines them in some way to form the output – the answer to the problem it was originally trying to solve.

The Map and Reduce functions of MapReduce are both defined with data structured as (key, value) pairs. Map takes one pair of data with a type in one data domain, and returns a list of pairs in a different domain:

Map(k1,v1) → list(k2,v2)

The Map function is applied in parallel to every pair in the input dataset. This produces a list of pairs for each call. After that, the MapReduce framework collects all pairs with the same key from all lists and groups them together, creating one group for each key.

The Reduce function is then applied in parallel to each group, which in turn produces a collection of values in the same domain:

Reduce(k2, list (v2)) → list(v3)

Each Reduce call typically produces either one value v3 or an empty return, though one call is allowed to return more than one value. The returns of all calls are collected as the desired result list.

Thus the MapReduce framework transforms a list of (key, value) pairs into a list of values

Depth First Traversal

DFS only examines vertices connected to the source. less momery

usually uses recursion

void preorderTraversal( Node curNode){
    if( curNode== null ) return;
    curNode.printValue();
    preorderTraversal( curNode.getLeft() );
    preorderTraversal( curNode.getRight() );
}

What’s the running time on this algorithm? Every node is examined once, so it’s O(n).

Note that inorder and postorder traversals are almost identical; all you vary is the order in which the node and subtrees are visited:

void inorderTraversal( Node curNode){
    if( curNode== null ) return;
    inorderTraversal( curNode.getLeft() );
    curNode.printValue();
    inorderTraversal( curNode.getRight() );
}

void postorderTraversal( Node curNode){
    if( curNode== null ) return;
    postorderTraversal( curNode.getLeft() );
    postorderTraversal( curNode.getRight() );
    curNode.printValue();
}

Breadth first traversal

we usually use this to find shorted path. use more memory since you have to track the children of searched modes.

Traversal of tree visits nodes in order of their depth. it visits node at depth 0, then depth 1 and 2 and so on, from left to right.

Heap is a binary search tree with only left tree, plus adjacent nodes can be equal or less than.

Twenty Percent Knowledge

Sunday, September 21, 2014

Misc Notes

Depth First Traversal

Breadth first traversal

No comments: