Programmer: Lifelong Learning: January 2010

Algorithm – Tree Concept

A binary tree is a tree data structure in which each node has at most two children. Typically the first node is known as the parent and the child nodes are called left and right.

Complete Binary tree

A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible.

Complete Binary tree is implemented using an array.

Binary search tree

A binary search tree (BST) is a node based binary tree data structure which has the following properties:

* The left subtree of a node contains only nodes with keys less than the node's key.

* The right subtree of a node contains only nodes with keys greater than the node's key.

* Both the left and right subtrees must also be binary search trees.

Implementation

http://users.cis.fiu.edu/~weiss/dsj3/code/weiss/nonstandard/BinarySearchTree.java

Self-balancing binary search tree

A self-balancing (or height-balanced) binary search tree is any binary search tree data structure that automatically keeps its height (number of levels below the root) small in the face of arbitrary item insertions and deletions.

Most operations on a binary search tree (BST) take time directly proportional to the height of the tree, so it is desirable to keep the height small.

AVL tree

In AVL tree is a self-balancing binary search tree.

In an AVL tree, the heights of the two child subtrees of any node differ by at most one; therefore, it is also said to be height-balanced. Lookup, insertion, and deletion all take O(log n) time in both the average and worst cases. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations.

AVL trees perform better than red-black trees for lookup-intensive applications.

Red-black tree

It is complex, but has good worst-case running time for its operations and is efficient in practice: it can search, insert, and delete in O(log n) time.

A red-black tree is a binary search tree where each node has a color attribute, the value of which is either red or black. In addition to the ordinary requirements imposed on binary search trees, the following additional requirements apply to red-black trees:

1. A node is either red or black.

2. The root is black. (This rule is used in some definitions and not others. Since the root can always be changed from red to black but not necessarily vice-versa this rule has little effect on analysis.)

3. All leaves are black.

4. Both children of every red node are black.

5. Every simple path from a given node to any of its descendant leaves contains the same number of black nodes.

Splay tree

A splay tree is a self-balancing binary search tree with the additional property that recently accessed elements are quick to access again. It performs basic operations such as insertion, look-up and removal in O(log(n)) amortized time.

All normal operations on a binary search tree are combined with one basic operation, called splaying. Splaying the tree for a certain element rearranges the tree so that the element is placed at the root of the tree. One way to do this is to first perform a standard binary tree search for the element in question, and then use tree rotations in a specific fashion to bring the element to the top. Alternatively, a top-down algorithm can combine the search and the tree reorganization into a single phase.

Advantages and disadvantages

Good performance for a splay tree depends on the fact that it is self-balancing, and indeed self optimizing, in that frequently accessed nodes will move nearer to the root where they can be accessed more quickly. This is an advantage for nearly all practical applications, and is particularly useful for implementing caches and garbage collection algorithms;

Splay trees also have the advantage of being considerably simpler to implement than other self-balancing binary search trees, such as red-black trees or AVL trees, while their average-case performance is just as efficient. Also, splay trees do not need to store any bookkeeping data, thus minimizing memory requirements. However, these other data structures provide worst-case time guarantees, and can be more efficient in practice for uniform access.

B-tree

The B-tree is a generalization of a binary search tree in that more than two paths diverge from a single node.

Unlike self-balancing binary search trees, the B-tree is optimized for systems that read and write large blocks of data. It is most commonly used in databases and filesystems.

In B-trees, internal (non-leaf) nodes can have a variable number of child nodes within some pre-defined range. When data is inserted or removed from a node, its number of child nodes changes. In order to maintain the pre-defined range, internal nodes may be joined or split. Because a range of child nodes is permitted, B-trees do not need re-balancing as frequently as other self-balancing search trees, but may waste some space, since nodes are not entirely full. The lower and upper bounds on the number of child nodes are typically fixed for a particular implementation. For example, in a 2-3 B-tree (often simply referred to as a 2-3 tree), each internal node may have only 2 or 3 child nodes.

In the narrow sense, a B-tree stores keys in its internal nodes but need not store those keys in the records at the leaves.

B+ tree

B+ tree is a dynamic, multilevel index, with maximum and minimum bounds on the number of keys in each index segment (usually called a "block" or "node"). In a B+ tree, in contrast to a B-tree, all records are stored at the leaf level of the tree; only keys are stored in interior nodes.

Binary heap

A binary heap is a heap data structure created using a binary tree. It can be seen as a binary tree with two additional constraints:

* The shape property: the tree is an almost complete binary tree; that is, all levels of the tree, except possibly the last one (deepest) are fully filled, and, if the last level of the tree is not complete, the nodes of that level are filled from left to right.

* The heap property: each node is greater than or equal to each of its children according to some comparison predicate which is fixed for the entire data structure.

There are two kinds of binary heaps: max-heaps and min-heaps.

Adding to the heap

If we have a heap, and we add an element, we can perform an operation known as up-heap, bubble-up, or percolate-up in order to restore the heap property. We can do this in O(logn) time, using a binary heap, by following this algorithm:

1. Add the element on the bottom level of the heap.

2. Compare the added element with its parent; if they are in the correct order, stop.

3. if not, swap the element with its parent and return to the previous step.

Binary heap is implemented using an array.

Implementation

http://users.cis.fiu.edu/~weiss/dsj3/code/weiss/util/PriorityQueue.java

Resources:

Data Structures and Problem Solving Using Java (3rd Edition)

http://users.cis.fiu.edu/~weiss/dsj3/code/code.html

Algorithm - Sorting

Insertion sort
Every iteration of insertion sort removes an element from the input data, inserting it into the correct position in the already-sorted list, until no input elements remain.
Insertion sort is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. but it's about twice as fast as the bubble sort and somewhat faster than the selection sort in normal situations.However, insertion sort provides several advantages:
# efficient for (quite) small data sets
# adaptive, i.e. efficient for data sets that are already substantially sorted: the time complexity is O(n + d), where d is the number of inversions
# more efficient in practice than most other simple quadratic (i.e. O(n2)) algorithms such as selection sort or bubble sort: the average running time is n2/4[citation needed], and the running time is linear in the best case
# stable, i.e. does not change the relative order of elements with equal keys
# in-place, i.e. only requires a constant amount O(1) of additional memory space
# online, i.e. can sort a list as it receives it.
Online Algorithm
In computer science, an online algorithm is one that can process its input piece-by-piece in a serial fashion, i.e., in the order that the input is fed to the algorithm, without having the entire input available from the start. In contrast, an offline algorithm is given the whole problem data from the beginning and is required to output an answer which solves the problem at hand. (For example, selection sort requires that the entire list be given before it can sort it, while insertion sort doesn't.)
// Simple insertion sort.
public static <AnyType extends Comparable<? super AnyType>> void insertionSort( AnyType [ ] a )
{
    for( int i = 1; i < a.length; i++ )
    {
        AnyType tmp = a[ i ];
        int j = i;
        // Insert a[p] into the sorted sublist
        for( ; j > 0 && tmp.compareTo( a[ j - 1 ] ) < 0; j-- )
            a[ j ] = a[ j - 1 ];
        a[ j ] = tmp;
    }
}
Shell sort
Shellsort is based on the insertion sort, it improves insertion sort by comparing elements separated by a gap of several positions. This lets an element take "bigger steps" toward its expected position. Multiple passes over the data are taken with smaller and smaller gap sizes. The last step of Shell sort is a plain insertion sort, but by then, the array of data is guaranteed to be almost sorted.
Gap sequence
The gap sequence is an integral part of the Shell sort algorithm. Any increment sequence will work, as long as the last increment is 1. The algorithm begins by performing a gap insertion sort, with the gap being the first number in the gap sequence. It continues to perform a gap insertion sort for each number in the sequence, until it finishes with a gap of 1
// Shellsort, using a sequence suggested by Gonnet.
public static <AnyType extends Comparable<? super AnyType>> void shellsort( AnyType [ ] a )
{
    for( int gap = a.length / 2; gap > 0;
                 gap = gap == 2 ? 1 : (int) ( gap / 2.2 ) )
        for( int i = gap; i < a.length; i++ )
        {
            AnyType tmp = a[ i ];
            int j = i;

            for( ; j >= gap && tmp.compareTo( a[ j - gap ] ) < 0; j -= gap )
                a[ j ] = a[ j - gap ];
            a[ j ] = tmp;
        }
}

Mergesort
http://www.vogella.de/articles/JavaAlgorithmsMergesort/article.html
Merge sort is an O(N*logN) comparison-based sorting algorithm. In most implementations it is stable, meaning that it preserves the input order of equal elements in the sorted output. It is an example of the divide and conquer algorithmic paradigm.
Conceptually, a merge sort works as follows
1. If the list is of length 0 or 1, then it is already sorted. Otherwise:
2. Divide the unsorted list into two sublists of about half the size.
3. Sort each sublist recursively by re-applying merge sort.
4. Merge the two sublists back into one sorted list.
// Mergesort algorithm.
public static <AnyType extends Comparable<? super AnyType>> void mergeSort( AnyType [ ] a )
{
   AnyType [ ] tmpArray = (AnyType []) new Comparable[ a.length ];
   mergeSort( a, tmpArray, 0, a.length - 1 );
}
/**
* Internal method that makes recursive calls.
* @param a an array of Comparable items.
* @param tmpArray an array to place the merged result.
* @param left the left-most index of the subarray.
* @param right the right-most index of the subarray.
*/
private static <AnyType extends Comparable<? super AnyType>> void mergeSort( AnyType [ ] a, AnyType [ ] tmpArray,int left, int right )
{
   if( left < right )
   {
       int center = ( left + right ) / 2;
       mergeSort( a, tmpArray, left, center );
       mergeSort( a, tmpArray, center + 1, right );
       merge( a, tmpArray, left, center + 1, right );
   }
}
/**
* Internal method that merges two sorted halves of a subarray.
* @param a an array of Comparable items.
* @param tmpArray an array to place the merged result.
* @param leftPos the left-most index of the subarray.
* @param rightPos the index of the start of the second half.
* @param rightEnd the right-most index of the subarray.
*/
private static <AnyType extends Comparable<? super AnyType>> void merge( AnyType [ ] a, AnyType [ ] tmpArray,
       int leftPos, int rightPos, int rightEnd )
{
   int leftEnd = rightPos - 1;
   int tmpPos = leftPos;
   int numElements = rightEnd - leftPos + 1;

   // Main loop
   while( leftPos <= leftEnd && rightPos <= rightEnd )
       if( a[ leftPos ].compareTo( a[ rightPos ] ) <= 0 )
           tmpArray[ tmpPos++ ] = a[ leftPos++ ];
       else
           tmpArray[ tmpPos++ ] = a[ rightPos++ ];

   while( leftPos <= leftEnd )    // Copy rest of first half
       tmpArray[ tmpPos++ ] = a[ leftPos++ ];

   while( rightPos <= rightEnd ) // Copy rest of right half
       tmpArray[ tmpPos++ ] = a[ rightPos++ ];

   // Copy tmpArray back
   for( int i = 0; i < numElements; i++, rightEnd-- )
       a[ rightEnd ] = tmpArray[ rightEnd ];
}
How Sorting Algorithm is used in JDK?
for Array of objects, Java's standard sorting method (Arrays.sort() and its derivative Collections.sort()) uses a version of mergesort algorithm(at least in Sun's implementation).
for arrays with built-in types,
for smallest arrays, Arrays.sort() and its derivative Collections.sort() uses Insertion sort, otherwise, it uses quicksort.
Quicksort - O(NlogN)
http://www.vogella.de/articles/JavaAlgorithmsQuicksort/article.html
Quicksort (also known as "partition-exchange sort") is a comparison sort and, in efficient implementations, on average, makes Θ(nlogn) (big O notation) comparisons to sort n items. In the worst case, it makes Θ(n2) comparisons, though if implemented correctly this behavior is rare. Typically, quicksort is significantly faster in practice than other Θ(nlogn) algorithms, because its inner loop can be efficiently implemented on most architectures, it is not a stable sort.
Quicksort sorts by employing a divide and conquer strategy to divide a list into two sub-lists.
The steps are:
1. Pick an element, called a pivot, from the list.
2. Reorder the list so that all elements which are less than the pivot come before the pivot and so that all elements greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
3. Recursively sort the sub-list of lesser elements and the sub-list of greater elements.
The base case of the recursion are lists of size zero or one, which are always sorted.
// Quicksort algorithm.
public static <AnyType extends Comparable<? super AnyType>> void quicksort( AnyType [ ] a )
{
   quicksort( a, 0, a.length - 1 );
}
private static final int CUTOFF = 10;
/**
* Method to swap to elements in an array.
* @param a an array of objects.
* @param index1 the index of the first object.
* @param index2 the index of the second object.
*/
public static final <AnyType> void swapReferences( AnyType [ ] a, int index1, int index2 )
{
   AnyType tmp = a[ index1 ];
   a[ index1 ] = a[ index2 ];
   a[ index2 ] = tmp;
}
/**
* Internal quicksort method that makes recursive calls.
* Uses median-of-three partitioning and a cutoff of 10.
* @param a an array of Comparable items.
* @param low the left-most index of the subarray.
* @param high the right-most index of the subarray.
*/
private static <AnyType extends Comparable<? super AnyType>> void quicksort( AnyType [ ] a, int low, int high )
{
   if( low + CUTOFF > high )
       insertionSort( a, low, high );
   else
   {
       // Sort low, middle, high
       int middle = ( low + high ) / 2;
       if( a[ middle ].compareTo( a[ low ] ) < 0 )
           swapReferences( a, low, middle );
       if( a[ high ].compareTo( a[ low ] ) < 0 )
           swapReferences( a, low, high );
       if( a[ high ].compareTo( a[ middle ] ) < 0 )
           swapReferences( a, middle, high );

       // Place pivot at position high - 1
       swapReferences( a, middle, high - 1 );
       AnyType pivot = a[ high - 1 ];

       // Begin partitioning
       int i, j;
       for( i = low, j = high - 1; ; )
       {
           while( a[ ++i ].compareTo( pivot ) < 0 )
               ;
           while( pivot.compareTo( a[ --j ] ) < 0 )
               ;
           if( i >= j )
               break;
           swapReferences( a, i, j );
       }

       // Restore pivot
       swapReferences( a, i, high - 1 );

       quicksort( a, low, i - 1 );    // Sort small elements
       quicksort( a, i + 1, high );   // Sort large elements
   }
}
Quick select - find kth
To implement a quickselect we just modify a quicksort algorithm.
1. If the area you're searching has only one element in it, you have found the element you are looking for, so stop searching.
2. Choose a pivot element (see Selecting a pivot)
3. Swap the pivot element with the last element
4. First, the program looks at pointer L and moves it right, checking each element in turn. As soon as it finds an element whose value is greater or equal to the pivot, it stops. It also stops if it passes pointer R.
5. Now the program moves pointer R left until it finds an element whose value is less than or equal to the pivot, or it has passed pointer L.
6. If pointers L and R have not met, swap the 2 elements they point to, and continue from step 3. Otherwise, swap the last element (which is the pivot because of step 1) with the element that pointer L is at.
7. If the pivot is the element you're looking for, stop searching. If the pivot is to the right (or larger) than the element you're searching for, repeat the quickselect on the left group only, otherwise repeat the quickselect on the right group only.
/**
* Quick selection algorithm.
* Places the kth smallest item in a[k-1].
* @param a an array of Comparable items.
* @param k the desired rank (1 is minimum) in the entire array.
*/
public static <AnyType extends Comparable<? super AnyType>> void quickSelect( AnyType [ ] a, int k )
{
   quickSelect( a, 0, a.length - 1, k );
}
/**
* Internal selection method that makes recursive calls.
* Uses median-of-three partitioning and a cutoff of 10.
* Places the kth smallest item in a[k-1].
* @param a an array of Comparable items.
* @param low the left-most index of the subarray.
* @param high the right-most index of the subarray.
* @param k the desired rank (1 is minimum) in the entire array.
*/
private static <AnyType extends Comparable<? super AnyType>>void quickSelect( AnyType [ ] a, int low, int high, int k )
{
   if( low + CUTOFF > high )
       insertionSort( a, low, high );
   else
   {
       // Sort low, middle, high
       int middle = ( low + high ) / 2;
       if( a[ middle ].compareTo( a[ low ] ) < 0 )
           swapReferences( a, low, middle );
       if( a[ high ].compareTo( a[ low ] ) < 0 )
           swapReferences( a, low, high );
       if( a[ high ].compareTo( a[ middle ] ) < 0 )
           swapReferences( a, middle, high );

       // Place pivot at position high - 1
       swapReferences( a, middle, high - 1 );
       AnyType pivot = a[ high - 1 ];

       // Begin partitioning
       int i, j;
       for( i = low, j = high - 1; ; )
       {
           while( a[ ++i ].compareTo( pivot ) < 0 )
               ;
           while( pivot.compareTo( a[ --j ] ) < 0 )
               ;
           if( i >= j )
               break;
           swapReferences( a, i, j );
       }

       // Restore pivot
       swapReferences( a, i, high - 1 );

       // Recurse; only this part changes
       if( k <= i )
           quickSelect( a, low, i - 1, k );
       else if( k > i + 1 )
           quickSelect( a, i + 1, high, k );
   }
}
Other Simple Sort Algorithms
Bubble sort
Bubble sort works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top of the list.
private static <AnyType> void swap(AnyType[ ] a, int one, int two)
{
   AnyType temp = a[one];
   a[one] = a[two];
   a[two] = temp;
}
public static <AnyType extends Comparable<? super AnyType>> void bubbleSort(AnyType [ ] a )
{
   int out, in;
   int nElems = a.length;
   for(out=nElems-1; out>1; out--) // outer loop (backward)
   {
       for(in=0; in<out; in++)    // inner loop (forward)
       {
           if( a[in].compareTo(a[in+1]) > 0 )    // out of order?
           {
               swap(a, in, in+1);
           }
       }
   }
}
Selection Sort
Selection Sort is an in-place comparison sort. It has O(n2) complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort.
The algorithm works as follows:
1. Find the minimum value in the list
2. Swap it with the value in the first position
3. Repeat the steps above for the remainder of the list (starting at the second position and advancing each time)
public static <AnyType extends Comparable<? super AnyType>> void selectionSort(AnyType[ ] a )
{
   int out, in, min;
   int nElems = a.length;
   for(out=0; out<nElems-1; out++) // outer loop
   {
       min = out;           // minimum
       for(in=out+1; in<nElems; in++) // inner loop
           if(a[in].compareTo(a[min]) < 0 )     // if min greater,
               min = in;        // we have a new min
       swap(a, out, min);
   }
}

Resources:
Data Structures and Problem Solving Using Java (3rd Edition)
http://users.cis.fiu.edu/~weiss/dsj3/code/code.html
http://en.literateprograms.org/Category:Programming_language:Java

Algorithm – Tree Concept

Algorithm – Tree Concept

Algorithm - Sorting

Algorithm - Sorting

Labels