a binary search or half-interval search algorithm finds the position of a specified value (the input "key") within a sorted array. In each step, the algorithm compares the input key value with the key value of the middle element of the array. If the keys match, then a matching element has been found so its index, or position, is returned. Otherwise, if the sought key is less than the middle element's key, then the algorithm repeats its action on the sub-array to the left of the middle element or, if the input key is greater, on the sub-array to the right. If the remaining array to be searched is reduced to zero, then the key cannot be found in the array and a special "Not found" indication is returned.
A binary search halves the number of items to check with each iteration, so locating an item (or determining its absence) takes logarithmic time. A binary search is a dichotomic divide and conquer search algorithm.
Algorithm
Recursive
A straightforward implementation of binary search is recursive. The initial call uses the indices of the entire array to be searched. The procedure then calculates an index midway between the two indices, determines which of the two subarrays to search, and then does a recursive call to search that subarray. Each of the calls is tail recursive, so a compiler need not make a new stack frame for each call. The variables
imin
and imax
are the lowest and highest inclusive indices that are searched.int binary_search(int A[], int key, int imin, int imax) { // test if array is empty if (imax < imin): // set is empty, so return value showing not found return KEY_NOT_FOUND; else { // calculate midpoint to cut set in half int imid = midpoint(imin, imax); // three-way comparison if (A[imid] > key) // key is in lower subset return binary_search(A, key, imin, imid-1); else if (A[imid] < key) // key is in upper subset return binary_search(A, key, imid+1, imax); else // key has been found return imid; } }
It is invoked with initial
imin
and imax
values of 0
and N-1
for a zero based array.
The number type "int" shown in the code has an influence on how the midpoint calculation can be implemented correctly. With unlimited numbers, the midpoint can be calculated as
"(imin + imax) / 2"
. In practical programming, however, the calculation is often performed with numbers of a limited range, and then the intermediate result "(imin + imax)"
might overflow. With limited numbers, the midpoint can be calculated correctly as "imin + ((imax - imin) / 2)"
.