Rotate Merge Sort



Rotate Merge Sort is a simple in-place Merge Sort that runs in $$O(n\log^2 n)$$ time which is far better than naive methods that run in $$O(n^2)$$. Although there are multiple variants of Rotate Merge Sorts, they all work the same way halving the merge recursively. Like Merge Sort, Rotate Merge Sort (and its variants) are stable.

In general, a rotate merge halves the merge into subproblems of half the size in $$O(n)$$ time (rotation is $$O(n)$$). It continues to divide them into a constant length which is $$O(\log n)$$ partitions therefore the merge in a rotate merge is $$O(n\log n)$$. A Merge Sort merges a total of $$O(\log n)$$ layers therefore the total runtime of a Rotate Merge variant is $$O(n\log^2 n)$$.

$$\large T\left(n\right) = n\log n + 2T\left(\frac{n}{2}\right)= O(n\log^2 n)$$

Although Rotate Merge is efficient in practice and has a good overhead, its running time is still asymptotically suboptimal as its merge is $$O(n\log n)$$ worst case (where $$O(n)$$ is optimal). Block Merge Sort provides an optimal $$O(n)$$ in-place merge therefore beating Rotate Merge in terms of complexity. However, in comparison, Block Merge is much more complicated therefore Rotate Merge is still competitive with Block Merge as a viable option for in-place merging due to its simplicity and low overhead.

Rotate Merge
The original Rotate Merge binary searches the middle element $$m$$ of a half in the other half and rotates elements over based on the result of the binary search. After the rotation, $$m$$ is in its correct position and can be ignored completely from the recursive calls.

In summary to merge halves $$A$$ and $$B$$: let $$m$$ be the middle element of $$A$$ for simplicity. $$A$$ is then divided into $$A'mA''$$.


 * 1) Divide $$B \rightarrow B'B$$ such that each element of $$B'$$ is $$< m$$ and each of $$B$$ $$is \geq m$$
 * 2) Rotate $$mA$$ and $$B'$$: $$A'mAB'B \rightarrow A'B'mAB''$$
 * 3) Recurse on $$A'B'$$ and $$AB$$ ignoring $$m$$.  If one of the halves is length 0, the merge terminates.

procedure rotate_merge(subarray A, subarray B) do if length(A) == 0 or length(B) == 0 return if length(A) >= length(B) do int i = binary search A[A.middle] in B rotate A[A.middle : A.end] and B[0 : i-1] # A[a : b] is the slice function (end inclusive) rotate_merge(A[0 : A.middle-1], B[0 : i-1]) rotate_merge(A[A.middle+1 : A.end], B[i : B.end]) # exclude middle element of A   else do int i = binary search B[B.middle] in A rotate A[i : A.end] and B[0 : B.middle-1] rotate_merge(A[0 : i-1], B[0 : B.middle-1]) rotate_merge(A[i : A.end], B[B.middle+1 : B.end]) # exclude middle element of B   end if end The worst case depth of a Rotate Merge is $$O(\log n)$$ since each step halves one of the subarrays. However, the halves being recursed on have a chance of being unbalanced which can increase the overhead of the algorithm. To prevent this, Rotate Merge takes the middle element of the larger half and binary searches the smaller half. This ensures that the larger subarrays are halved first.

Although the amount of moves is $$O(n\log n)$$ per merge, the amount of comparisons is $$O(n)$$. This is because of the binary search and the $$O(\log n)$$ recurrence relation:

$$\large S\left(n\right) = \log n + 2S\left(\frac{n}{2}\right)= O(n)$$

Rotate Partition Merge


Rotate Partition Merge is very similar to a normal Merge Sort by selecting the smallest $$\small\frac{m+n}{2}$$ elements from the halves combined (as if they were already merged where $$m$$ and $$n$$ are the length of the halves). It then rotates the selected elements which partitions the merge in half stably. It's easy to predict the size of the partitions after the rotation (in this case the smaller partition is guaranteed to be size $$\small\frac{m+n}{2}$$) therefore this method can be made stackless.

In summary:


 * 1) To merge halves $$A$$ and $$B$$ select the smallest $$\small\frac{|A|+|B|}{2}$$ elements of $$AB$$ from each half
 * 2) Let $$A'$$ be the selected elements from $$A$$ and $$B'$$ from $$B$$, so the array is divided up like this: $$AB \rightarrow A'AB'B$$
 * 3) Rotate $$A$$ and $$B'$$ so that: $$A'AB'B \rightarrow A'B'AB$$ (each element from $$A'B'$$ should be \leq any element from $$AB''$$)
 * 4) Recurse on $$A'B'$$ and $$AB$$. If one of the halves is length 0, the merge terminates since a single half is already partitioned.

procedure rotate_partition_merge(subarray A, subarray B) do if length(A) == 0 or length(B) == 0 return int i = amount of elements in A smaller than median of concat(A, B) int j = amount of elements in B smaller than median of concat(A, B) rotate A[i : A.end] and B[0 : j-1] rotate_partition_merge(A[0 : i-1], B[0 : j-1]) rotate_partition_merge(A[i : A.end], B[j : B.end]) end Since Rotate Partition Merge partitions in perfect halves each time, the worst case depth is $$O(\log n)$$. Although selecting the elements can be done using a linear Merge Sort pass, it can also be done with a binary search since each half is always sorted therefore reducing the comparisons from $$O(n)$$ to $$O(\log n)$$ per partition (and for the total merge $$O(n\log n)$$ to $$O(n)$$ comparisons).

Block-Swap Merge
Block-Swap Merge (a.k.a. Swap Merge) is similar to Rotate Partition Merge except it partitions by swapping equal length ranges. Since the ranges are always the same length, Block-Swap Merge doesn't need a rotation algorithm, as the name implies, which simplifies the algorithm and reduces any possible overhead that can stem from a rotation algorithm.

The algorithm is very simple:


 * 1) To merge halves $$A$$ and $$B$$, search for the largest possible value $$d$$ such that, in $$AB$$, $$A[|A|-d]>B[d]$$
 * 2) Let $$A'$$ be the last $$d$$ elements of $$A$$ and $$B'$$ be the first $$d$$ elements of $$B$$ so $$AB \rightarrow AA'B'B$$
 * 3) Block swap $$A'$$ and $$B'$$: $$AA'B'B \rightarrow AB'A'B$$
 * 4) Recurse on $$AB'$$ and $$A'B$$. If one of the halves is length 0 or $$d=0$$ in step 1, the merge terminates.

procedure block_swap_merge(subarray A, subarray B) do if length(A) == 0 or length(B) == 0 return int d = 0 # A.end is inclusive of the last element # normally a binary search is used here while d < min(length(A),length(B)) and A[A.end-d] > B[d] do d += 1 end while block swap A[A.end-d : A.end] and B[0 : d]   block_swap_merge(A[0 : A.end-d], B[0 : d]) block_swap_merge(A[A.end-d : A.end], B[d : B.end]) end However, Block-Swap Merge has a worst case recursion depth of $$O(n)$$, as opposed to Rotate Merge and Rotate Partition Merge's $$O(\log n)$$ depth, therefore Block-Swap Merge is not as parallelizable. To avoid a stack overflow, one must utilize a tail call optimization to guarantee $$O(\log n)$$ worst case stack memory usage (depth doesn't change).

Like Rotate Partition Merge, the linear search for the middle range can be replaced with a binary search which reduces the comparisons from $$O(n\log n)$$ to $$O(n)$$ per merge.