Shell sorting

format_list_bulleted Contenido keyboard_arrow_down

ImprimirCitar

Process step by step according to Shell's algorithm.

The shell sort is a sorting algorithm. The method is named Shell after its inventor Donald Shell. Its original implementation requires O(n²) comparisons and swaps in the worst case. A minor change introduced in V. Pratt's book produces an implementation with performance of O(n log² n) in the worst case. This is better than the O(n²) comparisons required by simple algorithms but worse than the optimal O(n log n). Although it is easy to develop an intuitive sense of how this algorithm works, it is very difficult to analyze its execution time.

Shell sort is a generalization of insertion sort, keeping two observations in mind:

The insertion order is efficient if the entry is "almost ordered".
The insertion system is inefficient, in general, because it moves the values only one position each time.

The Shell sort Algorithm improves insertion sort by comparing elements separated by a space of several positions. This allows an element to make "bigger steps" to its expected position. Multiple passes over the data are done with smaller and smaller space sizes. The last step of the shell sort is a simple insertion sort, but by then, the data in the vector is guaranteed to be almost sorted.

Example

Consider a small value that is initially stored at the end of the vector that you want to sort in ascending order. Using an O(n²) sort like the bubble sort or insertion sort, it will take approximately n comparisons and swaps to move this value to the other end of the vector. The shell sort first moves values using gigantic space sizes, so a small value will move quite a few positions toward its final position, with only a few comparisons and swaps.

One can visualize the shell sort algorithm as follows: put the list into a table and sort the columns (using insertion sort). Repeat this process, each time with a smaller number of longer columns. In the end, the table has only one column. While transforming the list into a table makes it easier to visualize, the algorithm itself does its sorting in context (incrementing the index by the step size, that is, using i += step_size instead of i++).

For example, consider a list of numbers like [ 13 14 94 33 82 25 59 94 65 23 45 27 73 25 39 10 ]. If we start with a step size of 5, we could visualize this by splitting the list of numbers into a table with 5 columns. This would look like this:

13 14 94 33
25 59 94 65 23
45 27 73 25 39
10

Then we order each column, which gives us

10 14 73 25 23
13 27 94 33 39
25 59 94 65
45

When we read it back as a single list of numbers, we get [ 10 14 73 25 23 13 27 94 33 39 25 59 94 65 82 45 ]. Here, the 10 that was at the trailing end has been moved to the leading end. This list is then again sorted using a 3-position gap sort, and then a 1-position gap sort (single insertion sort).

The Shell sort is named after its inventor, Donald Shell, who published it in 1959. Some textbooks and older references call it the "Shell-Metzner" by Marlene Metzner Norton, but according to Metzner, "I have nothing to do with the sorting algorithm, and my name should never be attached to it." [1]

Sequence of spaces

The sequence of spaces is an integral part of the shell sort algorithm. Any incremental sequence would work as long as the last element is 1. The algorithm begins by performing an insertion sort with space, with space being the first number in the space sequence. It continues to perform an insertion sort with space for each number in the sequence, until it ends with a space of 1. When space is 1, the insertion sort with space is just an ordinary insertion sort, guaranteeing that the final list it will be ordered.

The sequence of spaces that was originally suggested by Donald Shell should begin with ${displaystyle N/2}$ and divide by half the number to reach 1. Although this sequence provides significant performance improvements on quadratic algorithms such as the insertion system, it can be slightly changed to decrease the average time needed and the worst case. Weiss' textbook shows that this sequence allows a system $O(n^2)$ of the worst case, if the data are initially in the vector as (small_1, large_1, small_2, large_2,...) - that is, the high half of the numbers are placed, in orderly form, in the positions with index pair and the lower half of the numbers are located in the same way in the positions with index

Perhaps the most crucial property of the sort shell is that elements remain k-sorted even as space decreases. A vector divided into k subvectors is said to be k-ordered if each of these subvectors is ordered when considered in isolation. For example, if a list was 5-ordered and then 3-ordered, the list is now not just 3-ordered, but both 5-ordered and 3-ordered. If this were not true, the algorithm would undo the work it had done in previous iterations, and would not achieve such a low execution time.

Depending on the choice of the sequence of spaces, Shell draw has a run time in the worst case $O(n^2)$ (using Shell increments that begin with n/2, n the vector size and are divided by 2 each time), ${displaystyle O(n^{3/2})}$ (using Hibbard's increases ${displaystyle 2^{k}-1}$ ), ${displaystyle O(n^{4/3})}$ (using Sedgewick increases ${displaystyle O(nlog ^{2}n)}$ and possibly better unchecked execution times. The existence of an implementation ${displaystyle O(nlog n)}$ in the worst case of the Shell draw remains a question to be resolved.

NOTE: Sedgewick increments are calculated by interleaving the values of the following functions:

${displaystyle f(i)=9(4^{i})-9(2^{i})+1;f(0)=1,f(1)=19,f(2)=109,f(3)=505,...)}$

${displaystyle g(i)=(2^{i+2})*(2^{i+2}-3)+1;g(0)=5,g(1)=41,g(2)=209,g(3)=929,...)}$

Más resultados...