{menu} {mirror}



SEPA: A Simple, Efficient Permutation Algorithm




Jeffrey A. Johnson, Brigham Young University-Hawaii Campus
(johnsonj@cs.byuh.edu)


Abstract

Research in progress developing a fast algorithm which will find all possible permutations of a data set without the use of "helping" data structures or recursion.



Introduction

As Robert Sedgewick has said, "It is an interesting computational puzzle to write a program that generates all possible ways of rearranging N distinct items."[1] Applications include "traveling-salesman" problem, where it is necessary to find the shortest path through all points in a graph. Permutations are also used in the analysis of sorting algorithms.[2]

Usual approaches to finding the permutations of a data set include the use of graphs as suggested by Robert Sedgewick[1] or using recursion as suggested by Jacques Arsac[3]. Both of these approaches have a common drawback they require helping data structures to perform the permutation. To create permutations using graphs, it is necessary to have data space for the data set to be permuted, as well as data space for the graph structures to perform the permutation. The recursive algorithm requires the overhead of recursive calls, or the use of a stack if implemented without recursion. These helping data structures can require as much space as the original data set. This paper discusses research in progress which will offer a new approach to the permutation problem, that retains the speed of the above algorithms, without requiring helping data structures.



Research

Upon inspecting character data sets as they permute, it was found that through a set of logical steps a set of character data can be permuted into its next permutation. It then becomes possible to find all permutations in sorted order by repeatedly calling the algorithm on the last output obtained. These same steps can be used on any data set in which the data have magnitude relationships. It was also discovered that each permutation could be created "in place", so no additional helping structure is required. Preliminary analysis reveal that the creation of each permutation has a worst case running time of O(n) and a much faster average running time.



The SEPA Algorithm

Three steps are needed to convert a data set to its next permutation. The first step is to find the key field for this permutation. The first field which changes between the two permutations is the key field. The key field in "abdca" as it becomes "acabd" (its next sorted order permutation) is the field containing the value 'b'.

The key field can be determined by checking pairs of digits (or fields) starting from the rear (or right side) of the set. For each pair, if the left value of the pair is smaller than the right, then the field on the left is the key field. At this point all values to the right of the key field must be in sorted order from right to left (if not a field to the right would be the key field). If the key does not exist, the last permutation has been created. At this time, the data set is in reverse sorted order (from left to right).

For clarification, those values to the right of the key field will be termed the "tail" of the set, while those to the left will be the "head". Note that the head, key, and tail will be different for each permutation.

To find the next permutation in order, the key field must be exchanged with the smallest value in the tail which is larger than the key value. For example in the data set "abdca" the field with the value 'b' is the key, and the field containing the value 'c' has the smallest value in the tail which is larger than the key value. After swapping values, the set becomes "acdba". It is possible to find this value by comparing each value with the key value, starting at the end of the data set. The first value larger than the key value is the value to be exchanged with the key value. Notice that the tail (the 'c' is the key now) is still in reverse sorted order. This is important, because the last step is putting the tail into sorted order.

Since the tail is in reverse sorted order, to sort it requires only a single loop, which works from both ends of the tail at the same time, swapping each pair of items while moving to the center of the tail. In this example, a character data set was used, but as stated before the logic behind this algorithm will work with other data sets, as long as there is a less than/greater than relationship among the data items. If all permutations are required, it is necessary that the original data set be the first permutation (exist in sorted order).

A quick review of the algorithm shows it is compose of three loops (an example in C is given as Listing 1). The first finds the key. The second finds the value to swap with the key. The third reorders the tail from reverse order to sorted order. A short analysis shows all three loops are bounded by the size of the data set, therefore the running time is bounded by O(n).



Future Research

The author with proceed to create in-depth proof showing the algorithm produces all permutations, as well as giving formal worst case, and average run time bound analysis. Representative implementations of both graph and recursive based algorithms will be giving with similar analysis as a comparison to this new approach. A memory constraint analysis will also be given for each of the algorithms.



Contributions

This algorithm's lack of helping data structures makes it a contribution to the field of Computer Science.



Acknowledgments

The author thanks Dr. Christopher Jones (Utah Valley State College) and Dr. Cory Barker (Brigham Young University-Hawaii Campus) for their support for this project.



References

[1]Robert Sedgewick. Algorithms (Second Edition). Addison-Wesley Publishing Company, August 1989.
[2]Donald E. Knuth. The Art of Computer Programming (Volume 3 /Sorting and Searching). Addison-Wesley Publishing Company, 1973.
[3]Jacques Arsac. Foundations of Programming. Academic Press Inc. (London) LTD. 1985.


Listing 1


#include<STDIO.H>
#include<STDLIB.H>


void swap(char *s, int a, int b)
{
   char temp=s[a];
   s[a] = s[b];
   s[b] = temp;
}


int permute(char *str, int len)
{
   int key=len-1;
   int newkey=len-1;

   /* The key value is the first value from the end which
      is smaller than the value to its immediate right        */

   while( (key > 0) && (str[key] <= str[key-1]) )
      { key--; }

   key--;

   /* If key < 0 the data is in reverse sorted order, 
      which is the last permutation.                          */

   if( key < 0 )
      return 0;

   /* str[key+1] is greater than str[key] because of how key 
      was found. If no other is greater, str[key+1] is used   */

   newkey=len-1;
   while( (newkey > key) && (str[newkey] <= str[key]) )
   {
      newkey--;
   }

   swap(str, key, newkey);

   /* variables len and key are used to walk through the tail,
      exchanging pairs from both ends of the tail.  len and 
      key are reused to save memory                           */

   len--;
   key++;

   /* The tail must end in sorted order to produce the
      next permutation.                                       */

   while(len>key)
   {
      swap(str,len,key);
      key++;
      len--;
   }

   return 1;
}


void main()
{
   /* test data */

   char test_string[] = "aabcd";

   /* A short test loop to print each permutation, which are
      created in sorted order.                                */

   do  {
      printf("%s\n",test_string);
   } while( permute(test_string, strlen(test_string)) );
}

Web page saved on 10/19/2001 from URL:  http://www.cs.byuh.edu/~johnsonj/permute/soda_submit.html
Email all questions and concerns to johnsonj@cs.byuh.edu