Joshua Unger, joshua_unger@yahoo.com
JHU - CS 605.451.32 - Spring '98 - Sheppard
Homework 2, Problem 1

  1. Choose Hill Climbing from the list of Algorithms
  2. Choose Sparse or Complete Graph from the list of Graphs
  3. Click on Color! or Step, whatever suits your fancy!
  4. Sit back, relax, and remember: "Every Planar Graph is 4-Colorable", or so they say
If you can read this, you are missing out on a world-class Java applet...

Graph Coloring Problem

Abstract:

The graph coloring problem is solved using two algorithms and their relative performance is measured. The two algorithms used are hill climbing with multiple random restarts and simulated annealing. The problem is implemented in a Java applet using the United States of America as a graph basis.

1.0 Introduction

The graph coloring problem is an extension of the map coloring problem. According to R&N, "In map coloring, the aim is to color countries on a map using a given set of colors, such that no two adjacent countries are the same color. (p.91)" The graph coloring problem is similar to the map coloring problem, but in graph coloring countries are replaced by vertices of a graph. A map can be represented in graph theory by a planar bi-directional graph. The planar modifier in the previous sentence indicates a graph with no vertices intersecting another when the graph is drawn on a planar surface. The bi-directional modifier means that if one vertex is connected to a second vertex, the second is necessarily connected to the first. However, in the graph coloring problem the graph is not necessarily planar nor bi-directional.

While considering the analysis of the solution outlined in this paper, it is important to note that there has been extensive research done on the topic of coloring planar graphs. There exists a theorem that states that all planar graphs are four colorable.

The hill climbing algorithm and simulated annealing are described in Chapter 4.4 of R&N. They are both iterative improvement algorithms. An iterative improvement algorithm maintains no memory of previous states in its search space, and makes incremental changes to the state that will theoretically approach the solution.

2.0 Implementation

2.1 Inputs

Three different graphs are used as inputs. The first graph is a planar bi-directional one, and thus the graph coloring problem becomes the map coloring problem. In keeping with this theme, the graph has 49 vertices and are connected so the graph is a map of the continental United States of America (48 States and the District of Columbia). Each vertex has a x-coordinate and a y-coordinate in a Cartesian reference frame, so the graph can be represented visually. The second graph has the same vertices as the first but the second is complete and bi-directional. A complete graph is defined as one in which each vertex is connected to all the other vertices in the graph. The third graph used also has the same vertices as the first. The third graph is neither planar nor complete, but rather is somewhere in between. The third graph is also not bi-directional like the first two. In the implementation, these graphs are named "Sparse", "Complete", and "In Between," respectively.

With these input graphs, the solutions can be predicted. From the theorem mentioned above, the Sparse graph should have at the most four colors as its solution. By common sense, the Complete graph should have 49 colors as its solution since it has 49 vertices. The solution to the In Between graph is not known, but it should have an upper limit of 49 and a lower limit of 4. Since this problem is discrete, the expected solution should be between 5 and 48.

2.2 Algorithms

The hill climbing algorithm uses multiple random restarts as a method for escaping local maxima and exploring the search space more fully. The version of simulated annealing that is implemented in this problem with a slight flavor of hill climbing. In graph coloring, a large palette of colors creates a vast search space, so the simulated annealing algorithm cannot afford to behave in a pure random manner. Instead, it first checks to see if there are any obvious hill climbing choices in the immediate surrounding space. If not, it then applies the classical simulated annealing strategy of choosing a random successor.

To solve the general graph coloring problem with either algorithm, the applet starts with a color palette of n, where n is the number of vertices in the graph. For each palette, the applet tries to solve the problem. If the applet is successful, it reduces the number of colors in the available palette. If at any time the applet cannot find a solution, it reverts to the last solution found and displays statistical information about the solution path.

2.3 Evaluation Function

Both algorithms use the same evaluation function for determining the worth of a particular state. A state can be characterized the number of errors that it possesses. An error is defined as when two neighboring vertices have the same color. Intuitively, an algorithm can improve its evaluation by changing a vertex so that fewer errors exist in the graph. The evaluation function counts the number of errors in the state and multiplies the total by –1. Therefore the best that a state can rate with this function is 0, and the worst would be -n2 in the case of a complete graph with n vertices.

2.4 Hill Climbing

A general hill climbing algorithm examines the state and generates an improved state if one exists. The state generation strategy of the hill climbing algorithm is specific to the graph coloring problem. First, hill climbing first examines the surrounding vertices of the vertex that was last changed. From this set of vertices, it chooses the vertex with the most amount of errors, include vertices with no errors. The algorithm then examines every possible color that it might change the vertex to, and applies the evaluation function to each possible state. If any possible states are an improvement on the current, then hill climbing picks the state with the greatest improvement and marks the changed vertex as the last vertex changed. If no improvements exist among the possible generated spaces, the algorithm then cycles through every other vertex in the state applying the previous method of state generation. If still no improvements exist after cycling through the entire vertex set, the algorithm fails and the current state is therefore a local maximum.

2.5 Simulated Annealing

It was stated above that this implementation uses a modified version of simulated annealing, specifically that it has some elements of hill climbing built in. The method of determining successor nodes first mirrors that of hill climbing. The algorithm examines the vertex that was last changed. It chooses the adjacent vertex with the fewest errors, and attempts to apply a color that would reduce errors and hence increase evaluation. If it is unable to do so, the algorithm applies simulated annealing. It picks a vertex at random, picks a color at random, and applies classic simulated annealing to the new state.

3.0 Results

3.1 Run Parameters

For the hill climbing algorithm, 25 random restarts are allowed per color palette. If no solution is found after 25 restarts, the applet reverts to the last solved palette. For simulated annealing, the Schedule function is defined as a linear backward countdown from 7.5 to 0. The total number of increments between these two values is 1,000. The two numbers 25 and 1,000 were determined empirically to try to match the total processor time taken for each algorithm to find the solution to the Sparse graph. The linear backward countdown from 7.5 was also determined empirically but this number affects the effectiveness of the simulated annealing algorithm, not the processor time. The following graph illustrates how this choice was made. For larger values of D E relative to the temperature in the Maxwell-Boltzman equation, the probability function slopes more rapidly than the same equation for smaller values of D E relative to the same temperature. For smaller values, the probability plateaus for a while and then sharply dipped. For the modified simulated annealing algorithm, it was determined that a larger ratio of D E to temperature was desired, yielding a more sharply sloping probability function. This is because of the nature of the algorithm. Since the probability corresponds to a random negative change, the less of these changes made the algorithm perform better. Regardless, a countdown from 7.5 to 0 was determined empirically to be the most effective.

3.2 Performance Measures

30 tests were run on a Pentium 200 MHz PC with 64 MB of memory. The applet was run in a Microsoft Internet Explorer 3.0 browser window. The applet was written in Java 1.0.2 even though 1.0.2 is not the current release, due to the desire to eventually have this applet displayed at the Gamelan site. Interestingly enough, the Java Appletviewer that comes with release 1.0.2 performed at about 25% of Internet Explorer’s capability for displaying the applet. Perhaps this is due to the difference in graphics capability between the two applications. In any case, each graph was solved 5 times by each algorithm. The table below summarizes the best solution described by the number of colors found and the total processor time spent, where HC is hill climbing and SA is simulated annealing.

HC Sparse

SA Sparse

HC In Between

SA In Between

HC Complete

SA Compete

4 / 16.12 s

4 / 23.13 s

20 / 21.36 s

18 / 36.13 s

49 / 27.93 s

49 / 24.19 s

4 / 14.06 s

5 / 16.70 s

19 / 37.38 s

18 / 30.50 s

49 / 33.67 s

49 / 91.66 s

5 / 8.13 s

4 / 5.78 s

20 / 22.08 s

17 / 65.92 s

49 / 49.45 s

49 / 80.60 s

5 / 2.20 s

4 / 45.20 s

20 / 23.80 s

17 / 26.04 s

49 / 62.05 s

49 / 65.58 s

4 / 6.00 s

4 / 40.76 s

20 / 17.88 s

17 / 100.46 s

49 / 80.47 s

49 / 55.30 s

Average:

4.4 / 8.90 s

4.2 / 26.31

19.8 / 24.5 s

17.6 / 51.81

49 / 50.71 s

49 / 63.47 s

Percentage of solved solutions can be compared by taking the official solution as the smallest number of colors within a graph solution set. For example, for the Sparse graph, the solution is 4. It is apparent that simulated annealing does a better job of solving the problem, but again, at what cost? It takes simulated annealing a longer time to arrive at the solution. Operator cost is directly measured by the total time spent on the solution.

Within each set of five measured times there is a wide range of times. A few factors may have contributed to this. The first is the nature of the random behavior of each of the algorithms. Sometimes they find a solution rapidly and sometimes not. The second is the computer language issue. It is not certain how Java manages memory as the applet is alive for a long time. It may become more or less efficient with the different colors and graphics that are constantly being updated. Other processes being run on the computer may have also affected the times.

4.0 Conclusion

Both the hill climbing algorithm and the simulated annealing algorithm are effective for solving the graph coloring problem. In the case of the planar graph with 49 vertices with a palette of 4 colors, trying every possible combination would be 449 combinations! For each algorithm to be able to solve the graph problem in 20 seconds graphically is a great feat.

The limited time available restricted the more extensive analysis of performance. If this experiment were to be continued in the future, more time would be spent to automate the testing procedure so that more data could become available. However, for the purposes of the assignment, the graphical approach was more novel and appealing from an implementation standpoint.