Sunday, September 14, 2008

Mathematical description of simulated annealing algorithm

The parameters, used in the algorithm, are: Tstart – start temperature, Tend – temperature, on which the algorithm should end, - temperature fall coefficient, ∈[-∞,1], k – repeat count.
In the algorithm, the temperature changes, following the following equation:




In the above equation by N we denote the current step of the algorithm, Nk is N multiplied by k. k is the number of times the algorithm has to be repeated before current temperature changes.
From this we can clearly see, that parameter k is basically redundant. Its function is just to slow down the temperature fall, which can be achieved by bringing alpha closer to 1. Then we would get a bit simplified equation:



The solution naturally is:



The rate, at which the solutions are accepted, is:


Where ∆E is the difference in the energy of the new solution to the previous solution.
The equation above actually represents the movement of a system, influenced by an attractor. It’s easier for the system to move towards it then in the opposite direction. Also, with temperature falling, the attractor becomes more powerful.
As the system is never bound to move in a certain direction, it is free to explore other many extremes on a function with multiple local extreme, gradually finding the main one. This also suggests that if we write and solve the equation of the system movement, we can determine the optimal parameters for particular data. Writing down this equation is not as easy, as it seems, as we have “decisions” like – take or not take the solution with next energy.
Basically, if we look at the rate at which the solutions are accepted, we can see diffusion. This is the continuous version for Markov chains model, which looks like this: let’s look at the simplified model, where from current energy only one better and one worse solution can be produced. is the probability that a better solution has been produced by change algorithm. The chain then would look like:



For diffusive process, let us assume following f(E,-∆E) – the probability, that the solution produced by the change algorithm in the state with energy E will decrease energy by ∆E, f(E,∆E) – the probability that the proposed solution will increase its energy by ∆E. In annealing algorithms it’s usually the case, that:


Than, the probability of transferring to a better position will be:


The probability of transferring to a worse energy:



The probability that system’s state won’t change:



The probability that the system will acquire the state with energy E0 from being in state with higher energy:



The probability of transferring to state E0 from a state with lower energy:



Now, the probability of being in current state would be:



Also:



Basically, the system, represented by the equations above is fluctuating around its state with highest probability for current step (best solution for ), which is the attractor (N) of the system. It can be represented as follows:

The change of the attractor of the system. N1 >N2. The gestalt of the shapes is formed by the propertied of system modification algorithm part

An attractor is a set to which a dynamical system evolves after a long enough time. That is, points that get close enough to the attractor remain close even if slightly disturbed. Geometrically, an attractor can be a point, a curve, a manifold, or even a complicated set with a fractal structure known as a strange attractor. Describing the attractors of chaotic dynamical systems has been one of the achievements of chaos theory.
A trajectory of the dynamical system in the attractor does not have to satisfy any special constraints except for remaining on the attractor. The trajectory may be periodic or chaotic or of any other type.