Simulated annealing: Difference between revisions

Revision as of 19:42, 12 December 2024

Author: Gwen Zhang (xz929), Yingjie Wang (yw2749), Junchi Xiao (jx422), Yichen Li (yl3938), Xiaoxiao Ge (xg353) (ChemE 6800 Fall 2024)

Stewards: Nathan Preuss, Wei-Han Chen, Tianqi Xiao, Guoqing Hu

Introduction

Simulated annealing (SA) is a probabilistic optimization algorithm inspired by the metallurgical annealing process, which reduces defects in a material by controlling the cooling rate to achieve a stable state.^[1] The core concept of SA is to allow algorithms to escape the constraints of local optima by occasionally accepting suboptimal solutions. This characteristic enables SA to find near-global optima in large and complex search spaces.^[2] During the convergence process, the probability of accepting a suboptimal solution diminishes over time.

SA is widely applied in diverse fields such as scheduling,^[3] machine learning, and engineering design, and is particularly effective for combinatorial optimization problems that are challenging for deterministic methods.^[4] First proposed in the 1980s by Kirkpatrick, Gelatt, and Vecchi, SA demonstrated its efficacy in solving various complex optimization problems through its analogy to thermodynamics.^[5] Today, it remains a powerful heuristic often combined with other optimization techniques to enhance performance in challenging problem spaces.

Algorithm Discussion

Formal Description of the Algorithm

SA is a probabilistic technique for finding approximate solutions to optimization problems, particularly those with large search spaces. Inspired by the annealing process in metallurgy, the algorithm explores the solution space by occasionally accepting worse solutions with a probability that diminishes over time. This reduces the risk of getting stuck in local optima and increases the likelihood of discovering the global optimum.

The basic steps of the SA algorithm are as follows:

Initialization: Begin with an initial solution $s_{0}$ and an initial temperature $T_{0}$ .
Iterative Improvement: While the stopping criteria are not met:
- Generate a new solution $s_{new}$ in the neighborhood of the current solution s.
- Calculate the difference in objective values $\Delta E=f(s_{new})-f(s)$ , where $f$ is the objective function.
- If $\Delta E<0$ (i.e., $s_{new}$ is a better solution), accept $s_{new}$ as the current solution.
- If $\Delta E>0$ , accept $s_{new}$ with a probability $P=\exp \left(-{\frac {\Delta E}{T}}\right)$ .
- Update the temperature $T$ according to a cooling schedule, typically $T=\alpha \cdot T$ , where $\alpha$ is a constant between 0 and 1.
Termination: Stop the process when the temperature falls below a minimum threshold , or after a predefined number of iterations. The best solution encountered is returned as the approximate optimum.

Pseudocode for Simulated Annealing

def SimulatedAnnealing(f, s_0, T_0, alpha, T_min):
    s_current = s_0
    T = T_0
    while T > T_min:
        s_new = GenerateNeighbor(s_current)
        delta_E = f(s_new) - f(s_current)
        if delta_E < 0 or Random(0, 1) < math.exp(-delta_E / T):
            s_current = s_new
        T = alpha * T
    return s_current

Assumptions and Conditions

Objective Function Continuity

The objective function $f$ is assumed to be defined and continuous in the search space.

Cooling Schedule

The cooling schedule, typically geometric, determines how $T$ decreases over iterations. The cooling rate $\alpha$ affects the convergence speed and solution quality.

Acceptance Probability

Based on the Metropolis criterion, the probability $P=\exp \left(-{\frac {\Delta E}{T}}\right)$ allows the algorithm to accept worse solutions initially, encouraging broader exploration.

Termination Condition

SA terminates when $T$ falls below $T_{min}$ or when a predefined number of iterations is reached.

Numerical Examples

The Ising model is a classic example from statistical physics and in order to reach its optimal state, we can use Simulated Annealing (SA) to reach a minimal energy state.^[6] In the SA approach, we start with a high "temperature" that allows random changes in the system’s configuration, which helps explore different arrangements of spins. As the temperature lowers, the algorithm gradually favors configurations with lower energy, eventually "freezing" into a low-energy state.

Problem Setup

Consider a small $2\times 2$ grid of spins, where each spin can be either $+1$ or $-1$ . The energy of the system is determined by the Hamiltonian: $E=-J\sum _{\langle i,j\rangle }S_{i}S_{j}$ , where $J$ is a coupling constant (set to 1 for simplicity), $s_{i}$ and $s_{j}$ are neighboring spins, and the summation is over nearest neighbors. The goal is to minimize $E$ , achieved when all spins are aligned.

Algorithm Steps

Initialize the spins: start with a random configuration. For example: $L={\begin{bmatrix}+1&-1\\-1&+1\end{bmatrix}}$
Calculate Initial Energy: For this configuration, each pair of neighboring spins contributes $-JS_{i}S_{j}$ to the energy. Calculate by summing these contributions: The pairs are $(+1,-1),(-1,+1),(-1,+1),(+1,-1)$ , all of which contribute $+1$ . So, initial energy $E=4$ .
Neighboring Configurations: Select a random spin to flip. Suppose the spin in the top right corner is chosen, changing $+1$ to $-1$ : ${\text{New L}}={\begin{bmatrix}+1&+1\\-1&+1\end{bmatrix}}$ Recalculate $E$ for the new configuration: Now, three pairs are aligned $(+1,+1)$ and contribute $-1$ each, while one pair $(-1,+1)$ contributes $+1$ , so total energy $E=-2$ .
Acceptance Decision: Calculate $\Delta E=E_{\text{new}}-E_{\text{old}}=-2-4=-6$ . Since $\Delta E<0$ , accept the new configuration.

↑ Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E.(1953). "Equation of state calculations by fast computing machines." Journal of Chemical Physics, 21(6), 1087-1092.
↑ Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). "Optimization by simulated annealing." Science, 220(4598), 671-680.
↑ Aarts, E. H., & Korst, J. H. (1988). Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing. Wiley.
↑ Cerny, V. (1985). "Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm." Journal of Optimization Theory and Applications, 45(1), 41-51.
↑ Kirkpatrick, S. (1984). "Optimization by simulated annealing: Quantitative studies." Journal of Statistical Physics, 34(5-6), 975-986.
↑ Peierls, R. (1936). On Ising’s model of ferromagnetism. Mathematical Proceedings of the Cambridge Philosophical Society, 32(3), 477–481.

[1] Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E.(1953). "Equation of state calculations by fast computing machines." Journal of Chemical Physics, 21(6), 1087-1092.

[2] Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). "Optimization by simulated annealing." Science, 220(4598), 671-680.

[3] Aarts, E. H., & Korst, J. H. (1988). Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing. Wiley.

[4] Cerny, V. (1985). "Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm." Journal of Optimization Theory and Applications, 45(1), 41-51.

[5] Kirkpatrick, S. (1984). "Optimization by simulated annealing: Quantitative studies." Journal of Statistical Physics, 34(5-6), 975-986.

[6] Peierls, R. (1936). On Ising’s model of ferromagnetism. Mathematical Proceedings of the Cambridge Philosophical Society, 32(3), 477–481.

[1]

[2]

[3]

[4]

[5]

[6]

@@ Line 42: / Line 42: @@
 === Assumptions and Conditions ===
-==== '''Objective Function Continuity''': ====
+==== '''Objective Function Continuity''' ====
 The objective function <math>f
 </math> is assumed to be defined and continuous in the search space.
-==== '''Cooling Schedule''': ====
+==== '''Cooling Schedule''' ====
 The cooling schedule, typically geometric, determines how <math>T
 </math> decreases over iterations. The cooling rate <math>\alpha
@@ Line 52: / Line 52: @@
 </math> affects the convergence speed and solution quality.
-'''Acceptance Probability''':
+'''Acceptance Probability'''
 Based on the Metropolis criterion, the probability <math>P = \exp\left(-\frac{\Delta E}{T}\right)</math> allows the algorithm to accept worse solutions initially, encouraging broader exploration.
-==== '''Termination Condition''': ====
+==== '''Termination Condition''' ====
 SA terminates when <math>T
 </math> falls below <math>T_{min}
@@ Line 62: / Line 62: @@
 == Numerical Examples ==
+The Ising model is a classic example from statistical physics and in order to reach its optimal state, we can use Simulated Annealing (SA) to reach a minimal energy state.<ref>Peierls, R. (1936). On Ising’s model of ferromagnetism. ''Mathematical Proceedings of the Cambridge Philosophical Society'', 32(3), 477–481.</ref> In the SA approach, we start with a high "temperature" that allows random changes in the system’s configuration, which helps explore different arrangements of spins. As the temperature lowers, the algorithm gradually favors configurations with lower energy, eventually "freezing" into a low-energy state.
+=== '''''Problem Setup''''' ===
+[[File:Figure 1. Initial configuration.png|alt=|thumb|204x204px|Figure 1: Configuration for L]]
+Consider a small <math>2\times 2</math> grid of spins, where each spin can be either <math>+1</math> or <math>-1</math> . The energy  of the system is determined by the Hamiltonian: <math>E = -J \sum_{\langle i,j \rangle} S_i S_j</math>, where <math>J</math> is a coupling constant (set to 1 for simplicity), <math>s_i</math> and <math>s_j</math> are neighboring spins, and the summation is over nearest neighbors. The goal is to minimize <math>E</math>, achieved when all spins are aligned.
+=== '''''Algorithm Steps''''' ===
+# Initialize the spins: start with a random configuration. For example: <math>L = \begin{bmatrix}
++1 & -1 \\
+-1 & +1
+\end{bmatrix}</math>
+# Calculate Initial Energy: For this configuration, each pair of neighboring spins contributes <math>-JS_i S_j</math> to the energy. Calculate by summing these contributions: The pairs are <math>(+1, -1), (-1, +1), (-1, +1), (+1, -1)</math>,  all of which contribute <math>+1</math>.  So, initial energy <math>E = 4</math>.
+# Neighboring Configurations: Select a random spin to flip. Suppose the spin in the top right corner is chosen, changing <math>+1</math> to <math>-1</math>: <math>\text{New L} = \begin{bmatrix} +1 & +1 \\
+-1 & +1 \end{bmatrix}</math> Recalculate <math>E</math> for the new configuration: Now, three pairs are aligned <math>(+1,+1)</math> and contribute <math>-1</math> each, while one pair <math>(-1,+1)</math> contributes <math>+1</math>, so total energy <math>E = -2</math>.
+# Acceptance Decision: Calculate <math>\Delta E = E_{\text{new}} - E_{\text{old}} = -2 - 4 = -6</math>. Since <math>\Delta E < 0</math>, accept the new configuration.