Thinking about simulated annealing

October 31st, 2009
machinelearning, programming, tech
I've been reading Duda, Hart, and Stork's "Pattern Classification". Much more detail than we got to in swat's AI class or the following seminar. Some parts are tough going, and I'm definitely not working out every detail, but I think it's helpful.

I just finished reading about simulated annealing, and I think I understand now why it's much more generally applicable than simple random restart hill climbing. In the latter you pick a random point in the solution space, consider all the neighboring points, move to the best one, repeat until you're better than your neighbors. Then repeat the whole process, remembering your best places so far, until you decide to stop. People always talk about the problem with this as "getting stuck in local minima" (with the "down is good" analogy of "gradient descent" instead of the "local maxima" that the "up is good' analogy of "hill climbing" would give) but I didn't really understand. I think now I do.

Consider trying to find the lowest point in a parking lot. The lot is pretty smooth, and pretty consistent, so you could put a soccer ball in the middle and see where it stopped. But what if you chose a ball bearing? The parking lot is smooth on a soccer ball scale, but not on a ball bearing scale. So the bearing gets caught in a low bit (local minimum) while the soccer ball rolls all the way down (global minimum).

In simulated annealing, we use a ball bearing but shake the parking lot so it has a decent chance of getting out of these little pockets and making it's way all the bottom of the lot. So that we don't keep bouncing around for ever, we start off shaking the lot a large amount, and over time decrease our shaking exponentially until we're not shaking at all.

Because simulated annealing decreases the amount of shaking over a very wide range (lots to almost none) it's suitable for all sorts of terrain, from ditches to dents. Hill climbing only works if the terrain is close to smooth.

What would happen if we tried a search system closer to the one with the soccer ball? The soccer ball represents a large portion of the solution space, and at each time step it moves from one set of adjacent points to a new set, under the condition that the highest of the new set of points be lower than the highest of the current set of points. Once could imagine "simulated soccer ball rolling" where one starts off with a giant ball and then makes it smaller over time until one has a ball bearing making tiny adjustments. The biggest problem with this that I see is that rolling a soccer ball requires testing a prohibitively large number of points. Imagine how many points would need to be tested to roll a ball 1/16 the size of the earth over the earth's surface. Worse, there are lots of solutions this search misses because it is strongly biased towards solutions that lie in large flattish regions. Our giant ball rolling over arizona would likely miss the grand canyon.

This diversion with ball rolling aside, I think the part about simulated annealing that really made sense to me is: exponentially decreasing willingness to make things worse is a good fit for a huge range of possible solution spaces.

Comment via: facebook

Recent posts on blogs I like:

Impact, agency, and taste

understand + work backwards from the root goal • don’t rely too much on permission or encouragement • make success inevitable • find your angle • think real hard • reflect on your thinking

via benkuhn.net April 19, 2025

Book Review: Daygame Mastery

the dark, forbidden, secret truth that women like men with boundaries

via Thing of Things April 18, 2025

Which Came First, the Chicken or the Egg?

When I thought about this question it was really hard to figure out because the way it's phrased it's essentially either a chicken just pops into existence, or an egg just pops into existence, without any parent animals involved. I thought about t…

via Lily Wise's Blog Posts April 13, 2025

more     (via openring)