Taking the Fun Out of Puzzles & Games

Thursday, October 19, 2017

The high cost of "safe" estimates

A summary of an idea from Goldratt's Critical Chain

When talking about task duration estimates, I'll define a "true" estimate as an estimate in which we have 50% confidence, that is, an estimate that has a 50% probability of being greater than or equal to the actual duration. Similarly, a "safe" estimate is an estimate in which we have 90% confidence.

People want to be seen as reliable, so they tend to give safe estimates. But their stakeholders tend to try to whittle down the estimate, so the person often has to fight for their estimate. Now they might look dumb if they finish much sooner than the estimate for which they just fought so hard. Even if the stakeholder is initially pleased when this happens, the person knows that, in the future, their estimates may be seen as inflated. 

If you were to draw a graph of the likelihood of a task finishing at various points in time, it would commonly look something like this: 

Here, the X axis represents time (say, in weeks), and the area under the curve from x1 to x2 is the probability that the task will take between x1 and x2 weeks to complete. Since there a chance that the task will take a long time, the curve has a long tail on the right-hand side (not shown completely in this graph) that drifts down to 0 very slowly. The area under the entire curve is 1, reflecting an unrealistic assumption that the task will definitely be completed at some point in time. 

The area under the curve from 0 to 1 is about 0.5 (as you can see noting that the green area of the graph is slightly less than a 1 x 0.6 rectangle), so the probability that this task will take less than 1 week is about 50%. In other words, the 50th percentile time for this task is 1 week. The probability that it will take between 1 and 2 weeks is about 25%. The probability that it will take between 2 and 4 weeks is about 15%. And the probability that it will take longer than 4 weeks is about 10%. Adding these numbers up, we see that this task has a 50% chance of finishing in less than a week, a 75% chance of taking less than 2 weeks, and a 90% chance of taking less than 4 weeks. 

So, an estimate with 90% confidencea safe estimatewill have to be many times longer than a true estimate. 

For more on this topic, see Cutting Edge's excellent white paper on Managing Duration Uncertainty. 

Thursday, December 1, 2016

Wednesday, November 2, 2016

How big is 10^100^100?

Elliot Temple posed the following math puzzle: How many square roots does it take to get 10100100 down to a small number? (And what number is it?)
Note that abc = a(bc).
Answer: 665 square roots takes you down from 10100100 to around 4.5.
Here, lg means the base 10 logarithm, and dot (·) means multiplication. I will use the following facts:
  • (ab)c = abc
  • xaxb = xa+b
  • x = 10lg x
First, note that 10100100= 1010200 . We next prove the following lemma:
abc = 1010c lg b + lg lg a
Example: 223 = 28 = 256 = 10103 lg 2 + lg lg 2

=(10lg a)(10lg b)c
=(10lg a)10c lg b
=10lg a · 10c lg b
=1010lg lg a · 10c lg b
=1010c lg b + lg lg a

We can now use the lemma to show that 4.52665 = 1010665 lg 2 + lg lg 4.5. And 665 lg 2 + lg lg 4.5 ≈ 200.▢
You can also check the answer with Robert Munafo's Hypercalc, an online calculator that works with very large integers. To do this, enter 4.5^(2^665) in Hypercalc. It evaluates it to something very close to 10^10^200.

Monday, January 11, 2016

2011 Sprouts Clash of the Titans - Game 2 Analysis

Clash of the Titans, Game 1: Analysis by Josh Jordan

Clash of the Titans, Game 2

Analysis by Josh Jordan

For background, see my analysis of game 1.

22- (Khorkov - Jordan+Aunt Beast*) 1(23)1[2-10] 1(24)23[11-14] 11(25)24 2(26)2 2(27)26[3] 15(28)16 15(29)16[17-19] 15(30)17 16(31)20


Aunt Beast doesn't know who wins this position, but she can see that Roman loses if he moves in the 04 component.

20(33)31[21,22] N

Roman moves in the 04 component, changing it to a 15. In a way, Roman's hand is forced: he wants S(17) to have the nim-values 1x, and this is the only move that does that.

29(34)30[28] P

Move 34 and every subsequent even-numbered move is P. Based on its nim-values, this seems to be another one of those moves which magically win, seemingly at random. As John Conway wrote in Winning Ways, "You mustn't expect any magic formula for dealing with such positions."

Situations like this demonstrate just how inadequate nim values can be in the analysis of misère sprouts. Understanding these positions will require a fundamental breakthrough in the mathematical theory of misère sprouts games.

17(35)17[18,19] 21(36)22 18(37)19 18(38)19 19(39@35)38 20(40)36

After move 40, S(3) still has nim-values 11 (the sphere hasn't changed since move 27), but S(40) has evolved into a component with nim-values 55 — and the result is still a misère P-position! We've truly crossed over into the Twilight Zone.

18(41)37[39] 4(42)4[5-8] 4(43)42[5] 3(44)3 3(45)27 9(46)9 9(47@10)46 21(48)22 21(49@33)48 40(50)49[22] II

After a long endgame, Roman concedes.

2011 Sprouts Clash of the Titans - Game 1 Analysis

Clash of the Titans, Game 1: Analysis by Josh Jordan

Clash of the Titans, Game 1

Analysis by Josh Jordan

Game 1 of the World Game of Sprouts Association-sponsored “2011 Clash of the Titans” match between WGOSA World Champion Roman Khorkov and me took place over email in early February, 2011. I was assisted by my computer program Aunt Beast, while Roman used only his brain. After I won the game, Roman wrote:

“This was an amazing game. Very intriguing... It added my white hair. It will be good if you make a detailed analysis. We simply have to bring it to the public!”

As requested, here is my analysis. First, though, let's have a look at these two titans.

Roman Khorkov posing on a boat Josh Jordan, leaning against a bridge in Paris
Roman Khorkov Josh Jordan

27- (Khorkov*)

Sprouts is a two-player pencil-and-paper game invented in the 1960s by mathematicians John Conway and Michael Patterson. The game starts with a field of dots on an otherwise-empty piece of paper. The move is to add a new dot along with two lines joining it to existing dots subject to the conditions that (a) lines may not cross and (b) no dot may have more than 3 lines coming out of it. (Mathematically speaking, each move extends a plane multigraph of maximum degree 3 by adding a new vertex of degree 2.) The players alternate turns until it is no longer possible to move, at which point the game is over. Like any impartial combinatorial game, sprouts can be played with two different win conditions: (a) last player wins or (b) last player loses. The former is known as normal play, the latter as misère (pronounced something like "MEE-zair") play.

The top human players seem to be able to play normal sprouts quite accurately. For this reason, the emphasis in modern tournament sprouts has shifted to misère play. (According to the Wikipedia entry on misère games, "Perhaps the only misère combinatorial game played competitively in an organized forum is Sprouts.") However, skill in normal play does not go to waste: the main principle of human misère sprouts strategy is to to create a large component with firm nim-values (see below), in the presence of which the game's misère outcome is usually the same as its normal outcome. (A component is a minimal contiguous region of the game with the property that a move in that region does not affect the moves available in the rest of the game.)

In sprouts, one player serves, that is, determines the number of initial spots and the win condition (normal or misère), and the other player then decides who plays first.. The player who plays first is known as Left; the other is known as Right. Here Roman has served 27 spots, misère. Had he intended normal play, the serve would have been written as "27+".

27- (Jordan+Aunt Beast - Khorkov*) 1(28)2

The shape that results from connects two initial spots like this is known as an anago.

Even though 27- is thought to be a win for Right, I opt to move first, in part to show how a computer can help win a theoretically losing position. We're a long way from knowing for sure whether 27- is a win for Left or a win for Right, so in the opening phase I merely try to make the position more complex, in accordance with the Enough Rope Principle.

The diagrams here follow Winning Ways by coloring Left's move bLue and Right's move Red.


Roman responds with the analogue of the correct reply in 15-. According to theory, adding a multiple of 6 spots to a region with a large number of spots doesn't change the outcome of the game, so this move should also work for 27-.


Continuing the standard opening for 15-, I make a *3 component in S(2). (The notation S(x) means the entire component containing spot x.) I will try to make more of these *2 and *3 components as the game continues. An identical pair of nim-heaps of size at least 2 make a component with firm nim-values 00. In misère play, when the position contains a component with firm nim-values, further identical pairs of these nim-heaps can can usually be ignored as in normal play, but not always…

It may be useful to recall some definitions and notation. A nim-heap of size x is denoted by *a. The nim-value of a position x is the size of the nim-heap that must be adjoined to x to result in a P-position, that is, a loss for the player to move. A position that is a win for the player to move is known as an N-position.

The nim-values of a position are its normal nim-value a and its misère nim-value b, written ab. A component with firm nim-values is a component with identical normal and misère nim-values. This is not the same as a firm component, which is a tame component with firm nim-values. A component x is tame whenever there exist nim-heaps a, b, …, c such that, for any impartial combinatorial game y, a+b+…+c+y has the same misère outcome as x+y. (Note that y doesn't necessarily have to follow the rules of sprouts or nim.) Most components with firm nim-values are not tame. A partially-correct table of nim-values for various common sprouts components can be found at Playing Sprouts with Misère Grundy Numbers (Peltier, 2008).


Roman makes a group of 16 spots and a group of 7. If spots 30 and 31 were removed, both of these groups would be components with firm nim-values of 11.


With Aunt Beast, I try to disrupt the human strategy by steering the game into positions in which the normal outcome differs from the misère outcome, even in the presence of components with firm nim-values. Playing those types of positions correctly seems to be very difficult for today's human sprouts players.

I have noticed that many of these tricky positions contain groups of four spots, so I try to create such a group here. This was almost certainly a mistake, since it allows Roman to effectively remove those four spots from the game on his next move.


Roman simplifies. S(6) is now effectively dead (equivalent to *0).

Dead spots and components which I know to be effectively dead (equivalent to *0, either by themselves or in combination with other components) are colored gray in the diagrams. Further play in these dead components is pointless.


Trying again to add complexity, I fall back on the usual approach of playing an anago.


I expect Roman to close off the anago with 10(35)10[11] or 10(35)10[11,12], but he surprises me by playing this move instead, splitting the rest of the position into two components with nim-values 30 and 00.

I think it's bad practice for a human player in a theoretically winning position to leave an anago exposed like this, since the opponent can use it to create complex positions that could otherwise be prevented from arising.

In the diagram, the nim-values ab of various components have been noted in boxes. When a box contains *n, the component is equivalent to a nim-heap of size n.


Trying to make things even more complex, I play a third anago.


Roman is now facing two exposed anagos, a situation in which he does not usually find himself.


This move looks to have been a mistake on my part. Even if the 01 component were equivalent to *0, the remaining components do not make a misère P-position. If instead the 30 component is replaced by *3, the resulting position is also not a misère P-position.

My move does have a possible virtue: there are only three replies that make the normal-play nim-values sum to *0 by changing the 30 component into a 01, namely 23(39)23, 25(39)36, and 23(39)27. Each of these are subtly different from a misère perspective, and each yields a different unusual configuration in S(23). These unusual configurations are unlikely to have been well-studied by human sprouts theorists.

Instead, though, I should have played 12(38)34 or 12(38)13, both of which better accord with the Enough Rope Principle.


Roman returns to a position in which the normal-play nim-values sum to *0. Interestingly, though, he does so by raising the nim-values of S(11) from 01 to 34.


I have set a trap. Aunt Beast still can't figure out who wins here, but she can see that Roman loses if he moves in the 51 component. It's likely that he will, since human players can't abide positions with normal-play nim-values greater than 3.

Update (March 2011): In subsequent analysis, Aunt Beast determined that this move is not just a trap — it actually wins! How strange that *2 + *2 + 51 + 34 would yield a misere P-position. If God doesn't play dice with the universe, he must not play sprouts either, because both seem equally random.

26(41)26 N

Roman moves in the 51 component, reducing it to *3. Now Aunt Beast can see that Roman has lost. Adopting a notation from combinatorial game theory, I have appended "N" to losing moves, which stands for "next player wins".

14(42)15 P

There are six winning options here: 34(42)38, 12(42)13, 14(42)14[15-16,38], 14(42)14[15-18], 14(42)14[15-17], and 14(42)15. I like to play an anago whenever possible. Aunt Beast didn't compute the exact normal-play nim-value of the anago component during the game, but in the course of writing this article, I used Glop to determine that it is at least 5.

Adopting another notation from combinatorial game theory, I have appended "P" to winning moves, which stands for "previous player wins". In this case, the previous player is the person who just played the move, that is, me.

14(43)14[16-20] N

Looping on a tail of the anago, Roman encloses a group of 5 spots.

According to the "component with firm nim-values" heuristic, this position's nim-values 33 + *3 + *3 + *3 would seem to yield a clear 00, that is, a P-position in normal and misère play. Actually, though, it's a misère N-position — a perfect example of a case in which the heuristic goes horribly wrong.

Even though it loses, this is still an excellent move: it leaves me with only one winning reply, a reply which would be impossible to find today (early 2011) without the aid of a computer.

16(44)16[43] P

This was the only winning move. The fact that this position is P cannot be predicted by a theory that looks only at the nim-values of its components.

17(45)17[18,19] N

Roman adds an additional level of nesting to the loops and pivots in S(11). This is another excellent move that leaves me only one winning reply.

In 2006, Dan Hoey pointed out that a related class of positions containing nested pivot spots “tend to have reasonably large game complexity, since Dawson's Kayles produces nim-values up to *9.” For more on this, see these messages in the “Dawson's Sprouts” thread on sprouts-theory.

18(46)19 P

Again, this was the only winning move. Now the game's outcome and its nim-values begin to correspond: the 13 component has restive nim-values, so we can guess that the position is P if and only if *3 is equal to *3 + *2 + *2, which indeed it is.

26(47)41 N

Ordinarily, when you are winning, the answer to an anago in an otherwise-empty region is to convert it to a big loop, but doing that here via 18(47)19 allows four different winning replies, while Roman's move allows only one.

21(48)21[22] P

For the third time in a row, this was the only winning move. Game trees in which two moves in a row have unique winning replies tend to be difficult for humans to analyze, so I can only imagine what a tree with three in a row is like.

This position is clearly P on the basis of its nim-values alone.

2(49)2[3] N

This is another good move; it allows me only two winning replies.

12(50@13)39 P

This was one of only two winning moves; the other was 17(50)46, which leaves S(11) with nim-values 61.

Roman usually resigns in losing positions with such few remaining live spots. Either the normal-play nim-value of 5 is clouding his vision, or he is doing an excellent job of defending a losing position, by forcing his opponent to play very accurately.

18(51)19 N

Roman plays yet another excellent move which allows me only one winning reply.

17(52)18 P

This was the only winning move.

20(53)20[45] N

Another strong move that leaves me only two winning replies.

16(54@20)44 I P

After this move, S(17) is effectively dead and Roman concedes. This was the simpler of only two winning moves; the other was 19(54@45)46, with nim-values 50.

Josh Jordan, February, 2011