from Guide to Hacking on Jan 14, 2024

How to identify a fake 2048 score

There are a large number of absurdly high scores for 2048, even in subreddits today — a decade after the game's debut. The one bastion of legitimate 2048 scores, 2048 Masters's leaderboard, unfortunately shut down in 2023, so new claims are difficult to validate.

A redditor u/s18541 even suggests these claims are made on variants of 2048 with the undo button — and possibly with the aid of algorithms. However, I did some digging and it's a bit worse in reality though: Some of these numbers are just made up.

How 2048 works

Just in case you're unfamiliar, let's very quickly go over how 2048 works.

Step #1. Start with a grid of 4x4 numbers, like the following.

|  |  |  |  |
|  |  |  |4 |
|4 |2 |4 |  |
|16|16|8 |4 |

Step #2. You have four options: Shift "left", "up", "right", or "down". This controls all of the tiles on the screen. For example, let's say you hit "right".

|  |  |  |  |
|  |  |  |4 |
|  |4 |2 |4 |
|  |32|8 |4 |

Step #3. A new number spawns in a random empty tile. A 2 is spawned with 90% probability, and a 4 is spawned with 10% probability.

|2 |  |  |  |
|  |  |  |4 |
|  |4 |2 |4 |
|  |32|8 |4 |

The goal is to merge tiles until you reach 2048.

Use the highest possible score to discredit fakes.

There are a variety of claims on Reddit for really really high scores reaching the tens of millions: u/dth98bs's 12,668,292, u/No_Guidance_4910's 4,272,088, and u/Dependent-Walrus-826's 19,970,748. Turns out this is like claiming you scored 110% percent on the test — or in the last player's case, claiming you scored 500%. Their claim that these scores are on "unofficial versions" don't make their scores any more possible either.

A redditor u/Asusfood worked out the math a decade ago, finding that the largest possible tile is 131,972 and the highest score is 3,932,156. I'll show below that this is slightly off — just by 0.001% — but the redditor's calculations hold true for any 4x4 board, whether or not there are additions like undo buttons. This 3.93 million threshold immediately rules all of the above high scores as fake.

Let's calculate this upper bound ourselves. We'll improve on the thresholds calculated by u/Asusfood and others, by incorporating the spawn distribution over 2s and 4s.

The maximum tile value is 131,072.

Let's for now say that our 2048 game only spawns 2s. We'll now build an intuition for the maximum value the board can attain, as we increase the size of the board from a single 1x1 to 2x1 and so on and so forth.

  1. Take a 1x1 board. Clearly, the highest possible value is 2.
  2. Take a 2x1 board. I claim the highest possible value is 4.

    1. According to our previous bullet point, the maximum value of either empty tile is 2.
    2. Those two 2s can merge to form a 4. 4 now occupies one tile, and there is one empty tile left.
    3. The remaining empty tile again has only a maximum value of 2, which is less than the 4 we already have. Thus, the maximum value is 4.
  3. Take a 3x1 board. I claim the highest possible value is 8.

    1. According to our previous bullet point, any two tiles will give us 4 at most. Once that 4 is formed, we have two empty tiles again, and the maximum value of those two remaining empty tiles is again 4.
    2. Those two 4s can merge to form an 8. 8 now occupies one tile, so there are two empty tiles left.
    3. The remaining two empty tiles again has only a maximum value of 4, which is less than the 8 we already have. Thus, the maximum value is 8.

Above, we see a clear pattern: The maximum value for 1 tile is 2. The maximum value for 2 tiles is 4. The maximum value for 3 tiles is 8. More generally, the maximum value for n tiles appears to be $2^n$. Using this intuition, we can write up a more formal proof, using strong induction.

  1. Base case: We've proved that the maximum value for 1-tile is 2, 2-tile is 4, and 3-tile is 8.
  2. Induction hypothesis: Assume that for $i=1$ up to $i = n$, the maximum value for $i$ tiles is $2^i$.
  3. Induction step: Take a board with $n+1$ tiles.

    1. We take the first $n$ tiles. According to our hypothesis, these $n$ tiles form at most a $2^n$ tile.
    2. Now one tile is full, and there are $n$ tiles remaining. According to our hypothesis, these $n$ tiles form at a most a $2^n$ tile.
    3. We can now merge both $2^n$ tiles to form an $2^{n+1}$ tile. Now, one tile is full, and there are again $n$ tiles remaining.
    4. We know that the remaining $n$ tiles can at most only produce a $2^n$ tile, which is less than the $2^{n+1}$ tile we have. So, the maximum value we can produce with $2^{n+1}$ tiles is a tile with value $2^{n+1}$.

The above proof thus shows that an $n$ tile produces tiles with values at most $2^n$. This means that the standard 4x4 board for 2048 would have a maximum tile value of $2^{16} = 65,536$ if only 2s were spawned.

Since 4s are also possible to spawn in 2048, we can redo this intuition assuming all spawns are 4s. The maximum value for 1 tile is 4. The maximum value for 2 tiles is 8. The maximum value for 3 tiles is 16. More generally, the maximum value for n tiles is $2^{n + 1}$. Now, we know that 2048 would have a maximum tile value of $2^{17} = 131,072$ if we spawn all 4s.

More generally, let's say we spawn a large number of 2s, and we've now filled up an N-1 tiles in an N-tile board with $2^N, 2^{N-1}, 2^{N-2},..., 2^2$. If the remaining Nth tile spawns a 4, we can merge adjacent values together, repeatedly, reaching $2^{N+1}$. In other words, you need just one final 4-spawn to reach the $2^{N+1}$ value. This is now our upper bound on tile value.

The maximum score is 3,884,503 in expectation.

In 2048, we earn points equal to the value of the newly-merged tile, when we complete a merge. Knowing this, we can calculate how many merges it would take to get to the maximum tile value.

  1. Say that we reached the maximum tile value of $2^{17}$. We get $2^{17}$ for this last merge.
  2. At some point, we reached two tiles of value $2^{16}$. We got $2 \cdot 2^{16}$ for those two merges.
  3. At some point, we reached four tiles of value $2^{15}$. We got $4 \cdot 2^{15}$ for those four merges.
  4. This continues until we reach $2^{15}$ tiles of value 4. This is $2^{15} \cdot 4$ for those merges. However, remember that our last tile spawn must be a 4. If we spawn a 4, we don't earn points for that 4. This means that we need to subtract points earned from one of the merges. This is $(2^{15} - 1)4 = 2^{15} \cdot 2^2 - 4$.

In other words, that's

$$(\sum_{i=2}^{17} 2^{17 - i} \cdot 2^i) - 4 = (\sum_{i=2}^{17} 2^{17}) - 4 = 16 \cdot 2^{17} - 4$$

To upper bound the number of points earned, this assumes that all of our spawns were 2, until the very last spawn, which was a 4. This ensures that we earned from all of the 4s, except the last. More generally, to make a tile of value $2^N$, you would generate at most $(N-1)2^N - 4$ total points.

We've now attained the maximum tile value in one of the 16 tiles. Now, we need to fill up the rest of the 15 empty tiles.

We can continue this for all remaining tiles, then write a general expression for the total number of points it takes to fill up the board.

$$\sum_{i=2}^{17} ((i-1) \cdot 2^i - 4) = 3,932,100$$

You'll notice this is a slightly tighter bound than redditor u/Asusfood's bound of 3,932,156. A commenter u/XBKMT corroborates my calculation above though, pointing out that u/Asusfood forgot to account for the final 4, when generating every tile.

There's still one final bit that the calculations above miss out on: Namely, 2s spawn with 90% probability and 4s spawn with 10% probability. Even though the above upper bound is indeed a valid upper bound, we can make the bound tighter by factoring in the spawn distribution. Let's see how it works.

  1. Say we spawn 100 numbers.
  2. 10 tiles are 4s, and the remaining 90 tiles are 2s.
  3. The 2s then all merge, given us 45 total 4s.
  4. As a result, we have 45 merged 4s to 10 spawned 4s.

All in all, about 45/55 = 82% of 4s are merged and thus grant points. The remaining 10/55 = 18% of 4s are spawned and thus don't grant points. Now, to make a tile of value $2^N$, we expect to earn $\frac{45}{55} 2^N$ from the 4s instead of $2^N$. This means that to make a tile of value $2^N$, you would generate at most $(N-2)2^N + 2^N\frac{45}{55} = (N-\frac{65}{55})2^N$ points. This calculation really only makes sense so long as $\frac{10}{55}2^N > 1$ or equivalently when $N > 2$.

We can now update Lemma #3 with this new expression as well. This way, the maximum number of attainable points for creating a $2^N$ tile is now the following. We split out the $i=2$ term according to the edge case above, which is $(2-1)2^2 = 4$.

$$4 + \sum_{i=3}^N (i-\frac{65}{55}) \cdot 2^i$$

Plugging in $N=17$, we get 3,884,503 points in expectation, for our new bound. Taking all of this information together, we know now the following.

This gives us a sense of what's possible, in the limit. There are definitely no high scores beyond 3.93 million and unlikely to have high scores far beyond 3.88 million.

Use expected minimum and maximum scores.

What about the other high score claims? The claims that fall below the 3.93 million point maximum but still seem suspicious? Fortunately, using our expression above, we can compute the highest score you can possibly attain, per tile value. This means that once a redditor reports a score and a tile value, we can check to see if those values fall within range of each other.

Let's say that you reach the $2^{16}$ instead of the $2^{17}$ tile. At best, you achieve two $2^{16}$ tiles and fill the board with the highest possible tiles you can. We can repeat this calculation for $2^{15}$ tiles, $2^{14}$ tiles, and more. In this way, we can compute the maximum possible scores for each tile value.

Let's use $\tau(i)$ from Lemma #2, which denotes the maximum number of points you can obtain for creating a tile with value $2^i$. More generally, the formula for high score looks like this, if we assume the player reaches the $2^N$ tile at best and fills the board with the highest value tiles they possible can.

$$\sum_{i=2}^{17} \tau(\text{min}(i, N)) = \sum_{i=2}^{17} ((\text{min}(i,N)-1) \cdot 2^{\text{min}(i, N)} - 4)$$

You can write a similar expression using $\tau'(i)$ from Lemma #4, which represents the expected high score from creating the $2^i$ tile.

$$\sum_{i=2}^{17}\tau'(\text{min}(i,N)) = 4 + \sum_{i=3}^{17}((\text{min}(i,N)-\frac{65}{55})\cdot 2^{\text{min}(i,N)})$$

We can also compute the minimum score, which is the number of points you'd expect a tile of $2^N$ to garner. To obtain this threshold, we'll simply use $\tau(N)$, except assume that the player got no points from 4s. This means the minimum score is

$$(N-2)2^N$$

On the other hand, the expected minimum score is simply $\tau'(N)$ — the expected score for achieving just a single $2^N$ tile.

N Value Min score Max score Expected min score Expected max score
17 131,072 1,966,080 3,932,100 2,073,320 3,884,503
16 65,536 917,504 2,817,988 971,124 2,782,307
15 32,768 425,984 1,769,412 452,794 1,745,646
14 16,384 196,608 1,032,132 210,013 1,017,303
13 8,192 90,112 573,380 96,814 564,508
12 4,096 40,960 307,140 44,311 301,992
11 2,048 18,432 159,684 20,107 156,771
10 1,024 8,192 80,836 9,029 79,225
9 512 3,584 39,876 4,002 39,010
8 256 1,536 19,140 1,745 18,693
7 128 640 8,900 744 8,686
6 64 256 3,972 308 3,886
5 32 96 1,668 122 1,652
4 16 32 628 45 650
3 8 8 180 14 222

Perfect. We now have a series of different intervals to check any claim to a high score. Although the ranges grow fairly wide, we can roughly understand if scores and their corresponding tiles fall in the right ballpark.

We can use the guidelines above to judge several high scores. It turns out that many many of the reported scores are actually near the minimum expected score for that tile. This makes sense: Players probably post to Reddit as soon as they reach the next highest tile.

So all this to say is, a good chunk of those Reddit claims may actually be true. Just a few absolutely bonkers claims that make no sense.

Takeaways

In summary, we've computed a few quantities that can help us verify the authenticity of various high score claims on Reddit. Turns out that a number of claims are actually legitimate, all clustering around the minimum expected score for reaching a certain tile value.

Using the above, we calculated that

  1. The highest value tile that can be obtained is 131,072.
  2. The highest possible score is 3,932,100. However, this score occurs with effectively 0% probability.
  3. In expectation, the optimal board configuration would yield a score of 3,884,503.

You can use these upper bounds to determine if a high-score claim is legitimate or not. Then, using those expressions, we computed the minimum and maximum scores for each tile value. For any claimed high scores that fall below 3.8m or 3.9m, you can then use that table above to check any high score you run across.


back to Guide to Hacking