from Guide to Hacking on Jan 14, 2024
How to identify a fake 2048 score
There are a large number of absurdly high scores for 2048, even in subreddits today — a decade after the game's debut. The one bastion of legitimate 2048 scores, 2048 Masters's leaderboard, unfortunately shut down in 2023, so new claims are difficult to validate.
A redditor u/s18541 even suggests these claims are made on variants of 2048 with the undo button — and possibly with the aid of algorithms. However, I did some digging and it's a bit worse in reality though: Some of these numbers are just made up.
How 2048 works
Just in case you're unfamiliar, let's very quickly go over how 2048 works.
- You can try the original game for yourself at play2048.co or even refer to the official game's source code.
- For the tests in this article, I created a Python version of the game, which you can see on Github.
Step #1. Start with a grid of 4x4 numbers, like the following.
| | | | |
| | | |4 |
|4 |2 |4 | |
|16|16|8 |4 |
Step #2. You have four options: Shift "left", "up", "right", or "down". This controls all of the tiles on the screen. For example, let's say you hit "right".
- All the tiles will shift to the farthest right they can.
- Any adjacent tiles in that direction, with the same value, will merge together to become the sum of the pair. For example, the two 16s merge to become a 32.
- Your total score increases by 32 points.
| | | | |
| | | |4 |
| |4 |2 |4 |
| |32|8 |4 |
Step #3. A new number spawns in a random empty tile. A 2 is spawned with 90% probability, and a 4 is spawned with 10% probability.
- If the board fills and there are no more options left, the game ends.
|2 | | | |
| | | |4 |
| |4 |2 |4 |
| |32|8 |4 |
The goal is to merge tiles until you reach 2048.
Use the highest possible score to discredit fakes.
There are a variety of claims on Reddit for really really high scores reaching the tens of millions: u/dth98bs's 12,668,292, u/No_Guidance_4910's 4,272,088, and u/Dependent-Walrus-826's 19,970,748. Turns out this is like claiming you scored 110% percent on the test — or in the last player's case, claiming you scored 500%. Their claim that these scores are on "unofficial versions" don't make their scores any more possible either.
A redditor u/Asusfood worked out the math a decade ago, finding that the largest possible tile is 131,972 and the highest score is 3,932,156. I'll show below that this is slightly off — just by 0.001% — but the redditor's calculations hold true for any 4x4 board, whether or not there are additions like undo buttons. This 3.93 million threshold immediately rules all of the above high scores as fake.
Let's calculate this upper bound ourselves. We'll improve on the thresholds calculated by u/Asusfood and others, by incorporating the spawn distribution over 2s and 4s.
The maximum tile value is 131,072.
Let's for now say that our 2048 game only spawns 2s. We'll now build an intuition for the maximum value the board can attain, as we increase the size of the board from a single 1x1 to 2x1 and so on and so forth.
- Take a 1x1 board. Clearly, the highest possible value is 2.
-
Take a 2x1 board. I claim the highest possible value is 4.
- According to our previous bullet point, the maximum value of either empty tile is 2.
- Those two 2s can merge to form a 4. 4 now occupies one tile, and there is one empty tile left.
- The remaining empty tile again has only a maximum value of 2, which is less than the 4 we already have. Thus, the maximum value is 4.
-
Take a 3x1 board. I claim the highest possible value is 8.
- According to our previous bullet point, any two tiles will give us 4 at most. Once that 4 is formed, we have two empty tiles again, and the maximum value of those two remaining empty tiles is again 4.
- Those two 4s can merge to form an 8. 8 now occupies one tile, so there are two empty tiles left.
- The remaining two empty tiles again has only a maximum value of 4, which is less than the 8 we already have. Thus, the maximum value is 8.
Above, we see a clear pattern: The maximum value for 1 tile is 2. The maximum value for 2 tiles is 4. The maximum value for 3 tiles is 8. More generally, the maximum value for n tiles appears to be $2^n$. Using this intuition, we can write up a more formal proof, using strong induction.
- Base case: We've proved that the maximum value for 1-tile is 2, 2-tile is 4, and 3-tile is 8.
- Induction hypothesis: Assume that for $i=1$ up to $i = n$, the maximum value for $i$ tiles is $2^i$.
-
Induction step: Take a board with $n+1$ tiles.
- We take the first $n$ tiles. According to our hypothesis, these $n$ tiles form at most a $2^n$ tile.
- Now one tile is full, and there are $n$ tiles remaining. According to our hypothesis, these $n$ tiles form at a most a $2^n$ tile.
- We can now merge both $2^n$ tiles to form an $2^{n+1}$ tile. Now, one tile is full, and there are again $n$ tiles remaining.
- We know that the remaining $n$ tiles can at most only produce a $2^n$ tile, which is less than the $2^{n+1}$ tile we have. So, the maximum value we can produce with $2^{n+1}$ tiles is a tile with value $2^{n+1}$.
The above proof thus shows that an $n$ tile produces tiles with values at most $2^n$. This means that the standard 4x4 board for 2048 would have a maximum tile value of $2^{16} = 65,536$ if only 2s were spawned.
Since 4s are also possible to spawn in 2048, we can redo this intuition assuming all spawns are 4s. The maximum value for 1 tile is 4. The maximum value for 2 tiles is 8. The maximum value for 3 tiles is 16. More generally, the maximum value for n tiles is $2^{n + 1}$. Now, we know that 2048 would have a maximum tile value of $2^{17} = 131,072$ if we spawn all 4s.
More generally, let's say we spawn a large number of 2s, and we've now filled up an N-1 tiles in an N-tile board with $2^N, 2^{N-1}, 2^{N-2},..., 2^2$. If the remaining Nth tile spawns a 4, we can merge adjacent values together, repeatedly, reaching $2^{N+1}$. In other words, you need just one final 4-spawn to reach the $2^{N+1}$ value. This is now our upper bound on tile value.
The maximum score is 3,884,503 in expectation.
In 2048, we earn points equal to the value of the newly-merged tile, when we complete a merge. Knowing this, we can calculate how many merges it would take to get to the maximum tile value.
- Say that we reached the maximum tile value of $2^{17}$. We get $2^{17}$ for this last merge.
- At some point, we reached two tiles of value $2^{16}$. We got $2 \cdot 2^{16}$ for those two merges.
- At some point, we reached four tiles of value $2^{15}$. We got $4 \cdot 2^{15}$ for those four merges.
- This continues until we reach $2^{15}$ tiles of value 4. This is $2^{15} \cdot 4$ for those merges. However, remember that our last tile spawn must be a 4. If we spawn a 4, we don't earn points for that 4. This means that we need to subtract points earned from one of the merges. This is $(2^{15} - 1)4 = 2^{15} \cdot 2^2 - 4$.
In other words, that's
$$(\sum_{i=2}^{17} 2^{17 - i} \cdot 2^i) - 4 = (\sum_{i=2}^{17} 2^{17}) - 4 = 16 \cdot 2^{17} - 4$$
To upper bound the number of points earned, this assumes that all of our spawns were 2, until the very last spawn, which was a 4. This ensures that we earned from all of the 4s, except the last. More generally, to make a tile of value $2^N$, you would generate at most $(N-1)2^N - 4$ total points.
We've now attained the maximum tile value in one of the 16 tiles. Now, we need to fill up the rest of the 15 empty tiles.
- According to Lemma #1 above, we know that the maximum tile value for the remaining 15 tiles is $2^{15+1} = 65,536$. Using Lemma #2, we know attaining this tile nets us $(16 - 1) \cdot 2^{16} - 4 = 983,400$ points. We've now filled one more tile, and there are 14 empty tiles left.
- According to Lemma #1, we know the maximum tile value for the remaining 14 tiles is $2^{14 + 1} = 32,768$. Using Lemma #2, we know this nets $(15 - 1) \cdot 2^{15} - 4 = 458,748$ points.
We can continue this for all remaining tiles, then write a general expression for the total number of points it takes to fill up the board.
$$\sum_{i=2}^{17} ((i-1) \cdot 2^i - 4) = 3,932,100$$
You'll notice this is a slightly tighter bound than redditor u/Asusfood's bound of 3,932,156. A commenter u/XBKMT corroborates my calculation above though, pointing out that u/Asusfood forgot to account for the final 4, when generating every tile.
There's still one final bit that the calculations above miss out on: Namely, 2s spawn with 90% probability and 4s spawn with 10% probability. Even though the above upper bound is indeed a valid upper bound, we can make the bound tighter by factoring in the spawn distribution. Let's see how it works.
- Say we spawn 100 numbers.
- 10 tiles are 4s, and the remaining 90 tiles are 2s.
- The 2s then all merge, given us 45 total 4s.
- As a result, we have 45 merged 4s to 10 spawned 4s.
All in all, about 45/55 = 82% of 4s are merged and thus grant points. The remaining 10/55 = 18% of 4s are spawned and thus don't grant points. Now, to make a tile of value $2^N$, we expect to earn $\frac{45}{55} 2^N$ from the 4s instead of $2^N$. This means that to make a tile of value $2^N$, you would generate at most $(N-2)2^N + 2^N\frac{45}{55} = (N-\frac{65}{55})2^N$ points. This calculation really only makes sense so long as $\frac{10}{55}2^N > 1$ or equivalently when $N > 2$.
We can now update Lemma #3 with this new expression as well. This way, the maximum number of attainable points for creating a $2^N$ tile is now the following. We split out the $i=2$ term according to the edge case above, which is $(2-1)2^2 = 4$.
$$4 + \sum_{i=3}^N (i-\frac{65}{55}) \cdot 2^i$$
Plugging in $N=17$, we get 3,884,503 points in expectation, for our new bound. Taking all of this information together, we know now the following.
- The highest score in expectation is 3,884,503, accounting for the 90% spawn rate for 2s and 10% spawn rate for 4s. It's technically possible to exceed this score.
- The highest score possible is 3,932,100. This assumes that the game spawns all 2s until the very last spawn of 4. For even the last $2^{17}$ tile, this is extremely unlikely to happen, as it'd require $2^{15}-1$ 2-spawns and just one final 4-spawn. In other words, an event that has probability 0.000000…000046% chance of happening — I didn't list them all out but there should actually be 1408 zeros after the decimal point according to Wolfram Alpha. In other words, it's very very unlikely to reach this theoretical maximum.
This gives us a sense of what's possible, in the limit. There are definitely no high scores beyond 3.93 million and unlikely to have high scores far beyond 3.88 million.
Use expected minimum and maximum scores.
What about the other high score claims? The claims that fall below the 3.93 million point maximum but still seem suspicious? Fortunately, using our expression above, we can compute the highest score you can possibly attain, per tile value. This means that once a redditor reports a score and a tile value, we can check to see if those values fall within range of each other.
Let's say that you reach the $2^{16}$ instead of the $2^{17}$ tile. At best, you achieve two $2^{16}$ tiles and fill the board with the highest possible tiles you can. We can repeat this calculation for $2^{15}$ tiles, $2^{14}$ tiles, and more. In this way, we can compute the maximum possible scores for each tile value.
Let's use $\tau(i)$ from Lemma #2, which denotes the maximum number of points you can obtain for creating a tile with value $2^i$. More generally, the formula for high score looks like this, if we assume the player reaches the $2^N$ tile at best and fills the board with the highest value tiles they possible can.
$$\sum_{i=2}^{17} \tau(\text{min}(i, N)) = \sum_{i=2}^{17} ((\text{min}(i,N)-1) \cdot 2^{\text{min}(i, N)} - 4)$$
You can write a similar expression using $\tau'(i)$ from Lemma #4, which represents the expected high score from creating the $2^i$ tile.
$$\sum_{i=2}^{17}\tau'(\text{min}(i,N)) = 4 + \sum_{i=3}^{17}((\text{min}(i,N)-\frac{65}{55})\cdot 2^{\text{min}(i,N)})$$
We can also compute the minimum score, which is the number of points you'd expect a tile of $2^N$ to garner. To obtain this threshold, we'll simply use $\tau(N)$, except assume that the player got no points from 4s. This means the minimum score is
$$(N-2)2^N$$
On the other hand, the expected minimum score is simply $\tau'(N)$ — the expected score for achieving just a single $2^N$ tile.
N | Value | Min score | Max score | Expected min score | Expected max score |
---|---|---|---|---|---|
17 | 131,072 | 1,966,080 | 3,932,100 | 2,073,320 | 3,884,503 |
16 | 65,536 | 917,504 | 2,817,988 | 971,124 | 2,782,307 |
15 | 32,768 | 425,984 | 1,769,412 | 452,794 | 1,745,646 |
14 | 16,384 | 196,608 | 1,032,132 | 210,013 | 1,017,303 |
13 | 8,192 | 90,112 | 573,380 | 96,814 | 564,508 |
12 | 4,096 | 40,960 | 307,140 | 44,311 | 301,992 |
11 | 2,048 | 18,432 | 159,684 | 20,107 | 156,771 |
10 | 1,024 | 8,192 | 80,836 | 9,029 | 79,225 |
9 | 512 | 3,584 | 39,876 | 4,002 | 39,010 |
8 | 256 | 1,536 | 19,140 | 1,745 | 18,693 |
7 | 128 | 640 | 8,900 | 744 | 8,686 |
6 | 64 | 256 | 3,972 | 308 | 3,886 |
5 | 32 | 96 | 1,668 | 122 | 1,652 |
4 | 16 | 32 | 628 | 45 | 650 |
3 | 8 | 8 | 180 | 14 | 222 |
Perfect. We now have a series of different intervals to check any claim to a high score. Although the ranges grow fairly wide, we can roughly understand if scores and their corresponding tiles fall in the right ballpark.
We can use the guidelines above to judge several high scores. It turns out that many many of the reported scores are actually near the minimum expected score for that tile. This makes sense: Players probably post to Reddit as soon as they reach the next highest tile.
- For the 8192 tile, players report scores of 114,420, 103,000, 123,964, and 108,520 — all just barely higher than the expected minimum of 96,814 that we calculated above. For comparison, the expected maximum is 564,508.
- For the 16,484 tile, players report scores of 250,000, 257,360, 281,176 — again just higher than the expected minimum of 210,013 we calculated above.
So all this to say is, a good chunk of those Reddit claims may actually be true. Just a few absolutely bonkers claims that make no sense.
Takeaways
In summary, we've computed a few quantities that can help us verify the authenticity of various high score claims on Reddit. Turns out that a number of claims are actually legitimate, all clustering around the minimum expected score for reaching a certain tile value.
- For generating a single tile of value $2^N$, the points earned for creating that tile falls somewhere between $(N-2)2^N$ at minimum and $(N-1)2^N - 4$ at maximum. In expectation, the points earned should be $(N-\frac{65}{55})2^N$ for $N > 2$ and would be $(N-1)2^N - 4$ otherwise.
- If the player fills the entire board optimally, after reaching a tile of value $2^N$, the total points earned should be $\sum_{i=2}^N ((i-1) \cdot 2^i) - 4$ at most. In expectation, that optimally filled board would earn $4 + \sum_{i=3}^N (i-\frac{65}{55}) \cdot 2^i$ points.
Using the above, we calculated that
- The highest value tile that can be obtained is 131,072.
- The highest possible score is 3,932,100. However, this score occurs with effectively 0% probability.
- In expectation, the optimal board configuration would yield a score of 3,884,503.
You can use these upper bounds to determine if a high-score claim is legitimate or not. Then, using those expressions, we computed the minimum and maximum scores for each tile value. For any claimed high scores that fall below 3.8m or 3.9m, you can then use that table above to check any high score you run across.
Want more tips? Drop your email, and I'll keep you in the loop.