posted on Mar 23, 2024
Compare dishes, instead of scoring restaurants
A recently-popularized food app caught my attention recently, called Beli. The app has some interesting aspects: rather than ask you to rate a restaurant directly, you're asked to compare restaurants with each other. From these comparisons, Beli then computes a numerical score for each restaurant.
Absolute scores are difficult to assign
Beli's comparison mechanism is interesting, because it solves a problem with every other rating system out of 5 stars: Namely, it's difficult for us as consumers to rate a restaurant with an absolute score. If I felt the restaurant was average, is that 3 or 4 stars? Maybe I should assign 2 stars, so I have "room" to differentiate between better restaurants.
However, Beli's insight is that it's easy to compare restaurants, even if it's hard to score restaurants. Interestingly, Beli then uses these comparisons to build absolute scores, which can now be calibrated however Beli wants — for example, so long as the ordering of restaurants agrees with your comparisons, Beli can choose 9.0 or 5.0 to the median score. In this way, the platform helps you assign absolute scores. However, these absolute scores are still fallible.
Absolute scores are difficult to understand
Beli only solves half the problem: Understanding absolute scores is difficult.
Issue #1. Absolute scores need calibration. The issue with absolute scores is we don't know what they mean. In particular, we don't know what score corresponds to "good" and what corresponds to "not good". After some calibrating, by just seeing scores and evaluating them yourself, you'll end up with a calibration like the following:
- On Yelp, 4 stars is average, and 4.5 stars is "good", depending on your tastes.
- On TableLog, Japan's version of Yelp, 3 stars is average, and 4 stars is exceptional.
-
On Google movie reviews, 90%+ audience score is a good movie. 95%+ is exceptional.
- Any movie between 80% and 90% is okay — it may have some redeeming quality, such as visuals or cool action shots (e.g., Pacific Rim @ 80%) but is lacking in storyline.
- Anything less than 80% is not worth watching (e.g., Little Mermaid @ 54%)
There are of course workarounds to this issue of calibration. To encourage a particular distribution of ratings, some interfaces will include a visual interpretation of the score. For example, Rotten Tomatoes includes a tomato of varying ripeness, which you've no doubt seen.
Issue #2. Calibrations differ between people. Above, I listed calibrations in general. However, these generalizations miss a critical part of ratings: namely, different people will naturally provide different ratings. To some degree, this is "personalization" at varying levels of granularity:
- Personalization can be at the national level. According to the above, Japanese culture dictates 3 out of 5 is average, but American culture dictates 4 out of 5 is average.
- Personalization can also be per culture. For example, I find that the quality of Chinese restaurants on Yelp is generally inversely correlated with its rating. The dominant audience on Yelp may just (a) prefer food from other cultures or (b) have a completely different rating calibration.
- The most common manifestation of personalization is of course individualized. For example, Google recently introduced a recommendation percentage, based on your particular food preferences. Funnily enough, this percentage now also needs calibration. As far as I can tell, this recommendation is at a culture-level (i.e., it recommends every Chinese restaurant across the nation) and does not correspond to my taste at all.
Amazon does a better job in this regard, including products similar to ones you've purchased before — or products similar to ones you've browsed through before. In fact, they even include a large comparison table across similar products; granted, the tables contain just product information instead of crowd-sourced assessments, but it's headed in the right direction.
Idea. Use comparisons for both rating and displaying. One idea is to extend Beli, so that comparisons are used for both rating and displaying scores for a restaurant. For example, you could now see restaurant ratings relative to Daeho, using the comparisons that other users submitted. By sticking with comparisons, you jointly solve both the need to calibrate (i.e., comparisons are already against baselines you understand) or personalize (i.e., comparisons are relative to yours).
Rate dishes, not restaurants
There's also an issue with granularity:
- Most of the time, I enjoy specific dishes at each restaurant. Not all of the dishes at Restaurant A are better or worse than all of the dishes at Restaurant B.
- Additionally, I don't pick where to eat based on the restaurant name — despite that being the most prominent feature of any Yelp search page. I pick based on pictures of dishes.
Granted, there are good reasons to introduce comparisons at a restaurant level: No one has time to rank and compare individual dishes, and there are restaurant-level aspects to rate, such as service and ambiance.
However, provided with a base set of favorites, it may become much easier to quickly compare your favorite dishes across restaurants — instead of exhaustively comparing every dish. In this spirit, here are my absolute favorite dishes across locations and restaurants.
- Below, I list the location as "Multiple Locations" only if I've eaten at at least two of the locations and found them to be highly consistent.
- I've omitted several of my favorite dishes, simply because you almost can't go wrong anywhere you go: Vietnamese banh mi, any Thai dish.
- I've also omitted a few other dishes that could go wrong. However, beyond "good" and "not good", I haven't found exceptionally different variants: chicken fried rice.
Country | City | Dish | Restaurant | Comments | Cost |
---|---|---|---|---|---|
US | Seattle | Lasagna Pink Door | The Pink Door | creamy, no marinara | $25 |
US | Seattle | Fragrant Duck | Wild Ginger | includes steamed buns, $26 for half duck | $48 |
US | Seattle | Unagi Kama Meshi | Maneki | steamed eel over rice | $30 |
US | Seattle | Beef Short Ribs Soup | Seoul Tofu & Jjim | kalbitang, massive portion | $28 |
US | Seattle | Pork Ribs Soup | Biang Biang Noodles | any hand-pulled noodles dish | $19 |
US | Seattle | Sliced Fish with Tofu Pudding in Hot Sauce | Chengdu Taste | chili oil fish, fairly spicy | $18 |
US | San Francisco | Grilled Pork Belly Skewers | Taniku Izakaya | melts in your mouth | $8 |
US | San Francisco | Juicy Pork Bao | Dumpling Home | best american-ized pan-fried pork buns | $14 |
US | San Francisco | Hakata Tonkotsu DX | Marufuku Ramen | tried all the ramen in Japantown, pork belly melts | $21 |
US | Cupertino | The Best Char Siu | Koi Palace Contempo | very tender, soft pork | $22 |
US | Multiple Locations | Xiao Long Bao | Din Tai Fung | most authentic soup dumplings in the US | $16 |
US | Multiple Locations | Kalbijjim | Daeho | braised short rib, must add cheese, prices increase regularly | $81 |
US | Austin | Sliced Brisket | Terry Black's | very sauce-y, heavy, but delicious; pork ribs also good | $35 / lb |
Taiwan | Taipei | Fried Chicken | Haodada Jipai | massive, medium batter | $3 |
Taiwan | Taipei | Minced Pork | Wang's Broth | best with braised tofu and soup | $2 |
Taiwan | Taipei | Gua Bao | (near NTU) | "burger" (open-faced bun), pork belly melts, peanut powder | $2 |
If you've tried any of these, reach out to me on Twitter if you have related or better recommendations!
Takeaways
In short, as a user, I find several issues with contributing to and using food recommendation apps, and these are the takeaways.
- Compare restaurants, instead of assigning absolute scores, as Beli does.
- Provide some way to calibrate absolute scores — an intuitive interpretation of what "good" or "bad" is, like Rotten Tomatoes does with tomato icons.
- "Personalize" comparisons by using baselines you have compared before, as Amazon does.
- Compare dishes instead of restaurants, since (a) your favorites are served by different restaurants and (b) different dishes at one restaurant may not all be your favorite.
- To lessen effort, prioritize dish comparisons for your favorite dishes.
Even without a food recommendation app that does all of the above, you can start to curate your own list of favorites, akin to my table above. Then, crowd-source recommendations, using your favorites table as a reference, among friends and family.
posted on Mar 23, 2024
Want more tips? Drop your email, and I'll keep you in the loop.