The Leopard’s Palate: Hunting the World’s Best Cuisines

Author

Proud Leopard
Neha Arora (na458), Emmanual Dodoo (end25), Caroline Chan (cic37)

Introduction

Our analysis draw on a dataset of recipes scraped from https://allrecipes.com/ and released as part of the TidyTuesday project for the week of 2025-09-16.The release contains two complementary tables: all_recipes.csv with 14,426 general recipes, and cuisines.csv with 2,218 recipes categorized by country of origin. Both tables include comprehensive recipe information such as the recipe title, author, and URL; a free‑text ingredients list; basic nutritional facts (calories, fat, carbohydrates, protein per serving); preparation and cooking times; total time; and number of servings. They also record user‑generated data such as average rating, number of ratings, and number of written reviews.

For this project, we focus on the cuisines.csv table. Because each recipe in this table is linked to a country or region through the country variable, we treat this field as a proxy for “cuisine” and use it to compare patterns across cuisines. This structure allows us to examine our two main research questions about (1) how macronutrient profiles differ between higher‑ and lower‑rated cuisines and (2) how the presence and average ratings of four focal cuisines (Chinese, Italian, Japanese, and Indian) change over time.

Question 1: How do macronutrients (protein, carbs, fats) compare across the highest and lowest rated cuisines?

Introduction

Understanding how macronutrient composition varies across cuisine could provide broader insights into food preferences and consumer ratings. In this analysis, we investigate how the proportions of calories derived from fat, protein, and carbohydrates differ between the five highest-rated and five lowest-rated cuisines in our dataset. To answer this question we used the following variables from our dataset:

  • “calories” (integer): “Calories per serving”
  • “fat” (integer): “Fat per serving”
  • “protein” (integer): “Protein per serving”
  • “carbs” (integer): “Carbohydrates per serving”
  • “country” (character): “The country/region the cuisine is from.”

Such information could also be helpful in making more conscious eating decisions and gaining an understanding of how macronutrients influence taste preferences.

Approach

We plan to select a subset of recipes from the highest- and lowest-rated cuisines and compare their macronutrient composition. First, we group the dataset by country and compute the average rating for each cuisine using the avg_rating column. This value is stored in a new column called cuisine_avg_rating. The cuisines are then sorted by this value, and we select the five highest- and five lowest-rated cuisines. Next, we return to the recipe-level dataset and create a new column, top_vs_bottom, to indicate whether each recipe belongs to a top- or bottom-rated cuisine. Recipes from other cuisines are removed, so the analysis focuses only on these groups. For each remaining recipe, we calculate macronutrient proportions based on calories rather than grams, since fat, protein, and carbohydrates contribute different amounts of energy per gram. Fat provides 9 calories per gram, while protein and carbohydrates provide 4 calories per gram. We therefore convert grams to caloric contribution and divide by the total calories to compute the proportions of calories from each macronutrient. This allows for consistent comparisons across recipes with different calorie totals. We then create a summarized dataset by grouping recipes by cuisine and rating group and calculating the average caloric share of fat, protein, and carbohydrates across all recipes within each cuisine. These averages represent the typical macronutrient composition of each cuisine.

Visualization 1: Stacked Bar Chart

The stacked bar chart shows the average macronutrient composition of each cuisine. The bars are normalized so that each cuisine sums to 100% of total calories, allowing us to clearly compare how calories are distributed among fat, protein, and carbohydrates across cuisines with different total calorie levels. Each segment of the bar represents the average share of calories contributed by a specific macronutrient (mapped by color), which makes it easy to see which macronutrients dominate within a cuisine’s recipes. Faceting the chart by whether cuisines belong to the top- or bottom-rated group helps separate the two sets of cuisines and allows for a clearer visual comparison between them. This makes it easier to identify patterns, such as whether higher-rated cuisines tend to have a different macronutrient balance than lower-rated cuisines.

Visualization 2: Box Plot

The box plot shows the distribution of macronutrient proportions across individual recipes within each cuisine. While the stacked bar chart shows the average composition, the box plot reveals the variation within cuisines by showing the median, interquartile range, and potential outliers for each macronutrient. This allows us to see whether certain cuisines consistently have higher or lower proportions of fat, protein, or carbohydrates, or whether there is substantial variability across recipes. The box plot is useful because averages can hide differences in the data, but the distribution helps reveal how consistent or varied the macronutrient composition is within each cuisine. This plot is faceted by macronutrient, and color is mapped to the cuisine rating category.

Analysis

cusine_avg_rating <- cuisines_df |>
  group_by(country) |>
  summarise(
    cuisine_avg_rating = mean(avg_rating, na.rm = TRUE)
  ) |>
  arrange(cuisine_avg_rating)

cusine_avg_rating_bottom <- cusine_avg_rating |>
  slice_head(n = 5)

cusine_avg_rating_top <- cusine_avg_rating |>
  slice_tail(n = 5)

macro <- cuisines_df |>
  mutate(
    top_vs_bottom = case_when(
      country %in% cusine_avg_rating_top$country ~ "Top Rated",
      country %in% cusine_avg_rating_bottom$country ~ "Bottom Rated",
      .default = NA
    )
  ) |>
  drop_na(top_vs_bottom) |>
  mutate(
    fat_pct = fat * 9 / calories,
    protein_pct = protein * 4 / calories,
    carbs_pct = carbs * 4 / calories,
  )
macro_plot_stacked <- macro |>
  group_by(country, top_vs_bottom) |>
  summarise(
    fat = mean(fat_pct, na.rm = TRUE),
    protein = mean(protein_pct, na.rm = TRUE),
    carbs = mean(carbs_pct, na.rm = TRUE),
    .groups = "drop"
  ) |>
  pivot_longer(
    cols = c(fat, protein, carbs),
    names_to = "macro",
    values_to = "avg_proportion"
  ) |>
  mutate(
    top_vs_bottom = factor(
      top_vs_bottom,
      levels = c("Top Rated", "Bottom Rated")
    ),
    country = factor(
      country,
      levels = c(
        cusine_avg_rating_top$country,
        cusine_avg_rating_bottom$country
      )
    ),
    macro = factor(macro, levels = c("carbs", "fat", "protein"))
  )


ggplot(
  macro_plot_stacked,
  aes(y = country, x = avg_proportion, fill = macro)
) +
  geom_col(position = "fill") +
  facet_wrap(~top_vs_bottom, ncol = 1, scales = "free_y") +
  scale_x_continuous(labels = scales::percent_format(accuracy = 1)) +
  scale_y_discrete(labels = label_wrap) +
  theme_leopard() +
  theme_sub_legend(
    key.size = unit(14, "pt"),
  ) +
  labs(
    title = "Macronutrient proportions across cuisine recipes",
    subtitle = "Focus on top 5 and bottom 5 user rated cuisines",
    x = NULL,
    y = NULL,
    fill = NULL,
    caption = "Source: Allrecipes"
  ) +
  guides(
    fill = guide_legend(reverse = TRUE)
  )

macro_plot2 <- macro |>
  group_by(country)

macro_plot2 <- macro |>
  pivot_longer(
    cols = c(fat_pct, protein_pct, carbs_pct),
    names_to = "macro",
    values_to = "proportion"
  ) |>
  mutate(
    top_vs_bottom = factor(
      top_vs_bottom,
      levels = c("Top Rated", "Bottom Rated")
    ),
    country = factor(
      country,
      levels = c(
        cusine_avg_rating_top$country,
        cusine_avg_rating_bottom$country
      )
    )
  ) |>
  filter(
    proportion <= 1.0
  )

ggplot(
  macro_plot2,
  aes(x = country, y = proportion, fill = top_vs_bottom)
) +
  geom_boxplot() +
  facet_wrap(
    ~macro,
    ncol = 1,
    labeller = labeller(
      macro = c(
        protein_pct = "Protein proportions",
        carbs_pct = "Carbs proportions",
        fat_pct = "Fat proportions"
      )
    )
  ) +
  scale_x_discrete(labels = \(x) str_wrap(x, 8)) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  theme_leopard() +
  labs(
    title = "Macronutrient proportions across cuisine recipes",
    subtitle = "Distribution of macronutrients in recipes from the 5 highest- and lowest-rated cuisines",
    x = NULL,
    y = NULL,
    fill = NULL,
    caption = "Source: Allrecipes"
  )

Discussion

The analysis reveals a clear positive relationship between fat content and recipe ratings. On average, recipes with higher fat proportions consistently received higher user ratings. This finding aligns with research from Purdue University and the National Laboratory for Medicine, both of which identify fat as a primary taste trigger in food. Beyond elevating perceived taste intensity, fats play a multifaceted role in culinary experience. They carry fat-soluble flavor compounds, contribute to mouthfeel and texture, and promote satiety by slowing gastric emptying. The convergence of these sensory and physiological effects likely explains why users responded more favorably to fat-rich recipes, creating a meaningful distinction between higher- and lower-rated dishes.

On the other end of the macronutrient spectrum, the data shows consistently low protein proportions across recipes relative to both carbohydrates and fats. This pattern is not surprising given the nature of recipe datasets: traditional and home-style recipes are typically built around carbohydrate staples like grains, legumes, root vegetables with fats serving as the primary flavor vehicle. Protein sources, where present, often appear as secondary or supporting ingredients rather than the nutritional centerpiece. This structural bias in how recipes are composed likely accounts for the low protein figures observed across the board.

It is worth noting, however, that the trends identified here should be interpreted with some caution. The comparisons are drawn from the top and bottom n recipes by rating, which represent the extremes of the distribution. The apparent differences in macronutrient composition between high- and low-rated recipes may therefore appear more pronounced than they would across the full dataset, where the gradient between rating tiers is likely more gradual. A complete analysis spanning all rating levels would offer a more nuanced and accurate picture of these relationships.

Question 2: How has the representation of Chinese, Italian, Japanese, Indian food changed over time and how have their ratings evolved during this period?

Introduction

Food trends on social media and recipe platforms change quickly, and a small set of globally familiar cuisines often sits at the center of that attention. In this analysis, we examine how the presence of Chinese, Italian, Japanese, and Indian cuisines in our dataset evolve over time and how their average ratings change during this period. We focus on these cuisines because they are widely recognized in global food culture and are consistently represented in our data.

We are interested in this question because tracking both recipe counts and ratings can reveal patterns in shifting tastes and preferences, showing not only which cuisines are more visible over time but also how users respond to them. In turn, these patterns reflect broader cultural and social influences, such as identity, community, and changing food habits, as well as the way trends in social media and global food culture shape what people cook, share, and enjoy.

To do this, we use the following three variables:

  • “date_published” (date): “When the recipe was published/updated”
  • “country” (character): “The country/region the cuisine is from.”
  • “avg_rating” (double): “Average rating out of 5 stars”

Approach

To prepare the data for analysis, we began by filtering the dataset to include only Chinese, Indian, Italian, and Japanese cuisines. We then converted the date_published variable into a proper date format and extracted the year from it, allowing us to analyze trends on a yearly basis. Next, we grouped the data by year and cuisine to calculate two key values: the number of recipes for each cuisine in each year (n) and the average rating (mean_rating). To measure how visible each cuisine is relative to the others in a given year, we also calculated the proportion of recipes (prop) that each cuisine represents out of the total number of recipes that year. 

Visualization 1: Bubble chart layered on top of a line graph

The line graph shows how the average rating of each cuisine changes over time, with the x-axis representing the year and the y-axis representing the mean rating. Each cuisine is mapped to a different color, making it easy to compare trends. On top of the lines, we add a bubble chart using geom_point(). The size of each bubble represents the proportion of recipes (prop) for that cuisine in that year. This allows the plot to simultaneously communicate: how ratings evolve over time and how prominent each cuisine is in the dataset during that period. Larger bubbles indicate years where that cuisine makes up a larger share of the recipes.

Visulaization 2: Faceted radial (polar) chart

In this plot, each panel represents a different year, allowing us to compare cuisines across time. Each slice corresponds to a different cuisine ans is differentiated by color. The angle of each slice represents the proportion of recipes (prop), showing how much of the dataset each cuisine occupies in that year. The radial distance from the center represents the average rating, which is shifted so that a rating of 4 acts as the baseline. This makes differences in ratings easier to see since most recipe ratings fall within a narrow range between 4–5 stars. Using a radial chart combines representation and ratings into a single visual, and faceting by year makes it easier to observe how the balance between cuisines changes over time.

Analysis

selected_cuisines <- c("Chinese", "Indian", "Italian", "Japanese")

cuisines_filtered <- cuisines_df |>
  filter(country %in% selected_cuisines) |>
  mutate(
    year = year(date_published)
  )
annual_counts <- cuisines_filtered |>
  count(country, year)

annual_summary <- cuisines_filtered |>
  group_by(year, country) |>
  summarise(
    n = n(),
    mean_rating = mean(avg_rating, na.rm = TRUE),
    .groups = "drop"
  ) |>
  group_by(year) |>
  mutate(
    year_total = sum(n),
    prop = n / year_total,
  )
ggplot(
  annual_summary |>
    ungroup() |>
    mutate(
      country = fct_reorder2(
        .f = country,
        .x = year,
        .y = mean_rating
      )
    ),
  aes(x = year, y = mean_rating, color = country)
) +
  geom_point(alpha = 0.7, aes(size = prop)) +
  geom_line() +
  scale_size(range = c(4, 13)) +
  theme_leopard() +
  theme_sub_legend(
    position = "right",
  ) +
  guides(
    size = "none",
  ) +
  labs(
    title = "Average Recipe Ratings by Cuisine Over Time",
    subtitle = "Point size reflects each cuisine’s relative share of recipes",
    x = "Year",
    y = "Average Rating",
    color = "Cuisine",
    caption = "Source: Allrecipes"
  )

radial <- annual_summary |>
  arrange(year, country) |>
  group_by(year) |>
  mutate(
    prop_cumulative = cumsum(prop),
    prop_prev = lag(prop_cumulative, default = 0),
    angle_mid = (prop_prev + prop_cumulative) / 2,
    # shift ratings so 4 becomes the center baseline
    shift_rating = mean_rating - 4
  ) |>
  ungroup()

ggplot(
  radial,
  aes(
    # position around circle
    x = angle_mid,
    # radial length (rating)
    y = shift_rating,
    fill = country
  )
) +
  geom_col(
    aes(width = prop),
    color = "white"
  ) +
  coord_polar(theta = "x", start = 0) +
  facet_wrap(~year) +
  scale_y_continuous(
    limits = c(0, 1),
    breaks = c(0, 0.25, 0.5, 0.75, 1),
    labels = c(4, 4.25, 4.5, 4.75, 5)
  ) +
  scale_fill_manual(
    values = c(
      Indian = "#E38D3E",
      Italian = "#6E8F6A",
      Japanese = "#C96A78",
      Chinese = "#8A6A57"
    )
  ) +
  theme_leopard() +
  theme_sub_axis_y(
    text = element_text(colour = "#333333"),
    ticks = element_line(colour = "#333333"),
  ) +
  theme_sub_axis_x(
    text = element_text(colour = "#333333"),
    ticks = element_line(colour = "#333333"),
  ) +
  theme_sub_panel(
    grid.major = element_line(colour = "#d7d5d5ff"),
    grid.minor = element_line(colour = "#d7d5d5ff"),
  ) +
  labs(
    title = "Cuisine Representation and Average Ratings Over Time",
    x = NULL,
    y = NULL,
    fill = NULL,
    caption = "Source: Allrecipes"
  )

Discussion

Across the observed period, the distribution of recipes among the four selected cuisines shifted considerably. Indian cuisine saw the most dramatic change, shrinking from approximately 35% of the group’s recipes in 2022 to just 11% by 2025. This does not necessarily mean fewer Indian recipes were published overall, but rather that its share within this cuisine group declined relative to the others, particularly Italian, which expanded its proportion the most and emerged as the dominant cuisine by 2025. The growth of the other cuisines should similarly be understood in relative terms: they gained ground within this selection, not necessarily on the platform as a whole.

A more nuanced pattern emerges when representation is read alongside ratings. Despite Indian cuisine’s shrinking share of published recipes per year, its mean ratings trended upward a trajectory shared by all four cuisines. This suggests that user appreciation improved broadly regardless of how much each cuisine was being published.

Italian cuisine stands out as the clearest winner across both dimensions, growing its share of yearly publications while also achieving among the highest mean ratings by 2025, suggesting both strong contributor activity and consistent user satisfaction.

Presentation

Our presentation can be found here.

Data

Include a citation for your data here. See https://data.research.cornell.edu/data-management/storing-and-managing/data-citation/ for guidance on proper citation for datasets. If you got your data off the web, make sure to note the retrieval date.

TidyTuesday. (2025). AllRecipes recipe dataset (Week of September 16, 2025) [Data set]. Prepared from recipes scraped from AllRecipes.com and distributed via the tastyR package. Curated by Brian Mubia. https://github.com/rfordatascience/tidytuesday/tree/main/data/2025/2025-09-16

References

List any references here. You should, at a minimum, list your data source.

About Allrecipes. (n.d.). Allrecipes. https://www.allrecipes.com/about-us-6648102

Drewnowski, A., & Almiron-Roig, E. (2010). Human Perceptions and Preferences for Fat-Rich Foods. Nih.gov; CRC Press/Taylor & Francis. https://www.ncbi.nlm.nih.gov/books/NBK53528/

Fatty food triggers taste buds, new research finds. (2019). Purdue.edu. https://www.purdue.edu/uns/html4ever/011203.Mattes.taste.html

TidyTuesday. (2025). AllRecipes recipe dataset (Week of September 16, 2025) [Data set]. Prepared from recipes scraped from AllRecipes.com and distributed via the tastyR package. [https://github.com/rfordatascience/tidytuesday/tree/main/data/2025/2025-09-16]

Williams, C. (2020, February 25). Most Popular Ethnic Cuisines in America According to Google. Chef’s Pencil. https://www.chefspencil.com/most-popular-ethnic-cuisines-in-america/