2019 ADA Project
Secrets Behind Recipes
Tianyang Dong, Wei Jiang, Huajian Qiu, Jiahua Wu
Abstract

Abstract

Cooking is an important skill for everyone, no matter where they live. After evolving for centuries, what people eat everyday has grown into systems. Every country has their own style of eating and the styles vary from region to region.

Through the recipes, we can know what is frequently eaten by the people of one country and how they usually cook. We can even go further, find the relationships between what people eat and how their health conditions are, and dig out all the secrets behind the recipes.
The two datasets about recipes we used are collected from Kaggle. One of them contains 230185 different recipes scraped from Food.com and includes information about the cooking steps, ingredients, time needed, tags, etc. Through the tags of the recipes, we could figure out where the dish originates from. By matching the countries with recipes, we get 96286 recipes from 51 different countries.

To study the relationship between recipes and health, we use datasets about noncommunicable diseases and body mass index (BMI) from WHO in 2016. The GNI(Gross national income) dataset from The World Bank is also used to balance the influence of economy on citizens' health conditions.
How do eating habits vary in different countries?
- the most frequently used seasoning, cooking methods, ingredients

How similar the way of cooking in different countries?

How are the different eating habits related to health?
- analyze correlation between some health indices like the life span, overweight rate, high blood pressure, etc with common seasonings and nutrition content

What do they prefer?

Not only your accent can tell us where you are from, what you eat can do the same! There may be stereotypes that all French love cheese or all Chinese have rice for lunch. Nevertheless in general human beings living in the same area tend to eat alike. These habits are inherited from our ancestors and latently label us who we are.

We compute the frequency of the ingredients and cooking methods for all the countries. Seasonings are seperated from the ingredients, since they are in charge of deciding whether the dish is sweet or salty and characterize more of people's eating habit. A guy in Europe and a guy in Asia may both choose to have chicken for lunch, yet how they adjust the flavor can be totally different.
W3.CSS

Top 30 Seasonings

Recall that we identify 51 coutries in total. Therefore, as universal flavorings, salt, pepper, onion, oil, sugar and butter almost appear in recipes of all countries studied. In contrast, at the bottom of the sorted frequency list, we may recognize some regional specialties. We choose several representative countries: Italy, France, Vietnam, Malaysia, Korea, China and Japan.
Italy
France
Vietnam
Malaysia
China
Korea
Japan


Spaghetti is traditional italian cuisine, so it is not surprising that spaghetti sauce is frequently used in Italy. One would never expect that to happen in the asian countries.

Used as a pungent condiment for sushi, wasabi is indispensible in most of the japanese recipes.

Tarragon is one of the four "fines herbes" of French cooking, and is particularly suitable for chicken, fish, and egg dishes.

Soy sauce and rice wine vinegar are very popular in East Asia.

Star anise is widely used in Asia and it is also a major ingredient in the making of phở, a Vietnamese noodle soup.

Top 30 Main Ingredients

Beside seasonings, other ingredients are also interesting. We choose eight countries as representatives: Italy, Switzerland, France, Netherlands, Greece, China, Japan and Korea.
Italy
Switzerland
France
Netherlands
Greece
China
Korea
Japan


Cheese is very popular in European countires, and each country has their own favor towards cheese: ricotta and mozzarella cheese for Italy, gouda cheese for Netherlands, gruyere and emmenthaler cheese for Switzerland. Sushi rice and shiitake mushroom appear frequently in Japan's recipes yet one will never see them among all other 50 countries' top 30 main ingredients.

These particular kinds of ingredients appear in more than 10% of the countries' recipes. Specifically, parmesan cheese is used in more than a third of italian recipes and feta cheese is also very popular in greek recipes.

Top 15 Cooking Methods

The frequently used cooking methods of China, Germany, France, Japan, Italy, Korea and Palestine are displayed. Basic methods like boil, simmer, drain and bake are largely present in recipes of various countries for food cooking.
China
Germany
France
Japan
Italy
Korea
Palestine


Broil is popular in korean recipes mainly because of "Korean BBQ".

Italian recipes and japanese recipes reflect respectively typical western and asian food cooking habits. Both of them are mostly prepared by draining, boiling and simmering, which is the case of most of the countries.

However, They still slightly differ from each other: marinate is more present as a food preprocessing step in japanese recipes while baking appears more frequently in italian recipes. Such difference is reasonable since food made by baking (for instance bread and cake) is essential in western diet while marinate plays an important role in east-asian cuisine for refining the taste of ingredients.

Are they similar?





Recipes themselves may differ a lot, but it is a natural and reasonable idea that recipes from the same country would be similar. This may be due to the fact that the ingredients the recipes use should be easily gained from local areas and be preferred by natives.

To give a better and clearer idea what the recipes from the same country look like, we use PCA and t-SNE to help us visualize the relationship among recipes. The steps in the recipes that teach people how to cook are processed using NLP techiniques and are then converted to vectors using bag of words in a space with thousands of dimensions. Finally, using the dimension reduction methods mentioned above, the vectors representing recipes are shown in a three-dimensional space.

How much do they intake?

Not only eat delicious, but also eat healthy. Nowadays, flavor of dishes is no longer the first thing to consider. People tend to put more and more consideration on health. We draw here a choropleth map based on the average nutrition facts of the recipes from different countries.
d
Clearly, recipes originated from western countries (European countries, US, Canada and Australia) tend to contain relatively high calories, high fat, high sugar, high carbs and low sodium. On the contrary, recipes of east asian countries tend to contain high sodium and high proteins. These characterizations have important influence on statistics related to health that we include later.

What is their relationship?

Nutrition

There is no need to say that food is related to health. After having an overview at the averaged nutrition intake, we would like to dig out more about what people eat and how healthy people are.

To clearly define healthy, we use datasets from WHO about disease incidence and overweight rate. The correlation between nutrition and health index is computed and shown in flow chart (due to the data size, only several examples are shown). Wide edge indicates high correlation. If we focus on high cholesterol, we would see that sugar, fat and carbohydrate seem to have high correlation with it.

Seasoning

It is a common knowledge that too much sugar or salt is not a good choice when you want to stay healthy. Lots of stars and models have boiled broccoli with no seasoning to keep in shape. So what is the relationship between seasonings and healthy?

The flow chart shows several countries' intake of the seasonings and the countries' health indices. These characteristics are represented as nodes. The wider the edge is, the more seasoning the country consumes in average or the higher the health index is. It can be seen that the seasoning intake and the health indices differ a lot from country to country. There is more to explore about.


We hope to find the relationships between seasoning and the health indices, however even if we get a high correlation between a specific seasoning and disease, we can not come to the conclusion that having the seasoning more in daily life will lead to the disease. There are many other factors that will influence people's health condition.

Here we just go one step furthur: taking income per capita into consideration. Given the same income level, how does the seasoning consumption amount influence health indices?
We first make the hypothesis that only seasoning intake and income level influence the health indices. As for other attributes of the countries, they are either similar or irrelavent to the health indices, thus these attributes will not exert an influence on the health indices. So when controling the seasoning intake and income level, the health indices values should be the same. A sensitivity analysis is carried out and when gamma equals 2, only hypothesis for six health indices are verified (shown below).

Take the high income countries (defined by The World Bank) as example. For these countries, the correlation between seasoning intake and the six health indices are computed. The value of the correlation is represented by the width of the edges. Salt seems to have high positive correlation with high cholesterol and high blood pressure. This coincides with our common knowledge. Also, vinegar might be good for health due to their high correlation.
-->

Conclusions

By having a look at different aspects at recipes, we can conlude our work done so far:
find out the popular seasonings, other ingredients and cooking methods of different countries.
find the similarities among recipes from the same country by transforming the steps from strings to vectors, then reduce the dimension using PCA and t-SNE.
show the nutrition intake of all the 51 countries and its correlation with the health indices
through sensitivity analysis, make the correlation between seasoning and the six health indices (high cholesterol, high blood pressure, NCD death probability, high blood glucose, overweight and life span) more convincing.

Beside the work already done, there are something more that could be done in the future:
The dataset we used are collected from Food.com and this is a website most popular in North American regions. So there may not be enough data about other areas and thus introduce bias to the analysis.
The dosage of seasonings and ingredients are not available in the dataset. But relative information can be found on the website. The analysis could be more reliable and convincing when taking the dosage into consideration.
The recipes are treated equally, which means whether the dish is preferred in real life is not taken into consideration. The reviews on Food.com could be used to better define the intake of seasonings and nutritions.

-->