Eating Healthy Assistant
There are tons of different online diet programs to choose from, and they all promise significant healthy recipes. We have all searched for weight loss/ weight gain or just healthy recipes on Google, since we all trust Google as if he were a nutritionist, doctor, pastry chef, pizza delivery guy etc. Do these online recipes make our cooks easier…or cheesier ? Google didn’t answer this so we decided to find it out by ourselves.
Where did we start?
We scraped almost 400k recipes from top 10 food websites with the help of Structured Data. Every recipe would have its list of ingredients in this or similar format:
- 2 cups of milk
- 3 chopped onions
- salt and pepper to taste
accompanied with cooking instructions and its total nutritional information, information we were mostly interested in. It was too late for us to start Academy of Nutrition and Dietetics so we took another way, automated a nutritionist guy 🙂
After importing all food database from the US Department of Agriculture and with all scraped recipes stored in a document database (MongoDB), we started by identifying the ingredients in each recipe. After food identification (milk) we had to extract the measurer if there was one (cup) and the amount, still if there was one (2, 1/2). These were the simplest cases.
To accomplish that, we used text processing techniques in Python using the most powerful Natural Language Processing and Machine Learning libraries Python offers like:
We had ups and downs, our nutritionist would say that 1/2 cup chopped onion is 1/2 cup Ham, chopped, canned, still we kept upgrading our nutritionist analyzer by improving our text processing algorithms until a 95% accuracy.
Math and calculations
In the moment that our intelligent nutritionist knew that 1/2 cup chopped onion was 1/2 chopped onion he was ready to calculate the total amount of nutrition in a recipe, simple math…or maybe not so simple.
- salt and pepper to taste
Is it just a teaspoon of salt and pepper? What if the recipe is for 20 servings? There were so many factors we couldn’t skip and we had to consider because a wrong nutrition estimation for just an ingredient would misguide us to the total nutrition calculation of all the recipe.
Not anything yet about healthy recipes online?
Our nutritionist analyzer was now ready to process the 400k stored recipes. For each recipe, along with the original nutritional information, the calculated result was added to them.
The last step to the final goal was comparing the results and calculating the DELTA, how much different was our estimation in comparison with what they stated. The answer:
Fat | 34.6% |
Calories | 27.5% |
Sodium | 41.6% |
Carbohydrates | 26.6% |
Cholesterol | 49.5% |
Protein | 37.6% |
Why would you trust us? We too had a thread of doubt on our estimations so we chose some random recipes of different cases and made manual calculations, an example from final results:
Recipe:
- 1 tablespoon butter
- 3 eggs
- 1 teaspoon water
- 1/2 cup crumbled feta cheese
- salt and pepper to taste
Nutritions | Original | Manually | Estimated |
Fat | 16.18 | 20 | 21.2 |
Calories | 257 | 302.8 | 304.8 |
Sodium | 710 | 545 | 547.3 |
Carbohydrates | 2.1 | 14.2 | 14.2 |
Cholesterol | 328 | 426.3 | 427.2 |
Protein | 14.8 | 25.8 | 25.8 |
Lastly, never forget that any ‘healthy recipe’ online is just ‘yet another record’ in a database, and assuring its accuracy is a real challenge.