Foundation Foods#
- ingredient_parser.en.foundationfoods.match_foundation_foods(tokens: list[str], pos_tags: list[str], name_idx: int) FoundationFood | None[source]#
Match ingredient name to foundation foods from FDC ingredient.
This is done in three stages. The first stage prepares and normalises the tokens.
The second stage uses an Unsupervised Smooth Inverse Frequency calculation to down select the possible candidate matching FDC ingredients.
The third stage selects the best of these candidates using a fuzzy embedding document metric.
The need for two stages is that the ingredient embeddings do not seem to be as accurate as off the shelf pre-trained general embeddings are for general tasks. Improving the quality of the embeddings might remove the need for the second stage.