Parsers#
- ingredient_parser.parsers.inspect_parser(sentence: str, lang: str = 'en', separate_names: bool = True, discard_isolated_stop_words: bool = True, expect_name_in_output: bool = True, string_units: bool = False, imperial_units: bool = False, volumetric_units_system: str = 'us_customary', foundation_foods: bool = False, custom_units: dict[str, str] | None = None) ParserDebugInfo[source]#
Return intermediate objects generated during parsing for inspection.
- Parameters:
- sentence
str Ingredient sentence to parse.
- lang
str Language of sentence. Currently supported options are: en.
- separate_namesbool,
optional If True and the sentence contains multiple alternative ingredients, return an IngredientText object for each ingredient name, otherwise return a single IngredientText object. Default is True.
- discard_isolated_stop_wordsbool,
optional If True, any isolated stop words in the name, preparation, or comment fields are discarded. Default is True.
- expect_name_in_outputbool,
optional If True, if the model doesn’t label any words in the sentence as the name, fallback to selecting the most likely name from all tokens even though the model gives it a different label. Note that this does guarantee the output contains a name. Default is True.
- string_unitsbool
If True, return all IngredientAmount units as strings. If False, convert IngredientAmount units to pint.Unit objects where possible. Default is False.
- imperial_unitsbool
If True, use imperial units instead of US customary units for pint.Unit objects for the the following units: fluid ounce, cup, pint, quart, gallon. Default is False, which results in US customary units being used. This has no effect if string_units=True.
Deprecated since version v2.5.0: Use
volumetric_units_system="imperial"for the same functionality.- volumetric_units_system
str,optional Sets the units system for volumetric measurements, like “cup” or “tablespoon”. Available options are “us_customary” (default), “imperial”, “metric”, “australian”, “japanese”. This has no effect if string_units=True.
Added in version v2.5.0.
- foundation_foodsbool,
optional If True, extract foundation foods from ingredient name. Foundation foods are the fundamental foods without any descriptive terms, e.g. ‘cucumber’ instead of ‘organic cucumber’. Default is False.
- custom_units
dict[str,str] |None,optional Provide custom units to aid the parser in identifying units. The custom units should be provided as a dict of plural: singular pairs. If a unit does not have a plural form, provide the singular form as the key for the pair. The units should not start with a capital letter, but may contain capital letters at other positions.
Added in version v2.6.0.
- sentence
- Returns:
ParserDebugInfoParserDebugInfo object containing the PreProcessor object, PostProcessor object and Tagger.
- ingredient_parser.parsers.parse_ingredient(sentence: str, lang: str = 'en', separate_names: bool = True, discard_isolated_stop_words: bool = True, expect_name_in_output: bool = True, string_units: bool = False, imperial_units: bool = False, volumetric_units_system: str = 'us_customary', foundation_foods: bool = False, custom_units: dict[str, str] | None = None) ParsedIngredient[source]#
Parse an ingredient sentence to return structured data.
- Parameters:
- sentence
str Ingredient sentence to parse.
- lang
str Language of sentence. Currently supported options are: en.
- separate_namesbool,
optional If True and the sentence contains multiple alternative ingredients, return an IngredientText object for each ingredient name, otherwise return a single IngredientText object. Default is True.
- discard_isolated_stop_wordsbool,
optional If True, any isolated stop words in the name, preparation, or comment fields are discarded. Default is True.
- expect_name_in_outputbool,
optional If True, if the model doesn’t label any words in the sentence as the name, fallback to selecting the most likely name from all tokens even though the model gives it a different label. Note that this does guarantee the output contains a name. Default is True.
- string_unitsbool,
optional If True, return all IngredientAmount units as strings. If False, convert IngredientAmount units to pint.Unit objects where possible. Default is False.
- imperial_unitsbool,
optional If True, use imperial units instead of US customary units for pint.Unit objects for the the following units: fluid ounce, cup, pint, quart, gallon. Default is False, which results in US customary units being used. This has no effect if string_units=True.
Deprecated since version v2.5.0: Use
volumetric_units_system="imperial"for the same functionality.- volumetric_units_system
str,optional Sets the units system for volumetric measurements, like “cup” or “tablespoon”. Available options are “us_customary” (default), “imperial”, “metric”, “australian”, “japanese”. This has no effect if string_units=True.
Added in version v2.5.0.
- foundation_foodsbool,
optional If True, extract foundation foods from ingredient name. Foundation foods are the fundamental foods without any descriptive terms, e.g. ‘cucumber’ instead of ‘organic cucumber’. Default is False.
- custom_units
dict[str,str] |None,optional Provide custom units to aid the parser in identifying units. The custom units should be provided as a dict of plural: singular pairs. If a unit does not have a plural form, provide the singular form as the key for the pair. The units should not start with a capital letter, but may contain capital letters at other positions.
Added in version v2.6.0.
- sentence
- Returns:
ParsedIngredientParsedIngredient object of structured data parsed from input string.
- ingredient_parser.parsers.parse_multiple_ingredients(sentences: Iterable[str], lang: str = 'en', separate_names: bool = True, discard_isolated_stop_words: bool = True, expect_name_in_output: bool = True, string_units: bool = False, imperial_units: bool = False, volumetric_units_system: str = 'us', foundation_foods: bool = False, custom_units: dict[str, str] | None = None) list[ParsedIngredient][source]#
Parse multiple ingredient sentences.
This function accepts a list of sentences and returns a list of ParsedIngredient objects.
- Parameters:
- sentences
Iterable[str] Iterable of sentences to parse.
- lang
str Language of sentence. Currently supported options are: en.
- separate_namesbool,
optional If True and the sentence contains multiple alternative ingredients, return an IngredientText object for each ingredient name, otherwise return a single IngredientText object. Default is True.
- discard_isolated_stop_wordsbool,
optional If True, any isolated stop words in the name, preparation, or comment fields are discarded. Default is True.
- expect_name_in_outputbool,
optional If True, if the model doesn’t label any words in the sentence as the name, fallback to selecting the most likely name from all tokens even though the model gives it a different label. Note that this does guarantee the output contains a name. Default is True.
- string_unitsbool
If True, return all IngredientAmount units as strings. If False, convert IngredientAmount units to pint.Unit objects where possible. Default is False.
- imperial_unitsbool
If True, use imperial units instead of US customary units for pint.Unit objects for the the following units: fluid ounce, cup, pint, quart, gallon. Default is False, which results in US customary units being used. This has no effect if string_units=True.
Deprecated since version v2.5.0: Use
volumetric_units_system="imperial"for the same functionality.- volumetric_units_system
str,optional Sets the units system for volumetric measurements, like “cup” or “tablespoon”. Available options are “us_customary” (default), “imperial”, “metric”, “australian”, “japanese”. This has no effect if string_units=True.
Added in version v2.5.0.
- foundation_foodsbool,
optional If True, extract foundation foods from ingredient name. Foundation foods are the fundamental foods without any descriptive terms, e.g. ‘cucumber’ instead of ‘organic cucumber’. Default is False.
- custom_units
dict[str,str] |None,optional Provide custom units to aid the parser in identifying units. The custom units should be provided as a dict of plural: singular pairs. If a unit does not have a plural form, provide the singular form as the key for the pair. The units should not start with a capital letter, but may contain capital letters at other positions.
Added in version v2.6.0.
- sentences
- Returns:
list[ParsedIngredient]List of ParsedIngredient objects of structured data parsed from input sentences.