Parsers#

ingredient_parser.parsers.inspect_parser(sentence: str, lang: str = 'en', separate_names: bool = True, discard_isolated_stop_words: bool = True, expect_name_in_output: bool = True, string_units: bool = False, imperial_units: bool = False, volumetric_units_system: str = 'us_customary', foundation_foods: bool = False, custom_units: dict[str, str] | None = None) ParserDebugInfo[source]#

Return intermediate objects generated during parsing for inspection.

Parameters:
sentencestr

Ingredient sentence to parse.

langstr

Language of sentence. Currently supported options are: en.

separate_namesbool, optional

If True and the sentence contains multiple alternative ingredients, return an IngredientText object for each ingredient name, otherwise return a single IngredientText object. Default is True.

discard_isolated_stop_wordsbool, optional

If True, any isolated stop words in the name, preparation, or comment fields are discarded. Default is True.

expect_name_in_outputbool, optional

If True, if the model doesn’t label any words in the sentence as the name, fallback to selecting the most likely name from all tokens even though the model gives it a different label. Note that this does guarantee the output contains a name. Default is True.

string_unitsbool

If True, return all IngredientAmount units as strings. If False, convert IngredientAmount units to pint.Unit objects where possible. Default is False.

imperial_unitsbool

If True, use imperial units instead of US customary units for pint.Unit objects for the the following units: fluid ounce, cup, pint, quart, gallon. Default is False, which results in US customary units being used. This has no effect if string_units=True.

Deprecated since version v2.5.0: Use volumetric_units_system="imperial" for the same functionality.

volumetric_units_systemstr, optional

Sets the units system for volumetric measurements, like “cup” or “tablespoon”. Available options are “us_customary” (default), “imperial”, “metric”, “australian”, “japanese”. This has no effect if string_units=True.

Added in version v2.5.0.

foundation_foodsbool, optional

If True, extract foundation foods from ingredient name. Foundation foods are the fundamental foods without any descriptive terms, e.g. ‘cucumber’ instead of ‘organic cucumber’. Default is False.

custom_unitsdict[str, str] | None, optional

Provide custom units to aid the parser in identifying units. The custom units should be provided as a dict of plural: singular pairs. If a unit does not have a plural form, provide the singular form as the key for the pair. The units should not start with a capital letter, but may contain capital letters at other positions.

Added in version v2.6.0.

Returns:
ParserDebugInfo

ParserDebugInfo object containing the PreProcessor object, PostProcessor object and Tagger.

ingredient_parser.parsers.parse_ingredient(sentence: str, lang: str = 'en', separate_names: bool = True, discard_isolated_stop_words: bool = True, expect_name_in_output: bool = True, string_units: bool = False, imperial_units: bool = False, volumetric_units_system: str = 'us_customary', foundation_foods: bool = False, custom_units: dict[str, str] | None = None) ParsedIngredient[source]#

Parse an ingredient sentence to return structured data.

Parameters:
sentencestr

Ingredient sentence to parse.

langstr

Language of sentence. Currently supported options are: en.

separate_namesbool, optional

If True and the sentence contains multiple alternative ingredients, return an IngredientText object for each ingredient name, otherwise return a single IngredientText object. Default is True.

discard_isolated_stop_wordsbool, optional

If True, any isolated stop words in the name, preparation, or comment fields are discarded. Default is True.

expect_name_in_outputbool, optional

If True, if the model doesn’t label any words in the sentence as the name, fallback to selecting the most likely name from all tokens even though the model gives it a different label. Note that this does guarantee the output contains a name. Default is True.

string_unitsbool, optional

If True, return all IngredientAmount units as strings. If False, convert IngredientAmount units to pint.Unit objects where possible. Default is False.

imperial_unitsbool, optional

If True, use imperial units instead of US customary units for pint.Unit objects for the the following units: fluid ounce, cup, pint, quart, gallon. Default is False, which results in US customary units being used. This has no effect if string_units=True.

Deprecated since version v2.5.0: Use volumetric_units_system="imperial" for the same functionality.

volumetric_units_systemstr, optional

Sets the units system for volumetric measurements, like “cup” or “tablespoon”. Available options are “us_customary” (default), “imperial”, “metric”, “australian”, “japanese”. This has no effect if string_units=True.

Added in version v2.5.0.

foundation_foodsbool, optional

If True, extract foundation foods from ingredient name. Foundation foods are the fundamental foods without any descriptive terms, e.g. ‘cucumber’ instead of ‘organic cucumber’. Default is False.

custom_unitsdict[str, str] | None, optional

Provide custom units to aid the parser in identifying units. The custom units should be provided as a dict of plural: singular pairs. If a unit does not have a plural form, provide the singular form as the key for the pair. The units should not start with a capital letter, but may contain capital letters at other positions.

Added in version v2.6.0.

Returns:
ParsedIngredient

ParsedIngredient object of structured data parsed from input string.

ingredient_parser.parsers.parse_multiple_ingredients(sentences: Iterable[str], lang: str = 'en', separate_names: bool = True, discard_isolated_stop_words: bool = True, expect_name_in_output: bool = True, string_units: bool = False, imperial_units: bool = False, volumetric_units_system: str = 'us', foundation_foods: bool = False, custom_units: dict[str, str] | None = None) list[ParsedIngredient][source]#

Parse multiple ingredient sentences.

This function accepts a list of sentences and returns a list of ParsedIngredient objects.

Parameters:
sentencesIterable[str]

Iterable of sentences to parse.

langstr

Language of sentence. Currently supported options are: en.

separate_namesbool, optional

If True and the sentence contains multiple alternative ingredients, return an IngredientText object for each ingredient name, otherwise return a single IngredientText object. Default is True.

discard_isolated_stop_wordsbool, optional

If True, any isolated stop words in the name, preparation, or comment fields are discarded. Default is True.

expect_name_in_outputbool, optional

If True, if the model doesn’t label any words in the sentence as the name, fallback to selecting the most likely name from all tokens even though the model gives it a different label. Note that this does guarantee the output contains a name. Default is True.

string_unitsbool

If True, return all IngredientAmount units as strings. If False, convert IngredientAmount units to pint.Unit objects where possible. Default is False.

imperial_unitsbool

If True, use imperial units instead of US customary units for pint.Unit objects for the the following units: fluid ounce, cup, pint, quart, gallon. Default is False, which results in US customary units being used. This has no effect if string_units=True.

Deprecated since version v2.5.0: Use volumetric_units_system="imperial" for the same functionality.

volumetric_units_systemstr, optional

Sets the units system for volumetric measurements, like “cup” or “tablespoon”. Available options are “us_customary” (default), “imperial”, “metric”, “australian”, “japanese”. This has no effect if string_units=True.

Added in version v2.5.0.

foundation_foodsbool, optional

If True, extract foundation foods from ingredient name. Foundation foods are the fundamental foods without any descriptive terms, e.g. ‘cucumber’ instead of ‘organic cucumber’. Default is False.

custom_unitsdict[str, str] | None, optional

Provide custom units to aid the parser in identifying units. The custom units should be provided as a dict of plural: singular pairs. If a unit does not have a plural form, provide the singular form as the key for the pair. The units should not start with a capital letter, but may contain capital letters at other positions.

Added in version v2.6.0.

Returns:
list[ParsedIngredient]

List of ParsedIngredient objects of structured data parsed from input sentences.