Parsers

Parsers#

ingredient_parser.parsers.inspect_parser(sentence: str, lang: str = 'en', separate_names: bool = True, discard_isolated_stop_words: bool = True, expect_name_in_output: bool = True, string_units: bool = False, imperial_units: bool = False, volumetric_units_system: str = 'us_customary', foundation_foods: bool = False, custom_units: dict[str, str] | None = None) → ParserDebugInfo[source]#

Return intermediate objects generated during parsing for inspection.

Parameters:

sentencestr: Ingredient sentence to parse.
langstr: Language of sentence. Currently supported options are: en.
separate_namesbool, optional: If True and the sentence contains multiple alternative ingredients, return an IngredientText object for each ingredient name, otherwise return a single IngredientText object. Default is True.
discard_isolated_stop_wordsbool, optional: If True, any isolated stop words in the name, preparation, or comment fields are discarded. Default is True.
expect_name_in_outputbool, optional: If True, if the model doesn’t label any words in the sentence as the name, fallback to selecting the most likely name from all tokens even though the model gives it a different label. Note that this does guarantee the output contains a name. Default is True.
string_unitsbool: If True, return all IngredientAmount units as strings. If False, convert IngredientAmount units to pint.Unit objects where possible. Default is False.
imperial_unitsbool: If True, use imperial units instead of US customary units for pint.Unit objects for the the following units: fluid ounce, cup, pint, quart, gallon. Default is False, which results in US customary units being used. This has no effect if string_units=True.

Deprecated since version v2.5.0: Use volumetric_units_system="imperial" for the same functionality.
volumetric_units_systemstr, optional: Sets the units system for volumetric measurements, like “cup” or “tablespoon”. Available options are “us_customary” (default), “imperial”, “metric”, “australian”, “japanese”. This has no effect if string_units=True.

Added in version v2.5.0.
foundation_foodsbool, optional: If True, extract foundation foods from ingredient name. Foundation foods are the fundamental foods without any descriptive terms, e.g. ‘cucumber’ instead of ‘organic cucumber’. Default is False.
custom_unitsdict[str, str] | None, optional: Provide custom units to aid the parser in identifying units. The custom units should be provided as a dict of plural: singular pairs. If a unit does not have a plural form, provide the singular form as the key for the pair. The units should not start with a capital letter, but may contain capital letters at other positions.

Added in version v2.6.0.

Returns:

ParserDebugInfo: ParserDebugInfo object containing the PreProcessor object, PostProcessor object and Tagger.

ingredient_parser.parsers.parse_ingredient(sentence: str, lang: str = 'en', separate_names: bool = True, discard_isolated_stop_words: bool = True, expect_name_in_output: bool = True, string_units: bool = False, imperial_units: bool = False, volumetric_units_system: str = 'us_customary', foundation_foods: bool = False, custom_units: dict[str, str] | None = None) → ParsedIngredient[source]#

Parse an ingredient sentence to return structured data.

Parameters:

sentencestr: Ingredient sentence to parse.
langstr: Language of sentence. Currently supported options are: en.
separate_namesbool, optional: If True and the sentence contains multiple alternative ingredients, return an IngredientText object for each ingredient name, otherwise return a single IngredientText object. Default is True.
discard_isolated_stop_wordsbool, optional: If True, any isolated stop words in the name, preparation, or comment fields are discarded. Default is True.
expect_name_in_outputbool, optional: If True, if the model doesn’t label any words in the sentence as the name, fallback to selecting the most likely name from all tokens even though the model gives it a different label. Note that this does guarantee the output contains a name. Default is True.
string_unitsbool, optional: If True, return all IngredientAmount units as strings. If False, convert IngredientAmount units to pint.Unit objects where possible. Default is False.
imperial_unitsbool, optional: If True, use imperial units instead of US customary units for pint.Unit objects for the the following units: fluid ounce, cup, pint, quart, gallon. Default is False, which results in US customary units being used. This has no effect if string_units=True.

Deprecated since version v2.5.0: Use volumetric_units_system="imperial" for the same functionality.
volumetric_units_systemstr, optional: Sets the units system for volumetric measurements, like “cup” or “tablespoon”. Available options are “us_customary” (default), “imperial”, “metric”, “australian”, “japanese”. This has no effect if string_units=True.

Added in version v2.5.0.
foundation_foodsbool, optional: If True, extract foundation foods from ingredient name. Foundation foods are the fundamental foods without any descriptive terms, e.g. ‘cucumber’ instead of ‘organic cucumber’. Default is False.
custom_unitsdict[str, str] | None, optional: Provide custom units to aid the parser in identifying units. The custom units should be provided as a dict of plural: singular pairs. If a unit does not have a plural form, provide the singular form as the key for the pair. The units should not start with a capital letter, but may contain capital letters at other positions.

Added in version v2.6.0.

Returns:

ParsedIngredient: ParsedIngredient object of structured data parsed from input string.

ingredient_parser.parsers.parse_multiple_ingredients(sentences: Iterable[str], lang: str = 'en', separate_names: bool = True, discard_isolated_stop_words: bool = True, expect_name_in_output: bool = True, string_units: bool = False, imperial_units: bool = False, volumetric_units_system: str = 'us', foundation_foods: bool = False, custom_units: dict[str, str] | None = None) → list[ParsedIngredient][source]#

Parse multiple ingredient sentences.

This function accepts a list of sentences and returns a list of ParsedIngredient objects.

Parameters:

sentencesIterable[str]: Iterable of sentences to parse.
langstr: Language of sentence. Currently supported options are: en.
separate_namesbool, optional: If True and the sentence contains multiple alternative ingredients, return an IngredientText object for each ingredient name, otherwise return a single IngredientText object. Default is True.
discard_isolated_stop_wordsbool, optional: If True, any isolated stop words in the name, preparation, or comment fields are discarded. Default is True.
expect_name_in_outputbool, optional: If True, if the model doesn’t label any words in the sentence as the name, fallback to selecting the most likely name from all tokens even though the model gives it a different label. Note that this does guarantee the output contains a name. Default is True.
string_unitsbool: If True, return all IngredientAmount units as strings. If False, convert IngredientAmount units to pint.Unit objects where possible. Default is False.
imperial_unitsbool: If True, use imperial units instead of US customary units for pint.Unit objects for the the following units: fluid ounce, cup, pint, quart, gallon. Default is False, which results in US customary units being used. This has no effect if string_units=True.

Deprecated since version v2.5.0: Use volumetric_units_system="imperial" for the same functionality.
volumetric_units_systemstr, optional: Sets the units system for volumetric measurements, like “cup” or “tablespoon”. Available options are “us_customary” (default), “imperial”, “metric”, “australian”, “japanese”. This has no effect if string_units=True.

Added in version v2.5.0.
foundation_foodsbool, optional: If True, extract foundation foods from ingredient name. Foundation foods are the fundamental foods without any descriptive terms, e.g. ‘cucumber’ instead of ‘organic cucumber’. Default is False.
custom_unitsdict[str, str] | None, optional: Provide custom units to aid the parser in identifying units. The custom units should be provided as a dict of plural: singular pairs. If a unit does not have a plural form, provide the singular form as the key for the pair. The units should not start with a capital letter, but may contain capital letters at other positions.

Added in version v2.6.0.

Returns:

list[ParsedIngredient]: List of ParsedIngredient objects of structured data parsed from input sentences.

Parsers

Contents

Parsers#