Dataclasses#
- class ingredient_parser.dataclasses.UnitSystem(*values)[source]#
Enum defining unit systems
- METRIC = 'metric'#
- US_CUSTOMARY = 'us_customary'#
- IMPERIAL = 'imperial'#
- AUSTRALIAN = 'australian'#
- JAPANESE = 'japanese'#
- OTHER = 'other'#
- NONE = 'none'#
- class ingredient_parser.dataclasses.CompositeIngredientAmount(amounts: list[IngredientAmount], join: str, subtractive: bool)[source]#
Dataclass for a composite ingredient amount.
This is an amount comprising more than one IngredientAmount object e.g. “1 lb 2 oz” or “1 cup plus 1 tablespoon”.
- Attributes:
- amounts
list[IngredientAmount] List of IngredientAmount objects that make up the composite amount. The order in this list is the order they appear in the sentence.
- join
str String of text that joins the amounts, e.g. “plus”.
- subtractivebool
If True, the amounts combine subtractively. If False, the amounts combine additively.
- text
str Composite amount as a string, automatically generated the amounts and join attributes.
- confidence
float Confidence of parsed ingredient amount, between 0 and 1. This is the average confidence of all tokens that contribute to this object.
- starting_index
int Index of token in sentence that starts this amount.
- unit_system
UnitSystem Unit system (e.g. metric) that the unit of the amount belongs to.
- amounts
Methods
combined()Return the combined amount in a single unit for the composite amount.
convert_to(unit, density)Convert units of the combined CompositeIngredientAmount object to given unit.
- combined() Quantity[source]#
Return the combined amount in a single unit for the composite amount.
The amounts that comprise the composite amount are combined according to whether the composite amount is subtractive or not. The combined amount is returned as a pint.Quantity object.
- Returns:
pint.QuantityCombined amount.
- Raises:
TypeErrorRaised if any of the amounts in the object do not comprise a float quantity and a pint.Unit unit. In these cases, they amounts cannot be combined.
- convert_to(unit: str, density: Quantity = <Quantity(1000.0, 'kilogram / meter ** 3')>) Quantity[source]#
Convert units of the combined CompositeIngredientAmount object to given unit.
Conversion is only possible if none of the quantity, quantity_max and unit are strings.
Conversion between mass and volume is supported using the density parameter, but otherwise a DimensionalityError is raised if attempting to convert units of different dimensionality.
Warning
When a conversion between mass <-> volume is performed, the quantities will be converted to floats.
- Parameters:
- unit
str Unit to convert to.
- density
pint.Quantity,optional Density used for conversion between volume and mass. Default is the density of water.
- unit
- Returns:
pint.QuantityCombined amount converted to given units.
- class ingredient_parser.dataclasses.FoundationFood(text: str, confidence: float, fdc_id: int, category: str, data_type: str, name_index: int)[source]#
Dataclass for the attributes of an entry in the Food Data Central database.
- Attributes:
- text
str Description FDC database entry.
- confidence
float Confidence of the match, between 0 and 1.
- fdc_id
int ID of the FDC database entry.
- category
str Category of FDC database entry.
- data_type
str Food Data Central data set the entry belongs to.
- url
str URL for FDC database entry.
- name_index
int Index of associated name in ParsedIngredient.name list.
- text
- class ingredient_parser.dataclasses.IngredientAmount(quantity: Fraction | str, quantity_max: Fraction | str, unit: str | Unit, text: str, confidence: float, starting_index: int, APPROXIMATE: bool = False, SINGULAR: bool = False, RANGE: bool = False, MULTIPLIER: bool = False, PREPARED_INGREDIENT: bool = False)[source]#
Dataclass for holding a parsed ingredient amount.
On instantiation, the unit is made plural if necessary.
- Attributes:
- quantity
Fraction|str Parsed ingredient quantity, as a Fraction where possible, otherwise a string. If the amount if a range, this is the lower limit of the range.
- quantity_max
Fraction|str If the amount is a range, this is the upper limit of the range. Otherwise, this is the same as the quantity field. This is set automatically depending on the type of quantity.
- unit
str|pint.Unit Unit of parsed ingredient quantity. If the quantity is recognised in the pint unit registry, a pint.Unit object is used.
- text
str String describing the amount e.g. “1 cup”, “8 oz”
- confidence
float Confidence of parsed ingredient amount, between 0 and 1. This is the average confidence of all tokens that contribute to this object.
- starting_index
int Index of token in sentence that starts this amount
- unit_system
UnitSystem Unit system (e.g. metric) that the unit of the amount belongs to.
- APPROXIMATEbool,
optional When True, indicates that the amount is approximate. Default is False.
- SINGULARbool,
optional When True, indicates if the amount refers to a singular item of the ingredient. Default is False.
- RANGEbool,
optional When True, indicates the amount is a range e.g. 1-2. Default is False.
- MULTIPLIERbool,
optional When True, indicates the amount is a multiplier e.g. 1x, 2x. Default is False.
- PREPARED_INGREDIENTbool,
optional When True, indicates the amount applies to the prepared ingredient. When False, indicates the amount applies to the ingredient before preparation. Default is False.
- quantity
Methods
convert_to(unit, density)Convert units of IngredientAmount object to given unit.
- convert_to(unit: str, density: Quantity = <Quantity(1000.0, 'kilogram / meter ** 3')>)[source]#
Convert units of IngredientAmount object to given unit.
Conversion is only possible if none of the quantity, quantity_max and unit are strings.
Conversion between mass and volume is supported using the density parameter, but otherwise a DimensionalityError is raised if attempting to convert units of different dimensionality.
Warning
When a conversion between mass <-> volume is performed, the quantities will be converted to floats.
- Parameters:
- unit
str Unit to convert to.
- density
pint.Quantity,optional Density used for conversion between volume and mass. Default is the density of water.
- unit
- Returns:
SelfCopy of IngredientAmount object with units converted to given unit.
- Raises:
TypeErrorRaised if unit, quantity or quantity_max are str
- class ingredient_parser.dataclasses.IngredientText(text: str, confidence: float, starting_index: int)[source]#
Dataclass for holding a parsed ingredient string.
- Attributes:
- text
str Parsed text from ingredient. This is comprised of all tokens with the same label.
- confidence
float Confidence of parsed ingredient text, between 0 and 1. This is the average confidence of all tokens that contribute to this object.
- starting_index
int Index of token in sentence that starts this text
- text
- class ingredient_parser.dataclasses.LabelledToken(index: int, text: str, pos_tag: str, label: str, score: float, plural: bool)[source]#
Dataclass representing a labelled token from a ingredient sentence.
- class ingredient_parser.dataclasses.ParsedIngredient(name: list[IngredientText], size: IngredientText | None, amount: list[IngredientAmount | CompositeIngredientAmount], preparation: IngredientText | None, comment: IngredientText | None, purpose: IngredientText | None, foundation_foods: list[FoundationFood], sentence: str)[source]#
Dataclass for holding the parsed values for an input sentence.
- Attributes:
- name
list[IngredientText] List of IngredientText objects, each representing an ingreident name parsed from input sentence. If no ingredient names are found, this is an empty list.
- size
IngredientText|None Size modifier of ingredients, such as small or large. If no size modifier, this is None.
- amount
List[IngredientAmount|CompositeIngredientAmount] List of IngredientAmount objects, each representing a matching quantity and unit pair parsed from the sentence. If no ingredient amounts are found, this is an empty list.
- preparation
IngredientText|None Ingredient preparation instructions parsed from sentence. If no ingredient preparation instruction was found, this is None.
- comment
IngredientText|None Ingredient comment parsed from input sentence. If no ingredient comment was found, this is None.
- purpose
IngredientText|None The purpose of the ingredient parsed from the sentence. If no purpose was found, this is None.
- foundation_foods
list[FoundationFood] List of foundation foods from the parsed sentence.
- sentence
str Normalised input sentence
- name
- class ingredient_parser.dataclasses.ParserDebugInfo(sentence: str, PreProcessor: Any, PostProcessor: Any, tagger: NumpyCRFInference)[source]#
Dataclass for holding intermediate objects generated during parsing.
- Attributes:
- sentence
str Input ingredient sentence.
- PreProcessor
PreProcessor PreProcessor object created using input sentence.
- PostProcessor
PostProcessor PostProcessor object created using tokens, labels and scores from input sentence.
- Tagger
NumpyCRFInference CRF model tagger object.
- sentence
- class ingredient_parser.dataclasses.Token(index: int, text: str, feat_text: str, pos_tag: str, features: TokenFeatures)[source]#
Dataclass representing a token from a ingredient sentence.
- Attributes:
- index
int Index of the token in the sentence.
- text
str Token text.
- feat_text
str Token text used for feature generation.
- pos_tag
str Part of speech tag for token.
- features
TokenFeatures Common features for token.
- index
- class ingredient_parser.dataclasses.TokenFeatures(stem: str, shape: str, is_capitalised: bool, is_unit: bool, is_punc: bool, is_ambiguous_unit: bool)[source]#
Dataclass for common token features.
- Attributes:
- stem
str Stem of the token.
- shape
str Shape of the token, represented by X, x, d characters.
- is_capitalisedbool
True if the token starts with a capital letter, else False.
- is_unit
str True if the token is in the list of units, else False.
- is_punc
str True if the token is a punctuation character, else False.
- is_ambiguous_unit
str True if the token is in the list of ambiguous units, else False.
- stem