Data Sources
The LCA Food Glossary integrates 168,626 terms from 10 authoritative sources covering food classification, Life Cycle Assessment, packaging, and agricultural domains.
Overview Table
| Source | Terms | Domain | Type | Status |
|---|---|---|---|---|
| AGROvoc | 41,447 | Agriculture | Thesaurus | Live |
| Hestia | 36,044 | Food LCA | API | Live |
| Ecoinvent | 33,784 | LCA Inventory | Database | Static |
| FoodEx2 | 31,601 | Food Classification | Standard | Static |
| LanguaL | 12,836 | Food Characteristics | Vocabulary | Static |
| Sentier | 7,731 | LCA Framework | RDF/Linked Data | Static |
| CPC | 4,583 | Commodity Codes | Classification | Static |
| UNECE Rec 21 | 406 | Packaging Codes | Standard | Static |
| GS1 Packaging | 154 | Packaging Vocabulary | Standard | Static |
| Eaternity | 40 | EOS Schema | Schema | Static |
| Total | 168,626 | Multiple | Mixed | - |
Source Details
AGROvoc
Terms: 41,447 Provider: Food and Agriculture Organization (FAO) Domain: Agriculture, forestry, fisheries, food, environment Type: Multilingual thesaurus Coverage: Broadest agricultural vocabulary
AGROvoc is a comprehensive agricultural thesaurus covering all areas of interest to FAO, including food, nutrition, agriculture, fisheries, forestry, and the environment. It provides standardized terminology in multiple languages.
Key Features:
- Multilingual support (20+ languages)
- Hierarchical relationships between concepts
- Broad coverage of agricultural domains
- Used by agricultural libraries and information systems worldwide
Use Cases:
- Agricultural research and documentation
- Food system terminology standardization
- Cross-language information retrieval
- Agricultural policy and planning
Data Format: RDF/SKOS thesaurus Update Frequency: Regular updates by FAO License: Open data (CC-BY-IGO 3.0)
Hestia
Terms: 36,044 Provider: Hestia Project (api.hestia.earth) Domain: Food Life Cycle Assessment Type: Live API integration Coverage: Specialized food LCA terms across 6 main categories
Hestia is the largest food LCA database with real-time API integration, providing comprehensive terminology for environmental impact assessment of food systems.
Main Categories:
- Practices - Agricultural and production practices
- Inputs & Products - Ingredients, products, and materials
- Measurements - Quantitative measurements and metrics
- Methods & Models - LCA methodologies and calculation models
- Emissions & Resource Use - Environmental impacts and resource consumption
- Infrastructure & Equipment - Production facilities and machinery
Key Features:
- Live API integration with ~30,000 terms
- Hierarchical category structure
- Detailed descriptions and units
- Regular updates from global food LCA studies
Use Cases:
- Environmental impact calculations
- Food sustainability assessments
- Supply chain carbon footprinting
- Agricultural practice modeling
Data Format: JSON-LD via REST API
Update Frequency: Continuous (live API)
API Endpoint: https://api.hestia.earth
License: Open for research use
Ecoinvent
Terms: 33,784 Provider: ecoinvent Association Domain: Life Cycle Inventory Type: LCA database Coverage: Industrial processes, materials, energy, transportation, waste
Ecoinvent is the world's leading Life Cycle Inventory database, providing consistent and transparent data for environmental assessments.
Coverage Areas:
- Energy supply
- Materials production
- Agriculture and food
- Chemicals and plastics
- Transportation and logistics
- Waste treatment
- Construction and infrastructure
Key Features:
- Comprehensive LCI datasets
- Impact category classification
- Process-level granularity
- Global geographical coverage
- Uncertainty quantification
Use Cases:
- Product life cycle assessments
- Environmental footprinting
- Supply chain impact analysis
- Comparative product assessments
Data Format: Activity names and process identifiers Update Frequency: Annual major releases License: Commercial license required for full database
FoodEx2
Terms: 31,601 Provider: European Food Safety Authority (EFSA) Domain: Food classification Type: Hierarchical classification standard Coverage: Complete food catalog with faceted classification
FoodEx2 is EFSA's standardized food classification and description system, designed for data exchange in food safety and nutrition.
Structure:
- Master Hierarchy - Core food groups and categories
- Report Hierarchy - Aggregated categories for reporting
- Facets - Additional descriptors (processing, production method, packaging)
Key Features:
- Comprehensive European food catalog
- Hierarchical code system
- Faceted classification (multi-dimensional)
- Used by EU member states for food safety reporting
Main Food Groups:
- Grains and grain-based products
- Vegetables and vegetable products
- Fruits and fruit products
- Meat and meat products
- Fish and seafood
- Dairy products
- Beverages
- Composite foods
Use Cases:
- Food safety data exchange
- Nutritional databases
- Food consumption surveys
- Dietary exposure assessments
Data Format: Excel with hierarchical codes Update Frequency: Periodic revisions by EFSA License: Public domain
LanguaL
Terms: 12,836 Provider: LanguaL Consortium Domain: Food characteristics Type: Thesaurus Coverage: Systematic food description using 14 facets
LanguaL (Langua aLimentaria or language of food) is a systematic method for describing food using standardized terminology across multiple facets.
Facet Categories (14 facets):
- Product Type
- Food Source
- Part of Plant or Animal
- Physical State
- Extent of Heat Treatment
- Cooking Method
- Treatment Applied
- Preservation Method
- Packing Medium
- Container or Wrapping
- Food Contact Surface
- Consumer Group/Dietary Use
- Geographic Places and Regions
- Additional Product Information
Key Features:
- Multi-faceted food description
- International food terminology
- Supports detailed food characterization
- Used in food composition databases
Use Cases:
- Food composition databases
- Nutritional analysis
- Food labeling and regulation
- Recipe and menu analysis
Data Format: Structured vocabulary Update Frequency: Periodic updates License: Open use
Sentier
Terms: 7,731 Provider: Sentier.dev (Open Source) Domain: Life Cycle Assessment Type: RDF/Linked Data framework Coverage: Open source LCA terminology and relationships
Sentier is an open source framework for LCA data management using linked data and semantic web technologies.
Key Features:
- RDF/Turtle format native support
- Semantic web integration
- Open source and transparent
- Modern LCA data architecture
Use Cases:
- Research and academic LCA studies
- Open source LCA tools
- Linked data applications
- LCA methodology development
Data Format: RDF/Turtle (TTL) Update Frequency: Community-driven updates License: Open source
CPC
Terms: 4,583 Provider: United Nations Statistics Division Domain: Commodity classification Type: Classification standard Coverage: Goods and services across all economic sectors
CPC (Central Product Classification) is the UN's comprehensive classification of goods and services used for economic statistics.
Coverage Areas:
- Agricultural products
- Food products
- Industrial goods
- Energy products
- Services
Key Features:
- Internationally standardized codes
- Hierarchical structure
- Used in trade statistics
- Aligned with other UN classifications
Use Cases:
- International trade analysis
- Economic statistics
- Product categorization
- Supply chain classification
Data Format: Hierarchical codes and descriptions Update Frequency: Periodic revisions by UNSD License: Public domain
UNECE Rec 21
Terms: 406 Provider: United Nations Economic Commission for Europe Domain: Packaging codes Type: Recommendation standard Coverage: Packaging materials and container types
UNECE Recommendation 21 provides standardized codes for package types and packaging materials, recommended by GS1 for global supply chains.
Coverage:
- Packaging materials (paper, plastic, metal, glass, wood)
- Container types (boxes, bottles, cans, pallets)
- Package configurations
Key Features:
- Internationally recognized codes
- Endorsed by GS1
- Used in logistics and supply chain
- Material-based hierarchies
Use Cases:
- Supply chain documentation
- Packaging sustainability analysis
- Logistics and shipping
- Waste management and recycling
Data Format: Code lists with descriptions Update Frequency: Stable standard with occasional updates License: Public domain
GS1 Packaging
Terms: 154 Provider: GS1 Organization Domain: Packaging vocabulary Type: Global standard Coverage: Packaging materials, features, functions, shapes, recycling
GS1 Packaging provides a global vocabulary for describing packaging characteristics in supply chains.
Categories:
- Materials - Packaging material types
- Features - Functional features (resealable, tamper-evident)
- Functions - Packaging purposes (containment, protection)
- Shapes - Physical shapes and forms
- Recycling - Recyclability and sustainability attributes
Key Features:
- Global supply chain standard
- Integration with GS1 barcodes
- Sustainability focus
- E-commerce compatible
Use Cases:
- Product packaging descriptions
- E-commerce listings
- Sustainability reporting
- Circular economy initiatives
Data Format: Structured vocabulary Update Frequency: Regular updates by GS1 License: GS1 terms of use
Eaternity
Terms: 40 (24 schema classes + 16 properties) Provider: Eaternity Domain: Environmental Operating System schema Type: Data schema Coverage: EOS API data model for food sustainability
Eaternity schema terms represent the data model of the Environmental Operating System (EOS), Eaternity's platform for food sustainability assessment.
Schema Classes (24 terms):
FlowNode- Material flows in the systemActivityNode- Production and processing activitiesFoodProductFlowNode- Specific food product flowsImpactAssessment- Environmental impact calculations- And 20 more specialized classes
Property Terms (16 terms):
Product Name- Product identificationQuantity- Amount measurementsOrigin Country- Geographic locationProcessing Method- Production processesNutritional Values- Nutrient data- And 11 more properties
Key Features:
- Direct mapping to EOS API
- Semantic matching for CSV import
- Property-level granularity
- Python field name mappings
Use Cases:
- EOS API integration
- User data import mapping
- Food sustainability calculations
- Schema validation
Data Format: LinkML YAML with JSON-LD context Update Frequency: Aligned with EOS releases License: Proprietary
Learn more about Eaternity Schema →
Integration Strategy
Hierarchical Organization
Each source maintains its native hierarchical structure:
- FoodEx2:
masterHierarchyCode,reportHierarchyCode, facet groups - Hestia: 6 main categories with subcategories
- AGROvoc: SKOS concept schemes and broader/narrower relationships
- Ecoinvent: Impact category classifications
- LanguaL: 14 facet groups
- CPC: Hierarchical section/division/group/class structure
- GS1/UNECE: Material and type-based hierarchies
- Sentier: RDF semantic relationships
- Eaternity: Schema class inheritance
Cross-Source Mapping
The glossary includes semantic relationships between sources:
- Exact Mappings - Terms with identical meaning across sources
- Broader/Narrower - Hierarchical relationships between sources
- Related - Conceptually related terms
- AI-Generated - Semantic similarity from embeddings
Update Strategy
| Source Type | Update Method | Frequency |
|---|---|---|
| Live API (Hestia) | Automated fetch | Daily/weekly |
| Static Files | Manual update | Quarterly |
| RDF/Linked Data | Git submodule | On release |
| Schema (Eaternity) | Synchronized | With EOS releases |
Data Quality
Validation
All sources are validated against the LinkML schema:
- Required fields - ID, name, source, category
- Data types - String, numeric, array consistency
- Relationships - Valid parent term references
- Uniqueness - No duplicate term IDs
Enrichment
Terms are enriched during processing:
- Descriptions - Standardized format and length
- Categories - Automatic extraction from hierarchies
- Metadata - Source URLs, update timestamps
- Search optimization - Searchable flags and priority
Coverage Analysis
| Domain | Primary Sources | Term Count | Coverage |
|---|---|---|---|
| Food Products | FoodEx2, Hestia, AGROvoc | 109,092 | Excellent |
| LCA Processes | Ecoinvent, Hestia, Sentier | 77,559 | Excellent |
| Agriculture | AGROvoc, LanguaL | 54,283 | Excellent |
| Packaging | GS1, UNECE, CPC | 5,143 | Good |
| Sustainability | Hestia, Eaternity | 36,084 | Excellent |
License Summary
| License Type | Sources | Commercial Use |
|---|---|---|
| Open Data | AGROvoc, FoodEx2, UNECE, CPC | Yes |
| Research Use | Hestia, Sentier | Academic only |
| Commercial | Ecoinvent | License required |
| Standard Terms | GS1, LanguaL | With attribution |
| Proprietary | Eaternity | Restricted |
Note: Always verify individual source licenses before commercial use.
Future Expansions
Potential additional sources under consideration:
- USDA FoodData Central - US food composition data (~15,000 terms)
- UK Food Standards - British food classification (~10,000 terms)
- ISO 14040/14044 - LCA methodology terms (~500 terms)
- GHG Protocol - Greenhouse gas accounting (~1,000 terms)
- WFLDB - World Food LCA Database (~5,000 terms)
Related Documentation
- Semantic Mapping - How sources are mapped together
- Data Formats - Export formats and integration
- FoodEx2 Reference - Detailed FoodEx2 documentation
- Hestia Reference - Detailed Hestia documentation
- Ecoinvent Reference - Detailed Ecoinvent documentation