Formati dei Dati

Il Glossario ESFC e disponibile in oltre 8 formati per supportare diversi casi d'uso, dalle applicazioni web all'integrazione con il web semantico, query di database e programmazione type-safe.

Panoramica

Tutti i formati sono generati da un singolo schema LinkML, garantendo coerenza tra gli output mentre ogni formato viene ottimizzato per il suo caso d'uso specifico.

Formati Disponibili:

Database SQLite (133 MB)
JSON (189 MB)
LinkML YAML (157 MB)
JSON-LD (web semantico)
Tipi TypeScript
Ontologie RDF/OWL
Schemi SQL DDL
CSV/Excel

Formati Primari

Database SQLite

File: glossary.db Dimensione: 133 MB Caso d'Uso: Query, relazioni, integrazione applicazioni

Il database SQLite fornisce archiviazione ottimizzata e query veloci per il glossario completo.

Schema del Database

-- Tabella principale dei termini
CREATE TABLE terms (
  id TEXT PRIMARY KEY,
  name TEXT NOT NULL,
  description TEXT,
  source TEXT NOT NULL,
  category TEXT,
  properties JSON,
  external_mappings JSON,
  parent_terms JSON,
  metadata JSON,
  status TEXT DEFAULT 'active',
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Indici per le prestazioni
CREATE INDEX idx_source ON terms(source);
CREATE INDEX idx_category ON terms(category);
CREATE INDEX idx_name ON terms(name);
CREATE INDEX idx_status ON terms(status);

-- Ricerca full-text
CREATE VIRTUAL TABLE terms_fts USING fts5(
  id, name, description, category,
  content=terms
);

-- Tabella delle relazioni
CREATE TABLE relationships (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  source_term_id TEXT NOT NULL,
  target_term_id TEXT NOT NULL,
  relationship_type TEXT NOT NULL,
  confidence REAL,
  method TEXT,
  FOREIGN KEY (source_term_id) REFERENCES terms(id),
  FOREIGN KEY (target_term_id) REFERENCES terms(id)
);

CREATE INDEX idx_relationships_source ON relationships(source_term_id);
CREATE INDEX idx_relationships_target ON relationships(target_term_id);

Esempi di Query

Query di Base:

-- Conta i termini per fonte
SELECT source, COUNT(*) as term_count
FROM terms
GROUP BY source
ORDER BY term_count DESC;

-- Trova tutti i termini relativi al grano
SELECT id, name, source, category
FROM terms
WHERE name LIKE '%wheat%'
ORDER BY source, name;

-- Ricerca full-text
SELECT t.id, t.name, t.source, t.category
FROM terms_fts fts
JOIN terms t ON fts.id = t.id
WHERE terms_fts MATCH 'carbon emission'
LIMIT 20;

Query Avanzate:

-- Trova termini con relazioni
SELECT
  t1.id as source_id,
  t1.name as source_name,
  r.relationship_type,
  t2.name as target_name,
  t2.source as target_source,
  r.confidence
FROM terms t1
JOIN relationships r ON t1.id = r.source_term_id
JOIN terms t2 ON t2.id = r.target_term_id
WHERE t1.source = 'foodex2'
  AND t2.source = 'hestia'
ORDER BY r.confidence DESC
LIMIT 10;

-- Query gerarchiche
WITH RECURSIVE hierarchy AS (
  SELECT id, name, parent_terms, 1 as level
  FROM terms
  WHERE id = 'foodex2-A0101'

  UNION ALL

  SELECT t.id, t.name, t.parent_terms, h.level + 1
  FROM terms t
  JOIN hierarchy h
  WHERE json_extract(t.parent_terms, '$[0]') = h.id
)
SELECT * FROM hierarchy;

Esempi di Integrazione

Python:

import sqlite3

# Connetti al database
conn = sqlite3.connect('glossary.db')
cursor = conn.cursor()

# Interroga i termini
cursor.execute('''
  SELECT id, name, source, category
  FROM terms
  WHERE source = ?
  LIMIT 10
''', ('hestia',))

for row in cursor.fetchall():
    print(f"{row[0]}: {row[1]} ({row[2]})")

conn.close()

Node.js:

import Database from 'better-sqlite3'

const db = new Database('glossary.db')

// Statement preparato
const stmt = db.prepare(`
  SELECT id, name, source, category
  FROM terms
  WHERE source = ? AND category LIKE ?
`)

const results = stmt.all('hestia', '%Emission%')
console.log(`Trovati ${results.length} termini sulle emissioni`)

Formato JSON

File: glossary.json Dimensione: 189 MB Caso d'Uso: Applicazioni web, integrazione JavaScript/TypeScript

Dati completi del glossario in formato JSON con tutti i dettagli dei termini e metadati.

Struttura

{
  "metadata": {
    "version": "0.1.2",
    "build": 6,
    "lastUpdated": "2025-12-08T02:54:36.996Z",
    "totalTerms": 168626,
    "sources": {
      "foodex2": 31601,
      "hestia": 36044,
      "ecoinvent": 33784,
      "agrovoc": 41447,
      "langual": 12836,
      "cpc": 4583,
      "sentier": 7731,
      "unece": 406,
      "gs1": 154,
      "eaternity": 40
    }
  },
  "terms": [
    {
      "@type": "Term",
      "id": "foodex2-A010101",
      "name": "Common wheat",
      "description": "Triticum aestivum, bread wheat",
      "source": "foodex2",
      "category": "Grains",
      "properties": {
        "hierarchyCode": "A010101",
        "scientificName": "Triticum aestivum",
        "level": 4
      },
      "external_mappings": [
        {
          "externalId": "hestia-crop-wheat",
          "externalSource": "hestia",
          "mappingType": "related"
        }
      ],
      "parent_terms": ["foodex2-A0101"],
      "metadata": {
        "searchable": true,
        "verified": true
      },
      "status": "active"
    }
  ]
}

Esempi di Utilizzo

JavaScript/TypeScript:

// Carica il glossario
const glossary = await fetch('/glossary.json')
  .then(r => r.json())

// Filtra per fonte
const hestiaTerms = glossary.terms.filter(t =>
  t.source === 'hestia'
)

// Cerca per nome
const searchResults = glossary.terms.filter(t =>
  t.name.toLowerCase().includes('wheat')
)

// Raggruppa per categoria
const byCategory = glossary.terms.reduce((acc, term) => {
  const cat = term.category || 'Uncategorized'
  if (!acc[cat]) acc[cat] = []
  acc[cat].push(term)
  return acc
}, {})

Python:

import json

with open('glossary.json') as f:
    glossary = json.load(f)

# Accedi ai metadati
print(f"Versione: {glossary['metadata']['version']}")
print(f"Termini totali: {glossary['metadata']['totalTerms']}")

# Filtra i termini
hestia_terms = [
    t for t in glossary['terms']
    if t['source'] == 'hestia'
]

# Cerca
wheat_terms = [
    t for t in glossary['terms']
    if 'wheat' in t['name'].lower()
]

LinkML YAML

File: glossary.yaml Dimensione: 157 MB Caso d'Uso: Web semantico, ricerca, validazione dati

Formato nativo LinkML con annotazioni semantiche complete e relazioni.

Struttura

terms:
  - '@type': Term
    id: foodex2-A010101
    name: Common wheat
    description: Triticum aestivum, bread wheat
    source: foodex2
    category: Grains
    properties:
      hierarchyCode: A010101
      scientificName: Triticum aestivum
      level: 4
    external_mappings:
      - externalId: hestia-crop-wheat
        externalSource: hestia
        mappingType: related
    parent_terms:
      - foodex2-A0101
    metadata:
      searchable: true
      verified: true
    status: active

Utilizzo con LinkML

Python con LinkML Runtime:

from linkml_runtime.loaders import yaml_loader
from glossary_model import Glossary, Term

# Carica il glossario
glossary = yaml_loader.load('glossary.yaml', target_class=Glossary)

# Accedi ai termini
print(f"Caricati {len(glossary.terms)} termini")

# Filtra per fonte
hestia_terms = [t for t in glossary.terms if t.source == 'hestia']

# Valida rispetto allo schema
from linkml_runtime.utils.schemaview import SchemaView

schema = SchemaView('schema/glossary.linkml.yaml')
for term in glossary.terms[:10]:
    schema.validate_object(term, target_class='Term')

JSON-LD (Web Semantico)

File: glossary.jsonld Dimensione: ~200 MB Caso d'Uso: Web semantico, integrazione RDF, linked data

Formato JSON-LD con contesto per il web semantico per l'integrazione RDF/SPARQL.

Struttura

{
  "@context": {
    "@vocab": "http://esfc-glossary.org/vocab/",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "dc": "http://purl.org/dc/terms/",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "Term": "skos:Concept",
    "name": "skos:prefLabel",
    "description": "skos:definition",
    "source": "dc:source",
    "category": "skos:inScheme",
    "parent_terms": "skos:broader",
    "external_mappings": {
      "@id": "skos:relatedMatch",
      "@container": "@set"
    }
  },
  "@graph": [
    {
      "@type": "Term",
      "@id": "foodex2:A010101",
      "name": "Common wheat",
      "description": "Triticum aestivum, bread wheat",
      "source": "foodex2",
      "category": "Grains",
      "parent_terms": ["foodex2:A0101"],
      "external_mappings": [
        {
          "@id": "hestia:crop-wheat",
          "mappingType": "related"
        }
      ]
    }
  ]
}

Query SPARQL

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX dc: <http://purl.org/dc/terms/>

# Trova tutti i termini relativi al grano
SELECT ?term ?label ?source WHERE {
  ?term skos:prefLabel ?label ;
        dc:source ?source .
  FILTER(CONTAINS(LCASE(?label), "wheat"))
}
LIMIT 10

# Trova termini correlati
SELECT ?source ?target ?type WHERE {
  ?source skos:relatedMatch ?target .
  ?source dc:source "foodex2" .
  ?target dc:source "hestia" .
}

Formati Generati

Tipi TypeScript

File: glossary.types.ts Dimensione: ~500 KB Caso d'Uso: Integrazione type-safe TypeScript/JavaScript

Definizioni di tipi TypeScript generate per lo schema del glossario.

Tipi Generati

/**
 * Interfaccia principale del glossario
 */
export interface Glossary {
  metadata: GlossaryMetadata
  terms: Term[]
}

/**
 * Metadati sul glossario
 */
export interface GlossaryMetadata {
  version: string
  build: number
  lastUpdated: string
  totalTerms: number
  sources: Record<string, number>
}

/**
 * Singolo termine del glossario
 */
export interface Term {
  '@type': 'Term'
  id: string
  name: string
  description?: string
  source: GlossarySource
  category?: string
  properties?: Record<string, any>
  external_mappings?: ExternalMapping[]
  parent_terms?: string[]
  metadata?: Record<string, any>
  status: TermStatus
}

/**
 * Fonti del glossario
 */
export type GlossarySource =
  | 'foodex2'
  | 'hestia'
  | 'ecoinvent'
  | 'agrovoc'
  | 'langual'
  | 'cpc'
  | 'sentier'
  | 'unece'
  | 'gs1'
  | 'eaternity'

/**
 * Mappatura esterna verso altri vocabolari
 */
export interface ExternalMapping {
  externalId: string
  externalSource: string
  mappingType: 'exact' | 'related' | 'broader' | 'narrower'
  confidence?: number
}

/**
 * Stato del termine
 */
export type TermStatus = 'active' | 'deprecated' | 'obsolete'

Utilizzo

import { Glossary, Term, GlossarySource } from './glossary.types'

async function loadGlossary(): Promise<Glossary> {
  const response = await fetch('/glossary.json')
  return response.json()
}

function filterBySource(
  terms: Term[],
  source: GlossarySource
): Term[] {
  return terms.filter(t => t.source === source)
}

// Utilizzo type-safe
const glossary = await loadGlossary()
const hestiaTerms = filterBySource(glossary.terms, 'hestia')

// TypeScript garantisce la sicurezza dei tipi
console.log(`Trovati ${hestiaTerms.length} termini Hestia`)

Schema SQL DDL

File: glossary.sql Dimensione: ~50 KB Caso d'Uso: Creazione schema database, configurazione PostgreSQL/MySQL

Definizione dello schema SQL per la creazione delle tabelle del database.

Schema Generato

-- Tabella dei termini
CREATE TABLE terms (
  id VARCHAR(255) PRIMARY KEY,
  name TEXT NOT NULL,
  description TEXT,
  source VARCHAR(50) NOT NULL,
  category VARCHAR(255),
  properties JSONB,
  external_mappings JSONB,
  parent_terms JSONB,
  metadata JSONB,
  status VARCHAR(50) DEFAULT 'active',
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Indici
CREATE INDEX idx_terms_source ON terms(source);
CREATE INDEX idx_terms_category ON terms(category);
CREATE INDEX idx_terms_name ON terms USING gin(to_tsvector('english', name));
CREATE INDEX idx_terms_properties ON terms USING gin(properties);

-- Ricerca full-text (PostgreSQL)
CREATE INDEX idx_terms_fts ON terms
USING gin(to_tsvector('english', coalesce(name, '') || ' ' || coalesce(description, '')));

-- Vista materializzata per le statistiche delle fonti
CREATE MATERIALIZED VIEW source_statistics AS
SELECT
  source,
  COUNT(*) as term_count,
  COUNT(DISTINCT category) as category_count,
  MIN(created_at) as first_added,
  MAX(updated_at) as last_updated
FROM terms
GROUP BY source;

Ontologia RDF/OWL

File: glossary.owl Dimensione: ~250 MB Caso d'Uso: Applicazioni per il web semantico, ragionamento ontologico

Ontologia OWL per il ragionamento e l'inferenza nel web semantico.

Struttura dell'Ontologia

<?xml version="1.0"?>
<rdf:RDF xmlns="http://esfc-glossary.org/ontology#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:skos="http://www.w3.org/2004/02/skos/core#">

  <owl:Ontology rdf:about="http://esfc-glossary.org/ontology">
    <rdfs:label>Ontologia Glossario ESFC</rdfs:label>
    <rdfs:comment>
      Ontologia unificata del glossario alimentare e LCA
    </rdfs:comment>
  </owl:Ontology>

  <!-- Classi -->
  <owl:Class rdf:about="http://esfc-glossary.org/ontology#Term">
    <rdfs:label>Term</rdfs:label>
    <rdfs:subClassOf rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
  </owl:Class>

  <!-- Proprieta -->
  <owl:DatatypeProperty rdf:about="http://esfc-glossary.org/ontology#source">
    <rdfs:domain rdf:resource="http://esfc-glossary.org/ontology#Term"/>
    <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/>
  </owl:DatatypeProperty>

  <!-- Individui (Termini) -->
  <owl:NamedIndividual rdf:about="http://esfc-glossary.org/terms/foodex2-A010101">
    <rdf:type rdf:resource="http://esfc-glossary.org/ontology#Term"/>
    <skos:prefLabel>Common wheat</skos:prefLabel>
    <skos:definition>Triticum aestivum, bread wheat</skos:definition>
  </owl:NamedIndividual>

</rdf:RDF>

Formati di Esportazione

Esportazione CSV

Generate file CSV per sottoinsiemi specifici di termini.

Struttura dell'Esportazione

id,name,description,source,category,properties,status
foodex2-A010101,"Common wheat","Triticum aestivum",foodex2,Grains,"{""hierarchyCode"":""A010101""}",active
hestia-crop-wheat,"Wheat crop","Agricultural wheat production",hestia,"Inputs & Products","{}",active

Script di Esportazione

# Esporta tutti i termini in CSV
npm run export:csv

# Esporta una fonte specifica
npm run export:csv -- --source hestia

# Esporta con filtri
npm run export:csv -- --source foodex2 --category Grains

Esportazione Excel

Cartella di lavoro Excel multi-foglio con dati organizzati.

Struttura della Cartella di Lavoro

Foglio 1: Panoramica

Metadati e statistiche
Riepilogo delle fonti
Suddivisione per categoria

Foglio 2: Tutti i Termini

Lista completa dei termini
Colonne filtrabili
Colorazione per fonte

Foglio 3: FoodEx2

Termini FoodEx2 con gerarchia
Informazioni sulle facet

Foglio 4: Hestia

Termini LCA Hestia
Organizzazione per categoria

Foglio 5: Relazioni

Mappature tra fonti diverse
Punteggi di confidenza
Metodi di mappatura

Generazione

# Genera la cartella di lavoro Excel
npm run export:excel

# Esportazione personalizzata
node scripts/export-excel.js \
  --output glossary.xlsx \
  --include-relationships

Posizioni di Download

Tutti i formati sono disponibili per il download:

https://esfc-glossary-ec2bc9.gitlab.io/downloads/
├── glossary.db          # Database SQLite (133 MB)
├── glossary.json        # Formato JSON (189 MB)
├── glossary.yaml        # LinkML YAML (157 MB)
├── glossary.jsonld      # JSON-LD (200 MB)
├── glossary.types.ts    # Tipi TypeScript (500 KB)
├── glossary.owl         # Ontologia OWL (250 MB)
├── glossary.sql         # SQL DDL (50 KB)
├── glossary.csv         # Esportazione CSV (variabile)
└── glossary.xlsx        # Cartella di lavoro Excel (variabile)

Guida alla Selezione del Formato

Scegliete il formato giusto per il vostro caso d'uso:

Caso d'Uso	Formato Raccomandato	Perche
Applicazione web	JSON o SQLite	Caricamento veloce, facile integrazione
Sviluppo type-safe	Tipi TypeScript + JSON	Type safety e autocompletamento
Applicazione database	SQLite o SQL DDL	Query ottimizzate
Web semantico	JSON-LD o RDF/OWL	Compatibilita RDF/SPARQL
Ricerca	LinkML YAML	Annotazioni semantiche complete
Analisi dati	CSV o Excel	Strumenti per fogli di calcolo
Integrazione Python	SQLite o LinkML YAML	Supporto nativo
Integrazione Node.js	JSON o SQLite	Parsing facile

Pipeline di Generazione

Tutti i formati sono generati dallo schema LinkML:

Schema LinkML (glossary.linkml.yaml)
    ↓
Parsing e Validazione dei Dati
    ↓
LinkML YAML (formato nativo)
    ↓
Generazione Multi-Formato
    ├── JSON (linkml-convert)
    ├── JSON-LD (linkml-convert)
    ├── TypeScript (linkml-generate-typescript)
    ├── OWL (linkml-convert)
    ├── SQL DDL (linkml-generate-sql)
    └── SQLite (script personalizzato)
    ↓
Ottimizzazione e Compressione
    ↓
Deployment su CDN

Documentazione Correlata

Fonti Dati - Panoramica di tutte le 10 fonti
Mappatura Semantica - Relazioni tra fonti diverse
Panoramica del Glossario - Documentazione principale
Riferimento FoodEx2 - Classificazione alimentare
Riferimento Hestia - Dati LCA

Panoramica​

Formati Primari​

Database SQLite​

Schema del Database​

Esempi di Query​

Esempi di Integrazione​

Formato JSON​

Struttura​

Esempi di Utilizzo​

LinkML YAML​

Struttura​

Utilizzo con LinkML​

JSON-LD (Web Semantico)​

Struttura​

Query SPARQL​

Formati Generati​

Tipi TypeScript​

Tipi Generati​

Utilizzo​

Schema SQL DDL​

Schema Generato​

Ontologia RDF/OWL​

Struttura dell'Ontologia​

Formati di Esportazione​

Esportazione CSV​

Struttura dell'Esportazione​

Script di Esportazione​

Esportazione Excel​

Struttura della Cartella di Lavoro​

Generazione​

Posizioni di Download​

Guida alla Selezione del Formato​

Pipeline di Generazione​

Documentazione Correlata​

Panoramica

Formati Primari

Database SQLite

Schema del Database

Esempi di Query

Esempi di Integrazione

Formato JSON

Struttura

Esempi di Utilizzo

LinkML YAML

Struttura

Utilizzo con LinkML

JSON-LD (Web Semantico)

Struttura

Query SPARQL

Formati Generati

Tipi TypeScript

Tipi Generati

Utilizzo

Schema SQL DDL

Schema Generato

Ontologia RDF/OWL

Struttura dell'Ontologia

Formati di Esportazione

Esportazione CSV

Struttura dell'Esportazione

Script di Esportazione

Esportazione Excel

Struttura della Cartella di Lavoro

Generazione

Posizioni di Download

Guida alla Selezione del Formato

Pipeline di Generazione

Documentazione Correlata