Data Reference#
This section documents the data sources and data flow in PyPSA-GB.
Data Architecture#
PyPSA-GB uses a layered data architecture:
flowchart TB
subgraph External["External Sources"]
FES["NESO FES API"]
DUKES["DUKES Statistics"]
REPD["REPD Database"]
ETYS["ETYS Network Data"]
ERA5["ERA5 Weather"]
ESPENI["ESPENI Demand"]
ELEXON["ELEXON BMRS"]
NESO_VAL["NESO Constraint Data"]
end
subgraph Raw["data/ (Raw)"]
FES_RAW["FES Excel/CSV"]
DUKES_RAW["DUKES Excel"]
REPD_RAW["REPD CSV"]
ETYS_RAW["Network files"]
end
subgraph Processed["resources/ (Processed)"]
FES_PROC["FES processed"]
GEN_PROC["Generators"]
RENEW_PROC["Renewables"]
NETWORK["Network files"]
MARKET["Market outputs"]
end
FES --> FES_RAW --> FES_PROC
DUKES --> DUKES_RAW --> GEN_PROC
REPD --> REPD_RAW --> RENEW_PROC
ETYS --> ETYS_RAW --> NETWORK
ELEXON --> MARKET
NESO_VAL --> MARKET
Directory Structure#
data/ # Raw input (versioned, rarely changes)
├── FES/ # FES capacity projections
├── generators/ # DUKES thermal data
├── renewables/ # REPD site data
├── network/ # Network topology
├── demand/ # Demand profiles
├── storage/ # Storage parameters
└── interconnectors/ # Cross-border connections
resources/ # Generated outputs (recreated by workflow)
├── FES/ # Processed FES data
├── generators/ # Processed generator files
├── renewable/ # Site data + generation profiles
├── network/ # Built PyPSA networks
├── marginal_costs/ # Fuel price time series
└── analysis/ # Output reports
Market-enabled scenarios also use data/market/ for optional ELEXON caches and
data/validation/ for NESO/ELEXON validation caches. Per-scenario wholesale,
balancing, and validation outputs are written to resources/market/.
Key Data Files#
File |
Description |
Update Frequency |
|---|---|---|
|
UK power station register |
Annual (March) |
|
Renewable sites |
Quarterly |
|
Transmission entry capacity |
Monthly |
|
NESO API endpoints |
Per FES release |
|
Historical demand |
Annual |
|
Optional ELEXON BMRS caches |
Generated/cached as needed |
|
Optional NESO constraint validation data |
Generated/cached as needed |
Scenario Data Routing#
Data sources depend on the modelled year:
Data Type |
Historical (≤2024) |
Future (>2024) |
|---|---|---|
Thermal |
DUKES statistics |
FES projections |
Renewables |
REPD site data |
FES + REPD distribution |
Storage |
REPD site data |
FES GSP-level |
Demand |
ESPENI profiles |
FES totals + profile shape |
Network |
ETYS current |
ETYS + planned upgrades |