Architecture#
Overview of PyPSA-GB’s software architecture and design decisions.
High-Level Architecture#
flowchart TB
subgraph Config["Configuration Layer"]
YAML["YAML Config Files"]
SCENARIOS["Scenario Definitions"]
end
subgraph Workflow["Workflow Layer"]
SNAKE["Snakemake"]
RULES["Rule Definitions"]
end
subgraph Core["Core Layer"]
SCRIPTS["Python Scripts"]
PYPSA["PyPSA"]
end
subgraph Data["Data Layer"]
RAW["Raw Data"]
RESOURCES["Generated Resources"]
end
YAML --> SNAKE
SCENARIOS --> SNAKE
SNAKE --> RULES
RULES --> SCRIPTS
SCRIPTS --> PYPSA
RAW --> SCRIPTS
SCRIPTS --> RESOURCES
Design Principles#
1. Declarative Configuration#
Users declare what they want, not how to build it:
# User specifies desired outcome
HT35:
modelled_year: 2035
network_model: "ETYS"
FES_scenario: "Holistic Transition"
Snakemake determines the execution path automatically.
2. Reproducibility#
All inputs are versioned or documented
Configuration is explicit
Random seeds are fixed where needed
Logs capture execution details
3. Modularity#
Each script does one thing well:
Script |
Single Responsibility |
|---|---|
|
Assemble ETYS network from preprocessed data |
|
Parse raw ETYS Excel to CSVs |
|
Apply planned network reinforcements |
|
Map ETYS publication years to files |
|
Create Reduced/Zonal networks |
|
Add thermal capacity |
|
Run optimization |
|
Run copperplate wholesale dispatch |
|
Run anchored balancing redispatch |
4. Data Source Abstraction#
The same workflow handles historical and future scenarios:
flowchart LR
SCENARIO["Scenario Config"] --> DETECT["Scenario Detection"]
DETECT -->|"≤2024"| HIST["Historical Sources"]
DETECT -->|">2024"| FUTURE["FES Sources"]
HIST --> INTEGRATE["Integration"]
FUTURE --> INTEGRATE
Component Architecture#
Configuration System#
config/
├── config.yaml # What to run
├── scenarios.yaml # Scenario definitions
├── defaults.yaml # Default values
└── config_loader.py # Python interface
Loading flow:
Load
defaults.yamlOverride with
scenarios.yamlvaluesOverride with
config.yamlvaluesOverride with command-line arguments
Workflow System#
Snakefile # Main entry point
├── rules/
│ ├── network_build.smk
│ ├── generators.smk
│ ├── renewables.smk
│ ├── storage.smk
│ ├── solve.smk
│ └── analysis.smk
Rules define:
Input/output file relationships
Script to execute
Parameters from config
Log file locations
Script Organization#
scripts/
├── Network Build Package
│ ├── network_build/
│ │ ├── ETYS_network.py # ETYS network assembly (stage 2)
│ │ ├── process_ETYS_data.py # Raw Excel → CSV preprocessing (stage 1)
│ │ ├── ETYS_upgrades.py # Network upgrade application
│ │ ├── etys_file_registry.py # File/sheet name registry and constants
│ │ └── build_network.py # Reduced/Zonal network builders
├── Core Modules
│ ├── solve_network.py
│ └── scenario_detection.py
├── Integration Modules
│ ├── integrate_thermal_generators.py
│ ├── integrate_renewable_generators.py
│ └── add_storage.py
├── Utility Modules
│ ├── spatial_utils.py
│ ├── logging_config.py
│ └── carrier_definitions.py
└── Analysis Modules
├── analyze_results.py
└── plotting.py
Market Workflow Modules#
Market dispatch is an optional branch from the finalized network. The main
entry points are rules/market.smk and scripts/market/:
Module |
Role |
|---|---|
|
Copperplate Stage 1 wholesale solve |
|
Constrained Stage 2 balancing redispatch |
|
Bid/offer pricing, ELEXON loading, redispatch metrics |
|
Dashboard and summary outputs |
|
Historical ELEXON BM validation |
|
NESO constraint validation |
Data Flow#
Network Building Pipeline#
flowchart LR
subgraph Build
BASE["Base Network\n(buses, lines)"]
DEMAND["+ Demand\n(loads)"]
RENEW["+ Renewables\n(generators)"]
THERMAL["+ Thermal\n(generators)"]
STORAGE["+ Storage\n(storage_units)"]
HYDRO["+ Hydrogen\n(electrolysis, H2)"]
INTER["+ Interconnectors\n(links)"]
end
BASE --> DEMAND --> RENEW --> THERMAL --> STORAGE --> HYDRO --> INTER
Each step:
Loads the previous network state
Adds new components
Saves updated network
File Naming Convention#
{scenario}_network.nc # Base
{scenario}_network_demand.pkl # + demand
{scenario}_network_demand_renewables.pkl # + renewables
{scenario}_..._thermal_generators.pkl # + thermal
{scenario}_..._storage.pkl # + storage
{scenario}_..._hydrogen.pkl # + hydrogen
{scenario}_..._interconnectors.nc # + interconnectors
{scenario}_solved.nc # Optimized
Market-enabled scenarios branch from the finalized network rather than from
{scenario}_solved.nc:
resources/network/{scenario}.nc # Finalized physical network
resources/market/{scenario}_wholesale.nc # Copperplate wholesale solve
resources/market/{scenario}_balancing.nc # Constrained BM solve
resources/analysis/{scenario}_market_dashboard.html
PyPSA Integration#
Network Structure#
PyPSA-GB uses standard PyPSA components:
Component |
Usage |
|---|---|
|
Substations/nodes |
|
Transmission circuits |
|
Voltage transformation |
|
Power plants |
|
Batteries, pumped hydro |
|
Demand |
|
HVDC, interconnectors |
Component Naming#
# Generators: {carrier}_{bus}_{index}
"wind_offshore_BEAU41_0"
"CCGT_PADI41_1"
# Storage: {carrier}_{bus}_{index}
"battery_LOND41_0"
# Lines: {bus0}_{bus1}_{circuit}
"BEAU41_DOUN41_1"
Error Handling Strategy#
Levels of Handling#
Validation: Catch errors before processing
validate_scenario_complete(scenario)
Graceful Degradation: Handle missing optional data
try: extra_data = load_optional_data() except FileNotFoundError: logger.warning("Optional data not found, using defaults") extra_data = defaults
Fail Fast: Stop on critical errors
if not network.buses.any(): raise ValueError("Network has no buses!")
Logging Strategy#
# DEBUG: Detailed diagnostic info
logger.debug(f"Processing bus {bus_name} with {n_gens} generators")
# INFO: Progress milestones
logger.info(f"Added {n_generators} generators to network")
# WARNING: Recoverable issues
logger.warning(f"Missing coordinates for {n_missing} sites, skipping")
# ERROR: Problems requiring attention
logger.error(f"Solver returned infeasible status")
Extension Points#
Adding New Technologies#
Define carrier in
carrier_definitions.pyAdd data processing in integration module
Update profile generation if needed
Add configuration options
Adding New Data Sources#
Place raw data in
data/Create processing script
Add Snakemake rule
Update scenario detection if needed
Adding New Analysis#
Create analysis script
Add rule in
analysis.smkDefine output format (HTML, CSV, etc.)
Performance Considerations#
Memory Management#
Large networks use ~8GB RAM
Time series stored efficiently (NetCDF)
Profile caching to avoid recomputation
Computation Scaling#
Factor |
Impact |
|---|---|
More buses |
O(n²) for LOPF |
More timesteps |
Linear |
More generators |
Sublinear (grouped) |
Optimization#
Network clustering for faster solving
Parallel rule execution
Profile pre-generation