Architecture#

Overview of PyPSA-GB’s software architecture and design decisions.

High-Level Architecture#

        flowchart TB
    subgraph Config["Configuration Layer"]
        YAML["YAML Config Files"]
        SCENARIOS["Scenario Definitions"]
    end
    
    subgraph Workflow["Workflow Layer"]
        SNAKE["Snakemake"]
        RULES["Rule Definitions"]
    end
    
    subgraph Core["Core Layer"]
        SCRIPTS["Python Scripts"]
        PYPSA["PyPSA"]
    end
    
    subgraph Data["Data Layer"]
        RAW["Raw Data"]
        RESOURCES["Generated Resources"]
    end
    
    YAML --> SNAKE
    SCENARIOS --> SNAKE
    SNAKE --> RULES
    RULES --> SCRIPTS
    SCRIPTS --> PYPSA
    RAW --> SCRIPTS
    SCRIPTS --> RESOURCES

Design Principles#

1. Declarative Configuration#

Users declare what they want, not how to build it:

# User specifies desired outcome
HT35:
  modelled_year: 2035
  network_model: "ETYS"
  FES_scenario: "Holistic Transition"

Snakemake determines the execution path automatically.

2. Reproducibility#

All inputs are versioned or documented
Configuration is explicit
Random seeds are fixed where needed
Logs capture execution details

3. Modularity#

Each script does one thing well:

Script	Single Responsibility
`network_build/ETYS_network.py`	Assemble ETYS network from preprocessed data
`network_build/process_ETYS_data.py`	Parse raw ETYS Excel to CSVs
`network_build/ETYS_upgrades.py`	Apply planned network reinforcements
`network_build/etys_file_registry.py`	Map ETYS publication years to files
`network_build/build_network.py`	Create Reduced/Zonal networks
`integrate_thermal_generators.py`	Add thermal capacity
`solve_network.py`	Run optimization
`market/solve_wholesale.py`	Run copperplate wholesale dispatch
`market/solve_balancing.py`	Run anchored balancing redispatch

4. Data Source Abstraction#

The same workflow handles historical and future scenarios:

        flowchart LR
    SCENARIO["Scenario Config"] --> DETECT["Scenario Detection"]
    DETECT -->|"≤2024"| HIST["Historical Sources"]
    DETECT -->|">2024"| FUTURE["FES Sources"]
    HIST --> INTEGRATE["Integration"]
    FUTURE --> INTEGRATE

Component Architecture#

Configuration System#

config/
├── config.yaml       # What to run
├── scenarios.yaml    # Scenario definitions
├── defaults.yaml     # Default values
└── config_loader.py  # Python interface

Loading flow:

Load defaults.yaml
Override with scenarios.yaml values
Override with config.yaml values
Override with command-line arguments

Workflow System#

Snakefile              # Main entry point
├── rules/
│   ├── network_build.smk
│   ├── generators.smk
│   ├── renewables.smk
│   ├── storage.smk
│   ├── solve.smk
│   └── analysis.smk

Rules define:

Input/output file relationships
Script to execute
Parameters from config
Log file locations

Script Organization#

scripts/
├── Network Build Package
│   ├── network_build/
│   │   ├── ETYS_network.py          # ETYS network assembly (stage 2)
│   │   ├── process_ETYS_data.py     # Raw Excel → CSV preprocessing (stage 1)
│   │   ├── ETYS_upgrades.py         # Network upgrade application
│   │   ├── etys_file_registry.py    # File/sheet name registry and constants
│   │   └── build_network.py         # Reduced/Zonal network builders
├── Core Modules
│   ├── solve_network.py
│   └── scenario_detection.py
├── Integration Modules
│   ├── integrate_thermal_generators.py
│   ├── integrate_renewable_generators.py
│   └── add_storage.py
├── Utility Modules
│   ├── spatial_utils.py
│   ├── logging_config.py
│   └── carrier_definitions.py
└── Analysis Modules
    ├── analyze_results.py
    └── plotting.py

Market Workflow Modules#

Market dispatch is an optional branch from the finalized network. The main entry points are rules/market.smk and scripts/market/:

Module	Role
`solve_wholesale.py`	Copperplate Stage 1 wholesale solve
`solve_balancing.py`	Constrained Stage 2 balancing redispatch
`market_utils.py`	Bid/offer pricing, ELEXON loading, redispatch metrics
`analyze_market.py`	Dashboard and summary outputs
`validate_bm.py`	Historical ELEXON BM validation
`validate_neso_constraints.py`	NESO constraint validation

Data Flow#

Network Building Pipeline#

        flowchart LR
    subgraph Build
        BASE["Base Network\n(buses, lines)"]
        DEMAND["+ Demand\n(loads)"]
        RENEW["+ Renewables\n(generators)"]
        THERMAL["+ Thermal\n(generators)"]
        STORAGE["+ Storage\n(storage_units)"]
        HYDRO["+ Hydrogen\n(electrolysis, H2)"]
        INTER["+ Interconnectors\n(links)"]
    end
    
    BASE --> DEMAND --> RENEW --> THERMAL --> STORAGE --> HYDRO --> INTER

Each step:

Loads the previous network state
Adds new components
Saves updated network

File Naming Convention#

{scenario}_network.nc                           # Base
{scenario}_network_demand.pkl                   # + demand
{scenario}_network_demand_renewables.pkl        # + renewables
{scenario}_..._thermal_generators.pkl           # + thermal
{scenario}_..._storage.pkl                      # + storage
{scenario}_..._hydrogen.pkl                     # + hydrogen
{scenario}_..._interconnectors.nc               # + interconnectors
{scenario}_solved.nc                            # Optimized

Market-enabled scenarios branch from the finalized network rather than from {scenario}_solved.nc:

resources/network/{scenario}.nc                 # Finalized physical network
resources/market/{scenario}_wholesale.nc        # Copperplate wholesale solve
resources/market/{scenario}_balancing.nc        # Constrained BM solve
resources/analysis/{scenario}_market_dashboard.html

PyPSA Integration#

Network Structure#

PyPSA-GB uses standard PyPSA components:

Component	Usage
`Bus`	Substations/nodes
`Line`	Transmission circuits
`Transformer`	Voltage transformation
`Generator`	Power plants
`StorageUnit`	Batteries, pumped hydro
`Load`	Demand
`Link`	HVDC, interconnectors

Component Naming#

# Generators: {carrier}_{bus}_{index}
"wind_offshore_BEAU41_0"
"CCGT_PADI41_1"

# Storage: {carrier}_{bus}_{index}
"battery_LOND41_0"

# Lines: {bus0}_{bus1}_{circuit}
"BEAU41_DOUN41_1"

Error Handling Strategy#

Levels of Handling#

Validation: Catch errors before processing
```
validate_scenario_complete(scenario)
```

Graceful Degradation: Handle missing optional data

try:
    extra_data = load_optional_data()
except FileNotFoundError:
    logger.warning("Optional data not found, using defaults")
    extra_data = defaults

Fail Fast: Stop on critical errors

if not network.buses.any():
    raise ValueError("Network has no buses!")

Logging Strategy#

# DEBUG: Detailed diagnostic info
logger.debug(f"Processing bus {bus_name} with {n_gens} generators")

# INFO: Progress milestones
logger.info(f"Added {n_generators} generators to network")

# WARNING: Recoverable issues
logger.warning(f"Missing coordinates for {n_missing} sites, skipping")

# ERROR: Problems requiring attention
logger.error(f"Solver returned infeasible status")

Extension Points#

Adding New Technologies#

Define carrier in carrier_definitions.py
Add data processing in integration module
Update profile generation if needed
Add configuration options

Adding New Data Sources#

Place raw data in data/
Create processing script
Add Snakemake rule
Update scenario detection if needed

Adding New Analysis#

Create analysis script
Add rule in analysis.smk
Define output format (HTML, CSV, etc.)

Performance Considerations#

Memory Management#

Large networks use ~8GB RAM
Time series stored efficiently (NetCDF)
Profile caching to avoid recomputation

Computation Scaling#

Factor	Impact
More buses	O(n²) for LOPF
More timesteps	Linear
More generators	Sublinear (grouped)

Optimization#

Network clustering for faster solving
Parallel rule execution
Profile pre-generation