Skip to content

Commit ffd43c2

Browse files
author
Jonas Danke
committed
feat: refactor database engine management with dual-backend support (OEP + local egon-data)
This commit introduces a comprehensive refactoring of how eDisGo manages database connections, enabling seamless switching between the OpenEnergyPlatform (OEP) and local egon-data PostgreSQL databases. ## Engine as EDisGo property (edisgo/edisgo.py) - Add lazy-initialized `engine` property to the EDisGo class, replacing the previous pattern of passing engine explicitly to every function call - Support multiple initialization modes via new constructor parameters: `engine`, `db_config_path`, `db_url`, `db_ssh`, `db_token` - Default to OEP engine when no configuration is provided - Add `__deepcopy__` method that excludes the unpicklable SQLAlchemy engine (contains _thread._local objects) and lets the copy lazily recreate its own connection ## Optional engine parameter in all DB functions Make the `engine` parameter optional (default=None) in all 33 functions across 8 modules that access the database. When engine is None, each function falls back to `edisgo_object.engine`. This maintains full backward compatibility — callers can still pass an explicit engine. Modified modules: - edisgo/io/timeseries_import.py (10 functions) - edisgo/io/electromobility_import.py (4 functions) - edisgo/io/dsm_import.py (3 functions) - edisgo/io/heat_pump_import.py (2 functions) - edisgo/io/generators_import.py (1 function) - edisgo/io/storage_import.py (1 function) - edisgo/network/heat.py (2 engine fallbacks) - edisgo/network/timeseries.py (1 engine fallback) ## Local DB schema resolution (edisgo/tools/config.py) Extend `import_tables_from_oep()` to handle local egon-data databases: - Auto-resolve schema mismatches: tables may reside in different schemas locally vs on OEP (e.g. egon_etrago_bus is in "grid" locally but imported via "supply" on OEP). The method now searches all schemas when a table is not found in the expected schema. - Handle tables without primary keys (e.g. egon_map_zensus_grid_districts, egon_daily_heat_demand_per_climate_zone) by passing all columns as synthetic PK via `__mapper_args__["primary_key"]`. ## SSH tunnel lifecycle management (edisgo/io/db.py) - `ssh_tunnel()` now returns `tuple[str, SSHTunnelForwarder]` instead of just the port string, so the server object is no longer lost - `engine()` stores the SSH server as `engine._ssh_server` for later cleanup via `server.stop()` - This fixes the "I/O operation on closed file" logging errors caused by orphaned paramiko keepalive threads writing to closed log handlers ## Test infrastructure (tests/conftest.py) - Add `--runlocal` flag to run all DB tests against both OEP and local egon-data database - Add `--egon-data-config` flag for custom YAML config path - Parametrize tests via `db_engine` fixture: each DB test runs as `test_name[oep]` and `test_name[local]` when --runlocal is active - Add `pytest_sessionfinish` hook that disposes engines and stops SSH tunnels cleanly after all tests complete - Suppress paramiko DEBUG keepalive logging (defense-in-depth) - Migrate all 29 test methods across 10 test files from hardcoded `pytest.engine` to parametrized `db_engine` fixture ## New files - tests/io/test_db.py: 4 tests for SSH tunnel lifecycle (tunnel returns server, engine stores server, cleanup stops tunnel, OEP has no tunnel) - egon-data.configuration.yaml: Template config file with placeholder credentials for local egon-data database connections
1 parent e5c9f0c commit ffd43c2

28 files changed

Lines changed: 621 additions & 221 deletions

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,5 @@ eDisGo.egg-info/
2828
.vscode/settings.json
2929

3030
*OEP_TOKEN.*
31+
egon-data.configuration.yaml
32+
.coverage

edisgo/edisgo.py

Lines changed: 148 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,6 @@
3131
pypsa_io,
3232
timeseries_import,
3333
)
34-
from edisgo.io.db import engine as egon_engine
3534
from edisgo.io.ding0_import import import_ding0_grid
3635
from edisgo.io.electromobility_import import (
3736
distribute_charging_demand,
@@ -131,6 +130,26 @@ class EDisGo:
131130
Default: "default".
132131
legacy_ding0_grids : bool
133132
Allow import of old ding0 grids. Default: True.
133+
engine : :sqlalchemy:`sqlalchemy.Engine<sqlalchemy.engine.Engine>`, optional
134+
SQLAlchemy database engine. If provided, this engine is used for all
135+
database queries. If not provided, an engine is lazily created based
136+
on other database parameters or defaults to an OEP connection.
137+
Default: None.
138+
db_config_path : str or pathlib.Path, optional
139+
Path to an egon-data YAML configuration file (e.g.
140+
``~/.ssh/egon-data.configuration.yaml``). Used together with `db_ssh`
141+
to create a local database engine via SSH tunnel.
142+
Default: None.
143+
db_url : str, optional
144+
SQLAlchemy database URL for direct connection (e.g.
145+
``postgresql+psycopg2://egon:data@localhost:59510/egon-data``).
146+
Default: None.
147+
db_ssh : bool
148+
If True, use SSH tunnel when connecting via `db_config_path`.
149+
Default: False.
150+
db_token : str or pathlib.Path, optional
151+
OEP token or path to token file for OEP connection.
152+
Default: None.
134153
135154
Attributes
136155
----------
@@ -157,13 +176,50 @@ class EDisGo:
157176
requirements or power plant dispatch.
158177
dsm : :class:`~.network.dsm.DSM`
159178
This is a container holding data on demand side management potential.
179+
engine : :sqlalchemy:`sqlalchemy.Engine<sqlalchemy.engine.Engine>`
180+
Database engine for OEDB access. Lazily initialized on first access.
181+
Defaults to OEP connection. Can be overridden via setter or constructor
182+
parameters.
183+
184+
Examples
185+
--------
186+
Default OEP connection (engine created lazily on first DB access):
187+
188+
>>> edisgo = EDisGo(ding0_grid="path/to/grid")
189+
>>> edisgo.import_generators(generator_scenario="eGon2035")
190+
191+
Local database via YAML config with SSH tunnel:
192+
193+
>>> edisgo = EDisGo(
194+
... ding0_grid="path/to/grid",
195+
... db_config_path="~/.ssh/egon-data.configuration.yaml",
196+
... db_ssh=True,
197+
... )
198+
199+
Local database via connection string:
200+
201+
>>> edisgo = EDisGo(
202+
... ding0_grid="path/to/grid",
203+
... db_url="postgresql+psycopg2://egon:data@localhost:59510/egon-data",
204+
... )
205+
206+
Override engine after construction:
207+
208+
>>> edisgo.engine = my_custom_engine
160209
161210
"""
162211

163212
def __init__(self, **kwargs):
164213
# load configuration
165214
self._config = Config(**kwargs)
166215

216+
# database engine configuration (lazy initialization)
217+
self._engine = kwargs.get("engine", None)
218+
self._db_config_path = kwargs.get("db_config_path", None)
219+
self._db_url = kwargs.get("db_url", None)
220+
self._db_ssh = kwargs.get("db_ssh", False)
221+
self._db_token = kwargs.get("db_token", None)
222+
167223
# instantiate topology object and load grid data
168224
self.topology = Topology(config=self.config)
169225
self.import_ding0_grid(
@@ -232,6 +288,66 @@ def config(self):
232288
def config(self, kwargs):
233289
self._config = Config(**kwargs)
234290

291+
@property
292+
def engine(self):
293+
"""
294+
Database engine for accessing the OEDB or local egon-data database.
295+
296+
Lazy initialization: the engine is only created when first accessed.
297+
By default, creates an OEP engine using oedialect. Can be overridden
298+
by providing a custom engine via the setter, or by passing `db_config_path`,
299+
`db_url`, or `engine` to the EDisGo constructor.
300+
301+
Parameters
302+
----------
303+
engine : :sqlalchemy:`sqlalchemy.Engine<sqlalchemy.engine.Engine>`
304+
SQLAlchemy engine to use for database connections.
305+
306+
Returns
307+
-------
308+
:sqlalchemy:`sqlalchemy.Engine<sqlalchemy.engine.Engine>`
309+
Database engine.
310+
311+
"""
312+
if self._engine is None:
313+
if self._db_url is not None:
314+
from sqlalchemy import create_engine
315+
316+
self._engine = create_engine(self._db_url, echo=False)
317+
elif self._db_config_path is not None:
318+
from edisgo.io.db import engine as create_db_engine
319+
320+
self._engine = create_db_engine(
321+
path=self._db_config_path, ssh=self._db_ssh
322+
)
323+
else:
324+
from edisgo.io.db import engine as create_db_engine
325+
326+
self._engine = create_db_engine(token=self._db_token)
327+
return self._engine
328+
329+
@engine.setter
330+
def engine(self, value):
331+
self._engine = value
332+
333+
def __deepcopy__(self, memo):
334+
import copy as _copy
335+
336+
# Engine (SQLAlchemy) contains _thread._local objects that cannot be
337+
# pickled. Exclude it from the deep copy and let the copy lazily
338+
# recreate its own engine from the stored configuration parameters
339+
# (_db_config_path, _db_url, _db_token etc. are all plain strings/bools
340+
# and will be deep-copied normally).
341+
cls = self.__class__
342+
result = cls.__new__(cls)
343+
memo[id(self)] = result
344+
for k, v in self.__dict__.items():
345+
if k == "_engine":
346+
setattr(result, k, None)
347+
else:
348+
setattr(result, k, _copy.deepcopy(v, memo))
349+
return result
350+
235351
def import_ding0_grid(self, path, legacy_ding0_grids=True):
236352
"""
237353
Import ding0 topology data from csv files in the format as
@@ -559,7 +675,7 @@ def set_time_series_active_power_predefined(
559675
is indexed using a default year and set for the whole year.
560676
561677
"""
562-
engine = kwargs["engine"] if "engine" in kwargs else egon_engine()
678+
engine = kwargs.pop("engine", None) or self.engine
563679
if self.timeseries.timeindex.empty:
564680
logger.warning(
565681
"When setting time series using predefined profiles it is better to "
@@ -982,7 +1098,7 @@ def import_generators(self, generator_scenario=None, **kwargs):
9821098
keyword arguments.
9831099
9841100
"""
985-
engine = kwargs["engine"] if "engine" in kwargs else egon_engine()
1101+
engine = kwargs.pop("engine", None) or self.engine
9861102
if self.legacy_grids is True:
9871103
generators_import.oedb_legacy(
9881104
edisgo_object=self, generator_scenario=generator_scenario, **kwargs
@@ -1130,12 +1246,14 @@ def _check_convergence():
11301246
if raise_not_converged and len(timesteps_not_converged) > 0:
11311247
raise ValueError(
11321248
"Power flow analysis did not converge for the "
1133-
f"following {len(timesteps_not_converged)} time steps: {timesteps_not_converged}."
1249+
f"following {len(timesteps_not_converged)} time "
1250+
f"steps: {timesteps_not_converged}."
11341251
)
11351252
elif len(timesteps_not_converged) > 0:
11361253
logger.warning(
11371254
"Power flow analysis did not converge for the "
1138-
f"following {len(timesteps_not_converged)} time steps: {timesteps_not_converged}."
1255+
f"following {len(timesteps_not_converged)} time "
1256+
f"steps: {timesteps_not_converged}."
11391257
)
11401258
return timesteps_converged, timesteps_not_converged
11411259

@@ -2032,6 +2150,9 @@ def import_electromobility(
20322150
if import_electromobility_data_kwds is None:
20332151
import_electromobility_data_kwds = {}
20342152

2153+
if engine is None:
2154+
engine = self.engine
2155+
20352156
if data_source == "oedb":
20362157
import_electromobility_from_oedb(
20372158
self,
@@ -2136,7 +2257,9 @@ def apply_charging_strategy(
21362257
self, strategy=strategy, charging_park_ids=charging_park_ids, **kwargs
21372258
)
21382259

2139-
def import_heat_pumps(self, scenario, engine, timeindex=None, import_types=None):
2260+
def import_heat_pumps(
2261+
self, scenario, engine=None, timeindex=None, import_types=None
2262+
):
21402263
"""
21412264
Gets heat pump data for specified scenario from oedb and integrates the heat
21422265
pumps into the grid.
@@ -2212,6 +2335,9 @@ def import_heat_pumps(self, scenario, engine, timeindex=None, import_types=None)
22122335
"central_resistive_heaters". If None, all are imported.
22132336
22142337
"""
2338+
if engine is None:
2339+
engine = self.engine
2340+
22152341
# set up year to index data by
22162342
# first try to get index from time index
22172343
if timeindex is None:
@@ -2308,7 +2434,7 @@ def apply_heat_pump_operating_strategy(
23082434
"""
23092435
hp_operating_strategy(self, strategy=strategy, heat_pump_names=heat_pump_names)
23102436

2311-
def import_dsm(self, scenario: str, engine: Engine, timeindex=None):
2437+
def import_dsm(self, scenario: str, engine: Engine = None, timeindex=None):
23122438
"""
23132439
Gets industrial and CTS DSM profiles from the
23142440
`OpenEnergy DataBase <https://openenergyplatform.org/database/>`_.
@@ -2327,8 +2453,8 @@ def import_dsm(self, scenario: str, engine: Engine, timeindex=None):
23272453
scenario : str
23282454
Scenario for which to retrieve DSM data. Possible options
23292455
are 'eGon2035' and 'eGon100RE'.
2330-
engine : :sqlalchemy:`sqlalchemy.Engine<sqlalchemy.engine.Engine>`
2331-
Database engine.
2456+
engine : :sqlalchemy:`sqlalchemy.Engine<sqlalchemy.engine.Engine>`, optional
2457+
Database engine. If not provided, uses :attr:`~.EDisGo.engine`.
23322458
timeindex : :pandas:`pandas.DatetimeIndex<DatetimeIndex>` or None
23332459
Specifies time steps for which to get data. Leap years can currently not be
23342460
handled. In case the given timeindex contains a leap year, the data will be
@@ -2340,6 +2466,9 @@ def import_dsm(self, scenario: str, engine: Engine, timeindex=None):
23402466
is indexed using the default year and returned for the whole year.
23412467
23422468
"""
2469+
if engine is None:
2470+
engine = self.engine
2471+
23432472
dsm_profiles = dsm_import.oedb(
23442473
edisgo_obj=self, scenario=scenario, engine=engine, timeindex=timeindex
23452474
)
@@ -2351,7 +2480,7 @@ def import_dsm(self, scenario: str, engine: Engine, timeindex=None):
23512480
def import_home_batteries(
23522481
self,
23532482
scenario: str,
2354-
engine: Engine,
2483+
engine: Engine = None,
23552484
):
23562485
"""
23572486
Gets home battery data for specified scenario and integrates the batteries into
@@ -2379,10 +2508,13 @@ def import_home_batteries(
23792508
scenario : str
23802509
Scenario for which to retrieve home battery data. Possible options
23812510
are 'eGon2035' and 'eGon100RE'.
2382-
engine : :sqlalchemy:`sqlalchemy.Engine<sqlalchemy.engine.Engine>`
2383-
Database engine.
2511+
engine : :sqlalchemy:`sqlalchemy.Engine<sqlalchemy.engine.Engine>`, optional
2512+
Database engine. If not provided, uses :attr:`~.EDisGo.engine`.
23842513
23852514
"""
2515+
if engine is None:
2516+
engine = self.engine
2517+
23862518
home_batteries_oedb(
23872519
edisgo_obj=self,
23882520
scenario=scenario,
@@ -3640,8 +3772,10 @@ def _check_timeindex(check_df, param_name):
36403772
).any()
36413773
if comparison.any():
36423774
logger.warning(
3643-
"Heat demand is higher than rated heatpump power"
3644-
f" of heatpumps: {comparison.index[comparison.values].values}. Demand can not be covered if no sufficient"
3775+
"Heat demand is higher than rated heatpump "
3776+
"power of heatpumps: "
3777+
f"{comparison.index[comparison.values].values}"
3778+
". Demand can not be covered if no sufficient"
36453779
" heat storage capacities are available."
36463780
)
36473781

0 commit comments

Comments
 (0)