filetools¶

This module provides features for handling the folder structure of HydPy projects as well as loading data from and storing data to files.

Module filetools implements the following members:

Folder2Path Map folder names to their pathnames.

FileManager Base class for NetworkManager, ControlManager, ConditionManager, and SequenceManager.

NetworkManager Manager for network files.

ControlManager Manager for control parameter files.

ConditionManager Manager for condition files.

SequenceManager Manager for sequence files.

check_projectstructure() Raise a warning if the given project root directory does not exist or does not contain all relevant base directories.

create_projectstructure() Make the given project root directory and its base directories.

class hydpy.core.filetools.Folder2Path(*args: str, **kwargs: str)[source]¶

Bases: object

Map folder names to their pathnames.

You can pass positional or keyword arguments when initialising Folder2Path. For positional arguments, the folder and its path are assumed to be identical. For keyword arguments, the keyword corresponds to the folder name, and its value is the pathname:

>>> from hydpy.core.filetools import Folder2Path
>>> Folder2Path()
Folder2Path()
>>> f2p = Folder2Path(
...     "folder1", "folder2", folder3="folder3", folder4="path4")
>>> f2p
Folder2Path(folder1,
            folder2,
            folder3,
            folder4=path4)
>>> print(f2p)
Folder2Path(folder1, folder2, folder3, folder4=path4)

To add folders after initialisation is supported:

>>> f2p.add("folder5")
>>> f2p.add("folder6", "path6")
>>> f2p
Folder2Path(folder1,
            folder2,
            folder3,
            folder5,
            folder4=path4,
            folder6=path6)

Folder names are required to be valid Python identifiers:

>>> f2p.add("folder 7")
Traceback (most recent call last):
...
ValueError: The given name string `folder 7` does not define a valid variable identifier.  Valid identifiers do not contain characters like `-` or empty spaces, do not start with numbers, cannot be mistaken with Python built-ins like `for`...)

You can query the folder and attribute names:

>>> f2p.folders
['folder1', 'folder2', 'folder3', 'folder4', 'folder5', 'folder6']
>>> f2p.paths
['folder1', 'folder2', 'folder3', 'path4', 'folder5', 'path6']

Attribute access and iteration are also supported:

>>> "folder1" in dir(f2p)
True
>>> f2p.folder1
'folder1'
>>> f2p.folder4
'path4'

>>> for folder, path in f2p:
...     print(folder, path)
folder1 folder1
folder2 folder2
folder3 folder3
folder4 path4
folder5 folder5
folder6 path6

>>> len(f2p)
6
>>> bool(f2p)
True
>>> bool(Folder2Path())
False

add(directory: str, path: str | None = None) → None[source]¶: Add a directory and, optionally, its path.

property folders: list[str]¶: The currently handled folder names.

property paths: list[str]¶: The currently handled path names.

class hydpy.core.filetools.FileManager[source]¶

Bases: object

Base class for NetworkManager, ControlManager, ConditionManager, and SequenceManager.

BASEDIR: str¶

DEFAULTDIR: str | None¶

property projectdir: str¶

The folder name of a project’s root directory.

For the HydPy-H-Lahn example project, projectdir is (not surprisingly) HydPy-H-Lahn and is queried from the pub module. However, you can define or change projectdir interactively, which can be useful for more complex tasks like copying (parts of) projects:

>>> from hydpy.core.filetools import FileManager
>>> from hydpy import pub
>>> pub.projectname = "project_A"
>>> filemanager = FileManager()
>>> filemanager.projectdir
'project_A'

>>> filemanager.projectdir = "project_B"
>>> filemanager.projectdir
'project_B'

>>> pub.projectname = "project_C"
>>> filemanager.projectdir
'project_B'

>>> del filemanager.projectdir
>>> filemanager.projectdir
'project_C'

>>> del pub.projectname
>>> filemanager.projectdir
Traceback (most recent call last):
...
hydpy.core.exceptiontools.AttributeNotReady: While trying to automatically determine the file manager's project root directory, the following error occurred: Attribute projectname of module `pub` is not defined at the moment.

property basepath: str¶

The absolute path pointing to the available working directories.

>>> from hydpy.core.filetools import FileManager
>>> filemanager = FileManager()
>>> filemanager.BASEDIR = "basename"
>>> filemanager.projectdir = "projectname"
>>> from hydpy import repr_, TestIO
>>> with TestIO():
...     repr_(filemanager.basepath)   
'...hydpy/tests/iotesting/projectname/basename'

property availabledirs: Folder2Path¶

The names and paths of the available working directories.

All possible working directories must be availablein the base directory of the respective FileManager subclass. Folders with names starting with an underscore do not count (use this for directories handling additional data files, if you like), while zipped directories do count as available directories:

>>> from hydpy.core.filetools import FileManager
>>> filemanager = FileManager()
>>> filemanager.BASEDIR = "basename"
>>> filemanager.projectdir = "projectname"
>>> import os
>>> from hydpy import repr_, TestIO
>>> TestIO.clear()
>>> with TestIO():
...     os.makedirs("projectname/basename/folder1")
...     os.makedirs("projectname/basename/folder2")
...     open("projectname/basename/folder3.zip", "w").close()
...     os.makedirs("projectname/basename/_folder4")
...     open("projectname/basename/folder5.tar", "w").close()
...     filemanager.availabledirs   
Folder2Path(folder1=.../projectname/basename/folder1,
            folder2=.../projectname/basename/folder2,
            folder3=.../projectname/basename/folder3.zip)

property currentdir: str¶

The name of the current working directory containing the relevant files.

To show most of the functionality of property currentdir (we explain unpacking zipped files on the fly in the documentation on function zip_currentdir()), we first prepare a FileManager object with the default basepath projectname/basename and no DEFAULTDIR defined:

>>> from hydpy.core.filetools import FileManager
>>> filemanager = FileManager()
>>> filemanager.BASEDIR = "basename"
>>> filemanager.DEFAULTDIR = None
>>> filemanager.projectdir = "projectname"
>>> import os
>>> from hydpy import pub, repr_, TestIO
>>> TestIO.clear()
>>> with TestIO():
...     os.makedirs("projectname/basename")
...     repr_(filemanager.basepath)  
'...hydpy/tests/iotesting/projectname/basename'

At first, the base directory is empty and asking for the current working directory results in the following error:

>>> with TestIO():
...     filemanager.currentdir  
Traceback (most recent call last):
...
RuntimeError: The current working directory of the file manager has not been defined manually and cannot be determined automatically: `.../projectname/basename` does not contain any available directories.

If only one directory exists, it is considered the current working directory automatically:

>>> with TestIO(), pub.options.printprogress(True):
...     os.mkdir("projectname/basename/dir1")
...     assert filemanager.currentdir == "dir1"
The name of the file manager's current working directory has not been previously defined and is hence set to `dir1`.

property currentdir memorises the name of the current working directory, even if another directory is added later to the base path:

>>> with TestIO():
...     os.mkdir("projectname/basename/dir2")
...     assert filemanager.currentdir == "dir1"

Set the value of currentdir to None to let it forget the memorised directory. After that, trying to query the current working directory results in another error, as it is unclear which directory to select:

>>> with TestIO():
...     filemanager.currentdir = None
...     filemanager.currentdir  
Traceback (most recent call last):
...
RuntimeError: The current working directory of the file manager has not been defined manually and cannot be determined automatically: `.../projectname/basename` does contain multiple available directories (dir1 and dir2).

Setting currentdir manually solves the problem:

>>> with TestIO():
...     filemanager.currentdir = "dir1"
...     assert filemanager.currentdir == "dir1"

Remove the current working directory dir1 with the del statement:

>>> with TestIO(), pub.options.printprogress(True):  
...     del filemanager.currentdir
...     assert not os.path.exists("projectname/basename/dir1")
Directory ...dir1 has been removed.

FileManager subclasses can define a default directory name. When many directories exist, and none is selected manually, the default directory is chosen automatically. The following example shows an error message due to multiple directories without any default name:

>>> with TestIO():
...     os.mkdir("projectname/basename/dir1")
...     filemanager.DEFAULTDIR = "dir3"
...     del filemanager.currentdir
...     filemanager.currentdir  
Traceback (most recent call last):
...
RuntimeError: The current working directory of the file manager has not been defined manually and cannot be determined automatically: The default directory (dir3) is not among the available directories (dir1 and dir2).

We can fix this by manually adding the required default directory:

>>> with TestIO(), pub.options.printprogress(True):
...     os.mkdir("projectname/basename/dir3")
...     assert filemanager.currentdir == "dir3"
The name of the file manager's current working directory has not been previously defined and is hence set to `dir3`.

Setting the currentdir to dir4 not only overwrites the default name but also creates the required folder:

>>> with TestIO(), pub.options.printprogress(True):
...     filemanager.currentdir = "dir4"
...     assert filemanager.currentdir == "dir4"  
Directory ...dir4 has been created.
>>> with TestIO():
...     dirs = os.listdir("projectname/basename")
...     assert sorted(dirs) == ["dir1", "dir2", "dir3", "dir4"]

Failed attempts to remove directories result in error messages like the following one:

>>> import shutil
>>> from unittest.mock import patch
>>> with patch.object(shutil, "rmtree", side_effect=AttributeError):
...     with TestIO():
...         del filemanager.currentdir  
Traceback (most recent call last):
...
AttributeError: While trying to delete the current working directory `.../projectname/basename/dir4` of the file manager, the following error occurred: ...

Then, the current working directory still exists and is remembered by property currentdir:

>>> with TestIO():
...     assert filemanager.currentdir == "dir4"
>>> with TestIO():
...     dirs =os.listdir("projectname/basename")
...     assert sorted(dirs) == ["dir1", "dir2", "dir3", "dir4"]

Assign the folder’s absolute path if you need to work outside the current project directory (for example, to archive simulated data):

>>> with TestIO():  
...     os.mkdir("differentproject")
...     filemanager.currentdir = os.path.abspath("differentproject/dir1")
...     path = repr_(filemanager.currentpath)
...     assert path.endswith("hydpy/tests/iotesting/differentproject/dir1")
...     assert os.listdir("differentproject") == ["dir1"]

If a FileManager subclass defines its DEFAULTDIR class attribute, the above behaviour differs in the case of an initially empty base directory. Then, currentdir activates and creates an accordingly named directory automatically:

>>> filemanager.currentdir = None
>>> filemanager.DEFAULTDIR = "default"
>>> TestIO.clear()
>>> with TestIO(), pub.options.printprogress(True):  
...     os.makedirs("projectname/basename")
...     assert filemanager.currentdir == "default"
...     assert os.path.exists("projectname/basename/default")
The name of the file manager's current working directory has not been previously defined and is hence set to `default`.
Directory ...default has been created.

property currentpath: str¶

The absolute path of the current working directory.

>>> from hydpy.core.filetools import FileManager
>>> filemanager = FileManager()
>>> filemanager.BASEDIR = "basename"
>>> filemanager.projectdir = "projectname"
>>> from hydpy import repr_, TestIO
>>> with TestIO():
...     filemanager.currentdir = "testdir"
...     repr_(filemanager.currentpath)  
'...hydpy/tests/iotesting/projectname/basename/testdir'

property filenames: list[str]¶

The names of the files in the current working directory, except those starting with an underscore.

>>> from hydpy.core.filetools import FileManager
>>> filemanager = FileManager()
>>> filemanager.BASEDIR = "basename"
>>> filemanager.projectdir = "projectname"
>>> from hydpy import TestIO
>>> with TestIO():
...     filemanager.currentdir = "testdir"
...     open("projectname/basename/testdir/file1.txt", "w").close()
...     open("projectname/basename/testdir/file2.npy", "w").close()
...     open("projectname/basename/testdir/_file1.nc", "w").close()
...     filemanager.filenames
['file1.txt', 'file2.npy']

property filepaths: list[str]¶

The absolute path names of the files returned by property filenames.

>>> from hydpy.core.filetools import FileManager
>>> filemanager = FileManager()
>>> filemanager.BASEDIR = "basename"
>>> filemanager.projectdir = "projectname"
>>> from hydpy import repr_, TestIO
>>> with TestIO():
...     filemanager.currentdir = "testdir"
...     open("projectname/basename/testdir/file1.txt", "w").close()
...     open("projectname/basename/testdir/file2.npy", "w").close()
...     open("projectname/basename/testdir/_file1.nc", "w").close()
...     for filepath in filemanager.filepaths:
...         repr_(filepath)  
'...hydpy/tests/iotesting/projectname/basename/testdir/file1.txt'
'...hydpy/tests/iotesting/projectname/basename/testdir/file2.npy'

zip_currentdir() → None[source]¶

Pack the current working directory in a zip file.

FileManager subclasses allow for manual packing and automatic unpacking of working directories. The only supported format is “zip”. The original directories and zip files are removed after packing or unpacking to avoid possible inconsistencies.

As an example scenario, we prepare a FileManager object with the current working directory folder containing the files test1.txt and text2.txt:

>>> from hydpy.core.filetools import FileManager
>>> filemanager = FileManager()
>>> filemanager.BASEDIR = "basename"
>>> filemanager.DEFAULTDIR = None
>>> filemanager.projectdir = "projectname"
>>> import os
>>> from hydpy import pub, repr_, TestIO
>>> TestIO.clear()
>>> basepath = "projectname/basename"
>>> with TestIO():
...     os.makedirs(basepath)
...     filemanager.currentdir = "folder"
...     open(f"{basepath}/folder/file1.txt", "w").close()
...     open(f"{basepath}/folder/file2.txt", "w").close()
...     filemanager.filenames
['file1.txt', 'file2.txt']

The directories under the base path are identical to the ones returned by property availabledirs:

>>> with TestIO():
...     assert os.listdir(basepath) == ["folder"]
...     filemanager.availabledirs  
Folder2Path(folder=.../projectname/basename/folder)

After manually packing the current working directory, it still counts as an available directory:

>>> with TestIO(), pub.options.printprogress(True):
...     filemanager.zip_currentdir()
...     assert os.listdir(basepath) == ["folder.zip"]
...     filemanager.availabledirs  
Directory ...folder has been removed.
Folder2Path(folder=.../projectname/basename/folder.zip)

Instead of the complete directory, only its files are packed:

>>> from zipfile import ZipFile
>>> with TestIO():
...     with ZipFile("projectname/basename/folder.zip", "r") as zp:
...         assert sorted(zp.namelist()) == ["file1.txt", "file2.txt"]

The zip file is unpacked again when folder becomes the current working directory:

>>> with TestIO(), pub.options.printprogress(True):  
...     filemanager.currentdir = "folder"
...     assert os.listdir(basepath) == ["folder"]
...     assert sorted(filemanager.filenames) == ["file1.txt", "file2.txt"]
...     filemanager.availabledirs
The zip file ...folder.zip has been extracted to directory ...folder and removed.
Folder2Path(folder=.../projectname/basename/folder)

class hydpy.core.filetools.NetworkManager[source]¶

Bases: FileManager

Manager for network files.

The base and default folder names of class NetworkManager are:

>>> from hydpy.core.filetools import NetworkManager
>>> NetworkManager.BASEDIR
'network'
>>> NetworkManager.DEFAULTDIR
'default'

The documentation of base class FileManager explains most aspects of using NetworkManager objects. The following examples deal with the extended features of class NetworkManager: reading, writing, and removing network files. For this purpose, we prepare the example project HydPy-H-Lahn in the iotesting directory by calling function prepare_full_example_1():

>>> from hydpy.core.testtools import prepare_full_example_1
>>> prepare_full_example_1()

You can define the complete network structure of an HydPy project by an arbitrary number of “network files”. These valid Python files define Node and Element objects and their connections. Network files are allowed to overlap, meaning two or more files can define the same objects (in a consistent manner only, of course). The primary purpose of class NetworkManager is to execute each network file individually and pass its content to a Selection object, which is done by method load_files():

>>> networkmanager = NetworkManager()
>>> from hydpy import TestIO
>>> with TestIO():
...     networkmanager.projectdir = "HydPy-H-Lahn"
...     selections = networkmanager.load_files()

Method load_files() takes file names as selection names (without file endings):

>>> selections
Selections("headwaters", "nonheadwaters", "streams")
>>> selections.headwaters
Selection("headwaters",
          nodes=("dill_assl", "lahn_marb"),
          elements=("land_dill_assl", "land_lahn_marb"))

The whole set of Node and Element objects is accessible via the property complete:

>>> selections.complete
Selection("complete",
          nodes=("dill_assl", "lahn_kalk", "lahn_leun", "lahn_marb"),
          elements=("land_dill_assl", "land_lahn_kalk",
                    "land_lahn_leun", "land_lahn_marb",
                    "stream_dill_assl_lahn_leun",
                    "stream_lahn_leun_lahn_kalk",
                    "stream_lahn_marb_lahn_leun"))

Method save_files() writes all user-defined selections into separate files. First, we change the current working directory to ensure we do not overwrite already existing files:

>>> import os
>>> with TestIO():
...     networkmanager.currentdir = "testdir"
...     networkmanager.save_files(selections)
...     sorted(os.listdir("HydPy-H-Lahn/network/testdir"))
['headwaters.py', 'nonheadwaters.py', 'streams.py']

Reloading and comparing with the still available Selection objects proves that the contents of the original and the new network files are equivalent:

>>> with TestIO():
...     selections == networkmanager.load_files()
True

Method delete_files() removes the network files of the given Selection objects:

>>> selections -= selections.streams
>>> with TestIO():
...     networkmanager.delete_files(selections)
...     sorted(os.listdir("HydPy-H-Lahn/network/testdir"))
['streams.py']

When defining network files, many things can go wrong. In the following, we list all specialised error messages of what we hope to be concrete enough to aid in finding the relevant problems:

>>> with TestIO():
...     networkmanager.delete_files(["headwaters"])   
Traceback (most recent call last):
...
FileNotFoundError: While trying to remove the network files of the selection(s) `headwaters`, the following error occurred: ...

>>> with TestIO():
...     with open("HydPy-H-Lahn/network/testdir/streams.py", "w") as wrongfile:
...         _ = wrongfile.write("x = y")
...     networkmanager.load_files()   
Traceback (most recent call last):
...
NameError: While trying to load the network file `...streams.py`, the following error occurred: name 'y' is not defined

>>> with TestIO():
...     with open("HydPy-H-Lahn/network/testdir/streams.py", "w") as wrongfile:
...         _ = wrongfile.write("from hydpy import Node")
...     networkmanager.load_files()   
Traceback (most recent call last):
...
RuntimeError: The class Element cannot be loaded from the network file `...streams.py`.

>>> with TestIO():
...     with open("HydPy-H-Lahn/network/testdir/streams.py", "w") as wrongfile:
...         _ = wrongfile.write("from hydpy import Element")
...     networkmanager.load_files()   
Traceback (most recent call last):
...
RuntimeError: The class Node cannot be loaded from the network file `...streams.py`.

>>> import shutil
>>> with TestIO():
...     shutil.rmtree("HydPy-H-Lahn/network/testdir")
...     networkmanager.save_files(selections)   
Traceback (most recent call last):
...
FileNotFoundError: While trying to save the selection(s) `headwaters and nonheadwaters` into network files, the following error occurred: ...

BASEDIR: str = 'network'¶

DEFAULTDIR: str | None = 'default'¶

load_files() → Selections[source]¶

Read all network files of the current working directory, structure their contents in a Selections object, and return it.

See the main documentation of class NetworkManager for further information.

save_files(selections: Iterable[Selection]) → None[source]¶

Save the Selection objects contained in the given Selections instance to separate network files.

See the main documentation on class NetworkManager for further information.

delete_files(selections: Iterable[Selection]) → None[source]¶

Delete the network files corresponding to the given selections (e.g. a list of str objects or a Selections object).

See the main documentation on class NetworkManager for further information.

class hydpy.core.filetools.ControlManager[source]¶

Bases: FileManager

Manager for control parameter files.

The base and default folder names of class ControlManager are:

>>> from hydpy.core.filetools import ControlManager
>>> ControlManager.BASEDIR
'control'
>>> ControlManager.DEFAULTDIR
'default'

Class ControlManager extends the functionalities of class FileManager only slightly, which is why the documentation on class FileManager should serve as a good starting point for understanding class ControlManager. Also, see the documentation on method prepare_models() of class HydPy, which relies on the functionalities of class ControlManager.

BASEDIR: str = 'control'¶

DEFAULTDIR: str | None = 'default'¶

load_file(element: Element | None = None, filename: str | None = None, clear_registry: bool = True) → dict[str, Any][source]¶

Return the namespace of the given file (and eventually of its corresponding auxiliary subfiles).

By default, ControlManager clears the internal registry after loading a control file and all its corresponding auxiliary files. You can change this behaviour by passing False to the clear_registry argument, which might decrease model initialisation times significantly. However, then it is your own responsibility to call the method clear_registry() when necessary (usually before reloading a changed control file).

One advantage of using method load_file() directly is that it supports reading control files that are yet not correctly integrated into a complete HydPy project by passing its name:

>>> from hydpy.core.testtools import prepare_full_example_1
>>> prepare_full_example_1()

>>> from hydpy.core.filetools import ControlManager
>>> controlmanager = ControlManager()
>>> from hydpy import pub, round_, TestIO
>>> pub.timegrids = "2000-01-01", "2001-01-01", "12h"
>>> with TestIO():
...     controlmanager.projectdir = "HydPy-H-Lahn"
...     results = controlmanager.load_file(filename="land_dill_assl")

>>> results["control"]
area(692.3)
nmbzones(12)
sclass(1)
zonetype(FIELD, FOREST, FIELD, FOREST, FIELD, FOREST, FIELD, FOREST,
         FIELD, FOREST, FIELD, FOREST)
zonearea(14.41, 7.06, 70.83, 84.36, 70.97, 198.0, 27.75, 130.0, 27.28,
         56.94, 1.09, 3.61)
psi(1.0)
zonez(2.0, 2.0, 3.0, 3.0, 4.0, 4.0, 5.0, 5.0, 6.0, 6.0, 7.0, 7.0)
pcorr(1.0)
pcalt(0.1)
rfcf(1.04283)
sfcf(1.1)
tcorr(0.0)
tcalt(0.6)
icmax(field=1.0, forest=1.5)
sfdist(1.0)
smax(inf)
sred(0.0)
tt(0.55824)
ttint(2.0)
dttm(0.0)
cfmax(field=4.55853, forest=2.735118)
cfvar(0.0)
gmelt(nan)
gvar(nan)
cfr(0.05)
whc(0.1)
fc(278.0)
beta(2.54011)
percmax(1.39636)
cflux(0.0)
resparea(True)
recstep(1200.0)
alpha(1.0)
k(0.005618)
k4(0.05646)
gamma(0.0)

>>> results["percmax"].values
0.69818

Passing neither a filename nor an Element object raises the following error:

>>> controlmanager.load_file()
Traceback (most recent call last):
...
RuntimeError: When trying to load a control file you must either pass its name or the responsible Element object.

classmethod read2dict(filename: str, info: dict[str, Any]) → None[source]¶

Read the control parameters from the given path (and its auxiliary paths, where appropriate) and store them in the given dict object info.

Note that`info` can be used to feed information into the execution of control files. Use this method only if you are entirely sure of how the control parameter import of HydPy works. Otherwise, you should most probably prefer to use the method load_file().

classmethod clear_registry() → None[source]¶: Clear the internal registry from control file information.

save_file(filename: str, text: str) → None[source]¶: Save the given text under the given control filename and the current path.

class hydpy.core.filetools.ConditionManager[source]¶

Bases: FileManager

Manager for condition files.

The base folder name of class ConditionManager is:

>>> from hydpy.core.filetools import ConditionManager
>>> ConditionManager.BASEDIR
'conditions'

Class ConditionManager generally works like class FileManager. The following examples, based on the HydPy-H-Lahn example project, explain the additional functionalities of the ConditionManager specific properties inputpath and outputpath:

>>> from hydpy.core.testtools import prepare_full_example_2
>>> hp, pub, TestIO = prepare_full_example_2()

If the current directory named is not defined explicitly, both properties construct it following the actual simulation start or end date, respectively:

>>> from hydpy import repr_
>>> with TestIO(), pub.options.printprogress(True):  
...     repr_(pub.conditionmanager.inputpath)
...     repr_(pub.conditionmanager.outputpath)
The condition manager's current working directory is not defined explicitly.  Hence, the condition manager reads its data from a directory named `init_1996_01_01_00_00_00`.
'.../hydpy/tests/iotesting/HydPy-H-Lahn/conditions/init_1996_01_01_00_00_00'
The condition manager's current working directory is not defined explicitly.  Hence, the condition manager writes its data to a directory named `init_1996_01_05_00_00_00`.
Directory ...init_1996_01_05_00_00_00 has been created.
'.../hydpy/tests/iotesting/HydPy-H-Lahn/conditions/init_1996_01_05_00_00_00'

>>> pub.timegrids.sim.firstdate += "1d"
>>> pub.timegrids.sim.lastdate -= "1d"
>>> pub.timegrids
Timegrids(init=Timegrid("1996-01-01 00:00:00",
                        "1996-01-05 00:00:00",
                        "1d"),
          sim=Timegrid("1996-01-02 00:00:00",
                       "1996-01-04 00:00:00",
                       "1d"),
          eval_=Timegrid("1996-01-01 00:00:00",
                         "1996-01-05 00:00:00",
                         "1d"))

>>> with TestIO():  
...     repr_(pub.conditionmanager.inputpath)
...     repr_(pub.conditionmanager.outputpath)
'.../hydpy/tests/iotesting/HydPy-H-Lahn/conditions/init_1996_01_02_00_00_00'
'.../hydpy/tests/iotesting/HydPy-H-Lahn/conditions/init_1996_01_04_00_00_00'

Use the property currentdir to change the values of both properties:

>>> with TestIO():  
...     pub.conditionmanager.currentdir = "test"
...     repr_(pub.conditionmanager.inputpath)
...     repr_(pub.conditionmanager.outputpath)
'.../hydpy/tests/iotesting/HydPy-H-Lahn/conditions/test'
'.../hydpy/tests/iotesting/HydPy-H-Lahn/conditions/test'

After deleting the custom value of property currentdir, the properties inputpath and outputpath work as before:

>>> with TestIO():  
...     del pub.conditionmanager.currentdir
...     repr_(pub.conditionmanager.inputpath)
...     repr_(pub.conditionmanager.outputpath)
'.../hydpy/tests/iotesting/HydPy-H-Lahn/conditions/init_1996_01_02_00_00_00'
'.../hydpy/tests/iotesting/HydPy-H-Lahn/conditions/init_1996_01_04_00_00_00'

Use the prefix option to configure the automatically determined folder names:

>>> with TestIO(), pub.conditionmanager.prefix("condi"):  
...     repr_(pub.conditionmanager.inputpath)
...     repr_(pub.conditionmanager.outputpath)
'.../hydpy/tests/iotesting/HydPy-H-Lahn/conditions/condi_1996_01_02_00_00_00'
'.../hydpy/tests/iotesting/HydPy-H-Lahn/conditions/condi_1996_01_04_00_00_00'

The date-based construction of directory names requires a Timegrids object available in module pub:

>>> del pub.timegrids
>>> with TestIO():  
...     repr_(pub.conditionmanager.inputpath)
Traceback (most recent call last):
...
hydpy.core.exceptiontools.AttributeNotReady: While trying to determine the currently relevant input path for loading conditions file, the following error occurred: Attribute timegrids of module `pub` is not defined at the moment.

>>> del pub.timegrids
>>> with TestIO():  
...     repr_(pub.conditionmanager.outputpath)
Traceback (most recent call last):
...
hydpy.core.exceptiontools.AttributeNotReady: While trying to determine the currently relevant output path for saving conditions file, the following error occurred: Attribute timegrids of module `pub` is not defined at the moment.

BASEDIR: str = 'conditions'¶

DEFAULTDIR: str | None = None¶

prefix¶

The prefix of the automatically determined, time-dependent condition directory names.

The default prefix is init:

>>> from hydpy.core.testtools import prepare_full_example_2
>>> hp, pub, TestIO = prepare_full_example_2()
>>> cm =  pub.conditionmanager
>>> with TestIO():
...     assert cm.inputpath.endswith("init_1996_01_01_00_00_00")
...     assert cm.outputpath.endswith("init_1996_01_05_00_00_00")

For example, you can vary the prefix to store the conditions of different ensemble members in separate directories:

>>> with TestIO(), cm.prefix("member_01"):
...     assert cm.inputpath.endswith("member_01_1996_01_01_00_00_00")
...     assert cm.outputpath.endswith("member_01_1996_01_05_00_00_00")         

property inputpath: str¶

The directory path for loading initial conditions.

See the main documentation on class ConditionManager and its option prefix for further information.

property outputpath: str¶

The directory path for saving (final) conditions.

See the main documentation on class ConditionManager and its option prefix for further information.

class hydpy.core.filetools.SequenceManager[source]¶

Bases: FileManager

Manager for sequence files.

Usually, there is only one SequenceManager used within each HydPy project, stored in module pub. This object is responsible for the actual I/O tasks related to IOSequence objects.

Working with a complete HydPy project, one often does not use the SequenceManager directly, except one wishes to load or save time series data in a way different from the default settings. The following examples show the essential features of class SequenceManager based on the example project configuration defined by function prepare_io_example_1().

We prepare the project and select one 0-dimensional sequence of type Sim and one 1-dimensional sequence of type NKor for the following examples:

>>> from hydpy.core.testtools import prepare_io_example_1
>>> nodes, elements = prepare_io_example_1()
>>> sim = nodes.node2.sequences.sim
>>> nkor = elements.element2.model.sequences.fluxes.nkor

We store the time series data of both sequences in ASCII files (methods save_file() and save_series() are interchangeable here. The last one is only a convenience function for the first one):

>>> from hydpy import pub
>>> pub.sequencemanager.filetype = "asc"
>>> from hydpy import TestIO
>>> with TestIO():
...     pub.sequencemanager.save_file(sim)
...     nkor.save_series()

We can load the file content from the output directory defined by prepare_io_example_1() and print it to check this was successful:

>>> import os
>>> from hydpy import round_
>>> def print_file(filename):
...     path = os.path.join("project", "series", "default", filename)
...     with TestIO():
...         with open(path) as file_:
...             lines = file_.readlines()
...     print("".join(lines[:3]), end="")
...     for line in lines[3:]:
...         round_([float(x) for x in line.split()])

>>> print_file("node2_sim_t.asc")
Timegrid("2000-01-01 00:00:00+01:00",
         "2000-01-05 00:00:00+01:00",
         "1d")
64.0
65.0
66.0
67.0
>>> print_file("element2_lland_dd_flux_nkor.asc")
Timegrid("2000-01-01 00:00:00+01:00",
         "2000-01-05 00:00:00+01:00",
         "1d")
16.0, 17.0
18.0, 19.0
20.0, 21.0
22.0, 23.0

To show that reloading the data works, we set the values of the time series of both objects to zero and recover the original values afterwards:

>>> sim.series = 0.0
>>> sim.series
InfoArray([0., 0., 0., 0.])
>>> nkor.series = 0.0
>>> nkor.series
InfoArray([[0., 0.],
           [0., 0.],
           [0., 0.],
           [0., 0.]])
>>> with TestIO():
...     pub.sequencemanager.load_file(sim)
...     nkor.load_series()
>>> sim.series
InfoArray([64., 65., 66., 67.])
>>> nkor.series
InfoArray([[16., 17.],
           [18., 19.],
           [20., 21.],
           [22., 23.]])

We now write two files that do not span the initialisation period.

>>> with TestIO():
...     for filename in ("incomplete_1.asc", "incomplete_2.asc"):
...         path = os.path.join("project", "series", "default", filename)
...         with open(path, "w") as file_:
...             _ = file_.write('Timegrid("2000-01-02 00:00:00+01:00",\n'
...                             '         "2000-01-04 00:00:00+01:00",\n'
...                             '         "1d")\n')
...             for value in (1.0, 2.0):
...                 if filename == "incomplete_1.asc":
...                     _ = file_.write(f"{value}\n")
...                 else:
...                     _ = file_.write(f"{value} {value + 1.0}\n")

>>> print_file("incomplete_1.asc")
Timegrid("2000-01-02 00:00:00+01:00",
         "2000-01-04 00:00:00+01:00",
         "1d")
1.0
2.0

>>> print_file("incomplete_2.asc")
Timegrid("2000-01-02 00:00:00+01:00",
         "2000-01-04 00:00:00+01:00",
         "1d")
1.0, 2.0
2.0, 3.0

By default, trying to read such incomplete files results in an error:

>>> sim.filename = "incomplete_1.asc"
>>> nkor.filename = "incomplete_2.asc"
>>> with TestIO():  
...     pub.sequencemanager.load_file(sim)
Traceback (most recent call last):
...
RuntimeError: While trying to load the time series data of sequence `sim` of node `node2`, the following error occurred: For sequence `sim` of node `node2` the initialisation time grid (Timegrid("2000-01-01 00:00:00", "2000-01-05 00:00:00", "1d")) does not define a subset of the time grid of the data file `...incomplete_1.asc` (Timegrid("2000-01-02 00:00:00", "2000-01-04 00:00:00", "1d")).

Setting option checkseries to False turns this safety mechanism off:

>>> with TestIO(), pub.options.checkseries(False):
...     pub.sequencemanager.load_file(sim)
...     nkor.load_series()
>>> sim.series
InfoArray([nan,  1.,  2., nan])
>>> nkor.series
InfoArray([[nan, nan],
           [ 1.,  2.],
           [ 2.,  3.],
           [nan, nan]])

Note that all previously available data outside the period supported by the read files has been set to nan, another safety mechanism to avoid accidentally mixing data. If you instead want to mix data from different sources, set option reset to True:

>>> sim.series = 5.0, 6.0, 7.0, 8.0
>>> nkor.series = [[5.0, 6.0], [6.0, 7.0], [7.0, 8.0], [8.0, 9.0]]
>>> with TestIO(), pub.options.checkseries(False), pub.sequencemanager.reset(False):
...     pub.sequencemanager.load_file(sim)
...     nkor.load_series()
>>> sim.series
InfoArray([5., 1., 2., 8.])
>>> nkor.series
InfoArray([[5., 6.],
           [1., 2.],
           [2., 3.],
           [8., 9.]])

We reset the file names and data for the remaining tests:

>>> del sim.filename
>>> del nkor.filename
>>> with TestIO():
...     pub.sequencemanager.load_file(sim)
...     nkor.load_series()

Wrongly formatted ASCII files and incomplete data should result in understandable error messages:

>>> path = os.path.join("project", "series", "default", "node2_sim_t.asc")
>>> with TestIO():
...     with open(path) as file_:
...         right = file_.read()
...     wrong = right.replace("Timegrid", "timegrid")
...     with open(path, "w") as file_:
...         _ = file_.write(wrong)
>>> with TestIO():
...     pub.sequencemanager.load_file(sim)
Traceback (most recent call last):
...
NameError: While trying to load the time series data of sequence `sim` of node `node2`, the following error occurred: name 'timegrid' is not defined

>>> sim_series = sim.series.copy()
>>> with TestIO():
...     lines = right.split("\n")
...     lines[5] = "nan"
...     wrong = "\n".join(lines)
...     with open(path, "w") as file_:
...         _ = file_.write(wrong)
>>> with TestIO():
...     pub.sequencemanager.load_file(sim)
Traceback (most recent call last):
...
RuntimeError: While trying to load the time series data of sequence `sim` of node `node2`, the following error occurred: The series array of sequence `sim` of node `node2` contains 1 nan value.
>>> sim.series = sim_series

By default, overwriting existing time series files is disabled:

>>> with TestIO():
...     sim.save_series()   
Traceback (most recent call last):
...
OSError: While trying to save the time series data of sequence `sim` of node `node2`, the following error occurred: Sequence `sim` of node `node2` is not allowed to overwrite the existing file `...`.
>>> pub.sequencemanager.overwrite = True
>>> with TestIO():
...     sim.save_series()

When a sequence comes with a weighting parameter referenced by property refweights, one can save the averaged time series by using the method save_mean():

>>> with TestIO():
...     nkor.save_mean()
>>> print_file("element2_lland_dd_flux_nkor_mean.asc")
Timegrid("2000-01-01 00:00:00+01:00",
         "2000-01-05 00:00:00+01:00",
         "1d")
16.5
18.5
20.5
22.5

Method save_mean() is strongly related to method average_series(), meaning one can pass the same arguments. We show this by changing the land use classes of element2 (parameter Lnk) to field (ACKER) and water (WASSER) and averaging the values of sequence NKor for the single field area only:

>>> from hydpy.models.lland_dd import ACKER, WASSER
>>> nkor.subseqs.seqs.model.parameters.control.lnk = ACKER, WASSER
>>> with TestIO():
...     nkor.save_mean("acker")
>>> print_file("element2_lland_dd_flux_nkor_mean.asc")
Timegrid("2000-01-01 00:00:00+01:00",
         "2000-01-05 00:00:00+01:00",
         "1d")
16.0
18.0
20.0
22.0

All numbers are written in scientific notation under the default setting of option reprdigits (-1):

>>> nodes.node1.sequences.sim.series = 0.12345678
>>> with TestIO(), pub.options.reprdigits(-1):
...     nodes.node1.sequences.sim.save_series()
>>> print_file("node1_sim_q.asc")
Timegrid("2000-01-01 00:00:00+01:00",
         "2000-01-05 00:00:00+01:00",
         "1d")
0.123457
0.123457
0.123457
0.123457

If you set this option to two, for example, all numbers are written in the decimal form with at most two decimal places:

>>> with TestIO(), pub.options.reprdigits(2):
...     nodes.node1.sequences.sim.save_series()
>>> print_file("node1_sim_q.asc")
Timegrid("2000-01-01 00:00:00+01:00",
         "2000-01-05 00:00:00+01:00",
         "1d")
0.12
0.12
0.12
0.12

Another option is storing data using numpy binary files, which is good for saving computation times but possibly problematic for sharing data with colleagues:

>>> pub.sequencemanager.filetype = "npy"
>>> with TestIO():
...     sim.save_series()
...     nkor.save_series()

The time information (without time zone information) is available within the first thirteen entries:

>>> path = os.path.join("project", "series", "default", "node2_sim_t.npy")
>>> import numpy
>>> from hydpy import print_vector, print_matrix
>>> with TestIO():
...     print_vector(numpy.load(path))
2000.0, 1.0, 1.0, 0.0, 0.0, 0.0, 2000.0, 1.0, 5.0, 0.0, 0.0, 0.0,
86400.0, 64.0, 65.0, 66.0, 67.0

Reloading the data works as expected:

>>> sim.series = 0.0
>>> nkor.series = 0.0
>>> with TestIO():
...     sim.load_series()
...     nkor.load_series()
>>> print_vector(sim.series)
64.0, 65.0, 66.0, 67.0
>>> print_matrix(nkor.series)
| 16.0, 17.0 |
| 18.0, 19.0 |
| 20.0, 21.0 |
| 22.0, 23.0 |

Writing mean values into numpy binary files is also supported:

>>> import numpy
>>> from hydpy import print_vector
>>> path = os.path.join(
...     "project", "series", "default", "element2_lland_dd_flux_nkor_mean.npy")
>>> with TestIO():
...     nkor.save_mean("wasser")
...     print_vector(numpy.load(path)[-4:])
17.0, 19.0, 21.0, 23.0

Generally, trying to load data for “deactivated” sequences results in the following error message:

>>> nkor.prepare_series(allocate_ram=False)
>>> with TestIO(clear_all=True):
...     pub.sequencemanager.save_file(nkor)
Traceback (most recent call last):
...
hydpy.core.exceptiontools.AttributeNotReady: Sequence `nkor` of element `element2` is not requested to make any time series data available.

The third option is to store data in NetCDF files, which is explained separately in the documentation on module netcdftools.

SUPPORTED_MODES = ('npy', 'asc', 'nc')¶

BASEDIR: str = 'series'¶

DEFAULTDIR: str | None = 'default'¶

filetype¶

Currently active time series file type.

filetype is an option based on OptionPropertySeriesFileType. See its documentation for further information.

reset¶

A flag that indicates whether to reset already available time series data before reading incomplete time series files.

reset is an option based on OptionPropertyBool. See its documentation for further information.

overwrite¶

Currently active overwrite flag for time series files.

overwrite is an option based on OptionPropertyBool. See its documentation for further information.

aggregation¶

Currently active aggregation mode for writing time series files.

aggregation is an option based on OptionPropertySeriesAggregation. See its documentation for further information.

convention¶

Currently selected naming convention for reading and writing input time series files.

convention is an option based on OptionPropertySeriesConvention. See its documentation for further information.

load_file(sequence: IOSequence) → None[source]¶: Load data from a data file and pass it to the given IOSequence.

save_file(sequence: IOSequence, array: InfoArray | None = None) → None[source]¶: Write the data stored in the series property of the given IOSequence into a data file.

property netcdfreader: NetCDFInterfaceReader¶

A NetCDFInterfaceReader object prepared by method open_netcdfreader() and to be finalised by method close_netcdfreader().

>>> from hydpy.core.filetools import SequenceManager
>>> sm = SequenceManager()
>>> sm.netcdfreader
Traceback (most recent call last):
...
hydpy.core.exceptiontools.AttributeNotReady: The sequence file manager currently handles no NetCDF reader object. Consider applying the `pub.sequencemanager.netcdfreading` context manager first (search in the documentation for help).

>>> sm.open_netcdfreader()
>>> from hydpy import classname
>>> classname(sm.netcdfreader)
'NetCDFInterfaceReader'

>>> sm.close_netcdfreader()
>>> sm.netcdfreader
Traceback (most recent call last):
...
hydpy.core.exceptiontools.AttributeNotReady: The sequence file manager currently handles no NetCDF reader object. Consider applying the `pub.sequencemanager.netcdfreading` context manager first (search in the documentation for help).

open_netcdfreader() → None[source]¶: Prepare a new NetCDFInterfaceReader object for reading data.

close_netcdfreader() → None[source]¶: Read data with a prepared NetCDFInterfaceReader object and delete it afterwards.

netcdfreading() → Iterator[None][source]¶: Prepare a new NetCDFInterfaceReader object for collecting data at the beginning of a with-block and read the data and delete the object at the end of the same with-block.

property netcdfwriter: NetCDFInterfaceWriter¶

A NetCDFInterfaceWriter object prepared by method open_netcdfwriter() and to be finalised by method close_netcdfwriter().

>>> from hydpy.core.filetools import SequenceManager
>>> sm = SequenceManager()
>>> sm.netcdfwriter
Traceback (most recent call last):
...
hydpy.core.exceptiontools.AttributeNotReady: The sequence file manager currently handles no NetCDF writer object. Consider applying the `pub.sequencemanager.netcdfwriting` context manager first (search in the documentation for help).

>>> sm.open_netcdfwriter()
>>> from hydpy import classname
>>> classname(sm.netcdfwriter)
'NetCDFInterfaceWriter'

>>> sm.close_netcdfwriter()
>>> sm.netcdfwriter
Traceback (most recent call last):
...
hydpy.core.exceptiontools.AttributeNotReady: The sequence file manager currently handles no NetCDF writer object. Consider applying the `pub.sequencemanager.netcdfwriting` context manager first (search in the documentation for help).

open_netcdfwriter() → None[source]¶: Prepare a new NetCDFInterfaceWriter object for writing data.

close_netcdfwriter() → None[source]¶: Write data with a prepared NetCDFInterfaceWriter object and delete it afterwards.

netcdfwriting() → Iterator[None][source]¶: Prepare a new NetCDFInterfaceWriter object for collecting data at the beginning of a with-block and write the data and delete the object at the end of the same with-block.

provide_netcdfjitaccess(deviceorder: Iterable[Node | Element]) → Iterator[None][source]¶

Open all required internal NetCDF time series files.

This method is only relevant for reading data from or writing data to NetCDF files “just in time” during simulation runs. See the main documentation on class HydPy for further information.

read_netcdfslices(idx: int) → None[source]¶

Read the time slice relevant to the current simulation step.

This method is only relevant for reading data from or writing data to NetCDF files “just in time” during simulation runs. See the main documentation on class HydPy for further information.

write_netcdfslices(idx: int) → None[source]¶

Write the time slice relevant to the current simulation step.

This method is only relevant for reading data from or writing data to NetCDF files “just in time” during simulation runs. See the main documentation on class HydPy for further information.

hydpy.core.filetools.check_projectstructure(projectpath: str) → None[source]¶

Raise a warning if the given project root directory does not exist or does not contain all relevant base directories.

First, check_projectstructure() checks if the root directory exists:

>>> from hydpy import check_projectstructure, HydPy, pub, TestIO
>>> TestIO.clear()
>>> with TestIO():
...     check_projectstructure("my_project")  
Traceback (most recent call last):
...
UserWarning: The project root directory `...my_project` does not exists.

Second, it lists all missing base directories:

>>> import os
>>> from hydpy.core.testtools import warn_later
>>> with TestIO(), warn_later(), pub.options.checkprojectstructure(True):
...     os.makedirs(os.path.join("my_project", "control"))
...     hp = HydPy("my_project")  
UserWarning: The project root directory ...my_project has no base directory named `network` as required by the network manager.
UserWarning: The project root directory ...my_project has no base directory named `conditions` as required by the condition manager.
UserWarning: The project root directory ...my_project has no base directory named `series` as required by the sequence manager.

Note that class HydPy calls function check_projectstructure() automatically if option checkprojectstructure is enabled:

>>> TestIO.clear()
>>> with TestIO(), pub.options.checkprojectstructure(False):
...     hp = HydPy("my_project")

>>> with TestIO(), pub.options.checkprojectstructure(True):
...     hp = HydPy("my_project")  
Traceback (most recent call last):
...
UserWarning: The project root directory `...my_project` does not exists.

hydpy.core.filetools.create_projectstructure(projectpath: str, overwrite: bool = False) → None[source]¶

Make the given project root directory and its base directories.

If everything works well, function create_projectstructure() creates the required directories silently:

>>> from hydpy import create_projectstructure, TestIO
>>> from hydpy.core.testtools import print_filestructure
>>> TestIO.clear()
>>> with TestIO():
...     create_projectstructure("my_project")
...     print_filestructure("my_project")  
* ...my_project
    - conditions
    - control
    - network
    - series

If the root directory already exists, it does not make any changes and instead raises the following error by default:

>>> with TestIO():
...     os.makedirs(os.path.join("my_project", "zap"))
...     create_projectstructure("my_project")  
Traceback (most recent call last):
...
FileExistsError: While trying to create the basic directory structure for project `my_project`the directory ...iotesting, the following error occurred: The root directory already exists and overwriting is not allowed.
>>> with TestIO():
...     print_filestructure("my_project")  
* ...my_project
    - conditions
    - control
    - network
    - series
    - zap

Use the overwrite flag to let function create_projectstructure() remove the existing directory and make a new one:

>>> with TestIO():
...     create_projectstructure("my_project", overwrite=True)
...     print_filestructure("my_project")  
* ...my_project
    - conditions
    - control
    - network
    - series

Table of Contents

Previous topic

Next topic

This Page

Versions

filetools¶