radis.io.cache_files module

Tools to deal with HDF5 cache files HDF5 cache files are used to cache Energy Database files, and Line Database files, and yield a much faster access time.

Routine Listing

See also

hit2df(), cdsd2df(), -------------------------------------------------------------------------------

cache_file_name(fname)[source]

Return the corresponding cache file name for fname.

check_cache_file(fcache, use_cached=True, expected_metadata={}, verbose=True, engine='auto')[source]

Quick function that check status of cache file generated by RADIS:

The function first checks the existence of fcache. What is does depends on the value of use_cached:

  • if True, check it exists and remove the file if it is not valid.

  • if 'regen', delete cache file even if valid, to regenerate it later.

  • if 'force', raise an error if file doesnt exist.

Then look if it is deprecated (we just look at the attributes, the file is never fully read). Deprecation is done by check_not_deprecated() comparing the metadata= content.

  • if deprecated, deletes it to regenerate later unless ‘force’ was used

Parameters
  • fcache (str) – cache file name

  • use_cached (True, False, 'force', 'regen') – see notes above. Default True.

  • expected_metadata (dict) – attributes to check

  • verbose (boolean) – print stuff

Returns

whether the file was valid or not (and was removed). Raises a DeprecatedFileWarning for un unvalid file in mode 'force'. The error can be caught by the parent function.

Return type

None

check_not_deprecated(file, metadata_is={}, metadata_keys_contain=[], current_version=None, last_compatible_version='0.9.1', engine='auto')[source]

Make sure cache file is not deprecated: checks that metadata is the same, and that the version under which the file was generated is valid.

Parameters
  • file (str) – a `` .h5`` cache file for Energy Levels

  • metadata_is (dict) – expected values for these variables in the file metadata. If the values dont match, a DeprecatedFileWarning() error is raised. If the file metadata contains additional keys/values, no error is raised.

  • metadata_keys_contain (list) – expected list of variables in the file metadata. If the keys are not there, a DeprecatedFileWarning() error is raised.

Other Parameters
  • current_version (str, or None) – current version number. If the file was generated in a previous version a warning is raised. If None, current version is read from radis.__version__.

  • last_backward_compatible_version (str) – If the file was generated in a non-compatible version, an error is raised. (useful parameter to force regeneration of certain cache files after a

    breaking change in a new version)

  • engine ('h5py', 'pytables', 'vaex', 'auto') – which HDF5 library to use. If 'auto', try to guess. Note: 'vaex' uses 'h5py' compatible HDF5.

check_relevancy(file, relevant_if_metadata_above, relevant_if_metadata_below, verbose=True, key='df', engine='auto')[source]

Make sure cache file is relevant.

Use case: checks that wavenumber min and wavenumber max in metadata are relevant for the specified spectral range.

Parameters
  • file (str) – a `` .h5`` line database cache file

  • load_only_wavenum_above, relevant_if_metadata_below (dict) – only load the cached file if the metadata values are above/below the specific values for each key.

  • relevant_if_metadata_above, relevant_if_metadata_below (dict) – file is relevant if the file metadata value for each key of the dictionary is above/below the value in the dictionary

Other Parameters

key (str) – dataset key in storer.

Examples

You want to compute a spectrum in between 2300 and 2500 cm-1. A line database file is relevant only if its metadata says that 'wavenum_max' > 2300 and 'wavenum_min' < 2500 cm-1.

check_relevancy(‘path/to/file’, relevant_if_metadata_above={‘wavenum_max’:2300},

relevant_if_metadata_below={‘wavenum_min’:2500})

the specified value.

filter_metadata(arguments, discard_variables=['self', 'verbose'])[source]

Filter arguments (created with locals() at the beginning of the script) to extract metadata.

Metadata is stored as attributes in the cached file:

  • remove variables in discard_variables

  • remove variables that start with '_'

  • remove varibles whose value is None

Parameters
  • arguments (dict) –

    list of local variables. For instance:

    arguments = locals()
    
  • discard_variables (list of str) – variable names to discard

Returns

metadata – a (new) dictionary built from arguments by removing discard_variables and variables starting with '_'

Return type

dict

Examples

How to get only function argument:

def some_function(*args):
    metadata = locals()     # stores only function arguments because it's the first line

    ...

    metadata = filter_metadata(metadata)
    save_to_hdf(df, fname, metadata=metadata)

    ...
get_cache_file(fcache, engine='pytables', verbose=True)[source]

Load HDF5 cache file.

Parameters

fcache (str) – file name

Other Parameters

verbose (bool) – If >=2, also warns if non numeric values are present (it would make calculations slower)

Notes

we could start using FEATHER format instead. See notes in cache_files.py

load_h5_cache_file(cachefile, use_cached, valid_if_metadata_is={}, relevant_if_metadata_above={}, relevant_if_metadata_below={}, current_version='', last_compatible_version='0.9.1', verbose=True, engine='pytables')[source]

Function to load a h5 cache file.

Parameters
  • cachefile (str) – cache file path

  • use_cached (str) – use cache file if value is not False:

    • if True, use (and generate if doesnt exist) cache file.

    • if 'regen', delete cache file (if exists) so it is regenerated

    • if 'force', use cache file and raises an error if it doesnt exist

    if using the cache file, check if the file is deprecated. If it is deprecated, regenerate the file unless 'force' was used (in that case, raise an error)

  • valid_if_metadata_is (dict) – values are compared to cache file attributes. If they dont match, the file is considered deprecated. See use_cached to know how to handle deprecated files

    Note

    if the file has extra attributes they are not compared

  • current_version (str) – version is compared to cache file version (part of attributes). If current version is superior, a simple warning is triggered.

  • last_compatible_version (str) – if file version is inferior to this, file is considered deprecated. See use_cached to know how to handle deprecated files.

  • relevant_if_metadata_above, relevant_if_metadata_below (dict) – values are compared to cache file attributes. If they don’t match, the function returns a IrrelevantFileWarning. For instance, load a line database file, only if it contains wavenumbers between 2300 and 2500 cm-1

    load_h5_cache_file(..., relevant_if_metadata_above={'wav':2300};
    relevant_if_metadata_below={'wav':2500})
    

    Note that in such an example, the file data is not read. Only the file metadata is. If the metadata does not contain the key (e.g.: 'wav') a DeprecatedFileWarning is raised.

Returns

df – None if no cache file was found, or if it was deleted

Return type

pandas DataFrame, or None

save_to_hdf(df, fname, metadata, version=None, key='df', overwrite=True, verbose=True)[source]

Save energy levels or lines to HDF5 file. Add metadata and version.

Parameters
  • df (a pandas DataFrame) – data will be stored in the key 'df'

  • fname (str) – .h5 file where to store.

  • metadata (dict) – dictionary of values that were used to generate the DataFrame. Metadata will be asked again on file load to ensure it hasnt changed. None values are not stored.

  • version (str, or None) – file version. If None, the current radis.__version__ is used. On file loading, a warning will be raised if the current version is posterior, or an error if the file version is set to be uncompatible.

  • key (str) – dataset name. Default 'df'

  • overwrite (boolean) – if True, overwrites file. Else, raise an error if it exists.

Other Parameters

verbose (bool) – If >=2, also warns if non numeric values are present (it would make calculations slower)

Notes

None values are not stored