radis.io.geisa module

Summary

GEISA database parser


Defines fetch_geisa() based on GEISADatabaseManager

fetch_geisa(molecule, local_databases=None, databank_name='GEISA-{molecule}', isotope=None, load_wavenum_min=None, load_wavenum_max=None, columns=None, cache=True, verbose=True, chunksize=100000, clean_cache_files=True, return_local_path=False, engine='default', output='pandas', parallel=True)[source]

Stream GEISA file from GEISA website. Unzip and build a HDF5 file directly.

Returns a Pandas DataFrame containing all lines.

Parameters:
  • molecule (all 58 GEISA 2020 molecules. See here https://geisa.aeris-data.fr/interactive-access/?db=2020&info=ftp)

  • local_databases (str) – where to create the RADIS HDF5 files. Default "~/.radisdb/geisa". Can be changed in radis.config["DEFAULT_DOWNLOAD_PATH"] or in ~/radis.json config file

  • databank_name (str) – name of the databank in RADIS Configuration file Default "GEISA-{molecule}"

  • isotope (str, int or None) – load only certain isotopes : '2', '1,2', etc. If None, loads everything. Default None.

  • load_wavenum_min, load_wavenum_max (float (cm-1)) – load only specific wavenumbers.

  • columns (list of str) – list of columns to load. If None, returns all columns in the file.

Other Parameters:
  • cache (True, False, 'regen' or 'force') – if True, use existing HDF5 file. If False or 'regen', rebuild it. If 'force', raise an error if cache file cannot be used (useful for debugging). Default True.

  • verbose (bool)

  • chunksize (int) – number of lines to process at a same time. Higher is usually faster but can create Memory problems and keep the user uninformed of the progress.

  • clean_cache_files (bool) – if True clean downloaded cache files after HDF5 are created.

  • return_local_path (bool) – if True, also returns the path of the local database file.

  • engine (‘pytables’, ‘vaex’, ‘default’) – which HDF5 library to use to parse local files. If ‘default’ use the value from ~/radis.json

  • output (‘pandas’, ‘vaex’, ‘jax’) – format of the output DataFrame. If 'jax', returns a dictionary of jax arrays.

  • parallel (bool) – if True, uses joblib.parallel to load database with multiple processes

Returns:

  • df (pd.DataFrame or vaex.dataframe.DataFrameLocal) – Line list A HDF5 file is also created in local_databases and referenced in the RADIS config file with name databank_name

  • local_path (str) – path of local database file if return_local_path

Examples

from radis import fetch_geisa
df = fetch_geisa("CO")
print(df.columns)
>>> Index(['wav', 'int', 'airbrd', 'El', 'globu', 'globl', 'locu', 'locl',
    'Tdpgair', 'isoG', 'mol', 'idG', 'id', 'iso', 'A', 'selbrd', 'Pshft',
    'Tdpair', 'ierrA', 'ierrB', 'ierrC', 'ierrF', 'ierrO', 'ierrR', 'ierrN',
    'Tdpgself', 'ierrS', 'Pshfts', 'ierrT', 'Tdppself', 'ierrU'],
    dtype='object')

Compare CO spectrum from the GEISA and HITRAN database

Compare CO spectrum from the GEISA and HITRAN database

Notes

if using load_only_wavenum_above/below or isotope, the whole database is anyway downloaded and uncompressed to local_databases fast access .HDF5 files (which will take a long time on first call). Only the expected wavenumber range & isotopes are returned. The .HFD5 parsing uses hdf2df()