radis.io.geisa module


GEISA database parser

class GEISADatabaseManager(name, molecule, local_databases, engine='default', verbose=True, chunksize=100000, parallel=True)[source]

Bases: DatabaseManager


requires connexion

parse_to_local_file(opener, urlname, local_file, pbar_active=True, pbar_t0=0, pbar_Ntot_estimate_factor=None, pbar_Nlines_already=0, pbar_last=True)[source]

Uncompress urlname into local_file. Also add metadata

  • opener (an opener with an .open() command)

  • gfile (file handler. Filename: for info)


register in ~/radis.json

columns_GEISA = {'A': ('a10', <class 'float'>, 'Einstein A coefficient', 's-1'), 'El': ('a10', <class 'float'>, 'lower-state energy', 'cm-1'), 'Pshft': ('a9', <class 'float'>, 'air pressure-induced line shift at 296K', 'cm-1.atm-1'), 'Pshfts': ('a8', <class 'float'>, 'self pressure-induced line shift at 296K', 'cm-1.atm-1'), 'Tdpair': ('a4', <class 'float'>, 'temperature-dependance exponent for Gamma air', ''), 'Tdpnself': ('a4', <class 'float'>, 'temperature-dependance exponent for self pressure-induced line shift', ''), 'Tdppair': ('a6', <class 'float'>, 'temperature-dependance exponent for air pressure-induced line shift', ''), 'Tdpself': ('a4', <class 'float'>, 'temperature-dependance exponent for self-broadening halfwidth', ''), 'airbrd': ('a6', <class 'float'>, 'air-broadened half-width at 296K', 'cm-1.atm-1'), 'globl': ('a25', <class 'str'>, 'electronic and vibrational global lower quanta', ''), 'globu': ('a25', <class 'str'>, 'electronic and vibrational global upper quanta', ''), 'id': ('a2', <class 'int'>, 'Hitran molecular number', ''), 'idG': ('a3', <class 'str'>, 'Internal GEISA code for the data identification', ''), 'ierrA': ('a10', <class 'float'>, 'estimated accuracy on the line position', 'cm-1'), 'ierrB': ('a11', <class 'str'>, 'estimated accuracy on the intensity of the line', 'cm-1/(molecule/cm-2)'), 'ierrC': ('a6', <class 'float'>, 'estimated accuracy on the air collision halfwidth', 'cm-1.atm-1'), 'ierrF': ('a4', <class 'float'>, 'estimated accuracy on the temperature dependence coefficient of the air-broadening halfwidth', ''), 'ierrN': ('a7', <class 'float'>, 'estimated accuracy on the self-broadened at 296K', 'cm-1.atm-1'), 'ierrO': ('a9', <class 'float'>, 'estimated accuracy on the air pressure shift of the line transition at 296K', 'cm-1.atm-1'), 'ierrR': ('a6', <class 'float'>, 'estimated accuracy on the temperature dependence coefficient of the air pressure shift', ''), 'ierrS': ('a4', <class 'float'>, 'estimated accuracy on the temperature dependence coefficient of the self-broadening halfwidth', ''), 'ierrT': ('a8', <class 'float'>, 'estimated accuracy on the self-pressure shift of the line transition at 296K', 'cm-1.atm-1'), 'ierrU': ('a4', <class 'float'>, 'estimated accuracy on the temperature dependence coefficient of the self pressure shift', ''), 'int': ('a11', <class 'str'>, 'intensity at 296K', 'cm-1/(molecule/cm-2)'), 'iso': ('a1', <class 'int'>, 'Hitran isotope number', ''), 'isoG': ('a3', <class 'int'>, 'GEISA isotope number', ''), 'locl': ('a15', <class 'str'>, 'electronic and vibrational local lower quanta', ''), 'locu': ('a15', <class 'str'>, 'electronic and vibrational local upper quanta', ''), 'mol': ('a3', <class 'int'>, 'GEISA molecular number', ''), 'selbrd': ('a7', <class 'float'>, 'self-broadened half-width at 296K', 'cm-1.atm-1'), 'wav': ('a12', <class 'float'>, 'vacuum wavenumber', 'cm-1')}[source]

parsing order of GEISA2020 format



fetch_geisa(molecule, local_databases=None, databank_name='GEISA-{molecule}', isotope=None, load_wavenum_min=None, load_wavenum_max=None, columns=None, cache=True, verbose=True, chunksize=100000, clean_cache_files=True, return_local_path=False, engine='default', output='pandas', parallel=True)[source]

Stream GEISA file from GEISA website. Unzip and build a HDF5 file directly.

Returns a Pandas DataFrame containing all lines.

  • molecule (all 58 GEISA 2020 molecules. See here https://geisa.aeris-data.fr/interactive-access/?db=2020&info=ftp)

  • local_databases (str) – where to create the RADIS HDF5 files. Default "~/.radisdb/geisa". Can be changed in radis.config["DEFAULT_DOWNLOAD_PATH"] or in ~/radis.json config file

  • databank_name (str) – name of the databank in RADIS Configuration file Default "GEISA-{molecule}"

  • isotope (str, int or None) – load only certain isotopes : '2', '1,2', etc. If None, loads everything. Default None.

  • load_wavenum_min, load_wavenum_max (float (cm-1)) – load only specific wavenumbers.

  • columns (list of str) – list of columns to load. If None, returns all columns in the file.

Other Parameters
  • cache (True, False, 'regen' or 'force') – if True, use existing HDF5 file. If False or 'regen', rebuild it. If 'force', raise an error if cache file cannot be used (useful for debugging). Default True.

  • verbose (bool)

  • chunksize (int) – number of lines to process at a same time. Higher is usually faster but can create Memory problems and keep the user uninformed of the progress.

  • clean_cache_files (bool) – if True clean downloaded cache files after HDF5 are created.

  • return_local_path (bool) – if True, also returns the path of the local database file.

  • engine (‘pytables’, ‘vaex’, ‘default’) – which HDF5 library to use to parse local files. If ‘default’ use the value from ~/radis.json

  • output (‘pandas’, ‘vaex’, ‘jax’) – format of the output DataFrame. If 'jax', returns a dictionary of jax arrays. If 'vaex', output is a vaex.dataframe.DataFrameLocal


    Vaex DataFrames are memory-mapped. They do not take any space in RAM and are extremelly useful to deal with the largest databases.

  • parallel (bool) – if True, uses joblib.parallel to load database with multiple processes


  • df (pd.DataFrame) – Line list A HDF5 file is also created in local_databases and referenced in the RADIS config file with name databank_name

  • local_path (str) – path of local database file if return_local_path


from radis import fetch_geisa
df = fetch_geisa("CO")
>>> Index(['wav', 'int', 'airbrd', 'El', 'globu', 'globl', 'locu', 'locl',
    'Tdpgair', 'isoG', 'mol', 'idG', 'id', 'iso', 'A', 'selbrd', 'Pshft',
    'Tdpair', 'ierrA', 'ierrB', 'ierrC', 'ierrF', 'ierrO', 'ierrR', 'ierrN',
    'Tdpgself', 'ierrS', 'Pshfts', 'ierrT', 'Tdppself', 'ierrU'],
Compare CO spectrum from the GEISA and HITRAN database

Compare CO spectrum from the GEISA and HITRAN database

Compare CO spectrum from the GEISA and HITRAN database


if using load_only_wavenum_above/below or isotope, the whole database is anyway downloaded and uncompressed to local_databases fast access .HDF5 files (which will take a long time on first call). Only the expected wavenumber range & isotopes are returned. The .HFD5 parsing uses hdf2df()

gei2df(fname, cache=True, load_columns=None, verbose=True, drop_non_numeric=True, load_wavenum_min=None, load_wavenum_max=None, engine='pytables')[source]

Convert a GEISA 1 file to a Pandas dataframe. :Parameters: * fname (str) – GEISA file name.

  • cache (boolean, or ‘regen’) – if True, a pandas-readable HDF5 file is generated on first access, and later used. This saves on the datatype cast and conversion and improves performances a lot (but changes in the database are not taken into account). If False, no database is used. If ‘regen’, temp file are reconstructed. Default True.

  • load_columns (list) – columns to load. If None, loads everything. .. note:

    this is only relevant when loading from a cache file. To generate
    the cache file, all columns are loaded anyway.
Other Parameters
  • drop_non_numeric (boolean) – if True, non numeric columns are dropped. This improves performances, but make sure all the columns you need are converted to numeric formats before hand. Default True. Note that if a cache file is loaded it will be left untouched.

  • load_wavenum_min, load_wavenum_max (float) – if not 'None', only load the cached file if it contains data for wavenumbers above/below the specified value. See :py:func`~radis.io.cache_files.load_h5_cache_file`. Default 'None'.

  • engine (‘pytables’, ‘vaex’) – format for Hdf5 cache file, pytables by default.


df – dataframe containing all lines and parameters.

Return type

pandas Dataframe


GEISA Database 2020 release can be downloaded from 2



The 2020 edition of the GEISA spectroscopic database, Thibault Delahaye et al., 2021


GEISA Database 2020 release

See also

hit2df(), cdsd2df()


Get non-empty lines of a chunk b, parsing the bytes.