radis.io.geisa module¶
Summary¶
GEISA database parser
Defines fetch_geisa() based on GEISADatabaseManager
- fetch_geisa(molecule, local_databases=None, databank_name='GEISA-{molecule}', isotope=None, load_wavenum_min=None, load_wavenum_max=None, columns=None, cache=True, verbose=True, chunksize=100000, clean_cache_files=True, return_local_path=False, engine='default', output='pandas', parallel=True)[source]¶
Stream GEISA file from GEISA website. Unzip and build a HDF5 file directly.
Returns a Pandas DataFrame containing all lines.
- Parameters:
molecule (all 58 GEISA 2020 molecules. See here https://geisa.aeris-data.fr/interactive-access/?db=2020&info=ftp)
local_databases (str) – where to create the RADIS HDF5 files. Default
"~/.radisdb/geisa". Can be changed inradis.config["DEFAULT_DOWNLOAD_PATH"]or in ~/radis.json config filedatabank_name (str) – name of the databank in RADIS Configuration file Default
"GEISA-{molecule}"isotope (str, int or None) – load only certain isotopes :
'2','1,2', etc. IfNone, loads everything. DefaultNone.load_wavenum_min, load_wavenum_max (float (cm-1)) – load only specific wavenumbers.
columns (list of str) – list of columns to load. If
None, returns all columns in the file.
- Other Parameters:
cache (
True,False,'regen'or'force') – ifTrue, use existing HDF5 file. IfFalseor'regen', rebuild it. If'force', raise an error if cache file cannot be used (useful for debugging). DefaultTrue.verbose (bool)
chunksize (int) – number of lines to process at a same time. Higher is usually faster but can create Memory problems and keep the user uninformed of the progress.
clean_cache_files (bool) – if
Trueclean downloaded cache files after HDF5 are created.return_local_path (bool) – if
True, also returns the path of the local database file.engine (‘pytables’, ‘vaex’, ‘default’) – which HDF5 library to use to parse local files. If ‘default’ use the value from ~/radis.json
output (‘pandas’, ‘vaex’, ‘jax’) – format of the output DataFrame. If
'jax', returns a dictionary of jax arrays.parallel (bool) – if
True, uses joblib.parallel to load database with multiple processes
- Returns:
df (pd.DataFrame or vaex.dataframe.DataFrameLocal) – Line list A HDF5 file is also created in
local_databasesand referenced in the RADIS config file with namedatabank_namelocal_path (str) – path of local database file if
return_local_path
Examples
from radis import fetch_geisa df = fetch_geisa("CO") print(df.columns) >>> Index(['wav', 'int', 'airbrd', 'El', 'globu', 'globl', 'locu', 'locl', 'Tdpgair', 'isoG', 'mol', 'idG', 'id', 'iso', 'A', 'selbrd', 'Pshft', 'Tdpair', 'ierrA', 'ierrB', 'ierrC', 'ierrF', 'ierrO', 'ierrR', 'ierrN', 'Tdpgself', 'ierrS', 'Pshfts', 'ierrT', 'Tdppself', 'ierrU'], dtype='object')
Notes
if using
load_only_wavenum_above/beloworisotope, the whole database is anyway downloaded and uncompressed tolocal_databasesfast access .HDF5 files (which will take a long time on first call). Only the expected wavenumber range & isotopes are returned. The .HFD5 parsing useshdf2df()if a registered entry already exists and
radis.config["ALLOW_OVERWRITE"]isTrue: - if any situation arises where the databank needs to be re-downloaded, the possible urls are attempted in their usual order of preference, as if the databank hadn’t been registered, rather than directly re-downloading from the same url that was previously registered, in case e.g. a new linelist has been uploaded since the databank was previously registered - If no partition function file is registered, e.g because one wasn’t available server-side when the databank was last registered, an attempt is still made again to download it, to account for e.g. the case where one has since been uploaded