radis.lbl.loader module

Summary

Module to host the databank loading / database initialisation parts of SpectrumFactory (and unload the factory.py file). Basically it holds all of the non-physical machinery, while actual population calculations and line broadening are still calculated in factory.py

This is done through SpectrumFactory inheritance of the DatabankLoader class defined here

Routine Listings

PUBLIC METHODS

PRIVATE METHODS - DATABASE LOADING

  • radis.lbl.loader.DatabankLoader._load_databank()

  • radis.lbl.loader.DatabankLoader._reload_databank()

  • radis.lbl.loader.DatabankLoader._check_line_databank()

  • radis.lbl.loader.DatabankLoader._retrieve_from_database()

  • radis.lbl.loader.DatabankLoader._build_partition_function_interpolator()

  • radis.lbl.loader.DatabankLoader._build_partition_function_calculator()

  • radis.lbl.loader.DatabankLoader._fetch_molecular_parameters()

Most methods are written in inherited class with the following inheritance scheme:

DatabankLoader > BaseFactory > BroadenFactory > BandFactory > SpectrumFactory

Inheritance diagram of radis.lbl.factory.SpectrumFactory

Notes

RADIS includes automatic rebuilding of Deprecated cache files + a global variable to force regenerating them after a given version. See radis.OLDEST_COMPATIBLE_VERSION


class ConditionDict[source]

Bases: dict

A class to hold Spectrum calculation input conditions (Input), computation parameters (Parameters), or miscalleneous parameters (MiscParams).

Works like a dict except you can also access attribute with:

v = a.key   # equivalent to v = a[key]

Also can be copied, deepcopied, and parallelized in multiprocessing

Notes

for developers:

Parameters and Input could also have simply derived from the (object) class, but it may have missed some convenients functions implemented for dict. For instance, how to be picked / unpickled.

See also

Input, Parameter, MiscParams

copy()a shallow copy of D[source]
get_params()[source]

Returns the variables (and their values) contained in the dictionary, minus some based on their type. Numpy array, dictionaries and pandas DataFrame are removed.

Tuples are converted to string

class DatabankLoader[source]

Bases: object

Inheritance diagram of radis.lbl.factory.SpectrumFactory

See also

SpectrumFactory

df0[source]

initial line database after loading.

If for any reason, you want to manipulate the line database manually (for instance, keeping only lines emitting by a particular level), you need to access the df0 attribute of SpectrumFactory.

Warning

never overwrite the df0 attribute, else some metadata may be lost in the process. Only use inplace operations. If reducing the number of lines, add a df0.reset_index()

For instance:

sf = SpectrumFactory(
    wavenum_min= 2150.4,
    wavenum_max=2151.4,
    pressure=1,
    isotope=1)
sf.load_databank('HITRAN-CO-TEST')
sf.df0.drop(sf.df0[sf.df0.vu!=1].index, inplace=True)   # keep lines emitted by v'=1 only
sf.eq_spectrum(Tgas=3000, name='vu=1').plot()

df0 contains the lines as they are loaded from the database. df1 is generated during the spectrum calculation, after the line database reduction steps, population calculation, and scaling of intensity and broadening parameters with the calculated conditions.

See also

df1

Type

pandas DataFrame

df1[source]

line database, scaled with populations + linestrength cutoff Never edit manually. See all comments about df0

See also

df0

Type

DataFrame

fetch_databank(source='hitran', parfunc=None, parfuncfmt='hapi', levels=None, levelsfmt='radis', load_energies=False, include_neighbouring_lines=True, parse_local_global_quanta=True, drop_non_numeric=True, db_use_cached=True, lvl_use_cached=True)[source]

Fetch the latest databank files from HITRAN or HITEMP with the https://hitran.org/ API.

Parameters

source ('hitran', 'hitemp') – [Download database lines from the latest HITRAN (see [HITRAN-2016]) or HITEMP version (see [HITEMP-2010] )]

Other Parameters
  • parfuncfmt ('cdsd', 'hapi', or any of KNOWN_PARFUNCFORMAT) – format to read tabulated partition function file. If hapi, then HAPI (HITRAN Python interface) 2 is used to retrieve them (valid if your database is HITRAN data). HAPI is embedded into RADIS. Check the version. If partfuncfmt is None then hapi is used. Default hapi.

  • parfunc (filename or None) – path to tabulated partition function to use. If parfuncfmt is hapi then parfunc should be the link to the hapi.py file. If not given, then the hapi.py embedded in RADIS is used (check version)

  • levels (dict of str or None) – path to energy levels (needed for non-eq calculations). Format: {1:path_to_levels_iso_1, 3:path_to_levels_iso3}. Default None

  • levelsfmt (‘cdsd-pc’, ‘radis’ (or any of KNOWN_LVLFORMAT) or None) – how to read the previous file. Known formats: (see KNOWN_LVLFORMAT). If radis, energies are calculated using the diatomic constants in radis.db database if available for given molecule. Look up references there. If None, non equilibrium calculations are not possible. Default 'radis'.

  • load_energies (boolean) – if False, dont load energy levels. This means that nonequilibrium spectra cannot be calculated, but it saves some memory. Default True

  • include_neighbouring_lines (bool) – if True, includes off-range, neighbouring lines that contribute because of lineshape broadening. The broadening_max_width parameter is used to determine the limit. Default True.

  • parse_local_global_quanta (bool, or 'auto') – if True, parses the HITRAN/HITEMP ‘glob’ and ‘loc’ columns to extract quanta identifying the lines. Required for nonequilibrium calculations, or to use line_survey(), but takes up more space.

  • drop_non_numeric (boolean) – if True, non numeric columns are dropped. This improves performances, but make sure all the columns you need are converted to numeric formats before hand. Default True. Note that if a cache file is loaded it will be left untouched.

  • db_use_cached (bool, or 'regen') – use cached

Notes

HITRAN is fetched with Astroquery [1]_ and HITEMP with fetch_hitemp()

HITEMP files are generated in a ~/.radisdb database.

See also

-, -

References

1

Astroquery

2

HAPI: The HITRAN Application Programming Interface

get_conditions(ignore_misc=False)[source]

Get all parameters defined in the SpectrumFactory.

ignore_misc: boolean

if True, then all attributes considered as Factory ‘descriptive’ parameters, as defined in get_conditions() are ignored when comparing the database to current factory conditions. It should obviously only be attributes that have no impact on the Spectrum produced by the factory. Default False

get_partition_function_calculator(molecule, isotope, elec_state)[source]

Retrieve Partition Function Calculator.

Parameters
  • molecule (str)

  • isotope (int)

  • elec_state (str)

get_partition_function_interpolator(molecule, isotope, elec_state)[source]

Retrieve Partition Function Interpolator.

Parameters
  • molecule (str)

  • isotope (int)

  • elec_state (str)

get_partition_function_molecule(molecule)[source]

Retrieve Partition Function for Molecule.

Parameters

molecule (str)

init_databank(*args, **kwargs)[source]

Method to init databank parameters but only load them when needed. Databank is reloaded by _check_line_databank()

Same inputs Parameters as load_databank():

Parameters

name (a section name specified in your ~/radis.json) – .radis has to be created in your HOME (Unix) / User (Windows). If not None, all other arguments are discarded. Note that all files in database will be loaded and it may takes some time. Better limit the database size if you already know what range you need. See Configuration file and DBFORMAT for expected ~/radis.json format

Other Parameters
  • path (str, list of str, None) – list of database files, or name of a predefined database in the Configuration file (json) Accepts wildcards * to select multiple files

  • format ('hitran', 'cdsd-hitemp', 'cdsd-4000', or any of KNOWN_DBFORMAT) – database type. 'hitran' for HITRAN/HITEMP, 'cdsd-hitemp' and 'cdsd-4000' for the different CDSD versions. Default 'hitran'

  • parfuncfmt ('hapi', 'cdsd', or any of KNOWN_PARFUNCFORMAT) – format to read tabulated partition function file. If hapi, then HAPI (HITRAN Python interface) [1]_ is used to retrieve them (valid if your database is HITRAN data). HAPI is embedded into RADIS. Check the version. If partfuncfmt is None then hapi is used. Default hapi.

  • parfunc (filename or None) – path to tabulated partition function to use. If parfuncfmt is hapi then parfunc should be the link to the hapi.py file. If not given, then the hapi.py embedded in RADIS is used (check version)

  • levels (dict of str or None) – path to energy levels (needed for non-eq calculations). Format: {1:path_to_levels_iso_1, 3:path_to_levels_iso3}. Default None

  • levelsfmt (‘cdsd-pc’, ‘radis’ (or any of KNOWN_LVLFORMAT) or None) – how to read the previous file. Known formats: (see KNOWN_LVLFORMAT). If radis, energies are calculated using the diatomic constants in radis.db database if available for given molecule. Look up references there. If None, non equilibrium calculations are not possible. Default 'radis'.

  • db_use_cached (boolean, or None) – if True, a pandas-readable csv file is generated on first access, and later used. This saves on the datatype cast and conversion and improves performances a lot. But! … be sure to delete these files to regenerate them if you happen to change the database. If 'regen', existing cached files are removed and regenerated. It is also used to load energy levels from .h5 cache file if exist. If None, the value given on Factory creation is used. Default None

  • load_energies (boolean) – if False, dont load energy levels. This means that nonequilibrium spectra cannot be calculated, but it saves some memory. Default True

  • include_neighbouring_lines (bool) – True, includes off-range, neighbouring lines that contribute because of lineshape broadening. The broadening_max_width parameter is used to determine the limit. Default True.

  • *Other arguments are related to how to open the files*

  • drop_columns (list) – columns names to drop from Line DataFrame after loading the file. Not recommended to use, unless you explicitely want to drop information (for instance if dealing with too large databases). If [], nothing is dropped. If 'auto', parameters considered unnecessary are dropped. See drop_auto_columns_for_dbformat and drop_auto_columns_for_levelsfmt. Default 'auto'.

Notes

Useful in conjonction with init_database() when dealing with large line databanks when some of the spectra may have been precomputed in a spectrum database (SpecDatabase)

Note that any previously loaded databank is discarded on the method call

See also

-, -, -

init_database(path, autoretrieve=True, autoupdate=True, add_info=['Tvib', 'Trot'], add_date='%Y%m%d', compress=True)[source]

Init a SpecDatabase folder in path to later store our spectra. Spectra can also be automatically retrieved from the database instead of being calculated.

Parameters
  • path (str) – path to database folder. If it doesnt exist, create it Accepts wildcards * to select multiple files

  • autoretrieve (boolean, or 'force') – if True, a database lookup is performed whenever a new spectrum is calculated. If the spectrum already exists then it is retrieved from the database instead of being calculated. Spectra are considered the same if all the stored conditions fit. If set to 'force', an error is raised if the spectrum is not found in the database (use it for debugging). Default True

  • autoupdate (boolean) – if True, all spectra calculated by this Factory are automatically exported in database. Default True (but only if init_database is explicitely called by user)

  • add_info (list, or None/False) – append these parameters and their values if they are in conditions. Default ['Tvib', 'Trot']

  • add_date (str, or None/False) – adds date in strftime format to the beginning of the filename. Default ‘%Y%m%d’

  • compress (boolean) – if True, Spectrum are read and written in binary format. This is faster, and takes less memory space. Default True

Returns

db – the database where spectra will be stored or retrieved

Return type

SpecDatabase

load_databank()[source]

Loads databank from shortname in the :ref:`Configuration file.

<label_lbl_config_file>` (json), or by manually setting all attributes.

Databank includes:

  • lines

  • partition function & format (tabulated or calculated)

  • (optional) energy levels, format

It also fetches molecular parameters (molar mass, abundance) for all molecules in database

Parameters

name (a section name specified in your ~/radis.json) – .radis has to be created in your HOME (Unix) / User (Windows). If not None, all other arguments are discarded. Note that all files in database will be loaded and it may takes some time. Better limit the database size if you already know what range you need. See Configuration file and DBFORMAT for expected ~/radis.json format

Other Parameters
  • path (str, list of str, None) – list of database files, or name of a predefined database in the Configuration file (json) Accepts wildcards * to select multiple files

  • format ('hitran', 'cdsd-hitemp', 'cdsd-4000', or any of KNOWN_DBFORMAT) – database type. 'hitran' for HITRAN/HITEMP, 'cdsd-hitemp' and 'cdsd-4000' for the different CDSD versions. Default 'hitran'

  • parfuncfmt ('hapi', 'cdsd', or any of KNOWN_PARFUNCFORMAT) – format to read tabulated partition function file. If hapi, then HAPI (HITRAN Python interface) [1]_ is used to retrieve them (valid if your database is HITRAN data). HAPI is embedded into RADIS. Check the version. If partfuncfmt is None then hapi is used. Default hapi.

  • parfunc (filename or None) – path to tabulated partition function to use. If parfuncfmt is hapi then parfunc should be the link to the hapi.py file. If not given, then the hapi.py embedded in RADIS is used (check version)

  • levels (dict of str or None) – path to energy levels (needed for non-eq calculations). Format: {1:path_to_levels_iso_1, 3:path_to_levels_iso3}. Default None

  • levelsfmt (‘cdsd-pc’, ‘radis’ (or any of KNOWN_LVLFORMAT) or None) – how to read the previous file. Known formats: (see KNOWN_LVLFORMAT). If radis, energies are calculated using the diatomic constants in radis.db database if available for given molecule. Look up references there. If None, non equilibrium calculations are not possible. Default 'radis'.

  • db_use_cached (boolean, or None) – if True, a pandas-readable csv file is generated on first access, and later used. This saves on the datatype cast and conversion and improves performances a lot. But! … be sure to delete these files to regenerate them if you happen to change the database. If 'regen', existing cached files are removed and regenerated. It is also used to load energy levels from .h5 cache file if exist. If None, the value given on Factory creation is used. Default True

  • load_energies (boolean) – if False, dont load energy levels. This means that nonequilibrium spectra cannot be calculated, but it saves some memory. Default True

  • include_neighbouring_lines (bool) – True, includes off-range, neighbouring lines that contribute because of lineshape broadening. The broadening_max_width parameter is used to determine the limit. Default True.

  • *Other arguments are related to how to open the files (***)

  • drop_columns (list) – columns names to drop from Line DataFrame after loading the file. Not recommended to use, unless you explicitely want to drop information (for instance if dealing with too large databases). If [], nothing is dropped. If 'auto', parameters considered useless are dropped. See drop_auto_columns_for_dbformat and drop_auto_columns_for_levelsfmt. If 'all', parameters considered unecessary for equilibrium calculations are dropped, including all information about lines that could be otherwise available in Spectrum() method. Warning: nonequilibrium calculations are not possible in this mode. Default 'auto'.

See also

Configuration file with: - all line database formats: DBFORMAT - all energy levels database formats: LVLFORMAT

References

1

HAPI: The HITRAN Application Programming Interface

misc[source]

Miscelleneous parameters (MiscParams) params that cannot change the output of calculations (ex: number of CPU, etc.)

params[source]

Parameters they may change the output of calculations (ex: threshold, cutoff, broadening methods, etc.)

Type

Computational parameters

parsum_calc[source]

store all partition function calculators, per isotope

Type

dict

parsum_tab[source]

store all partition function tabulators, per isotope

Type

dict

save_memory[source]

if True, tries to save RAM memory (but may take a little for time, saving stuff to files instead of RAM for instance)

Type

bool

verbose[source]

increase verbose level. 0, 1, 2 supported at the moment

Type

bool, or int

warn(message, category='default', level=0)[source]

Trigger a warning, an error or just ignore based on the value defined in the warnings dictionary.

The warnings can thus be deactivated selectively by setting the SpectrumFactory

warnings attribute

Parameters
  • message (str) – what to print

  • category (str) – one of the keys of self.warnings. See warnings

  • level (int) – warning level. Only print warnings when verbose level is higher than the warning levels. i.e., warnings of level 1 appear only if verbose==True, warnings of level 2 appear only for verbose>=2, etc.. Warnings of level 0 appear only the time. Default 0

Examples

::
if not ((df.Erotu > tol).all() and (df.Erotl > tol).all()):
self.warn(

“There are negative rotational energies in the database”, “NegativeEnergiesWarning”,

)

Notes

All warnings in the SpectrumFactory should call to this method rather than the default warnings.warn() method, because it allows easier runtime modification of how to deal with warnings

See also

warnings

warnings[source]

Default warnings for SpectrumFactory. See default_warning_status

Type

dict

class Input[source]

Bases: radis.lbl.loader.ConditionDict

Holds Spectrum calculation input conditions, under the attribute input of SpectrumFactory.

Works like a dict except you can also access attribute with:

v = sf.input.key   # equivalent to v = sf.input[key]

See also

params, misc

Tgas[source]

gas (translational) temperature. Overwritten by SpectrumFactory.eq/noneq_spectrum

Type

float

Tref[source]

reference temperature for line database.

Type

float

Trot[source]

rotational temperature. Overwritten by SpectrumFactory.eq/noneq_spectrum

Type

float

Tvib[source]

vibrational temperature. Overwritten by SpectrumFactory.eq/noneq_spectrum

Type

float

isotope[source]

isotope list. Can be ‘1,2,3’, etc. or ‘all’

Type

str

mole_fraction[source]

mole fraction

Type

float

molecule[source]

molecule

Type

str

overpopulation[source]

overpopulation

Type

dict

path_length[source]

path length (cm)

Type

float

pressure_mbar[source]

pressure (mbar)

Type

float

rot_distribution[source]

rotational levels distribution

Type

str

state[source]

electronic state

Type

str

wavenum_max[source]

wavenumber max (cm-1)

Type

str

wavenum_min[source]

wavenumber min (cm-1)

Type

str

KNOWN_DBFORMAT = ['hitran', 'hitemp', 'cdsd-hitemp', 'cdsd-4000', 'hitemp-radisdb', 'hdf5-radisdb'][source]

Known formats for Line Databases:

  • 'hitran' : [HITRAN-2016] original .par format

  • 'hitemp' : [HITEMP-2010] original format (same format as ‘hitran’)

  • 'cdsd-hitemp' : CDSD-HITEMP original format (CO2 only, same lines as HITEMP-2010)

  • 'cdsd-4000' : [CDSD-4000] original format (CO2 only)

  • 'hitemp-radisdb' : HITEMP under RADISDB format (pytables-HDF5 with RADIS column names).

  • 'hdf5-radisdb' : arbitrary HDF5 file with RADIS column names.

To install all databases manually see the Configuration file and the list of databases .

Type

list

KNOWN_LVLFORMAT = ['radis', 'cdsd-pc', 'cdsd-pcN', 'cdsd-hamil', None][source]

Known formats for Energy Level Databases (used in non-equilibrium calculations):

  • 'radis': energies calculated with Dunham expansions by

    PartFunc_Dunham

  • 'cdsd-pc': energies read from precomputed CDSD energies for CO2, with

    viblvl=(p,c) convention. See PartFuncCO2_CDSDcalc

  • 'cdsd-pcN': energies read from precomputed CDSD energies for CO2, with

    viblvl=(p,c,N) convention. See PartFuncCO2_CDSDcalc

  • 'cdsd-hamil': energies read from precomputed CDSD energies for CO2, with

    viblvl=(p,c,J,N) convention, i.e., a each rovibrational level can have a unique vibrational energy (this is needed when taking account Coupling terms) See PartFuncCO2_CDSDcalc

  • None: means you can only do Equilibrium calculations.

Type

list

KNOWN_PARFUNCFORMAT = ['cdsd', 'hapi'][source]

Known formats for partition function (tabulated files to read), or ‘hapi’ to fetch Partition Functions using HITRAN Python interface instead of reading a tabulated file.

Type

list

class MiscParams[source]

Bases: radis.lbl.loader.ConditionDict

A class to hold Spectrum calculation descriptive parameters, under the attribute params of SpectrumFactory.

Unlike Parameters, these parameters cannot influence the Spectrum output and will not be used when comparing Spectrum with existing, precomputed spectra in SpecDatabase

Works like a dict except you can also access attribute with:

v = a.key

See also

input, params

chunksize[source]

divide line database in chunks of lines

Type

int

total_lines[source]

number of lines in database.

Type

int

warning_linestrength_cutoff[source]

raise a warning if the sum of linestrength cut is above that

Type

float [0-1]

class Parameters[source]

Bases: radis.lbl.loader.ConditionDict

Holds Spectrum calculation computation parameters, under the attribute params of SpectrumFactory.

Works like a dict except you can also access attribute with:

v = sf.params.key    # equivalent to v = sf.params[key]

Also can be copied, deepcopied, and parallelized in multiprocessing

See also

input, misc

broadening_max_width[source]

cutoff for lineshape calculation (cm-1). Overwritten by SpectrumFactory

Type

float

broadening_method[source]

"voigt", "convolve", "fft"

Type

str

cutoff[source]

linestrength cutoff (molecule/cm)

Type

float

dbformat[source]

format of Line Database. See KNOWN_DBFORMAT

Type

str

dbpath[source]

list of filepaths to Line Database

Type

list

dlm_log_pG[source]

Gaussian step DLM lineshape database. Default _gaussian_step(0.01)

Type

float

dlm_log_pL[source]

Lorentzian step for DLM lineshape database. Default _lorentzian_step(0.01)

Type

float

include_neighbouring_lines[source]

if True, includes the contribution of off-range, neighbouring lines because of lineshape broadening. Default True.

Type

bool

levelsfmt[source]

format of Energy Database. See KNOWN_LVLFORMAT

Type

str

optimization[source]

"simple", "min-RMS", None

Type

str

parfuncfmt[source]

format of tabulated Partition Functions. See #: str: format of Energy Database. See KNOWN_PARFUNCFORMAT

Type

str

parfuncpath[source]

filepath to tabulated Partition Functions

Type

str

pseudo_continuum_threshold[source]

threshold to assign lines in pseudo continuum. Overwritten in SpectrumFactory

Type

float

wavenum_max_calc[source]

maximum calculated wavenumber (cm-1) initialized by SpectrumFactory

Type

float

wavenum_min_calc[source]

minimum calculated wavenumber (cm-1) initialized by SpectrumFactory

Type

float

waveunit[source]

should be cm-1.

Type

waverange unit

wstep[source]

spectral resolution (cm-1)

Type

float

df_metadata = ['Ia', 'molar_mass', 'Qref', 'Qvib', 'Q'][source]

metadata of line DataFrames df0, df1. @dev: when having only 1 molecule, 1 isotope, these parameters are constant for all rovibrational lines. Thus, it’s faster and much more memory efficient to transport them as attributes of the DataFrame rather than columns. The syntax is the same, thus the operations do not change, i.e:

k_b / df.molar_mass

will work whether molar_mass is a float or a column.

Warning

However, in the current Pandas implementation of DataFrame, attributes are lost whenever the DataFrame is recreated, transposed, pickled.

Thus, we use transfer_metadata() to keep the attributes after an operation, and expand_metadata() to make them columns before a Serializing operation (ex: multiprocessing) @dev: all of that is a high-end optimization. Users should not deal with internal DataFrames.

References

https://stackoverflow.com/q/13250499/5622825

Type

list

drop_all_but_these = ['id', 'iso', 'wav', 'int', 'airbrd', 'selbrd', 'Tdpair', 'Tdpsel', 'Pshft', 'El'][source]

drop all columns but these if using drop_columns='all' in load_databank

Note: nonequilibrium calculations wont be possible anymore and it wont be possible to identify lines with line_survey()

See also

-, -, -

Type

dict

drop_auto_columns_for_dbformat = {'cdsd-4000': ['wang2'], 'cdsd-hitemp': ['wang2', 'lsrc'], 'hdf5-radisdb': [], 'hitemp': ['ierr', 'iref', 'lmix', 'gp', 'gpp'], 'hitemp-radisdb': [], 'hitran': ['ierr', 'iref', 'lmix', 'gp', 'gpp']}[source]

drop these columns if using drop_columns='auto' in load_databank Based on the value of dbformat=, some of these columns won’t be used.

See also

-, -, -

Type

dict

drop_auto_columns_for_levelsfmt = {'radis': [], 'cdsd-pc': ['v1u', 'v2u', 'l2u', 'v3u', 'ru', 'v1l', 'v2l', 'l2l', 'v3l', 'rl'], 'cdsd-pcN': ['v1u', 'v2u', 'l2u', 'v3u', 'ru', 'v1l', 'v2l', 'l2l', 'v3l', 'rl'], 'cdsd-hamil': ['v1u', 'v2u', 'l2u', 'v3u', 'ru', 'v1l', 'v2l', 'l2l', 'v3l', 'rl'], None: []}[source]

drop these columns if using drop_columns='auto' in load_databank Based on the value of lvlformat=, some of these columns won’t be used.

See also

-, -, -, -

Type

dict

format_paths(s)[source]

escape all special characters.