Module to host the databank loading / database initialisation parts of
SpectrumFactory. This is done through SpectrumFactory
inheritance of the DatabankLoader class defined here
RADIS includes automatic rebuilding of Deprecated cache files + a global variable
to force regenerating them after a given version. See "OLDEST_COMPATIBLE_VERSION"
key in radis.config
A class to hold Spectrum calculation input conditions
(Input), computation parameters
(Parameters), or miscellaneous parameters
(MiscParams).
Works like a dict except you can also access attribute with:
v=a.key# equivalent to v = a[key]
Also can be copied, deepcopied, and parallelized in multiprocessing
Notes
for developers:
Parameters and Input could also have simply derived from the (object) class,
but it may have missed some convenient functions implemented for dict.
For instance, how to be picked / unpickled.
Returns the variables (and their values) contained in the
dictionary, minus some based on their type. Numpy array, dictionaries
and pandas DataFrame are removed. None is removed in general, except
for some keys (âcutoffâ, âtruncationâ)
Tuples are converted to string
initial line database after loading.
If for any reason, you want to manipulate the line database manually (for instance, keeping only lines emitting
by a particular level), you need to access the df0 attribute of
SpectrumFactory.
Warning
never overwrite the df0 attribute, else some metadata may be lost in the process.
Only use inplace operations. If reducing the number of lines, add
a df0.reset_index()
For instance:
fromradisimportSpectrumFactorysf=SpectrumFactory(wavenum_min=2150.4,wavenum_max=2151.4,pressure=1,isotope=1)sf.load_databank('HITRAN-CO-TEST')sf.df0.drop(sf.df0[sf.df0.vu!=1].index,inplace=True)# keep lines emitted by v'=1 onlysf.eq_spectrum(Tgas=3000,name='vu=1').plot()
df0 contains the lines as they are loaded from the database.
df1 is generated during the spectrum calculation, after the
line database reduction steps, population calculation, and scaling of intensity and broadening parameters
with the calculated conditions.
Fetch the latest files from [HITRAN-2020], [HITEMP-2010] (or newer),
[ExoMol-2020] or [GEISA-2020] or [Kurucz-2017], and store them locally in memory-mapping
formats for extremely fast access.
Parameters:
source (str) â Which database to use. Options are 'hitran', 'hitemp', 'exomol', 'geisa', 'kurucz', 'nist'.
database (str) â If fetching from HITRAN, 'full' downloads the full database and registers it, 'range' downloads only the lines in the range of the molecule.
If fetching from HITEMP, Kurucz, or NIST, only 'full' is available.
If fetching from ExoMol, use this parameter to choose which database to use. Keep 'default' to use the recommended one.
Default is 'full'.
parfunc: str or None
Path to a tabulated partition function file to use. This argument only affects molecules.
levels: dict or None
Path to energy levels (needed for non-eq calculations). Format: {1:path_to_levels_iso_1, 3:path_to_levels_iso3}. This argument only affects molecules.
Default is None.
levelsfmt: str or None
How to read the previous file. Known formats: 'cdsd-pc', 'radis' or any of KNOWN_LVLFORMAT. This argument only affects molecules.
Default is 'radis'.
load_energies: bool
If False, donât load energy levels. This means that nonequilibrium spectra cannot be calculated, but it saves some memory. This argument only affects molecules.
Default is False.
include_neighbouring_lines: bool
If True, includes off-range, neighbouring lines that contribute because of lineshape broadening.
Default is True.
parse_local_global_quanta: bool or 'auto'
If True, parses the HITRAN/HITEMP âglobâ and âlocâ columns to extract quanta identifying the lines.
Default is True.
drop_non_numeric: bool
If True, non-numeric columns are dropped. This improves performance.
Default is True.
db_use_cached: bool or 'regen'
Use cached database if available.
memory_mapping_engine: str
Which library to use to read HDF5 files. Options are 'pytables', 'vaex', 'feather'.
Default is 'default'.
parallel: bool
If True, uses joblib.parallel to load database with multiple processes.
Default is True.
Get all parameters defined in the SpectrumFactory.
Other Parameters:
ignore_misc (boolean) â if True, then all attributes considered as Factory âdescriptiveâ
parameters, as defined in get_conditions() are ignored when
comparing the database to current factory conditions. It should
obviously only be attributes that have no impact on the Spectrum
produced by the factory. Default False
Method to init databank parameters but only load them when needed.
Databank is reloaded by _check_line_databank()
Same inputs Parameters as load_databank():
Parameters:
name (a section name specified in your ~/radis.json) â .radis has to be created in your HOME (Unix) / User (Windows). If
not None, all other arguments are discarded.
Note that all files in database will be loaded and it may takes some
time. Better limit the database size if you already know what
range you need. See Configuration file and
DBFORMAT for expected
~/radis.json format
Other Parameters:
path (str, list of str, None) â list of database files, or name of a predefined database in the
Configuration file (json)
Accepts wildcards * to select multiple files
format ('hitran', 'cdsd-hitemp', 'cdsd-4000', or any of KNOWN_DBFORMAT) â database type. 'hitran' for HITRAN/HITEMP, 'cdsd-hitemp'
and 'cdsd-4000' for the different CDSD versions. Default 'hitran'
format to read tabulated partition function file. If hapi, then
HAPI (HITRAN Python interface) [1]_ is used to retrieve them (valid if
your database is HITRAN data). HAPI is embedded into RADIS. Check the
version. If partfuncfmt is None then it is inferred from format (e.g. hapi for hitran, exomol for exomol). Default None.
parfunc (filename or None) â path to tabulated partition function to use.
hapi.py file. If not given, then the hapi.py embedded in RADIS is used (check version)
levels (dict of str or None) â path to energy levels (needed for non-eq calculations). Format:
{1:path_to_levels_iso_1, 3:path_to_levels_iso3}. Default None
levelsfmt (âcdsd-pcâ, âradisâ (or any of KNOWN_LVLFORMAT) or None) â how to read the previous file. Known formats: (see KNOWN_LVLFORMAT).
If radis, energies are calculated using the diatomic constants in radis.db database
if available for given molecule. Look up references there.
If None, non equilibrium calculations are not possible. Default 'radis'.
db_use_cached (boolean, or None) â if True, a pandas-readable csv file is generated on first access,
and later used. This saves on the datatype cast and conversion and
improves performances a lot. But! ⊠be sure to delete these files
to regenerate them if you happen to change the database. If 'regen',
existing cached files are removed and regenerated.
It is also used to load energy levels from .h5 cache file if exist.
If None, the value given on Factory creation is used. Default None
load_energies (boolean) â if False, dont load energy levels. This means that nonequilibrium
spectra cannot be calculated, but it saves some memory. Default True
include_neighbouring_lines (bool) â True, includes off-range, neighbouring lines that contribute
because of lineshape broadening. The neighbour_lines
parameter is used to determine the limit. Default True.
drop_columns (list) â columns names to drop from Line DataFrame after loading the file.
Not recommended to use, unless you explicitly want to drop information
(for instance if dealing with too large databases). If [], nothing
is dropped. If 'auto', parameters considered unnecessary
are dropped. See drop_auto_columns_for_dbformat
and drop_auto_columns_for_levelsfmt.
Default 'auto'.
load_columns (list, 'all', 'equilibrium', 'noneq') â columns names to load.
If 'equilibrium', only load the columns required for equilibrium
calculations. If 'noneq', also load the columns required for
non-LTE calculations. See drop_all_but_these.
If 'all', load everything. Note that for performances, it is
better to load only certain columns rather than loading them all
and dropping them with drop_columns.
Default 'equilibrium'.
Warning
if using 'equilibrium', not all parameters will be available
for a Spectrum line_survey().
**Other arguments are related to how to open the files (****)
Notes
Useful in conjunction with init_database()
when dealing with large line databanks when some of the spectra may have
been precomputed in a spectrum database (SpecDatabase)
Note that any previously loaded databank is discarded on the method call
Init a SpecDatabase folder in
path to later store our spectra. Spectra can also be automatically
retrieved from the database instead of being calculated.
Parameters:
path (str) â path to database folder. If it doesnt exist, create it
Accepts wildcards * to select multiple files
autoretrieve (boolean, or 'force') â if True, a database lookup is performed whenever a new spectrum
is calculated. If the spectrum already exists then it is retrieved
from the database instead of being calculated. Spectra are considered
the same if all the stored conditions fit. If set to 'force', an error
is raised if the spectrum is not found in the database (use it for
debugging). Default True
autoupdate (boolean) â if True, all spectra calculated by this Factory are automatically
exported in database. Default True (but only if init_database is
explicitly called by user)
add_info (list, or None/False) â append these parameters and their values if they are in conditions.
Default ['Tvib','Trot']
add_date (str, or None/False) â adds date in strftime format to the beginning of the filename.
Default â%Y%m%dâ
compress (boolean, or 2) â if True, Spectrum are read and written in binary format. This is faster,
and takes less memory space. Default True.
If 2, additionally remove all redundant quantities.
Other Parameters:
**kwargs (**dict) â arguments sent to SpecDatabase initialization.
Returns:
db â the database where spectra will be stored or retrieved
Loads databank from shortname in the Configuration file. (json), or by manually setting all
attributes.
Databank includes:
- lines
- partition function & format (tabulated or calculated)
- (optional) energy levels, format
Parameters:
name (a section name specified in your ~/radis.json) â .radis has to be created in your HOME (Unix) / User (Windows). If
not None, all other arguments are discarded.
Note that all files in database will be loaded and it may takes some
time. Better limit the database size if you already know what
range you need. See Configuration file and
DBFORMAT for expected
~/radis.json format
Other Parameters:
path (str, list of str, None) â list of database files, or name of a predefined database in the
Configuration file (json)
Accepts wildcards * to select multiple files
format ('hitran', 'cdsd-hitemp', 'cdsd-4000', or any of KNOWN_DBFORMAT) â database type. 'hitran' for HITRAN/HITEMP, 'cdsd-hitemp'
and 'cdsd-4000' for the different CDSD versions. Default 'hitran'
parfunc (filename or None) â path to tabulated partition function to use.
If not given, then the hapi.py embedded in RADIS is used (check version). This argument only affects molecules.
levels (dict of str or None) â path to energy levels (needed for non-eq calculations). Format:
{1:path_to_levels_iso_1, 3:path_to_levels_iso3}. Default None.
This argument only affects molecules.
levelsfmt (âcdsd-pcâ, âradisâ (or any of KNOWN_LVLFORMAT) or None) â how to read the previous file. Known formats: (see KNOWN_LVLFORMAT).
If radis, energies are calculated using the diatomic constants in radis.db database
if available for given molecule. Look up references there.
If None, non equilibrium calculations are not possible. Default 'radis'.
This argument only affects molecules.
db_use_cached (boolean, or None) â if True, a pandas-readable csv file is generated on first access,
and later used. This saves on the datatype cast and conversion and
improves performances a lot. But! ⊠be sure to delete these files
to regenerate them if you happen to change the database. If 'regen',
existing cached files are removed and regenerated.
It is also used to load energy levels from .h5 cache file if exist.
If None, the value given on Factory creation is used. Default True
load_energies (boolean) â if False, dont load energy levels. This means that nonequilibrium
spectra cannot be calculated, but it saves some memory. Default True
This argument only affects molecules.
include_neighbouring_lines (bool) â True, includes off-range, neighbouring lines that contribute
because of lineshape broadening. The neighbour_lines
parameter is used to determine the limit. Default True.
**Other arguments are related to how to open the files (****)
drop_columns (list) â columns names to drop from Line DataFrame after loading the file.
Not recommended to use, unless you explicitly want to drop information
(for instance if dealing with too large databases). If [], nothing
is dropped. If 'auto', parameters considered useless
are dropped. See drop_auto_columns_for_dbformat
and drop_auto_columns_for_levelsfmt.
If 'all', parameters considered unnecessary for equilibrium calculations
are dropped, including all information about lines that could be otherwise
available in Spectrum() method.
Warning: nonequilibrium calculations are not possible in this mode.
Default 'auto'.
load_columns (list, 'all', 'equilibrium', 'noneq') â columns names to load.
If 'equilibrium', only load the columns required for equilibrium
calculations. If 'noneq', also load the columns required for
non-LTE calculations. See drop_all_but_these.
If 'all', load everything. Note that for performances, it is
better to load only certain columns rather than loading them all
and dropping them with drop_columns.
Default 'equilibrium'.
Warning
if using 'equilibrium', not all parameters will be available
for a Spectrum line_survey().
isotope (int, or list) â isotope number, sorted in terrestrial abundance
abundance (float, or list)
Examples
fromradisimportSpectrumFactorysf=SpectrumFactory(2284.2,2284.6,wstep=0.001,# cm-1pressure=20*1e-3,# barmole_fraction=400e-6,molecule="CO2",isotope="1,2",verbose=False)sf.load_databank("HITEMP-CO2-TEST")print("Abundance of CO2[1,2]",sf.get_abundance("CO2",[1,2]))sf.eq_spectrum(2000).plot("abscoeff")#%% Set the abundance of CO2(626) to 0.8; and the abundance of CO2(636) to 0.2 (arbitrary):sf.set_abundance("CO2",[1,2],[0.8,0.2])print("New abundance of CO2[1,2]",sf.get_abundance("CO2",[1,2]))sf.eq_spectrum(2000).plot("abscoeff",nfig="same")
Construct an interpolator or calculator for atomic partition functions and store it in the parsum attribute
Parameters:
pfsource (string) â The source for the partition function tables for an interpolator or energy level tables for a calculator. Sources implemented so far are âbarklemâ and âkuruczâ for the former, and ânistâ for the latter. âdefaultâ is currently ânistâ.
Trigger a warning, an error or just ignore based on the value
defined in the warnings
dictionary.
The warnings can thus be deactivated selectively by setting the SpectrumFactory
category (str) â one of the keys of self.warnings. See warnings
level (int) â warning level. Only print warnings when verbose level is higher
than the warning levels. i.e., warnings of level 1 appear only
if verbose==True, warnings of level 2 appear only
for verbose>=2, etc.. Warnings of level 0 appear only the time.
Default 0
Examples
::
if not ((df.Erotu > tol).all() and (df.Erotl > tol).all()):
self.warn(
âThere are negative rotational energies in the databaseâ,
âNegativeEnergiesWarningâ,
)
Notes
All warnings in the SpectrumFactory should call to this method rather
than the default warnings.warn() method, because it allows easier runtime
modification of how to deal with warnings
Holds Spectrum calculation input conditions, under the attribute
input of
SpectrumFactory.
Works like a dict except you can also access attribute with:
'cdsd-hamil': energies read from precomputed CDSD energies for CO2, with
viblvl=(p,c,J,N) convention, i.e., a each rovibrational level can have a
unique vibrational energy (this is needed when taking account Coupling terms)
See PartFuncCO2_CDSDcalc
None: means you can only do Equilibrium calculations.
Known formats for partition function (tabulated files to read), or âhapiâ
to fetch Partition Functions using HITRAN Python interface instead of reading
a tabulated file.
A class to hold Spectrum calculation descriptive parameters, under the attribute
params of
SpectrumFactory.
Unlike Parameters, these parameters cannot influence the
Spectrum output and will not be used when comparing Spectrum with existing,
precomputed spectra in SpecDatabase
Works like
a dict except you can also access attribute with:
Holds Spectrum calculation computation parameters, under the attribute
params of
SpectrumFactory.
Works like
a dict except you can also access attribute with:
v=sf.params.key# equivalent to v = sf.params[key]
Also can be copied, deepcopied, and parallelized in multiprocessing
metadata of line DataFrames df0,
df1.
@dev: when having only 1 molecule, 1 isotope, these parameters are
constant for all rovibrational lines. Thus, itâs faster and much more
memory efficient to transport them as attributes of the DataFrame
rather than columns. The syntax is the same, thus the operations do
not change, i.e:
k_b/df.molar_mass
will work whether molar_mass is a float or a column.
Warning
However, in the current Pandas implementation of DataFrame,
attributes are lost whenever the DataFrame is recreated, transposed,
pickled.
Thus, we use transfer_metadata() to keep
the attributes after an operation, and expand_metadata()
to make them columns before a Serializing operation (ex: multiprocessing)
@dev: all of that is a high-end optimization. Users should not deal
with internal DataFrames.
drop all columns but these if using drop_columns='all' in load_databank
Note: nonequilibrium calculations wont be possible anymore and it wont be possible
to identify lines with line_survey()