radis.api.hitranapi module

Summary

HITRAN database parser

Routine Listing

class HITRANDatabaseManager(name, molecule, local_databases, engine='default', extra_params=None, verbose=True, parallel=True)[source]

Bases: DatabaseManager

download_and_parse(local_file, cache=True, parse_quanta=True, add_HITRAN_uncertainty_code=False)[source]

Download from HITRAN and parse into local_file. Also add metadata

Overwrites radis.api.dbmanager.DatabaseManager.download_and_parse() which downloads from a list of URL, because here we use [HAPI] to download the files.

Parameters:
  • opener (an opener with an .open() command)

  • gfile (file handler. Filename: for info)

get_filenames()[source]

Get names of all files in the database (even if not downloaded yet)

Parameters:

return_reg_urls ((boolean)) – When the database is registered, whether to return the registered urls (True) or None (False)

See also

get_files_to_download()

register(download_files)[source]

register in ~/radis.json

cast_all_to_int64(df, keys)[source]

converts string to int64

cast_to_int64_with_missing_values(dg, keys, dataframe_type='pandas')[source]

replace missing values of int64 columns with -1

columns_2004 = {'A': ('a10', <class 'float'>, 'Einstein A coefficient', 's-1'), 'El': ('a10', <class 'float'>, 'lower-state energy', 'cm-1'), 'Pshft': ('a8', <class 'float'>, 'air pressure-induced line shift at 296K', 'cm-1.atm-1'), 'Tdpair': ('a4', <class 'float'>, 'temperature-dependance exponent for Gamma air', ''), 'airbrd': ('a5', <class 'float'>, 'air-broadened half-width at 296K', 'cm-1.atm-1'), 'globl': ('a15', <class 'str'>, 'electronic and vibrational global lower quanta', ''), 'globu': ('a15', <class 'str'>, 'electronic and vibrational global upper quanta', ''), 'gp': ('a7', <class 'float'>, 'upper state degeneracy', ''), 'gpp': ('a7', <class 'float'>, 'lower state degeneracy', ''), 'id': ('a2', <class 'int'>, 'Molecular number', ''), 'ierr': ('a6', <class 'str'>, 'ordered list of indices corresponding to uncertainty estimates of transition parameters', ''), 'int': ('a10', <class 'float'>, 'intensity at 296K', 'cm-1/(molecule/cm-2)'), 'iref': ('a12', <class 'str'>, 'ordered list of reference identifiers for transition parameters', ''), 'iso': ('a1', <class 'int'>, 'isotope number', ''), 'lmix': ('a1', <class 'str'>, 'flag indicating the presence of additional data and code relating to line-mixing', ''), 'locl': ('a15', <class 'str'>, 'electronic and vibrational local lower quanta', ''), 'locu': ('a15', <class 'str'>, 'electronic and vibrational local upper quanta', ''), 'selbrd': ('a5', <class 'float'>, 'self-broadened half-width at 296K', 'cm-1.atm-1'), 'wav': ('a12', <class 'float'>, 'vacuum wavenumber', 'cm-1')}[source]

parsing order of HITRAN 2004 format

Type:

OrderedDict

extract_columns(df, extracted_values, columns)[source]

extracts column from extracted_values

hit2df(fname, cache=True, verbose=True, drop_non_numeric=True, load_wavenum_min=None, load_wavenum_max=None, engine='pytables', output='pandas', parse_quanta=True, cache_directory_path=None, fast_parsing=True)[source]

Convert a HITRAN/HITEMP [1]_ file to a Pandas dataframe

Parameters:
  • fname (str) – HITRAN-HITEMP file name

  • cache (boolean, or 'regen' or 'force') – if True, a pandas-readable HDF5 file is generated on first access, and later used. This saves on the datatype cast and conversion and improves performances a lot (but changes in the database are not taken into account). If False, no database is used. If 'regen', temp file are reconstructed. Default True.

Other Parameters:
  • drop_non_numeric (boolean) – if True, non numeric columns are dropped. This improves performances, but make sure all the columns you need are converted to numeric formats before hand. Default True. Note that if a cache file is loaded it will be left untouched.

  • load_wavenum_min, load_wavenum_max (float) – if not 'None', only load the cached file if it contains data for wavenumbers above/below the specified value. See :py:func`~radis.api.cache_files.load_h5_cache_file`. Default 'None'.

  • engine (‘pytables’, ‘vaex’) – format for Hdf5 cache file. Default pytables

  • parse_quanta (bool) – if True, parse local & global quanta (required to identify lines for non-LTE calculations ; but sometimes lines are not labelled.)

  • fast_parsing (bool) – if True, uses vectorized parsing instead of regex for global quanta. Default True.

  • output (str) – output format of data as pandas Dataformat or vaex Dataformat

  • cache_directory_path (str or None, optional) – Directory to store/read cache files. If None, use the directory of fname.

Returns:

df – dataframe containing all lines and parameters

Return type:

pandas Dataframe or Vaex Dataframe

References

Notes

Performances: see CDSD-HITEMP parser

See also

cdsd2df()

hitranxsc(hitranXSC)[source]

Parse Hitran Cross-section files manually downloaded from https://hitran.org/xsc/ Returns a dictionary

Example

Read and plot a local acetone cross-section file:

datafile = ‘CH3COCH3_233.4_375.2_700.0-1780.0_13.xsc’

data = hitran_crosssection(datafile)

# %% Plot cross section: import matplotlib.pyplot as plt plt.plot(data[‘wavenumber’], data[‘spectrum’]) plt.xlabel(‘Wavenumber (cm-1)’) plt.ylabel(‘Cross section (cm2/molecule)’) # add title with molecule name, pressure and temperature: plt.title(data[‘name’] + ‘ at ‘ + str(data[‘P’]) + ‘ Torr and ‘ + str(data[‘T’]) + ‘ K’) plt.show()

Notes

Code adapted from https://fr.mathworks.com/matlabcentral/fileexchange/74716-load-hitran-absorption-cross-section by Mohammadamir Ghaderi

Under licence conditions

Copyright (c) 2020, Mohammadamir Ghaderi
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution
* Neither the name of  nor the names of its
contributors may be used to endorse or promote products derived from this
software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
parse_global_quanta(df, mol, verbose=True, dataframe_type='pandas', fast_parsing=True)[source]
Parameters:
  • df (pandas Dataframe)

  • mol (str) – molecule name

parse_local_quanta(df, mol, verbose=True, dataframe_type='pandas', fast_parsing=True)[source]
Parameters:
  • df (pandas Dataframe)

  • mol (str) – molecule name

post_process_hitran_data(df, molecule, verbose=True, drop_non_numeric=True, parse_quanta=True, add_HITRAN_uncertainty_code=False, dataframe_type='pandas', fast_parsing=True)[source]

Parsing non-equilibrium parameters in HITRAN/HITEMP [1]_ file to and return final Pandas Dataframe

Parameters:
  • df (pandas Dataframe) – dataframe containing generic parameters

  • molecule (str) – molecule name

Other Parameters:
  • drop_non_numeric (boolean) – if True, non numeric columns are dropped. This improves performances, but make sure all the columns you need are converted to numeric formats before hand. Default True. Note that if a cache file is loaded it will be left untouched.

  • parse_quanta (bool) – if True, parse local & global quanta (required to identify lines for non-LTE calculations ; but sometimes lines are not labelled.)

  • add_HITRAN_uncertainty_code (bool) – if True, a column which contains HITRAN uncertainty code is converted to integer and not dropped.

  • dataframe_type (str) – pandas or vaex

  • fast_parsing (bool) – if True, uses vectorized parsing instead of regex for global quanta. Default True.

Returns:

df – dataframe containing all lines and parameters

Return type:

pandas Dataframe or vaex Dataframe

References

Notes

Performances: see CDSD-HITEMP parser

See also

cdsd2df()