DATABASES OF ATMOSPHERIC KINETIC DATA - NOVEMBER 2018 I VEREECKEN LUC

Page created by Lori Barton
 
CONTINUE READING
DATABASES OF ATMOSPHERIC KINETIC DATA - NOVEMBER 2018 I VEREECKEN LUC
DATABASES OF ATMOSPHERIC KINETIC DATA

NOVEMBER 2018 I VEREECKEN LUC
DATABASES OF ATMOSPHERIC KINETIC DATA - NOVEMBER 2018 I VEREECKEN LUC
ATMOSPHERIC KINETIC DATA
The need for kinetic data for atmospheric chemistry
Isoprene is emitted by a tree. What happens to the isoprene?
 a) It is mostly removed by OH radicals.
 How fast? Which products are formed? Whence the OH radicals?
 b) Some dominant products are methacrolein and methylvinylketone
 How much? What happens to these... etc.
 c) Does the above depend on reaction conditions ?
 - Temperature, pressure = place, time
 - Alternative reactions: OH vrs. O3 / NO3 /
 - photochemistry (=time, light, clouds)
Isoprene MACR MVK

 Page 2
DATABASES OF ATMOSPHERIC KINETIC DATA - NOVEMBER 2018 I VEREECKEN LUC
ATMOSPHERIC KINETIC DATA
Information needed for kinetic data of elementary processes
a) Chemical reaction / physical processes / pseudo-process
 - CH4 + OH  CH3 + H2O
 - HCOOH (gas)  HCOOH (aq)
 - isoprene emission
b) Rate of change : differential equations with rate coefficient
 [ 4 ] [ ] [ 2 ] [ 3 ]
 = k  [CH4]  [OH] = = =
 
c) Value of rate coefficient k : specific to the given reaction
 Can depend on temperature, pressure, light intensity and spectrum, ...
 Pseudo processes: time- and place-dependence. E.g emission E(x,y,z,t)
d) Lumped reactions : multiple steps in a single process description

 Page 3
DATABASES OF ATMOSPHERIC KINETIC DATA - NOVEMBER 2018 I VEREECKEN LUC
ATMOSPHERIC KINETIC DATA
Information needed/useful for kinetic model of atmosphere
a) Chemical mechanism as a set of coupled, non-linear differential equations
 Usually written as a large list of reactions (mathematics implied)
b) Sources, sinks, boundary conditions as differential equations, field strengths, etc...
 Emission fields, spectral intensities, etc. : see other modules
c) Data for comparison
 Measurements of concentrations, column values, ...
e) Earth system model (ESM)
 Includes e.g. Transport, links to other models
Prediction of time-dependent concentrations for all species is then "merely" a matter of
solving the differential equations.

 Page 4
WHY DATABASES WITH KINETIC INFORMATION
 Size of chemical mechanisms
 Gecko-A : Automatic mechanism
 Master chemical mechanism (MCM) :
 generator for atmospheric chemistry
 semi-explicit chemical kinetic mechanism
 143 emitted organic species
 6700 model species
 17000 reactions
 10
Image source right: Aumont, B., Szopa, S. and
 Number of functional groups

Madronich, S.: Atmos. Chem. Phys., 5, 2497– 9 -pinene
2517, doi:10.5194/acp-5-2497-2005, 2005. 8 Octane
 7
Image source left: Vereecken, L., Aumont, B.,
Barnes, I., Bozzelli, J. W., Goldman, M. J., 6
Green, W. H., Madronich, S., Mcgillen, M. R., 5
Mellouki, A., Orlando, J. J., Picquet-Varrault, B., 4
Rickard, A. R., Stockwell, W. R., Wallington, T.
J. and Carter, W. P. L.: Int. J. Chem. Kinet., 3
50(6), 435–469, doi:10.1002/kin.21172, 2018. 2
 1
 0
 0.0 5.0x103 1.0x104 1.5x104 2.0x104 2.5x104
 Number of compounds

 Page 5
DATATYPES IN A KINETICS-ORIENTED DATABASE
THE MOLECULE AS DATA
Properties of a molecule
 Atoms in the molecule (stoichiometry, mass)
 Connectivity of the atoms (molecular graph, substructures)
 Spatial shape of the molecule
 Physical properties (color/spectrum, solubility, boiling/freezing point, density,...)
 Chemical properties (reactivity, acidity, decomposition stability, energy/enthalpy, ...)
 Biological properties (metabolism, toxicity, skin/eye irritation, bioaccumulation,...)
 Technological properties (flash point, combustion, production, critical temperatures,...)
 Etc...

 Page 7
THE MOLECULE AS DATA
Example of a database of chemical compounds : PubChem
 Website : https://pubchem.ncbi.nlm.nih.gov/
PubChem is an open chemistry database at the National Institute of Health (NIH)
 Compounds : ~100 million compounds, 250 million substances
Multiple search options
 Web interfaces with text search, structure search
 Programmatic interfaces (PUG, SOAP, REST)
Hands-on examples :
 methacrolein (https://pubchem.ncbi.nlm.nih.gov/compound/methacrolein)
 Try searching for a) isoprene (C5H8) c)
 b) the methyl radical (CH3)

 Page 8
THE MOLECULE AS DATA
Molecular identifiers (or "how to name a molecule")
Very complex problem, which is only partially solved
Systematic names: allows to reconstruct the molecular graph based on the name
 IUPAC (human readable), SMILES (pseudo-readable), formula, InChI
 Systematic vrs. canonical vrs. understandable
Code names
 Trivial names (human readable, memorizable)
 InChI key, CAS ID, Pubchem ID, MCM ID, SAPRC, ...
Graph representation
 SMILES (connectivity), InChI, little schemes drawn in a figure (most popular)

 Page 9
THE MOLECULE AS DATA
Example : methacrolein
 IUPAC: 2-methyl-prop-2-enal
 Trivial name : methacrolein, 2-methylacrolein, methacrylaldehyde
 InChI=1S/C4H6O/c1-4(2)3-5/h3H,1H2,2H3
 InChI key : STNJBCKSHOAVAJ-UHFFFAOYSA-N
 "Canonical" smiles : CC(=C)C=O
 Other smiles : O=CC(C)=C ; C=C(C)C=O ; C(C)(=C)C=O ; CC1=C.O=C1
 CAS : 78-85-3
 MCM : MACR
 etc...

 Page 10
THE MOLECULE AS DATA
How to search for a molecule
 Graph matching : most reliable, but computationally expensive (and... graph entry?)
 Canonization of identifiers (e.g. "canonical" smiles, InChI, "canonical" IUPAC name)
 Search in list of known names (can never be exhaustive)
How to search for similar molecule
 What is similarity ? Part of structure / (reactive) substituents / properties / ...
 Subgraph isomorphism matching is computationally expensive
 Can be improved by pre-calculation (e.g. fingerprinting)
How to search for spatial information
 3D information is critical e.g. in catalysts, enzymes, ...
 Most molecules have multiple possible 3D forms (conformers, internal rotation)

 Page 11
THE CHEMICAL REACTION AS DATA
Reaction identifiers (or "how to find a reaction")
Once the reactants and products are identified, the reaction identifier problem is trivial.
Except for: Elementary vrs. lumped reactions
 Catalysts ; third-body ; phase/lattice structure/activities
 Unidentified reactants/products (especially experimental data)
Reaction properties (all as a function of reaction conditions and environment)
 Transformation mapping
 Rate coefficient / -specific quantum yields
 Equilibrium constant
 Energetic properties (reaction barrier, reaction energy,...)
 Product yields (multiple channels are not independent)

 Page 12
ADDITIONAL INFORMATION NEEDED FOR DATABASES
Data provenance
 Methodology : Experimental / theoretical / evaluation / estimate / simplification...
 Source : literature identifier (DOI, ISBN,...)
 Data entry : Manual / automated ; operator ; software/form version
 Database transactions : timestamp addition, corrections
 Release : Snapshot identification

Uncertainty information
 Often not available
 As provided by source, from meta analysis, or from evaluation

 Page 13
ADDITIONAL INFORMATION NEEDED FOR DATABASES
Meta data
 Typically categorical data tagging the data quantity of interest
 Most often omitted due to cost of entry
 Meta data is often the key to leveraging the data to the fullest (data mining)

 Example : TheoKinDB entry forms
 TheoKinDB is database for theoretically obtained kinetic data
 Core data is energetic and kinetic data of reactions
 Theoretical methodology is added as meta data
 Provenance : source / data entry / versioning

 Page 14
ADDITIONAL INFORMATION NEEDED FOR DATABASES
Example meta-data: Benchmarking of theoretical methodologies (uncertainty analysis)
 Analysis without meta-data gives a statistically correct, but simplistic result
 Analysis with meta-data reveals underlying structure, and allows better use of data

 40  = f(reaction class)
  = -0.17  2.39 kcal/mol
 35

 30

 25
 Count

 20

 15

 10

 5

 0
 -6 -4 -2 0 2 4 6
 Difference M06-2X vrs. CCSD(T)//M06-2X (kcal/mol)

 Page 15
ADDITIONAL INFORMATION NEEDED FOR DATABASES
Example provenance: Active Thermochemical Tables (~1400 species)
 https://atct.anl.gov/  version 1.122d  click on species (e.g. H2O, Cl)
Old style thermodynamics : determine enthalpies relative to another compound
New style thermodynamics:
 - make database with all relative enthalpies (experimental and theoretical)
 - construct "thermochemical network" linking all species
 - optimize entire network to minimize errors  best results + uncertainty
 - analyze provenance :
 - which are the strongest determinants for a given value
 - identify strongest missing links for improvement.

 Page 16
DATABASES WITH KINETIC INFORMATION
IUPAC EVALUATED KINETIC DATA
 http://iupac.pole-ether.fr/
Kinetic and photochemical data
Evaluated by the IUPAC Task Group on Atmospheric Chemical Kinetic Data Evaluation
Datasheets for individual reactions available as PDF/Word file
Main advantage : Evaluation by specialists, with recommendations and error analysis
 Good provenance
 Disadvantages : Limited selection of reactions for large mechanisms
 Not searchable
 Only uses experimental data

 Page 18
NIST CHEMICAL KINETICS DATABASE
 https://kinetics.nist.gov/kinetics/
Kinetic and photochemical data
Literature overview, no additional evaluation
Has mostly experimental data, some theoretical data, reviewed data
Only limited meta-data
Better searchability (see "Getting started")
 Example : search for
 CH4 + OH  CH3 + H2O
 C2H5OO + NO2

 Page 19
STRUCTURE-ACTIVITY RELATIONSHIPS
STRUCTURE-ACTIVITY RELATIONSHIPS
What to do if no data is available for a given reaction?
Structure-Activity relationships (SARs) may help: summarizes reactivity trends
Example : alkoxy radical decomposition rate by estimating the barrier height

 Source : Vereecken, L. and Peeters, J.: Phys. Chem. Chem. Phys., 11(40), 9062–
 9074, doi:10.1039/b909712k, 2009.
 Page 21
STRUCTURE-ACTIVITY RELATIONSHIPS
Example : Carbonyl oxide reactivity (unimolecular or reaction with water)
Simple interactive SAR:
 http://iek8810-gw.iek.kfa-juelich.de/~luc/TdW_Vereecken/SAR.html
Reactivity depends on substitution on either side of the carbonyl oxide
Less than 10 of these reactions have been measured, about 40 calculated explicitly
Reactivity of about 150 CI reactions was summarized in the SAR

 Source : Vereecken, L., Novelli, A. and Taraborrelli, D.: Phys. Chem. Chem. Phys.,
 19, 31599–31612, doi:10.1039/C7CP05541B, 2017.
 Page 22
DATABASES WITH KINETIC MODELS
KINETIC MODELS
Mechanistic information is essentially a "set of reactions"
Applying the mechanisms to all compounds in a set gives the kinetic model
  Few people make a difference between "mechanism" and "kinetic model"
Kinetic models in publications:
 - static, not easily searchable, fixed-format
Example : -pinene+OH model
Vereecken, L. and Peeters,
J.: Phys. Chem. Chem. Phys., 14(11),
3802–3815, doi:10.1039/c2cp23711c, 2012.

 Page 24
KINETIC MODELS
Browseable model - Exploring the chemistry
 Visualisation of the mechanism
 Kinetic model interactively browseable
  tool for exploring / learning the model
Example : BOREAM model for -pinene chemistry, with aerosol formation
 http://tropo.aeronomie.be/boream/

 Page 25
KINETIC MODELS
Extractable model - Querying a database for a submechanism
 - Database of chemical reactions
 - Select relevant species, query list of all related chemistry (1st, 2nd,.. nth generation)
Example : Master Chemical Mechanism (MCM)
 http://mcm.leeds.ac.uk/MCM/
 - Browse/search the mechanism
 - Extract a subset in different formats
 - Kinetic model easily useable in
 your own research
 - Some provenance information
 with extensive mechanistic "protocols"

 Page 26
PUBLICLY AVAILABLE DATABASES
PUBLICLY AVAILABLE DATABASES (SELECTION)
Properties / names / graphs / identifiers
 Chemspider : http://www.chemspider.com/
 PubChem : https://pubchem.ncbi.nlm.nih.gov/
 NIST webbook : https://webbook.nist.gov/chemistry/
 CHEBI : Chemical entities of biological interest : https://www.ebi.ac.uk/chebi
 ADME/T : Water solubility database : http://modem.ucsd.edu/adme/databases/databases_logS.htm
Energy / geometry:
 BEGDB: Benchmark energy and geometry database : www.begdb.com
 CCCBDB: Computational Chemistry Comparison and Benchmark DataBase : https://cccbdb.nist.gov/
 ATcT : Active Thermochemical Tables : https://atct.anl.gov/
 Prof. Burcat Thermodynamic Data : Ideal Gas Thermodynamic Data in Polynomial form for Combustion and Air Pollution Use
 http://garfield.chem.elte.hu/Burcat/burcat.html

 Page 28
PUBLICALLY AVAILABLE DATABASES (SELECTION)
Kinetics:
 ERADB: Chemical Kinetics Database on oxygenated VOCs gas phase reactions : http://era-orleans.org/eradb/index.php
 IUPAC Task Group on Atmospheric Chemical Kinetic Data Evaluation: http://iupac.pole-ether.fr/
 NIST Chemical kinetics database: https://kinetics.nist.gov/kinetics/
 KIDA: Kinetic database for astrochemistry : http://kida.obs.u-bordeaux1.fr/
 JPL : Chemical kinetics and photochemical data for use in atmospheric studies : https://jpldataeval.jpl.nasa.gov/
 MPI-Mainz UV/VIS : Spectral atlas of gaseous molecules of atmospheric interest : http://satellite.mpic.de/spectral_atlas
 Gecko-A : Generator for explicit chemistry and kinetics of organics in the atmosphere : http://geckoa.lisa.u-pec.fr/
Chemical mechanisms
 Master Chemical Mechanism (MCM) : http://mcm.leeds.ac.uk/MCM/
 BOREAM Model : atmospheric reactions of alpha-Pinene : http://tropo.aeronomie.be/boream
Structure-activity relationships
 Gecko-A : Generator for explicit chemistry and kinetics of organics in the atmopshere : http://geckoa.lisa.u-pec.fr/

 Page 29
You can also read