Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues - Geoscientific Model Development

Page created by Benjamin Barber
 
CONTINUE READING
Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues - Geoscientific Model Development
Geosci. Model Dev., 13, 1055–1073, 2020
https://doi.org/10.5194/gmd-13-1055-2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

Development of Korean Air Quality Prediction System version 1
(KAQPS v1) with focuses on practical issues
Kyunghwa Lee1,2 , Jinhyeok Yu2 , Sojin Lee3 , Mieun Park4,5 , Hun Hong2 , Soon Young Park2 , Myungje Choi6 ,
Jhoon Kim6 , Younha Kim7 , Jung-Hun Woo7 , Sang-Woo Kim8 , and Chul H. Song2
1 Environmental  Satellite Center, Climate and Air Quality Research Department, National Institute of Environmental Research
(NIER), Incheon, Republic of Korea
2 School of Earth Sciences and Environmental Engineering, Gwangju Institute of Science and Technology (GIST),

Gwangju, Republic of Korea
3 Department of Earth and Atmospheric Sciences, University of Houston, Texas, USA
4 Air Quality Forecasting Center, Climate and Air Quality Research Department, National Institute of Environmental Research

(NIER), Incheon, Republic of Korea
5 Environmental Meteorology Research Division, National Institute of Meteorological Sciences (NIMS),

Jeju, Republic of Korea
6 Department of Atmospheric Sciences, Yonsei University, Seoul, Republic of Korea
7 Department of Advanced Technology Fusion, Konkuk University, Seoul, Republic of Korea
8 School of Earth and Environmental Sciences, Seoul National University, Seoul, Republic of Korea

Correspondence: Chul H. Song (chsong@gist.ac.kr)

Received: 12 June 2019 – Discussion started: 18 July 2019
Revised: 30 January 2020 – Accepted: 5 February 2020 – Published: 10 March 2020

Abstract. For the purpose of providing reliable and robust        compared with those of the CMAQ simulations without ICs
air quality predictions, an air quality prediction system was     (BASE RUN). For almost all of the species, the application
developed for the main air quality criteria species in South      of ICs led to improved performance in terms of correlation,
Korea (PM10 , PM2.5 , CO, O3 and SO2 ). The main caveat of        errors and biases over the entire campaign period. The DA
the system is to prepare the initial conditions (ICs) of the      RUN agreed reasonably well with the observations for PM10
Community Multiscale Air Quality (CMAQ) model simula-             (index of agreement IOA = 0.60; mean bias MB = −13.54)
tions using observations from the Geostationary Ocean Color       and PM2.5 (IOA = 0.71; MB = −2.43) as compared to the
Imager (GOCI) and ground-based monitoring networks in             BASE RUN for PM10 (IOA = 0.51; MB = −27.18) and
northeast Asia. The performance of the air quality predic-        PM2.5 (IOA = 0.67; MB = −9.9). A significant improve-
tion system was evaluated during the Korea-United States          ment was also found with the DA RUN in terms of bias.
Air Quality Study (KORUS-AQ) campaign period (1 May–              For example, for CO, the MB of −0.27 (BASE RUN) was
12 June 2016). Data assimilation (DA) of optimal interpola-       greatly enhanced to −0.036 (DA RUN). In the cases of O3
tion (OI) with Kalman filter was used in this study. One major    and SO2 , the DA RUN also showed better performance than
advantage of the system is that it can predict not only partic-   the BASE RUN. Further, several more practical issues fre-
ulate matter (PM) concentrations but also PM chemical com-        quently encountered in the air quality prediction system were
position including five main constituents: sulfate (SO2−
                                                       4 ), ni-   also discussed. In order to attain more accurate ozone predic-
trate (NO−3 ), ammonium   (NH  +
                               4 ), organic aerosols (OAs) and    tions, the DA of NO2 mixing ratios should be implemented
elemental carbon (EC). In addition, it is also capable of pre-    with careful consideration of the measurement artifacts (i.e.,
dicting the concentrations of gaseous pollutants (CO, O3 and      inclusion of alkyl nitrates, HNO3 and peroxyacetyl nitrates –
SO2 ). In this sense, this new air quality prediction system      PANs – in the ground-observed NO2 mixing ratios). It was
is comprehensive. The results with the ICs (DA RUN) were          also discussed that, in order to ensure accurate nocturnal pre-

Published by Copernicus Publications on behalf of the European Geosciences Union.
Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues - Geoscientific Model Development
1056                                                                    K. Lee et al.: Air quality prediction system in Korea

dictions of the concentrations of the ambient species, accu-          Although there are various datasets representing air qual-
rate predictions of the mixing layer heights (MLHs) should         ity, limitations remain in the observations and model outputs.
be achieved from the meteorological modeling. Several ad-          Specifically, observation data are, in general, known to be
vantages of the current air quality prediction system, such as     more accurate than model outputs, but they have spatial and
its non-static free-parameter scheme, dust episode prediction      temporal limitations. These limitations will be overcome by
and possible multiple implementations of DA prior to actual        improving spatial and temporal coverage via future geosta-
predictions, were also discussed. These configurations are all     tionary satellite instruments such as the Geostationary En-
possible because the current DA system is not computation-         vironment Monitoring Spectrometer (GEMS) over Asia, the
ally expensive. In the ongoing and future works, more ad-          Tropospheric Emissions: Monitoring of Pollution (TEMPO)
vanced DA techniques such as the 3D variational (3DVAR)            over North America and the Sentinel-4 over Europe. In addi-
method and ensemble Kalman filter (EnK) are being tested           tion, the TROPOspheric Monitoring Instrument (TROPOMI)
and will be introduced to the Korean air quality prediction        on board the Copernicus Sentinel-5 Precursor satellite was
system (KAQPS).                                                    successfully launched into low earth orbit (LEO) on 13 Octo-
                                                                   ber 2017 and is providing information on the chemical com-
                                                                   position in the atmosphere with a higher spatial resolution of
                                                                   3.5 km × 7 km.
1   Introduction                                                      Unlike observational data, models can provide meteoro-
                                                                   logical and chemical information without any spatial and
Air quality has long been considered an important issue in         temporal data discontinuity, but they do have an issue of inac-
climate change, visibility and public health, and it is strongly   curacy. The major causes of uncertainty in the results of CTM
dependent upon meteorological conditions, emissions and            simulations are introduced from imperfect emissions, mete-
the transport of air pollutants. Air pollutants typically con-     orological fields, initial conditions (ICs), and physical and
sist of atmospheric particles and gases such as particulate        chemical parameterizations in the models (Carmichael et al.,
matter (PM), carbon monoxide (CO), ozone (O3 ), nitrogen           2008). In order to minimize the limitations and maximize the
dioxide (NO2 ) and sulfur dioxide (SO2 ). These aerosols and       advantages of observation data and model outputs, there have
gases play important roles in anthropogenic climate forcing,       been numerous attempts to provide accurate and spatially as
both directly (Bellouin et al., 2005; Carmichael et al., 2009;     well as temporally continuous information on chemical com-
IPCC, 2013; Scott et al., 2014) and indirectly (Breon et al.,      position in the atmosphere by integrating observation data
2002; IPCC, 2013; Penner et al., 2004; Scott et al., 2014)         with model outputs via data assimilation (DA) techniques.
influencing the global radiation budget. Among the various            Although the Korean numerical weather prediction (NWP)
air pollutants, PM and surface O3 are the most notorious           carried out by the Korea Meteorological Administration
health threats, as has been stated by several previous studies     (KMA) employs various DA techniques, almost no previous
(Carmichael et al., 2009; Dehghani et al., 2017; Khaniabadi        efforts have been made to develop an air quality prediction
et al., 2017).                                                     system with DA in South Korea. Therefore, in the present
   With the stated importance of atmospheric aerosols and          study, the air quality prediction system named as Korean Air
gases, considerable research efforts have been made to mon-        Quality Prediction System version 1 (KAQPS v1) was devel-
itor and quantify their amounts in the atmosphere through          oped by preparing ICs via DA for the Community Multiscale
satellite-, airborne- and ground-based observations as well        Air Quality (CMAQ) model (Byun and Schere, 2006; Byun
as chemistry-transport model (CTM) simulations. In South           and Ching, 1999) using satellite- and ground-based observa-
Korea, the Korean Ministry of the Environment (KMoE)               tions for particulate matter (PM) and atmospheric gases such
provides real-time chemical concentrations as measured by          as CO, O3 and SO2 . The performances of the system were
ground-based observations for six air quality criteria air         then demonstrated during the period of the Korea-United
pollutants (PM10 , PM2.5 , O3 , CO, SO2 and NO2 ) at the           States Air Quality Study (KORUS-AQ) campaign (1 May–
Air Korea website (https://www.airkorea.or.kr, last access:        12 June 2016) in South Korea.
8 March 2020). PM10 or (PM2.5 ) refers to the atmospheric             In this study, the optimal interpolation (OI) method with
particulate matter that has an aerodynamic diameter less than      the Kalman filter was applied in order to develop the air qual-
10 (or 2.5) µm. In addition, the National Institute of Environ-    ity prediction system, since this method is still useful and
mental Research (NIER) of South Korea provides air qual-           viable in terms of computational cost and performance. The
ity predictions using multiple CTM simulations. Air quality        performance of the method is almost comparable to that of
predictions are another crucial element for protecting public      the 3D variational (3DVAR) method, as shown in Tang et
health through the forecasting of high air pollution episodes      al. (2017). More complex and advanced DA techniques are
in advance and alerting citizens about these high episodes.        currently being and will continue to be applied to current air
In this context, reliable and robust air quality forecasts are     quality prediction systems. These works are now in progress.
necessary to avoid any confusion caused by poor predictions           In addition, this paper also discusses several practical is-
given by CTM simulations.                                          sues frequently encountered in the air quality predictions

Geosci. Model Dev., 13, 1055–1073, 2020                                             www.geosci-model-dev.net/13/1055/2020/
Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues - Geoscientific Model Development
K. Lee et al.: Air quality prediction system in Korea                                                                      1057

such as (i) DA of NO2 mixing ratios for accurate ozone pre-      (1959) successive correction method. The adjusted meteoro-
diction with a careful consideration of measurement artifacts;   logical variables were temperature, geopotential height, rela-
(ii) the issue of the nocturnal mixing layer height (MLH)        tive humidity and zonal/meridional winds.
for nocturnal predictions; (iii) predictions of dust episodes;      The model domain for the WRF simulations covers north-
(iv) the use of non-static free parameters; and (v) the influ-   east Asia with a horizontal resolution of 15 km × 15 km, hav-
ences of multiple implementations of the DA before the ac-       ing a total of 223 latitudinal and 292 longitudinal grid cells.
tual predictions.                                                The size of the WRF domain is slightly larger than that of the
   The details of the datasets and methodology used in this      CMAQ domain, as shown in Fig. 1. The meteorological data
study are described in Sect. 2. The results of the developed     have 27 vertical layers from the surface (1000 hPa) to 50 hPa.
air quality prediction system are discussed in Sect. 3, and      The WRF meteorological fields (e.g., temperature, pressure,
then a summary and conclusions are given in Sect. 4.             wind, humidity, and clouds) were then transformed into the
                                                                 CMAQ-ready format via the Meteorology-Chemistry Inter-
                                                                 face Processor (MCIP; Otte and Pleim, 2010) v4.3, which
2     Methodology                                                is a software to serve for transforming horizontal and verti-
                                                                 cal coordinates while trying to maintain dynamic consistency
The air quality prediction system was developed using the
                                                                 between WRF and CMAQ model simulations.
CMAQ model along with meteorological inputs provided by
the Weather Research and Forecasting (WRF) model (Ska-
                                                                 2.1.2   CMAQ model simulations
marock et al., 2008). The ICs for the CMAQ model sim-
ulations were prepared via the DA method using satellite-
                                                                 The CMAQ v5.1 model was used to estimate the concen-
retrieved and ground-based observations. The performances
                                                                 trations of the atmospheric chemical species over the do-
of the developed prediction system were evaluated using
                                                                 main, as shown in Fig. 1. The CMAQ domain has 204 lat-
ground in situ data. The models, data and DA technique are
                                                                 itudinal and 273 longitudinal grid cells in total and also has
described in detail in the following sections.
                                                                 a 15 km × 15 km horizontal resolution and 27 vertical lay-
2.1     Meteorological and chemistry-transport modeling          ers. The CMAQv5.1 model was configured to use. Chemi-
                                                                 cal and physical configurations for the CMAQ model sim-
2.1.1    WRF model simulations                                   ulations were selected as follows: SAPRC07tc for the gas-
                                                                 phase chemical mechanism (Hutzell et al., 2012); AERO6
The WRF model has been developed for providing mesoscale         for aerosol thermodynamics (Appel et al., 2013); Euler Back-
numerical weather prediction (NWP). It has also been used        ward Iterative (EBI) chemistry solver (Hertel et al., 1993),
to provide meteorological input fields for CTM simulations       which is a numerically optimized photochemistry mecha-
(Appel et al., 2010; Chemel et al., 2010; Foley et al., 2010;    nism solver; M3DRY for dry deposition velocity (Pleim and
Lee et al., 2016; Park et al., 2014). In this study, WRF         Xiu, 2003; Xiu and Pleim, 2001); global mass-conserving
v3.8.1 with the Advanced Research WRF (ARW) dynami-              scheme (Yamartino and WRF) for horizontal and vertical
cal core was applied to prepare the meteorological inputs for    advection (Colella and Woodward, 1984); MULTISCALE
the CMAQ model simulations. Dynamical and physical con-          (Louis, 1979), which is a simple first-order eddy diffusion
figurations for the WRF model simulations were selected as       scheme for horizontal diffusion; and the Asymmetric Con-
follows: the Yonsei University (YSU) scheme for planetary        vective Model, version 2 (ACM2; Pleim, 2007a, b), for ver-
boundary layer (Hong et al., 2006); the WRF single-moment        tical diffusion.
6-class (WSM6) scheme for the microphysics (Hong and                For anthropogenic emissions, KORUS v1.0 emissions
Lim, 2006); the Grell–Freitas ensemble scheme for cumu-          (Woo et al., 2012) were used. The emissions cover almost
lus parameterization (Grell and Freitas, 2014); the Noah-MP      all of Asia and are based on three emission inventories:
land surface model (Niu et al., 2011; Yang et al., 2011); the    the Comprehensive Regional Emissions inventory for Atmo-
Rapid Radiative Transfer Model for Global Circulation Mod-       spheric Transport Experiment (CREATE) for East Asia ex-
els (RRTMG) for shortwave/longwave options (Iacono et al.,       cluding Japan; the Model Inter-Comparison Study for Asia
2008); and the revised MM5 scheme for surface layer options      (MICS-Asia) for Japan; and the Studies of Emissions and
(Jimenez et al., 2012). The National Centers for Environmen-     Atmospheric Composition, Clouds and Climate Coupling by
tal Prediction (NCEP) Final (FNL) Operational Global Anal-       Regional Surveys (SEAC4RS) for South and Southeast Asia.
ysis data on 1◦ ×1◦ grids were chosen for the ICs and bound-        Biogenic emissions were prepared by running the Model
ary conditions (BCs) for the WRF simulations. In order to        of Emissions of Gases and Aerosols from Nature (MEGAN
minimize meteorological field errors for the applications of     v2.1; Guenther et al., 2006, 2012) with a grid size identical to
ICs and BCs to the WRF simulations, the objective analysis       that of the CMAQ model simulations. For the MEGAN sim-
(OBSGRID) nudging was conducted using the NCEP Auto-             ulations, the MODIS land cover data (Friedl et al., 2010) and
mated Data Processing (ADP) Global Upper Air Observa-            improved leaf area index (LAI) based on MODIS datasets
tions/Surface Observational weather data via the Cressman’s      (Yuan et al., 2011) were utilized. Pyrogenic emissions were

www.geosci-model-dev.net/13/1055/2020/                                             Geosci. Model Dev., 13, 1055–1073, 2020
Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues - Geoscientific Model Development
1058                                                                           K. Lee et al.: Air quality prediction system in Korea

Figure 1. Domains of GOCI sensor (dark green line) and CMAQ model simulations (blue line). Red-colored dots denote the locations of Air
Korea sites in South Korea. Orange-colored dots represent the locations of ground-based observation stations in China. Blue stars show the
locations of seven supersites in South Korea. During the KORUS-AQ campaign, observation data were obtained from 1514 stations in China
as well as 264 Air Korea and seven supersite stations in South Korea. NCL (2019) was used to draw this figure.

obtained from the Fire Inventory from NCAR (FINN; Wied-                the retrieved GOCI AODs with other satellite-retrieved and
inmyer et al., 2006, 2011). The lateral BCs for the CMAQ               ground-based observations and found several errors in the
model simulations were prepared using the global model re-             cloud masking and surface reflectances. These errors were
sults of the Model for Ozone and Related chemical Tracers,             corrected in the recently updated second version of the GOCI
version 4 (MOZART-4; Emmons et al., 2010) at every 6 h.                YAER algorithm (Choi et al., 2018), which used the updated
The mapping and regridding of the MOZART-4 data were                   cloud masking and more accurate surface reflectances. In this
conducted by matching the CMAQ grid information.                       study, the most recent GOCI AOD products from the GOCI
                                                                       YAER version 2 algorithm were used.
2.2     Observation data
                                                                       2.2.2     Ground-based observations
2.2.1    Satellite-based observations
                                                                       In addition to the satellite data, ground-based observations
A Korean geostationary satellite of Communication, Ocean               in South Korea and China were also collected for use in the
and Meteorological Satellite (COMS) was launched on                    air quality prediction system for PM and gas-phase pollu-
26 June in 2010 over the Korean Peninsula. The COMS is a               tants. The orange, red and blue dots in Fig. 1 represent the
geostationary orbit satellite, and it is stationed at an altitude      ground-based observation sites in China, Air Korea and su-
of approximately 36 000 km at a latitude of 36◦ N and a lon-           persite stations in South Korea, respectively. These observa-
gitude of 128.2◦ E, with a horizontal coverage of 2500 km ×            tions provide real-time concentrations of air quality criteria
2500 km (refer to Fig. 1). Among the three payloads of the             species such as PM10 , PM2.5 , CO, O3 , SO2 and NO2 .
COMS, Geostationary Ocean Color Image (GOCI) is the                       Throughout the period of the KORUS-AQ campaign,
first multichannel ocean color sensor with visible and near-           ground-based observation data were obtained from 1514 sta-
infrared channels. The GOCI instrument provides hourly                 tions in China, 264 Air Korea stations and seven supersite
spectral images with a spatial resolution of 500 m × 500 m             stations in South Korea. In this study, 80 % of the ground-
from 00:30 to 07:30 coordinated universal time (UTC) for               based observations in China and Air Korea stations in South
eight spectral (six visible and two near-infrared) channels at         Korea were randomly selected for use in the prediction sys-
412, 443, 490, 555, 660, 680, 745 and 865 nm.                          tem. The other 20 % of the data and supersite observations
   The Yonsei aerosol retrieval (YAER) algorithm for the               were used to evaluate the performances of the developed air
GOCI sensor was initially developed by Lee et al. (2010)               quality prediction system.
to retrieve the aerosol optical properties (AOPs) over ocean              In addition, AErosol RObotic NETwork (AERONET)
areas and was then improved by expanding to consider non-              AODs were used to conduct an independent evaluation of
spherical aerosol optical properties (Lee et al., 2012). Choi          the air quality prediction system. AERONET is a federated
et al. (2016) further extended the algorithm for application           global ground-based sun photometer network (Holben et al.,
to land surfaces, and the algorithm was referred to as the             1998). Cloud-screened and quality-assured level 2.0 AODs
GOCI YAER version 1 algorithm. With the GOCI YAER                      for the AERONET were used in this study.
algorithm, hourly aerosol optical depths (AODs) at 550 nm
were produced over East Asia. Choi et al. (2016) compared

Geosci. Model Dev., 13, 1055–1073, 2020                                                   www.geosci-model-dev.net/13/1055/2020/
Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues - Geoscientific Model Development
K. Lee et al.: Air quality prediction system in Korea                                                                         1059

2.3     Air quality prediction system                               best results in estimating AODs at 550 nm over East Asia. On
                                                                    the basis of Lee et al.’s (2016) work, σext was estimated with
In the present study, the air quality prediction system was         the β and f (RH) values suggested by Chin et al. (2002). Af-
developed by adjusting the ICs for the CMAQ model sim-              ter that, σext was integrated with respect to altitude, in order
ulations based on DA with satellite-retrieved and ground-           to calculate the AODs. The calculated AODs were used in
measured observations. Two parallel WRF-CMAQ model                  the air quality prediction system in order to prepare the ICs
runs were conducted. The first experiment that involved ad-         for the PM predictions.
justing ICs via DA is referred to as DA RUN (see Fig. 2).
In order to evaluate the prediction system, a second exper-         2.3.2   Data assimilation (DA)
iment, in which the ICs were originated from the previous
CMAQ model simulations without assimilations, was also              The ground-based observations, together with GOCI-derived
conducted. This CMAQ run is referred to as BASE RUN.                AODs, were used to prepare the ICs for the air quality predic-
                                                                    tions with the CMAQ model simulations. In order to achieve
2.3.1    AOD calculations                                           this, the following steps were taken: (i) the CMAQ-calculated
                                                                    concentrations of CO, O3 and SO2 were combined with the
CMAQ AODs are calculated by integrating the aerosol ex-             concentrations of CO, O3 and SO2 obtained from ground-
tinction coefficient (σext ) using the following equation:          based observations in South Korea (Air Korea) and China;
                                                                    (ii) the CMAQ-calculated AODs were assimilated with the
             Zz
                                                                    GOCI AODs; (iii) the assimilated AODs were converted into
AOD (λ) =         σext (λ) dz,                               (1)
                                                                    PM10 ; (iv) the converted PM10 was again assimilated at the
             0                                                      surface in South Korea and China; and (v) after the DA at
where z represents the vertical height; σext is defined as the      the surface, the ratios of the assimilated species concentra-
sum of the absorption coefficient (σabs ) and the scattering co-    tions to the original CMAQ-simulated concentrations were
efficient (σsca ); and σabs and σsca can be estimated by Eqs. (3)   applied so as to the adjust vertical profiles of the chemical
and (4), respectively, as shown below.                              species above the surface. In the air quality prediction sys-
                                                                    tem, the DA cycle is 24 h, and the assimilation takes place
σext (λ) = σabs (λ) + σsca (λ)                               (2)    every day at 00:00 UTC (refer to Fig. 3).
         h       i X  n X  m                                            The optimal interpolation (OI) method with the Kalman
σabs (λ) Mm−1 =               {(1 − ωij (λ)) · βij (λ)              filter was chosen in the air quality prediction system. The OI
                           i   j                                    method was originally used for meteorological applications
   · fij (RH) · [C]ij }                                      (3)    (Lorenc, 1986) and has also been used in the assimilations
                        n X
                          m
                                                                    for trace gases (Khattatov et al., 1999, 2000; Lamarque et
         h       i X
σsca (λ) Mm−1 =             {ωij (λ) · βij (λ)                      al., 1999; Levelt et al., 1998). Recently, the OI technique has
                           i   j                                    also been applied to aerosol fields (Collins et al., 2001; Yu
                                                                    et al., 2003; Generoso et al., 2007; Adhikary et al., 2008;
   · fij (RH) · [C]ij },                                     (4)    Carmichael et al., 2009; Chung et al., 2010; Park et al., 2011,
where i and j denote the particulate species and size bin           2014; Tang et al., 2015, 2017).
(or particle mode), respectively; ωij (λ) is the single scatter-        Aerosol assimilation using the OI method was first applied
ing albedo; βij (λ) is the mass extinction efficiency (MEE)         by Collins et al. (2001) as follows:
of particulate species i for the size bin or particle mode j ;      τ 0 m = τm + K(τo − Hτm )                                   (5)
[C]ij is the concentration of particulate species including                  T       T          −1
(NH4 )2 SO4 , NH4 NO3 , black carbon, organic aerosols (OA),        K = BH (HBH + O)                                            (6)
mineral dust and sea-salt aerosols; RH is the relative humid-                    2         2
                                                                    O = [(fo τo ) + (εo ) ]I                                    (7)
ity; fij (RH) is the hygroscopic factor; and the single scatter-                            2        2
                                                                    B (dx , dz ) = [(fm τm ) + (εm ) ]
ing albedo (ω) implies to the fraction (portion) of scattering                              "       #
in the total extinction.                                                          dx2           dz2
                                                                                     
   Using Eqs. (2)–(4), AODs were calculated from the                   exp − 2 exp − 2 ,                                        (8)
                                                                                 2lmx         2lmz
aerosol composition and RH. There have been intensive tests
using different β and f (RH) values in the following three          where τ 0 m , τm and τo represent the assimilated products by
previous studies: (1) Chin et al.’s (2002) study with the God-      the OI method, the modeled values and the observed values,
dard Chemistry Aerosol Radiation and Transport (GOCART)             respectively; K is the Kalman gain matrix; H is the obser-
model; (2) Martin et al.’s (2003) study with the GEOS-Chem          vation operator (or forward operator), which is an interpola-
model; and (3) Malm and Hand’s (2007) study with the                tor from model to observation space; B and O are the back-
CMAQ model. Lee et al. (2016) tested these methods and              ground and observation error covariance matrices, respec-
then found that Chin et al.’s (2002) method reproduced the          tively; (·)T denotes the transpose of a matrix; fo is the frac-

www.geosci-model-dev.net/13/1055/2020/                                                   Geosci. Model Dev., 13, 1055–1073, 2020
Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues - Geoscientific Model Development
1060                                                                           K. Lee et al.: Air quality prediction system in Korea

Figure 2. Schematic diagram of the Korean air quality prediction system developed in this study. The initial conditions (ICs) of the CMAQ
model simulations are prepared by assimilating CMAQ outputs with satellite-retrieved and ground-measured observations. The data process
for preparing the ICs is shown in the box with dashed gray lines.

                                                                       and model, the mathematical formalisms of which are de-
                                                                       scribed in Eqs. (7) and (8), respectively. Several previous
                                                                       studies have used fixed values for free parameters (Collins
                                                                       et al., 2001; Yu et al., 2003; Adhikary et al., 2008; Chung et
                                                                       al., 2010). These runs are called “static” runs. In contrast to
                                                                       those previous studies, “non-static” free parameters were ap-
                                                                       plied in this study by minimizing the differences between the
                                                                       assimilated values and observations via an iterative process
                                                                       at each assimilation time step. This non-static free-parameter
Figure 3. Schematic diagram of the Korean air quality prediction       scheme is possible due to the fact that the OI technique with
system for particulate matter (PM) and gas-phase pollutants. The       the Kalman filter is much less costly in terms of computation
data assimilation (DA) cycle is 24 h for both PM and gas-phase pol-    time than other DA techniques, such as the 3D or 4D vari-
lutants such as CO, O3 and SO2 . The DA of NO2 is excluded in the      ational methods. This is another advantage of using the OI
current study, the reason for which is discussed in the text.
                                                                       technique in this system. It typically takes less than 20 min
                                                                       with a workstation environment (dual Intel Xeon 2.40 GHz
                                                                       processor).
tional error in the observation-retrieved value; εo is the min-
imum root-mean-square error in the observation-retrieved               2.3.3     Allocation of the assimilated PM10 & PM2.5 to
values; I denotes the unit matrix; fm is the fractional error                    particulate composition
in the model estimates; εm is the minimum root-mean-square
error in the model estimates; dx is the horizontal distance            In the procedure of DA, PM10 was assimilated in this study,
between two model grid points; lmx is the horizontal correla-          because the PM10 data were more plentiful than PM2.5 . The
tion length scale for the errors in the model; dz is the vertical      assimilated PM10 then needs to be allocated to the PM com-
distance between two model grid points; and lmz is the ver-            position for the CMAQ-model prediction runs. In order to
tical correlation length scale for the errors in the model. In         achieve this, the differences between the assimilated PM10
this work, the OI technique was applied for the DA of atmo-            and background PM10 (1PM10 ) were first calculated. Then,
spheric gaseous species as well as particulate species.                1PM2.5 was estimated using the ratios of PM2.5 to PM10
   Six free parameters (fm , fo , εm , εo , lmx and lmz ) were used    from the background CMAQ model runs (i.e., 1PM2.5 =
to calculate the error covariance matrices of the observations         1PM10 × PM2.5 /PM10 ). 1PM2.5 was then allocated to the

Geosci. Model Dev., 13, 1055–1073, 2020                                                   www.geosci-model-dev.net/13/1055/2020/
Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues - Geoscientific Model Development
K. Lee et al.: Air quality prediction system in Korea                                                                             1061

Figure 4. Average PM2.5 composition (a) observed at the supersite
stations and (b) simulated by the CMAQ model during the KORUS-
AQ campaign. The averaged PM2.5 measured from the supersites
and calculated from the CMAQ model simulations over the period
of the KORUS-AQ campaign are 28 and 19.9 µg m−3 , respectively.
The mass of organic aerosols (OAs) was calculated by multiplying
organic carbon mass by 1.6.

                                                                    Figure 5. Time-series plots of hourly (a) PM10 , (b) PM2.5 , (c) CO,
PM2.5 composition according to the comparison between two           (d) SO2 and (e) O3 concentrations at 264 Air Korea stations. Open
PM2.5 compositions observed at the seven supersites and             black circles (OBS) represent the observed concentrations. Blue and
simulated from the CMAQ model runs over South Korea.                red lines show the results simulated from the BASE RUN and DA
Both of the compositions are shown in Fig. 4. In Fig. 4, the        RUN over South Korea, respectively.
PM “OTHERS” indicates the remaining particulate matter
species after excluding sulfate, nitrate, ammonium, organic
aerosol (OA) and elementary carbon (EC). The PM OTH-
                                                                    the DA at 00:00 UTC. For the forecast hours from 01:00
ERS occupies 25 % of the total PM2.5 observed at supersites.
                                                                    to 23:00 UTC, all of the ground observations (254 Air Ko-
The other fraction, 1PM10 ×(1−PM2.5 /PM10 ), was also dis-
                                                                    rea and seven supersite stations) were used to evaluate the
tributed to the coarse-mode particles (PM2.5–10 ) as crustal el-
                                                                    performances of the developed air quality prediction system.
ements.
                                                                    As shown in Fig. 5, we achieved some improvements in the
                                                                    prediction performances by applying the ICs to the CMAQ
3     Results and discussions                                       model simulations. The BASE RUN significantly underpre-
                                                                    dicted PM10 , PM2.5 and CO, while the DA RUN produced
The performances of the air quality prediction system were          concentrations that were more consistent with the observa-
evaluated by comparing them with ground-based observa-              tions than those of the BASE RUN.
tions from the Air Korea network and supersite stations in             In the case of CO, the observed CO mixing ratios were
South Korea. Several sensitivity analyses were also con-            about 3 times higher than those from the BASE RUN. These
ducted in order to assess the influences of the DA time in-         large differences are well known and have been attributed
tervals on the accuracy of the air quality prediction.              to the underestimated emissions of CO (Heald et al., 2004).
                                                                    However, when the DA was applied, the predictions of the
3.1     Evaluation of the air quality prediction system             CO mixing ratios improved. Similarly, the performances of
                                                                    the PM10 and PM2.5 predictions improved with the applica-
3.1.1    Time-series analysis                                       tion of the DA. Unlike PM10 , PM2.5 and CO, the O3 mixing
                                                                    ratios and its diurnal trends from both the BASE RUN and
Figure 5 shows the time-series plots of PM10 , PM2.5 , CO,          DA RUN tend to be well matched with the observations. By
O3 and SO2 concentrations from the BASE RUN and the DA              contrast, the poorest performances of the BASE RUN and the
RUN. Here, the observation data (OBS) obtained from the             DA RUN were shown for SO2 .
Air Korea network were compared with the results of the                In addition, a dust event took place between 6 and 8 May.
two sets of the CMAQ model simulations, i.e., (1) BASE              This event is captured by the DA RUN (check red peaks in
RUN and (2) DA RUN. As mentioned previously, 20 % of                Fig. 5a and b), while the BASE RUN cannot capture this dust
the Air Korea observations used in the evaluation were ran-         event. This demonstrates the capability of the current system
domly selected during the period of the KORUS-AQ cam-               to possibly predict dust events in South Korea. In the DA
paign. The other 80 % of the Air Korea data were used in            RUN, dust information is provided to the CMAQ model runs

www.geosci-model-dev.net/13/1055/2020/                                                 Geosci. Model Dev., 13, 1055–1073, 2020
Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues - Geoscientific Model Development
1062                                                                      K. Lee et al.: Air quality prediction system in Korea

                                                                     Figure 8. Comparison of WRF-simulated mixing layer height
Figure 6. Aggregated average concentrations of (a) PM10 ,            (MLH) (denoted by dashed blue line) with lidar-measured MLH
(b) PM2.5 , (c) CO and (d) SO2 at 264 Air Korea stations over the    (denoted by open black circles) at Seoul National University (SNU)
KORUS-AQ campaign period. Open black circles denote the ob-          in Seoul. KST stands for Korean standard time.
servations obtained from 264 Air Korea stations in South Korea.
Blue and red lines represent the predicted concentrations from the
BASE RUN and DA RUN, respectively. The DA was conducted at           and China, the predicted ozone mixing ratios became sub-
00:00 UTC every day throughout the KORUS-AQ campaign period.         stantially closer to the observations, as shown in Fig. 7b. This
                                                                     is clearly due to the fact that NOx is an important precursor
                                                                     of ozone. In the prediction of the ozone mixing ratios, both
                                                                     1 h peak ozone (around 15:00 KST) and 8 h averaged ozone
                                                                     mixing ratios (between 09:00 and 17:00 KST) are important.
                                                                     Figure 7 clearly shows that the prediction accuracies of both
                                                                     the ozone mixing ratios were improved after the DA of NO2
                                                                     mixing ratios.
                                                                        Although the DA for NO2 provided better ozone predic-
Figure 7. Comparison of CMAQ-simulated O3 mixing ratios              tions, one should take caution in using the NO2 observa-
(BASE RUN with blue lines and DA RUN with red lines) with O3         tions. The NO2 mixing ratios measured at Air Korea sites are
mixing ratios from Air Korea stations (open black circles). DA RUN   known to be contaminated by other nitrogen gases such as
was carried out by assimilating CMAQ outputs with Air Korea ob-      nitric acid (HNO3 ), peroxyacetyl nitrates (PANs) and alkyl
servations using (a) only O3 mixing ratios and (b) both O3 and NO2   nitrates (ANs), since the Air Korea NO2 mixing ratios are
mixing ratios.                                                       measured through a chemiluminescent method with catalysts
                                                                     of gold or molybdenum oxide at high temperatures. These
                                                                     are known to be “NO2 measurement artifacts” (Jung et al.,
through both/either GOCI AOD and/or ground PM observa-               2017), which is one of the reasons that the DA of NO2 was
tions measured along the dust plume tracks.                          not shown in Fig. 6. The NO2 mixing ratios are corrected
   The effectiveness of the DA with respect to prediction time       from the Air Korea NO2 data and are then used to prepare the
was also analyzed by calculating the aggregated average con-         ICs via the DA for more accurate ozone and NO2 predictions.
centrations of atmospheric species (see Figs. 6, 7 and 9). Fig-      Currently, such corrections of the observed NO2 mixing ra-
ure 6 depicts the CMAQ-calculated average concentrations             tios are being standardized with more sophisticated year-long
of PM10 , PM2.5 , CO and SO2 against the Air Korea observa-          NO2 measurements. After the corrections of the NO2 mea-
tions. Our air quality prediction system regenerated relatively      surement artifacts, more evolved schemes of ozone and NO2
well-matched concentrations for PM10 , PM2.5 and CO from             predictions will be possible in the future. As shown in Fig. 7,
the DA RUN.                                                          about a 20 % reduction (average fraction of non-NO2 mixing
   Figure 7 shows the case of ozone from the DA RUN by               ratios in the observed NO2 mixing ratios) was made for these
assimilating CMAQ outputs with Air Korea-observed (a) O3             demonstration runs (Jung et al., 2017).
mixing ratios and (b) both O3 and NO2 mixing ratios for a               Another practical issue is now discussed. Although the as-
preliminary test run. The ozone mixing ratios from the DA            similation with the observed NO2 mixing ratios can enhance
RUN in Fig. 7a were reasonably consistent with the observa-          the accuracy of the predictions of the daytime ozone mixing
tions at 00:00 UTC but disagreed with those between 04:00            ratios, the nighttime ozone mixing ratios tend to be consis-
and 09:00 UTC (13:00 and 18:00 KST), when solar insola-              tently overpredicted in the aggregated plot of the ozone mix-
tion is the most intense. This may be attributed to the chemi-       ing ratios at the observation sites (see Fig. 7). This can be
cal imbalances between ozone production and ozone destruc-           caused by underestimated NO2 mixing ratios and thus not
tion (or titration). However, if CMAQ NO2 was assimilated            enough nighttime ozone titration. As mentioned before, re-
with ground-based observations in South Korea (Air Korea)            liable NO2 prediction via the correction of the NO2 mea-

Geosci. Model Dev., 13, 1055–1073, 2020                                                www.geosci-model-dev.net/13/1055/2020/
K. Lee et al.: Air quality prediction system in Korea                                                                           1063

                                                                        3.1.2   Spatial distribution

                                                                        Figure 10 shows the spatial distributions and bias of PM and
                                                                        chemical species throughout the entire period of the KORUS-
                                                                        AQ campaign over the Seoul metropolitan area (SMA). No-
                                                                        ticeable improvements are observed to have been achieved
                                                                        in the spatial distributions by applying the ICs to the CMAQ
                                                                        model simulations, particularly for PM10 (Fig. 10a), PM2.5
                                                                        (Fig. 10b) and CO (Fig. 10c). As shown in Fig. 10, the un-
                                                                        derpredicted concentrations of PM10 , PM2.5 and CO were
                                                                        adjusted to concentrations closer to the observations. In the
                                                                        case of SO2 (see Fig. 10d), the DA RUN produced better
                                                                        agreement with the observations than the BASE RUN, but
                                                                        there were still underpredicted SO2 concentrations over the
                                                                        northeastern part of the SMA.
                                                                           By contrast, relatively lower ozone mixing ratios from the
                                                                        DA RUN against the BASE RUN were found in the south-
                                                                        western part of the SMA (see Fig. 10e). Due to the non-
                                                                        linear relationship between NOx and O3 , high mixing ra-
                                                                        tios (or emissions) of NOx in the SMA can lead to deple-
                                                                        tion of ozone. In these runs, the precursors of ozone such
                                                                        as NOx and volatile organic compounds (VOCs) were ex-
Figure 9. Aggregated average concentrations of (a) PM10 ,               cluded in the preparation of the ICs for CMAQ model sim-
(b) PM2.5 , (c) OC, (d) EC, (e) sulfate, (f) nitrate and (g) ammonium   ulations. Again, this is because the Air Korea NO2 mixing
as predicted by CMAQ model during the period of the KORUS-AQ            ratios are contaminated by several reactive nitrogen species,
campaign. The others are the same as those shown in Fig. 7, except
                                                                        so the data cannot be directly used in the assimilation pro-
for the fact that the observation data used here were obtained from
the seven supersite stations in South Korea.
                                                                        cedures. In the case of VOCs, a limited number of datasets
                                                                        are available in South Korea for the DA. Improvements in
                                                                        the prediction of ozone mixing ratios can be achieved when
surement artifacts will be made in the future. Another pos-             the NO2 mixing ratios are corrected and a sufficient amount
sible reason of the overpredicted ozone mixing ratios dur-              of VOC data (possibly from satellite data in the future) are
ing the nighttime can be underestimation of the mixing layer            available.
height (MLH). Figure 8 shows a comparison between lidar-
measured MLH (black dashed line) and WRF-calculated                     3.1.3   Statistical analysis
MLH (with the option of the Yonsei University planetary
boundary layer scheme by Hong et al., 2006; see red line). As           In order to achieve a better understanding of the perfor-
shown in Fig. 8, the nocturnal lidar-measured MLH is about              mances of the DA RUN, analyses of statistical variables such
2 times higher than the nocturnal WRF-calculated MLH as                 as index of agreement (IOA), Pearson’s correlation coeffi-
measured at a lidar site inside the campus of Seoul National            cient (R), root-mean-square error (RMSE) and mean bias
University (SNU) in Seoul. Such underestimated MLH in                   (MB) were conducted using observations from the Air Korea
the model tends to compress the ozone molecules within the              stations for PM10 , PM2.5 , CO, SO2 and O3 (see Fig. 11). Def-
mixing layer during the nighttime, which leads to consis-               initions of the statistical variables are given in Appendix A.
tently overpredicted nocturnal ozone mixing ratios. Based on               After the applications of the ICs, both RMSE and MB be-
this discrepancy shown in Fig. 8, more intensive comparison             came lower, while the correlation coefficient became higher
study is being carried out by comparing lidar-measured MLH              for the entire set of predictions. In addition, it was found
with model-calculated MLH at multiple sites in South Korea.             that the differences between the BASE RUN and the DA
   In this work, the aerosol composition (including EC, OA,             RUN tended to diminish as the prediction time progressed.
sulfate, nitrate and ammonium) was further compared with                The results of the statistical analysis are listed in Table 1.
the composition observed at the supersites shown in Fig. 9.             The results of the DA RUN were reasonably consistent with
As shown in Fig. 9, agreement was observed between the DA               the observations for PM10 (IOA = 0.60; R = 0.40; RMSE
RUN and observations for all of the major PM constituents.              = 34.87; MB = −13.54) and PM2.5 (IOA = 0.71; R = 0.53;
Again, a strong capability of our DA system is to improve               RMSE = 17.83; MB = −2.43), as compared to the BASE
the predictions of the aerosol composition.                             RUN for PM10 (IOA = 0.51; R = 0.34; RMSE = 40.84;
                                                                        MB = −27.18) and PM2.5 (IOA = 0.67; R = 0.51; RMSE
                                                                        = 19.24; MB = −9.9). In terms of bias, an improvement

www.geosci-model-dev.net/13/1055/2020/                                                    Geosci. Model Dev., 13, 1055–1073, 2020
1064                                                                          K. Lee et al.: Air quality prediction system in Korea

Figure 10. Spatial distributions (first and second columns) and bias (third and fourth columns) of (a) PM10 , (b) PM2.5 , (c) CO, (d) O3 and
(e) SO2 over Seoul metropolitan area (SMA) for the entire period of the KORUS-AQ campaign. Colored circles of first and second columns
represent the concentrations of the air pollutants observed at the Air Korea stations in the SMA.

was found for C, with MB = -0.036 for the DA RUN and                    trations. Collectively, the DA improved model accuracy by
MB = −0.27 for the BASE RUN. Regarding O3 and SO2 ,                     a large degree in terms of R, particularly for PM10 (R:
the DA RUN showed slightly better performances than the                 0.3 → 0.75; slope: 0.17 → 0.66) and O3 (R: 0.09 → 0.61;
BASE RUN.                                                               slope: 0.07 → 0.42). In addition, for all species, MB and
   Table 2 presents the results of the statistical analysis at          RMSE decreased significantly with the DA RUN as com-
00:00 UTC when the DA was conducted, with the results                   pared with the BASE RUN.
clearly showing how much closer the DA makes the CMAQ-
calculated chemical concentrations to the observed concen-

Geosci. Model Dev., 13, 1055–1073, 2020                                                    www.geosci-model-dev.net/13/1055/2020/
K. Lee et al.: Air quality prediction system in Korea                                                                          1065

Figure 11. Time-series plots of four performance metrics (IOA, R, RMSE and MB) for (a) PM10 , (b) PM2.5 , (c) CO, (d) SO2 and (e) O3
forecasts. The DA was conducted at 00:00 UTC. The units of RMSE and MB are micrograms per cubic meter and parts per million by volume
for PM concentrations and for gaseous species, respectively. The definitions of the four performance metrics are shown in Appendix A.

3.2     Sensitivity test of DA time interval                         frequent implementation of the DA is expected to make the
                                                                     predicted results closer to the observations. Although the DA
3.2.1    AOD                                                         RUN with a shorter assimilation time interval tends to pro-
                                                                     duce a better prediction, it is not always the most appropri-
In this section, a sensitivity analysis was conducted with dif-      ate choice, since the shorter assimilation time interval results
ferent implementation time intervals of the DA (i.e., 24, 6 and      in increased computational cost. Therefore, an optimized as-
3 h) for AOD (refer to Fig. 12). As shown in Fig. 12, more           similation time interval should be found to achieve the best

www.geosci-model-dev.net/13/1055/2020/                                                 Geosci. Model Dev., 13, 1055–1073, 2020
1066                                                                         K. Lee et al.: Air quality prediction system in Korea

Table 1. Statistical metrics from BASE RUN and DA RUN with Air Korea observations over the entire period of the KORUS-AQ campaign.

                   PM10                    PM2.5                    CO                       SO2                     O3
           BASE RUN       DA RUN     BASE RUN     DA RUN    BASE RUN       DA RUN     BASE RUN      DA RUN    BASE RUN      DA RUN
  N               101 852                 65 383                   101 764                  101 764                101 836
  IOA            0.51         0.60       0.67        0.71         0.41         0.51        0.34        0.35        0.69        0.70
  R              0.34         0.40       0.51        0.53         0.28         0.21        0.14        0.15        0.50        0.52
  RMSE           40.8        34.87       19.2       17.83         0.31         0.19      0.0068      0.0066       0.020        0.02
  MB            −27.2       −13.54       −9.9       −2.43        −0.27        −0.04     −0.0009     −0.0004       0.003     −0.0024
  ME             30.1        24.20       15.3       13.48         0.27         0.15       0.004      0.0034       0.015       0.015
  MNB           −50.0       −18.17      −30.1        5.32        −62.0         3.14         3.1       17.77        48.0       30.22
  MNE            60.7        52.35       62.6       62.77         62.9        40.67        93.1       93.56        70.2       61.34
  MFB           −84.3       −41.61      −63.6      −24.41        −94.1       −10.00      −56.4       −40.20        11.1      −0.82
  MFE            91.1        62.32       81.6       60.01         94.9        39.49        91.4       82.91        40.7       40.64

Table 2. Statistical metrics from BASE RUN and DA RUN with Air Korea observations at 00:00 UTC when the DA was conducted during
the KORUS-AQ campaign.

                   PM10                    PM2.5                    CO                       SO2                     O3
           BASE RUN       DA RUN     BASE RUN     DA RUN    BASE RUN       DA RUN     BASE RUN      DA RUN    BASE RUN      DA RUN
  N                1057                     695                     1024                     1007                    1043
  IOA            0.48         0.86       0.63        0.74         0.41         0.62        0.36        0.44        0.45        0.75
  R              0.30         0.75       0.46        0.59         0.28         0.43       0.097        0.27        0.09        0.61
  RMSE           47.2        23.92       21.5       18.21         0.35         0.16      0.0061      0.0039       0.023       0.012
  MB            −32.2        −5.46      −11.5        2.80        −0.31        −0.01     −0.0019     −0.0009       0.015       0.002
  ME             34.5        16.03       17.2       13.25         0.31         0.12      0.0039      0.0023       0.018       0.009
  MNB           −54.9        −0.53      −33.2       26.17        −64.3         9.69      −20.1         7.35       100.4       27.45
  MNE            64.0        36.07       63.1       59.77         64.8        30.69        86.7       55.27       107.8       43.81
  MFB           −92.8       −13.38      −67.3        0.56        −98.7         1.81      −75.9       −17.39        43.7       12.16
  MFE            98.8        38.41       84.3       48.30         99.1        27.14        99.9       56.23        52.9       31.53

performances from the given DA system with the considera-            the best performance was found for the air quality prediction
tion of its own computational ability.                               system with the DA time interval of 24 h.
                                                                        Unsurprisingly, more frequent DAs prior to the actual pre-
                                                                     diction mode (i.e., before 00:00 UTC in our system) with a
3.2.2   PM and gases
                                                                     longer time interval (such as 2 h) will be computationally
                                                                     costly. There will certainly be a tradeoff between the preci-
In addition, sensitivity analyses of the developed air quality       sion of air quality prediction and the computational cost. The
prediction system applied to multiple implementations of the         system should be designed under the consideration of these
DA with different time intervals were also investigated for          two factors.
(a) PM10 , (b) PM2.5 , (c) CO, (d) SO2 and (e) O3 , shown in
Fig. 13. Figure 13 shows a soccer plot analysis for BASE
RUN (blue crosses) and DA RUNs with different DA time                4     Summary and conclusions
intervals of 24 h (OI; red circles), 2 h (2 h OI; black dia-
monds) and 1 h (1 h OI; dark-green triangles). This set of           In this study, the air quality prediction system was developed
testing was designed based on the fact that the performances         by preparing the ICs for CMAQ model simulations using
are expected to improve if the DAs are implemented multiple          GOCI AODs and ground-based observations of PM10 , CO,
times prior to the actual predictions at 00:00 UTC. Here, for        ozone and SO2 during the period of the KORUS-AQ cam-
the 2 h OI run, the DA was implemented three times a day             paign (1 May–12 June 2016) in South Korea. The major ad-
at 20:00, 22:00 and 00:00 UTC, while for the 1 h OI run, the         vantages of the developed air quality prediction system are
DA was implemented at 22:00, 23:00 and 00:00 UTC. The                its comprehensiveness in predicting the ambient concentra-
performances of all of the chemical species excluding ozone          tions of both gaseous and particulate species (including PM
improved, as expected, with DA RUNs with more frequent               composition) and its powerfulness in terms of computational
and longer DA time intervals (i.e., three-time implementation        cost.
with a 2 h time interval in our cases). In the case of ozone,

Geosci. Model Dev., 13, 1055–1073, 2020                                                 www.geosci-model-dev.net/13/1055/2020/
K. Lee et al.: Air quality prediction system in Korea                                                                                 1067

Figure 12. Variations in three performance metrics (R, RMSE and
MB) with time intervals of data assimilations. For these tests, the
GOCI AODs were used in the DA to update the initial condi-
tions of the CMAQ model simulations. The results from the three
CMAQ model simulations were compared with AERONET AODs
(“ground truth”). The blue squares represent the performances from
the BASE RUNs and the red squares indicate the performances from         Figure 13. Soccer plot analyses for (a) PM10 , (b) PM2.5 , (c) CO,
the DA RUNs. The three experiments were carried out with the as-         (d) SO2 and (e) O3 . The CMAQ-predicted concentrations were
similation time intervals of 24, 6 and 3 h, respectively. Here, the DA   compared with the Air Korea observations. Blue crosses, red cir-
RUN with the 24 h time interval is referred to as “air quality predic-   cles, dark-green triangles and black diamonds represent the perfor-
tion”, and the DA RUNs with the 6 and 3 h time interval are referred     mances calculated from the BASE RUN, the DA RUNs with the OI
to as “air quality reanalysis”.                                          system, the 1 h OI system and the 2 h OI system, respectively.

   The performances of the developed prediction system                      Moreover, the developed air quality prediction system will
were evaluated using near-surface in situ observation data.              be upgraded by using the new observation data that will
The CMAQ model runs with the ICs (DA RUN) showed                         be retrieved after 2020 from the Geostationary Environment
higher consistency with the observations of almost all of                Monitoring Spectrometer (GEMS) with a high spatial reso-
the chemical species, including PM composition (sulfate, ni-             lution of 7 km×8 km as well as a high temporal resolution of
trate, ammonium, OA and EC) and atmospheric gases (CO,                   1 h over a large part of Asia. In addition, the current DA tech-
ozone and SO2 ), than the CMAQ model runs without the                    nique of the OI with the Kalman filter can also be upgraded
ICs (BASE RUN). Particularly for CO, the DA was able                     with the use of more advanced DA methods such as varia-
to remarkably improve the model performances, while the                  tional techniques of 3DVAR and 4DVAR methods, as well
BASE RUN significantly underpredicted the CO concentra-                  as with the ensemble Kalman filter (EnK) method. These re-
tions (predicting about one-third of the observed values). In            search endeavors are currently underway.
the case of ozone, both the BASE RUN and DA RUN were                        In conjunction with improving the air quality modeling
in close agreement with observations. More reliable predic-              system, artificial intelligence (AI)-based air quality predic-
tions of ozone mixing ratios will be achieved via the DA                 tion systems are also currently being developed in several
of the observed NO2 mixing ratios and the corrections of                 ways (e.g., Kim et al., 2019). Actually, Kim et al. (2019) de-
model-simulated mixing layer height (MLH). For SO2 , the                 veloped an AI-based PM prediction system based on a deep
performances of both the BASE RUN and the DA RUN were                    recurrent neural network (RNN) in South Korea. The AI-
somewhat poor. Regarding this issue, more accurate SO2                   based prediction system was optimized by iterative model
emissions are required to achieve better SO2 predictions, and            trainings with the inputs of ground-observed PM10 , PM2.5 ,
these can be estimated through inverse modeling using satel-             and meteorological fields including wind speed, wind direc-
lite data (e.g., Lee et al., 2011). The adjustments of both ICs          tion, relative humidity, and precipitation. The AI-based pre-
and emissions may be able to improve the performances of                 diction system showed better performances at several sites
the air quality prediction system, and this will be examined             than the CMAQ model simulations. However, it works only
in future studies.                                                       for the observation sites in South Korea where ground-based

www.geosci-model-dev.net/13/1055/2020/                                                      Geosci. Model Dev., 13, 1055–1073, 2020
1068                                                              K. Lee et al.: Air quality prediction system in Korea

observations are available. By taking advantage of both the
CTM-based air quality prediction and the AI-based predic-
tion systems, both systems will be eventually combined so
as to create a more accurate hybrid air quality prediction sys-
tem over South Korea. This will be the ultimate goal of the
series of our research work.

Geosci. Model Dev., 13, 1055–1073, 2020                                      www.geosci-model-dev.net/13/1055/2020/
K. Lee et al.: Air quality prediction system in Korea                                              1069

Appendix A: Formulas for statistical evaluation indices

The formulas used to evaluate the performances of the air
quality prediction system are defined as follows:

Index of agreement (IOA) =
             n
                  (M − O)2
             P
              1
  1−   n                                                 (A1)
       P                        2
           M − Ō + O − Ō
       1
Correlation coefficient (R) =
            n                    
     1 X          O − Ō      M − M̄
                                                         (A2)
  (n − 1) 1          σO        σm
                                     v
                                     u n
                                     uP
                                     u (M − O)2
                                     t 1
Root-mean-square error (RMSE) =                          (A3)
                                         n
                         n
                      1X
Mean bias (MB) =           (M − O)                       (A4)
                      n 1
Mean normalized bias (MNB) =
    n          
  1X    M −O
                  × 100 %                                (A5)
  n 1      O
Mean normalized error (MNE) =
    n           
  1X    |M − O|
                    × 100 %                              (A6)
  n 1       O
Mean fractional bias (MFB) =
    n
  1X  (M − O)
        M+O
             × 100 %                                    (A7)
  n 1    2
Mean fractional error (MFE) =
    n
  1X  |M − O|
        M+O
             × 100 %.                                   (A8)
  n 1    2

In Eqs. (A1)–(A8), M and O represent the model and obser-
vation data, respectively. N is the number of data points, and
σ means the standard deviation. The overbars in the equa-
tions indicate the arithmetic mean of the data. The units of
RMSE and MB are the same as the unit of data, while IOA
and R are dimensionless statistical parameters.

www.geosci-model-dev.net/13/1055/2020/                           Geosci. Model Dev., 13, 1055–1073, 2020
1070                                                                       K. Lee et al.: Air quality prediction system in Korea

Code and data availability. WRF                          v3.8.1       References
(https://doi.org/10.5065/D6MK6B4K, Skamarock et al., 2008)
and CMAQ v5.1 (https://doi.org/10.5281/zenodo.1079909,                Adhikary, B., Kulkarni, S., Dallura, A., Tang, Y., Chai, T., Leung,
US EPA Office of Research and Development, 2015)                        L. R., Qian, Y., Chung, C. E., Ramanathan, V., and Carmichael,
models are both open source and publicly available.                     G. R.: A regional scale chemical transport modeling of Asian
Source codes for WRF and CMAQ can be downloaded at                      aerosols with data assimilation of AOD observations using op-
http://www2.mmm.ucar.edu/wrf/users/downloads.html         (Ska-         timal interpolation technique, Atmos. Environ., 42, 8600–8615,
marock et al., 2008) and https://github.com/USEPA/CMAQ                  https://doi.org/10.1016/j.atmosenv.2008.08.031, 2008.
(US EPA, 2020), respectively. Data from the KORUS-AQ                  Appel, K. W., Roselle, S. J., Gilliam, R. C., and Pleim, J. E.:
field campaign can be downloaded from the KORUS-AQ                      Sensitivity of the Community Multiscale Air Quality (CMAQ)
data archive (http://www-air.larc.nasa.gov/missions/korus-aq,           model v4.7 results for the eastern United States to MM5 and
NASA, 2020a). Other data were acquired as follows.                      WRF meteorological drivers, Geosci. Model Dev., 3, 169–188,
Ground-based observation data were downloaded from the                  https://doi.org/10.5194/gmd-3-169-2010, 2010.
Air Korea website (http://www.airkorea.or.kr, Korea En-               Appel, K. W., Pouliot, G. A., Simon, H., Sarwar, G., Pye, H. O.
vironment Corporation of the Ministry of Environment,                   T., Napelenok, S. L., Akhtar, F., and Roselle, S. J.: Evaluation of
2020) for South Korea and from https://pm25.in (CNEMC,                  dust and trace metal estimates from the Community Multiscale
2020) for China. AERONET data were downloaded from                      Air Quality (CMAQ) model version 5.0, Geosci. Model Dev., 6,
https://aeronet.gsfc.nasa.gov (NASA, 2020b). The KAQPS v1               883–899, https://doi.org/10.5194/gmd-6-883-2013, 2013.
(https://doi.org/10.5281/zenodo.3659551, Lee, 2020) code can be       Bellouin, N., Boucher, O., Haywood, J., and Reddy, M.
obtained by contacting Kyunghwa Lee (lkh1515@gmail.com) or              S.: Global estimate of aerosol direct radiative forcing
from https://github.com/AIR-Codes/KAQPSv1 (Lee, 2020). NCL              from satellite measurements, Nature, 438, 1138–1141,
(2019; https://doi.org/10.5065/D6WD3XH5) was used to draw the           https://doi.org/10.1038/nature04348, 2005.
figures.                                                              Breon, F.-M., Tanre, D., and Generoso, S.: Aerosol Effect on Cloud
                                                                        Droplet Size Monitored from Satellite, Science, 295, 834–838,
                                                                        https://doi.org/10.1126/science.1066434, 2002.
Author contributions. KL developed the model code, performed          Byun, D. and Schere, K. L.: Review of the Governing
the simulations and analyzed the results. CHS directed the experi-      Equations, Computational Algorithms, and Other Compo-
ments. JY contributed to shaping the research and analysis. SL, MP,     nents of the Models-3 Community Multiscale Air Quality
HH and SYP helped analyze the results. MC, JK, YK, JH and SK            (CMAQ) Modeling System, Appl. Mech. Rev, 59, 51–77,
provided and analyzed data applied in the experiments. KL prepared      https://doi.org/10.1115/1.2128636, 2006.
the paper with contributions from all coauthors.                      Byun, D. W. and Ching, J. K. S.: Science algorithms of the EPA
                                                                        models-3 community multiscale air quality (CMAQ) model-
                                                                        ing system, U.S. Environmental Protection Agency, EPA/600/R-
                                                                        99/030 (NTIS PB2000-100561), 1999.
Competing interests. The authors declare that they have no conflict
                                                                      Carmichael, G. R., Sakurai, T., Streets, D., Hozumi, Y., Ueda, H.,
of interest.
                                                                        Park, S. U., Fung, C., Han, Z., Kajino, M., Engardt, M., Bennet,
                                                                        C., Hayami, H., Sartelet, K., Holloway, T., Wang, Z., Kannari,
                                                                        A., Fu, J., Matsuda, K., Thongboonchoo, N., and Amann, M.:
Acknowledgements. Special thanks go to the entire KORUS-AQ              MICS-Asia II: The model intercomparison study for Asia Phase
science team for their considerable efforts in conducting the cam-      II methodology and overview of findings, Atmos. Environ.,
paign. Our thanks also go to the editor and anonymous reviewers         42, 3468–3490, https://doi.org/10.1016/j.atmosenv.2007.04.007,
for their constructive comments.                                        2008.
                                                                      Carmichael, G. R., Adhikary, B., Kulkarni, S., D’Allura, A., Tang,
                                                                        Y., Streets, D., Zhang, Q., Bond, T. C., Ramanathan, V., Jam-
Financial support. This research was supported by the National          roensan, A., and Marrapu, P.: Asian Aerosols: Current and Year
Strategic Project-Fine Particle of the National Research Founda-        2030 Distributions and Implications to Human Health and Re-
tion of Korea (NRF) of the Ministry of Science and ICT (MSIT),          gional Climate Change, Environ. Sci. Technol., 43, 5811–5817,
the Ministry of Environment (MOE), and the Ministry of Health           https://doi.org/10.1021/es8036803, 2009.
and Welfare (MOHW) (grant no. NRF-2017M3D8A1092022). This             Chemel, C., Sokhi, R. S., Yu, Y., Hayman, G. D., Vincent, K. J.,
work was also funded by the GEMS program of the MOE of the              Dore, A. J., Tang, Y. S., Prain, H. D., and Fisher, B. E. A.: Eval-
Republic of Korea as part of the Eco-Innovation Program of KEITI        uation of a CMAQ simulation at high resolution over the UK
(grant no. 2012000160004) and was supported by a grant from the         for the calendar year 2003, Atmos. Environ., 44, 2927–2939,
National Institute of Environment Research (NIER), funded by the        https://doi.org/10.1016/j.atmosenv.2010.03.029, 2010.
MOE of the Republic of Korea (grant no. NIER-2019-01-01-028).         Chin, M., Ginoux, P., Kinne, S., Torres, O., Holben, B.
                                                                        N., Duncan, B. N., Martin, R. V., Logan, J. A., Hig-
                                                                        urashi, A., and Nakajima, T.: Tropospheric Aerosol Op-
Review statement. This paper was edited by Havala Pye and re-           tical Thickness from the GOCART Model and Compar-
viewed by two anonymous referees.                                       isons with Satellite and Sun Photometer Measurements,
                                                                        J. Atmos. Sci., 59, 461–483, https://doi.org/10.1175/1520-
                                                                        0469(2002)0592.0.CO;2, 2002.

Geosci. Model Dev., 13, 1055–1073, 2020                                                  www.geosci-model-dev.net/13/1055/2020/
You can also read