MERSEA1 The TOPAZ system at met.no under
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Note No. 26/2009 oseanografi Oslo, December 1, 2009 The TOPAZ system at met.no under MERSEA1 Pål Erik Isachsen, Harald Engedahl, Ann Kristin Sperrevik, Bruce Hackett and Arne Melsom 1 This document contains hyperlinks that are active when viewed with properly enabled software.
Contents List of Figures Contents 1 Introduction 2 2 Overview of the TOPAZ system 2 3 Implementation at NERSC 4 4 Modifications made 6 4.1 Removal of manual menu-based inputs . . . . . . . . . . . . . . . . . . . . . . . 6 4.2 File transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.3 HPC que scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.4 Batch control by SMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.5 OPeNDAP server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.6 Daily updates of deterministic forecast . . . . . . . . . . . . . . . . . . . . . . . 8 4.7 Transition from SSM-I to OSI-SAF ice fields . . . . . . . . . . . . . . . . . . . 8 4.8 SVN version control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 5 Today’s TOPAZ system at met.no 9 5.1 The TOPAZ cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 5.2 HPC setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.3 SMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.4 OPeNDAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 6 Experience from an initial testing period 17 6.1 Model fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 6.2 HPC performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 6.3 SMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 6.4 THREDDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 7 Future work 17 7.1 Argo In situ observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 7.2 met.no uses of TOPAZ results . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 A The HPC directory structure 18 B Initialization files 19 List of Figures 1 The original TOPAZ cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 The main menu in the NERSC implementation . . . . . . . . . . . . . . . . . . 5 1
2 OVERVIEW OF THE TOPAZ SYSTEM 3 The TOPAZ week at met.no. ’INIT’ = Initialization, ’ANA’ = Analysis,’PAF’ = Prepare Atmospheric Forcing, ’F07/F14’ = Forecast07/14, ’GFP’ = Generate Forecast Products. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1 Introduction The TOPAZ system (“Towards an Operational Prediction system for the North Atlantic and Eu- ropean coastal Zones”) consists of a numerical ocean-sea ice model of the North Atlantic and Arctic oceans which utilizes an Ensemble Kalman Filter (EnKF) to assimilate ocean and ice ob- servations into the model. TOPAZ has been developed at the Nansen Environmental and Remote sensing Center (NERSC) under the EU FP6 MERSEA project2 . As part of the same project the model system is to be implemented for operational use at the Norwegian Meteorological Institute (met.no). This note contains a description of the work that has gone into the transfer of TOPAZ from the development branch at NERSC to the operational branch at met.no. This note presents the state of the met.no implementation as of fall 2009. However, TOPAZ is an evolving system and will be continually modified and expanded under the EU FP7 MyOcean project3 . For this reason we largely leave out exhaustive descriptions of technical details as they are implemented at the time of writing of this note. 2 Overview of the TOPAZ system TOPAZ (Bertino and Lisæter, 2008) is an ensemble ocean-sea ice forecasting system of the North Atlantic and Arctic Oceans which includes data assimilation via an Ensemble Kalman Filter (EnKF: Evensen, 1994; Evensen, 2003). Data is presently assimilated into the model once a week and the over-all purpose of the system is two-fold: 1) to make a best possible weekly estimate of the true ocean-sea ice state, and 2) to make a forecast of the evolution of this state some days into the future. The system first produces a best possible guess, or analysis, of the true ocean-sea ice state by merging the model estimate of that state with an estimate based on actual observations. This first step is done with the Ensemble Kalman Filter: The model estimate is taken to be the ensemble mean of one hundred individual model runs, each integrated from slightly perturbed initial con- ditions and each forced with slightly perturbed atmospheric fields. The observational estimate of the ocean-sea ice state comes from remote-sensed sea level anomalies (SLA), sea surface temper- atures (SST), sea ice concentrations (ICEC) and sea ice drift velocity (IDRIFT).4 A final estimate is then constructed as a linear combination of the model estimate and the observation estimate where the relative weights of the two are based on estimates of the respective error covariances. 2 http://www.mersea.eu.org 3 http://www.myocean.eu.org 4 The preparation towards assimilation of in situ hydrographic data is ongoing at NERSC. 2
2 OVERVIEW OF THE TOPAZ SYSTEM Figure 1: The original TOPAZ cycle The error covariances of the model estimate are taken from the ensemble spread while those of the observations are specified a priori. The generic ’TOPAZ week’ proceeds as follows (Figure 1): Every Tuesday an analysis of the ocean state the previous Wednesday, i.e. six days back in time, is made from the end state of previous week’s ensemble model runs and from observations centered around that previous Wednesday. The next day, on Wednesday (of the current week), the system then makes a 17-day integration of one single model member, from last week’s Wednesday to next week’s Saturday (-7 to +10 days). At the end of this integration one is then left with a 10-day ocean-sea ice forecast. Finally, the system integrates a new set of 100 ensemble members, this time 7 days from the previous Wednesday to the current Wednesday. These 100 ensemble members will thus form the basis for the analysis to be made at the onset of next week’s cycle, i.e. on the following Tuesday. The TOPAZ ocean model is an implementation of the HYbrid Coordinate Ocean Model (HY- COM). HYCOM has been developed based on the Miami Isopycnic Coordinate Ocean Model 3
3 IMPLEMENTATION AT NERSC (MICOM) (Bleck et al., 1992 and references therein). In HYCOM, the vertical coordinate is specified as target densities. When the requested specification of layers can be met according to an algorithm embedded in HYCOM, the model layers are isopycnic. Thus, the isopycnic layers normally span the water column beneath the mixed layer in the deep, stratified ocean. There is a smooth transition to terrain-following coordinates in shallow regions, and to z-level coordinates in the mixed layer and unstratified seas. The hybrid coordinate algorithm has been described in detail by Bleck (2002), and various specifications of the vertical coordinate have been described and tested by Chassignet et al. (2003). Another feature in HYCOM is that the user may select one of several vertical mixing param- eterizations. A detailed discussion of how HYCOM performs when five different mixed layer models are used, is given by Halliwell (2003). The K-Profile Parameterization (KPP) closure scheme (Large et al., 1994) is used in the TOPAZ implementation. Further, TOPAZ is run with 22 layers in the vertical, all of which are allowed to become hybrid depending on the results from the algorithm. The present implementation in TOPAZ is based on HYCOM version 2.1.34. In TOPAZ, HYCOM is coupled to a prognostic sea ice model. The thermodynamic part of this model is based on Drange and Simonsen (1996), and the dynamic part is due to Hibler III (1979), modified by the elastic-viscous-plastic (EVP) rheology by Harder et al. (1998). After completion of model integrations, the TOPAZ fields are split up into North Atlantic and Arctic regions and distributed in NetCDF format via OPeNDAP5 as so-called Mersea class 1, 2 and 3 products. Class 1 products are 3-dimensional daily mean fields of all prognostic variables interpolated to a set of fixed z levels, class 2 products are the same fields interpolated onto a set of oceanic sections and class 3 products are integrated transports (mass, heat, etc.) through these same sections. 3 Implementation at NERSC TOPAZ has up until recently6 been run at the High Performance Computing (HPC) system “njord” at NTNU7 . For the NERSC set up the model runs under the “laurentb” user and most of the actual scripts and executable code were placed at /home/ntnu/laurentb/TOPAZ3 while most of the data storage took place at /work/laurentb/TOPAZ3. The NERSC implementation of TOPAZ is operated manually via a menu-based system. From the main script (topaz_realt.sh) the operator can start and monitor the progress of all steps of the TOPAZ week (Figure 2). The week starts on Tuesday with an initialization step where model dates are updated (ad- vanced by seven days). Next the observational data are downloaded one by one from various data providers via ftp, and finally the EnKF analysis step is started. On Wednesday, if the analysis step was successful, the operator sets off the prognostic model runs. The two steps, forecast148 for the 17-day (7+10) single-member integration and forecast07 5 http://opendap.org/ 6 As of 2009 NERSC has migrated their TOPAZ setup from njord in Trondheim to hexagon in Bergen. 7 http://www.notur.no/hardware/njord 8 The 17-day single-member integration is named forecast14 since the last three days of the integration is forced 4
3 IMPLEMENTATION AT NERSC Figure 2: The main menu in the NERSC implementation 5
4 MODIFICATIONS MADE for the 7-day 100-member integration, actually involves first generating relevant atmospheric forcing fields9 . In the current set-up each model member utilizes one 16-cpu node on njord. The 17-day integration (including the preparation of forcing fields) requires about two and a half wall-clock hours while one single 7-day integration requires about one hour. The 100 ensemble members are submitted to the HPC que one by one as 100 different jobs, and since TOPAZ runs submitted by NERSC receives normal que priority several days may pass before all jobs have passed through the que. Since some of the 100 ensemble members will typically crash during integration, the operator will have to manually monitor the progress of forecast07 and re-submit failed members. The resubmitted members get their new initial conditions copied from another, successful, member. This procedure will not result in two identical integrations however, since the two are forced by atmospheric fields perturbed differently. When each of forecast14 and forecast07 have successfully completed, the operator starts post- processing scripts that create a set of pre-defined products for dissemination. The daily mean model fields are now interpolated onto two sub-regions, an Arctic domain defined on a polar stereographic grid and a North Atlantic domain defined on a regular lat-lon grid. The class 1, 2 and 3 NetCDF files are uploaded (via scp) from the njord to a local OPeNDAP server at NERSC. 4 Modifications made For transfer of the TOPAZ system from NERSC to met.no, the run setup required two major modifications. First, the met.no operational system will enjoy top queue priority on the HPC system as opposed to the normal queue priority during the development stages. Since the top queue priority enables TOPAZ to essentially claim the entire HPC system during certain stages and thus to exclude other work on the computer cluster—also including other top-priority fore- cast jobs—extra care must be implemented into the planning of the operational TOPAZ week at met.no. Second, the over-all control and execution of the various steps of TOPAZ should be automatic rather than manual (the above-mentioned command-based interface operated by NERSC). The following sections describe these initial modifications needed for the transition to met.no in some detail. Then follows a short description of some added features to the TOPAZ system: 1) a daily update of the deterministic forecast, 2) the replacement of SSM/I ice fields with OSI-SAF products, and 3) the setup of SVN version control. 4.1 Removal of manual menu-based inputs The initial stages in porting the TOPAZ model setup consisted of changing hard-coded paths, URLs and email addresses in the actual code, then changing symbolic links. All menus and by climatological atmospheric fields. 9 ECMWF (799) forcing fields in GRIB format are downloaded daily via ftp from met.no. 6
4.2 File transfer 4 MODIFICATIONS MADE command-based interfaces were then removed and replaced with functionality for automatic execution of the code (see the description of the SMS-run system below). 4.2 File transfer Some work also had to be done in modifying the way data files are transferred. During develop- ment stages of the system, all file transfers were initialized and controlled from the external HPC machine. But since met.no’s computers are protected by fire walls, some of the scripting had to be modified to move the control of all file transfer between met.no and the HPC machine to the local met.no machines. This involved uploading daily atmospheric forcing fields from ECMWF and downloading the final MERSEA products to be displayed on an OPeNDAP server. Down- loading of observation fields (SLA, SST, ice concentration and ice drift) by ftp from external sources was left unchanged. 4.3 HPC que scheduling With regards to the top queue priority enjoyed by the met.no "forecast" user on the HPC cluster, the execution of the 100-member forecast07 integrations needed special care. Submitting 100 such jobs all at once from the forecast user who enjoys top queue priority would essentially occupy the entire HPC cluster for a minimum of two and a half to three hours. After consulting the HPC staff at NTNU, an initial approach consisting of sending the 100 jobs in batches of five to ten members was considered too inefficient. Instead, we split the forecast07 step into three parts: Part one runs the first fifty members during a relatively quiet period during early Wednesday evening, part two then runs the second batch of fifty members in a second quiet period later that same night or early Thursday morning. Both parts one and two allow any crashed members to be rerun one time. Then, finally, part three runs immediately after part two and deals with any remaining unsuccessful ensemble members from parts one and two. A crashed member is first given a final chance after its initial conditions have been copied from a randomly chosen successful member. Then, should the integration of the member still fail, its end state is copied from a randomly chosen successful member (and the ensemble will thus contain two identical members). 4.4 Batch control by SMS Control of TOPAZ at met.no had to go through Supervisor Monitor Scheduler (SMS), an appli- cation that makes it possible to run a large number of jobs which are dependant of each other and/or by time. All jobs in the SMS are grouped in ’suites’ containing one or more ’families’. Typically a ’suite’ contains all ’families’ which are run during one time (e.g. at 00 UTC, 12 UTC, etc.). A ’family’ contains all the SMS jobs or scripts which are applied for a certain model or application. SMS is applied to 1) submit jobs, 2) control each submitted job, and 3) report back to the operator: 7
4.5 OPeNDAP server 4 MODIFICATIONS MADE Submitting jobs: The SMS is applied to run jobs depending on given criteria, e.g., to start at a specified wall clock time, to start and stop at specified times (i.e. the job runs for a certain period), to start when another specified job has the status ’complete’, or to start a job when an ’event flag’ is set. Controlling jobs: The SMS monitor the status of each job and also when a job sets an ’event flag’ . Interphase with the operator: The operator communicates with the SMS by XCdp (X Com- mand and Display Program). By XCdp the operator can start (submit), suspend, set jobs as ’complete’ or abort jobs. In XCdp the status of each job is shown as different color codes. 4.5 OPeNDAP server Only minor modifications were made to the set-up scripts used by the THREDDS10 OPeNDAP server. 4.6 Daily updates of deterministic forecast During the spring of 2009, the system has been modified such that the deterministic forecast (forecast14, from -7 days to +10 days) is updated every day. Each such update will thus enjoy one more day of analyzed atmospheric forcing fields and should thus be a better forecast than the one made the day before. 4.7 Transition from SSM-I to OSI-SAF ice fields Within the EU FP7 MyOcean project the data for assimilation in the TOPAZ system will be pulled from the MyOcean in-situ and satellite Thematic Assembly Centers. The first adjustment to meet this objective is the transition from SSM/I to OSI-SAF11 sea ice concentration fields, which is the product that will be available through MyOcean. This transition was made during September 2009. Due to the high resolution of the OSI-SAF data, a routine for generating ’super observations’ (the mean of several observations) has been added to the preprocessing of the sea ice concentration data. 4.8 SVN version control In March 2009 a SVN repository was established for the TOPAZ system12 . This repository will ease the exchange of code updates between NERSC and met.no. The HPC directories (see Appendix A) currently under version control are Jobscripts, Progs and Realtime_exp, which contain all scripts and source code of the TOPAZ system. In addition to the TOPAZ version used operationally at met.no, the repository also contains a NERSC branch. 10 http://www.unidata.ucar.edu/projects/THREDDS/ 11 http://www.osi-saf.org/ 12 https://svn.met.no/topaz 8
5 TODAY’S TOPAZ SYSTEM AT MET.NO 5 Today’s TOPAZ system at met.no 5.1 The TOPAZ cycle At present, the TOPAZ cycle at met.no (Figure 3) is run along two different paths: On a daily basis the forecast14 and genfore14prods are run after new atmospheric data is transferred from ECMWF. The forecast14 step integrates one single member from day -7 to day +10, and gen- fore14prods generates the MERSEA class 1, 2 and 3 products in NetCDF format. The TOPAZ daily cycle ends with a post-processing stage which transfers the NetCDF files with MERSEA products to the OPeNDAP server at met.no. In parallel, on a weekly basis the following jobs are executed: On Tuesday morning at 07:00 UTC the initialize, and analysis steps are launched. Then, at Wednesday evening after met.no’s 12-UTC cycle (“termin”) has completed, the forecast07 runs are started to produce the 100- member ensemble. By this procedure, the forecast14 runs are updated every day, each run with an improved set of atmospheric forcing fields, i.e., more analyzed atmospheric fields. However, the ocean (and sea ice) model initial fields are kept unchanged until the next weekly cycle with the forecast07 runs. The initialize step advances the TOPAZ time stamp by 7 days to prepare for a new week. In addition, it does some cleaning up. The analysis step then downloads observation fields and conducts the analysis, one field at a time. The forecast07 step which integrates 100 ensemble members from day -7 to day 0 starts off with a part1 which integrates the first 50 members. This first part of the job is triggered Wednes- day evening by the completion of the 12-UTC cycle that day13 . Then the next 50 ensemble members are integrated by part2 which is triggered by the completion of the 18-UTC cycle (sim- ilar procedure as for the 12-UTC cycle) sometime early on Thursday morning. In order to avoid serious delays for other operational forecasting jobs, part1 and part2 of forecast07 are each al- lowed a maximum of about three wall clock hours to complete their runs on the HPC. Finally, to sweep up any remaining crashed members, a part3 is submitted to run immediately after the completion of part2. The timing here is not critical since this last stage will normally not require an excessive amount of HPC resources (very few crashed ensemble members will remain after two trials in part1 and part2). After the 100 ensemble members are ready the genfore07prods step generates MERSEA class 1, 2 and 3 products from the ensemble integration (as for the daily cycle). The TOPAZ weekly cycle ends with a post-processing stage which transfers the NetCDF files with MERSEA prod- ucts to the OPeNDAP server at met.no. This last step will normally be done some time Thursday morning. 13 In fact this is done indirectly, by an ’event trigger’ which is set by the completion of the last job in the 12-UTC cycle. 9
5.1 The TOPAZ cycle 5 TODAY’S TOPAZ SYSTEM AT MET.NO TUE WED−THU GFP INIT ANA F07−1 F07−2 F07−3 OBSERVATIONS INIT FORCING THREDDS FIELDS FIELDS GFP GFP GFP GFP GFP GFP GFP PAF PAF PAF PAF PAF PAF PAF F14 F14 F14 F14 F14 F14 F14 MON TUE WED THU FRI SAT SUN ECMWF ATM. FIELDS Figure 3: The TOPAZ week at met.no. ’INIT’ = Initialization, ’ANA’ = Analysis,’PAF’ = Prepare Atmospheric Forcing, ’F07/F14’ = Forecast07/14, ’GFP’ = Generate Forecast Prod- ucts. 10
5.2 HPC setup 5 TODAY’S TOPAZ SYSTEM AT MET.NO 5.2 HPC setup Currently, the HPC setup for met.no’s forecast user has purposely been made as similar as pos- sible to that of the development branch14 . The HPC operation for met.no’s operational setup is executed via a set of top-level job scripts under the home directory of the forecast user, all triggered by SMS (according to Fig. 3): Weekly cycle topaz3_initialize.job: Cleans up some _msg files (for communication with SMS) in Realtime_exp. Then executes script Realtime_exp/Subscripts2/topaz_cleanup.sh (which again calls Real- time_exp/Subscripts2/topaz_cleanup_migrate.sh). These scripts do some cleaning up and copying of model output files to the BusStop/Backup directory. Finally, adds 7 days to the dates stored in Realtime_exp/Startup_files. Note that the NERSC version had this script start a new set of log files (not implemented in the met.no version). topaz3_analysis.job: Tries to download and process each of the observation types (SLA, SST, ICEC, IDRIFT) via scripts in Realtime_exp/Subscripts2. Starts EnKF via Realtime_exp/ Subscripts2/topaz_enkf.sh. Waits maximum 10 hours for this script to finish. topaz3_forecast07_part1.job: Generates two HYCOM input files (as for forecast14). Gener- ates forcing files by submitting job Forecast07/job_forfun_nersc.sh. Then submits mem- bers 1–50 to the que (uses template job script Realtime_exp/Infiles/forecast07_job_single.mal_test). Waits for all members to finish (or until a maximum wait time has passed). Checks which members were unsuccessful (crashed) by looking for missing ENSrestart files. Resubmits these missing members one more time. Waits for all these resubmitted jobs to finish. topaz3_forecast07_part2.job: Identical to the previous step, except that no infiles nor forcing files are generated. Now submits members 51-100, and gives any unsuccessful members one more chance. topaz3_forecast07_part3.job: Checks how many jobs were unsuccessful (by same procedure as above). For each such member, resubmit after having replaced the initial condition (ENSrestart file) from a random draw amongst the successful members. Waits for all members to finish. Again, check for any remaining unsuccessful member. This time, replace the end state of such a member with one drawn randomly amongst end states of successful members. topaz3_genfore07prods.job: Same procedure as in topaz3_genfore14prods.job, but this time for the ensemble run. 14 Thedevelopment branch is associated with user laurentb and resides under /home/laurentb/TOPAZ3 and /work/laurentb/TOPAZ3. 11
5.3 SMS 5 TODAY’S TOPAZ SYSTEM AT MET.NO Daily cycle topaz3_prepatmforcing.job: This job only executes script tmp/T799_ecnc every day (indepen- dent of the other steps of the TOPAZ cycle). Here GRIB files of ECMWF forcing files (uploaded from met.no to the HPC machine independently) are converted to NetCDF via CDO.15 topaz3_forecast14.job: Starts the forecast14 job via script Realtime_exp/Subscripts2/topaz_fore14_nest.sh. This script first generates two input files for HYCOM, then submits another job script, Forecast14/forecast14_job.sh. Finally, this job script does two things: first it generates atmospheric forcing fields (from NetCDF to HYCOM’s format) via the executable Real- time_exp/bin/forfun_nersc, then it runs the HYCOM executable. topaz3_genfore14prods.job: Generates a job file (by inserting correct dates into a generic ver- sion) which generates the Mersea class 1,2,3 products for the North Atlantic (NAT) and Arctic (ARCTIC) regions via script Realtime_exp/Subscripts2/topaz_generate_daily.sh. Class 1 products are actually generated by executable Realtime_exp/bin/hyc2proj, while class 2 and 3 are generated by scripts Realtime_exp/Subscripts2/merseaip_class2_sections.sh and Realtime_exp/Subscripts2/merseaip_class3_transport.sh. 5.3 SMS All jobs in the SMS are grouped in ’suites’ containing one or more ’families’. Typically a ’suite’ might contain all ’families’ which are run during the same time slot (e.g. at 00 UTC, 12 UTC, etc.). A ’family’ contains all the SMS jobs or tasks which are applied for a certain model or application. All SMS jobs are scripts, most commonly of shell or perl type. They must have a specified format at the beginning (header) and at the end to be recognized by the SMS. To tell the SMS system which jobs and dependencies to be used, an ASCII format input file named ’metop.def’ is applied. This is a dynamical file to control the SMS, and it must be changed every time a new SMS job is added or removed from the system. In the daily cycle, in SMS family run_everyday, the atmospheric forcing fields for the topaz model system are retrieved from ECMWF. This is done after the ECMWF fields from their 12 UTC model run have arrived at met.no. After forcing fields have been downloaded and converted to NetCDF, the forecast14 and genfore14prods steps are executed. Finally the model results (now on NetCDF format) are transfered to the OPeNDAP server at met.no. The procedure is as follows: Under suite ’ec_atmo’ in family ’topaz’ the SMS job ’check_if_atmo_12_240’ checks if the ECMWF fields for the 12 UTC run have arrived at met.no. If this is the case, the event flag ’ec_atmo_12_240’ is set, and the SMS jobs under the suite ’trigjobs’, family ’run_everyday’ and family ’topaz’ are started. These jobs prepare and transfer ECMWF atmospheric fields to the remote HPC host ’njord’, and starts the forecast14 runs. The lines below show the SMS jobs with events and triggers as they appear in family run_everyday in the file ’metop.def’ (all comments starts with #): 15 http://www.mpimet.mpg.de/fileadmin/software/cdo/ 12
5.3 SMS 5 TODAY’S TOPAZ SYSTEM AT MET.NO suite trigjobs........ family run_everyday edit JOB_HOSTS "" family topaz # - Prepare new atmospheric fields from ECMWF # - Perform complete Forecast14 run, including transfer of output # data to thredds # - Requeue family task prep_atmos_topaz3 trigger /ec_atmo/topaz/check_if_atmo_12_240:ec_atmo_12_240 task put_atmos_topaz3 trigger ./prep_atmos_topaz3 == complete task start_topaz3_prepatmforcing trigger ./put_atmos_topaz3 == complete task get_topaz3_prepatmforcing edit RUN_JOB_OPTIONS "--timeout-job=170" trigger ./start_topaz3_prepatmforcing == complete # On every day of the week, run topaz forecast14 with updated atmospheric task start_topaz3_forecast14 trigger ./get_topaz3_prepatmforcing == complete task get_topaz3_forecast14 edit RUN_JOB_OPTIONS "--timeout-job=190" trigger ./start_topaz3_forecast14 == complete task start_topaz3_genfore14prods trigger ./get_topaz3_forecast14 == complete task get_topaz3_genfore14prods edit RUN_JOB_OPTIONS "--timeout-job=130" trigger ./start_topaz3_genfore14prods == complete # Put model output tar file on thredds, unpack, and requeue family task post_topaz3 trigger ./get_topaz3_genfore14prods == complete task requeue_run_everyday trigger ./post_topaz3 == complete endfamily endfamily The weekly cycle starts with the initialize and analysis steps on Tuesdays. Thus, these jobs are run under family run_tuesday. When it is tuesday, the event flag ’tuesday_OK’ is set. When this event flag is set, all SMS jobs under suite ’trigjobs’, family ’run_tuesday’, and family ’topaz’ are started. In ’metop.def’ this "production line" looks like: family run_tuesday edit JOB_HOSTS "" 13
5.3 SMS 5 TODAY’S TOPAZ SYSTEM AT MET.NO family topaz # Initialization & cleanup, then perform the analysis part: task start_topaz3_initialize trigger /cronjobs/topaz/check_which_day_of_week:tuesday_OK # Copy last "old" results from njord to rhino task copy_topaz3_results edit RUN_JOB_OPTIONS "--timeout-job=120" trigger /cronjobs/topaz/check_which_day_of_week:tuesday_OK task get_topaz3_initialize trigger ./start_topaz3_initialize == complete task start_topaz3_analysis trigger ./get_topaz3_initialize == complete task get_topaz3_analysis edit RUN_JOB_OPTIONS "--timeout-job=500" trigger ./start_topaz3_analysis == complete task requeue_run_tuesday trigger ./get_topaz3_analysis == complete and ./copy_topaz3_r endfamily endfamily The forecast07 runs are initiated on Wednesday evening with the first part (part1) at the end of the 12-UTC operational suite at met.no. Since we now will start 100 model runs, special care must be taken to avoid blocking other operational models on the HPC. This is done by checking if one of the last models in the 12-UTC suite is finished (similar procedure for part2 in the 18-UTC suite). If this is true the event flag ’wednesday_OK’ is set. When this event flag is set, all SMS jobs under suite ’trigjobs’, family ’run_wednesday’, and family ’topaz’ are started. To avoid that any part1 and part2 runs are performed at the same time, something that could completely block the computer for other runs, part2 (and part3) is not started before part1 is completed, regardless of the 18-UTC suite is finished or not. Below is the portion of the ’metop.def’ for family ’run_wednesday’: family run_wednesday edit JOB_HOSTS "" family topaz # When the last model (um4exp) during the 12 UTC time is completed, # start forecast07 part1 task start_topaz3_forecast07_part1 trigger /metop/mod12/topaz/check_which_day_of_week:wednesday_ task get_topaz3_forecast07_part1 edit RUN_JOB_OPTIONS "--timeout-job=250" trigger ./start_topaz3_forecast07_part1 == complete # When the last model (hirlam4) during the 18 UTC time is completed, 14
5.4 OPeNDAP 5 TODAY’S TOPAZ SYSTEM AT MET.NO # start forecast07 part2: task start_topaz3_forecast07_part2 trigger /metop/mod18/topaz/check_which_day_of_week:wednesday_ ./get_topaz3_forecast07_part1 == complete task get_topaz3_forecast07_part2 edit RUN_JOB_OPTIONS "--timeout-job=250" trigger ./start_topaz3_forecast07_part2 == complete # If there is still some ensemble members which have crashed, even after # we give those a third and last chance in part3: task start_topaz3_forecast07_part3 trigger ./get_topaz3_forecast07_part2 == complete task get_topaz3_forecast07_part3 edit RUN_JOB_OPTIONS "--timeout-job=250" trigger ./start_topaz3_forecast07_part3 == complete # When forecast07 part1, part2 and part3 are all completed, make products task start_topaz3_genfore07prods trigger ./get_topaz3_forecast07_part1 == complete && \ ./get_topaz3_forecast07_part2 == complete && \ ./get_topaz3_forecast07_part3 == complete task get_topaz3_genfore07prods edit RUN_JOB_OPTIONS "--timeout-job=100" trigger ./start_topaz3_genfore07prods == complete # Put model output tar file on thredds, unpack, and requeue family task post_topaz3 trigger ./get_topaz3_genfore07prods == complete task requeue_run_wednesday trigger ./post_topaz3 == complete endfamily endfamily 5.4 OPeNDAP For data dissemination by OPeNDAP, THREDDS Data Server (TDS16 ) software was installed on server hardware in the met.no De-Militarized Zone (DMZ). An area on this server is dedicated to serving TOPAZ Mersea data products (thredds.met.no/thredds/public/mersea-ipv2.html). Con- figuration of the Mersea area of this server is essentially a copy of the configuration used by NERSC (see topaz.nersc.no/thredds/catalog.html). Catalog-generation scripts from NERSC were adapted and implemented at met.no. 16 www.unidata.ucar.edu/projects/THREDDS/ 15
5.4 OPeNDAP 5 TODAY’S TOPAZ SYSTEM AT MET.NO The TOPAZ products are located under threddsday:/metno/eksternweb/thredds/content/mersea- ipv2 in the following structure: /metno/eksternweb/thredds/content/ Top of THREDDS content tree mersea-ipv2.xml Catalog for Mersea tree (static) mersea-ipv2/ Top of Mersea tree gen_xml.sh Script to update catalog xml files. Called by the SMS job post_topaz3.sms. Runs perl scripts located in Agg-XML to generate updated xml files for each set of products. xml files are built in Agg-XMLwork and then copied to this subdi- rectory. mersea-ipv2-class1-arctic.xml Catalog for Arctic Class 1 products mersea-ipv2-class2-arctic.xml Catalog for Arctic Class 2 products mersea-ipv2-class3-arctic.xml Catalog for Arctic Class 3 products mersea-ipv2-class1-nat.xml Catalog for North Atlantic Class 1 products mersea-ipv2.tar tar ball of all NetCDF data files in current update. Currently 2 Gb. arctic/ Arctic grid (polar-stereographic) mersea-class1/ Class 1 products (gridded fields) Filename template: topaz_V3_mersea_arctic_grid1to8_da_class1_b[bulletin date YYYYMMDD]_f[field date YYYYMMDD]9999.nc mersea-class2/ Class 2 products (vertical section fields) section01/ Data files for Section 1. Filename template: topaz_V3_mersea_arctic_section01_dc_b[bulletin date YYYYMMDD]_f[field date YYYYMMDD]9999.nc section02/ ... section24/ Data files for Sections 2-24. Filename template: topaz_V3_mersea_arctic_sectionNN_dc_b[bulletin date YYYYMMDD]_f[field date YYYYMMDD]9999.nc moorings/ Data files for mooring locations. Filename template: topaz_V3_mersea_arctic_moorings_dc_b[bulletin date YYYYMMDD]_f[field date YYYYMMDD]9999.nc mersea-class3/ Class 3 products (transports) transport/ Data files for water transport. Filename template: topaz_V3_mersea_arctic_transport_b[bulletin date YYYYMMDD]_f[field date YYYYMMDD]9999.nc icetransport/ Data files for sea ice transport. Filename template: topaz_V3_mersea_arctic_icetransport_b[bulletin date YYYYMMDD]_f[field date YYYYMMDD]9999.nc nat/ North Atlantic grid (geographic) mersea-class1/ Class 1 products (gridded fields) Filename template: topaz_V3_mersea_nat_grid1to8_da_class1_b[bulletin date YYYYMMDD]_f[field date YYYYMMDD]9999.nc Agg-XML/ Contains perl code for generating aggregation catalog files in xml. Perl scripts are called by gen_xml.sh. 16
7 FUTURE WORK work/ Work area for writing the catalog xml files. These are copied to the top directory. old/ Contains catalog xml files from the previous update. backup/ Purpose unknown, currently empty. old_content/ Obsolete forms of xml catalog files. (Not used?) 6 Experience from an initial testing period The first operational runs of TOPAZ at met.no took place in early April 2008. The system has been operational since mid May, 2008, and resulting MERSEA class 1, 2 and 3 files can be downloaded from: http://thredds.met.no/thredds/public/mersea-ipv2.html 6.1 Model fields No thorough validation of TOPAZ output fields has been conducted. A few random visual in- spections have shown the met.no fields to be in good accordance with those produced by NERSC. Regular validation is planned within the MyOcean project. 6.2 HPC performance Scheduling and execution on njord under the forecast user has resulted in few, if any, problems. One irregularity in njord networking caused problems with ftp connections, and TOPAZ was then unable to download any observational data for that week. 6.3 SMS The once-weekly structure of TOPAZ has caused some problems. On two-three occasions the failure of operational personnel to reset some triggers used by TOPAZ has caused failure. New routines have been implemented which should avoid such problems in the future. 6.4 THREDDS Thredds has run largely without problems. 7 Future work 7.1 Argo In situ observations The NERSC version of TOPAZ has been assimilating Argo17 in situ hydrographic profiles since the end of 2008. This feature has yet to be implemented in the met.no version. 17 http://www.argo.net/ 17
7.2 met.no uses of TOPAZ results A THE HPC DIRECTORY STRUCTURE 7.2 met.no uses of TOPAZ results Presently the TOPAZ fields are available to the community at large via the OPeNDAP server, but they are not used for any in-house met.no applications. A The HPC directory structure The TOPAZ3 directory under /home/ntnu/forecast contains these major subdirectories: Realtime_exp: Most of the shell scripts and code are stored and executed here: Startup_files: Four ASCII text files contain the current analysis date (in year, week, day and Julian day). These files are modified by the top-level script topaz3_initialize.job and subsequently read in by many other scripts. Logfiles: Not presently used by the met.no setup. Subscripts: Generic scripts Subscripts2: Scripts more specific to TOPAZ Subprogs: Generic programs Subprogs2: Programs more specific to TOPAZ Infiles: Various controlling input files (including generic run scripts for the HYCOM model) bin: Various executables Helpdocs: (Largely outdated) help documents. Analysis: The analysis step is executed here Forecast14: The Forecast14 step is executed here Forecast07: The Forecast07 step is executed here BusStop: Temporary storage for data to be transferred: Backup: Backup fields (Analysis, Forecast14, Forecast07, MERSEA) OpenDAP: The Mersea products to be uploaded to the THREDDS server Diagnostics: Not presently used Progs: Various source code (ripped from the NERSC laurentb user) EnKF_MPI: EnKF code for MPI tmp: T799_ecnc: ECMWF forcing fields are downloaded here (every day) and converted from GRIB to NetCDF format. TOPAZ3_relax: Climatological fields for ocean model. 18
B INITIALIZATION FILES Met.no: Presently not used ECMWFR: Presently not used HYCOM_inputs: HYCOM data EOdata: Downloaded observation data (SLA, SST, ICEC, IDRIFT) Jobscripts: Contains all top-level job scripts described in Section 5.2). B Initialization files The following is a list of data files from last week’s run which must be present at the beginning of a new TOPAZ week. The list is not exhaustive but includes files which will typically be missing if one week’s runs went astray. The files must then be copied from the (hopefully successful) NERSC runs. All files belong in the (HPC) directory /̃TOPAZ3/Forecast07/. ENSrestartyyyy_ddd* These are restart files containing the end state of the model at the end of last week’s run. ENSrestRANDyyyy_ddd_00.[ab] Contain random component? ENSDAILY_yyyy_d-7_ICEDRIFT.uf Ice drift data throughout entire previous ensemble run? Here ’yyyy’ and ’ddd’ is the year and the day-in-the-year (days after 1 January) pointing to the analysis day we want to integrate from. The ’d-7’ in in the ENSDAILY files points to seven days before the analysis, i.e. to the start of the previous analysis (two weeks ago). An example: For the TOPAZ week starting 20 January 2009, we need to restart from the previous Wednesday, i.e. 14 January or yyyy=2009, ddd=013. We would need the files EN- Srestart2009_013*, ENSrestRAND2009_013_00.[ab] and ENSDAILY_2009_006*. References Bertino, L. and K. A. Lisæter, 2008: The TOPAZ monitoring and prediction system for the Atlantic and Arctic Oceans. J. Operat. Oceanogr., 1(2), 15–19. Bleck, R., 2002: An oceanic general circulation model framed in hybrid isopycnic-cartesian coordinates. Ocean Modelling, 4, 55-88. Bleck, R., C. Rooth, D. Hu, and L. T. Smith, 1992: Salinity thermocline transients in a wind- and thermohaline-forced isopycnic coordinate model of the Atlantic Ocean. J. Phys. Oceanogr., 22, 1486–1505. Chassignet, E. P., L. T. Smith, G. R. Halliwell, and R. Bleck, 2003: North Atlantic simulations with the HYbrid Coordinate Ocean Model (HYCOM): Impact of the vertical coordinate choice, reference pressure, and thermobaricity. J. Phys. Oceanogr., 33, 2504–2526. 19
B INITIALIZATION FILES Drange, H., and K. Simonsen, 1996: Formulation of air-sea fluxes in the ESOP2 version of MICOM. Technical Report 125, Nansen Environmental and Remote Sensing Center, Bergen, Norway. 23 pp. Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99(C5), 10143– 10162. Evensen, G., 2003: The Ensemble Kalman Filter: theoretical formulation and practical imple- mentation. Ocean Dynamics, 53, 343–367. Halliwell, G. R., 2004: Evaluation of vertical coordinate and vertical mixing algorithms in the HYbrid Coordinate Ocean Model (HYCOM). Ocean Modelling, 7(3–4), 285–322. Harder, M., P. Lemke, and M. Hilmer, 1998: Simulation of sea ice transport through Fram Strait: Natural variability and sensitivity to forcing. J. Geophys. Res., 103(C3), 5595– 5606. Hibler, W. D., III, 1979: A dynamic thermodynamic sea ice model. J. Phys. Oceanogr., 9, 815–846. Lisæter, K. A., and G. Evensen, 2003: Assimilation of ice concentration in a coupled ice – ocean model, using the Ensemble Kalman filter. Ocean Dynamics, 53, 368–388. 20
You can also read