Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Ultra Legacy Monte Carlo Production: GEN overview Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group
Outline of the talk Overview of GEN specific improvements in the UL campaign UL Monte Carlo production How to find my missing samples? MC production model in CMS Toolbox for analyzers Toolbox for MC contacts Feedback? SMP General Meeting !2 Saptaparna Bhattacharya
Overview of GEN specific improvements in the UL campaign SMP General Meeting !3 Saptaparna Bhattacharya
Generators in use in the UL campaign • Large fraction of UL events produced at NLO (~50%) and NNLO (~14%)* • A wide spate of generators used and are tailored to facilitate the physics program in CMS • Using POWHEG for NLO generation allows one to circumvent the deleterious effect of negatively weighted events • Parton shower modeling in 99% of cases done with Pythia madgraph powheg pythiaOnly amcatnloFXFX 12.9% powheg pythiaOnly madgraphMLM powhegMiNNLO amcatnloFXFX 22.3% madgraphMLM 7.68% 28.2% amcatnlo sherpa evtgen unknown 4.94% unknown madgraph powhegMiNNLO evtgen+pythia 3.79% mcfm amcatnlo 62.9% 14.7% 0.0421% 4% 2.0 0.879% mcfm 2.8 2. 6% 1.11% 2. 15% 1.79% 81 % 1.79% 0.736% 13.8% 11.9% 0.631% Usage of MC generators categorized by sample (left) and events (right) for UL16 campaign based on sampling of 9M events *refers specifically to UL16 SMP General Meeting !4 Saptaparna Bhattacharya
Ingredients in the UL campaign: GEN overview Preparation of GEN UL campaign started in 2019 (introduced by Qiang Li and Efe Yazgan) MC production in UL16, 17 and 18 documented in GitBook GEN campaigns (X = 16, 17, 18): In UL16 50% luminosity split between APV and non-APV RunIISummer20UL{X}GEN: Pythia-only requests (pythia fragment needs to be specified) RunIISummer20UL{X}wmLHEGEN: Matrix element + parton shower requests (production with gridpacks) RunIISummer20UL{X}pLHE: Matrix element (proceeds through privately produced LHE) + parton shower request Standard positive-definite NNPDF3.1 PDF NNLO set (with CP5 tune) is the default 325500 is also positive definite CP5 is the default tune (CPX tunes documented here, CP2 and CP3 tunes may be used for LO/NLO PDFs in the matrix element) SMP General Meeting !5 Saptaparna Bhattacharya
Version of event generators: Madgraph • Madgraph version > 2.6.X recommended for UL production (strict inequality in version required since GEN and SIM steps separate in UL campaign) • Features gridpack read-only feature and the use multithreading • Includes bias-reweighting features at LO and NLO • Details in Olivier’s talk • Improved cluster support • Magraph 2.6.5 is currently in the master of the genproductions repository • Detailed validation of both Madgraph 2.6.1 and 2.6.5 performed by the GEN Validation team (S. Bhattacharya, J. Choi, A. Grohsjean, G. Kole) in 2018-2019 • Validation suite includes the validation of kinematic distribution, PDF weights, BSM weights (associated with coupling reweighing or mass reweighting) at both LO and NLO • During the course of Run II data analysis, bugs in Madgraph found by the GEN team and analyzers • Examples of such bugs include incorrect computation of cross section (MG 5.2.3.3) • Twiki page to collect user-based feedback of buggy samples, inconsistent settings: https://twiki.cern.ch/twiki/ bin/view/CMS/MCKnownIssues • This page is consistently maintained and includes solutions to the reported bugs • Gridpacks for processes with known bugs have been reproduced for the UL campaign SMP General Meeting !6 Saptaparna Bhattacharya
Version of event generators: Pythia • Pythia version 8.2.4.0 used • Includes the dipole recoil option essential for Vector Boson Scattering topologies • UL campaign uses a version of Pythia where the unexpectedly large particle multiplicity bug was fixed (detailed description on slide 29) • To specify relevant PDF sets include (included in the default fragment) • Parton shower weights incorporated as SMP General Meeting !7 Saptaparna Bhattacharya
Version of event generators: POWEG, Sherpa and HERWIG • POWHEG V2 35.8 fb-1 (13 TeV) • Often used in conjunction with JHU GEN for samples dσ [pb GeV-1] CMS e/ µ +jets Data 1 particle level Sys ⊕ stat Stat • Sherpa 2.2.11 + OpenLoops 2.1.0 POWHEG P8 • Available in CMSSW_11_3_0_pre4, CMSSW_10_2_26 & SHERPA CS dp (th) 10−1 POWHEG H++ CMSSW_10_6_21; CMSSW_9_3_19; CMSSW_7_1_47 T MG5 P8 [FxFx] • Comparison between Sherpa (Catani-Seymour) and other generators studied for TOP-17-002 10−2 • First look at multi jet production in Sherpa will be discussed in the GEN meeting on July 12th 10−3 • HERWIG7 Theory • Available since CMSSW_9_3_14 1.4 Data • Supersedes HERWIG++ 1.2 p (t ) [GeV] • Tuned with CH3 T h 1 • Matching at NLO for angular ordered and dipole shower • Implementation of aMC@NLO and POWHEG-“type” 0.8 matching schemes 0 100 200 300 400 500 600 700 800 p (t ) [GeV] T h SMP General Meeting !8 Saptaparna Bhattacharya
MC production in the wider context Release and Production Plan until Mid 2022 Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul 2021 2022 CMSSW_11_3_0 CMSSW_12_0_0 CMSSW_12_1_0 CMSSW_12_2_0 CMSSW_12_3_0 ? ● Geant4 10.7 ● DD4HEP ● Stable POG code ● Datataking release ● GPU ● Stable DPG code ● POG samples (PAG ?) ● POG and PAG samples ● PF calib. sample ● L1T: LLP; HLT: 1st menu ● Trigger: final menu ● L1 Trigger (w/o LLP) Use final COM energy (13.6?) here sqrt(s)=14 TeV by default LHC beams MWGR #6 Commissioning: pp-Collisions: MWGR #5 MWGR #3 MWGR #4 14-16 Apr CRUZET 9-11 Jun ● Midweek GR ● Feb: Beam commissioning CRAFT ● Cosmics ● Apr: Ramp-up / first collisons ● LHC beams ● May: Physics Run-3 DPG and trigger samples (CMSSW_11_X) Run-3 Trigger, DPG, POG, PAG samples (CMSSW_12_0) Run-3 analysis Custom-Nano (JME) Legacy Legacy NanoAODv9 NanoAODv10 Re-MiniAOD Re-MiniAOD U-Legacy ReReco for B-parking on all Legacy AOD using 12_0 or 12_1 Run-2 Legacy MC 16/17/18 (Summer20UL) Run-2 Legacy: Run-3 Preparation Phase-2 Preparation ● Produce ~50 Billion MC events in 2021 ● GPU and Scouting ● HLT-TDR ● Re-NanoAOD (target every 3 months) ● Detectors, conditions and trigger ● Snowmass ● Custom-NanoAOD: establish for JME ● Skims ● Re-MiniAODv2 and v3 ● Code freeze and calibrations ● High-precision calibration and SF ● Analysis preparation ● Re-reco of B-parking and HIN PD in 10_6 (tbc) XC meeting Friday 30 April [indico], follow-up XC on 9-July https://indico.cern.ch/event/1048959/ PPD, 17 June 2021 4 SMP General Meeting !10 Saptaparna Bhattacharya
Global CPU usage https://dmytro.web.cern.ch/dmytro/cmsprodmon/ CPUs currently swamped with UL samples SMP General Meeting !11 Saptaparna Bhattacharya
UL event generation status by PAG EVENTS 10G 8G 6G 4G 2G G H V M O G E M O D S P S U P B2 BP BT EG EX HI JM LU MU PP PP SM SU TA TO Generated: Monday, July 5th, 2021, 00:29 Last update: Sunday, July 4th, 2021, 23:17 For input: RunIISummer20UL16GENAPV, RunIISummer20UL16wmLHEGENAPV, RunIISummer20UL16wmLHEGEN, RunIISummer20UL16pLHEGEN, RunIISummer20UL16GEN, RunIISummer20UL16pLHEGENA SMP General Meeting !12 Saptaparna Bhattacharya
Samples in production: priority submitted done EVENTS 10G Block 3 Block 2 Block 1 100M 1M 10k Low High 100 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 5 6 7 8 9 0 1 2 4 5 6 3 4 5 6 3 0 1 2 3 4 5 6 7 8 1 2 2 0 5 00 500 500 500 500 500 500 500 500 500 501 501 501 501 501 501 501 501 501 502 502 502 502 502 502 000 000 000 000 002 000 000 000 000 000 000 000 000 000 001 001 200 000 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 11 11 11 11 11 11 11 11 11 11 11 11 12 • Block 0 (130k and higher): strictly for PnR / CompOps • Block 1: 1-2 months (saturated currently) similar • Block 1 (120k,121k,122k): HLT-TDR, premix libraries and • Block 1 (113k): classical mixing - Run3Winter21 Campaigns • Block Generated: Thursday, July2: 1st, ~2 2021, months 08:52 • Block 1 (111k): re-NanoAOD Last update: Thursday, July 1st, 2021, 08:24 • Block 1 (110k): High-priority requests (only via JIRA) Block 3: > 6 months (mostly for legacy production) • For input: RunIISummer20UL18MiniAOD, RunIISummer20UL17MiniAOD, RunIISummer20UL16MiniAOD, RunIISummer20UL16MiniAODAPV • Block 2 (90k): UL20 PAG requests (higher priority) • Block 3 (85k): UL20 PAG requests (lower priority) SMP General Meeting !13 Saptaparna Bhattacharya
How to find my missing samples? SMP General Meeting !14 Saptaparna Bhattacharya
How to find your sample in production Status of MC production can be found on McM Detailed status available to indicate where the sample is in the production chain SMP General Meeting !15 Saptaparna Bhattacharya
How to find your sample in production EXO MC team (Sihyun Jeon, Michael Krohn, Kai Wei, Jay Vora, Ram Krishna Sharma, Young Wan Kim, and Matthew Decaro) have put together documentation: https://exo- mc-and-i.gitbook.io/exo-mc-and-interpretation/others/finding-prepids-in-mcm SMP General Meeting !16 Saptaparna Bhattacharya
Monitoring using computing tools The status can also be gathered from the cmsprodmon tools Some inconsistencies expected with McM McM tracks every step of production while cmsprodmon tracks requests submitted (one-to-one correspondence for campaigns with a single step, does not hold true for multi-step campaigns) SMP General Meeting !17 Saptaparna Bhattacharya
UL sample status tracking using GrASP GrASP is a lite version of McM Facilitates identification of same or similar samples requested by the PAGs Only samples that are “tagged” (tags added by production managers) in McM can be tracked with GrASP SMP General Meeting !18 Saptaparna Bhattacharya
UL sample status tracking using GrASP Only SMP specific search Clicking on RunIISummer20UL*GEN can be performed too SMP General Meeting !19 Saptaparna Bhattacharya
MC production model in CMS Tools for interested parties SMP General Meeting !20 Saptaparna Bhattacharya
Model of MC production in CMS In CMS, Monte Carlo production proceeds via a bottom-up approach driven by analyzers Production managers start the • While this is primarily how production of samples sample production is handled in EXO, in TOP large fraction samples are cloned from previous Generator conveners critically examine campaigns the tickets (while looking for common pitfalls) and if all tests pass, the requests are approved • Currently handled by the Top Modeling Group MC contacts then inject samples into McM and make tickets and present them at the Monte Carlo coordination meeting Analyzers reach out to MC contacts for guidance on sample production and in some cases MC contacts assist with gridpack production Analyzers identify MC samples needed for a particular analysis SMP General Meeting !21 Saptaparna Bhattacharya
Toolbox for analyzers Regular tutorials are organized by GEN on various event generators: https:// indico.cern.ch/event/962610/ Tutorials on MG5, POWHEG, HERWIG, Sherpa, RIVET GEN Twiki maintained regularly Examples of cards for LO and NLO production included in the twiki SMP General Meeting !22 Saptaparna Bhattacharya
Toolbox for analyzers Information on MC campaigns, monitoring submitted samples can be found on gitbook SMP General Meeting !23 Saptaparna Bhattacharya
Analysis specific monitoring tools In the VVV analysis group, where a huge number of MC samples are in use due to a large array of final states, Qilong Guo has developed a UL sample monitoring framework: https://test-qiguo.web.cern.ch/test-qiguo/ VVV/UL_samples/0Lepton/16/NanoAOD_v7_2016_missied.png SMP General Meeting !24 Saptaparna Bhattacharya
Toolbox for MC contacts GEN meetings held on Mondays at 14:00 CERN can generally be instructive for GEN contacts Useful information can also be found here: https://cms- pdmv.gitbook.io/project/ mccontact Preferred dataset naming conventions included in the gitbook Pertinent information can be found in each google doc that is discussed during Monte Carlo Coordination meetings SMP General Meeting !25 Saptaparna Bhattacharya
How MC production is handled in SMP Sample requests in SMP handled via GitHub issues: https://github.com/SMP-GEN- Coordination SMP General Meeting !26 Saptaparna Bhattacharya
Conclusion Contacting GEN: best way to contact GEN is through the GEN hypernews: hn-cms- generators@cern.ch Contacting production managers through PREPOPS hypernews: hn-cms-prep- ops@cern.ch SMP MC contacts can be contacted via GitHub: https://github.com/SMP-GEN- Coordination Feedback? SMP General Meeting !27 Saptaparna Bhattacharya
Additional Material SMP General Meeting !28 Saptaparna Bhattacharya
Pythia 8 bug in UL MC production • P8 bug noticed by J. Thaler in 8.240: ◦ unexpected swap of color tags to recoiler in some branchings ◦ unexpectedly large strings can be pulled between a central radiation and beam remnant ◦ first evaluation of impact by Steve ▪ no significant changes to jet shapes, kinematics, properties, .... ▪ particle multiplicity per jet shifted to higher values • patch to P8.240 from Steve already merged ◦ thanks to Sanghyun for very quick validation • preparing with PCGT to check impact on CPx tunes • will provide test release and MC campaign to quantify impact and define production strategy ◦ ping relevant groups to prepare requests when ready https://indico.cern.ch/event/920855/contributions/3905388/attachments/2059238/3454050/GEN_Summary_CMSWeek_20200617.pdf SMP General Meeting !29 Saptaparna Bhattacharya
How MC production is handled in PAGs: EXO SMP General Meeting !30 Saptaparna Bhattacharya
To include •GEN improvements in UL campaign: •How to look up information on sample production and at what stage of the production a particular sample is in •Introduction to GrASP (for identification of similar samples requested by other PAGS) •A few words (or slide) dedicated to the CMS bottom-up approach when it comes to MC needs •toolbox for analyzers, who to contact if samples are missing etc. •toolbox for gencontacts and common pitfalls to watch out for: like starting from the latest version of cards when requesting analyzers to make gridpacks. Common card settings that are usually telltale signs of an old card. •A brief slide about pre-legacy. SMP General Meeting !31 Saptaparna Bhattacharya
You can also read