Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...

Page created by Carlos Harmon
 
CONTINUE READING
Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...
Ultra Legacy Monte Carlo
     Production: GEN overview

Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal
                        for the CMS GEN group
Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...
Outline of the talk

Overview of GEN specific improvements in the UL campaign

UL Monte Carlo production

How to find my missing samples?

MC production model in CMS

Toolbox for analyzers

Toolbox for MC contacts

Feedback?

       SMP General Meeting            !2           Saptaparna Bhattacharya
Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...
Overview of GEN specific improvements in the
               UL campaign

 SMP General Meeting   !3       Saptaparna Bhattacharya
Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...
Generators in use in the UL campaign

     •     Large fraction of UL events produced at NLO (~50%) and NNLO (~14%)*

     •     A wide spate of generators used and are tailored to facilitate the physics program in CMS

     •     Using POWHEG for NLO generation allows one to circumvent the deleterious effect of negatively weighted
           events

     •     Parton shower modeling in 99% of cases done with Pythia
                                                        madgraph                                                      powheg
                                                        pythiaOnly                                                    amcatnloFXFX
                12.9%                                   powheg                                                        pythiaOnly
                                                        madgraphMLM                                                   powhegMiNNLO
                                                        amcatnloFXFX          22.3%                                   madgraphMLM
      7.68%
                                                                                             28.2%
                                                        amcatnlo                                                      sherpa
                                                        evtgen                                                        unknown
   4.94%                                                unknown                                                       madgraph
                                                        powhegMiNNLO                                                  evtgen+pythia
    3.79%                                               mcfm                                                          amcatnlo
                               62.9%                                  14.7%                                0.0421%
        4%                                                                                             2.0 0.879%     mcfm
     2.8                                                                                             2. 6% 1.11%
                                                                                                  2. 15%
1.79%                                                                                               81
                                                                                                      %
 1.79%
 0.736%                                                                         13.8%     11.9%
  0.631%

              Usage of MC generators categorized by sample (left) and events (right) for UL16
                              campaign based on sampling of 9M events
                                                                                              *refers specifically to UL16
                        SMP General Meeting                      !4                     Saptaparna Bhattacharya
Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...
Ingredients in the UL campaign: GEN overview

Preparation of GEN UL campaign started in 2019 (introduced by Qiang Li and Efe Yazgan)
MC production in UL16, 17 and 18 documented in GitBook
GEN campaigns (X = 16, 17, 18): In UL16 50% luminosity split between APV and non-APV
   RunIISummer20UL{X}GEN: Pythia-only requests (pythia fragment needs to be specified)
   RunIISummer20UL{X}wmLHEGEN: Matrix element + parton shower requests (production with gridpacks)
   RunIISummer20UL{X}pLHE: Matrix element (proceeds through privately produced LHE) + parton shower
   request
Standard positive-definite NNPDF3.1 PDF NNLO set (with CP5 tune) is the default

                                                                325500 is also
                                                                positive definite

   CP5 is the default tune (CPX tunes documented here, CP2 and CP3 tunes may be used for LO/NLO
   PDFs in the matrix element)

              SMP General Meeting                 !5                  Saptaparna Bhattacharya
Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...
Version of event generators: Madgraph

•   Madgraph version > 2.6.X recommended for UL production (strict inequality in version required since GEN and
    SIM steps separate in UL campaign)

     •   Features gridpack read-only feature and the use multithreading

     •   Includes bias-reweighting features at LO and NLO

          •   Details in Olivier’s talk

     •   Improved cluster support

•   Magraph 2.6.5 is currently in the master of the genproductions repository

•   Detailed validation of both Madgraph 2.6.1 and 2.6.5 performed by the GEN Validation team (S. Bhattacharya,
    J. Choi, A. Grohsjean, G. Kole) in 2018-2019

     •   Validation suite includes the validation of kinematic distribution, PDF weights, BSM weights (associated with
         coupling reweighing or mass reweighting) at both LO and NLO

•   During the course of Run II data analysis, bugs in Madgraph found by the GEN team and analyzers

     •   Examples of such bugs include incorrect computation of cross section (MG 5.2.3.3)

     •   Twiki page to collect user-based feedback of buggy samples, inconsistent settings: https://twiki.cern.ch/twiki/
         bin/view/CMS/MCKnownIssues

          •   This page is consistently maintained and includes solutions to the reported bugs

•   Gridpacks for processes with known bugs have been reproduced for the UL campaign

                     SMP General Meeting                  !6                  Saptaparna Bhattacharya
Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...
Version of event generators: Pythia

•   Pythia version 8.2.4.0 used

    •   Includes the dipole recoil option essential for Vector Boson Scattering topologies

    •   UL campaign uses a version of Pythia where the unexpectedly large particle multiplicity bug
        was fixed (detailed description on slide 29)

•   To specify relevant PDF sets include (included in the default fragment)

•   Parton shower weights incorporated as

              SMP General Meeting                !7                Saptaparna Bhattacharya
Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...
Version of event generators:
                                 POWEG, Sherpa and HERWIG

• POWHEG V2                                                                                                                   35.8 fb-1 (13 TeV)
    • Often used in conjunction with JHU GEN for samples

                                                                  dσ [pb GeV-1]
                                                                                             CMS     e/ µ +jets                   Data
                                                                                     1               particle level               Sys ⊕ stat
                                                                                                                                  Stat
• Sherpa 2.2.11 + OpenLoops 2.1.0                                                                                                 POWHEG P8
    • Available in CMSSW_11_3_0_pre4, CMSSW_10_2_26 &                                                                             SHERPA CS

                                                                dp (th)
                                                                                  10−1                                            POWHEG H++
      CMSSW_10_6_21; CMSSW_9_3_19; CMSSW_7_1_47

                                                                           T
                                                                                                                                  MG5 P8 [FxFx]
    • Comparison between Sherpa (Catani-Seymour) and
      other generators studied for TOP-17-002
                                                                                  10−2
    • First look at multi jet production in Sherpa will be
      discussed in the GEN meeting on July 12th

                                                                                  10−3
• HERWIG7

                                                                Theory
    • Available since CMSSW_9_3_14                                                 1.4

                                                                 Data
    • Supersedes HERWIG++                                                          1.2                                               p (t ) [GeV]
    • Tuned with CH3
                                                                                                                                         T h
                                                                                     1
    • Matching at NLO for angular ordered and dipole shower
    • Implementation of aMC@NLO and POWHEG-“type”                                  0.8
     matching schemes                                                                    0    100   200    300        400   500    600     700   800
                                                                                                                                     p (t ) [GeV]
                                                                                                                                         T h

                     SMP General Meeting                   !8                                  Saptaparna Bhattacharya
Ultra Legacy Monte Carlo Production: GEN overview - Saptaparna Bhattacharya, Alexander Grohsjean, and Gurpreet Singh Chahal for the CMS GEN group ...
UL Monte Carlo production

SMP General Meeting       !9        Saptaparna Bhattacharya
MC production in the wider context

                   Release and Production Plan until Mid 2022
    Feb      Mar      Apr          May   Jun         Jul          Aug           Sep    Oct               Nov        Dec        Jan    Feb        Mar     Apr     May      Jun   Jul
    2021                                                                                                                       2022

                          CMSSW_11_3_0               CMSSW_12_0_0          CMSSW_12_1_0                                               CMSSW_12_2_0           CMSSW_12_3_0 ?
                           ● Geant4 10.7              ● DD4HEP               ● Stable POG code                                         ● Datataking release
                           ● GPU                      ● Stable DPG code      ● POG samples (PAG ?)                                     ● POG and PAG samples
                                                      ● PF calib. sample     ● L1T: LLP; HLT: 1st menu                                 ● Trigger: final menu
                                                      ● L1 Trigger (w/o LLP)
                                                                                                                                             Use final COM energy (13.6?) here
                                                                sqrt(s)=14 TeV by default

                                                                                       LHC beams

                                                                                                                     MWGR #6
     Commissioning:                                                                                                                    pp-Collisions:

                                                                                                          MWGR #5
                       MWGR #3

                                         MWGR #4
                       14-16 Apr

                                                       CRUZET
                                         9-11 Jun

     ● Midweek GR                                                                                                                      ● Feb: Beam commissioning

                                                                        CRAFT
     ● Cosmics                                                                                                                         ● Apr: Ramp-up / first collisons
     ● LHC beams                                                                                                                       ● May: Physics

     Run-3 DPG and trigger samples (CMSSW_11_X)                   Run-3 Trigger, DPG, POG, PAG samples (CMSSW_12_0)                            Run-3 analysis

                                                           Custom-Nano
                                                              (JME)
                                                      Legacy                                           Legacy
                                                    NanoAODv9                                        NanoAODv10
              Re-MiniAOD                                                                                                   Re-MiniAOD
                                                     U-Legacy ReReco for B-parking
           on all Legacy AOD                                                                                            using 12_0 or 12_1

    Run-2 Legacy MC 16/17/18 (Summer20UL)

    Run-2 Legacy:                                                         Run-3 Preparation                                                               Phase-2 Preparation
     ● Produce ~50 Billion MC events in 2021                               ● GPU and Scouting                                                              ● HLT-TDR
     ● Re-NanoAOD (target every 3 months)                                  ● Detectors, conditions and trigger                                             ● Snowmass
     ● Custom-NanoAOD: establish for JME                                   ● Skims
     ● Re-MiniAODv2 and v3                                                 ● Code freeze and calibrations
     ● High-precision calibration and SF                                   ● Analysis preparation
     ● Re-reco of B-parking and HIN PD in 10_6 (tbc)

                                   XC meeting Friday 30 April [indico], follow-up XC on 9-July
https://indico.cern.ch/event/1048959/                                                 PPD, 17 June 2021                                                                           4

                  SMP General Meeting                                                              !10                                   Saptaparna Bhattacharya
Global CPU usage

        https://dmytro.web.cern.ch/dmytro/cmsprodmon/

            CPUs currently swamped with UL samples

SMP General Meeting           !11            Saptaparna Bhattacharya
UL event generation status by PAG
            EVENTS

     10G

      8G

      6G

      4G

      2G

                       G          H          V     M     O       G        E           M     O         D         S        P         S         U         P
                     B2      BP            BT    EG    EX      HI       JM          LU    MU        PP        PP       SM        SU       TA        TO

Generated: Monday, July 5th, 2021, 00:29
Last update: Sunday, July 4th, 2021, 23:17
For input: RunIISummer20UL16GENAPV, RunIISummer20UL16wmLHEGENAPV, RunIISummer20UL16wmLHEGEN, RunIISummer20UL16pLHEGEN, RunIISummer20UL16GEN, RunIISummer20UL16pLHEGENA

                             SMP General Meeting                              !12                         Saptaparna Bhattacharya
Samples in production: priority

             submitted                                  done
          EVENTS

    10G
                                                                                    Block 3                        Block 2                                 Block 1

  100M

     1M

    10k

                         Low                                                                                                                                High
    100

      1

               0 1 2 3 4 5 6 7 8 9 0 1 2 3 5 6 7 8 9 0 1 2 4 5 6 3 4 5 6 3 0 1 2 3 4 5 6 7 8 1 2 2 0
           5 00 500 500 500 500 500 500 500 500 500 501 501 501 501 501 501 501 501 501 502 502 502 502 502 502 000 000 000 000 002 000 000 000 000 000 000 000 000 000 001 001 200 000
          8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 11 11 11 11 11 11 11 11 11 11 11 11 12

                                                                                                              • Block 0 (130k and higher): strictly for PnR / CompOps
     •      Block 1: 1-2 months (saturated currently)
                                                                                                      similar
                                                                                                              • Block 1 (120k,121k,122k): HLT-TDR, premix libraries and
                                                                                                   • Block 1 (113k): classical mixing - Run3Winter21 Campaigns
     •        Block
Generated: Thursday, July2:
                          1st, ~2
                               2021, months
                                     08:52
                                                                                                   • Block 1 (111k): re-NanoAOD
Last update: Thursday, July 1st, 2021, 08:24
                                                                                                   • Block 1 (110k): High-priority requests (only via JIRA)
              Block 3: > 6 months (mostly for legacy production) •
For input: RunIISummer20UL18MiniAOD, RunIISummer20UL17MiniAOD, RunIISummer20UL16MiniAOD, RunIISummer20UL16MiniAODAPV

     •                                                                                                Block 2 (90k): UL20 PAG requests (higher priority)
                                                                                                   • Block 3 (85k): UL20 PAG requests (lower priority)
                                 SMP General Meeting                                            !13                              Saptaparna Bhattacharya
How to find my missing samples?

SMP General Meeting   !14       Saptaparna Bhattacharya
How to find your sample in production

Status of MC production can be found on McM

  Detailed status available to indicate where the sample is in the production chain

            SMP General Meeting           !15             Saptaparna Bhattacharya
How to find your sample in production

EXO MC team (Sihyun Jeon, Michael Krohn, Kai Wei, Jay Vora, Ram Krishna Sharma,
Young Wan Kim, and Matthew Decaro) have put together documentation: https://exo-
mc-and-i.gitbook.io/exo-mc-and-interpretation/others/finding-prepids-in-mcm

              SMP General Meeting       !16           Saptaparna Bhattacharya
Monitoring using computing tools

The status can also be gathered from the cmsprodmon tools

Some inconsistencies expected with McM

   McM tracks every step of production while cmsprodmon tracks requests submitted
   (one-to-one correspondence for campaigns with a single step, does not hold true for
   multi-step campaigns)

             SMP General Meeting           !17              Saptaparna Bhattacharya
UL sample status tracking using GrASP

GrASP is a lite version of McM

Facilitates identification of same or similar samples requested by the PAGs

Only samples that are “tagged” (tags added by production managers) in McM can be
tracked with GrASP

             SMP General Meeting           !18            Saptaparna Bhattacharya
UL sample status tracking using GrASP

                                          Only SMP specific search
Clicking on RunIISummer20UL*GEN            can be performed too

           SMP General Meeting    !19     Saptaparna Bhattacharya
MC production model in CMS
               Tools for interested parties

SMP General Meeting        !20        Saptaparna Bhattacharya
Model of MC production in CMS

In CMS, Monte Carlo production proceeds via a bottom-up approach driven by
analyzers
                                   Production managers start the              •   While this is primarily how
                                      production of samples                       sample production is
                                                                                  handled in EXO, in TOP
                                                                                  large fraction samples are
                                                                                  cloned from previous
                                  Generator conveners critically examine
                                                                                  campaigns
                                  the tickets (while looking for common
                                     pitfalls) and if all tests pass, the
                                          requests are approved
                                                                              •   Currently handled by the
                                                                                  Top Modeling Group

                      MC contacts then inject samples into McM and make
                         tickets and present them at the Monte Carlo
                                     coordination meeting

                       Analyzers reach out to MC contacts for guidance on
                       sample production and in some cases MC contacts
                                 assist with gridpack production

             Analyzers identify MC samples needed for a particular analysis

            SMP General Meeting                     !21               Saptaparna Bhattacharya
Toolbox for analyzers

Regular tutorials are organized by GEN on
various event generators: https://
indico.cern.ch/event/962610/

   Tutorials on MG5, POWHEG, HERWIG,
   Sherpa, RIVET

GEN Twiki maintained regularly

  Examples of cards for LO and NLO
  production included in the twiki
            SMP General Meeting             !22           Saptaparna Bhattacharya
Toolbox for analyzers

Information on MC
campaigns, monitoring
submitted samples can
be found on gitbook

              SMP General Meeting             !23           Saptaparna Bhattacharya
Analysis specific monitoring tools

In the VVV analysis group, where a huge number of MC samples are in use due to a large array of final states,
Qilong Guo has developed a UL sample monitoring framework: https://test-qiguo.web.cern.ch/test-qiguo/
VVV/UL_samples/0Lepton/16/NanoAOD_v7_2016_missied.png

               SMP General Meeting                 !24                 Saptaparna Bhattacharya
Toolbox for MC contacts

GEN meetings held on Mondays
at 14:00 CERN can generally be
instructive for GEN contacts

Useful information can also be
found here: https://cms-
pdmv.gitbook.io/project/
mccontact

Preferred dataset naming
conventions included in the
gitbook

                                                                   Pertinent information can
                                                                   be found in each google
                                                                   doc that is discussed
                                                                   during Monte Carlo
                                                                   Coordination meetings
             SMP General Meeting         !25        Saptaparna Bhattacharya
How MC production is handled in SMP

Sample requests in SMP handled via GitHub issues: https://github.com/SMP-GEN-
Coordination

             SMP General Meeting       !26           Saptaparna Bhattacharya
Conclusion

Contacting GEN: best way to contact GEN is through the GEN hypernews: hn-cms-
generators@cern.ch

Contacting production managers through PREPOPS hypernews: hn-cms-prep-
ops@cern.ch

SMP MC contacts can be contacted via GitHub: https://github.com/SMP-GEN-
Coordination

Feedback?

            SMP General Meeting        !27         Saptaparna Bhattacharya
Additional Material

SMP General Meeting             !28         Saptaparna Bhattacharya
Pythia 8 bug in UL MC production

•   P8 bug noticed by J. Thaler in 8.240:

     ◦   unexpected swap of color tags to recoiler in some branchings

     ◦   unexpectedly large strings can be pulled between a
         central radiation and beam remnant

     ◦   first evaluation of impact by Steve

          ▪   no significant changes to jet shapes, kinematics,
              properties, ....

          ▪   particle multiplicity per jet shifted to higher values

•   patch to P8.240 from Steve already merged

     ◦   thanks to Sanghyun for very quick validation

•   preparing with PCGT to check impact on CPx tunes

•   will provide test release and MC campaign to quantify impact
    and define production strategy

     ◦   ping relevant groups to prepare requests when ready

              https://indico.cern.ch/event/920855/contributions/3905388/attachments/2059238/3454050/GEN_Summary_CMSWeek_20200617.pdf

                      SMP General Meeting                         !29                   Saptaparna Bhattacharya
How MC production is handled in PAGs: EXO

 SMP General Meeting   !30    Saptaparna Bhattacharya
To include

•GEN improvements in UL campaign:
•How to look up information on sample production and at what stage of the
 production a particular sample is in
•Introduction to GrASP (for identification of similar samples requested by other
 PAGS)
•A few words (or slide) dedicated to the CMS bottom-up approach when it comes
 to MC needs
•toolbox for analyzers, who to contact if samples are missing etc.
•toolbox for gencontacts and common pitfalls to watch out for: like starting from the
 latest version of cards when requesting analyzers to make gridpacks. Common
 card settings that are usually telltale signs of an old card.
•A brief slide about pre-legacy.

               SMP General Meeting        !31            Saptaparna Bhattacharya
You can also read