New Paradigms for New Physics Searches with Machine Learning - Rutgers University - CERN Indico
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
New Paradigms for New Physics Searches with Machine Learning David Shih January 13, 2021 Theory Mini-workshop HKUST Scott IAS Program on HEP Thomas Rutgers University
Where is the new physics?? Despite thousands of searches for new physics at the LHC, nothing but limits and null results so far. What if new physics is hiding in the data but we haven’t looked in the right places yet?
A Benchmark Example LHC Olympics 2020 R&D Dataset https://doi.org/10.5281/zenodo.2629072 q X Z’ q q Y q No explicit search at the LHC for this scenario! Could be hiding in the dijet resonance search at >5sigma significance!!
The most common approach Model specific searches The BDT search s search strategy is applied separately through two sets of signal regions targeting models with gluino-pair duction with direct (BDT-GGd) or one-step (BDT-GGo) 6̃ decays. In each set, events are separated Most NP searches at the LHC are heavily optimized with specific 0 o four categories, depending on the mass difference 3 , pmiss T ) min > 0.4 ⇢ Tmiss /< eff (#j ) > 0.2 < eff [GeV] > 1400 > 800 BDT score > 0.97 > 0.94 > 0.94 > 0.87 0 0.4BDTs) optimized q( 91,2, (3) , pmiss T ) min > 0.2 using simulations of signal AND q( 98>3 , pbackground. miss ) T min > 0.4 > 0.2 ⇢ Tmiss /< eff (#j ) > 0.2 < eff [GeV] > 1400 > 800 BDT score > 0.96 > 0.87 > 0.92 > 0.84
The most common approach Model specific searches The BDT search s search strategy is applied separately through two sets of signal regions targeting models with gluino-pair duction with direct (BDT-GGd) or one-step (BDT-GGo) 6̃ decays. In each set, events are separated Most NP searches at the LHC are heavily optimized with specific 0 o four categories, depending on the mass difference 3 , pmiss ) min rc h > 0.4 sea T ⇢ Tmiss /< eff (#j ) > 0.2 < eff [GeV] of > 1400 > 800 BDT score 9 9 %> 0.97 > 0.94 > 0.94 > 0.87 0 > 0.4BDTs) optimized q( 91,2, (3) , pmiss T ) min > 0.2 using simulations of signal AND q( 98>3 , pbackground. miss ) T min > 0.4 > 0.2 ⇢ Tmiss /< eff (#j ) > 0.2 < eff [GeV] > 1400 > 800 BDT score > 0.96 > 0.87 > 0.92 > 0.84
Of course, we should continue to perform these searches, because NP could always be right around the corner… But we should also recognize that perhaps new physics is on a different street, or in a different building altogether…
Existing model-independent approaches “the bump hunt” Idea: assume signal is localized in some feature (usually invariant mass) while background is smooth. Interpolate from sidebands into signal region, search for an excess.
Existing model-independent approaches “the bump hunt” Idea: assume signal is localized in some feature (usually invariant mass) while background is smooth. r i es. Interpolate from sidebands into signal region, search c ove for an excess. y di s m an e d i n d, u s e t ho si c m C l as
Existing model-independent searches “the general search” Idea: divide the phase space up into thousands of bins, compare data s are dominated by tt toFigures production. SMforsimulation additional object groupsin caneach be foundone in endix A. 0.015 0.018 0.020 0.045 0.046 0.064 0.065 0.068 0.068 0.069 0.077 0.081 0.081 0.083 0.084 0.091 0.097 0.10 0.10 0.10 Events per class 107 106 CMS -1 Data tt W + jets Higgs boson CMS ATLAS 35.9 fb (13 TeV) Drell-Yan γ + jets 105 Exclusive classes Single t Multijet Multiboson “Model independent 104 103 “MUSIC” 102 general search” 10 1 1807.07447 CMS-PAS-EXO-14-016 −1 10 EPJC 79:120 (2019) 2 e + 2 µ + 1b 2 e + 1b 4 µ + 1b + 1jet + pmiss 1e + 1µ + 1γ + pmiss pmiss 1µ + 4 b + 1jet + pmiss pmiss pmiss 1e + 1µ + 1jet + pmiss 2 e + 1b + pmiss 3 e + 1b + 2 jets 3 µ + 1b + 3 jets 1e + 2 γ + 2 jets 1µ + 1γ + 2 b + 4 jets 1µ + 4 b + 2 jets 2 e + 1µ + 1b + 3 jets 1e + 1µ + 1γ + 1b + 1jet 1e + 1µ + 1jet 1µ + 2 γ 3 e + 1γ T T T T T T T T 1e + 1µ + 2 e + 1µ + 1jet + 1e + 1γ + 1b + 5 jets + See also proposals by D’Agnolo, Wulzer et al (1806.02350, 1912.12155): re 8: Data and SM predictions for the most significant exclusive event classes, where the ficance of an event class is calculated in a single aggregated bin. Measured data are shown train DNN on full phase space to distinguish data from background ack markers, contributions from SM processes are represented by coloured histograms, he shaded region represents the uncertainty in the SM background. The values above the
Existing model-independent searches “the general search” Idea: divide the phase space up into thousands of bins, compare data toFigures SMforsimulation additional object groupsin caneach be foundone st i l l s are dominated by tt production. in , bu t t n endix A. d en t nd e e pe n d e p e d 0.015 0.018 0.020 0.045 0.046 0.064 0.065 0.068 0.068 0.069 0.077 0.081 0.081 0.083 0.084 0.091 0.097 n n 0.10 0.10 0.10 e l i l at i o Events per class d 107 CMS u ATLAS o Data W + jets m CMS m si 106 tt Higgs boson 5 35.9 fb-1 (13 TeV) gn al nd ) Drell-Yan γ + jets i 10 Exclusive classes Single t Multijet s u Multiboson ly ro “Model independent 104 Tru (backg “MUSIC” 103 102 general search” h ly hig 10 1 1807.07447 CMS-PAS-EXO-14-016 10−1 EPJC 79:120 (2019) 2 e + 2 µ + 1b 2 e + 1b 4 µ + 1b + 1jet + pmiss 1e + 1µ + 1γ + pmiss pmiss 1µ + 4 b + 1jet + pmiss pmiss pmiss 1e + 1µ + 1jet + pmiss 2 e + 1b + pmiss 3 e + 1b + 2 jets 3 µ + 1b + 3 jets 1e + 2 γ + 2 jets 1µ + 1γ + 2 b + 4 jets 1µ + 4 b + 2 jets 2 e + 1µ + 1b + 3 jets 1e + 1µ + 1γ + 1b + 1jet 1e + 1µ + 1jet 1µ + 2 γ 3 e + 1γ T T T T T T T T 1e + 1µ + 2 e + 1µ + 1jet + 1e + 1γ + 1b + 5 jets + See also proposals by D’Agnolo, Wulzer et al (1806.02350, 1912.12155): re 8: Data and SM predictions for the most significant exclusive event classes, where the ficance of an event class is calculated in a single aggregated bin. Measured data are shown train DNN on full phase space to distinguish data from background ack markers, contributions from SM processes are represented by coloured histograms, he shaded region represents the uncertainty in the SM background. The values above the
2 An Overview of Model (In)dependent Searches A viable search for new physics generally must have two essential com New paradigms for model-agnostic searches sensitive to new phenomena and it must also be able to estimate the b null hypothesis (Standard Model only). The categorization of a searc (in)dependence requires consideration of both of these components. Figu Can advances in machine learning open up new avenues for to characterize model independence for both BSM sensitivity and SM b model-independent We will nowsearches? consider each in turn. from Nachman & DS 2001.04990 background (SM) model independence background (SM) model independence autoencoders Direct De Some searches LDA estimation, S (train signal ANODE versus data) ABCD CWoLa SALAD Control region Most searches MUSiC (CMS), method (train with General Search simulations) (ATLAS) Pure MC prediction signal model independence signal model ind
2 An Overview of Model (In)dependent Searches A viable search for new physics generally must have two essential com New paradigms for model-agnostic searches sensitive to new phenomena and it must also be able to estimate the b null hypothesis (Standard Model only). The categorization of a searc (in)dependence requires consideration of both of these components. Figu Can advances in machine learning open up new avenues for to characterize model independence for both BSM sensitivity and SM b model-independent We will nowsearches? consider each in turn. from Nachman & DS 2001.04990 background (SM) model independence background (SM) model independence autoencoders Direct De Some searches estimation, S (train signal LDA Many new ANODE versus data) CWoLa ideas recently! ABCD SALAD Control region Most searches MUSiC (CMS), method (train with General Search simulations) (ATLAS) Pure MC prediction signal model independence signal model ind
Many new approaches inspired by (or at least demonstrated on) the LHC Olympics 2020 Data Challenge 1 The LHC Olympics 2020 2 A Community Challenge for Anomaly 3 Detection in High Energy Physics 4 5 Gregor Kasieczka (ed),1 Benjamin Nachman (ed),2,3 David Shih (ed),4 Oz Amram,5 6 Anders Andreassen,6 Kees Benkendorfer,2,7 Blaz Bortolato,8 Gustaaf Brooijmans,9 7 Florencia Canelli,10 Jack H. Collins,11 Biwei Dai,12 Felipe F. De Freitas,13 Barry M. 8 Dillon,8,14 Ioan-Mihail Dinu,5 Zhongtian Dong,15 Julien Donini,16 Javier Duarte,17 D. 9 A. Faroughy10 Julia Gonski,9 Philip Harris,18 Alan Kahn,9 Jernej F. Kamenik,8,19 10 Charanjit K. Khosa,20,30 Patrick Komiske,21 Luc Le Pottier,2,22 Pablo 11 Martı́n-Ramiro,2,23 Andrej Matevc,8,19 Eric Metodiev,21 Vinicius Mikuni,10 Inês 12 Ochoa,24 Sang Eon Park,18 Maurizio Pierini,25 Dylan Rankin,18 Veronica Sanz,20,26 13 Nilai Sarda,27 Uros̆ Seljak,2,3,12 Aleks Smolkovic,8 George Stein,2,12 Cristina Mantilla 14 Suarez,5 Manuel Szewc,28 Jesse Thaler,21 Steven Tsan,17 Silviu-Marian Udrescu,18 15 Louis Vaslin,16 Jean-Roch Vlimant,29 Daniel Williams,9 Mikaeel Yunus18 1 16 Institut für Experimentalphysik, Universität Hamburg, Germany 2 Physics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA to appear on the arXiv next week… 17 3 18 Berkeley Institute for Data Science, University of California, Berkeley, CA 94720, USA 4 19 NHETC, Department of Physics & Astronomy, Rutgers University, Piscataway, NJ 08854, USA 5 20 Department of Physics & Astronomy, The Johns Hopkins University, Baltimore, MD 21211, 21 USA
66 Contents 67 1 Introduction 2 Many new approaches inspired by (or at least demonstrated 68 2 Dataset and Challenge 5 69 2.1 R&D Dataset 5 on) the LHC Olympics 2020 Data Challenge 70 71 2.2 Black Box 1 2.3 Black Box 2 6 7 72 2.4 Black Box 3 7 73 Individual Approaches 9 74 3 Unsupervised 11 75 3.1 Anomalous Jet Identification via Variational Recurrent Neural Network 11 76 3.2 Anomaly Detection with Density Estimation 16 77 3.3 BuHuLaSpa: Bump Hunting in Latent Space 19 78 3.4 GAN-AE and BumpHunter 24 79 3.5 Gaussianizing Iterative Slicing (GIS): Unsupervised In-distribution Anomaly 80 Detection through Conditional Density Estimation 29 81 3.6 Latent Dirichlet Allocation 34 82 3.7 Particle Graph Autoencoders 38 83 3.8 Regularized Likelihoods 42 84 3.9 UCluster: Unsupervised Clustering 46 85 4 Weakly Supervised 51 86 4.1 CWoLa Hunting 51 87 4.2 CWoLa and Autoencoders: Comparing Weak- and Unsupervised methods 88 for Resonant Anomaly Detection 55 89 4.3 Tag N’ Train 60 90 4.4 Simulation Assisted Likelihood-free Anomaly Detection 63 91 4.5 Simulation-Assisted Decorrelation for Resonant Anomaly Detection 68 92 5 (Semi)-Supervised 71 93 5.1 Deep Ensemble Anomaly Detection 71 94 5.2 Factorized Topic Modeling 77 95 5.3 QUAK: Quasi-Anomalous Knowledge for Anomaly Detection 81 96 5.4 Simple Supervised learning with LSTM layers 85 97 6 Discussion 88
LHC Olympics 2020: Black Boxes https://doi.org/10.5281/zenodo.3547721 In 2019, Gregor Kasieczka, Ben Nachman and I initiated the LHC Olympics 2020 Challenge. It consisted of three “black boxes” of simulated data (bg dominated!): • 1 million events each • 4-vectors of every reconstructed particle (all hadronic) in the event • Particle ID, charge, etc not included • Single R=1 jet trigger pT>1.2 TeV The goal of the challenge was for participants to analyze each box and 1. Decide whether or not it contains new physics 2. Characterize the new physics, if it’s there
LHC Olympics 2020: Black Boxes https://doi.org/10.5281/zenodo.3547721 In 2019, Gregor Kasieczka, Ben Nachman and I initiated the LHC Olympics 2020 Challenge. xes It consisted of three “black boxes” of simulated data (bg dominated!): e b o s i n t h • 1 million events each at w a e e w h n g e k t o s h al l e • 4-vectors of every reconstructed e n ’s ta l particle (all f t e c hadronic) h in the event fo r B om e o • taParticle t u n e ID, u t c dcharge, etcenotoincluded S y a n d t h • Single R=1 jet trigger pT>1.2 TeV The goal of the challenge was for participants to analyze each box and 1. Decide whether or not it contains new physics 2. Characterize the new physics, if it’s there
LHC Olympics 2020: R&D Dataset https://doi.org/10.5281/zenodo.2629072 Prior to the challenge, we also released a labeled R&D dataset consisting of 1M QCD dijet events and 100k signal events q No explicit search at the mX=500 GeV LHC for this scenario! X Z’ q mZ’=3.5 TeV q Y mY=100 GeV Unofficial theme of LHCO2020: q “enhancing the bump hunt”
LHC Olympics 2020: R&D Dataset Resonant feature m = mZ 0 = mJJ (null) Benchmark S=500, B=500,000, BSR=61,000 re 2. Histograms for the invariant mass of the leading two jets for the Standard Model background signal strength: S/BSR~6x10-3, S/√BSR~1.5 ell as the injected signal. There are 1 million background events and 1000 signal events.
LHC Olympics 2020: R&D Dataset Additional features: Figure 3. The four features used for classification: mJ1x = m(m (top left), left), and · J2 (bottom right). These histograms are inclusive in m ≠ mJ1 ,m (top right), J1 J·2 ,(bottom J2 ⌧21 , ⌧21 ) (null) J1 J2 J1 21 . There are 1 million background
Enhancing the bump hunt ⇥(null) ~x 2 Rd (null) primary resonant feature (mJJ) additional features If the signal is localized in additional features, can we find it in a model- independent way?
sideband SR sideband Enhancing the bump hunt The optimal model-agnostic discriminant would be Figure 2. Histograms for the invariant mass of the leading two jets for the Standard Model ba as well as the injected signal. There are 1 million background events and 1000 signal events epochs results in a stable result. Averaging over more epochs does not further imp P(x|data) stability. All results with ANODE present the SB density estimator with this averagin R(x) = for the last 10 epochs. Figure 4 shows a scatter plot of R(x|m) versus log pbackground (x|m) for the test s P(x|bg) (null) SR. As desired, the background is mostly concentrated around R(x|m) = 1, while there tail for signal events at higher values of R(x|m) and between ≠2 < log pbackground (x This is exactly what is expected for this signal: it is an over-density (R > 1) in a phase space that is relatively rare for the background (pbackground (x|m) π 1). The background density in Fig. 4 also shows that the R(x|m) is narrower aroun Key idea: use sidebands in mJJ to model P(x|bg) pbackground (x|m) is large and more spread out when pbackground (x|m) π 1. This is that the density estimation is more accurate when the densities are high and wo the densities are low. This is also to be expected: if there are many data points clo another, it should be easier to estimate their density than if the data points are ver Another view of the results is presented in Fig. 5, with one-dimensional info about R(x|m) in the SR. The left plot of Fig. 5 shows that the background is cent approximately symmetric around R = 1 with a standard deviation of approximat This width is due to various sources, including the accuracy of the SR density, the ac the SB density, and the quality of the interpolation from SB to SR. Each of these so contributions from the finite size of the datasets used for training, the neural network fl and the training procedure. The right plot of Fig. 5 presents the number of backgro signal events as a function of a threshold R > Rc . The starting point are the original – 12 –
sideband SR sideband Enhancing the bump hunt The optimal model-agnostic discriminant would be Figure 2. Histograms for the invariant mass of the leading two jets for the Standard Model ba as well as the injected signal. There are 1 million background events and 1000 signal events epochs results in a stable result. Averaging over more epochs does not further imp P(x|data) stability. All results with ANODE present the SB density estimator with this averagin R(x) = for the last 10 epochs. Figure 4 shows a scatter plot of R(x|m) versus log pbackground (x|m) for the test s P(x|bg) (null) SR. As desired, the background is mostly concentrated around R(x|m) = 1, while there tail for signal events at higher values of R(x|m) and between ≠2 < log pbackground (x This is exactly what is expected for this signal: it is an over-density (R > 1) in a phase space that is relatively rare for the background (pbackground (x|m) π 1). The background density in Fig. 4 also shows that the R(x|m) is narrower aroun Key idea: use sidebands in mJJ to model P(x|bg) pbackground (x|m) is large and more spread out when pbackground (x|m) π 1. This is that the density estimation is more accurate when the densities are high and wo the densities are low. This is also to be expected: if there are many data points clo another, it should be easier to estimate their density than if the data points are ver Another view of the results is presented in Fig. 5, with one-dimensional info about R(x|m) in the SR. The left plot of Fig. 5 shows that the background is cent approximately symmetric around R = 1 with a standard deviation of approximat This width is due to various sources, including the accuracy of the SR density, the ac the SB density, and the quality of the interpolation from SB to SR. Each of these so contributions from the finite size of the datasets used for training, the neural network fl and the training procedure. The right plot of Fig. 5 presents the number of backgro signal events as a function of a threshold R > Rc . The starting point are the original – 12 –
ANODE: Anomaly Detection with Density Estimation Nachman & DS 2001.04990 Example of a new approach inspired by LHCO2020. (See Ben’s talk for additional new approaches!) Use neural density estimation to directly learn the conditional probability densities from the data P(x|data; mJJ 2 SR) (null) (null) P(x|data; mJJ 2 / SR) = P(x|bg; mJJ 2 / SR) interpolate in (x,mJJ) (null) P(x|bg; mJJ 2 SR) P(x|data; mJJ 2 SR) Construct the likelihood ratio: R(x) = P(x|bg; mJJ 2 SR) (null)
ANODE: Results on LHCO R&D Figure 4. Scatter Dataset plot of R(x|m) versus log p background (x| Nachman & DS 2001.04990 events are shown (as a two-dimensional histogram) in gray in red. Figure 5. Left: Histogram of R(x|m) evaluated on the igure 4. Scatter plot of R(x|m) versus log pbackground (x|m) across the test set in the SR. Background ents are shown (as a two-dimensional histogram) in grayscale and individual signal events are shown red. The method works! ANODE is sensitive events that survive atothreshold on R(x|m). The two distr the signal! 500,000 total background events and 500 total signal even to have the same number of events as each other and i
J1 ANODE: Results on LHCO R&D Dataset (left) and mJ2 ≠ mJ1 (right) in the signal region after applying a Nachman & DS 2001.04990 Can enhance the significance of the bump hunt by a factor of up to 7! ng Characteristic (ROC) curve (left) and Significance Improvement ht). 1.5σ (dijet bump hunt) => 10σ (ANODE+dijet bump hunt)
determine the background efficiency in the SR after a requirement on R > Rc . Figure 8 presents a comparison between integration methods (direct integration and importance sampling) described in Sec. 3.2 and the true background yields. Qualitatively, both methods are able to ANODE: Results on LHCO R&D Dataset characterize the yield across several orders of magnitude in background efficiency. However, both methods Nachman & DSdiverge from the truth in the extreme tails of the R distribution. The right plot 2001.04990 of Fig. 8 offers a quantitative comparison between methods. For efficiencies down to about 10≠3 , both methods are accurate within about 25%. The direct integration method has a smaller Novelbias of about aspect 10%. This of ANODE: canis consistent with Fig. 5, for estimate backgrounds which the directly withstandard deviation is P(x|bg; m∈SR) between 10-20%. Figure 8. Left: The number of events after a threshold requirement R > Rc using the two integration methods described in Sec. 3.2, as well as the true background yield. Right: The ratio of the predicted and true background yields from the left plot, as a function of the actual number of events that survive
Via Machinae: ANODE IN SPACE DS, Matt Buckley, Lina Necib & John Tamanas, in preparation ANODE is a completely general method for identifying multivariate overdensities in data using sidebands. It could have many diverse applications. We are currently using it to find stellar streams in Gaia data. Gaia satellite: • Launched in 2013; extended to 2025 • Mission: map out the full 6d phase space of the stars in our galaxy • Angular positions, velocities, color and magnitude of over 1 billion stars in our galaxy • radial positions and velocities for a smaller subset of nearby stars (not used in this work)
Stellar streams Cold stellar streams are tidally- stripped remnants of globular clusters and dwarf galaxies, falling into and orbiting our galaxy. They are very interesting objects of study for astrophysicists and particle physicists. In particular, they could be unique probes into dark matter substructure.
Stream finding https://github.com/cmateu/galstreams Existing methods (eg STREAMFINDER, Malhan & Ibata 2018) have found many new streams in the Gaia data, but they make a number of model-dependent assumptions (form of the galactic potential, orbits, isochrones, …). We were interested in whether unsupervised ML could be used to find streams more model-independently.
Ibata & Martin Stream finding from Malhan et al 2018 position velocity photometry ties ties of of aa sample sample of of previously-discovered Figure 4. Properties of a sample previously-discovered streams, streams, ofas recovered recovered by by the previously-discovered as the STREAMFINDER. The The first, streams, as recovered STREAMFINDER. second, first, by third third and the STREAMFINDER. second, and fourth The first, second fourth perties of operties of the GD-1, therows GD-1, Jhelum, show Jhelum, Indus Indus and the properties and ofOrphan the GD-1, Orphan streams, Jhelum, streams, respectively. Indus andThe respectively. columns Orphan The reproduce, streams, columns from from left respectively. reproduce, The left to right, right, the tocolumns thereproduce, from ates of nates of the the structures, equatorialthe structures, distance distance solutions coordinates the found found by of the structures, solutions the bythe algorithm thedistance algorithm (for (for representative solutions found by the representative metallicity algorithmvalues), metallicity the the proper (for representative values), proper metallicity v on on (with (with observations observations in in red, red, model motion distribution model solutions solutions (with in in blue, observations blue,inand red,the and the full modelfull DR2 DR2 sample sample solutions in in grey), in blue, and and grey), the and full the colour-magnitude the DR2 colour-magnitude sample in grey), and the e stars stars (with (with observations observations distribution of in red in the and and template red stars template model model in (with observations in blue) blue) in red selected selected by by STREAMFINDER. and template model in blue) STREAMFINDER. The The distance distanceby selected solutions solutions found found The distanc STREAMFINDER. match match closely the closely by distance thethe distance algorithmvalues values that have have been thatclosely match the previously been previously distance derived derived values that for these forhave these streams: streams: been DD ⇠ previously 88kpc kpc for ⇠derived forGD-1 for these(Grillmair GD-1 (Grillmair streams: D ⇠ 8 kpc fo D ⇠ ,, D ⇠ 13.2 13.2 kpc and ⇠ and kpcDionatos & ⇠ 16.6 16.6 kpc kpcD 2006), for for Jhelum ⇠Jhelum and 13.2 kpcandandIndus, Indus, respectively respectively ⇠ 16.6 kpc for Jhelum(Shipp (Shipp andet et al. al. 2018) and and D D = 2018)respectively Indus, = [33 [33 38] (Shipp etkpc 38] al. for kpc for 2018) and D = g etet al. al. 2010). The The CMD 2010).Orphan CMD template template (Newberg models, 2010).shown models, et al. shown The CMDin in blue blue in in the templatethe last last column, models,column, shown have have in been blue plotted been plotted in at the lastat the the appropriate appropriate column, have been plotted a spective streams. espective streams. The Thefor distance colour-magnitude colour-magnitude diagram diagram The the respective streams. of of the the Orphan Orphan stream stream might colour-magnitude might seem diagram seem peculiar, peculiar, of the Orphan but but here here we stream we only see onlyseem might see the the peculiar, but he due due to to the the trimming trimming red-giantof the the data of branch data sample sample due to the below below GG= trimming = 19.5. 19.5. of the data sample below G = 19.5.
Ibata & Martin Stream finding from Malhan et al 2018 position velocity photometry ties ties of of aa sample sample of of previously-discovered Figure 4. Properties of a sample previously-discovered streams, streams, ofas recovered recovered by by the previously-discovered as the STREAMFINDER. The The first, streams, as recovered STREAMFINDER. second, first, by third third and the STREAMFINDER. second, and fourth The first, second fourth perties of operties of the GD-1, therows GD-1, Jhelum, show Jhelum, Indus Indus and the properties and ofOrphan the GD-1, Orphan streams, Jhelum, streams, respectively. Indus andThe respectively. columns Orphan The reproduce, streams, columns from from left respectively. reproduce, The left to right, right, the tocolumns thereproduce, from ates of nates of the Idea: streams are local overdensities in position, velocity and photometric the structures, equatorialthe structures, distance distance solutions coordinates the found found by of the structures, solutions the bythe algorithm thedistance algorithm (for (for representative solutions found by the representative metallicity algorithmvalues), metallicity values), the the proper (for representative proper metallicity v on on (with (with observations in in red, red, model solutions in in blue, inand red,the full full DR2 DR2 sample in in grey), and and the colour-magnitude e stars stars (with space. observations motion distribution (with observations observations distribution of in model red in the and red stars solutions (with observations and template template model blue, model in (with observations and in blue) blue) the in red model selected selected by sample solutions in blue, by STREAMFINDER. and template grey), model in blue) STREAMFINDER. The and full the the DR2 The distance distanceby selected colour-magnitude sample in grey), and the solutions solutions found found The distanc STREAMFINDER. match match closely the closely by distance thethe distance algorithmvalues values that have have been thatclosely match the previously been previously distance derived derived values that for these forhave these streams: streams: been DD ⇠ previously 88kpc kpc for ⇠derived forGD-1 for these(Grillmair GD-1 (Grillmair streams: D ⇠ 8 kpc fo D ⇠ ,, D ⇠ 13.2 13.2 kpc and ⇠ and kpcDionatos & ⇠ 16.6 16.6 kpc kpcD 2006), for for Jhelum ⇠Jhelum and 13.2 kpcandandIndus, Indus, respectively respectively ⇠ 16.6 kpc for Jhelum(Shipp (Shipp andet et al. al. 2018) and and D D = 2018)respectively Indus, = [33 [33 38] (Shipp etkpc 38] al. for kpc for 2018) and D = g etet al. al. 2010). The The CMD 2010).Orphan CMD template template (Newberg models, 2010).shown models, et al. shown The CMDin in blue blue in in the templatethe last last column, models,column, shown have have in been blue plotted been plotted in at the lastat the the appropriate appropriate column, have been plotted a spective streams. espective streams. The Thefor distance colour-magnitude colour-magnitude diagram diagram The the respective streams. of of the the Orphan Orphan stream stream might colour-magnitude might seem diagram seem peculiar, peculiar, of the Orphan but but here here we stream we only see onlyseem might see the the peculiar, but he due due to to the the trimming trimming red-giantof the the data of branch data sample sample due to the below below GG= trimming = 19.5. 19.5. of the data sample below G = 19.5.
Ibata & Martin Stream finding from Malhan et al 2018 position velocity photometry ties ties of of aa sample sample of of previously-discovered Figure 4. Properties of a sample previously-discovered streams, streams, ofas recovered recovered by by the previously-discovered as the STREAMFINDER. The The first, streams, as recovered STREAMFINDER. second, first, by third third and the STREAMFINDER. second, and fourth The first, second fourth perties of operties of the GD-1, therows GD-1, Jhelum, show Jhelum, Indus Indus and the properties and ofOrphan the GD-1, Orphan streams, Jhelum, streams, respectively. Indus andThe respectively. columns Orphan The reproduce, streams, columns from from left respectively. reproduce, The left to right, right, the tocolumns thereproduce, from ates of nates of the Idea: streams are local overdensities in position, velocity and photometric the structures, equatorialthe structures, distance distance solutions coordinates the found found by of the structures, solutions the bythe algorithm thedistance algorithm (for (for representative solutions found by the representative metallicity algorithmvalues), metallicity values), the the proper (for representative proper metallicity v on on (with (with observations in in red, red, model solutions in in blue, inand red,the full full DR2 DR2 sample in in grey), and and the colour-magnitude e stars stars (with space. observations motion distribution (with observations observations distribution of in model red in the and red stars solutions (with observations and template template model blue, model in (with observations and in blue) blue) the in red model selected selected by sample solutions in blue, by STREAMFINDER. and template grey), model in blue) STREAMFINDER. The and full the the DR2 The distance distanceby selected colour-magnitude sample in grey), and the solutions solutions found found The distanc STREAMFINDER. match match closely the closely by distance thethe distance algorithmvalues values that have have been thatclosely match the previously been previously distance derived derived values that for these forhave these streams: streams: been DD ⇠ previously 88kpc kpc for ⇠derived forGD-1 for these(Grillmair GD-1 (Grillmair streams: D ⇠ 8 kpc fo D ⇠ ,, D ⇠ 13.2 Since they are cold, the stars in the stream are clustered in velocity. 13.2 kpc and ⇠ and kpcDionatos & ⇠ 16.6 16.6 kpc kpcD 2006), for for Jhelum ⇠Jhelum and 13.2 kpcandandIndus, Indus, respectively respectively ⇠ 16.6 kpc for Jhelum(Shipp (Shipp andet et al. al. 2018) and and D D = 2018)respectively Indus, = [33 [33 38] (Shipp etkpc 38] al. for kpc for 2018) and D = g etet al. al. 2010). The The CMD 2010).Orphan CMD template template (Newberg models, 2010).shown models, et al. shown The CMDin in blue blue in in the templatethe last last column, models,column, shown have have in been blue plotted been plotted in at the lastat the the appropriate appropriate column, have been plotted a spective streams. espective streams. The Thefor distance colour-magnitude colour-magnitude diagram diagram The the respective streams. of of the the Orphan Orphan stream stream might colour-magnitude might seem diagram seem peculiar, peculiar, of the Orphan but but here here we stream we only see onlyseem might see the the peculiar, but he due due to to the the trimming trimming red-giantof the the data of branch data sample sample due to the below below GG= trimming = 19.5. 19.5. of the data sample below G = 19.5.
Ibata & Martin Stream finding from Malhan et al 2018 position velocity photometry ties ties of of aa sample sample of of previously-discovered Figure 4. Properties of a sample previously-discovered streams, streams, ofas recovered recovered by by the previously-discovered as the STREAMFINDER. The The first, streams, as recovered STREAMFINDER. second, first, by third third and the STREAMFINDER. second, and fourth The first, second fourth perties of operties of the GD-1, therows GD-1, Jhelum, show Jhelum, Indus Indus and the properties and ofOrphan the GD-1, Orphan streams, Jhelum, streams, respectively. Indus andThe respectively. columns Orphan The reproduce, streams, columns from from left respectively. reproduce, The left to right, right, the tocolumns thereproduce, from ates of nates of the Idea: streams are local overdensities in position, velocity and photometric the structures, equatorialthe structures, distance distance solutions coordinates the found found by of the structures, solutions the bythe algorithm thedistance algorithm (for (for representative solutions found by the representative metallicity algorithmvalues), metallicity values), the the proper (for representative proper metallicity v on on (with (with observations in in red, red, model solutions in in blue, inand red,the full full DR2 DR2 sample in in grey), and and the colour-magnitude e stars stars (with space. observations motion distribution (with observations observations distribution of in model red in the and red stars solutions (with observations and template template model blue, model in (with observations and in blue) blue) the in red model selected selected by sample solutions in blue, by STREAMFINDER. and template grey), model in blue) STREAMFINDER. The and full the the DR2 The distance distanceby selected colour-magnitude sample in grey), and the solutions solutions found found The distanc STREAMFINDER. match match closely the closely by distance thethe distance algorithmvalues values that have have been thatclosely match the previously been previously distance derived derived values that for these forhave these streams: streams: been DD ⇠ previously 88kpc kpc for ⇠derived forGD-1 for these(Grillmair GD-1 (Grillmair streams: D ⇠ 8 kpc fo D ⇠ ,, D ⇠ 13.2 Since they are cold, the stars in the stream are clustered in velocity. 13.2 kpc and ⇠ and kpcDionatos & ⇠ 16.6 16.6 kpc kpcD 2006), for for Jhelum ⇠Jhelum and 13.2 kpcandandIndus, Indus, respectively respectively ⇠ 16.6 kpc for Jhelum(Shipp (Shipp andet et al. al. 2018) and and D D = 2018)respectively Indus, = [33 [33 38] (Shipp etkpc 38] al. for kpc for 2018) and D = g etet al. al. 2010). The The CMD 2010).Orphan CMD template template (Newberg models, 2010).shown models, et al. shown The CMDin in blue blue in in the templatethe last last column, models,column, shown have have in been blue plotted been plotted in at the lastat the the appropriate appropriate column, have been plotted a spective streams. espective streams. The Thefor distance colour-magnitude colour-magnitude diagram diagram The the respective streams. of of the the Orphan Orphan stream stream might colour-magnitude might seem diagram seem peculiar, peculiar, of the Orphan but but here here we stream we only see onlyseem might see the the peculiar, but he due due to to the Can sideband in one of the velocities and use ANODE to look for local the trimming trimming red-giantof the the data of branch data sample sample due to the below below GG= trimming = 19.5. 19.5. of the data sample below G = 19.5. overdensities!
Via Machinae: preliminary results The method successfully finds known streams!
Via Machinae: preliminary results The method also finds ~600 other stream candidates We are currently investigating these further to see if they could be real. Stay tuned!
Summary and Outlook • Advances in machine learning are opening up new and exciting avenues for model independent new physics searches at the LHC. • Ideas and methods inspired by LHC problems are being applied successfully to other fields such as astronomy and astrophysics. • The LHC Olympics 2020 provided a very useful testing ground for the development and common benchmarking of new approaches. • Much work remains to be done in order to port these ideas over to ATLAS and CMS and implement them as actual analyses on real data. • We need more ideas for model-independent searches at the LHC. This is just the beginning!
Current Organization of Physics Analysis Groups at the LHC B physics B2G / Exotics/ HDBS Exotica Search Statistics Groups forum ML forum SM Supporting organizations SUSY Measurement Top Higgs Groups Q: Why is there no model independent search group???
A vision for the future… Future Organization of Physics Analysis Groups at the LHC?? Unsupervised B2G / B physics Weakly Supervised Model HDBS Agnostic? (Semi) Supervised Statistics Search forum ML Groups forum Exotics/ SM Supporting Exotica organizations Measurement Top SUSY Higgs Groups from G. Kasieczka, B. Nachman, DS (eds), et al 2101.nextweek
Thanks for your attention!
9. A comparison of the four features x between the SR and two nearby sidebands defined by .1, 3.3] TeV (lower sideband) and mjj œ [3.7, 3.9] TeV (upper sideband). ANODE: Results on LHCO R&D Dataset mate the background. To study this sensitivity in a controlled fashion, correlations Ben Nachman & DS 2001.04990 oduced artificially. In practice, adding more features to x will inevitably result in pendence with mjj ; the artificial example here illustrates the challenges already in low Can also consider performance on a feature set which is not ons. New jet independent mass observables are created, which are linearly shifted: of m. We introduced artificial correlations just as proof of concept: mJ1,2 æ mJ1,2 + c mJJ , (5.1) = 0.1 for this study. The resulting shifted lighter jet mass is presented in Fig. 10. w ANODE and CWoLa models are trained using the shifted dataset and their perfor- s quantified in Fig. 11. As expected, the fully supervised classifier is nearly the same as ANODE is still able to significantly enhance the signal, with a maximum significance ment near 4. While in principle ANODE could achieve the same classification accuracy hifted and nominal datasets, the performance on the shifted examples is not as strong g. 7. In practice the interpolation of pbackground into the SR is more challenging now he linear correlations. This could possibly be overcome with improved training, better of hyperparameters, or more sophisticated density estimation techniques. construction, there are now bigger differences between the SR and SB than between background and the SRANODEsignal. Therefore, the CWoLa is robust while hunting classifier CWoLa completely fails! is not able to Figure 11. ROC (left) and SIC (right) curves in the signal region using the shifted dataset specified
Searching for NP with deep autoencoders Heimel et al 1808.08979; Farina, Nakai & DS 1808.08992 Latent layer Figure 1: The schematic diagram of an autoencoder. The input is mapped into a low(er) dimensional An autoencoder representation, maps in this case anand 6-dim, input into a reduced “latent representation” then decoded. and then attempts to reconstruct the original input from it. threshold. CanForuse reconstruction concreteness, error we will focus as work in this an anomaly threshold! on distinguishing “fat” QCD jets from other types of heavier, boosted resonances decaying to jets. Building on previous work on top tagging [12], we will concentrate on machine learning algorithms that take jet images as inputs. For signal, we will consider all-hadronic top jets, as well as 400 GeV See also: Hajergluinos et al “Novelty Detection decaying to 3Meets jets Collider via RPV.Physics” 1807.10261 Obviously, this is not meant to be an exhaustive Cerri study et al “Variational Autoencoders of all possible for New Physics backgrounds Mining atand and signals the Large Hadron methods Collider” but is just1811.10276 meant to be a
signal (tops and 400 background (QCD) GeV RPV gluinos) Performance It works as an anomaly detector! Robust against Figure contamination 2: Distribution with of reconstruction errorsignal —with computed cana CNN use autoencoder in fully unsupervised on test samples of mode QCD background (gray) and two signals: tops (blue) and 400 GeV gluinos (orange).
You can also read