AI and optimization challenges in physical sciences - A snapshot and a look forward - CERN ...
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
AI and optimization challenges in physical sciences A snapshot and a look forward. Andrey Ustyuzhanin1 1 NRU Higher School of Economics November 8, 2020, USERN
Outline ▶ Quest for discovery outline – Role of simulation ▶ Optimization problem outlook ▶ Optimization methods families – Examples – Local-Generative Surrogate Optimization ▶ Outlook & Conclusion Andrey Ustyuzhanin 08.11.2020 2
Shameless plug ▶ Development and application of Machine Learning methods for solving tough scientific challenges; ▶ Collaborates with LHCb, SHiP, OPERA, CRAYFIS experiments ▶ Research Project examples: – Storage/speed optimization for LHCb triggers; hse_lambda – Particle identification algorithms; – Optimization of detector devices; – Fast and meaningful physical process simulation. ▶ Co-organization of ML challenges: Flavours of Physics, TrackML ▶ 6 Summer schools on Machine Learning for High-Energy Physics ▶ Open for interns, graduate students and post doc researchers! Andrey Ustyuzhanin 08.11.2020 3
Forward and Inverse problems ▶ Forward: from given initial (and hidden) parameters, get the system observable state ▶ Inverse: from the observable state, get hidden parameters – No single solution – No straightforward way to compute – But if one can approximate evolution of a system by some differentiable surrogate, it might profit from methods of Machine Learning – Systems for probabilistic programming: Stan, PyMC3, pyro, Tensorflow Probability (ex Edward) or pyprob. Andrey Ustyuzhanin 9
Optimization challenges – I (selection) ▶ Optimize selection given data, hypothesis to maximize sensitivity – Triggers, particle or jets identification, etc. ▶ Optimize selection given data, hypothesis to maximize sensitivity and minimize model-induced bias ▶ Optimize selection given data, null hypothesis to maximize unexpected (unexplainable) sample yield Andrey Ustyuzhanin 08.11.2020 10
Optimization challenges – II (simulation) ▶ Optimize simulation parameters given data, hypothesis to minimize difference – What is the difference? ▶ Optimize simulation surrogate given data, hypothesis to minimize difference and speed ▶ Make simulation invertible: from observed data allow for the reconstruction of original (input, hidden) parameters and its uncertainties Andrey Ustyuzhanin 08.11.2020 11
Optimization challenges – III+ ▶ Optimize detector (and LHC as well?) given budget, physics laws and new physics expectations to maximize signal yield – Can it be solved recursively? – Implies solving challenges I and II ▶ Optimize set of physics laws given knowledge collected so far and agent cognitive capabilities to minimize complexity of the laws for the agent Andrey Ustyuzhanin 08.11.2020 12
Optimization methods families › 08.11.2020 Andrey Ustyuzhanin 13
Optimization methods ▶ Gradient based – Stochastic, ADAM, RMSProp, … ▶ Gradient-free – Simulated annealing https://bit.ly/3eAqkws – Evolution strategies – Random search – Variational optimization, … ▶ Surrogate-based – Bayesian – DONE – NN-based (L-GSO)
Gradient-based families Andrey Ustyuzhanin 08.11.2020 15
Bayesian optimization (BO) Conditions ▌ f is a black box for which no closed form is known (nor its gradients); ▌ f is expensive to evaluate; ▌ and evaluations of y = f (X) may be noisy. ▌ BO cases • Active learning • Surrogate inference • Bayesian computations Andrey Ustyuzhanin 16
Bayesian optimization cycle Assume our f(x) can be approximated by a Estimate generative model posterior g(Θ, x) and probability of some prior for Θ g(Θ, x) Measure f at argmax(!(x)) and update conditional probability for Θ Introduce acquisition !(x) that depends on with new observation the posterior and captures the probability of finding maximum of f(x) Andrey Ustyuzhanin 17
Illustration https://distill.pub/2020/bayesian-optimization/ Andrey Ustyuzhanin 08.11.2020 18
Bayesian optimization variations ▶ Generative models classes – Gaussian Process Regression – Random Forest Regression – GBDT Regression – NN Regression ▶ Parameters for the class ▶ Acquisition function – Probability of improvement – Confidence bounds – … Andrey Ustyuzhanin 08.11.2020 19
Optimization with Local Generative Surrogates (L-GSO) › 08.11.2020 Andrey Ustyuzhanin 20
TL;DR: This allows us to compute gradients of We approximate a stochastic black-box the objective w.r.t. parameters of the with a local generative surrogate. black-box. 21
From intractable gradient estimation of the black- box. To gradient estimation with learnable generative surrogate(GAN, NF, etc). And successive gradient based optimization of the parameters. 22 Andrey Ustyuzhanin 08.11.2020
Key point: training local generative surrogate Optimization path Area inside which the local surrogate was trained ü gradients of the non-linear surface are well estimated inside the local area. True gradients Surrogate gradients Andrey Ustyuzhanin 08.11.2020 23
Results on high-dimensional problems with low-dimensional manifold Nonlinear Three Hump Neural network weights problem, 40dim optimization, 91dim ▌L-GSO outperforms all algorithms in a high-dimensional setting when parameters lie on a lower dimension manifold. Andrey Ustyuzhanin 08.11.2020 24
Example 1: SHiP Detector Shield Optimization Andrey Ustyuzhanin 08.11.2020 25
Design optimisation in 42 dimensional space of physics simulator ▌L-GSO improves previous results obtained with BO with the same computational budget. ▌New design is 25% more efficient. NeurIPS’20 paper https://arxiv.org/abs/2002.04632 Andrey Ustyuzhanin 08.11.2020 26
Example 2: Molecular Dynamics, inverse simulation Image: https://evolution.skf.com/us/bearing-research-going-to-the-atomic-scale/
Molecular Dynamics (MD) method Image source: http://atomsinmotion.com/book/chapter5/md
MD Big Problem: interaction potential Image source: http://atomsinmotion.com/book/chapter5/md
Interaction potential: experiments & quantum simulations Image: https://mlz-garching.de/englisch/instruments-und- Image: Forbes labs/user-labs/materials-science-lab.html
Molecular Dynamics with Machine Learning Task Develop algorithms that can infer potential functions from the data and quantum simulations Data used Open-source simulation https://www.sciencedirect.com/science/article/pii/S0009250915000779 Metrics Simulation accuracy Image: See also https://www.sciencedirect.com/science/article/pii/S000925 0915000779 Cheng, Bingqing, et al. "Evidence for supercritical behaviour of high- pressure liquid hydrogen." Nature 585.7824 (2020): 217-220.
Conclusion ▶ Optimization Challenges – Design optimization – Forward simulation speed-up – Simulation fine-tuning – Inverse simulation, simulation-based inference ▶ Physical sciences heavily rely on simulation tools. Simulation tools are no longer black-boxes: – Improve control over hidden/latent parameters with respect to the output – Generative models – Probabilistic programming and automatic differentiation ▶ Simulation-based inference with surrogate modelling are new great research fields for direct and inverse optimization problems ▶ Interested in playing with those problems? (see next slide) Andrey Ustyuzhanin 08.11.2020 32
Thank you for the attention! austyuzhanin@hse.ru anaderiRu hse_lambda Andrey Ustyuzhanin Andrey Ustyuzhanin Andrey Ustyuzhanin 08.11.2020 20 33
You can also read