THE PYTHON SOFTWARE ENVIRONMENT IN KM3NET - JOHANNES SCHUMANN (SPEAKER), TAMAS GAL ON BEHALF OF THE KM3NET COLLABORATION PYHEP CONFERENCE ...
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
The Python Software Environment in KM3NeT Johannes Schumann (Speaker), Tamas Gal on behalf of the KM3NeT Collaboration PyHEP Conference 2021-07-08
KM3NeT ● Water Cherenkov detector infrastructure in the Mediterranean sea (more than 2km depth) Oscillation Research with Cosmics in the Abyss – dense instrumentation for few-GeV atmospheric ν – determine neutrino mass hierarchy – effective target volume ~ 6 Mm3 Astroparticle Research with Cosmics in the Abyss – sparse instrumentation for TeV-PeV cosmic ν – discover high-energetic astrophysical neutrino sources [1] KM3NeT LoI ● Challenging task to build, operate and scientifically exploit the detector – Large (uneven) datasets need to be processed (on different levels) – Challenge: Minimize obstacles to access data from multiple languages and platforms PyHEP 21 - 2021-07-05 – Johannes Schumann 2
Detector Setup Artist’s impression of KM3NeT/ORCA ν l π e- Sources µ- π [5] P2O LoI 12 PMTs 17” [2] [3] [4] 19 PMTs Atmospheric Active Galactic Supernova [1] KM3NeT LoI Nuclei Digital Optical Module 31 x 3” PMTs PyHEP 21 - 2021-07-05 – Johannes Schumann 3
Computing and Python Environment Shore Station Computing Center C++ / ν l tier-based π0 processing chain µ- - e .root π+ Data Writeout km3io Microservices DB km3db km3services Online Monitoring cca004$> _ km3mon ● km3pipe ● km3cuts ● km3flux ● km3astro ● ... PyHEP 21 - 2021-07-05 – Johannes Schumann 4
C++ & PyROOT in KM3NeT ● Main KM3NeT codes (trigger, calibration, reconstruction) are C++ ● ROOT6/PyROOT/cppyy allows use of the C++ codebase in Python ● For analysis & development: offline data format ROOT Classes designed with Python usage in mind from the start, e.g. printing. ● Offline framework that supports both C++ and Python user code – providing e.g. user-friendly event-file reading – C++ out of the box, with some pythonizations. – Freedom to choose where to use Python and C++ ● ‘low level’ C++ with ‘high-level’ Python scripting commonly used: e.g. summary file creation, astronomy searches and cascade reconstruction. PyHEP 21 - 2021-07-05 – Johannes Schumann 5
km3io ● km3io provides uproot/awkward front-end in order to provide KM3NeT data access w/o (Py)ROOT ● Main dev: Tamás Gál ● Source: https://github.com/KM3NeT/km3io ● Standard KM3NeT file formats are ROOT based with custom classes – “Online”: detector DAQ write out format – “Offline”: format MC simulation and event reconstruction data ● Reading of data files was previously only given via PyROOT bindings – but ROOT installation is needed – PyROOT is slow and requires more memory compared to uproot ● Optimised iterator behaviour to allow combined lazy-readings of multiple branches ● Individual number of particles and photosensor readings (hits) leads to uneven data structure → perfect match with awkward arrays PyHEP 21 - 2021-07-05 – Johannes Schumann 7
June 22, 2021 [ ]: ]: import uproot [1]: [ import km3io [ ]: uproot perspective [12]: from km3net_testdata import data_path [ ]: data_fname = data_path("offline/km3net_offline.root") [18]: [ ]: f = uproot.open(data_fname) [ ]: f["E"].show() [19]: [ ]: name | typename | interpretation ---------------------+--------------------------+------------------------------- [ ]: Evt | Evt | AsGroup(u4') Evt/AAObject/usr_… | vector | AsGroup(i4') Evt/det_id | int32_t | AsDtype('>i4') [ ]: Evt/mc_id | int32_t | AsDtype('>i4') [19]: f["E"].show() ... 1 name | typename | interpretation ---------------------+--------------------------+------------------------------- Evt | Evt | AsGroup(u4') 8 Evt/AAObject/usr_… | vector | AsGroup(
June 22, 2021| AsObjects(AsArray(True, Fal… Evt/mc_trks/mc_tr… | std::vector* Evt/mc_trks/mc_tr… | int32_t[] | AsJagged(AsDtype('>i4')) [1]: Evt/mc_trks/mc_tr… import uproot | int32_t[] | AsJagged(AsDtype('>i4')) Evt/mc_trks/mc_tr… import km3io | std::vector* | AsObjects(AsArray(True, Fal… Evt/mc_trks/mc_tr… | std::vector* | AsObjects(AsArray(True, Fal… uproot perspective [12]: Evt/mc_trks/mc_tr… | std::vector* from km3net_testdata import data_path | AsObjects(AsArray(True, Fal… Evt/mc_trks/mc_tr… | std::string* | AsObjects(AsArray(True, Fal… data_fname = data_path("offline/km3net_offline.root") Evt/comment | TString | AsStrings() Evt/index | int32_t [18]: f = uproot.open(data_fname) | AsDtype('>i4') Evt/flags | int32_t | AsDtype('>i4') [ ]: [20]: f["E/Evt/hits"].keys() [ ]: [20]: ['hits.id', 'hits.dom_id', ● Hit data stored in a general purpose class [ ]: 'hits.channel_id', 'hits.tdc', for DAQ and simulated hits [ ]: 'hits.tot', 'hits.trig', amplitude, pure amplitude, pure time, – [ ]: 'hits.pmt_id', etc. are MC data values [ ]: 'hits.t', 'hits.a', [ ]: 'hits.pos.x', 'hits.pos.y', [ ]: 'hits.pos.z', 'hits.dir.x', 'hits.dir.y', [19]: f["E"].show() 'hits.dir.z', name 'hits.pure_t', | typename | interpretation ---------------------+--------------------------+------------------------------- 'hits.pure_a', Evt 'hits.type', | Evt | AsGroup(u4') 9 Evt/AAObject/usr_… | vector | AsGroup(
't', June 22, 2021 'tdc', 'pos_x', [1]: import uproot 'pos_y', import km3io 'pos_z', 'dir_x', uproot perspective [12]: from km3net_testdata import data_path 'dir_y', data_fname = data_path("offline/km3net_offline.root") 'dir_z', 'tot', [18]: f = uproot.open(data_fname) 'trig'] [ ]: [36]: evts = r.events[:3] evts.hits.channel_id[0,:5] [ ]: [36]: [ ]: [ ]: [ ]: [ ]: [ ]: [ ]: [ ]: [ ]: [19]: f["E"].show() name | typename | interpretation ---------------------+--------------------------+------------------------------- Evt | Evt | AsGroup(u4') 10 Evt/AAObject/usr_… | vector | AsGroup(
km3io perspective [22]: r = km3io.OfflineReader(data_fname) [43]: print(r.events.keys()) {'n_mc_tracks', 'det_id', 'n_hits', 'mc_run_id', 'w2list', 'trigger_mask', 'w', 'flags', 'mc_id', 't_sec', 'tracks', 'mc_tracks', 'trigger_counter', 'frame_index', 'mc_hits', 'index', 'trks', 't_ns', 'mc_trks', 'w3list', 'comment', 'run_id', 'n_trks', 'n_mc_trks', 'id', 'usr_names', 'n_tracks', 'overlays', 'n_mc_hits', 'hits', 'mc_t'} [45]: r.events.hits.fields [45]: ['id', 'channel_id', 'dom_id', Only relevant fields are 't', 'tdc', accessible →compare 'pos_x', a, pure_a, pure_t, ... 'pos_y', 'pos_z', 'dir_x', 'dir_y', 'dir_z', 'tot', 'trig'] [36]: evts = r.events[:3] evts.hits.channel_id[0,:5] [36]: PyHEP 21 - 2021-07-05 – Johannes Schumann [ ]: 11
km3pipe ● Multi-purpose framework based on the thepipe project ● Main devs: Tamás Gál, Johannes Schumann ● Source: https://github.com/KM3NeT/km3pipe ● Focus on pipeline workflow ● Interoperability functions to all relevant detector interfaces (also by utilising other km3py packages, e.g. km3io) – Detector data (ROOT / ASCII / custom binary formats) – DAQ network interface – Database ● HDF5 output → Conversion between different file formats ● Benchmark tools: timers & performance statistics ● High performance computing → create and submit scripts to TORQUE ● Provenance tracking PyHEP 21 - 2021-07-05 – Johannes Schumann 12
km3pipe ● Set up a simple pipline: 0.0.2 Setup the pipeline [18]: pipe = km3pipe.Pipeline() [ ]: pipe.attach(km3pipe.io.online.EventPump, filename=data_fname) pipe.attach(km3modules.common.StatusBar, every=25) pipe.attach(km3pipe.calib.Calibration, filename=calib_fname) pipe.attach(EventHits) pipe.attach(EventHitsStatistic) pipe.attach(km3pipe.io.hdf5.HDF5Sink, filename="output.h5") [20]: pipe.drain() Pipeline and module initialisation took 0.851s (CPU 0.498s). Number of Hits: 96 Number of Hits: 124 Number of Hits: 78 ================================[ . ]================================ Mean number of hits: 99.33333333333333 2021-06-21 22:07:53 ++ km3pipe.io.hdf5.HDF5Sink.HDF5Sink: HDF5 file written to: output.h5 ============================================================ 3 cycles drained in 0.953132s (CPU 0.602786s). Memory peak: 247.59 MB wall mean: 0.029388s medi: 0.019961s min: 0.016951s max: 0.051252s std: 0.015509s PyHEP 21 - 2021-07-05 – Johannes Schumann CPU mean: 0.030253s medi: 0.020248s min: 0.017112s max: 0.053400s std: 13
WARNING Could not find setup.py for directory /home/johannes/.pyenv/versions/3.9.5/lib/python3.9/site-packages (tried all parent directories) 2021-06-21 21:59:19 johannes-t480 pip._internal.vcs.versioncontrol[2283989] WARNING Could not find setup.py for directory /home/johannes/.pyenv/versions/3.9.5/lib/python3.9/site-packages (tried all km3pipe parent directories) ● Custom modules: 0.0.1 Prepare a custom module [6]: def EventHits(blob): hits = blob["Hits"] print("Number of Hits: {}".format(len(hits))) return blob class EventHitsStatistic(km3pipe.Module): def configure(self): self._hit_numbers = [] def process(self, blob): hits = blob["Hits"] no_of_hits = len(hits) self._hit_numbers.append(no_of_hits) return blob def finish(self): mean_no_hits = np.mean(self._hit_numbers) print("Mean number of hits: {}".format(mean_no_hits)) PyHEP 21 - 2021-07-05 – Johannes Schumann 14
[4]: pipe = km3pipe.Pipeline() [5]: pipe.attach(km3pipe.io.online.EventPump, filename=data_fname) pipe.attach(km3modules.common.StatusBar, every=25) pipe.attach(km3pipe.calib.Calibration, filename=calib_fname) pipe.attach(EventHits) km3pipe pipe.attach(EventHitsStatistic) pipe.attach(km3pipe.io.hdf5.HDF5Sink, filename="output.h5") ● Run the pipeline: ++ Detector: Parsing the DETX header ++ Detector: Reading PMT information… ++ Detector: Done. [6]: pipe.drain() Pipeline and module initialisation took 2.011s (CPU 1.987s). Number of Hits: 96 Number of Hits: 124 Number of Hits: 78 ================================[ . ]================================ Mean number of hits: 99.33333333333333 2021-06-22 15:39:30 ++ km3pipe.io.hdf5.HDF5Sink.HDF5Sink: HDF5 file written to: output.h5 ============================================================ 3 cycles drained in 2.861304s (CPU 2.829904s). Memory peak: 241.05 MB wall mean: 0.278396s medi: 0.018987s min: 0.018449s max: 0.797751s std: 0.367240s CPU mean: 0.275658s medi: 0.019116s min: 0.018727s max: 0.789131s std: 0.363080s [6]: Blob([('EventPump', None), ('StatusBar', None), ('Calibration', None), ('EventHitsStatistic', None), ('HDF5Sink', None)]) PyHEP 21 - 2021-07-05 – Johannes Schumann 15
km3pipe ● Data provenance information: [14]: print(km3pipe.Provenance().as_json(indent=2)) [ { "uuid": "d6b77a12-979b-4ff9-9263-9af00482e5b0", "name": "pipeline", "parent_activity": "5ef0be78-85d8-4efd-8605-38101689b0ff", "child_activities": [], "start": { "time_utc": "2021-06-22T13:39:27.878522+00:00", "peak_memory": 217.5 }, "stop": { "time_utc": "2021-06-22T13:39:30.752623+00:00", "peak_memory": 241.046875 }, "system": { "thepipe_version": "1.3.5", "executable": "/home/johannes/.pyenv/versions/3.9.5/bin/python", "arguments": [ "/home/johannes/.pyenv/versions/3.9.5/lib/python3.9/site- packages/ipykernel_launcher.py", "-f",– Johannes Schumann PyHEP 21 - 2021-07-05 16 "/home/johannes/.local/share/jupyter/runtime/kernel-c70b0c7e-64ce-4718-b
km3buu ● Python based wrapper for the GiBUU neutrino generator [6] ● Main devs: Johannes Schumann ● GiBUU Overview: – Monolithic application in FORTRAN90 – Factorized νN interaction model: ● Primary interaction: Relativistic Fermi Gas with SUSA potential ● Final State Interactions: Propagation of phase space densities using Boltzmann-Uehling-Uhlenbeck-Equation – Binary output in ROOT file format → parsed using uproot ● KM3BUU uses GiBUU inside of container distributed via the KM3NeT docker server ● Write out to km3net data format is optional → requires PyROOT PyHEP 21 - 2021-07-05 – Johannes Schumann 17
km3buu ● Setup simulation configuration (jobcard) and run it: PyHEP 21 - 2021-07-05 – Johannes Schumann 18
km3buu ● Setup simulation configuration (jobcard) and run it: PyHEP 21 - 2021-07-05 – Johannes Schumann 19
km3buu ● KM3NeT data format write out: PyHEP 21 - 2021-07-05 – Johannes Schumann 20
km3buu ● KM3NeT data format write out: PyHEP 21 - 2021-07-05 – Johannes Schumann 21
Untitled km3services June 28, 2021 Can be run on server or locally ● Microservices API, e.g. for calculating oscillation probabilities: [1]: import numpy as np import matplotlib.pyplot as plt [2]: from km3services.oscprob import OscProb Docker container oscprob = OscProb() Numpy Library, [60]: n = 1000 arrays e.g. energies = np.logspace(-2, 1, n) OscProb cos_zenith = 0 REST API nue_pdgid = 12 numu_pdgid = 14 nutau_pdgid = 16 [61]: prob_ee = oscprob.oscillationprobabilities(nue_pdgid, nue_pdgid, energies,␣ ,→cos_zenith) JSON prob_em = oscprob.oscillationprobabilities(nue_pdgid, numu_pdgid, energies,␣ ,→cos_zenith) prob_et = oscprob.oscillationprobabilities(nue_pdgid, nutau_pdgid, energies,␣ ,→cos_zenith) Python [62]: plt.plot(energies, prob_ee) REST API program plt.plot(energies, prob_em) plt.plot(energies, prob_et) km3services plt.grid() Numpy arrays call plt.xscale("log") cca004$> _ PyHEP 21 - 2021-07-05 – Johannes Schumann 22
Additional km3py packages ● km3db: Interface for the KM3NeT Oracle database which stores information about detector hardware, calibration, monitoring data, Q&A results etc. ● km3flux: Parsing of flux tables with interpolation functionality ● km3astro: Extension for astropy for celestial coordinate transformations ● km3net-testdata: Collection of all kinds of data formats which can be utilised for unit testing ● rainbowalga: GUI for animating the events and the light distribution in the detector PyHEP 21 - 2021-07-05 – Johannes Schumann 23
Summary ● The km3py collection provides high compatibility to all interfaces of the detector and specific functionality ● Focus on data monitoring, pipelines and provenance ● Widely used tools and scripts in KM3NeT based on the km3py environment ● Most of the packages are open source and distributed via PyPI ● Additionally a environment for KM3NeT is in the making – Existing framework to process KM3NeT data files (ROOT, HDF5, binary, etc.) natively in Julia: NeRCA.jl – Real-time event reconstruction for high-level detector monitoring – Some general HEP Julia projects which originate from KM3NeT members: UnROOT.jl, Corpuscles.jl and Neurthino.jl PyHEP 21 - 2021-07-05 – Johannes Schumann 24
Thank you for your attention!
References [1] S. Adrián-Martínez et al., ‘Letter of intent for KM3NeT 2.0’, J. Phys. G: Nucl. Part. Phys., vol. 43, no. 8, p. 084001, Jun. 2016, doi: 10.1088/09543899/43/8/084001. [2] A. López-Oramas, ‘Multi-year Campaign of the Gamma-Ray Binary LS I +61◦ 303 and Search for VHE Emission from Gamma-Ray Binary Candidates with the MAGIC Telescopes’, 2015, doi:10.13140/RG.2.1.4140.4969. [3] U. F. Katz and Ch. Spiering, ‘High-energy neutrino astrophysics: Status and perspectives’, Progress in Particle and Nuclear Physics, vol. 67, no. 3, pp. 651–704, Jul. 2012, doi: 10.1016/j.ppnp.2011.12.001. [4] https://what-if.xkcd.com/73/ [5] A. V. Akindinov et al., ‘Letter of interest for a neutrino beam from Protvino to KM3NeT/ORCA’, Eur. Phys. J. C, vol. 79, no. 9, p. 758, Sep. 2019, doi:10.1140/epjc/s10052-019-7259-5. [6] O. Buss et al., ‘Transport-theoretical description of nuclear reactions’, Physics Reports, vol. 512, no. 1, pp. 1–124, Mar. 2012, doi:10.1016/j.physrep.2011.12.001. PyHEP 21 - 2021-07-05 – Johannes Schumann 26
You can also read