FriDAQ Infrastructure - CERN Indico
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
FriDAQ Infrastructure Benjamin Moritz Veit 8. Februar 2021 Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 1 / 24
Selection of Server Hardware Usage of AMD EPYC architecture in other DAQ systems (e.g. LHCb) lead us to look into this architecture. AMD EPYC MCM architecture is considerably different from Intel single die configuration → Advantages in NUMA configuration. Higher IO capability (PCIe4.0) and more lanes per CPU. Better price/performance figure than Intel. AMD EPYC 7002 architecture: Chip Cores/Threads max Freq. TPD Cache Cost Cost/Core EPYC 7282 16/32 2.8Ghz 120W 64MB 650 40 EPYC 7402P 24/48 2.8Ghz 180W 128MB 1300 54 EPYC 7542 32/64 2.9Ghz 225W 128MB 2660 83 EPYC 7262 8/16 3.2Ghz 155W 128MB 650 81 EPYC 7232P 8/16 3.1Ghz 120W 32MB 460 58 → EPYC7282 has the best price/performance tag! Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 2 / 24
Supermicro 1014S-WTRT Single AMD EPYC7002 Processor 8x DIMMs (ECC DDR4-3200MHz) 2x PCI-E 4.0 x16 (FHFL) slots 1x PCI-E 4.0 x16 (LP) slot 4 Hot-swap 3.5 SATA3 drive 2xPCIe/SATA3 NVMe M.2 2x10GBase-T (Broadcom BCM57416) Integrated IPMI 2.0 + KVM (dedicated port) Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 3 / 24
AMBER Server V1 Single AMD EPYC7282 Processor (16C/32T) 64GB (ECC DDR4-3200MHz) 512GB NVMe SSD 4x 3.5 HDD Bay Expansion cards: Nvidia ConnectX-4Lx 2x25Gbit (default) LSI MegaRAID SAS 9380-8e (optional +750Euro) Spillbuffer card (optional) Universal system for all computation nodes in the AMBER DAQ. Allows up to 3x PCIe expansion cards and 4x 3.5 HDD Cost point ≈ 2200Euro (base system) Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 4 / 24
Local Storage JBOD Disk Chassis (384TB each) ... 4x MiniSAS3 (12Gbit/s) ReadOut Engines ... 25Gbit/s SFP+ DAC QFX5120-48Y 25Gbit Switch 25Gbit/s SFP+ DAC HLT Nodes ... Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 5 / 24
Supermicro SuperChassis CSE-846BE2C-R1K03JBOD 4U Storage JBOD Chassis 24 x 3.5 hot-swappable HDDs bays 8 x Mini-SAS HD ports 1x IPMI port for Remote System Power on/off and system monitoring Dual Expander Backplane Boards support SAS3/2 HDDs with 12Gb/s throughput 1000W (1+1) 96% efficient Titanium level power supplies 24x 16TB = 384TB Raw capacity System Cost (24x320Euro + 1500Euro + 750Euro ≈ 10000 Euro) Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 6 / 24
DAQ General Overview CERN FrontEnd SpillBuffer PCIe ReadOut Engine Tape LV1+ Multiplexer JBOD Local Storage Archive HLT Node LV0 FrontEnd Multiplexer SpillBuffer PCIe ReadOut Engine Crosspoint LV1+ Multiplexer HLT Node FrontEnd JBOD Local Storage 20Gbit/s Switch (72x72) SpillBuffer PCIe ReadOut Engine LV1+ Multiplexer HLT Node FrontEnd JBOD Local Storage LV0 FrontEnd SpillBuffer PCIe ReadOut Engine Multiplexer LV1+ Multiplexer JBOD Local Storage HLT Node 24 Interlinks FrontEnd DAQ Switch (8x8) SpillBuffer PCIe ReadOut Engine 51.2Gbit/s 25Gbit LV1+ Multiplexer JBOD Local Storage HLT Node AMBER Network ... ... SpillBuffer PCIe ReadOut Engine LV1+ Multiplexer HLT Node JBOD Local Storage Crosspoint Switch SpillBuffer PCIe ReadOut Engine (72x72) LV1+ Multiplexer HLT Node JBOD Local Storage FrontEnd LV0 Multiplexer SpillBuffer PCIe ReadOut Engine FrontEnd LV1+ Multiplexer HLT Node JBOD Local Storage FE sents Multiplexing on Load ballancing Multiplexing on Multiplexing data Receives data and save it in local Fetches data from continous HIT image level and of links TimeSlices level which belongs to storage local storage for information in buffering between LV0 and buffering one TimeSlices to filtering Images and LV1+ MUX one ReadOut Local storage for O(1-2Weeks) Engine Max 120 Input Links Max Rate: 8x6.4Gb/s Possible scheme for maximum configuration 96 input links on the cross-point switches 8x8 DAQ switch → max 8 readout engines → 8x650 Mbyte/s = 5.2 Gbyte/s (sustained) Can be doubled by adding a second DAQ switch! Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 7 / 24
FriDAQ Rack Planning Server Rack Server Rack Server Rack 24xMTP-Patch-Panel Router X-Switch 24xMTP-Patch-Panel NetworkMaster 24xMTP-Patch-Panel GW01 X-Switch GW02 24xMTP-Patch-Panel HLT01 aTCA 8xMUX2 4xMTP->LC 4xMTP->LC HLT02 Ethernet Switch TCS Distribution aTCA Switch HLT03 TimeSlice Builder HLT04 aTCA 24xMUX1 DB Master HLT05 FileServer HLT06 TCS Distribution Web Server HLT07 Ethernet Switch DAQ HLT08 TCS Distribution Master Switch Ethernet Switch RE01 RE05 aTCA 24xMUX1 Storage1 Storage5 RE02 RE06 aTCA 24xMUX1 Storage2 Storage6 TCS Distribution RE03 RE07 Ethernet Switch Storage3 Storage7 TCS Distribution RE04 RE08 aTCA 24xMUX1 Storage4 Storage8 4x100W Switches = 0.4kW 8*350W Server = 2.8kW 12*300W HLT = 3.6kW 4*6Slot*200W = 4.8kW 4x250W Disk Array = 1.0kW 4*250W Disk Array = 1.0kW 2*2Slot*200W = 0.8kW 1x100W Switch = 0.1kW 2x100W(X-Switch) = 0.2kW 1x300W Router = 0.3kW 1x100W Switch = 0.1kW 3*150W Small Server = 0.5kW 5.2kW 4.9kW 5.5kW https://schroff.nvent.com/en-gb/search#q=ATCA → Power requirements: 2x 16 A per rack Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 8 / 24
Network CERN IT route the subnet 172.22.0.0/18 (172.22.0.1 - 172.22.63.254 [16,384]) via our new layer3 switch since December 2020. VLANs are used to separate sub-nets. The cross sub-net communication is realized via a central layer3 switch. DHCP/DNS is provided by a new, separated network gateway. Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 9 / 24
Network Scheme COMPASS CR Bld. 892 COMPASS DOMAIN COMPASS/GPN 172.22.24.0/24 Gateway and VlanID: 10 DHCP/DNS for privat l3-interface: 172.22.24.23 VLANS COMPASS IPBus 10.152.0.0/16 COMPASS VlanID: 50 l3-interface: 10.152.0.5 PCCOGW00 172.22.24.241 CERN Network COMPASS SlowCTRL 192.168.104.0/24 VLAN Trunk (all) VlanID: 70 l3-interface: 192.168.104.5 COMPASSPriv 20Gbit LAG COMPASS DOMAIN 192.168.101.0/24 VlanID: 60 l3-interface: 192.168.101.5 CERN HP COMPASS Router Juniper Switch 172.22.24.1 172.22.24.23 IPMI 192.168.100.0/24 VlanID: 30 l3-interface: 192.168.100.5 VLAN Trunk (all) AMBER 172.22.28.0/22 VlanID: 128 l3-interface: 172.22.28.1 AMBER AMBER SlowCTRL Experimental Area 172.22.32.0/22 VlanID: 132 l3-interface: 172.22.32.1 ... Access Switch Access Switch AMBER IPBus 172.22.36.0/20 VlanID: 136 l3-interface: 172.22.36.1 Routing between COMPASS and AMBER network might be required if we want to use COMPASS control room?! Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 10 / 24
New central layer3 switch Juniper QFX-5120-48Y 48x 25 GbE (SFP28)/10 GbE (SFP+)/1 GbE (SFP) downlink ports 8x 100 GbE (QSFP28)/40 GbE (QSFP+) uplink ports Up to 4 Tbps L2 and L3 performance (bidirectional) Latency as low as 550 nanoseconds 2.9 GHz quad-core Intel CPU with 16 GB memory and 100 GB SSD storage Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 11 / 24
Status of Network Installation Central layer3 network switch installed and configured All switches in the area are updated and configure All switches are connected to juniper switch as star point New NETGW00 deployed with OPNsense as OS - act as DNS and DHCP Network configuration of servers in COMPASS network is adapted Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 12 / 24
Network Map All network equipment is integrated in Zabbix monitoring! Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 13 / 24
Speed towards CTA The connection towards the new Central Tape Archive was tested during the 2020 Dry Run. Over 7 Gbit/s archived! Still discussion over bigger up-link towards CERN data centers is ongoing (min 20 Gbit/s) Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 14 / 24
Fiber Distribution Status Quo Old style SC connectors / low density installation / aging → Multiple conversions between connector types LC-SC/SC-MTP/MTP-LC... Multiple points of failure! Lack of fibers especially at BMS/CEDARS, Target, Gallery, Trigger → Use of Multiplexer in experimental area to concentrate fibers Sensitive to radiation! (Problems on Gallery and CEDARs in 2018) Change of classical read-out scheme to trigger-less read-out → Additional fibers needed to avoid multiple MUXs in area. → Higher speed of serial links not compatible with current fibers. Avoid radiation issue for future hadron runs! Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 15 / 24
MTP/MPO Technology High density Fiber Connectors - used also in all new DAQ developments. MTP-24 with OM3/4 fiber as new standard for the DAQ. https://www.samm.com/en/page/109/mpo-mtp-frequently-asked-questions.html Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 16 / 24
Plans: Removing all Multiplexers from the area (at least for the high radiation locations) and have direct connections to the DAQ barracks and place multiplexers there... 4 0 1 8 3 5 2 9 6 7 Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 17 / 24
Test Area (CEDAR) TPC 1 TPC To connect our equipment at the two test location ≈20-25m of patch cables are needed Do we want to have a direct run from BMS to our DAQ or go over (1) as a patch-point? Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 18 / 24
Status Fe Fibers Estimated Newly Position Name old setup new Fibers Installed Comment 0 BMS 14 72 0 1 CEDAR 12 72 0 Covert by BE 2 Target 341 432 144 To be Installed 3 SM2 401 432 144 Installed 4 Gallery 305 360 0 5 ECAL2 390 432 0 6 VETO 12 72 24 Installed 7 Trigger 24 72 48 Installed 8 BeamDump 12 72 0 9 RICH 180 216 144 Installed We have to decide about BeamDump position if we want to use it as test location for PRM! Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 19 / 24
New Fibers Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 20 / 24
New Fiber patch panels DAQ-Barracks: SM2-Position: 11xMTP-24 24xLC RICH-Position: 48xMTP-24 in 2x1U 96xLC to 8x MTP-24 in 2x1U 11xMTP-24 24xLC Material for Target position arrived but still have to be installed. Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 21 / 24
General Status Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 22 / 24
General Status Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 23 / 24
Order/Delivery Status Goal for tests in 2021: General Infrastructure + 1x complete read-out chain running ... 4x Amber Server V1 (1x File-Server, 1x DB-Server, 1x ReadOutEngine, 1x HLT) ( 2x already delivered, 2x ordered ) 2x Low cost Server (1x Gateway, 1x Web Server) ( 2x ordered ) 2x 24 Port 1 GbE Network Switch ( not yet ordered - waiting for final offer ) 8x 16 TB HDD ( 8x ordered - arrived today at CERN) 1x Storage Array + Raid Controller ( Not yet ordered - waiting for budget (≈ 10kEuro) ) Benjamin Moritz Veit FriDAQ Infrastructure 8. Februar 2021 24 / 24
You can also read