From Open-Channel SSDs to Zoned Namespaces - Matias Bjørling Director, Solid-State System Software - Usenix
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
From Open-Channel SSDs to Zoned Namespaces Matias Bjørling Director, Solid-State System Software 23th January 2019 © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19
Forward-Looking Statements Safe Harbor | Disclaimers This presentation contains forward-looking statements that involve risks and uncertainties, including, but not limited to, statements regarding our solid- state technologies, product development efforts, software development and potential contributions, growth opportunities, and demand and market trends. Forward-looking statements should not be read as a guarantee of future performance or results, and will not necessarily be accurate indications of the times at, or by, which such performance or results will be achieved, if at all. Forward-looking statements are subject to risks and uncertainties that could cause actual performance or results to differ materially from those expressed in or suggested by the forward-looking statements. Key risks and uncertainties include volatility in global economic conditions, business conditions and growth in the storage ecosystem, impact of competitive products and pricing, market acceptance and cost of commodity materials and specialized product components, actions by competitors, unexpected advances in competing technologies, difficulties or delays in manufacturing, and other risks and uncertainties listed in the company’s filings with the Securities and Exchange Commission (the “SEC”) and available on the SEC’s website at www.sec.gov, including our most recently filed periodic report, to which your attention is directed. We do not undertake any obligation to publicly update or revise any forward-looking statement, whether as a result of new information, future developments or otherwise, except as required by law. © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 2
Ubiquitous Workloads Efficiency of the Cloud requires many different workloads to a single SSD Databases Sensors Analytics Virtualization Video Solid-State Drive © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 3 3
Solid State Drive Internals Read/Write Host Interface (NVMe) • NAND Read/Program/Erase Solid State Drive Logical to Physical Wear-Leveling Garbage Collection • Highly Parallel Architecture Translation Map – Tens of Dies Bad Block Management Media Error Handling … • NAND Access Latencies Media Interface Controller • Translation Layer – Logical to Physical Translation Layer Read/Program/Erase – Wear-leveling Read (50-100us) – Garbage Collection Die0 Die0 Die0 Die0 Write (1-10ms) – Bad block management Die1 Die1 Die1 Die1 Erase (3-15ms) – Media error handling Die2 Die2 Die2 Die2 – Etc. Die3 Die3 Die3 Die3 • Read/Write/Erase -> Read/Write Highly Parallel Architecture ©2018 Western Digital Corporation or its affiliates. All rights reserved. © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 44
Open-Channel Concepts Chunks & Parallel Units • Chunks • Parallel Units – Write sequentially within an LBA range – Host can direct I/Os to separate workloads – Requires reset for rewrites – Stripes across single or multiple dies. – Borrows from HDD’s SMR specification – The parallel units inherits the throughput and (ZAC/ZBC) latency characteristics of the underlying media – Optimized for SSD physical constraints – Similar concept to I/O Determinism in NVMe • Align writes to media layout Application-1 Application-2 Application-3 Application-1 Application-2 Application-3 FTL-1 FTL-2 FTL3 SSD Firmware SSD Firmware Disk LBA range divided in chunks Flash Flash Chunk 0 Chunk 1 Chunk 2 Chunk 3 Chunk X © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 5
Open-Channel SSDs I/O Isolation Predictable Latency Data Placement & I/O Scheduling © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 6 6
Open-Channel SSDs Interface Adoption Open-Channel SSD architectures are being adopted by major hyper-scalers Concepts ready to be part of the NVMe standard specification © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 7
NVMe Approach Drive features into NVMe which address key OCSSD use cases Zoned Namespaces (e.g. Chunks) Endurance Group Mgmt (e.g. Parallel Units) • Explicit Provisioning of Device Topology • Build on IO Determinism Today’s talk © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 8
Zoned Namespaces (ZNS) • Technical Proposal in the NVMe working group • Standardizes zone interface as an approach to: – Reduce device-side write amplification – Reduce over-provisioning • “Note that excessive over-provisioning is similar to early replacement -- in both cases you buy more devices.” - Mark Callaghan, Facebook – Reduce DRAM in SSDs • Highest cost after NAND itself – Improve latency outliers and throughput • Less device-side data movement • Tail at scale – Enable software eco-system. • Everyone benefits from improvements! © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 9 9
Zoned Namespaces similar to ZBC/ZAC for SMR HDDs • Storage capacity is divided into zones Disk LBA range divided in zones • Each zone is written sequentially Zone 0 Zone 1 Zone 2 Zone 3 Zone X • Interface optimized for SSDs – Align with media characteristics • Zone size aligned to media (E.g., NAND block sizes) • Zone capacity aligned to physical media sizes – Reduce NAND media erase cycles (Write amp.) Write commands advance the write Write pointer pointer position Reset write pointer commands rewind the write pointer © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 10
Zone Information • Zone State – Empty, Implicitly Opened, Explicitly Opened, Closed, Full, Read Only, Offline – Empty -> Open -> Full -> Empty -> …. • Zone Reset – Full -> Empty • Zone Size & Zone Capacity – Zone Size is fixed – Zone Capacity is the writeable area within a zone Zone Size (e.g., 512MB) Zone X - 1 Zone X Zone X + 1 Zone Start LBA Zone Capacity (E.g., 500MB) © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 11
Zone Append QEMU (i7-6700K, 4K, QD1, RW, libaio) 1500 1300 • Low scalability on multiple writers to a zone 1100 900 K IOPS 700 – Write Queue Depth per Zone = 1 500 – IOPS: 80K vs 880K on Qemu and 300K vs 1400K on metal 300 100 -100 1 2 3 4 • ZAC/ZBC requires strict write ordering Number of Jobs Zone Lock Contention • Limits write performance and increases host overhead 1 Zone 4 Zones Zone Append / No Zones Metal (i7-6700K, 4K, QD1, RW, libaio) • Big challenge with software eco-system, HBAs, etc. 1500 1300 • Introducing Zone Append 1100 900 K IOPS • Append data to a zone without defining offset 700 500 300 • Drive returns where data was written in the zone 100 -100 1 2 3 4 Number of Jobs 1 Zone 4 Zones Zone Append / No Zones © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 12
Zone Write Example 3x Writes (4K, 8K, 16K) – Queue Depth = 1 4K Write0 16K Write2 8K Write1 Zone WPWP WP WP © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 13
Zone Append Example 3x Writes (4K, 8K, 16K) – Queue Depth = 3 4K Write0 16K Write2 8K Write1 Zone WPWP WP WP © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 14
ZNS: Synergy w/ ZAC/ZBC software ecosystem • Existing ZAC/ZBC-aware file systems & device User Space mappers “just work” Applications util-linux fio – Few changes to support to ZNS libzbc blktests • Reuse existing work already applied with for Linux File-System with SMR Kernel Regular File- Support Systems (xfs) ZAC/ZBC hard-drives (SMR) (f2fs, btrfs) Logical Block Device (dm-zoned) • Integrate directly with file-systems Block Layer – No host-side FTL SCSI/ATA NVMe – No 1GB DRAM per 1TB Media requirement – Better utilization of SSD SMR HDD ZBC/ZAC ZNS SSD • Code is already in production at hyper-scalers *= Enhanced data paths for SMR drives and available in the Linux eco-system. © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 06.01 15 3.195
Mature Software Stack ZNS to use existing storage stack • User-space libraries – Libzbd – Nvme-cli – Blktests – Util-linux (blkzone) – fio – libzns • Kernel-space libraries – NVMe support for Zones – XFS, Btrfs, F2FS, dm-zoned, etc… • Qemu with ZNS support © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 16
ZBC State Model • States (Empty, Implicitly Opened, Explicitly Opened, Full, Offline) • Each zone – Size – Capacity – Write Pointer – And other fields • Zone Management command – ZM Open, ZM Close, ZM Reset, ZM Finish © 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19 17
© 2019 Western Digital Corporation or its affiliates. All rights reserved. 06.03.19
You can also read