Microsoft SQL Server Acceleration Flash Buyer's Guide
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Microsoft SQL Server Acceleration Flash Buyer’s Guide Allon Cohen, PhD Scott Harlin OCZ Storage Solutions, Inc. – A Toshiba Group Company
Contents 1 Introduction...................................................................................................................................2 2 Microsoft SQL Server Overview.....................................................................................................3 2.1 What are common Microsoft SQL Server DBMS data types?.................................................3 2.2 What are the common Microsoft SQL Server databases?.......................................................4 2.3 What are the common Microsoft SQL Server applications?.....................................................5 3 When should I use flash to accelerate my SQL Server application?................................................6 4 Which flash form factors are best for my SQL Server application?..................................................7 5 Which performance metrics should I use to best compare flash solutions?....................................8 5.1 Testing flash performance specific to SQL Server workloads................................................10 6 What is the difference between flash volumes and flash caching, and when to use each?...........10 7 What kinds of SQL Server data should I place on flash?..............................................................11 8 What capabilities verify data will be available on flash for SQL Server?.........................................12 9 What parameters should I consider if my SQL Server instances are virtualized?...........................13 10 What native flash functionality does SQL Server 2014 provide?.................................................14 11 How can flash improve SQL Server ETL processes?.................................................................15 12 How can implementation wizards affect the success of my SQL Server acceleration project?....16 13 Flash Buyer’s Checklist..............................................................................................................17 14 Introducing OCZ’s ZD-XL SQL Accelerator.................................................................................18 1 Introduction Flash memory technology provides a perfect match to the performance requirements of enterprise and cloud database applications. In comparison to traditional hard disk drive (HDD) storage, flash-based storage uses significantly less electricity, has no moving mechanical heads or spinning disks, and reads and writes data significantly faster handling random data access effortlessly while completely outperforming the slower electromagnetic media. The purpose of this Flash Buyer’s Guide developed by OCZ Storage Solutions – a Toshiba Group Company is to provide database administrators (DBAs) and IT managers responsible for the success of their enterprise database applications with key guidelines on when and why to deploy flash in a Microsoft SQL Server environment and how to accelerate this application effectively using solid-state storage. The guide details typical and common questions asked by DBAs and IT managers when considering flash usage in their environments and provide answers to those questions with solutions and capabilities that accelerate Microsoft SQL Server applications. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 2
Included in the Flash Buyer’s Guide is a checklist that can be used when comparing the variety of offerings available from various flash storage vendors, as well as a short introduction to how OCZ accelerates SQL Server applications through its award-winning ZD-XL SQL Accelerator PCIe card. 2 Microsoft SQL Server Overview SQL Server is a database management system (DBMS) developed by Microsoft designed for enterprise environments. It is a software product whose primary function is to store and retrieve data as requested by other applications (either on the same computer or on another computer across a network). There are different editions of Microsoft SQL Server aimed at various audiences with workloads that include OnLine Transaction Processing (OLTP), data warehousing, data mining, OnLine Analytical Processing (OLAP), to name a few. As the most pervasive DBMS solution in the market, Microsoft SQL Server provides enterprises of all sizes with a wide range of transactional and analytical capabilities to solve critical business needs. Being data access intensive, one of the major factors affecting its performance is the strength of the underlying storage resources. The speed of the storage implementation is determined by the time it takes to scan and analyze large portions of data as well as the amount of concurrent commands that the storage device can process ultimately determines whether users receive the business insight they need, when they need it. To enable a large number of users to be serviced without contention, and to maximize each user’s application experience, the underlying storage metrics particular to SQL Server workloads must deliver optimal performance. Providing immediate access to data becomes especially critical during peak usage so that productivity is not adversely affected. Transactional access rates and database read bandwidth can significantly impact the time it takes to complete queries in enterprise and cloud environments. New technologies introduced in Microsoft SQL Server 2014, such as in-memory columnar processing and buffer pool extensions further enhance SQL Server performance, far bypassing the capabilities of traditional HDD storage. With the parallel growth in server CPU power and memory, a matching storage paradigm is required to achieve the full benefits and new capabilities presented by SQL Server 2014. 2.1 What are common Microsoft SQL Server DBMS data types? Microsoft SQL Server is a DBMS that lets users create and access data in a database and manages these user requests freeing them and other application programs from having to know where data is located on storage media. As a DBMS, it ensures that data continues to be accessible, is consistently organized as intended, and is secured so only those with access privileges can access the data. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 3
The most common DBMS is a relational database management system (RDBMS), as the standard user and program interface is the Structured Query Language (SQL). A DBMS can be thought of as a file manager that manages data in databases rather than files in file systems. Microsoft SQL Server is a DBMS that serves database requests from multiple users and includes three distinct data types: 1. Data records and indexes (referred to as database data) 2. Transaction logs (referred to as the write log files) log user transactions and are also used for recovery and data replication 3. TempDB files use a temporary database to store transient, non-persistent data 2.2 What are the common Microsoft SQL Server databases? Microsoft SQL Server databases typically fall into two primary categories, either transactional or analytical, with records presented as either row or columnar based. Transactional Databases These databases facilitate and manage all online transactions capturing information surrounding a business transaction (e.g. a sale) while enabling the data to be segmented, grouped, stored or retrieved for a specific use-case. This type of data commonly includes data items such as products ordered, sales prices, shipping and routing information, method of payment, applied warranties/rebates /discounts, sales location and any number of other variables available through the recording of transactions. Analytical Databases These databases selectively extract data for analysis and provide varying points of view of the data captured. As an example, this type of data enables users to analyze products sold in a geographic area, during a specific month, compared to revenue during another month, or a comparison of other product sold during the same time period, etc. To facilitate this kind of analysis, the data in some cases is stored in a multi-dimensional database (versus a relational database) that considers each data attribute (such as product, geographic sales region, and time period) as a separate dimension for analysis. Row-Oriented Databases A DBMS that stores data in rows, sometimes referred to as a record, and includes a set of database fields within a table relevant to a specific entry. For example, in a table called customer contact information, a row may contain the following fields: ID LastName FirstName Bonus 132 Doe John 8000 133 Smith Will 4000 134 Jones Mike 2000 Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 4
In a row-oriented DBMS, each record would be stored as a unit. In the above example, three units would be stored and include {132, Doe, John, 8000}, {133, Smith, Will, 4000} and {134, Jones, Mike, 2000}. Column-Oriented Databases A DBMS that stores data in columns (instead of rows) so that data can be highly compressed enabling operations to be performed very quickly as the number of data elements that typically must be read by the database engine while it is processing queries is greatly reduced. Also, since columnar databases are self-indexing, they require less storage space. Using the example from the Row-Oriented records, in a column-oriented DBMS, each record would be stored as a unit as follows: {132, 133, 134}, {Doe, Smith, Jones}, {John, Will, Mike} and {8000, 4000, 2000}. 2.3 What are the common Microsoft SQL Server applications? Microsoft SQL Server applications typically fall into three primary categories including OnLine Transaction Processing (OLTP), OnLine Analytical Processing (OLAP) and data mining. A brief description of each now follows: OLTP A database application that facilitates, manages, and processes transactions. A primary example includes data entry and sales order retrieval in support of a number of industries. The key to supporting OLTP applications in the enterprise is to enable a large number of users to be serviced without contention and to deliver optimal storage latency and transactional IOPS. SSD flash memory provides a perfect fit for the data access requirements of OLTP applications through its support of ultra-low latencies and the ability to efficiently handle randomized data access requests. OLAP A database application used to selectively extract data for analysis and to provide varying points of view of the data captured and helps enable companies and organization to make better business decisions. As it relates to marketing applications, OLAP is commonly used to verify or prove the effectiveness of existing marketing campaigns. In many cases, the data to be used for OLAP is collected into a central repository commonly called a data warehouse. Data from various OLTP applications and other sources are selectively extracted and organized onto the data warehouse database for use by analytical applications and user queries. A Microsoft SQL Server data warehouse can include hundreds of thousands of products and a few million records (typical in today’s enterprise) so the accuracy of the information and the ability to deliver data in real-time to customers can be the difference in securing customer orders and providing a heightened experience. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 5
Data Mining In data mining applications, miners sort through huge data sets using sophisticated software to identify undiscovered patterns and establish hidden relationships, whereas data analytics focuses on the process of deriving a conclusion based solely on what is already known. Typical data mining parameters include: • Association: searches for patterns that connect one event to another • Sequence (Path) Analysis: searches for patterns where one event leads to another later event • Classification: classifies new facts into previously discovered patterns • Clustering: finds and visually groups facts into distinct classes • Forecasting: discovers patterns in data that can lead to reasonable future predictions known as predictive analysis 3 When should I use flash to accelerate my SQL Server application? A DBA or IT manager will definitely know when they have a performance issue to address when either end users complain about performance or when the CFO complains about SAN and server expenditures. Microsoft SQL Server performance can be affected by multiple external factors such as the underlying networking, slow CPUs, and limited memory allocation, but the collective experience of many DBAs has shown that in a majority of cases, the culprit is slow storage. The following highlights key signs that indicate the enterprise environment can benefit from using flash to accelerate SQL Server applications. Low CPU Utilization during Peak Usage Times During high workload events when memory fills up, SQL Server often requires frequent access to data pages from underlying storage. Low CPU utilization coupled with high memory usage is an indicator that the processing cores are wasting idle cycles waiting for data to arrive. In these cases, deploying flash services quickens data access and dramatically improves CPU utilization. SAN Controller I/O at Maximum Performance If SQL Server is saturating the underlying SAN, all other applications attempting to access the SAN will also be affected. The performance bottleneck will spread from SQL Server to other applications. On-host flash resources, such as PCIe cards, enable the data hungry SQL Server CPUs to get the data they need locally, alleviating pressure on the SAN while enabling it to handle other applications. Excessive TempDB Usage TempDB files may contain interim results (such as transient calculation tables) that the server writes during the query and must read before being able to complete the query, therefore, write and read performance are critical for the timely execution of such queries. In multi-core systems, tempDB files may be written to and read from simultaneously creating random access patterns that are detrimental to HDD performance as its mechanical head moves from location Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 6
to location. Each movement takes time so the read/write IOPS performance and latency slows down considerably until data is found and accessed. In contrast, tempDB usage patterns are. Low memory creates further loads on tempDB files as SQL Server uses it to store interim results when main memory runs out. Utilizing flash storage, tempDB performance can be dramatically improved and an excellent fit for SSDs as it seamlessly handles random read and write loads. Frequent Storage and Server Upgrades If an IT department is constantly upgrading the SAN due to growing usage, or if more servers are being purchased to improve performance while CPU utilization for each server remains low, introducing flash into the environment will efficiently extend CPU utilization on existing servers and gain the required performance benefits while reducing licensing, OPEX and CAPEX costs in the enterprise. 4 Which flash form factors are best for my SQL Server application? The main flash-based SSD formats fall into three categories: PCIe Flash Cards – a high-speed expansion card format that connects enterprise servers to peripherals in a serial interface format. Every device connected to a motherboard using PCIe can use multiple point-to-point connections called ‘lanes.’ As a result, PCIe-connected devices can aggregate bandwidth, which in turn, enables more scalable performance, lower latency and higher data transfer rates making these devices ideal for SQL Server applications. As a server-side flash solution, PCIe cards are typically compact, power-efficient and fit directly into the server’s PCIe bus to increase server application performance while delivering fast and reliable access to data without burdening host CPU and memory resources. The interface supports different size formats of PCIe cards such as Full Height/Full Length (FH/FL), Full-Height/ Half-Length (FH/HL) and Half-Height/Half-Length (HH/HL). SAS or SATA SSDs – These flash-based drives are packaged in the same form factor as traditional HDDs so they can be ubiquitously deployed and utilized in practically any server or storage slot initially designed to hold HDDs. SAS (Serial Attached SCSI) and SATA (Serial Advanced Technology Attachment) are both industry standard connections used to connect HDDs, and now flash-based SSDs, into a computer system based on serial signaling technology. One drawback of both SAS and SATA is that these form factors were originally designed for HDDs, so that the power that can be drawn and used by SSDs is limited affecting each SSD’s peak performance. Storage capacity is also capped by the amount of flash that can actually fit into these smaller HDD enclosures. Finally, bandwidth is also limited by SAS and SATA Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 7
interfaces to the maximal rates originally designed for mechanical HDDs in mind. To a certain extent, this limitation can be overcome by aggregating the bandwidth from multiple SSDs in an enclosure. External Flash Arrays – External flash arrays offer a simple way to deploy large amounts of flash centrally to multiple servers. They connect to servers using traditional SAN and NAS interfaces and present flash to the array through networking protocols. Flash arrays enable relatively simple deployment because they do not require physical access to the host running SQL Server to deploy the flash. These external arrays therefore offer an excellent solution when the flash placed locally on the server is limited, as is the case in blade arrays. Flash arrays may be susceptible to networking and storage controller bottlenecks as they do not sit directly on the host PCIe bus. Latency will also be increased due to the extra network hops. 5 Which performance metrics should I use to best compare flash solutions? Flash storage vendors present performance data that covers various users and use cases so one must use caution when interpreting this data when it pertains to Microsoft SQL Server environments as this application has very particular data access patterns. As a result, some of the general performance data published by flash vendors may be irrelevant to the eventual performance one will actually obtain in a SQL Server environment. Typical parameters used to evaluate the total effectiveness of a storage system include input/output operations per second (IOPS), bandwidth and latency. Latency The time it takes to complete an operation. Typically the farther away the flash storage device is from the CPU accessing it, the more latency exists and the slower it is to complete an operation. As an example, a PCIe flash card that resides on the server’s PCIe bus, in close proximity to the CPU, has significantly less latency than a SAS-based or SATA-based SSD residing on a shared storage network. HDDs have significantly higher latency than SSDs given the time it takes to move the read head to new locations. It is important to note that while latency measures the total time it takes to complete an I/O command, not all commands are identical. Commands are differentiated by their command size which reflects the amount of data that needs to be passed to or from an SSD. Small command sizes require the processing of small amounts of data and hence the latency to process them will typically be lower. Larger command sizes require the processing of larger amounts of data and hence take more time to process. For this reason low latency numbers published by flash storage vendors usually reflect very small command sizes. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 8
For most transactional operations, SQL Server uses large command sizes typically ranging from 4KB to as high as 64KB, and therefore the lowest latency number published usually are not relevant to SQL Server performance. As most SQL Server applications run on multicore systems and issue multiple storage commands in parallel, the latency of a single I/O command will rarely be the determinant of overall performance. In most systems the ability to handle multiple concurrent commands is much more critical as indicated by IOPS performance. IOPS (Input/Output Operations Per Second) A measurement representing the total number of I/O operations that an SSD can perform in one second in context with the type of storage operation performed (read, write, hybrid) and the size of commands for the data being exchanged. While lowering latency does enable a higher number of IOPS (as more commands can be completed each second), latency is not the only factor determining IOPS. The second factor affecting IOPS performance is parallelism, which increases IOPS by enabling multiple commands to be processed in parallel. A PCIe flash card that employs parallelism will be able to process more commands per second and more effectively handle the concurrent requests of many CPU cores processing data in parallel for SQL Server queries. For this reason, a PCIe card with multiple controllers (or a controller with multiple cores) will usually fair better in SQL Server environment than a PCIe card with a single controller, even if it is capable of producing very low latencies of individual commands. While IOPS is usually a better metric to look at than latency for measuring SQL Server performance, remember that even when comparing IOPS it is important to consider command sizes relevant to SQL Server loads as they range from 4K to 64K. IOPS are particularly important as a performance indicator for transactional workloads (OLTP applications) which involve large, multiple, concurrent transactions. However, the performance of analytics and reporting workloads are sometimes governed by the ability to quickly read very large amounts of data for processing which is measured by the bandwidth indicator. Bandwidth Synonym for the data transfer rate and represents the amount of data that can be transferred in a given time period. SSD bandwidth is usually measured and expressed in megabits per second or gigabits per second (Mbps or Gbps). When SQL Server needs to access large amounts of data (such as entire tables) for analytical or report generation, it will attempt to read the largest sequential commands possible. In this scenario, the storage device will receive a series of very large sequential commands that can reach up to 256K per command. At this stage, the time it will take to complete such queries is no longer governed by a maximal IOPS count (as each command is in itself very large) but rather by the maximal bandwidth the SSD can produce. This will vary between manufacturers and between models as the maximum bandwidth can be limited by the interface (SAS/SATA/PCIe), by the power available to the SSD, by the controller(s) and by the internal handling of large data sets by the SSD (such as striping large commands between NAND chips). Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 9
5.1 Testing flash performance specific to SQL Server workloads While IOPS can help determine performance for transactional loads and bandwidth can help determine flash performance for analytical loads, in day to day activities the performance boost received through the use of flash is also dependent on other environment variables such as the CPU, amount of memory, and the design of the actual queries being performed by SQL Server. Therefore, the best measure for evaluating flash performance is to test with the specific SQL Server data sets and queries of your environment. If testing production data is not possible, standard benchmarks are available and can be helpful in indicating flash performance under typical database loads. Commonly used standard benchmark tests include TPC-C which simulates transactional loads and TPC-H which simulates analytical loads. Many manufacturers can provide you with the results of their testing based on these benchmarks and other tests. Using tools such as Hammer DB can help you generate emulated benchmark loads against SQL Server instances. 6 What is the difference between flash volumes and flash caching, and when to use each? A flash volume is a portion of a PCIe flash card or SSD that is exposed as a volume to the operating system so that data is written and read directly to flash. Flash volumes offer the highest possible speed for both read and write operations as they use flash directly for every command. As a result, flash volumes are particularly useful when both high write speeds and high read speeds are required by the application. A typical example is the tempDB file. When SQL Server does not have enough DRAM available, queries will automatically be written into a tempDB file. As the memory clears up, data in the tempDB file will immediately be read. This write operation followed shortly thereafter by a read operation of the tempDB file makes its performance highly susceptible to both read and write speeds. If the tempDB file resides on a remote SAN, these write and read operations can cause a considerable drop in SQL Server performance. Placing tempDB files on local flash volumes provide an immediate boost in application performance. A flash cache is a repository of flash memory storage that holds copies of certain portions of a database stored on another media such as HDDs. The cache enables requests for data to be fulfilled at greater speed and used in tandem with slower hard disk drives (HDDs) to improve data access times. Cache can be located in a server, storage device or network, and requires time for it to warm-up and populate. As more data enters the cache, more requests can be accelerated. A good example for flash caching occurs with large databases that will not fit into flash volumes. As database capacities grow beyond a certain point, it becomes are impractical or not cost-effective to place the entire database on SSD flash. Even though database files can reach terabytes in size, only the data within certain ‘hot zones’ needs to be frequently accessed. Storing only data from these hot zones on flash cache allows for a major performance increase at a considerable cost savings. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 10
There are two major use cases that require a SQL Server environment to use flash caching: When user databases are larger than the available flash resources In this use case, flash caching offers an excellent avenue to accelerate performance with lower amounts of flash when a caching algorithm is used that can select the right data to place on flash from the larger databases on the HDD volumes. Caching selection algorithms that are tailored to provide the highest benefit to a particular data access pattern are called optimized caching policies. It is useful to differentiate between transactional and analytical loads when caching data as their access patterns are very different, requiring each to have its own optimized caching policy. When enterprise environments use legacy SAN backup and High Availability functions In this use case, it is important to use write-through caching as it ensures that the data on the SAN always contains the latest data and is safe for backup and High Availability (HA) purposes. When an enterprise environment depends on its legacy SAN for backup and HA, caching will provide a way to benefit from flash while retaining all current procedures. While all-flash volumes provide the fastest performance, flash caching lowers CAPEX and OPEX by combining smaller amounts of flash with less costly HDDs. Selecting a flash solution that can easily shift between flash volumes and flash caching, or uses both modes concurrently for different databases, will future-proof the SQL Server environment for data growth. 7 What kinds of SQL Server data should I place on flash? SQL Server stores its databases in files, of which, the two most common are the main database file and the log file. TempDB files represent a third type of database file and discussed in Section V. Log Files These files are write-intensive with heavy reading occurring only when transactions need to be reconstructed due to an external event such as a power failure. For this reason, write-through caching (which accelerates reads but not writes) is not effective for use with log files. To accelerate log files, place them on a flash volume but make sure to maintain High Availability (HA) on this volume (e.g. with mirroring) to assure that the log file will not be lost if a flash failure occurs. Main Database Files The main database files are usually more balanced in terms of reads and write with actual read and write ratios dependent on the workload. When SQL Server requires a page from the database, it can load it into memory much faster if that page is on flash rather than on an HDD. As a performance guideline, the higher the probability that a page will be in flash when it is needed by SQL Server, the better SQL Server performance will be. This is typically measured by a metric called the hit ratio, or the percentage of time a request from SQL Server can be Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 11
answered by data in flash rather than SQL Server having to fetch the data from HDD storage. To improve hit ratios, the caching software needs to statistically process the data in real-time and intelligently select whether specific data elements are worth caching. However, the more analysis that the software performs in real-time, the higher the interference incurred on the data path, resulting in the classic data path design dilemma: • If too much time is spent on deciding whether to cache a data element as it flows through the data path, data access to SSD flash can be slowed down. • If too little time is spent on deciding whether to cache a data element as it flows through the data path, it’s possible that the data cached will be useless to the application, or even worse, critical data can be flushed out of the cache. Caching performance can be improved when a system uses out-of-band processing to select what data to cache. 8 What capabilities verify data will be available on flash for SQL Server? Enterprise applications such as SQL Server are extremely vulnerable to cache policy optimization as they dynamically handle large amounts of data of constantly shifting importance. Data that is critical to cache at one point in time may be useless at another, and the selection of the best data to cache at each point in time is highly dependent on current access statistics. To assure high hit ratios, there are four basic methods to select data to be cached on flash. Hot Zone Detection A mechanism that pinpoints frequently accessed data locations in SSD flash volumes to help determine what data to cache and its relevance. The key to accelerating SQL Server is figuring out what data is important and worth caching so that the data on SSD flash is quickly accessible and relevant to the needs of the application. This method relies on optimized caching policies to determine the data to cache based on how ‘hot’ the data is. A smart caching mechanism can detect where indexes are located within a file by monitoring the number of accesses to a zone. The located hot zones can then be used to determine what data is worth storing on the flash. Sequentiality Detection This data cache selection mechanism differentiates between relevant and irrelevant data access patterns and can filter out background processing tasks (such as error checking) to prevent irrelevant data from entering the cache. On the other hand, important sequential patterns (such as index creation) may be marked for special handling in the cache. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 12
Command-size Inspection Commands are differentiated by their command size and reflect the amount of data that needs to be passed to or from an SSD. Inspecting the command sizes being generated by SQL Server enables differentiation between different types of data usages. Cache Analysis and Warm-up While hot zone detection is effective for transactional processing throughout the day, analytical processing requires a different method for selecting data to cache. Analytical processing often involves a periodic batch process that requires accessing specific types of data. For example, a SQL Server instance may be collecting sales data throughout the day, but at night, a report generation process may be collecting data from different sources to help develop a complete sales report comparing the day’s sales activities to previous periods. In this scenario, hot zone detection will not be helpful for this type of analytics because the data being accessed is not the same data that was hot during the day. To achieve high hit ratios for analytics requires data access analysis and a pre-warming cache mechanism. Data access analysis monitors the data being accessed during a certain period of the day (in the example above it would monitor what data is accessed for report generation). After an analysis of the required data is performed, a pre-warming cache mechanism fetches this data just before generating the report verifying that the data is on flash at the precise time SQL Server needs it. 9 What parameters should I consider if my SQL Server instances are virtualized? Whether or not your SQL Server implementation is virtualized today, at some point in the future, some or all of the SQL Server instances may need to move to a virtualized environment. Selecting a flash acceleration solution that supports both virtualized and non-virtualized environments enables the flexibility to deploy SQL Server in either of these modes. Accelerating a virtualized SQL Server instance is similar in many respects to accelerating a physical SQL Server instance however there are a few additional considerations that must be taken into account when deploying flash into a virtualized environment. Flash Resource Sharing If the virtual environment includes multiple instances of SQL Server virtual machines (VMs) running concurrently, as well other application VMs that can benefit from flash resources as well, look for a solution that enables intelligent and efficient distribution of flash resources between all connected VMs. Additionally, when the solution allows you to virtualize flash as a highly available network-exposed resource, the flash can be shared amongst any VM in the cluster regardless of location making sure that no VM inefficiently occupies flash when it can be better used elsewhere in the environment. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 13
High Availability Support One of the main reasons for deploying virtualization is the ability to provide High Availability (HA) should a server running the SQL Server virtual machine (VM) go offline for any reason. Therefore, users should look for flash acceleration solutions that either support SQL Server’s native High Availability Always On Technology when deployed virtually or supports a virtualization platform’s HA functionality such as VMware High Availability. Virtual Machine Migration While virtual machines (VMs) can migrate between physical servers in a cluster, flash is a physical resource that resides in a particular location in the enterprise. If the chosen solution does not allow exposing flash over the network, once a VM is migrated to a new server, it will lose connectivity to its flash cache, and that source application will experience a sharp drop in performance until data is reloaded over time to the new cache. Therefore, look for a flash solution that enables the dynamic migration of a VM from one storage system to another, with no VM downtime, service disruption or loss of cache to end-users. Remote Flash Access In some cases, SQL Server will run on remote servers while the flash resources are located remotely on either a different commodity server or storage appliance. Therefore, look for a flash solution that provides a connection from SQL Server to its flash resources remotely and can relocate SQL Server files to remote flash volumes, remote flash caching or a combination of both enabling SQL Server virtual machines (VMs) to continue enjoying the benefits of flash acceleration even when migrated between servers. 10 What native flash functionality does SQL Server 2014 provide? Recognizing the importance of flash for performance acceleration of SQL Server environments, Microsoft has added flash buffer pool extension (BPE) support to the SQL Server 2014 version release. Buffer Pool Extensions allow SQL Server users to define a file residing on flash so that the SQL Server instance can extend its memory buffer to flash. SQL Server uses the flash file to store clean buffer pages that it has no room for in memory. Assuming these pages will be needed by SQL Server again, they can be fetched faster by loading them directly from the flash buffer pool extension rather than from the database files. While it is important to make sure that the flash acceleration solution supports SQL Server 2014 Buffer Pool Extensions, it is also important to select a solution that extends and enhances the capabilities provided by BPE. For example, native BPE support can be activated at the SQL Server Instance level but cannot differentiate between different databases running in the environment. Therefore, to make sure that the flash is used to accelerate a key database critical for performance (without the flash being hogged by a less important but larger database), look for a solution that can selectively differentiate between databases and accelerate at a per database level. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 14
Analytical loads are another key area in which the Microsoft BPE functionality can be extended by acceleration software. In this scenario, the BPE mechanism relies on data being accessed before loading it into the flash buffer extension. Therefore, if the data is only accessed one time each night (e.g. for nightly report generation), it will most likely not be in the flash when SQL Server requires it. If analytics are used with such recurring processes, look for a solution that provides analysis and warm-up scheduling to assure the data is loaded into the flash just before it is needed by the application. 11 How can flash improve SQL Server ETL processes? As it relates to managing databases, Extract/Transform/Load (ETL) refers to three separate functions that are combined into a single programming tool and can be used to acquire a temporary subset of data for reports or other purposes or a more permanent data set for the population of a data mart or data warehouse, the conversion from one database type to another, or the migration of data from one database or platform to another. Each process has a very distinct function as follows: Extract This function reads data from a specified source database and extracts a desired subset of data Transform This function works with the acquired data, using rules or lookup tables, to convert it to the desired state Load This function writes the resulting data (either all of the subset or just the changes) to a target database, which may or may not previously exist Depending on the size of data and database files, the ETL process can be long and time consuming. Look for a flash solution that has the ability to partition its flash for use as a flash volume, flash cache or a hybrid of both. The ability to partition flash volumes provide an optimized solution where the write log and tempDB data files benefit from high flash performance while hot areas of the database are flash cached for immediate use by SQL Server. The result enables all SQL Server data types to be optimized and accelerated. A flash solution that enables write-through caching functionality will allow you to load that data into flash as it is being loaded into an underlying database. When used as a mirrored volume for the ETL process, the most current data will already be cached on flash in the secondary replica significantly reducing ETL run times. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 15
12 How can implementation wizards affect the success of my SQL Server acceleration project? Many DBAs are expert in database administration but are not as experienced in deploying flash into their enterprise environments, so simplifying the DBAs life through best practice guidelines for flash implementation is a benefit. Based on a simple to use and intuitive graphical user interface (GUI), implementation wizards are very helpful in guiding DBAs through a flash-based storage deployment. Adding best practice models of the flash-based resources can also provide efficient acceleration and quick plug-and-play set-up into existing SQL Server deployments. Look for flash solutions with intuitive GUI management wizards that can: • Divide the flash resource into a volume section and a cache section • Advise on what data and workloads to place on the flash volume or on flash cache • Provide a list of the database volumes so that optimized policies can be used whether the workload is transactional or analytical • Provide instructions on how to pre-warm the cache in advance of demanding and critical jobs Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 16
13 Flash Buyer’s Checklist Category Functionality Solution A Solution B In-Server Installation (PCIe, SAS/ SATA ) Supported Form Factors External Appliance Support Random Read IOPS at 8K Random Write IOPS at 8K Max Read Bandwidth Performance Max Write Bandwidth Benchmark Performance (TPC-C, TPC-H, etc.) Performance under my test load Flash Volume Support Usage Mode Flash Cache Support Flexible concurrent usage Caching Policies for OLTP and analytics Data Accesses Analysis Caching Functionality Cache Warm-up Scheduling Database Level Acceleration Flash Resource Sharing Virtualization Support High Availability Support VM Migration and Remote Caching Support BPE SQL Server 2014 Flash BPE Enhancements Implementation Wizards SQL Server Management Integration Deployment and Management Central Management for multiple servers Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 17
14 Introducing OCZ’s ZD-XL SQL Accelerator ZD-XL SQL Accelerator is a tightly integrated hardware/software, plug-and-play acceleration solution optimized for Microsoft SQL Server applications. It leverages OCZ’s industry-proven PCIe SSD hardware and application-tuned software to deliver low latency flash that resolves potential SQL Server bottleneck issues enabling the flash to be deployed as a local flash volume, a flash cache for HDD volumes, or as a combination of both. The solution includes a combination of fast flash performance, a unique cache mechanism that makes advanced and statistically-optimized decisions on what data to cache, and a dynamic cache warm-up scheduler that enables workloads to be placed on flash cache in advance of demanding and critical jobs. This advanced PCIe accelerator card utilizes implementation wizards and step-by-step instructions to guide database administrators (DBAs) through the deployment process enabling best practice models of its flash-based resources to be simply and quickly set-up for efficient SQL Server acceleration. The intuitive GUI also divides the ZD-XL SQL Accelerator SSD resource into a volume section and cache section, advising DBAs on what data and workloads to place on flash. It also provides a list database volumes enabling DBAs to simply select the optimized policy to use on each workload and instructs them on how to pre-warm the cache in advance. Additionally, ZD-XL SQL Accelerator provides complete High Availability (HA) via Microsoft AlwaysOn Technology so not only can SQL Server environments function at the speed of flash, but in the event of planned or unplanned downtime, can continue operations from the stopping point retaining all of its data as if no downtime had occurred. With this level of functionality and performance, OCZ’s initial ZD-XL SQL Accelerator release earned prominent reviews as well as the 2013 Best of Interop® award in the Data Center & Storage category. In support of SQL Server 2014, ZD-XL SQL Accelerator enables its flash resources and associated management capabilities to connect directly to the application providing the utmost in application management integration. This tight integration of flash and application management enables ZD-XL SQL Accelerator to accelerate the application at a per database level versus having to accelerate all of the databases in the SQL Server instance. SQL Server can run on blade servers or specific rack-mounted servers except that the PCIe form factor does not fit making flash acceleration more difficult. OCZ enables flash acceleration for these instances by running ZD-XL SQL Accelerator software next to the database application on blade or rack-mounted servers while the SSD flash is located remotely on either a commodity server or storage appliance. As a result, ZD-XL SQL Accelerator software provides a direct connection from SQL Server to its flash resources remotely. Many enterprises benefit from the capability of running SQL Server in a virtualized environment. For these applications, ZD-XL SQL Accelerator supports both VMware ESXi and Microsoft Hyper-V hypervisors enabling its flash resources to be deployed exactly to the needs of VMs while retaining the application connection in the virtualized environment. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 18
Contact us for more information OCZ Storage Solutions 6373 San Ignacio Avenue San Jose, CA 95119 USA P 408.733.8400 E sales@oczenterprise.com W ocz.com/enterprise EMAIL SALES TEAM VISIT OCZ ENTERPRISE Disclaimer OCZ may make changes to specifications and product descriptions at any time, without notice. The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Any performance tests and ratings are measured using systems that reflect the approximate performance of OCZ products as measured by those tests. Any differences in software or hardware configuration may affect actual performance, and OCZ does not control the design or implementation of third party benchmarks or websites referenced in this document. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to any changes in product and/or roadmap, component and hardware revision changes, new model and/or product releases, software changes, firmware changes, or the like. OCZ assumes no obligation to update or otherwise correct or revise this information. OCZ MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. OCZ SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL OCZ BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF OCZ IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2014 OCZ Storage Solutions, Inc. – A Toshiba Group Company. All rights reserved. OCZ, the OCZ logo, OCZ XXXX, OCZ XXXXX, [Product name] and combinations thereof, are trademarks of OCZ Storage Solutions, Inc. – A Toshiba Group Company. All other products names and logos are for reference only and may be trademarks of their respective owners. Buyer’s Guide | SQL Server Acceleration Flash Buyer’s Guide | V1.0 | © 2014 OCZ Storage Solutions 19
You can also read