Migrating to Snowflake? Here's What You Need to Test - Tricentis
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Migrating to Snowflake? | 1 Migrating legacy data (e.g., from IBM Netezza, Oracle, MSSQL, PostgreSQL…) to Snowflake is not a simple “lift and shift.” It can’t happen all at once, and it must be tested to ensure the tens of thousands of data reports continue to operate properly on the new model. Organizations must ensure the data is moved efficiently by performing extensive validation and reconciliation across the old and new worlds. The current process surrounding the migration and ongoing testing of Snowflake is to utilize manual tests that create a dependency matrix. This process is error prone, so many different iterations are usually required. As a result, timelines can easily shift from days/weeks to months—adding considerable delays and costs to the migration project. This paper outlines the top challenges that enterprise organizations typically encounter during a Snowflake migration. For each challenge, we briefly explain how Tricentis Data Integrity has been used to address the challenge. We conclude with a proactive approach to eliminating data integrity issues—before, during, and after Snowflake migration. Challenges of Snowflake Migration No database migration is simple. Gartner reported that 83% of data migration projects either fail to meet budgets and schedule expectations…or fail altogether. Migrating workloads from on-prem solutions to cloud databases is even more complex. Forrester estimates that an average Snowflake migration requires full-time involvement from 3 DBAs/IT staff for 6 months—plus considerable consulting time. Here are the top Snowflake migration challenges we’ve encountered at customer and prospect sites, along with ways to address them. Data must be migrated incrementally Snowflake is so popular that they limit the transactions to their system. You can be limited to transferring only 10GB a day unless you get special permission. For a 5 terabyte system, that could mean 500 days...if you do it perfectly. To avoid drawing out an already lengthy process, automatically reconcile and validate each transfer as it happens. Tricentis Data Integrity verifies that the data moves efficiently and accurately. Automated reconciliation tests provide instant insight into which transformation requirements have been tested and whether those tests succeeded or failed. Organizations don’t want to move and store bad data Since Snowflake charges per terabyte, it’s in your best interest to clean “garbage data” and duplicate records before moving it over. However, few organizations have the time or resources to do this at any point—much Tricentis www.tricentis.com www.tricentis.com v
Migrating to Snowflake? | 2 less when they’re preparing for a massive migration project. Automatically-generated tests that expose data errors will not only save time, but also enable a much more thorough and accurate inspection than manual efforts ever could. Tricentis Data Integrity’s “pre-screening” tests exposes data that isn’t fit for migration. For instance, it finds missing values, duplicates, data formats issues, data beyond the acceptable range, etc. Migrating workloads is tedious and error-prone Moving workloads from legacy environments to Snowflake is an error-prone activity with a high risk of disrupting business as usual. Migrating code, business logic, and analytics jobs all have their own set of unique challenges. For example, workloads must have their exact target equivalent matching the production performance SLAs. To achieve this, enterprises must perform all the following steps before putting new workloads into production: 1. Thoroughly assess the existing inventory of workloads to identify the chain of workloads to be moved. 2. Match the source and target data. 3. Convert scripts, business logic, reporting logic, etc. 4. Validate the migrated logic. Tricentis Data Integrity is used to validate the migrated logic. Typically, we find that 60% of the legacy data workloads can be migrated as-is, 20% workloads might require some additional optimization, and 20% workloads require total re-engineering. In all cases, testing can be automated with our end-to-end suite of data integrity tests. These tests span from pre-screening, to vital checks for consistency and correctness, through any data transformations, and finally to the analytics and report checks that verify the process was completed correctly. Data processes are deeply embedded With RDBMS, existing ETL pipelines push data to legacy warehouses, customized visualization tools pull data out of their warehouses, and custom applications also depend closely on data from their warehouse. When you move to Snowflake, all these processes must be re-engineered …and tested. Tricentis Data Integrity can effectively deal with the reconciliation and validation required to make the migration risk-free. Tricentis www.tricentis.com www.tricentis.com v
Migrating to Snowflake? | 3 Rethink Data Testing: Before, During, and After Your Snowflake Migration Rather than tackle these challenges with a “whack-a-mole” approach, use Snowflake migration as an opportunity to modernize and transform your overall approach to data integrity—just like you’re modernizing and transforming your approach to data management. Tricentis Data Integrity’s end-to-end data reconciliation and validation has helped top organizations unleash the full power and speed of Snowflake. • Before the integration, take the opportunity to assess the data, identify issues, and fix them so your Snowflake data is streamlined and accurate from the start. • During the migration, automatically detect unintentional changes from the old data to the new Snowflake stores and processes. This automated regression testing can run throughout the migration period to expose change impacts the moment they are introduced—which is when they are 10X faster to find and fix. • Once you’re up and running on Snowflake, reuse the same tests to identify when ongoing system modifications compromise your processes and data. These extensible, reusable, and resilient tests and embed them into the DevOps toolchain of your choice. With this baseline, you can expose unintentional data impacts as soon as they occur. For a deeper dive into what’s involved in this strategy—including a look at how we approach each step— watch our webinar. About Tricentis Data Integrity Tricentis Data Integrity is the industry’s top end-to-end data testing solution for enterprise organizations. Our end-to-end automation covers everything from the integrity of the data fed into your system, to the accuracy of integrations, transformations, and migrations, to verification of report logic and presentation. Tricentis Data Integrity takes advantage of the unique capabilities of Snowflake. For example, for Time Travel, we create tests that profile and monitor changes in the data as it enters the process—not at the end when this bad data impacts end users in the business units. What sets Tricentis Data Integrity apart? Tricentis www.tricentis.com www.tricentis.com v
Migrating to Snowflake? | 4 • End-to-end: Automates end-to-end data testing covering all reconciliation and validation tasks from sources to stores to reporting and visualizations. • Any technology: Sits on top of any data landscape, covering structured, unstructured, and message data from any source or technology as well as reports in any analytics tool via UI, API, and PDF. • Snowflake enrichment: Allows you to create tests utilizing Snowflake’s unique capabilities (such as Time Travel) to pinpoint data regression issues as they happen at the source. • Accessible automation: Enables Business Analysts, Data Stewards, Data Engineers, etc. to automate testing, replacing spotty “state and compare” checking as well as complex, unscalable SQL scripting. • CI/CD integration: Integrates into CI/CD pipelines to ensure frequent application changes don’t inadvertently alter ETL processes and compromise data quality. • Enterprise grade: Delivers a mature enterprise-grade solution with highly-scalable performance and enterprise-grade global support to help you achieve your goals, fast. • Risk-based: Guides teams to focus limited testing resources on top business risks; reveals whether a release candidate is sufficiently tested and fit for release. Next Steps Learn more about how Tricentis can help your organization simplify your migration to Snowflake and ensure that ongoing system modifications in Snowflake don’t compromise data integrity. Contact your organization’s Tricentis representative to schedule a briefing with our data integrity specialists. Tricentis www.tricentis.com www.tricentis.com v
You can also read