INTRODUCTION TO K2VIEW FABRIC - A WHITE PAPER BY K2VIEW.
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
INTRODUCTION TO K2VIEW FABRIC A WHITE PAPER BY K2VIEW. ABSTRACT Across industries, the amount of data to be managed is exponentially growing, and with it the need for modern, fully-distributed and scalable data management systems - often referred as big data architectures. As this new market opens, many solutions arise, providing fully distributed and scalable systems to manage Big Data. These solutions do answer the volume K2VIEW FABRIC and administration problems but still present some caveats: They often require a lot of effort to integrate into an K2View Fabric is an innovative existing mature environment. and revolutionary database They still use the same type of outdated data management system, residing representation that is the relational database model on top of Apache Cassandra. It (see details next page). provides an easy, secure and reliable way to consolidate your K2View Fabric is designed to solve today’s big data data and distribute it over your problem while alleviating these caveats, as this white network for high availability. paper will introduce.
AT THE HEART OF K2VIEW schema. This schema defines the relevant input objects associated with one Logical Unit Type. This FABRIC: THE LOGICAL UNIT process is either automated using K2View Fabric K2View Fabric uses a game-changing data model to Auto-Discovery module or performed manually retrieve and store data: the Logical Unit. using K2View’s drag-and-drop style graphical configuration dashboard, K2View Fabric Studio. The Most database management systems store data result is a business oriented structure containing based on the type of data being stored (e.g. tables and objects from as many systems as needed customer data, financial data, address data, device (e.g. for a Customer Logical Unit Type, 3 tables data); this model translates into very large tables from the CRM system running on MySQL and 5 that must be queried using complex joins every tables from the billing system residing on Oracle). time one wants to access business relevant data (e.g. how many payments has this customer made This schema is used every time data is accessed in within the past three months?). K2View Fabric: using embedded migration (ETL) capabilities, the data is processed, stored and K2View’s solutions look at data a different way: distributed as Logical Unit Instances. storing and retrieving it based on business logic, hence the name Logical Unit. This allows the Managing data as these logical, compressed and business to easily design K2View Fabric’s base encrypted mini-databases enables incredible schema based on their needs, as opposed to try to performance, enhanced security, high availability fit them into a pre-defined structure. and customizable data synchronization. Indeed, in K2View Fabric, every business related As such, the Logical Unit concept is a bridge object (e.g. Customer, Merchant) is represented by between scattered, hard to maintain data and a Logical Unit Type. highly available, business-oriented data. Each Logical Unit Type is then associated with a
ARCHITECTURE The diagram below illustrates an overview of the K2View Fabric’s architecture: INSIDE K2VIEW FABRIC the principles of massive parallel processing and map-reduce in order execute operations. CONFIGURATION: This layer contains the versioned configuration of every Logical Unit SMART DATA CONTROLLER: This layer Type. This layer is accessed through our drives the real-time synchronization of data to administration tools (K2 Admin Manager, K2View Fabric. K2View Fabric Studio and Web Admin ETL LAYER: This layer is K2View Fabric’s interfaces). embedded migration layer, allowing for WEB/DATABASE SERVICES: This layer is automated ETL on retrieval. used to communicate with user applications: ENCRYPTION ENGINE: This layer manages either via direct queries (database services) or the granular encryption of each data set. via web services. LU STORAGE MANAGER: This layer AUTHENTICATION ENGINE: This layer compresses and send data to the distributed manages user access control and restrictions. database for storage. K2View Fabric leverages MASKING LAYER: This layer is an optional Cassandra as the distributed database. The layer that allows real time masking of sensitive communication between the distributed data. database is very straight forward, making K2View Fabric a flexible solution that can be PROCESSING ENGINE: This layer is where adapted to any other distributed database. every data computation is managed. It uses
BIG DATA FEATURES As presented above, K2View Fabric’s architecture Ownership (TCO). It relies on three very simple is built to address the challenges of Big Data. cornerstones: Therefore, it features state-of-the-art capabilities In-Memory performance on commodity such as: hardware: only the computations are done in In-Memory distributed performance memory, the data is compressed and stored on disk. Linear scalability on commodity hardware Complete linear scalability: driven by the Consistency, Durability and High Availability distributed database. Full SQL support and DB standard connectors Risk-Free integration: see details in the next section. This section will give a brief overview on how K2View Fabric provides this features. For more CONSISTENCY, DURABILITY, AVAILABILITY details about K2View Fabric features, please refer K2View Fabric ensures full consistency, guaranteed to our Technical White Paper. durability and high availability of the data it PERFORMANCE contains. Consistency is ensured by the Processing engine of K2View Fabric, using an internal and K2View Fabric’s principal performance feature is its distributed transaction table to determine if a inherent Logical Unit representation running every concurrent transaction is occurring and if the write query on small amount of data: this feature makes should be put on hold. Durability and high- K2View Fabric the fastest database on the market. availability are inherent features of the distributed On top of this inherent design, K2View Fabric database layer (Cassandra). ensures performance using the two following major principles: FULL SQL/STANDARD CONNECTORS Every query is executed in-memory. The K2View Fabric Processing Engine uses two query methods depending on the type of data on For analytics queries running across several which the query is executed: Logical Unit Instances, K2View Fabric implements a proprietary map-reduce Query on single Logical Unit Instance (around algorithm that breaks down this analytic query 95% of overall queries): simple ANSI SQL in small jobs distributed against K2View query. Fabric’s nodes. Query across Logical Unit Instance (analytics): Every computation is driven by K2View Fabric Map-Reduce engine reproducing SQL protocol. processing engine, which allows it to be executed Both methods support everything that is supported and distributed across any node, thus offering in ANSI SQL. It also provides a proprietary Massive Parallel Processing (MPP). indexing functionalities that not only allows LINEAR SCALABILITY/LOW TCO indexing for faster performances but also regulating user access. As opposed to many big data solutions offering high-end in memory performances, K2View Fabric Finally, K2View Fabric provides full JDBC support, does not require storage of all data in memory or and features connectors to all the most common expensive hardware for scaling up performance. databases on the market (e.g. Oracle, MySQL, Thus K2View Fabric offers a very low Total Cost of PostgreSQL, Netezza, SQLServer, etc.).
KEY DIFFERENTIATORS While K2View Fabric offers the best features of big can be applied during this definition. The ETL layer data architectures, it also provides unique is triggered automatically if needed by the smart functionalities that differentiate it from any other data controller, alleviating any need for external solution on the market, including: ETL tools or costly migration projects. Embedded ETL/Data Masking EMBEDDED WEB SERVICES Embedded Web Services K2View Fabric offers an out-of-the-box Flexible Synchronization configuration graphical interface to define web Row-level security services: any function (which can be as simple as EMBEDDED ETL/DATA MASKING a query) can be created and registered as web service. Once the function is defined, K2View K2View’s industry proven ETL capabilities are Fabric automatically ensures user access, embedded into K2View Fabric. The principles of distribution, updates due to schema changes, etc. the ETL are based on the logical unit data The gain in time and effort is tremendous representation: by simply defining its schema, compared to traditional database management K2View Fabric automatically creates a migration systems that require developing, distributing and path from all sources into a logical unit. Any type maintaining a communication layer between them of enrichment (adding field, masking fields, etc.) and your applications. The figure above illustrates the conceptual custom elements must be developed in order to difference between an integration of a traditional retrieve data from pre-existing systems. K2View solution (regardless of its architecture) and K2View Fabric on the other hand gets rid of any need for Fabric. In a traditional solution, multiple complex custom upstream or downstream development.
FLEXIBLE SYNCHRONIZATION allows complete control over your data encryption. It relies on three set of keys: K2View Fabric flexible data synchronization features are driven by its Smart Data Controller: Master Key: Generated during K2View Fabric any time data is accessed in K2View Fabric, the installation, this is the main key allowing access Smart Data Controller compares the current state to every resource of K2View Fabric. of the data in K2View Fabric versus the Type Keys: These keys restrict access at the synchronization parameters and update the data if Logical Unit Type level and are a hash of the needed whether it’s a change in the K2View Fabric Master Key. schema or triggered by one of the synchronization mode described below: Instance Keys: These keys restrict access at the Logical Unit Instance level and are a hash ON-DEMAND SYNC of their corresponding type key. K2View Fabric allows data synchronization to be triggered by on- demand calls. These calls can be triggered by web services, batch scripts or directly querying K2View Fabric (administrative mode). EVENT-BASED SYNC Alternatively, synchronization can be triggered using the principles of Change Data Capture (CDC). Using this mode, K2View Fabric automatically captures changes in the source systems that are part of its schema. In the figure above, you can see how HEKS is implemented for two LU types. ALWAYSYNC Indeed, you can see the following keys: K2View Fabric features an intelligent and flexible way to synchronize data: 1 Master Key allowing full access AlwaySync. This mode allows complete 2 Type Keys restricting access to 2 different LU granularity over the data that needs to be Types synchronized with source systems. 6 Instance Keys, 3 for each LU Types Using AlwaySync, K2View Fabric allows you to restricting access at the LU Instance level configure what data needs to be refreshed Using this hierarchical encryption, K2View Fabric automatically, and how frequently. For each allows complete control over the stored data and element of the K2View Fabric schema, an significantly the risk of data leaks: even if one AlwaySync timer that will be driving the K2View Instance Key were to be hacked, only the data of Fabric synchronization is set (e.g. if the usage one instance would be leaked; all other instances information from the Customer table needs to be data is still safely encrypted. updated every 5 minutes, a timer of 5 minutes). Therefore, this design makes K2View Fabric the ROW-LEVEL SECURITY most secure database on the market, essentially K2View Fabric features a proprietary algorithm rendering massive data breaches to be impossible. Hierarchical Encryption-Key Schema (HEKS) that
SUPPORTED FEATURES Traditional Big-Data Fabric No SPoF Consistency, Durability and High-Availability Low TCO and in-memory performance Embedded ETL Embedded Data Masking Embedded Web-Service layer Row Level security FREQUENTLY ASKED QUESTIONS What is the main difference between security in K2View Fabric versus a traditional RDBMS? Traditional RDBMS can’t restrict and encrypt access at an instance level. You either have access to the full table containing customer information or you don’t. Using K2View Fabric, you can define row-level security. With such rich synchronization features, how do you ensure performance? K2View Fabric provides high-end performances by first processing only the data related to one Logical Unit Instance, hence reducing the amount of data. Moreover, the processing layer only execute actions in memory, and maintains a data cache for frequent use. Finally, for processing across Logical Unit Instances, K2View Fabric uses map-reduce to implement fast queries. What is the difference between fully migrating to K2View Fabric or a traditional RDBMS? Migration is a feature of K2View Fabric. Migrations to traditional RDBMS require the development, testing and deployment of a specific migration tool. How many processing/sync/data storage layers are there in K2View Fabric? There are as many layers as there are Cassandra nodes in your deployment. This allows for full parallel execution between nodes.
CONFIDENTIALITY This document contains copyrighted work and proprietary information belonging to K2View. This document and information contained herein are delivered to you as is, and K2View makes no warranty whatsoever as to its accuracy, completeness, fitness for a particular purpose, or use. Any use of the documentation and/or the information contained herein, is at the user's risk, and K2View is not responsible for any direct, indirect, special, incidental, or consequential damages arising out of such use of the documentation. Technical or other inaccuracies, as well as typographical errors, may occur in this Guide. This document and the information contained herein and any part thereof are confidential and proprietary to K2View. All intellectual property rights (including, without limitation, copyrights, trade secrets, trademarks, etc.) evidenced by or embodied in and/or attached, connected, or related to this Guide, as well as any information contained herein, are and shall be owned solely by K2View. K2View does not convey to you an interest in or to this Guide, to information contained herein, or to its intellectual property rights, but only a CONTACT INFORMATION personal, limited, fully revocable right to use the Guide solely for reviewing purposes. Unless explicitly set forth otherwise, you may not reproduce by any means any document and/or www.k2view.com copyright contained herein. info@k2view.com Information in this Guide is subject to change without notice. Corporate and individual +1-844-438-2443 names and data used in examples herein are fictitious unless otherwise noted. Copyright © 2015 K2View Ltd./K2VIEW LLC. All rights reserved. The following are trademark of K2View: K2View logo, K2View's platform. K2View reserves the right to update this list from time to time. Other company and brand products and service names in this Guide are trademarks or registered trademarks of their respective holders.
You can also read