GEONIS Data Quality Program Overview for Data Quality Optimization - Guidelines - Geocom
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Geocom Guidelines We reserve the right to make technical changes. © Copyright 2019 by Geocom Informatik AG, Burgdorf / Switzerland Conception and Design: Geocom Informatik AG, Burgdorf, Switzerland All rights reserved. This document may not be reproduced or copied in any form, in full or in part, either electronically, photo- mechanically or mechanically, without the explicit consent of Geocom Informatik AG. GEONIS is a registered trademark of Geocom Informatik AG.
Guidelines Geocom Table of contents 1. Introduction ..................................................................................................................................4 2. Mandatory quality characteristics for GEONIS .........................................................................5 2.1. GEONIS conventions .....................................................................................................................5 2.2. GEONIS system tables / Object relationships (ORM)....................................................................5 2.3. XML definitions incl. overload concept ...........................................................................................6 2.4. Mandatory attributes / Correct network topology ...........................................................................7 3. Tools for quality assurance ........................................................................................................8 3.1. GDN Studio ....................................................................................................................................8 3.2. GEONIS DB Modeler .....................................................................................................................9 3.3. GDN Database Compare .............................................................................................................10 3.4. GEONIS DB Validation Tool ........................................................................................................10 3.5. GeoDbDiff.....................................................................................................................................11 3.6. GEONIS DB Update .....................................................................................................................12 3.7. GEONIS XML Restruct ................................................................................................................12 3.8. GEONIS Validation ......................................................................................................................14 3.9. GEONIS DB Statistic ....................................................................................................................15 3.10. Graphical topology check .............................................................................................................15 3.11. Geometric Network Editing (Esri network) ...................................................................................16 3.12. Esri Topology ...............................................................................................................................17 3.13. Python ..........................................................................................................................................18 3.14. GEONIS Data Converter / FME ...................................................................................................18 4. Overview of data quality tools ..................................................................................................19 5. Added value of data quality ......................................................................................................20 6. How to enhance data quality.....................................................................................................20 7. Support ........................................................................................................................................21 8. Internet Sources .........................................................................................................................21 GEONIS_Data_Quality_EN.docx Page 3
Guidelines Geocom 1. Introduction “Geodata is in all cases simply an incomplete image of a section of the Earth’s surface. It repro- duces a particular topic at a particular point in time and for a particular area (What? - When? - Where?) ”. Within a GIS, quality is driven by its application, always targeting on a certain use. Unlike pure master databases, which among other things should be redundancy-free and con- sistent, topological correctness plays a very important role with GIS. The fundamental purpose of topology is to provide the spatial relationships of objects to each other. Besides the correct snapping, the settling of vertices is another quality characteristic to keep the data amount small (using enough but not too many vertices). If the GIS data is based on a complex data model, the consistency of the data model has to be ensured by the help of rules. Missing or falsely linked elements create obsolete data that is difficult to maintain or to be cleaned-up. A complete attributive acquisition of data provides the GIS with information to perform profound analyses and evaluations. The full potential of a GIS is only exploited with correct topology combined with high-quality master data. The quality of a data pool is based on 5 quality characteristics. Consistency Accuracy logical accuracy spatial and topical Rightness accurate data DATA QUALITY Timeliness Completeness temporal validity of the complete record data However, it is not only the data itself that is significant in operating a GIS, as the availability and reliability of this is also important. GEONIS_Data_Quality_EN.docx Page 4
Guidelines Geocom 2. Mandatory quality characteristics for GEONIS For a successful use of GEONIS, following quality characteristics are important: • Compliance with GEONIS conventions • Correct and complete GEONIS system tables: including object relationships (ORM) and lookup / secondary tables • Valid XML definitions (incl. overloads) that match exactly the data model • Correct network topology • Complete acquisition of mandatory attributes 2.1. GEONIS conventions The GEONIS conventions must be complied with as mandatory in the case of model expansions or new solutions. This is the only way to ensure that the tools implemented will work properly and that an update can be completed without losing data. Some of the important conventions include: • Feature datasets, tables, columns ▪ have more than 30 characters ▪ are written in capital letters ▪ all user-specific extensions must be designated with “U_...” • Expansion of values in secondary tables → CODE ≥ 1000 • The code field as of GEONIS 5.4 is GlobalID as mandatory (previously OBJECTID) • The CODE / subtype attribute is mandatory from data type Integer → CODE 0 = unknown ▪ IMPORTANT: Esri subtypes must be consistent with the GEONIS lookup values! • All updates must be executed in the chronologically correct sequence (GN_VERSION) All of the GEONIS conventions can be found on the GEONIS Online Help. 2.2. GEONIS system tables / Object relationships (ORM) The contents in the GEONIS system tables influence the functionality of GEONIS directly. Defec- tive system tables can result in performance losses, erroneous data, system crashes, and in the worst cases in data loss. The following must be noted (also see System tables of the GEONIS conventions, GDN login required): • GN_LOOKUP complete and updated ▪ All values are fully entered ▪ No redundancies ▪ Each lookup reference has entries in each language • Secondary tables complete and updated ▪ Code values are defined ▪ Code values are unambiguous ▪ Code values are not NULL • Allocate relationship type in the GNREL_DEFINITION consistently ▪ no "Undefined" relationships (Overview of relationship types, GDN login required) ▪ All relationships reference to existing field and table names GEONIS_Data_Quality_EN.docx Page 5
Guidelines Geocom • Formulas in GNREL_RULE and GNREL_FORMULA must be server-compliant ▪ List of deprecated parser funcwtions • GN_SPLITMERGE_DEF for all line objects with relationships must be complete and filled in correctly • Module-specific system tables are available and contain all necessary parameters (GNELE_DEFINITION, GNSEW_DEFINITION, GNDH_DEFINITION, etc.) 2.3. XML definitions incl. overload concept Many functions and configurations in GEONIS are defined using XML files. The advantage of XML configurations is that adjustments in the solution can be done without programming know- how. This makes the user highly flexible and they can adapt GEONIS to meet their requirements. Of course, although this gives a certain freedom, GEONIS requires that you follow certain rules to run your system smoothly. Some of the XML conventions associated with GEONIS are provided below: • Each XML file begins with and is valid with the XSD file in the XSD folder provided • The version in the XML ideally corresponds with the solution version → version=«2017.0» & ..\xsd_2017\xyz.xsd (usually in the second row in the XML) • Enhancements to attribute screens, legends, etc. in accordance with GEONIS overload concept ▪ media: GEONIS standard configurations (ATTENTION: No changes! → Get lost after an update!) ▪ customization: adjustment for each media (customer-specific for each media) ▪ datasources: adjustment for each database (customer-specific for each database) ▪ projects: adjustment for each project (customer-specific for each project) IMPORTANT: In the event of an overload of the same file in the different overload levels, the last one always prevails in the following sequence: 1. media 2. customization 3. datasources 4. projects If an XML file is overloaded therefore at the projects level, then it is immaterial what is overloaded with this file in the first 3 levels. GEONIS_Data_Quality_EN.docx Page 6
Guidelines Geocom 2.4. Mandatory attributes / Correct network topology GIS data that is as complete and accurate as possible is crucial for wider use and for in- depth analyses. There are different requirements related to the data and its quality depending on the queries involved. The rows tracked with the GIS must be defined in advance in order to know the data requirements. The following questions may be of assistance with the objectives: • Which products are being generated? ▪ Plans / Legends: Work, plot, pressure area, year of construction, condition, etc. ▪ WebServices: What data is required for this? ▪ Reports / Statistics: What data is required for this? ▪ Subsequent requests: GEONIS Network Calculation (requires both complete attributes as well as correct topology) ▪ … • Which interfaces are being operated? ▪ Interlis: SIA, VSA, LKMap ▪ NEPLAN ▪ Youtility, CableScout ▪ Mike Urban, Dataver ▪ RESEAU ▪ IS-E, SAP ▪ … At the next stage the attributes that must be filled in as mandatory can be determined for each object in the acquisition guidelines. Use standard values here whenever possible (“unknown” is better than “EMPTY”). Graphical rules can also be recorded for certain objects / complex structures if useful. On the one hand this ensures a consistent appearance on the map, and on the other hand it helps not to distort assessments unnecessarily (e.g. conduits to the edge or center of the shaft?). From experience there is a need for action with the network topology in most cases. Yet this is only when network assessments are required, e.g. with • Fault simulations: Who is affected by a defect? • Automatic serial letters for maintenance work on the network • Simulations of load flows, e.g. with GEONIS Network Calculation • Hydraulic calculations • NEPLAN calculations • … GEONIS_Data_Quality_EN.docx Page 7
Guidelines Geocom 3. Tools for quality assurance There are various tools from Geocom, Esri as well as third-party service providers for raising the data quality to a desired level, thereby enabling GEONIS to be used to its full extent. The individual tools are presented briefly in the following sections, with the potential usage area highlighted with respect to data quality. 3.1. GDN Studio → Tool overview GDN Studio allows the GEONIS administrator to make simple and efficient adjustments to exist- ing configurations and the development of individual specialist applications. The specialist appli- cations configured or enhanced with GDN Studio remain release-proof, i.e. they can continue to be used in full following an update. The configurations available can be used on a desktop, a server, and also with mobile devices (GEONIS gear). GDN Studio is suited to the following tasks with respect to data quality: • Language-dependent database, lookup and relationship validation • Adjustment of XML overloads (FRM, VIEW) GEONIS_Data_Quality_EN.docx Page 8
Guidelines Geocom 3.2. GEONIS DB Modeler → Tool overview GDN DB Modeler is installed with GDN Studio. It provides numerous possibilities to create data- bases, to visualize and analyze them. Regarding data quality, GDN DB Modeler provides following functions: • Detailed view on the data model • Comparison of two database structures (e.g. user-specific vs. standard) • Generation of an update script based on the differences of two databases • Validation of: ▪ Unique table names ▪ Unique column names ▪ Correct table and column names ▪ Column types GEONIS_Data_Quality_EN.docx Page 9
Guidelines Geocom 3.3. GDN Database Compare → Tool overview GDN Database Compare is installed with GDN Studio. It provides the possibility to compare the contents of two databases. Besides the content, the geometry can be compared as well. 3.4. GEONIS DB Validation Tool → Tool overview The GEONIS DB Validation Tool provides, compared to GDN Studio, only the functionality of a database, lookup and relationship validation. This tool is explicitly available for an update from GEONIS 5.2 to GEONIS 5.4 (ArcGIS 10.2-10.5) and can be requested from your account man- ager and for a fix time frame only. The GEONIS DB Validation Tool can be downloaded here (GDN license required). An advantage of the GEONIS DB Validation Tool in comparison to GDN Studio: Enterprise Ge- odatabases (SDE connections) can be validated as well. GEONIS_Data_Quality_EN.docx Page 10
Guidelines Geocom 3.5. GeoDbDiff → Tool overview As of GEONIS 2017, the GeoDbDiff delivered as standard is also available for database compar- isons in addition to the GDN DB Modeler. Although the functional scope is not comparable with that of GDN DB Modeler, GeoDbDiff does provide an initial overview of the differences between two databases. The GeoDbDiff.exe can be found under C:\Program Files (x86)\Geocom\GEONIS\expert. GEONIS_Data_Quality_EN.docx Page 11
Guidelines Geocom 3.6. GEONIS DB Update → Tool overview GEONIS DB Update is part of the GEONIS Administrator and is provided as standard with GEONIS. In addition to the database updates provided for this, GEONIS DB Update also enables checks on the data and configuration quality. A few usage options are listed below: • With execution of the update script for the next highest GEONIS version → the error / warning messages in the log provide an indication of quality de- fects. • gns_parserfunctions_check.xml → this script checks the DB for depre- cated parser functions (download at GEONIS DB Validation) • Custom data check functions → see GDN Documentation (with GDN login) 3.7. GEONIS XML Restruct → Tool overview GEONIS XML Restruct is a component of GEONIS expert and can be accessed from the Win- dows start menu as “XML Restruct”. GEONIS XML Restruct primarily provides functions for up- dating the GEONIS Media directory or individual configuration files. The validity of all XML files used for GEONIS can also be reviewed for data quality purposes. GEONIS_Data_Quality_EN.docx Page 12
Guidelines Geocom Alternately, the XML files can get XSD validated with an XML editor. Notepad++ offers basic validation as well. Those features are only available, if the plug-in "XML Tools" is installed. The current XSD folder is installed with GEONIS expert and can be found under C:\Program Files (x86)\Geocom\GEONIS\xsd_2017. GEONIS_Data_Quality_EN.docx Page 13
Guidelines Geocom 3.8. GEONIS Validation → Tool overview GEONIS Validation is the perfect tool for reviewing the quality of the database in terms of com- pleteness, plausibility, consistency, and topology. In addition to reviewing the data, validations can also clean-up or modify pools of data in accordance with defined rules. Users have comprehensive validations and functions available to them as standard. These can be defined and parameterized in XML, and thus adjusted to individual needs. Further information on the topic of Reviews can be found in the GDN Documentation (with GDN login). The benefits of GEONIS Validations are as follows: • Review directly in GEONIS → Tool → Validation • Checks of versions (sde), selection, visible elements or all elements possible • Verification on write back of work packages (no write back without “clean” data) • GEONIS log window with identifi- cation and zoom function • Both attributive as well as geo- graphical or topological reviews are possible (particularly in the GEONIS ELE media) The downsides of GEONIS Validations are as follows: • Parameterization of the validation tasks is limited to the parameters available for the task. • Ongoing validations block work on the desktop. • Not batch-enabled. GEONIS_Data_Quality_EN.docx Page 14
Guidelines Geocom 3.9. GEONIS DB Statistic → Tool overview GEONIS DB Statistic is a component of GEONIS expert and can be accessed by the GEONIS Administrator or directly from the Windows start menu. GEONIS DB Statistic delivers a very fast overview of the • Number of tables in the database • Number of objects in the tables • Number of columns per table • Number of attributes per column → Degree of acquisition can be deduced as a percent- age [%] • Values used per attribute (Value range → Plausible?) • Data types used per attribute (String, Integer, Double, etc.) • Sizes of selection lists → remove expanded and unused values The degree of acquisition of the attributes per column is interesting for data quality purposes. Meaningful statistics, reports, data exports, etc. are only possible following complete attribute acquisition (see also Chapter "Mandatory attributes / Correct network topology"). 3.10. Graphical topology check → Tool overview The graphical check is a very practical alternative as topological errors are not obvious. The to- pology can undergo a visual check using these tools: • GEONIS network analysis (scope dependent on the medium) • Esri Utility Network Analyst GEONIS_Data_Quality_EN.docx Page 15
Guidelines Geocom 3.11. Geometric Network Editing (Esri network) → Tool overview The tools in the “Geometric Network Editing” toolbar can be used to review, repair, or restore objects in the geometric network. The review looks for impermissible and invalid geometries for use of the geometric network. What is checked and what the most common errors in a geometric network are, is described in the ArcMap Help. More information about the toolbar "Geometric Network Editing" can be found here. Procedure: • If existing, delete the [media]_[detail]_NET_BUILDERR table • Create new network When creating the geometric network in ArcGIS, a business table is generally created ([me- dia]_[detail]_NET_BUILDERR), which lists the errors when building the network. The "problem" geometries can be found using the Class ID, Object ID, and the Error Type. • Load the project (with standard Work legend) • Load the toolbar • Start edit session • Select network object in Table of Contents • Display errors with network creation GEONIS_Data_Quality_EN.docx Page 16
Guidelines Geocom • Display defective features in the Attribute Editor → zoom in on these and rectify them NOTE: The AV (cadastral survey) and SEW media feature no Esri network • AV (cadastral survey): Own testing mechanisms by the AV-TOPOMODULE • SEW: Graphical analysis with GEONIS network analysis 3.12. Esri Topology → Tool overview The Esri Topology offers possibilities to run location-based analyses (e.g. to find coincident fea- tures). In a topology, the arrangement constrains how point, line, and polygon features share geometries. Besides the creation and checking of predefined topology rules, ArcGIS offers the possibility to do automatic corrections. An overview of the topology rules provided by Esri can be found under C:\Program Files (x86)\ArcGIS\Desktop10.6\Documentation\topology_rules_poster.pdf. For GEONIS Sewer System, a step-by-step guide for the Cleaning up the catchment area topol- ogy is provided, which can be adjusted for other purposes. GEONIS_Data_Quality_EN.docx Page 17
Guidelines Geocom 3.13. Python → Tool overview Esri has fully integrated Python for ArcGIS and considers Python the best language for meeting the requirements of GIS users. Its “ArcPy” site package provides easy access to ArcGIS data and functions. Python works across platforms and can be executed within or outside of ArcMap (e.g. per batch). On Geocom GitHub • Geocom Database Management Tools • MPK export • TPK export or the Geocom Porta (Tools) • ClearFCnetwork • GEONIS_net_create you can find some examples of Python scripts for GEONIS. Python is of course also perfect for data quality reviews with the help of the ArcPy library. 3.14. GEONIS Data Converter / FME → Tool overview GEONIS Data Converter or FME is also ideal for reviewing data quality. Countless data tests can be configured and carried out on the individual data records based on FME’s large functional scope. Through the GEONIS Data Converter, which establishes the interface between GEONIS and FME, the test results can be displayed in the GEONIS Protocol and then processed as usual. Some examples in practice are provided below: • Check of the "from-to-node" in the SEW for the SIA export • Network logic validation in GEONIS Sewer System GEONIS_Data_Quality_EN.docx Page 18
Guidelines Geocom 4. Overview of data quality tools The data quality tools and their usage area can be compared in this table: Tool / Usage DB com- Attributes DB, Lookup and Obsolete / duplicate XML check Topology Download parison check relation validation entries (Lookup) check *GDN Studio ⚫ ⚫ ⚫ Geocom Portal – My Software *GDN DB Modeler ⚫ Part of GDN Studio *GDN Database Compare ⚫ ⚫ ⚫ Part of GDN Studio *GEONIS DB Validation ⚫ ⚫ Geocom Portal - Tools GeoDbDiff ⚫ Part of GEONIS 2017 GEONIS DB Update ⚫ Part of GEONIS GEONIS XML Restruct ⚫ Part of GEONIS GEONIS Validation ⚫ ⚫ ⚫ ⚫ Part of GEONIS GEONIS DB Statistic ⚫ ⚫ Part of GEONIS Graph. Topology Check ⚫ Part of ArcMap Geometric Network Editing ⚫ Part of ArcMap Esri Topology ⚫ Part of ArcMap Python ⚫ ⚫ ⚫ ⚫ ⚫ Part of ArcMap / Scripts1) *GEONIS Data Converter Geocom Portal – My Software ⚫ ⚫ ⚫ ⚫ FME files1) *Additional license for GEONIS expert required 1) Complete scripts / workbenches or support with creating these can be purchased as a service from Geocom. GEONIS_Data_Quality_EN.docx Page 19
Guidelines Geocom 5. Added value of data quality It is very difficult to express the value of a good quality data record in figures. However, in relation to a GIS it can be stated that data quality provides the foundation for information systems. At Geocom we see the following added value, with this list not in any way exhaustive: • Saved time and more effective working • Competitive advantage through fulfilling quality standards (e.g. SIA standards) • Simplified collaboration between offices • Unrestricted GEONIS functionality across all platforms (desktop, server and mobile) • Less efforts with updates (pre / post processing) • Less efforts when exporting data (Interlis, NEPLAN, Youtility…) • Correct results with assessments (reports) and simulations (network tracking) • Complex assessments possible across multiple object tables • Faster implementation of new (GIS) functions and products in a system • … 6. How to enhance data quality As mentioned at the beginning, (data) quality is relative to its usage purpose in all cases. This is why users should first of all consider the objectives pursued with the GIS. It is then much easier to deduce the required data quality, and measures to achieve and maintain this. On top of the data quality, the system quality that needs to be fulfilled in GEONIS also needs to be considered from a technical point of view so that all functions work flawlessly. This gives rise to the following general sequence aimed at improving data quality in GEONIS: 1. Ensure system quality 1.1 Rectify incorrect relationships, e.g. unequal data types 1.2 Remove duplicate code values in secondary tables (ATTENTION: obsolete entries!) 1.3 Rectify obsolete entries: re-attribute / delete 1.4 Correct extensions in accordance with GEONIS conventions → comparison with stand- ard 1.5 Correct module-specific system tables and GN_SPLITMERGE_DEF table 1.6 Valid XML definitions 2. Ensure technical data quality 2.1 List of (desired) products in the GIS 2.2 List of the interfaces to be used 2.3 Deduction of required objective attributes based on points 2.1 and 2.2 2.4 Deduction of required topology based on points 2.1 and 2.2 2.5 Creation of acquisition guidelines based on the quality required with 2.3 and 2.4 2.6 Provision of tools aimed at complying with and improving the desired data quality GEONIS_Data_Quality_EN.docx Page 20
Guidelines Geocom 7. Support Geocom will be pleased to support its customers in improving the data quality in GEONIS. Each customer can select the level of support individually based on their requirements. Geocom provides the following services within the scope of data quality improvements in GEONIS: • Workshops on how to enhance data quality • Workshops on how to use the above mentioned tools • Add-on service "GIS Data Analysis" for data analysis, reporting, and tips on the next steps → ideal preparation for a GEONIS update • Data clean-ups in close collaboration with the customer → Consulting Please get in touch with your Geocom contact with any questions for a no-obligation discussion. 8. Internet Sources [Status Tuesday, July 2, 2019] (GIS is it – A blog about GeoKrams) - German Data quality and quality URL: https://gisisit.wordpress.com (Round table GIS) - German Geoinformationssysteme - Leitfaden zur Datenqualität für Planungsbüros und Behörden URL: https://rundertischgis.de (BINEX – Business Information Excellence) - German Datenqualität messen: Mit 11 Kriterien Datenqualität quantifizieren URL: https://www.business-information-excellence.de (Wikipedia) (German only) Datenqualität GIS URL: https://de.wikipedia.org/wiki/Datenqualitaet_(GIS) (Produktion – Technik und Wirtschaft für die deutsche Industrie) - German In sieben Schritten die Datenqualität verbessern URL: https://www.produktion.de GEONIS_Data_Quality_EN.docx Page 21
You can also read