Understanding the European Data Portal - High level presentation of the architecture
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Understanding the European Data Portal High level presentation of the architecture data.europa.eu/europeandataportal
countries involved in the EU’s Access to the Portal What we do neighbourhood policy. But how does the Portal The access to the Portal More and more volumes of function? This factsheet is provided in two ways: a data are published every provides a summary of the machine-readable API and day. The amount of data architecture of the European a human readable web across the world is increasing Data Portal. site (GUI). The API enables exponentially. A substantial its users to search, create, amount of this data is modify and delete metadata collected by the public sector. The architecture on the portal. But for the data to be re-used, The GUI is basically built on it needs to be accessible. For a better understanding two components: CKAN and of the integration of the DRUPAL. CKAN manages The first official version of components into the and provides metadata the European Data Portal overall architecture, each content (datasets) in a central is available since February component’s functionality repository. DRUPAL provides 2016. The Portal harvests as well as interactions from the Portal’s Home Page with the metadata available on different perspectives (user/ editorial content (e.g. Portal’s public data and geospatial system) are described and objectives, articles, news, portals across European published on GitLab. events, tweets, etc.) and links countries. Portals can be The figure at the bottom of to an Adapt Framework national, regional, local this page provides a high based training platform. In or domain specific. They level overview diagram of addition it offers extended cover the EU Member the European Data Portal functionalities to registered States, EFTA countries and architecture. users via user login by ECAS. Third party User/Expert portals / Experts API Proxy Graphical API/Portal Usage, access Visualisation Graphical Portal Portal API/Portal Portal Portal search etc tool pre- statics and processor API/Portal caching Help map. Licensing MQA Desk apps FME Drupal SOLR CKAN Gazetter Harvester Assistant (Transformer) (JIRA) backend Sync SPARQL DCAt-AP Virtuoso Manager
Both systems are used in a Searching the Portal to datasets can be visualised side-by-side architecture. in tabular (tables) and A proxy is responsible for The portal uses the SOLR graphical (charts) form using a delivering the web pages search engine in order to D3.js library. requested by the user. Both separately search for editorial systems are equally themed content in DRUPAL and with the same Look & Feel for datasets in the CKAN so that the user is not aware repository. The GUI includes on which system he/she is a Licensing Assistant currently browsing. component that supports The Portal architecture the user by providing legal includes three additional Multilingualism information on the usage of components to enhance a specific dataset in terms the quality The Portal GUI will support of licenses that apply to the all 24 official EU languages dataset. for main editorial and main metadata content (using MT@ The SPARQL Manager EC). Training content will be component allows the user to available in English and French enter and run SPARQL queries only. Additional material will be on the Virtuoso linked data Harvesting Data made available in English or in repository. It also allows the On the Harvesting side, the the source language. logged-in user to store and portal follows a two-fold re-run SPARQL queries and architecture too. CKAN is notifies the user when a query used as the central metadata has finished running. repository for storing, Geo spatial data browsing and searching datasets in a POSTGRES Using the map.apps backend relational database. application, Geo spatial data is visualised on geographical In order to also support a maps. The application linked data functionality the is a proprietary solution CKAN metadata is replicated that comes with different into a Virtuoso quad store tooling and thematic focus, repository via a CKAN a graphical configuration synchronisation extension, interface, supports in order to ensure that both responsive web-design and repositories have the same set internationalisation files. of metadata. The application also The Harvester is a separate implements the OSGI component that is able to specification on the client side harvest data from multiple OPEN DATA (in JavaScript) allowing sharing data sources with different and re-usage of the bundled formats and APIs. The application logic as well as a harvester is acting as a straightforward maintenance. single point of entry for all Statistical data that is linked metadata that gets harvested,
transformed into the CKAN of all spatial file/database metadata and creates JSON schema and pushed formats and that is used for tickets for the Helpdesk in into the CKAN repository. harvesting the sources for case of harvesting issues. geographical names. The Gazetteer component The third component is is used by the Harvester Enhancing quality the monitoring component to enhance the metadata based on PIWIK and The Portal architecture located at the Proxy in with geo-spatial data includes three additional and information (geo- the architecture. In the full components to enhance respect of data privacy, coordinates, names, places, the quality of the metadata etc.). The Gazetteer is it records requests and and the portal. A Helpdesk user interactions on the mainly used to improve the handles user support search functionality. It uses portal in order to generate requests and feedback. anonymised user traffic the FME component as a universal spatial ETL tool The Metadata Quality statistics that will help (Extract-Transform-Load) Assistant (MQA) periodically enhancing the usage of the that supports the accessing, generates reports on the Portal. processing and outputting quality of the harvested Component Description API Access Machine-readable (SOAP / REST) API GUI Access Portal website graphical user interface CKAN Portal’s central metadata (dataset) repository DRUPAL Portal’s Home Page managing editorial content ECAS European Commission Authentication System used for user registration and login in order to provide extended functionalities of the Portal Adapt Framework Online platform used for Portal Training Modules (available in EN + FR) Proxy Routing of (HTTP(S)) user requests to corresponding components SOLR Portal Search engine used for searching portal editorial content Search (editorial data) SOLR Dataset Search engine used for searching and filtering datasets in the CKAN Search (metadata) metadata repository Licensing Component to provide legal information on (re-)usage of specific datasets Assistant SPARQL SPARQL query editor allowing to run SPARQL queries on Manager linked data in the Virtuoso repository
Component Description Virtuoso Linked data quadruple store that is synchronized with the CKAN repository map.apps: Geo-spatial Proprietary application to visualize geo-spatial data and information using Data Visualisation geo-maps. It comes with different tooling and thematic focus, a graphical configuration interface, supports responsive web-design, i18n internationalization files, client side implementation of the OSGI specification (JavaScript) Graphical Data Recline.js/D3.js JavaScript libraries to visualize (statistical) data in Visualisation tables and graphical charts Pre-processor RESTFUL web API, running on a Node.js server which analyzes and transforms XSL/XLSX files into CSV format in order to be used from the Visualisation tool Harvester Single entry point component for harvesting data from multiple data sources in different formats and from different APIs Gazetteer Component providing geo-spatial data and information FME Component used by the Gazetteer as a universal spatial ETL tool (Extract- Transform-Load) that supports the accessing, processing and outputting of, all spatial file / database formats and that is used for harvesting the sources for geographical names smart.finder Component used by the Gazetteer and simplifying searches for spatial data, services and documents. It enables fast and structured access to extensive, distributed and heterogeneous data stores Helpdesk Portal offers a user request/feedback form via the JIRA-API and generates JIRA tickets for follow-up by helpdesk Metadata Quality Component to report on the quality of the harvested metadata and to alert Assistant (MQA) helpdesk in case of issues Monitoring PIWIK component that provides Traffic Analytics of portal usage Multilingual Web pages + core editorial content + dataset descriptions available in all Support 24 official EU languages MT@EC Machine Translation Services of the European Commission used for translation of the metadata into all of the supported languages by the portal
Understanding the European Data Portal High level presentation of the architecture Last update: August 2016 For more information, please visit the European Data Portal or contact us via email. data.europa.eu/europeandataportal | info@europeandataportal.eu The European Data Portal initial content has been collected by harvesting national public data and geospatial portals. Progressively, the portal will harvest additional metadata collected from regional, local and domain specific portals. Do you want your portal or website to be harvested by the European Data Portal? Read the requirements on the website. Share your story about how you make use of Open Data. Are you an entrepreneur? A non- governmental organisation? A civil servant responsible for publishing data? A local authority? Tell us your story! The purpose of the collection of use cases is to assemble interesting European stories about the benefits and efficiency gains that result from the use of Open Data. Contact us via the website and fill out the form. Consortium: www.capgemini-consulting.com www.opendatainstitute.org www.intrasoft-intl.com www.timelex.eu www.sogeti.com www.southampton.ac.uk www.conterra.de www.fraunhofer.de
You can also read