CDI DATAONE AND SCIENCEBASE ACCESS POINT EXPANSION: PYTHON APPLICATIONS PROGRAMMING INTERFACE AND ARCGIS TOOLKIT DEVELOPMENT
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
CDI SSF Category 2: Computational Tools and Services CDI DataONE and ScienceBase Access Point Expansion: Python Applications Programming Interface and ArcGIS Toolkit Development Applicants/Principal Investigator(s): Mike Mulligan, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling, Denver CO 80225. Ph. (303) 202- 4242 Email mmulligan@usgs.gov Tim Mancuso, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling, Denver CO 80225. Ph. (303) 202- 4238 Email tmancuso@usgs.gov Abstract: In the last several years the USGS has either sponsored or partnered with other groups to develop several enterprise data management support systems. Two of these projects, DataONE and ScienceBase, provide robust web service endpoints to help scientists capture, catalog, manage, and share data resources. The USGS has a unique opportunity to grow the list of potential users to the vast group of ArcGIS analysts within the agency. This project seeks to build a set of support tools that will allow this GIS community to quickly and easily access and work with geospatial data held by DataONE and ScienceBase, as well as develop repeatable workflows around these data. The project will develop new access options for ArcMap users, an easily installed toolbar to take advantage of these new access options, a set of software documentation and user training materials, and an Open File Report documenting the effort. Total funding amount requested: $30,500 Total in-kind funding: $18,150 Specific Datasets Exposed: The datasets exposed will be a function of the projects that use these tools Geographic/geologic/ecosystem/habitat/taxonomic/other context: All geographic and mission areas; other contexts (keywords and categories) are software-related Type of Product(s) Generated: Open File Report, Python Applications Programming Interface, ArcGIS Toolbar, Best Practice Documentation, Software Documentation, User Training Materials 1
Summary Introduction and Background: In the last several years the USGS has either sponsored or partnered with other groups to develop several enterprise data management support systems. Two of these projects, DataONE and ScienceBase, provide robust web service endpoints to help scientists capture, catalog, manage, and share data resources. As these efforts gain acceptance among scientists, the USGS has a unique opportunity to grow the list of potential users to the vast group of ArcGIS analysts (~2000) within the agency. This project seeks to build a set of support tools that will allow this GIS community to quickly and easily access and work with geospatial data held by DataONE and ScienceBase, as well as develop repeatable workflows around these data. The deliverables from this work will include: a python API (Applications Programming Interface) for DataONE, based on existing service endpoints available through the project; a python API for ScienceBase, based on the existing ScienceBase REST API; a python-based ArcGIS Toolbar that can be plugged into a ArcGIS/ArcMap version 10 client and tied to a VisTrails workflow management instance; an Open File Report documenting the effort, including API documentation; the posting of all source code and README files on the USGS GitHub space; on-line training in the installation and use of the toolbar and use of VisTrails. CDI SSF Category: Computational Tools and Support (SSF Category 2) Project Title: CDI DataONE and ScienceBase Access Point Expansion: Python Applications Programming Interface and ArcGIS Toolkit Development Contacts: Mike Mulligan, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling, Denver CO 80225. Ph. (303) 202- 4242 Email mmulligan@usgs.gov Tim Mancuso, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling, Denver CO 80225. Ph. (303) 202- 4238 Email tmancuso@usgs.gov Developer Resources: Brad Williams, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling, Denver CO 80225. Ph. (303) 202- 4234 Email bradwilliams@usgs.gov Bruce Powell, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling, Denver CO 80225. Ph. (303) 202- 4089 Email bpowell@usgs.gov Travis Lawall, USGS Fort Collins Science Center, 2150 Centre Ave, Fort Collins, CO 80526. Ph (970)-226-9341 Email lawallt@usgs.gov Sebastien Nicoud, USGS Fort Collins Science Center, 2150 Centre Ave, Fort Collins, CO 80526. Ph (970)-226-9145 Email snicoud@usgs.gov Collaborating Organizations: DataONE o Dave Vieglais, Director for Development and Operations dave.vieglais@gmail.com ScienceBase/CSAS o Natalie Latysh, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling, Denver CO 80225. Ph. (303) 202- 4637 Email 2
nlatysh@usgs.gov Fort Collins Information Science Branch (Web Apps and GIS/RS) o Gail Montgomery, USGS Fort Collins Science Center, 2150 Centre Ave, Fort Collins, CO 80526. Ph (970)-226-9253 Email montgomeryg@usgs.gov o Colin Talbert, USGS Fort Collins Science Center, 2150 Centre Ave, Fort Collins, CO 80526. Ph (970)-226-9425 Email talbertc@usgs.gov o Laura Smyrl, USGS Fort Collins Science Center, 2150 Centre Ave, Fort Collins, CO 80526. Ph (970)-226-4369 Email lsmyrl@usgs.gov Detailed description of geographic/geologic/ecosystem/habitat/taxonomic/other context of the project and its importance or value if applicable: This project applies to all ArcGIS efforts, regardless of geographic reach. Scope In the last two years the USGS has either sponsored or partnered with other groups to develop several enterprise data management. Two of these projects, DataONE and ScienceBase, provide robust web service endpoints to help scientists capture, catalog, manage, and share data resources. As these efforts gain acceptance among scientists, the USGS has a unique opportunity to grow the list of potential users to the vast group of ArcGIS analysts within the agency. This project seeks to build a set of support tools that will allow this GIS community to quickly and easily access and work with geospatial data held by DataONE and ScienceBase, as well as develop repeatable workflows around these data. Being able to work with data is one part of the analysts’ concerns. Capturing the steps involved in an analysis for workflow reconstruction and metadata support is also an essential part of the equation. To ensure analysts can work with a ready-to-use tool, this project seeks to both develop an applications programming interface compatible with ArcMap and to encapsulate that API in a toolbar that can be quickly and easily added to the ArcGIS analyst’s client installation. As a part of the effort, the toolbar will allow seamless capture of the analyst’s workflow by the VisTrails workflow management tool. The deliverables from this work will include: a python API (Applications Programming Interface) for DataONE, based on existing service endpoints available through the project; a python API for ScienceBase, based on the existing ScienceBase REST API; a python-based ArcGIS Toolbar that can be plugged into a ArcGIS/ArcMap version 10 client and tied to a VisTrails workflow management instance; an Open File Report documenting the effort, including API documentation; the posting of all source code and README files on the USGS GitHub space; on-line training in the installation and use of the toolbar and use of VisTrails. The staff involved in this project has considerable success in developing the original APIs, developing workflows with VisTrails, as well as building enterprise connections to ArcGIS. This is the correct group to address this critical issue. 3
Technical Approach The technical deliverables from this project, the Python application programming interface, ArcGIS toolbar, and VisTrail integration, would be based on existing projects. API development would build off the Geo Data Portal API currently available via the USGS GitHub space. The ArcMap toolbox/Add-In and a VisTrails package. This would open ScienceBase and DataONE to seamless solutions for data and workflow management. VisTrails is an open source project that already has connections to ScienceBase and DataONE. The Python functions would include, but not be limited to: SeachWCS(searchTerm, Repo) to return a list of results; getMD(DataID, Repo) to return an FGDC metadata record for a single result; updateItemMD(DataID, replacementMD as xml file, Repo) to update a metadata record; getWMS(DataID, Repo) to return the url to the service generated from a single result; downloadLocal(DatasetID, outputFName, Repo) would save a local copy of a dataset; uploadLocal(localFName, RepoFName, optional Username, optional UserPassword, Repo) would upload a local file to the target server. These capabilities in other forms are already supported through system service endpoints. ScienceBase and DataONE both have production published REST services that are being used by a variety of groups. For the most part, these uses are focused on delivering content to web portals and desktop modeling/data processing systems. The use of these services by ArcGIS users has generally been focused on search and display of data resources. This new approach will focus on data and metadata submission, as well as building derivative products from delivered data. API development would be implemented as an ArcGIS Toolbar. This approach has been successful in other projects, including work by this project group on delivering and modeling phenology data housed by ScienceBase. Toolbar development would focus on the ArcGIS 10.x architecture. The source code would be exposed via GitHub, providing a way for future efforts to build on the initial development work. The use of VisTrails as a part of the technical stack is based on the increased use of this open source project for workflow capture and recreation. VisTrails provides an extensible way for ArcMap/ArcGIS users to build process metadata as part of product construction. This would be a recommended but optional component in the full package; ScienceBase and DataONE would be adapted to accept VisTrails inputs for those users who want to use the full technical stack. The vast majority of funded work would be centered on the Python API, ArcGIS Toolbar, and VisTrails integration. As the development team exercises the ScienceBase and DataONE REST services, we expect that these projects will have to make nominal changes to their APIs. These changes will be contributed as in-kind work. Also, CSAS and Fort Collins will contribute their full development environment as in-kind equipment/supplies. Project Experience The principle investigators are well versed in all aspects of this project (API development, python, ArcGIS Toolbar implementation, VisTrails workflow management). CSAS staff 4
have been involved in the DataONE project since it’s inception, the product owners of the ScienceBase project, and are key players in GIS tool development. The proposed CSAS contingent of cooperators and contract staff have successfully supported core CSAS data systems (e.g., IT IS, GAP, VCP, NFHAP, MARIS, OBIS). Fort Collins Science Center staff has provided support development for ScienceBase and a number of GIS tool products, have contributed to the VisTrails open source effort, and have developed APIs for a number of projects. Commitment to Effort Core Science Systems has invested heavily in the support of ScienceBase and DataONE. It is reasonable to expect CSS will continue to support both projects, either as a sponsor or a project collaborator. The lessons learned through this exercise will be applicable to similar efforts (API development and a reference implementation of the API in a client system). This proposal includes contract staff for support development. The COR for each respective organization has been contacted and the COR has approved use of the contract vehicle to support this effort. Budget Budget Category Federal Funding “Requested” Matching Funds “Proposed” 1. SALARIES (inc. number of hours and hourly rate): Federal Personnel $ $ Colin Talbert, 20 hrs $1,200 Gail Montgomery, 40 hrs $2,200 Laura Smyrl, 40 hrs $2,500 Mike Mulligan 20 hrs $1,000 Tim Mancuso 25 hrs $1,250 Bruce Powell 40 hrs $2,000 Contract Personnel $ Contract Staff (300 hours $21,000 @$70/hr) Contract Staff (150 hours $7,500 @$60/hr) CSAS 2. FRINGE BENEFITS: N/A Personnel $ $ $ $ $ $ $ $ 5
Contract Personnel $ $ $ $ $ $ $ $ Total Fringe Benefits: $ $ 3. TRAVEL EXPENSES*: N/A Per Diem $ $ Airfare $ $ Lodging Cost $ $ Vehicle Cost $ $ Mileage $ $ Other travel expense(s) $ $ Total Travel Expenses: $ $ 4. OTHER DIRECT COSTS: (itemize) Equipment (inc. software, $ $ 8,000 hardware, etc.) – Development Environment Supplies $ $ Training $ $ Publications $ $ Office supplies $ $ Communications Cost (OFR $2,000 $ Publication through SPN) Total Other Direct Costs $2,000 $ Total Direct Costs: $ 30,500 $ 18,150 Indirect Cost (%) $ $ GRAND TOTAL: $30,500 $18,150 Timeline Deliverable Estimated Delivery Date Development of Python API for ScienceBase 8 weeks from time of award Addition of DataONE Member Node to Python API 12 weeks from time of award Development of ArcGIS toolbar that uses Python API 16 weeks from time of award Development documentation and user manual 20 weeks from time of award Establish Training Schedule for Toolbar installation and 20 weeks from time of award use; test training with initial sets of users Open-File Report 24 weeks from time of award 6
Appendices No attachments or appendices 7
You can also read