BigFoot: Big Data Analytics of Digital Footprints
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
BigFoot: Big Data Analytics of Digital Footprints Project name BigFoot Project ID FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 Working Package Number WP6 Deliverable Number D.6.1 Document title BigFoot web site and dissemination material prepared Document version 1.0 Author EUR, GRIDP Date 31/12/2012 Status Public
Deliverable D.6.1 BIGFOOT Version 1.0 Revision History Date Version Description Author 19/12/’12 0.1 Initial Deliverable Setup Pietro Michiardi 19/12/’12 0.2 Update doc adding reference pic- Pierre Leray tures 19/12/’12 0.3 Updates on collaboration Pietro Michiardi 19/12/’12 0.4 Updates on open source Pietro Michiardi 28/12/’12 0.5 Date format fix and delivery date Pietro Michiardi update 28/12/’12 1.0 Review Pierre Leray FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 2
Deliverable D.6.1 BIGFOOT Version 1.0 Contents 1 Summary 4 2 Scientific Publications 5 3 Communication 7 3.1 BigFoot WebSite . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Project logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3 Leaflets and Posters . . . . . . . . . . . . . . . . . . . . . . . 9 3.4 Press Releases . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.5 Labelization Activities . . . . . . . . . . . . . . . . . . . . . . 10 4 Collaboration 11 4.1 Workshops . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.2 Training Events . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.3 Coding Community . . . . . . . . . . . . . . . . . . . . . . . . 12 4.4 Relation with other EU-funded, National projects . . . . . . . 13 5 Open Source Software 14 FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 3
Deliverable D.6.1 BIGFOOT Version 1.0 1 Summary The purpose of this report is to summarize the progress that has been made since the BigFoot project start with respect to the dissemination activities. The organization of the document is the following: • Section 2 presents the scientific dissemination strategy adopted in Big- Foot and some preliminary material that has been presented or sub- mitted for peer-review. • Section 3 is dedicated to communication activities carried out in the context of the BigFoot project. • Section 4 presents the collaboration strategy and summarizes the ac- tivities undertaken in the first three months of the project. • Section 5 presents the methodology used in BigFoot to: document the project, prepare research articles, and develop software deliverables. Essentially, a large fraction of the activities in BigFoot are regarded as open-source projects, and this Section illustrates the tools BigFoot partners set up to achieve this goal. FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 4
Deliverable D.6.1 BIGFOOT Version 1.0 2 Scientific Publications This Section is dedicated to the scientific dissemination strategy adopted in BigFoot. In addition, we summarize the scientific production carried out since the project start, including technical reports and submitted research articles. In BigFoot, a number of publications is expected in prestigious interna- tional conferences, workshops and journals, based on the concept, vision, design and implementation results of BigFoot. The list of plausible venues that may be of interest for the dissemination of BigFoot results is included in the DoW; however, due to the ever evolving nature of the scientific com- munity that covers data management, systems and networking domains, the consortium will update and revise such list with great zeal. As stated in the DoW, the nature of the research carried out in BigFoot – which has a strong accent on systems aspects, involving a great dose of engineering and development effort – is such that it is realistic to expect few, but important contributions in terms of research articles. In addition to the submission and eventual publication of workshop, conference and journal articles, in BigFoot we seek at disseminating results through the set-up of appropriate number of exhibition stands, workshops and seminars/schools. While the latter aspect is covered in more details in Section 4, in what follows we present the activities carried out in the first three months of the project, and foreseen for the near future (that is, events which have been already organized but still did not take place). In addition, we report seminars, talks, tutorials and demonstrations that were delivered before the project start: in this case, partners anticipated some work relevant to the context of BigFoot because the project was in an advanced state of negotiation. Submitted research articles • M. Pastorelli, A. Barbuzzi, D. Carra and P. Michiardi, “HFSP: The Hadoop Fair Sojourn Protocol”, submitted to ACM SIGMOD 2013 • I. Psaroudakis, M. Athanassoulis and A. Ailamaki, “Sharing Data and Work Across Concurrent Analytical Queries”, submitted to ACM Eu- rosys 2013 FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 5
Deliverable D.6.1 BIGFOOT Version 1.0 Poster presentation • P. Michiardi, “The BigFoot Project: Vision, Objectives and Perspec- tives”, EURECOM Scientific Council Annual Meeting Seminars, talks, tutorials, demos • Demo: Alagiannis, I., Borovica, R., Branco, M., Idreos, S. and Ail- amaki, A. (2012) NoDB in Action: Adaptive Query Processing on Raw Data. Proceedings of the VLDB Endowment, 5(12), 2012 (demo presented in VLDB 2012 conference) • Seminar: “Network Reliability in the Software Era – Finding Bugs in OpenFlow-based Software Defined Networks”, Dr. Marco Canini, Ph.D. (TLabs - Berlin), BigFoot Seminar, Eurecom, December, 2012 • Tutorial: “Big Data, Big Value?”, Telecom ParisTech, Paris, Decem- bre 2012 • Seminar: “Implementing a Distributed Clustering on Apache Hadoop”, Thibaut Debatty (Royal Military Academy, Belgium), BigFoot Semi- nar, Eurecom, November, 2012 • Talk: “BigFoot Project Presentation”, Internet of Services Days, Oc- tober 2012 • Talk: “Introducing Scaling”, Riviera Scala-Clojoure, Sophia-Antipolis, Seprember 2012 • Talk: “Big Data Analytics in Practice” Networking Lecture Series, TLabs - Berlin, July 2012 • Talk: “Big data et smart grids in BigFoot”, SophiaConf, July 2012 • Talk: “Big data et smart grids in BigFoot”, BigDataParis, Paris, March 2012 [figure 1] FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 6
Deliverable D.6.1 BIGFOOT Version 1.0 Figure 1: BigDataParis 2012 event publication 3 Communication This Section is dedicated to communication activities carried out in relation to the BigFoot project. 3.1 BigFoot WebSite As stated in the DoW: A WebSite will be dedicated to the BigFoot project, with a clear descrip- tion of BigFoot objectives, partners, and with a public section for De- liverables. Additionally the web site will consolidate the publication list of project partners, and link to relevant open-source projects. Finally, a collaborative portal, accessible only to authorized members, will assist the communication between project partners. The BigFoot project WebSite is up and running, reachable at: http: //www.bigfootproject.eu. The domain name has been bought for 5 years, with the traditional renew-policy. Currently, the Web site is organized in the following sections: • Who We Are: Organizations and people behind BigFoot. • Applications: Use cases driving the BigFoot effort. FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 7
Deliverable D.6.1 BIGFOOT Version 1.0 • Research: Academic research behind BigFoot. • Software: Free and Open Source software released by the BigFoot partners. • Documents: Official documents released by the BigFoot consortium. • Contact: Contact information. In its initial release, the BigFoot WebSite appears quite static (although the backend technology allows dynamic content), and it is based on the recently released (and open-source) Twitter Bootstrap framework. The main reason for a simple layout is that there is not enough content to enable a decent visualization of events, and related dynamics. The BigFoot WebSite is currently managed by EUR, and the publica- tion process follows the same procedure that is used to produce software, documents, reports and deliverables. Every edit is centralized through the BigFoot BitBucket repository: this allows version control, concurrent ed- its, offers backup and automatically generate statistics (in terms of who did what). A simple script can be invoked to push new content to the WebSite. Finally, a Google analytics key allows to track users, and drill-down into access data to determine popular pages. We emphasize here that project coordination and collaboration is done using a complex system that go beyond a private space on the BigFoot WebSite. Everything in BigFoot is treated as an open source project: as such it is committed to a common (private) repository that is accessible also through a built-in Web interface through the BitBucket service. 3.1.1 Planning There are many features that we will add to the BigFoot WebSite, which are listed below: • Link to a Twitter account or HashTag to disseminate information re- lated to BigFoot • News section, which will aggregate all events related to BigFoot • Industrial Advisory Board section, with information on what the IAB is for BigFoot, and how to join the IAB • A new section on dissemination material FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 8
Deliverable D.6.1 BIGFOOT Version 1.0 • New links in the Documents section, indicating where and how to download public deliverables • When ready, the Software section will contain links to software deliv- erables, which are hosted in BitBucket 3.2 Project logo As stated in the DoW: A project logo will be designed and used in all documents and publica- tions of BigFoot. The design will be done in a way that the logo will be representative of BigFoots concept and vision. The project logo has been designed during the project negotiation phase, and has been done by Symantec. The idea of the logo (which can be seen in the front-page of this – and all others – deliverable) is to mimic the approach taken in the open-source community to designate projects related to “Big Data”: generally, such logos are “cartoon-ized” animals. We believe the BigFoot logo to be representative of this philosophy. Our plan is to produce variants of the logo as a function of the specific communication type: as an illustrative example, given the project integra- tion (and contribution to) OpenStack, we will produce a variation of the original logo by placing related logos (e.g. that of OpenStack) in the cloud- shape above the BigFoot. 3.3 Leaflets and Posters As stated in the DoW: Two sets of leaflets and posters will be designed and produced. The first set early in the project will disseminate the objectives, concepts and vi- sion of BigFoot. The second during the third year of the project will addi- tionally disseminate public results, outcomes and findings from BigFoot research. This material will be used in all public events (conferences, workshops, exhibitions, etc.), where BigFoot partners will participate A first version of the BigFoot leaflet, inclusive of objectives, concepts and vision is available in the BigFoot BitBucket repository. This leaflet has been FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 9
Deliverable D.6.1 BIGFOOT Version 1.0 used during the IOS meeting that took place in Brussels, in October 2012. We expect to put this document and new releases in a dedicated section of the WebSite. 3.4 Press Releases All partners are involved in actively communicating to the press. Press re- leases will mostly cover the description of the BigFoot project as a whole and the application scenarios considered in BigFoot, namely ICT Security and SmartGrid applications. BigFoot has been featured (figure 2) in a very important “business-oriented” journal, the French “Les Echos”, and is avail- able here: http://goo.gl/mSFY6. As stated in the DoW, the main press releases will be focused on ap- plication scenarios of BigFoot, and on a non-scientific version of the main research results achieved in the project. 3.5 Labelization Activities The BigFoot project has been presented in a competition – limited to French initiatives – wherein the innovative aspects (from the research and develop- ment point of view) and exploitation possibilities (from the business point of view) underlying each project where thoroughly evaluated by a panel of experts. BigFoot ranked first in the competition and received a “labeliza- tion”, indicating support from the French Group Cap Energies. The link to the labelization event is available here: http://goo.gl/UQ4XV. FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 10
Deliverable D.6.1 BIGFOOT Version 1.0 Figure 2: Article in Les Echos dated March, 20th 2012. 4 Collaboration 4.1 Workshops The BigFoot consortium will organize at least one project workshop event, in which highly reputed international groups will be invited to submit their most recent and relevant work; in this workshop, the consortium will present their joint and individual work related to BigFoot. Clearly, it is too early to organize such an event at the current stage of the project. Instead, the BigFoot partners have established an Industrial Advisory Board, with a basis of key industrial players in the domain of “Big Data”, including those who develop tools for handling and processing data, and those who use such tool for their particular business needs. Although at month T6 of the FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 11
Deliverable D.6.1 BIGFOOT Version 1.0 project we will provide a detailed documentation of the workshop proposal, we anticipate here that our goal is to organize an event in which BigFoot will collect feedback from additional use cases that are not covered with by the industrial partners, and discuss with developers of “Big Data” solutions to understand what are the relevant industrial problems they currently work on. 4.2 Training Events PhD-level schools and workshops that will be organized by the BigFoot con- sortium will serve also as important dissemination activities. A preliminary contact has been established with the Telecommunication Group of Politec- nico di Torino, with the goal of organizing a winter school (to be planned for the fall/winter period of 2013) for Ph.D. students on the broad topics of: • Design of scalable algorithms for time-series analysis • Hadoop MapReduce • Hadoop Pig and Scalding • Applications to the telecommunication industry EUR will co-ordinate the Ph.D. winter school with Politecnico di Torino. The premise for the event, precise dates, invited speakers and a detailed agenda will be presented in D6.3 (at T6). 4.3 Coding Community Since BigFoot is an open platform for data processing applications, partic- ular attention will be devoted to European and international summits that gather the “coding community”. In addition, coding competitions will be organized on a regular (yearly) basis for student at the Academic institutions of the Consortium. Precisely, EUR will organize a “coding contest” for Master-level and PostGraduate-level students in the context of the CLOUDS course, which is jointly coordinated by Prof. P. Michiardi and Prof. M. Vukolic. The orga- nization of coding competitions will also take place in conjunction with the collaboration activities with other funded projects: EUR has been contacted (thanks to the BigFoot project) to take part to a French initiative to establish a collaborative platform for Data-Intensive computing and Data Science ap- plications. In particular, the governmental entities behind the French plat- form have agreed to co-organize coding contests with commoncrawl.org, in FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 12
Deliverable D.6.1 BIGFOOT Version 1.0 which EUR will play a crucial role by bridging the research and (a selection of) tools developed in BigFoot with the common platform on top of which the contest will take place. 4.4 Relation with other EU-funded, National projects The BigFoot project has been recently in contact with the coordinators and members of two EU-funded projects to establish different kinds of coopera- tions. The VISSENSE project (http://www.vis-sense.eu/) focuses on the development of visual analytics technologies for the representation of large datasets. Originally, the application of such tools targets solely security applications: nevertheless, the software developed in the project (in which SYM is a partner) may be a useful addition to the BigFoot project. The MPLANE project (http://www.ict-mplane.eu/) focuses on an intelligent measurement plane for future network and application manage- ment. EUR is involved in the project, in which large-scale data storage and analysis plays a crucial role. Prof. Pietro Michiardi presented some preliminary research results (see submitted papers in Section 2) that have been conceived within the BigFoot project, and established a preferential channel with the project coordinator (Politecnico di Torino) to consider the problems addressed in MPLANE as additional use-cases that are of interest in the BigFoot project, namely Internet Traffic analysis (e.g. classification, anomaly detection, fraud forensics). The collaboration with MPLANE goes beyond research activities – as illustrated in the points above – and tar- gets the organization of joint Ph.D. schools on the subject of parallel data processing and its applications in the telecommunication industry. FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 13
Deliverable D.6.1 BIGFOOT Version 1.0 Figure 3: BitBucket overview of the BigFoot project. 5 Open Source Software In this section we provide a set of screen-shots to illustrate the internal BigFoot operation, which is largely based on BitBucket.org. Bitbucket is a web-based hosting service for projects that use either the Mercurial or Git revision control systems. Bitbucket was previously an independent startup; on September 29th, 2010, Bitbucket was acquired by VC-funded Atlassian, which is the main company behind the JIRA ticketing system, that is used by the Apache Software Foundation. Figure 5 illustrates the main BigFoot project site, including members, repositories, news-feed and configuration. Figure 5 illustrates a particular repository, Documents, which includes all project documentation (with edit history) from its early conception, through the negotiation, to its execution. As it is possible to see, meeting minutes, deliverables and the BigFoot web-site are all coordinated through BitBucket, using git. Every member of the consortium has a local copy, which is then FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 14
Deliverable D.6.1 BIGFOOT Version 1.0 synchronized with the central hosting service. Finally, Figure 5 illustrates a fraction of the history related to the prepa- ration of the current deliverable, D.6.1. Clearly, all activity is tracked, with contribution frequency, time-stamps, versions and so on. In summary, BigFoot is organized in such a way that – from the technical point of view – everything is operated as an open-source project, and can be easily made public, or even exported as is on a different platform (e.g. github.com). FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 15
Deliverable D.6.1 BIGFOOT Version 1.0 (a) Documents Overview (b) Documents Organization Figure 4: BitBucket repository for the BigFoot documentation. FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 16
Deliverable D.6.1 BIGFOOT Version 1.0 Figure 5: BitBucket commit history example for D.6.1. FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858 17
You can also read