BigFoot: Big Data Analytics of Digital Footprints

Page created by Daryl Blair
 
CONTINUE READING
BigFoot: Big Data Analytics of Digital Footprints
BigFoot: Big Data Analytics of
         Digital Footprints

           Project name     BigFoot
              Project ID    FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858
Working Package Number      WP6
     Deliverable Number     D.6.1
          Document title    BigFoot web site and dissemination material prepared
       Document version     1.0
                  Author    EUR, GRIDP
                    Date    31/12/2012
                   Status   Public
BigFoot: Big Data Analytics of Digital Footprints
Deliverable D.6.1            BIGFOOT                         Version 1.0

                       Revision History
 Date      Version    Description                         Author
 19/12/’12 0.1        Initial Deliverable Setup           Pietro Michiardi
 19/12/’12 0.2        Update doc adding reference pic-    Pierre Leray
                      tures
 19/12/’12 0.3        Updates on collaboration            Pietro Michiardi
 19/12/’12 0.4        Updates on open source              Pietro Michiardi
 28/12/’12 0.5        Date format fix and delivery date   Pietro Michiardi
                      update
 28/12/’12 1.0        Review                              Pierre Leray

            FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858            2
BigFoot: Big Data Analytics of Digital Footprints
Deliverable D.6.1                BIGFOOT                                                                 Version 1.0

Contents
1 Summary                                                                                                                 4

2 Scientific Publications                                                                                                 5

3 Communication                                                                                                           7
  3.1 BigFoot WebSite . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    7
      3.1.1 Planning . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    8
  3.2 Project logo . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    9
  3.3 Leaflets and Posters .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    9
  3.4 Press Releases . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   10
  3.5 Labelization Activities    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   10

4 Collaboration                                                                                                          11
  4.1 Workshops . . . . . . . . . . .                . . . . . . . . . . .                       .   .   .   .   .   .   11
  4.2 Training Events . . . . . . . . .              . . . . . . . . . . .                       .   .   .   .   .   .   12
  4.3 Coding Community . . . . . . .                 . . . . . . . . . . .                       .   .   .   .   .   .   12
  4.4 Relation with other EU-funded,                 National projects .                         .   .   .   .   .   .   13

5 Open Source Software                                                                                                   14

            FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858                                                                3
BigFoot: Big Data Analytics of Digital Footprints
Deliverable D.6.1                BIGFOOT                          Version 1.0

1    Summary
The purpose of this report is to summarize the progress that has been made
since the BigFoot project start with respect to the dissemination activities.
The organization of the document is the following:

    • Section 2 presents the scientific dissemination strategy adopted in Big-
      Foot and some preliminary material that has been presented or sub-
      mitted for peer-review.

    • Section 3 is dedicated to communication activities carried out in the
      context of the BigFoot project.

    • Section 4 presents the collaboration strategy and summarizes the ac-
      tivities undertaken in the first three months of the project.

    • Section 5 presents the methodology used in BigFoot to: document the
      project, prepare research articles, and develop software deliverables.
      Essentially, a large fraction of the activities in BigFoot are regarded
      as open-source projects, and this Section illustrates the tools BigFoot
      partners set up to achieve this goal.

             FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858                 4
BigFoot: Big Data Analytics of Digital Footprints
Deliverable D.6.1                 BIGFOOT                        Version 1.0

2      Scientific Publications
This Section is dedicated to the scientific dissemination strategy adopted in
BigFoot. In addition, we summarize the scientific production carried out
since the project start, including technical reports and submitted research
articles.
     In BigFoot, a number of publications is expected in prestigious interna-
tional conferences, workshops and journals, based on the concept, vision,
design and implementation results of BigFoot. The list of plausible venues
that may be of interest for the dissemination of BigFoot results is included
in the DoW; however, due to the ever evolving nature of the scientific com-
munity that covers data management, systems and networking domains, the
consortium will update and revise such list with great zeal.
     As stated in the DoW, the nature of the research carried out in BigFoot
– which has a strong accent on systems aspects, involving a great dose of
engineering and development effort – is such that it is realistic to expect
few, but important contributions in terms of research articles.
     In addition to the submission and eventual publication of workshop,
conference and journal articles, in BigFoot we seek at disseminating results
through the set-up of appropriate number of exhibition stands, workshops
and seminars/schools. While the latter aspect is covered in more details
in Section 4, in what follows we present the activities carried out in the
first three months of the project, and foreseen for the near future (that
is, events which have been already organized but still did not take place).
In addition, we report seminars, talks, tutorials and demonstrations that
were delivered before the project start: in this case, partners anticipated
some work relevant to the context of BigFoot because the project was in an
advanced state of negotiation.

    Submitted research articles

     • M. Pastorelli, A. Barbuzzi, D. Carra and P. Michiardi, “HFSP: The
       Hadoop Fair Sojourn Protocol”, submitted to ACM SIGMOD 2013

     • I. Psaroudakis, M. Athanassoulis and A. Ailamaki, “Sharing Data and
       Work Across Concurrent Analytical Queries”, submitted to ACM Eu-
       rosys 2013

              FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858               5
BigFoot: Big Data Analytics of Digital Footprints
Deliverable D.6.1              BIGFOOT                         Version 1.0

 Poster presentation

   • P. Michiardi, “The BigFoot Project: Vision, Objectives and Perspec-
     tives”, EURECOM Scientific Council Annual Meeting

 Seminars, talks, tutorials, demos

   • Demo: Alagiannis, I., Borovica, R., Branco, M., Idreos, S. and Ail-
     amaki, A. (2012) NoDB in Action: Adaptive Query Processing on
     Raw Data. Proceedings of the VLDB Endowment, 5(12), 2012 (demo
     presented in VLDB 2012 conference)

   • Seminar: “Network Reliability in the Software Era – Finding Bugs
     in OpenFlow-based Software Defined Networks”, Dr. Marco Canini,
     Ph.D. (TLabs - Berlin), BigFoot Seminar, Eurecom, December, 2012

   • Tutorial: “Big Data, Big Value?”, Telecom ParisTech, Paris, Decem-
     bre 2012

   • Seminar: “Implementing a Distributed Clustering on Apache Hadoop”,
     Thibaut Debatty (Royal Military Academy, Belgium), BigFoot Semi-
     nar, Eurecom, November, 2012

   • Talk: “BigFoot Project Presentation”, Internet of Services Days, Oc-
     tober 2012

   • Talk: “Introducing Scaling”, Riviera Scala-Clojoure, Sophia-Antipolis,
     Seprember 2012

   • Talk: “Big Data Analytics in Practice” Networking Lecture Series,
     TLabs - Berlin, July 2012

   • Talk: “Big data et smart grids in BigFoot”, SophiaConf, July 2012

   • Talk: “Big data et smart grids in BigFoot”, BigDataParis, Paris,
     March 2012 [figure 1]

            FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858               6
BigFoot: Big Data Analytics of Digital Footprints
Deliverable D.6.1                  BIGFOOT                           Version 1.0

                Figure 1: BigDataParis 2012 event publication

3      Communication
This Section is dedicated to communication activities carried out in relation
to the BigFoot project.

3.1     BigFoot WebSite
As stated in the DoW:

    A WebSite will be dedicated to the BigFoot project, with a clear descrip-
    tion of BigFoot objectives, partners, and with a public section for De-
    liverables. Additionally the web site will consolidate the publication list
    of project partners, and link to relevant open-source projects. Finally,
    a collaborative portal, accessible only to authorized members, will assist
    the communication between project partners.

   The BigFoot project WebSite is up and running, reachable at: http:
//www.bigfootproject.eu. The domain name has been bought for 5 years,
with the traditional renew-policy. Currently, the Web site is organized in
the following sections:

     • Who We Are: Organizations and people behind BigFoot.

     • Applications: Use cases driving the BigFoot effort.

               FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858                     7
Deliverable D.6.1                BIGFOOT                           Version 1.0

   • Research: Academic research behind BigFoot.

   • Software: Free and Open Source software released by the BigFoot
     partners.

   • Documents: Official documents released by the BigFoot consortium.

   • Contact: Contact information.

     In its initial release, the BigFoot WebSite appears quite static (although
the backend technology allows dynamic content), and it is based on the
recently released (and open-source) Twitter Bootstrap framework. The main
reason for a simple layout is that there is not enough content to enable a
decent visualization of events, and related dynamics.
     The BigFoot WebSite is currently managed by EUR, and the publica-
tion process follows the same procedure that is used to produce software,
documents, reports and deliverables. Every edit is centralized through the
BigFoot BitBucket repository: this allows version control, concurrent ed-
its, offers backup and automatically generate statistics (in terms of who did
what). A simple script can be invoked to push new content to the WebSite.
     Finally, a Google analytics key allows to track users, and drill-down into
access data to determine popular pages.
     We emphasize here that project coordination and collaboration is done
using a complex system that go beyond a private space on the BigFoot
WebSite. Everything in BigFoot is treated as an open source project: as
such it is committed to a common (private) repository that is accessible also
through a built-in Web interface through the BitBucket service.

3.1.1   Planning
There are many features that we will add to the BigFoot WebSite, which
are listed below:

   • Link to a Twitter account or HashTag to disseminate information re-
     lated to BigFoot

   • News section, which will aggregate all events related to BigFoot

   • Industrial Advisory Board section, with information on what the IAB
     is for BigFoot, and how to join the IAB

   • A new section on dissemination material

             FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858                  8
Deliverable D.6.1                 BIGFOOT                            Version 1.0

   • New links in the Documents section, indicating where and how to
     download public deliverables

   • When ready, the Software section will contain links to software deliv-
     erables, which are hosted in BitBucket

3.2   Project logo
As stated in the DoW:

  A project logo will be designed and used in all documents and publica-
  tions of BigFoot. The design will be done in a way that the logo will be
  representative of BigFoots concept and vision.

    The project logo has been designed during the project negotiation phase,
and has been done by Symantec. The idea of the logo (which can be seen in
the front-page of this – and all others – deliverable) is to mimic the approach
taken in the open-source community to designate projects related to “Big
Data”: generally, such logos are “cartoon-ized” animals. We believe the
BigFoot logo to be representative of this philosophy.
    Our plan is to produce variants of the logo as a function of the specific
communication type: as an illustrative example, given the project integra-
tion (and contribution to) OpenStack, we will produce a variation of the
original logo by placing related logos (e.g. that of OpenStack) in the cloud-
shape above the BigFoot.

3.3   Leaflets and Posters
As stated in the DoW:

  Two sets of leaflets and posters will be designed and produced. The first
  set early in the project will disseminate the objectives, concepts and vi-
  sion of BigFoot. The second during the third year of the project will addi-
  tionally disseminate public results, outcomes and findings from BigFoot
  research. This material will be used in all public events (conferences,
  workshops, exhibitions, etc.), where BigFoot partners will participate

    A first version of the BigFoot leaflet, inclusive of objectives, concepts and
vision is available in the BigFoot BitBucket repository. This leaflet has been

             FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858                     9
Deliverable D.6.1               BIGFOOT                         Version 1.0

used during the IOS meeting that took place in Brussels, in October 2012.
We expect to put this document and new releases in a dedicated section of
the WebSite.

3.4   Press Releases
All partners are involved in actively communicating to the press. Press re-
leases will mostly cover the description of the BigFoot project as a whole
and the application scenarios considered in BigFoot, namely ICT Security
and SmartGrid applications. BigFoot has been featured (figure 2) in a very
important “business-oriented” journal, the French “Les Echos”, and is avail-
able here: http://goo.gl/mSFY6.
    As stated in the DoW, the main press releases will be focused on ap-
plication scenarios of BigFoot, and on a non-scientific version of the main
research results achieved in the project.

3.5   Labelization Activities
The BigFoot project has been presented in a competition – limited to French
initiatives – wherein the innovative aspects (from the research and develop-
ment point of view) and exploitation possibilities (from the business point
of view) underlying each project where thoroughly evaluated by a panel of
experts. BigFoot ranked first in the competition and received a “labeliza-
tion”, indicating support from the French Group Cap Energies. The link to
the labelization event is available here: http://goo.gl/UQ4XV.

            FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858               10
Deliverable D.6.1               BIGFOOT                         Version 1.0

         Figure 2: Article in Les Echos dated March, 20th 2012.

4     Collaboration
4.1   Workshops
The BigFoot consortium will organize at least one project workshop event,
in which highly reputed international groups will be invited to submit their
most recent and relevant work; in this workshop, the consortium will present
their joint and individual work related to BigFoot. Clearly, it is too early
to organize such an event at the current stage of the project. Instead, the
BigFoot partners have established an Industrial Advisory Board, with a
basis of key industrial players in the domain of “Big Data”, including those
who develop tools for handling and processing data, and those who use
such tool for their particular business needs. Although at month T6 of the

            FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858               11
Deliverable D.6.1                 BIGFOOT                           Version 1.0

project we will provide a detailed documentation of the workshop proposal,
we anticipate here that our goal is to organize an event in which BigFoot
will collect feedback from additional use cases that are not covered with by
the industrial partners, and discuss with developers of “Big Data” solutions
to understand what are the relevant industrial problems they currently work
on.

4.2   Training Events
PhD-level schools and workshops that will be organized by the BigFoot con-
sortium will serve also as important dissemination activities. A preliminary
contact has been established with the Telecommunication Group of Politec-
nico di Torino, with the goal of organizing a winter school (to be planned
for the fall/winter period of 2013) for Ph.D. students on the broad topics of:
   • Design of scalable algorithms for time-series analysis
   • Hadoop MapReduce
   • Hadoop Pig and Scalding
   • Applications to the telecommunication industry
   EUR will co-ordinate the Ph.D. winter school with Politecnico di Torino.
The premise for the event, precise dates, invited speakers and a detailed
agenda will be presented in D6.3 (at T6).

4.3   Coding Community
Since BigFoot is an open platform for data processing applications, partic-
ular attention will be devoted to European and international summits that
gather the “coding community”. In addition, coding competitions will be
organized on a regular (yearly) basis for student at the Academic institutions
of the Consortium.
    Precisely, EUR will organize a “coding contest” for Master-level and
PostGraduate-level students in the context of the CLOUDS course, which is
jointly coordinated by Prof. P. Michiardi and Prof. M. Vukolic. The orga-
nization of coding competitions will also take place in conjunction with the
collaboration activities with other funded projects: EUR has been contacted
(thanks to the BigFoot project) to take part to a French initiative to establish
a collaborative platform for Data-Intensive computing and Data Science ap-
plications. In particular, the governmental entities behind the French plat-
form have agreed to co-organize coding contests with commoncrawl.org, in

             FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858                  12
Deliverable D.6.1                BIGFOOT                          Version 1.0

which EUR will play a crucial role by bridging the research and (a selection
of) tools developed in BigFoot with the common platform on top of which
the contest will take place.

4.4   Relation with other EU-funded, National projects
The BigFoot project has been recently in contact with the coordinators and
members of two EU-funded projects to establish different kinds of coopera-
tions.
    The VISSENSE project (http://www.vis-sense.eu/) focuses on the
development of visual analytics technologies for the representation of large
datasets. Originally, the application of such tools targets solely security
applications: nevertheless, the software developed in the project (in which
SYM is a partner) may be a useful addition to the BigFoot project.
    The MPLANE project (http://www.ict-mplane.eu/) focuses on an
intelligent measurement plane for future network and application manage-
ment. EUR is involved in the project, in which large-scale data storage
and analysis plays a crucial role. Prof. Pietro Michiardi presented some
preliminary research results (see submitted papers in Section 2) that have
been conceived within the BigFoot project, and established a preferential
channel with the project coordinator (Politecnico di Torino) to consider the
problems addressed in MPLANE as additional use-cases that are of interest
in the BigFoot project, namely Internet Traffic analysis (e.g. classification,
anomaly detection, fraud forensics). The collaboration with MPLANE goes
beyond research activities – as illustrated in the points above – and tar-
gets the organization of joint Ph.D. schools on the subject of parallel data
processing and its applications in the telecommunication industry.

             FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858                13
Deliverable D.6.1                BIGFOOT                           Version 1.0

           Figure 3: BitBucket overview of the BigFoot project.

5    Open Source Software
In this section we provide a set of screen-shots to illustrate the internal
BigFoot operation, which is largely based on BitBucket.org. Bitbucket is
a web-based hosting service for projects that use either the Mercurial or Git
revision control systems. Bitbucket was previously an independent startup;
on September 29th, 2010, Bitbucket was acquired by VC-funded Atlassian,
which is the main company behind the JIRA ticketing system, that is used
by the Apache Software Foundation.
    Figure 5 illustrates the main BigFoot project site, including members,
repositories, news-feed and configuration.
    Figure 5 illustrates a particular repository, Documents, which includes all
project documentation (with edit history) from its early conception, through
the negotiation, to its execution. As it is possible to see, meeting minutes,
deliverables and the BigFoot web-site are all coordinated through BitBucket,
using git. Every member of the consortium has a local copy, which is then

             FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858                 14
Deliverable D.6.1                 BIGFOOT                            Version 1.0

synchronized with the central hosting service.
    Finally, Figure 5 illustrates a fraction of the history related to the prepa-
ration of the current deliverable, D.6.1. Clearly, all activity is tracked, with
contribution frequency, time-stamps, versions and so on.
    In summary, BigFoot is organized in such a way that – from the technical
point of view – everything is operated as an open-source project, and can
be easily made public, or even exported as is on a different platform (e.g.
github.com).

             FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858                   15
Deliverable D.6.1             BIGFOOT                        Version 1.0

                          (a) Documents Overview

                         (b) Documents Organization

     Figure 4: BitBucket repository for the BigFoot documentation.

            FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858           16
Deliverable D.6.1             BIGFOOT                        Version 1.0

         Figure 5: BitBucket commit history example for D.6.1.

            FP7-ICT-ICT-2011.1.2 Call 8 Project No. 317858           17
You can also read