Project

General

Profile

D32 - Open Science Data Analytics Operation Report

Abstract

Introduction

According to the title, this deliverable was initially conceived to report the activities performed to guarantee the operation of the AGINFRA+ Data Analytics facilities, as well as, presenting in detail their usage statistics.
During the project lifetime its scope has been extended to capture the activities related to the operation of the AGINFRA+ Virtual Research Environments, i.e. the web-based applications devised to support the project communities and their use cases.

This deliverable report the activity performed up to December 2019.

A total of 16 VREs have been deployed and operated since the beginning of the project, overall serving more than 570 users. Such users managed to perform:

  • a total of 59,720 sessions, with an average of circa 1658 sessions per month;
  • a total of 5,098 social interactions, with an average of circa 141 interactions per month;
  • a total of 12,720 analytics tasks, with an average of circa 353 tasks per month.

These VREs are largely in a constant development state meaning that the facilities they offer to their users are continuously updated as a consequence of the VRE usage, the communities developments (e.g. new models are integrated, new datasets are shared), as well as the enabling technology developments.

The deliverable is organised as follows:

  • VREs Operation reports some figures on the currently existing VREs including their number of users and some indicators on activities performed to develop the VRE and guarantee their operation;
  • Certification reports the analysis performed for assessing the Cloud Certification requirements from the e-infrastructures point of view as this is a potential requirement for use cases.

VREs Operation

The table below reports the list of VREs (16) created and operated by the project in the period from January 2017 up to December 2019, with their public URL, the creation date, the membership policy ("Open" means that everyone can become a member, "Private" means that membership is by invitation only, "Restricted" means that people can apply for membership and their request should be approved by the VRE managers), and the number of users (Dec. '19).

VRE Name
Web page
Start date
Membership
#Users


AGINFRAplus
https://aginfra.d4science.org/web/aginfraplus
Jan. '17
Private
60


AGINFRAplus4BioCos
https://aginfra.d4science.org/web/aginfraplus4biocos
Dec. '19
Private
8


AGINFRAplusDev
https://aginfra.d4science.org/web/aginfraplusdev
Sep. '18
Private
42


AGINFRAplusShowcase
https://aginfra.d4science.org/web/aginfraplusshowcase
Jul. '18
Restricted
32


AgroClimaticModeling_trial
https://aginfra.d4science.org/web/agroclimaticmodeling_trial
Jun. '19
Private
31


AgroClimaticModelling
https://aginfra.d4science.org/web/agroclimaticmodelling
Nov. '17
Restricted
54


DEMETER
https://aginfra.d4science.org/web/demeter
Mar. '17
Private
35


DEMETER_trial
https://aginfra.d4science.org/web/demeter_trial
Mar. '19
Restricted
31


EMPHASIS
https://aginfra.d4science.org/web/emphasis
Dec. '18
Restricted
22


FMJ_Lab
https://aginfra.d4science.org/web/fmj_lab
Dec. '19
Open
9


FoodborneOutbreak
https://aginfra.d4science.org/web/foodborneoutbreak
Dec. '17
Restricted
12


FoodSecurity
https://aginfra.d4science.org/web/foodsecurity
Nov. '17
Restricted
56


NitrogenScrum
https://aginfra.d4science.org/web/nitrogenscrumlab
Dec. '19
Restricted
17


ORIONKnowledgeHub
https://aginfra.d4science.org/web/orionknowledgehub
Jan. '18
Restricted
50


RAKIP_Portal
https://aginfra.d4science.org/web/rakip_portal
Jul. '17
Restricted
70


RAKIP_trial
https://aginfra.d4science.org/web/rakip_trial
Mar. '19
Restricted
62

VREs Operation Indicators

VREs Users

The graph below reports the sum of AGINFRA+ VREs users per month.

This number is growing partly because of the growing number of available VREs and partly because of the opening of the under development VRE to selected users not primarily belonging to the consortium yet part of the communities behind the use cases.

It is expected that the number of users will continue to grow in the future because of the development of the VREs and their release to a larger set of users.

VREs Accesses

The graph below reports the number of working sessions per month performed by AGINFRA+ VREs users.

A total of 59,720 sessions have been performed during the period with an average of circa 1658 sessions per month.

It can be observed that starting from November '17 there has been a constant growth of working sessions mainly due to the release of new VREs supporting the use cases. The spikes correspond to some events.

VREs Users Interactions

The graph below reports the social networking activity per month (post, replies, and likes) performed by AGINFRA+ VREs users.

A total of 5,098 social interactions have been performed during the reporting period with an average of circa 141 interactions per month.

Accordingly to the figures on working sessions, it can be observed that social interactions start growing from November '17 on.

VREs Analytics Tasks

The graph below reports the overall number of data analytics tasks performed by AGINFRA+ VREs users.

A total of 12,720 analytics tasks have been performed during the reporting period with an average of circa 353 tasks per month.

In this case there are peak of activities (e.g. Dec. '17 or Mar. '19) mainly related with exploitation of the data analytics facilities to experience with it when developing community specific methods.

Core VRE: AGINFRAplus Virtual Research Environment

The AGINFRAplus VRE has been created to offer project members with a collaborative working environment. It is exploited to support communication among project members and teams, to share documents, and to showcase certain facilities.

The VRE is in operational state since Jan. '17.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 17,739 sessions have been performed during the period with an average of circa 492 sessions per month.

During these working sessions:

  • a total of 3,727 social interactions have been performed with an average of circa 103 interactions per month;
  • a total of 1,276 analytics tasks have been performed with an average of circa 35 tasks per month;

Core VRE: AGINFRA4BioCos Virtual Research Environment

The AGINFRA4BioCos VRE has been created to provide BioCos members with services and resources stemming from AGINFRAplus project for their exploitation.

The VRE is in operational state since Dec. '19.

The facilities offered by this VRE are summarised by the VRE menu included below.

No usage statistics are available for this VRE because of the limited lifespan.

Core VRE: AGINFRAplusDev Virtual Research Environment

The AGINFRAplusDev VRE has been created to provide AGINFRAplus consortium members with a working environment to experience with AGINFRAplus services and develop new ones.

The VRE is in operational state since Sep. '18.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 6625 sessions have been performed during the period with an average of circa 389 sessions per month.

During these working sessions:

  • a total of 507 social interactions have been performed with an average of circa 31 interactions per month;
  • a total of 5,883 analytics tasks have been performed with an average of circa 367 tasks per month;

Core VRE: AGINFRAplusShowcase Virtual Research Environment

The AGINFRAplus Showcase VRE has been created to has been created to showcase the results of the AGINFRAplus project.

The VRE is in operational state since Jul. '18.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 1288 sessions have been performed during the period with an average of circa 71 sessions per month.

During these working sessions:

  • a total of 2 social interactions have been performed with an average of circa 0.5 interactions per month;
  • a total of 359 analytics tasks have been performed with an average of circa 19 tasks per month;

Core VRE: AgroClimaticModeling_trial Virtual Research Environment

The AgroClimaticModeling_trial VRE has been created to support evaluation events for the agro-climatic modelling community.

The VRE is in operational state since Jun. '19.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 2132 sessions have been performed during the period with an average of circa 304 sessions per month.

During these working sessions:

  • a total of 38 social interactions have been performed with an average of circa 5 interactions per month;
  • a total of 600 analytics tasks have been performed with an average of circa 85 tasks per month;

Core VRE: AgroClimaticModelling Virtual Research Environment

The AgroClimaticModelling VRE has been deployed to support the development of use cases stemming from WP5

The VRE is in operational state since Nov. '17.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 6412 sessions have been performed during the period with an average of circa 246 sessions per month.

During these working sessions:

  • a total of 135 social interactions have been performed with an average of circa 5 interactions per month;
  • a total of 960 analytics tasks have been performed with an average of circa 36 tasks per month;

Core VRE: DEMETER Virtual Research Environment

The DEMETER VRE has been deployed to support the development of use cases stemming from WP6

The VRE is in operational state since Mar. '17.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 2505 sessions have been performed during the period with an average of circa 75 sessions per month.

During these working sessions:

  • a total of 107 social interactions have been performed with an average of circa 3 interactions per month;
  • a total of 61 analytics tasks have been performed with an average of circa 1 task per month;

Core VRE: DEMETER_trial Virtual Research Environment

The DEMETER_trial VRE has been deployed to support the evaluation events of the use cases stemming from WP6

The VRE is in operational state since Mar. '19.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 197 sessions have been performed during the period with an average of circa 19 sessions per month.

During these working sessions:

  • a total of 45 social interactions have been performed with an average of circa 4.5 interactions per month;
  • a total of 48 analytics tasks have been performed with an average of circa 4.8 task per month;

Supported VRE: EMPHASIS Virtual Research Environment

The EMPHASIS VRE has been deployed to support the EMPHASIS-prep European project and enact the Operative Team to share applications and scripts (mainly R and Python) without having every partner having to install the necessary software.

The VRE is in operational state since Jul. '18.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 1069 sessions have been performed during the period with an average of circa 59 sessions per month.

During these working sessions:

  • a total of 18 social interactions have been performed with an average of circa 1 interactions per month;
  • no analytics tasks have been performed;

Core VRE: FMJ_Lab Virtual Research Environment

The FMJ_Lab VRE is paired with the Food Modeling Journal to facilitates the publication of mathematical models, datasets and software solutions in the area of food science.

The VRE is in operational state since Dec. '19.

The facilities offered by this VRE are summarised by the VRE menu included below.

No usage statistics are available for this VRE because of the limited lifespan.

Supported VRE: FoodborneOutbreak Virtual Research Environment

The FoodborneOutbreak VRE has been deployed to support a task of the COMPARE project.

The VRE is in operational state since December '17.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 2581 sessions have been performed during the period with an average of circa 103 sessions per month.

During these working sessions:

  • a total of 34 social interactions have been performed with an average of circa 1 interactions per month;
  • a total of 101 analytics tasks have been performed with an average of circa 4 tasks per month;

Core VRE: FoodSecurity Virtual Research Environment

The FoodSecurity VRE has been deployed to support the development of use cases stemming from WP7

The VRE is in operational state since Nov. '17.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 4500 sessions have been performed during the period with an average of circa 173 sessions per month.

During these working sessions:

  • a total of 59 social interactions have been performed with an average of circa 2 interactions per month;
  • a total of 1,376 analytics tasks have been performed with an average of circa 52 tasks per month;

Core VRE: NitrogenScrumLab Virtual Research Environment

The FMJ_Lab VRE has been developed to provide its users with collaborative data science, data engineering, storage, and compute facilities for the Nitrogen Scrum event.

The VRE is in operational state since Dec. '19.

The facilities offered by this VRE are summarised by the VRE menu included below.

No usage statistics are available for this VRE because of the limited lifespan.

Supported VRE: ORIONKnowledgeHub Virtual Research Environment

The ORIONKnowledgeHub VRE has been deployed to support the ORION project aiming at establishing and strengthening inter-institutional collaboration and transdisciplinary knowledge transfer in the area of surveillance data integration and interpretation, along the One Health (OH) objective of improving health and well-being.

The VRE is in operational state since January '18.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 4169 sessions have been performed during the period with an average of circa 173 sessions per month.

During these working sessions:

  • a total of 320 social interactions have been performed with an average of circa 13 interactions per month;
  • no analytics tasks have been performed;

Core VRE: RAKIP_Portal Virtual Research Environment

The RAKIP_portal VRE has been deployed to support the development of use cases stemming from WP6

The VRE is in operational state since July '17.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 7870 sessions have been performed during the period with an average of circa 262 sessions per month.

During these working sessions:

  • a total of 79 social interactions have been performed with an average of circa 2 interactions per month;
  • a total of 1,945 analytics tasks have been performed with an average of circa 64 tasks per month;

Core VRE: RAKIP_trial Virtual Research Environment

The RAKIP_portal VRE has been deployed to support the evaluation events of use cases stemming from WP6

The VRE is in operational state since Mar '19.

The facilities offered by this VRE are summarised by the VRE menu included below.

The graph below reports the number of working sessions.

A total of 2536 sessions have been performed during the period with an average of circa 253 sessions per month.

During these working sessions:

  • a total of 24 social interactions have been performed with an average of circa 2.4 interactions per month;
  • a total of 108 analytics tasks have been performed with an average of circa 10 tasks per month;

Certification

The use of cloud-based services brings high flexibility and advanced functionality that cannot easily achieved on a single workstation. Despite these benefits, organisations need to perform a risk assessment  and implement associated mitigations before using cloud services. Risks vary depending on factors such as the sensitivity and criticality of data to be stored or processed, how the cloud service is implemented and managed, how the organisation intends to use the cloud service, and challenges associated with the organisation performing timely incident detection and response. Organisations need to compare these risks against an objective risk assessment of using in-house computer systems as s the physical resources hosting users’ data and computation are no longer under direct control of the user. When using cloud services, a high degree of trust in the cloud service provider is required, which has to be developed first.

In order to develop such trust, certification of cloud services can help organisations to better evaluate the security implications of relying on those services. There are several information security standards in the context of cloud computing. Examples include the ISO/IEC 27001 and ISO/ IEC 27017 standards, the rules of the CSA Cloud Controls Matrix and the BSI products like the IT-Grundschutz Catalogues and security profiles for software as a service (SaaS). The BSI certification in particular is a requirement for usage of cloud resources by German federal organisations and was raised by BfR as part of the activities of WP6.

Most of these certifications require that the cloud provider has a information security management system (ISMS) in place. The ISO/IEC 27001 information security standard specifies such a management system which should bring information security under management control and gives specific requirements. ISO/IEC 27001 has become the most popular information security standard worldwide and many organisations have certified against it.  To become certified, it must first implement the standard and then go through the certification audit performed by a certification body. The certification audit is performed by first reviewing all the documentation regarding the ISMS and then by performing an on-site audit that checks whether the activities of the organisation are actually compliant with the standard and the described ISMS. After the certification is issued, auditors will perform surveillance visits to ensure the ISMS is still applied at the organisation.

Achieving this kind of certifications requires significant effort from the organisation and require upper management implication. Most commercial cloud providers have an extensive list of compliance certifications (e.g.Amazon AWS, Open Telekom Cloud or Google Cloud Platform) with even specific data centres located in Europe for further compliance with European-level regulation. However, for research cloud providers the availability of certifications is scarce. We surveyed on EGI network of research cloud providers in Europe and only 2 of them are currently certified against ISO/IEC 27001 and one of them is in the process of obtaining it. Currently, EGI federates a commercial cloud provider in the UK which is certified against ISO/IEC 27001.

Over the last years, EGI has pushed for the adoption of FitSM, a lightweight standard for IT service management. It brings order and traceability with simple, practical support and provides a common conceptual and process model setting out realistic requirements. While this standard does not cover the specifics of an Information Security Management System, it already raises the quality in management the services of the providers and prepares them to achieve further certifications. Triggered by the request from AGINFRA+ of certified resources and other potential use cases, EGI is re-designing its cloud federation to facilitate the integration of new providers, specially commercial ones, into the EOSC landscape that will result in a a broader set of certified services for users. This lighter federation will be marketplace-oriented and focus on federated identity that will allow users to easily access providers with their own credentials.
EGI is ready provide consultancy, training and audit for ISO 27001 and related certifications and closely following EU initiatives.