D9.1 BlueBRIDGE VRE Commons Facilities

Abstract

Deliverable D9.1 – “BlueBRIDGE VRE Commons Facilities” reports the release of the BlueBRIDGE facilities for Data Access, Data Discovery, Data Storage, Data Analytics and Data Publishing. The deliverable covers the first ten months of the BlueBRIDGE project. An update to this deliverable is expected at project month M29 with Deliverable D9.2.

The deliverable is divided in two main sections: the first one follows the structure of Work Package 9: T9.1, T9.2 and T9.3. Specifically, it includes dedicated parts documenting the various releases and versions of the facilities available for Accessing, Discovering and Storing Data (T9.1), for Analysing Data (T9.2) and for Publishing Data (T9.3).
The second section comprises facilities for managing and using VREs that span across all the three WP9 tasks. Specifically, it includes dedicated parts documenting the various releases and versions of services for defining, creating and deploying VREs (VRE Management facilities) together with a set of applications allowing to use the VRE through a think client (VRE Enabling facilities).

The deliverable provides, for each facility, a description, the documentation for developers and system administrators, how-to guides and usage instructions for different use cases and links to open source code and binaries. Apart from this, the totality of pages of the deliverable (facilities documentation) are hosted on the gCube wiki.

The intended readers of this deliverable are (a) the community in the large willing to be informed on the solutions BlueBRIDGE offers for VRE management, Data Access, Discovery, Storage, Analytics and Publishing, and (b) the gCube developer community to know how to integrate, use and build on top of the facilities described by the document.

Introduction

This deliverable reports a comprehensive set of facilities enabling the creation of the VREs and the implementation of the services envisaged by WP5-8 developed and released within the first 10 months of the project. These set of facilities are briefly introduced below.

Section “Facilities for Data Access, Data Discovery, Data Storage, Data Analytics and Data Publishing” includes following facilities:

  • Storage Manager: a service providing functions for standards-based and structured access and storage of files of arbitrary size;
  • Home Library: a service providing functions for managing and persisting end-users’ files in the infrastructure, supporting file and folders sharing;
  • Social Networking Data Access, Discovery, Storage: a set of services providing functions for accessing, storing, indexing and retrieving Social Networking Data available in the Infrastructure;
  • Statistical Manager: a service aiming to provide users and other infrastructure services with algorithms to perform Data Mining operations;
  • Statistical Manager Algorithms Importer: a service providing a tool to import end-users’ defined algorithms in the Infrastructure supporting R scripts integration;
  • GIS Publisher: a service offering a common standard interface for publishing geospatial data and metadata to the other services of the infrastructure.

Section “Facilities for VRE Management and Usage” includes following facilities:

  • VRE Management: a set of services and applications providing functions for defining, creating and deploying Virtual Research Environments;
  • VRE Enabling Applications: a set of interaction-oriented services and a front-end applications providing functions to manage VRE Users and to support content/information exchange by using social media-like tools.

The Social Networking Data Discovery Service and the Statistical Manager Algorithms Importer Service are completely new services entirely designed and developed in the context of BlueBRIDGE. The rest of the Facilities and Portlets listed above are components that pre-exist BlueBRIDGE. In the context of BlueBRIDGE these components have been re-designed to adapt to the evolving technologies and to meet user expectations: (i) personalized, interactive and collaborative content, (ii) distributed, loosely coupled, multi-platform architectures supporting huge volumes of data as well as (iii) hardware design supporting scaling out rather than scaling up.

AREA 1 “Facilities for Data Access, Data Discovery, Data Storage, Data Analytics and Data Publishing”

Storage Manager Service

A service providing functions for standards-based and structured access and storage of files of arbitrary size is a fundamental requirement for a wide range of system processes, including indexing, transfer, transformation, and presentation. Equally, it is a main driver for clients that interface the resources managed by the system or accessible through facilities available within the system.

https://gcube.wiki.gcube-system.org/gcube/Storage_Manager

Home Library Service

The Home Library service providing functions for managing and persisting end-users’ files in the infrastructure, supporting file and folders sharing. Any Home Library user is presented with a personal Workspace, where users can collaborate, share information, and access project resources using special folders. This module describes the model of Home Library 2.0 and how to use the new API interface.

https://wiki.gcube-system.org/gcube/Home_Library_2.0_API_Framework_Specification

Social Networking Data Access, Discovery, Storage Facility

A set of services providing functions for accessing, storing, for indexing and retrieving Social Networking Data (User Posts, Comments, Likes and Notifications) available in the Infrastructure:

https://wiki.gcube-system.org/gcube/Social_Networking_Library
https://wiki.gcube-system.org/index.php?title=Social_Networking_Data_Discovery

Statistical Manager Service

The goal of this service is to offer a unique access for performing data mining or statistical operations on heterogeneous data. These data can reside on the client side in the form of csv files or they can be remotely hosted, as SDMX documents or, furthermore, they can be stored in a database. The list of the available algorithms released (divided in thematic classes, e.g. Bayesian Methods, Climate, Geo Processing etc.) counts 82 different algorithms at M10:

https://wiki.gcube-system.org/gcube/Statistical_Manager
https://wiki.gcube-system.org/gcube/How-to_Implement_Algorithms_for_the_Statistical_Manager
https://wiki.gcube-system.org/gcube/Statistical_Manager_Algorithms

Statistical Manager Algorithms Importer Service

Statistical Algorithms Importer (SAI) is a tool to import algorithms in the D4Science e-Infrastructure. Currently, it supports R scripts integration. SAI separates R scripts development from its deployment in the infrastructure in a very flexible way. After the first deployment, made in collaboration with the Infrastructure team, script developers can modify and update their scripts by themselves, without the intervention of the Infrastructure team.

https://wiki.gcube-system.org/gcube/Statistical_Algorithms_Importer

GIS Publisher service

A service offering a common standard interface for publishing geospatial data and metadata to the other services of the infrastructure. The service is designed to rely on GeoNetwork and GeoServer repositories registered in the infrastructure.

https://wiki.gcube-system.org/gcube/GIS_Interface

AREA 2 “Facilities for VRE Management and Usage”

VRE Management facilities

A set of services and applications providing functions to services for defining, creating and deploying VREs.
These services support VRE Designers and Managers through graphical user interfaces to instruct the infrastructure about the expected features of the desired VRE as well as allowing to easily update the VRE once defined and operational:

https://wiki.gcube-system.org/gcube/VRE_Administration

VRE Enabling Portlets

A set of interaction-oriented services and a front-end applications providing functions to manage VRE Users and to support them to collaborate, cooperate and exchange content/information by using social media-like tools.

https://wiki.gcube-system.org/gcube/Users%27_Management
https://wiki.gcube-system.org/gcube/Explore_available_Virtual_Research_Environments
https://wiki.gcube-system.org/gcube/Sharing_Posts_and_using_News_feed

Software Distribution

All the software produced for the gCube system follows a common integration, testing and release process and, therefore, all facilities share the same location for:

All the facilities described in this document are delivered with the EUPL v1.1 software license