Project WP #654: WP6 - Supporting Blue Economy: VREs Development [Months: 1-30]
Project Task #656: T6.2 Strategic Investment analysis and Scientific Planning/Alerting VRE [Months: 1-30]
|Status:||Closed||Start date:||May 30, 2017|
|Assignee:||Konstantinos Giannakelos||% Done:|
In order to construct a performance model for a given location, the simul fish service needs to know the sea surface temperature during a year.
This library will be able to support this scenario by returning the temperature for a given location and period of time.
#4 Updated by Dimitris Katris almost 2 years ago
The reason why we want to have the data packed in a library is firstly because this offered us the ability to provide a very fast implementation and secondly because we need these data to be used in the parallel execution of fitness functions. In order to do so efficiently our initial thought was to send them packed in a library instead of invoking another service from multiple spark nodes. At a later stage we have also thought of adding these data in the geoanalytics platform and fetch them from there. I am not really aware of what a data mining process would offer to us because these data are already analysed and available, so we just need to retrieve/read them from the library's resources and make them available to the simul fish grown library.
#5 Updated by Pasquale Pagano almost 2 years ago
It is unclear to me the way you are imaging to use this dataset. The dataset covers from 2008 to present but it will become soon obsolete if you package it in a library and then distribute it. If you register it properly either in the geoanalytics platform or in the SDI, you can get the value you need for a specific location/time by using WCS (Web Coverage Service).
If you decide to do so by exploiting the SDI, then we can configure a job that periodically update the data from Copernicus and you will always get up-to-date information.
#6 Updated by Gianpaolo Coro almost 2 years ago
Since the dataset was taken from Copernicus, it is a gridded NetCDF and could be remotely accessed through the OPeNDAP protocol. On DataMiner, we have a process to extract geospatial information from a NetCDF file hosted in the e-Infrastructure:
And obviously, another to publish a NetCDF in the e-Infrastructure (indeed on the Thredds service http://thredds.d4science.org/thredds/catalog/public/netcdf/catalog.html), so that you can later reuse it in the process above:
Provisioning a geospatial file with a client is a very ancient practice, that has gone extinct and is currently used only when a NetCDF file is too large, is unstructured, and has too high 4D resolution (e.g. Gebco bathymetry), which is not the case of SST.
#7 Updated by Dimitris Katris almost 2 years ago
Our ultimate goal is to do what Lino suggested, register the dataset to geoanalytics and fetch the data through WCS. We have recently added support for raster data (#7144) but we still miss some steps in order to be able to offer the dataset by the platform. If we do so, then we will switch the implementation of the library to get the data from this source and perform some caching/prefetching rather that have them embedded. But we also needed to facilitate the implementation of global performance model evaluation that I2S currently develops and offer this first version of the library fast so that they can proceed with their tasks.
Finally our intention is not actually to spend effort to set up an automatic procedure that refreshes the data. It was not very clear from the ticket description but the simul fish growth library doesn't really need to receive real time generated data or search for data for a specific datetime (let's say 03/01/2017). What this library needs is the temperature or oxygen level per fortnight by taking into account the mean value of three years. But it is not so crucial to have the latest measurements available and it is not in our short term plans to try to automate this procedure.