D6.4 Blue Economy VRE Integrated Resources¶
- D6.4 Blue Economy VRE Integrated Resources
- The VREs
- VRE Constituents
- User GuideLines & Policies
Blue Economy VREs are logically grouped in two clusters : Blue Economy VRE Cluster 1 and Blue Economy VRE Cluster 2.
Blue Economy VRE Cluster 1¶
Performance analysis VREs , targeted to SMEs for performing growth analysis and techno-economic investment analysis.
Initially three (3) VREs were planned to be implemented, one per Aquafarming SME:
- Blue Economy VRE: Alieia (#4838)
- Blue Economy VRE: Forky's (#4840)
- Blue Economy VRE: Ellinika Psaria (#4841)
Apart from the above VREs for the three subcontractor aquaculture companies, we have already implemented and operated the following VREs for corresponding aquaculture SMEs:
- Blue Economy VRE: Kimagro Fishfarming Ltd (#9916)
- Blue Economy VRE: Ardag Aquaculture (#9762)
- Blue Economy VRE: iLKNAK Aquaculture (#9761)
- Blue Economy VRE: Galaxidi Marine Farm S.A. Group (#9651)
- Blue Economy VRE: Nireus Aquacultures S.A. (#9239)
- Blue Economy VRE: Markellos Aquaculture Leros S.A. (#9006)
- Blue Economy VRE: Stratos Aquacultures S.A. (#9005)
In general, the aforementioned VREs will contain the following functional components:
- “Setup” capabilities
- Site management application (Sites)
- Model management application (Models)
- “What-if Analysis” functionality
- “Techno-Economic Investment Analysis” functionality
Details on the full component list of each VRE may be located on the respective aforementioned ticket descriptions.
Blue Economy VRE Cluster 2¶
One VRE has been implemented showcasing the geospatial analysis project management and visualization tools. In the VRE, there are also available a number of investment opportunity algorithms that perform forecasts on production and financial indicators regarding aquaculture performance on various coastal areas:
- Blue Economy Strategic Investment analysis and Scientific Planning/Alerting VRE (SIASPA) (#1620)
The functional elements that are supported by the VRE are the following:
- Geoanalytics project management and decision support system
- Geoanalytics administration toolkit (use by the VRE Manager)
Details on the full component list of the VRE may be located on the respective aforementioned ticket description.
The Aquaculture Training Lab VRE (#5688) is conceived to provide its users training on the aquafarming assessment tools enacting them to perform evaluation growth analysis and techno economic investment analysis.
In general, the Aquaculture Training Lab VRE contains the following functional components:
- “Setup” capabilities
- Site management application (Sites)
- Model management application (Models)
- “What-if Analysis” functionality
- “Techno Economic Investment Analysis” functionality
Details on the full component list of the VRE may be located on the respective aforementioned ticket descriptions.
The aim of this section is to describe the datasets, which are necessary to exploit the VRE services. These data sets are categorised into two main groups based on their origin.
The former category includes datasets coming from Aquaculture companies or derived as the outcome of the analysis in the VRE framework, so forth they will be called Internal Aquaculture Datasets.
The latter category includes datasets collected from external resources, like Eurostat, so forth they will be called External Datasets. Their utilisation is on one hand to supplement the internal aquaculture data sets and on the other hand to provide new information contributing to the accomplishment of VRE services. Although the term may imply otherwise, External Datasets are ingested into, and hosted and served by the D4Science infrastructure, via a number of services.
Further details on the following datasets may be found here.
Services & Tools¶
In recent years aqua-farming companies have faced major challenges regarding their viability. The competitive environment leaves little room for inefficient operations and inaccurate decisions relating the production. On the other hand, significant issues concerning the environmental impact and the environmental sustainability of the production should be addressed. Efficient production management and the development of best practices should be responded with the needs of protection the natural resources and the environment. Thus, aquaculture companies can be assisted to improve profitability and minimize environmental impacts. Towards meeting these challenges Blue Economy is developingdeveloped two new services:
Performance evaluation, benchmarking and decision making in aquaculture VRE: providing it provides capacities for companies to evaluate, benchmark and optimize their performance against best practices and the competition, and to extend the capacity of scientific research communities and policy makers to quantify and comprehend aqua-farming industry operation ensuring sustainability and development of the sector.
Strategic Investment Analysis and Scientific Planning and Alerting VRE: supporting it supports investors and scientists in the efficient identification of strategic locations of interest that meet multifactor selection criteria.
In the following sections, we will describe with more details the above services.
Resources of Blue Economy VRE Cluster 1¶
To meet the goals of the VRE Cluster which embraces Performance evaluation, benchmarking and decision making in aquaculture, as well as support investment analysis, a number of new services to be offered the aquaculture companies have been designed. The What-If analysis service is one of those. Using this service, stakeholders can be able to create and assess what-if scenarios regarding the farm performance, to make comparative analysis using the benchmarking facilities as well as make accurate production decisions. Another key service, the Techno-Economic analysis one, will which allows stakeholders to comprehend the financial performance of aquafarming investment and operation.
What-If Analysis Services¶
To create accurate and feasible production plans is significant process for every aquaculture and depends on numerous physical, biological and environmental factors. What-If analysis service provides to fish farm managers the opportunity to define what-if scenarios, evaluate the vital Key Performance Indicators for the fishes’ growth and make efficient production plans. Furthermore, a benchmarking analysis is taking place, comparing the performance of one’s company KPIs against other aquacultures’ performance, which operate under “similar” circumstances.
The provided services are available through three portlets, which are combined and work in a supplementary way, aiming to supply aquafarmers with tools for performance evaluation of theirs' aquacultures, for benchmarking their performance against the competition and finally for making accurate decisions (Figure 1). Briefly, in the “Setup Site” portlet the necessary information about a particular site, like the geographical location and the thermal profile of the site for each fortnight for 12 months are defined. The “Setup Models” portlet serves the functionality to create models so as to estimate the crucial KPIs tables, like FCR, SFR, SGR and Mortality Rate for the production. These KPI’s tables are calculated by using state-of-the-art Machine Learning methodologies based on historical production data, which are provided by the users. Furthermore, tools to manage the existing models (edit, run, delete) are available. In order to be supportedlet the users exploit the what-if analysis service, a portlet has been developed, called “What-If Analysis” portlet, in which the functionality of management (create, edit, execute, delete) of already user-defined scenarios is supplied. Aquafarmers are enable to exploit the “What-If Analysis” portlet to draw a hypothesis and evaluate the performance of their production based on the hypothesis’s conditions. Moreover, they have the opportunity to compare their performance against the competition in terms of the vital production KPIs. Checking multiple what-if scenarios, they are able to make accurate decisions for their production strategy.
Note: for the analysis purposes, the user should must provide data for all the mandatory parameters of the periodic dataset.
One of the services that the VRE cluster provides to the sector's companies is the potential to perform various comparative analysis. In the framework of the service, the aquacultures have the ability to benchmark their KPIs against indicators of other companies, which operate under “similar” environmental circumstances. In other words, the service seeks for “similar” sites so as to produce “global” KPIs, which are able to describe a general trend according to the performance of the “virtual” global production. Then, the service compares the results of What-If analysis with the results of “virtual” global production by supplying appropriate graphs which depicts the differences.
Definition: As “Similar Sites” define the sites which fulfill specific environmental conditions, like sea temperature, oxygen and currents. A site is similar with another site if and only if the following conditions are held:
- Differences in average temperatures between the sites in the same month do not exceed ±1℃
- Similar annual median thermal profiles (±1℃)
- Sites with quite-equal qualitative environmental characteristics, such as oxygen and/or currents
The key term of this process is the ‘Global Performance Model’, which is created by the combination of the models of similar sites for a particular species per KPI. Specifically, the steps for creating the ‘Global Performance Model’ and Benchmarking analysis are the following:
- Identify the sites with similar characteristics: search the list of candidate sites and choose those ones, which comply with the definition of “similar” sites. The output of this process is the similar site list, which stores is stored to the corresponding entity in the database.
- Create Local/Global Models: collect the production data for all the sites similar to the site of interest from the database. Then the system invokes appropriate R functions in order to create the “local” KPIs model. In parallel, it calls the appropriate functions in order to create the “global” KPIs model.
- Benchmarking analysis: the local and global performance models submit are then passed to the same what-if analysis. The end-user determines a hypothesis (what-if scenario) and then this scenario feed the models. The results of the comparative analysis are presented via meaningful graphs and tables.
The following figure (Figure 2) depicts the whole process of the provided services. The aquafarm manager (end-user) can exploit the three portlets (Setup Site, Setup Model and What-If Analysis), as explained above, aiming to evaluate the performance of production in terms of the estimation of vital KPIs, such as FCR, SFR and Mortality Rate. Simultaneously, the system, as back-end process, produces a list of similar sites and creates the ‘Global Performance Model’. As a last step of the process, the what-if scenario, which is determined accessible by the end-user in the What-If Analysis portlet, is submitted to the created local and global performance models so as the benchmarking analysis will take place.
It is worth to note that the local and global performance model will be updated whenever some of the creation conditions will be altered. Specifically, the “Local” KPIs model will be updated each time where the user changes its creation conditions – input data file, thermal site profile. The “Global Performance Model’ will be updated each time where the following conditions will be changedare met:
- a new site will be addedis registered to the infrastructure and this site has similar characteristics with those sites in similar site list
- either when the “Local” KPIs model or some of the assumptions of “Local” KPIs model have been changed
- when the list of similar site has been changed
As a conclusion, the performance comparison provides the aqua farm managers with the ability to realise the potential margins of improvements that can have so as to make correct and valid decisions regarding their production.
D4Science / gCUBE Resources¶
The required facilities, services, resources and tools of theoperated by gCube BlueBRIDGE infrastructure , in order the aforementioned services to be supported and accomplished the appropriate functionality of them, are described in the following list:
- Data Miner: using the Statistical Algorithm Importer the project, which contains the modelling algorithms, should be created and deployed to Data Miner. The modelling algorithms can be invoked from the model management portlet, so as the calculation of the KPIs tables be performed and the utilisation of them to estimate the what-if analysis results be carried out.
- PostgreSQL database: the relational management database system provides the functionality to store, access, retrieve, secure and integrate users’ data within the database. It contains details about the Site, Region, Oxygen and Current Rating, Species, Broodstock and Feed Quality and stores them in the corresponding entities (tables). The information about the models, which are created based on user’s input, are stored in the entity named “SimulModel”. The output of each model is stored in the entity with the corresponding name (“Fcr”, “Sfr”, “Sgr” and “Mortality”). Furthermore, in the entity “Scenario” the details about a scenario as well as the results of the performance are stored.
- Home Library (user workspace): the user’s datasets are files in either excel or csv format. The user should upload sets of such files to a common place. These data sets are related to a specific model. In order to facilitate the user interaction, the model management portlet allows for transparent file management: when editing a model, it allows the addition, removal or replace of file related to that model. In order to accomplish that, the portlet uses the Home Library via its API.
- User Management and other fundamental enabling facilities of gCube-poweredthe VREs
For the estimation of fish growth in a time period whichperiod, which is defined by the user, the service can utilise the “growth algorithm”. This algorithm is I2S proprietary, so it is provided in executable format (java code, .jar file). The service can invoke the particular algorithm via a What-If Analysis portlet. The resulting data can be presented in various charts and table format.
Techno-Economic Analysis Tool¶
The Techno-Economic Analysis tool, allows stakeholders to complement production data with economic / financial data of an aquafarming investment, and consequently utilizes those data in order to assess the performance of the investment in financial terms.
The tool, that consists of front-end (portlet) and backend components (service and libraries), utilizes production and cost-driven techno economic models in order to provide its output, which is in the form of financial KPIs, such as IRR, NVP etc.
Resources of Blue Economy VRE Cluster 2¶
The Geoanalytics Platform is the main service offered in the Cluster 2 VREs. It is a simple yet efficient GIS system that facilitates analysts and scientists to visualize, analyze and manage geospatial information. The platform offers various features to help users collaborate and disseminate their work. Furthermore, the Geoanalytics Platform is an extensible system in which administrators can import data and analytics algorithms. Based on these features, in the context of the SIASPA VRE, a number of investment opportunity algorithms have been developed that are related to aquaculture industry.
Geoanalytics Project Management and Decision Support System¶
The Geoanalytics platform consists of a number of sub-systems that provide different aspects of its functionality, necessary to perform the sub tasks of:
- Project definition and management and sharing
- Exploration of existing geospatial datasets
- Geospatial layer and attribute visualization
- Geospatial analysis method execution
The platform integrates with infrastructure security and presentation layers, and utilizes primarily open standards, allowing it to seamlessly integrate into the offered VREs, even beyond its initial scope.
Here you can find a detailed description of how these services can be consumed by the users.
Geoanalytics Administration Toolkit¶
The Geoanalytics administration toolkit adds a number of functionalities to the VRE allowing managers to configure and extend its capabilities. The most notable facilities that the administrator tools offer are the following:
- Layer management
- Style management
- Geospatial or statistics datasets import
- Geospatial analysis algorithm (function) import
Here you can find a detailed description of the facilities offered to VRE managers by the Geoanalytics Administration Toolkit.
Investment Opportunity Algorithms¶
SIASPA VRE offers to its users a number of predefined investment opportunity algorithms that perform forecasts on production and financial indicators regarding aquaculture performance on various coastal areas. These algorithms base their functionality to the global performance library that uses anonymized datasets and statistics provided by aquaculture companies in order to predict fish growth and developement in certain areas. The algorithms can perform predictions only in areas and for specific fish species, that a sufficient amount of datasets and statistics are available. Additionally, site selection criteria can be specified in order to identify areas in which an aquafarm cannot be installed due to prohibiting environment conditions or other legislation factors. Finally, financial indicators are evaluated based on the cost driven techno-economic model presented later in the Cost-driven Technoeconomic Model section.
The required facilities and tools of the gCube BlueBRIDGE infrastructure, in order the aforementioned services to be supported and accomplish the desired functionality, are described in the following list:
- Geoserver Cluster
- Apache Spark Cluster
- Apache ZooKeeper
- PostgreSQL / PostGIS
- User Management and other fundamental enabling facilities of gCube-poweredthe VREs
Embedded in the aforementioned services and tools, Blue Economy VREs utilize a number of theoretical or experimental approaches. The top-level models (i.e. the ones exposed to the users directly by system tools) are presented below. Those models may be nesting sub-models for simulating other phenomena.
Data Fitting Models¶
Curve fitting is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints. Curve fitting can involve either interpolation, where an exact fit to the data is required, or smoothing, in which a "smooth" function is constructed that approximately fits the data. A related topic is the Data fitting (regression analysis) which is the process of fitting models to data and analyzing the accuracy of the fit. Data fitting techniques, including mathematical equations and nonparametric methods, will beare employed to process acquired data. Data fitting using regression analysis focuses more on questions of statistical inference such as how much uncertainty is present in a curve that is fit to data observed with random errors. Fitted curves can be used as an aid for data visualization, to infer values of a function where no data are available, and to summarize the relationships among two or more variables. Extrapolation refers to the use of a fitted curve beyond the range of the observed data, and is subject to a degree of uncertainty since it may reflect the method used to construct the curve as much as it reflects the observed data. Two commonly used types of curve fitting are:
- Interpolation: Given data for discrete values, fit a curve or a series of curves that pass directly through each of the points. We use these types of methods when the data are very precise.
- Regression: Given data for discrete values, derive a single curve that represents the general trend of the data. It includes methods like Least Squares fitting functions, for example, splines or Chebyshev series as well as Generalised Linear Models (GLM) and Generalized Additive Models (GAM). They are preferable when the given data exhibit a significant degree of error or noise.
The Cluster 1 VREs utilize curve fitting to perform the following tasks:
- Reduce noise and smooth data.
- Find the mathematical relationship or function among variables and use that function to perform further data processing, such as error compensation and, velocity and acceleration calculation, and so on.
- Estimate the variable value between data samples.
- Estimate the variable value outside the data sample range.
The Generalized Linear Model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value. The link function provides the relationship between the linear predictor and the mean of the distribution function. There are many commonly used link functions, and their choice can be somewhat arbitrary. It makes sense to try to match the domain of the link function to the range of the distribution function's mean.
The Generalized Additive Model (GAM) is a generalized linear model in which the linear predictor depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions. GAMs were originally developed to blend properties of generalized linear models with additive models. The model relates a univariate response variable, Y, to some predictor variables, Xi. An exponential family distribution is specified for Y (for example normal, binomial or Poisson distributions) along with a link function g (for example the identity or log functions) relating the expected value of Y to the predictor variables via a structure such as
g(E(Y))=β_0+f_1 (x_1 )+f_2 (x_2 )+⋯+f_m (x_m )
The functions f_i may be functions with a specified parametric form (for example a polynomial, or a spline depending on the levels of a factor variable) or may be specified non-parametrically, or semi-parametrically, simply as “smooth functions”, to be estimated by non-parametric means. This flexibility to allow non-parametric fits with relaxed assumptions on the actual relationship between response and predictor,predictor provides the potential for better fits to data than purely parametric models, but arguably with some loss of interpretability. In other words, the purpose of Generalized Additive Models is to maximize the quality of prediction of a dependent variable Y from various distributions, by estimating unspecific (non-parametric) functions of the predictor variables which are "connected" to the dependent variable via a link function.
According to the literature , Generalized Linear Models and their powerful extension Generalised Additive Models are increasingly used for species modelling and stock assessments in fisheries and aquaculture. Different types of aquaculture data can be used as response variables (e.g. indicators like FCR, SGR, Mortality). The selection of the proper explanatory data based on those parameters that describe more efficient the response variables, according to information on species life history. As explanatory data, environmental parameters are used including temperature, oxygen, currents as well as information relates to the weight or biomass of the fishes, the month of stocking or sampling, the feed quantity etc.
The aim of the growth algorithm is to calculate the vital KPIs which characterise the growth of the fish and estimate the performance of a production. These KPIs are the Average Weight, LTD Growth, LTD SGR, LTD Biological and Economical FCR and LTD Mortality. Briefly, the calculations are based on daily and aggregated biological data as well as taking under consideration the KPIs tables (FCR, SFR, SGR, Mortality Rate). The KPIs tables have been developed by the study of historical periodic data exploiting methods like Generalised Additive Models. The estimations of the LTD KPIs are carried out in a daily base, so the growth of the fishes assesses incrementally.
Cost-driven Technoeconomic Model¶
The cost-driven techno-economic model aims to supply the analyst with the appropriate indicators and project's annual economic values such as revenues and expenses cashflows, earnings before and after depreciation, cumulative profit / loss, that will beare useful in the aqua farm's overall project investment evaluation.
Due to the nature of the aqua farm project, the model is enhanced with depreciation and equipment replacement schedule techniques in order to break down and plan for the technological parts' costs over time.
Through the analysis of costs versus revenues over the project's timespan the model supplies the indicators' formulas with the required values for them to produce meaningful results for the analyst.
More info can be found here
In this section we present the virtualized computational, storage and networking infrastructure, required for the efficient operation of the existing VREs.
For VRE Cluster #1, the following resources are required for the continuous and smooth operation of the VRE services.
- Around 100 MB total storage per VRE
- VMs of 2 cores and 4 GB RAM per VRE
For VRE Cluster #2, the following resources are required.
- Around 300 MB total storage per VRE
- 2 VMs of 2 cores and 8 GB RAM per VRE
- A cluster of 6 VMs of 4 cores and 8 GB RAM that can be shared among VREs and be utilized to execute complex analysis algorithms
For both VRE clusters, the aforementioned specifications suffice to accommodate 3-5 concurrent end-users per VRE.
User GuideLines & Policies¶
Usage / licensing terms and guidelines¶
Blue Economy VRE Cluster 1¶
- The services of the VRE Blue Economy can be employed by any registered user on the VRE. An aquaculture company would determine its employees who have access to the VRE services. This access is granted only via the VRE Manager.
- Cluster VREs will be made available only to the predefined respective aquafarming SME (Stage 1 only).
- The uploaded data are a valuable asset and crucial for the viability of each aquaculture company. Thus, the datasets should bare accessible only from the owner of VRE, namely the aquaculture company and its registered employees.
- The datasets could be accessible for other aquaculture company in case of there exist an explicit permission of the owner of datasets.
- The enterprise datasets are confidential and protected by BlueBRIDGE data preservation strategy.
Blue Economy VRE Cluster 2¶
- The services of the SIASPA VRE can be employed by any registered user of the VRE.
- Uploaded datasets may become available to the entire D4Science infrastructure.
In the Blue Economy VRE Tools User Guide there are detailed descriptions regarding the way VRE tools can be used.