Task #4782

Provide ansible playbook for dataminer deployment

Added by Roberto Cirillo almost 3 years ago. Updated over 2 years ago.

Status:ClosedStart date:Jul 22, 2016
Priority:UrgentDue date:
Assignee:Roberto Cirillo% Done:

100%

Category:Other
Sprint:D4Science Infrastructure Upgrade to gCube 4.0.0
Infrastructure:Production
Milestones:
Duration:

Description

Provide an ansible playbook that deploys the dataminer instances in production environment:

dataminer1-p-d4s.d4science.org
dataminer2-p-d4s.d4science.org
dataminer3-p-d4s.d4science.org
dataminer4-p-d4s.d4science.org
dataminer5-p-d4s.d4science.org
dataminer6-p-d4s.d4science.org

These instances should run on the following scopes:

/d4science.research-infrastructures.eu/gCubeApps/ScalableDataMining
/d4science.research-infrastructures.eu/gCubeApps/BiodiversityLab
/d4science.research-infrastructures.eu/gCubeApps/StockAssessment
/d4science.research-infrastructures.eu/gCubeApps/SoBigData.eu
/d4science.research-infrastructures.eu/gCubeApps/Performance Evaluation In Aquaculture
/d4science.research-infrastructures.eu/gCubeApps/FAO Tuna Atlas
/d4science.research-infrastructures.eu/gCubeApps/Protected Area Impact Maps
/d4science.research-infrastructures.eu/gCubeApps/ENVRIPlus

History

#1 Updated by Roberto Cirillo almost 3 years ago

  • Status changed from New to In Progress

#2 Updated by Roberto Cirillo almost 3 years ago

  • % Done changed from 0 to 100
  • Status changed from In Progress to Feedback

#3 Updated by Andrea Dell'Amico almost 3 years ago

A couple of notes:

dataminer1-p-d4s.d4science.org
dataminer2-p-d4s.d4science.org

will be upgraded next monday. The other servers are running, and the haproxy configuration changed so that all the requests to dataminer-bigdata.d4science.org will be directed to dataminer[3:6]-p-d4s.d4science.org.

The haproxy stats URL has been opened to our network, so the frontend and backend status can be seen here: http://dataminer.d4science.org:8880/

#4 Updated by Gianpaolo Coro almost 3 years ago

The 3:6 services are not running because there are 52North WPS 3.5.0 libraries together with the (required) 3.3.2 libraries. The 3.5.0 libraries should be removed.

#5 Updated by Andrea Dell'Amico almost 3 years ago

I've just added the following files:

52n-wps-algorithm-3.5.0.jar
52n-wps-commons-3.5.0.jar
52n-wps-database-3.5.0.jar
52n-wps-io-3.5.0.jar
52n-wps-io-impl-3.5.0.jar
52n-wps-server-3.5.0.jar

to the list of jars that need to be deleted. But this behaviour is unsustainable, we need a more deterministic way to produce a dataminer webapp working instance and we need to be able to reproduce the same installation for each specific version.

@roberto.cirillo@isti.cnr.it I also made the 1.2.0-4.0.0-129209 version the default.

I'm running the playbook on dataminer[3:6] now.

#6 Updated by Gianpaolo Coro almost 3 years ago

Yes Andrea, I will fix the 52-North dependencies as soon as I will be back to work on DataMiner.

However, I have checked the installation. It seems OK as I have run tests to check the capabilities and the correctness of the algorithms descriptions.
However, the service cannot run processes completely, because there are issues in the interaction with the Workspace.

I see this error when I run any computation:

java.lang.ClassCastException: java.lang.String cannot be cast to org.gcube.common.homelibary.model.items.ItemDelegate
        at org.gcube.common.homelibrary.jcr.workspace.servlet.JCRSession.saveItem(JCRSession.java:360)
        at org.gcube.common.homelibrary.jcr.workspace.JCRWorkspaceItem.save(JCRWorkspaceItem.java:234)
        at org.gcube.common.homelibrary.jcr.workspace.JCRAbstractWorkspaceFolder.setSystemFolder(JCRAbstractWorkspaceFolder.java:848)
        at org.gcube.dataanalysis.wps.statisticalmanager.synchserver.mapping.dataspace.DataspaceManager.createFoldersNetwork(DataspaceManager.java:95)
        at org.gcube.dataanalysis.wps.statisticalmanager.synchserver.mapping.dataspace.DataspaceManager.writeProvenance(DataspaceManager.java:399)
        at org.gcube.dataanalysis.wps.statisticalmanager.synchserver.mapping.dataspace.DataspaceManager.run(DataspaceManager.java:78)
        at java.lang.Thread.run(Thread.java:745)

I guess this is due either to a wrong home-library jar or to the fact that the Workspace system is not updated in the production environment.
These are the home-library jars on the machine:

-rw-r--r-- 1 gcube gcube 302415 Jul 22 14:40 tomcat/webapps/wps/WEB-INF/lib/home-library-jcr-2.4.0-4.0.0-130598.jar
-rw-r--r-- 1 gcube gcube 121177 Jul 22 14:40 tomcat/webapps/wps/WEB-INF/lib/home-library-model-1.3.0-4.0.0-129370.jar
-rw-r--r-- 1 gcube gcube 160385 Jul 22 14:40 tomcat/webapps/wps/WEB-INF/lib/home-library-2.4.0-4.0.0-129376.jar

@valentina.marioli@isti.cnr.it or @roberto.cirillo@isti.cnr.it could you clarify, please?

#7 Updated by Valentina Marioli almost 3 years ago

The issue regarding home-library is due to the fact that Jackrabbit is not updated, so the folder cannot be set as "system folder".

#8 Updated by Gianpaolo Coro almost 3 years ago

Dataminer 3:6 have been tested on non-geospatial algorithms. I'm waiting for the geospatial infrastructure to be up and running to finish the tests.

#9 Updated by Andrea Dell'Amico almost 3 years ago

dataminer[1:2] has been upgraded (for real, this time).

#10 Updated by Gianpaolo Coro almost 3 years ago

dataminer [1:2] work. Perhaps, tomorrow I will be able to test the geoprocessing part. If you create the groups of dataminers on the proxy, I can test them as well.

#11 Updated by Gianpaolo Coro almost 3 years ago

I have run complete tests, also on the geoprocessing algorithms. All the dataminers works. Now it's time to implement the proxies.

#12 Updated by Gianpaolo Coro almost 3 years ago

I have measured that dataminer4 and dataminer6 are slower than the other machines. Is it possible to move them on a better hardware?

#13 Updated by Andrea Dell'Amico almost 3 years ago

Gianpaolo Coro wrote:

I have run complete tests, also on the geoprocessing algorithms. All the dataminers works. Now it's time to implement the proxies.

If you are referring to ha proxy, it's already configured. The hostname is dataminer-sobigdata.d4science.org

#14 Updated by Gianpaolo Coro almost 3 years ago

Hi Andrea, the "sobigdata" term makes me thrill... :-S

The distribution of the dataminer should be:

dataminer.d4science.org: dataminer [1:2]

dataminer-bigdata.d4science.org: dataminer [3:6]

The second proxy is "dataminer-bigdata", NOT "dataminer-sobigdata", because it will serve real big data computations.

The "soBigData" Project problems are currently not real big data problems and will be managed by the first proxy! Whenever they became real big data problems, I would move them to the other proxy.

Could you please change the dataminer-sobigdata.d4science.org into dataminer-bigdata.d4science.org?

#15 Updated by Andrea Dell'Amico almost 3 years ago

Gianpaolo Coro wrote:

Hi Andrea, the "sobigdata" term makes me thrill... :-S

The distribution of the dataminer should be:

dataminer.d4science.org: dataminer [1:2]

dataminer-bigdata.d4science.org: dataminer [3:6]

Could you please change the dataminer-sobigdata.d4science.org into dataminer-bigdata.d4science.org?

It's dataminer-bigdata.d4science.org already. Sorry, I wrote sobigdata without checking first.

#16 Updated by Gianpaolo Coro almost 3 years ago

OK, everything works correctly. I think we can close this ticket.

#17 Updated by Roberto Cirillo over 2 years ago

  • Status changed from Feedback to Closed

Also available in: Atom PDF