Bug #9118

Smartgears container: service run only on one scope but the container run on multiple scopes

Added by Roberto Cirillo over 2 years ago. Updated over 2 years ago.

Status:ClosedStart date:Jun 30, 2017
Priority:ImmediateDue date:
Assignee:Lucio Lelii% Done:


Sprint:zz - Unsprintable


I've verified the problem above on dewn05 node.
On this host there were two services and the container was running on 6 VRE (5 under FARM and 1 under gCubeApps)
The hostingNode resource was published correctly under all the scopes defined but all the GCoreEndpoint (also the GE related to the smartgears enabling services) were running only on one scope. I've tried to restart and clean the container but nothing change.
I've resolved the problem starting the container with only the FARM VO scope and the gCubeApps VRE scope and adding manually the FARM VRE scopes to the services.

Notice that I've also tried to restart the container with all the VRE scopes and adding the FARM VO scope but the behavior was not changed.

ghn.log (206 KB) Roberto Cirillo, Jul 13, 2017 06:47 PM

container.xml Magnifier (1.13 KB) Roberto Cirillo, Jul 13, 2017 06:53 PM


#2 Updated by Lucio Lelii over 2 years ago

  • Status changed from New to Closed

I cannot reproduce the problem, reopen it in case it will happen again

#3 Updated by Roberto Cirillo over 2 years ago

  • Priority changed from Normal to High
  • Status changed from Closed to In Progress
  • File ghn.log added

Same behavior verified again in preprod environment. This time the host is dl20.di.uoa.gr and the scopes are the following:


The hosting node was published in all the scopes above but the related services were published only in one scope. In attachment the ghn.log and the container.xml.
I've found the following logs in ghn.log:

19:23:45.755 [pool-7-thread-3] TRACE AuthorizationProxy: invalid entry found in cache for token 6afd3340-7b89-424b-a191-6bca********, contacting auth service
19:23:45.781 [localhost-startStop-1] INFO  ApplicationManager: initilizing context TechnoEconomicAnalysisService 
19:23:45.782 [localhost-startStop-1] INFO  ApplicationManager: webApp TechnoEconomicAnalysisService initialized 
19:23:45.840 [pool-7-thread-3] TRACE AuthorizationProxy: invalid entry found in cache for token c3fc612d-e9a5-410d-a1f8-b63a********, contacting auth service
19:23:45.913 [pool-7-thread-3] TRACE AuthorizationProxy: invalid entry found in cache for token 3b8c9e87-499d-48c7-a728-ad91********, contacting auth service

could be this the problem?

For the moment I've solved the problem running the HostingNode only at VO level and adding the VRE scopes by monitor.

#4 Updated by Roberto Cirillo over 2 years ago

#5 Updated by Kostas Kakaletris over 2 years ago

We have problem registering correctly multiple VREs running in a container to IS.

To be more precisely, we register correctly multiple VREs under a container that is running more than one service (maybe under multiple VOs too) at the first time but if update a service and restart the container or if we add new VREs then probably the services will not be registered correctly.

In that case the host profile will be showing all the VREs correctly but the service profile will be showing just one VRE (or at least not all the correct VREs.

An example is on dl20.di.uoa.gr (PreProduction) that is having two services in one container belonging to 2 VREs. We used tokens at the beginning and when the problem first occurred @roberto.cirillo@isti.cnr.it added the services manual to the corresponding VREs through monitoring.
When I upgraded the services I cleaned the container state before and as Roberto stated to me this caused the registration to be deleted too so informed me to not do it in such cases.
Mean while I added again the tokens to the container.xml but the VREs weren't registered. @lucio.lelii@isti.cnr.it or @roberto.cirillo@isti.cnr.it fixed it and the services were shown correctly again (the container.xml still having the tokens)
Now I just updated the one of the two service following this procedure : Stopped service -> checked monitor that was stopped -> deleted the war and all folders related to the service that I was updating (simulefish), added the new war and started the service and the problem occurred again

It is important to solve the issue and not just fix it to the specific installation, because we start having services running on multiple VREs and probably VOs on production. In such issues this problem is causing big unexpected downtime when we just want to add a new VRE.

Kind Regards,

#6 Updated by Pasquale Pagano over 2 years ago

  • Priority changed from High to Immediate

#7 Updated by Lucio Lelii over 2 years ago

  • % Done changed from 0 to 30

I found the problem, it is due to a wrong order of the requests sent from the registry service to the IC, after 20 minutes the GcoreEndpoint is republished with the right scopes.
I'm trying to solve it changing the registry-publisher.

#8 Updated by Lucio Lelii over 2 years ago

  • % Done changed from 30 to 90

I'm testing the new eresource registry library on the dev environment. It will be released in 4.6.1.

#9 Updated by Lucio Lelii over 2 years ago

  • % Done changed from 90 to 100
  • Status changed from In Progress to Closed

Also available in: Atom PDF