Smartgears container: service run only on one scope but the container run on multiple scopes
|Status:||Closed||Start date:||Jun 30, 2017|
|Assignee:||Lucio Lelii||% Done:|
|Sprint:||zz - Unsprintable|
I've verified the problem above on dewn05 node.
On this host there were two services and the container was running on 6 VRE (5 under FARM and 1 under gCubeApps)
The hostingNode resource was published correctly under all the scopes defined but all the GCoreEndpoint (also the GE related to the smartgears enabling services) were running only on one scope. I've tried to restart and clean the container but nothing change.
I've resolved the problem starting the container with only the FARM VO scope and the gCubeApps VRE scope and adding manually the FARM VRE scopes to the services.
Notice that I've also tried to restart the container with all the VRE scopes and adding the FARM VO scope but the behavior was not changed.
#3 Updated by Roberto Cirillo over 2 years ago
- Priority changed from Normal to High
- Status changed from Closed to In Progress
- File ghn.log added
Same behavior verified again in preprod environment. This time the host is dl20.di.uoa.gr and the scopes are the following:
/gcube/preprod /gcube/preprod/preVRE /gcube/preprod/Dorne
The hosting node was published in all the scopes above but the related services were published only in one scope. In attachment the ghn.log and the container.xml.
I've found the following logs in ghn.log:
19:23:45.755 [pool-7-thread-3] TRACE AuthorizationProxy: invalid entry found in cache for token 6afd3340-7b89-424b-a191-6bca********, contacting auth service 19:23:45.781 [localhost-startStop-1] INFO ApplicationManager: initilizing context TechnoEconomicAnalysisService 19:23:45.782 [localhost-startStop-1] INFO ApplicationManager: webApp TechnoEconomicAnalysisService initialized 19:23:45.840 [pool-7-thread-3] TRACE AuthorizationProxy: invalid entry found in cache for token c3fc612d-e9a5-410d-a1f8-b63a********, contacting auth service 19:23:45.913 [pool-7-thread-3] TRACE AuthorizationProxy: invalid entry found in cache for token 3b8c9e87-499d-48c7-a728-ad91********, contacting auth service
could be this the problem?
For the moment I've solved the problem running the HostingNode only at VO level and adding the VRE scopes by monitor.
#5 Updated by Kostas Kakaletris over 2 years ago
We have problem registering correctly multiple VREs running in a container to IS.
To be more precisely, we register correctly multiple VREs under a container that is running more than one service (maybe under multiple VOs too) at the first time but if update a service and restart the container or if we add new VREs then probably the services will not be registered correctly.
In that case the host profile will be showing all the VREs correctly but the service profile will be showing just one VRE (or at least not all the correct VREs.
An example is on dl20.di.uoa.gr (PreProduction) that is having two services in one container belonging to 2 VREs. We used tokens at the beginning and when the problem first occurred @firstname.lastname@example.org added the services manual to the corresponding VREs through monitoring.
When I upgraded the services I cleaned the container state before and as Roberto stated to me this caused the registration to be deleted too so informed me to not do it in such cases.
Mean while I added again the tokens to the container.xml but the VREs weren't registered. @email@example.com or @firstname.lastname@example.org fixed it and the services were shown correctly again (the container.xml still having the tokens)
Now I just updated the one of the two service following this procedure : Stopped service -> checked monitor that was stopped -> deleted the war and all folders related to the service that I was updating (simulefish), added the new war and started the service and the problem occurred again
It is important to solve the issue and not just fix it to the specific installation, because we start having services running on multiple VREs and probably VOs on production. In such issues this problem is causing big unexpected downtime when we just want to add a new VRE.
#7 Updated by Lucio Lelii over 2 years ago
- % Done changed from 0 to 30
I found the problem, it is due to a wrong order of the requests sent from the registry service to the IC, after 20 minutes the GcoreEndpoint is republished with the right scopes.
I'm trying to solve it changing the registry-publisher.