SO BPMN scaleOut CL - Time-gap issue #appc #oom #so #casablanca


jkzcristiano
 

Dear all,

you may help to understand next "performance issue" in ONAP.


I am inspecting SO BPMN logs during a scaleOut CL operation. Sometimes, BPMN logs get stuck for some time here:

2019-05-09T08:33:30.626Z|6dced4e5-507b-4d3e-b0d3-445897ab8843| o.o.s.b.i.flowspecific.tasks.GenericVnfHealthCheck - Running APP-C action: HealthCheck
2019-05-09T08:33:30.626Z|6dced4e5-507b-4d3e-b0d3-445897ab8843| o.o.s.b.i.flowspecific.tasks.GenericVnfHealthCheck - VNFID: ffebd418-feee-4cfb-918c-397b05e52db8
2019-05-09T08:33:30.627Z|6dced4e5-507b-4d3e-b0d3-445897ab8843| o.onap.so.client.appc.ApplicationControllerSupport - LCM Kit input message follows: {
"common-header" : {
"timestamp" : "2019-05-09T08:33:30.627Z",
"api-ver" : "2.00",
"originator-id" : "MSO",
"request-id" : "6dced4e5-507b-4d3e-b0d3-445897ab8843",
"sub-request-id" : "6aaf6bad-91a1-4e5e-b8cf-aec8d86f7f48",
"flags" : {
"mode" : "NORMAL",
"force" : "FALSE",
"ttl" : 65000
}
},
"action" : "HealthCheck",
"action-identifiers" : {
"vnf-id" : "ffebd418-feee-4cfb-918c-397b05e52db8"
},
"payload" : "{\"request-parameters\":{\"host-ip-address\":\"10.0.2.40\"}}"
}
2019-05-09T08:33:30.628Z|6dced4e5-507b-4d3e-b0d3-445897ab8843| o.onap.appc.client.impl.protocol.AsyncProtocolImpl - Successfully send message: {"version":"2.0","type":null,"body":{"input":{"common-header":{"timestamp":"2019-05-09T08:33:30.627Z","api-ver":"2.00","originator-id":"MSO","request-id":"6dced4e5-507b-4d3e-b0d3-445897ab8843","sub-request-id":"6aaf6bad-91a1-4e5e-b8cf-aec8d86f7f48","flags":{"mode":"NORMAL","force":"FALSE","ttl":65000}},"action":"HealthCheck","action-identifiers":{"vnf-id":"ffebd418-feee-4cfb-918c-397b05e52db8"},"payload":"{\"request-parameters\":{\"host-ip-address\":\"10.0.2.40\"}}"}},"rpc-name":"health-check","correlation-id":"6dced4e5-507b-4d3e-b0d3-445897ab8843-6aaf6bad-91a1-4e5e-b8cf-aec8d86f7f48","cambria.partition":null}
2019-05-09T08:33:30.649Z|| c.a.n.c.client.impl.CambriaSimplerBatchPublisher - sending 1 msgs to /events/APPC-LCM-READ. Oldest: 21 ms
2019-05-09T08:33:30.650Z|| com.att.nsa.apiClient.http.HttpClient - POST http://message-router.onap:3904/events/APPC-LCM-READ will send credentials over a clear channel.
2019-05-09T08:33:30.650Z|| com.att.nsa.apiClient.http.HttpClient - POST http://message-router.onap:3904/events/APPC-LCM-READ (as VIlbtVl6YLhNUrtU) ...
2019-05-09T08:33:30.655Z|| com.att.nsa.apiClient.http.HttpClient -  --> HTTP/1.1 200 OK
2019-05-09T08:33:30.655Z|| c.a.n.c.client.impl.CambriaSimplerBatchPublisher - cambria reply ok (6 ms):{"serverTimeMs":0,"count":1}

###########################
# As you may see there is a time-gap between above line and below line. This issue happens "sometimes".
###########################

2019-05-09T08:35:36.229Z|| com.att.nsa.apiClient.http.HttpClient -  --> HTTP/1.1 200 OK
2019-05-09T08:35:36.229Z|| o.onap.appc.client.impl.protocol.AsyncProtocolImpl - Successfully fetched 0 messages
2019-05-09T08:35:36.229Z|| c.att.nsa.cambria.client.impl.CambriaConsumerImpl - UEB GET /events/APPC-LCM-WRITE/8875e79f-f1c8-4cfb-884c-98d32ae493bb/8875e79f-f1c8-4cfb-884c-98d32ae493bb?timeout=360000&limit=1000
2019-05-09T08:35:36.229Z|| com.att.nsa.apiClient.http.HttpClient - GET http://message-router.onap:3904/events/APPC-LCM-WRITE/8875e79f-f1c8-4cfb-884c-98d32ae493bb/8875e79f-f1c8-4cfb-884c-98d32ae493bb?timeout=360000&limit=1000 will send credentials over a clear channel.
2019-05-09T08:35:36.230Z|| com.att.nsa.apiClient.http.HttpClient - GET http://message-router.onap:3904/events/APPC-LCM-WRITE/8875e79f-f1c8-4cfb-884c-98d32ae493bb/8875e79f-f1c8-4cfb-884c-98d32ae493bb?timeout=360000&limit=1000 (as VIlbtVl6YLhNUrtU) ...
2019-05-09T08:35:37.356Z|| com.att.nsa.apiClient.http.HttpClient -  --> HTTP/1.1 200 OK
2019-05-09T08:35:37.357Z|| o.onap.appc.client.impl.protocol.AsyncProtocolImpl - Successfully fetched 2 messages



Just guessing, is this time-gap issue due to SO waiting for a message to appear in APPC-LCM-READ MR topic?
I have a recent (~2 weeks) OOM casablanca install, where can I see related logs (during the time-gap)? If this issue can be inspected via APPC logs, where are those logs located (pod + path)?


Hope some help!!. I think this is an interesting "perfomance issue" for ONAP community.

Kind regards,
Xoan








Join onap-discuss@lists.onap.org to automatically receive all group messages.