CD server live for pomba or logging healthcheck on the hour - Jenkins JobBuilder validation non-triggered batch #log #oom #lfn
Michael O'Brien <frank.obrien@...>
This morning we discussed the recent POMBA healthcheck failure (1 failing 2 passing) that occurred as a regression on the 28th.
Dependent or multi-merge reviews – where they need to be merged in sequence – sequence was off on this one.
Fixed: Prudence Au (my co-PTL) in
helm install verified:
Basic Pomba AAI-context-builder Health Check | PASS |
Basic Pomba SDC-context-builder Health Check | PASS |
Basic Pomba Network-discovery-context-builder Health Check | PASS |
In the future anyone on the pomba team or from onap can check the status of a helm deployment every hour – specific to the 6 healthchecks for logging and pomba. Each build subset with 3 –set overrides takes only 20 min to run. In the future this type of helm install/deploy can be added to the helm verify already run as part of the JobBuilder.
This was a result of a change going into the application that had a dependency on an OOM config merge – the issue was that the change was tested with a local OOM change – instead of merging OOM just before the POMBA merge.
In the future when each application owns their config (in the works) - then these sort of multi-commit transactions can be done easier – for now they need to be closely managed (test both patches, post results, post healthcheck/container status – merge in dependency order).
I am also adding a 4th healthcheck on the kibana pod this week – it is the last to come up on log, clamp and pomba by nature.
In the meantime I have requested
1: that HC and container status be posted to the review/jira
2: that a clean build be tested ideally with the patch(s) on a separate machine – if possible
3: that each repo merge be linked with a described merge sequence (ie: post in POMBA not to merge before OOM patch is merged)
4: test master immediately after both merges – ideally this is the job of CI/CD – with triggered deployment in the queue
we have CD in onapci.org at a 4 hour interval to check
we have CD in kibana.onap.info:5601 at a 1 hour interval to check (a subset of the full deploy for speed – and cost reduction)
Note: I am only deploying log, robot and kibana here – the failed HC for other components are expected
Healthcheck pass/fail dashboards
server on (protected to secure ports 10249-10255 from crypto miners)
This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at https://www.amdocs.com/about/email-disclaimer