
The ping test starts at the beginning of the "update run" phase and stops after it finishes. This means after all roles has been updated. With this patch we stop, test and restart the ping in-between each role update. This means: 1. that we detect error earlier; 2. we detect error related to new flow being created during update run; The point 2. was discovered to be an important test as ovn can have existing flow still working, but new flow breaking. With this new behavior for the ping test we catch such error. The downside is that we have even more sensible to any % based testing as the same number of error will give you an higher percentage as we spend less time in the test for each run. This could be seen as another improvement. We're splitting the ping test into two stages so that if the ping fails to start (as it would be for this particular issue) we would detect it immediately instead of waiting for the end of the run. When we doing batch update (all roles in parallel) we deactivate that mechanism and fall back to the previous one as there is no in-between role step there. We also prevent the stop ping from searching into all home subdirectory as I had an issue in local testing where one subdirectory had unreadable files (after a local podman run). This shouldn't happen in CI, but is good to have for local testing. Change-Id: I7f30f5361773b96de13325f5038c89477b575e65
15 lines
380 B
YAML
15 lines
380 B
YAML
---
|
|
- name: l3 agent connectivity wait until vm is ready
|
|
shell: |
|
|
source {{ overcloud_rc }}
|
|
{{ l3_agent_connectivity_check_wait_script }}
|
|
when: l3_agent_connectivity_check
|
|
|
|
- name: start l3 agent connectivity check
|
|
shell: |
|
|
source {{ overcloud_rc }}
|
|
{{ l3_agent_connectivity_check_start_script }}
|
|
when: l3_agent_connectivity_check
|
|
async: 21660
|
|
poll: 0
|