USHIFT-6957: Use nodename as Prometheus instance#6659
Conversation
Label Prometheus metrics with the Kubernetes NodeName instead of host:port to match the OpenShift cluster-monitoring convention so the same tooling and queries can target MicroShift and OpenShift uniformly.
|
@sjug: This pull request references USHIFT-6957 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThe PR updates Ansible infrastructure to resolve Kubernetes node names from MicroShift configuration files and propagate that name through Prometheus scrape job configurations, kubelet logging blocks, and network query defaults, while adding CGNAT network support. ChangesKubernetes Node Name Resolution & Prometheus Configuration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sjug The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
ansible/roles/common/tasks/nodename.yml (2)
33-54: ⚡ Quick winAdd error handling for malformed YAML in config files.
The
from_yamlfilter will raise an error if config files contain invalid YAML, but there's no explicit handling for this case. A malformed drop-in will fail the entire playbook rather than gracefully falling back toansible_hostname.Consider wrapping the YAML parsing in a rescue block or validating files before parsing.
🛡️ Option: Add validation task
Before line 40, add:
- name: Validate YAML syntax in config files ansible.builtin.command: cmd: python3 -c "import yaml; yaml.safe_load(open('{{ item }}'))" loop: "{{ _microshift_configs.results | selectattr('content', 'defined') | map(attribute='item') | list }}" register: _yaml_validation failed_when: false changed_when: false - name: Warn about invalid YAML files ansible.builtin.debug: msg: "Warning: {{ item.item }} contains invalid YAML and will be skipped" loop: "{{ _yaml_validation.results | selectattr('rc', 'defined') | selectattr('rc', 'ne', 0) | list }}" when: _yaml_validation is definedThen filter out invalid files in the merge logic.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ansible/roles/common/tasks/nodename.yml` around lines 33 - 54, The YAML parsing in the set_fact block for _microshift_hostname_override uses from_yaml on r.content and can raise on malformed YAML; update the logic to handle parse errors by wrapping the from_yaml call in a Jinja2 try/rescue (or pre-validate files) so that if parsing fails for a given r in _microshift_configs.results it is skipped and does not abort the playbook, and ensure the namespace value fallback remains ansible_hostname; specifically modify the template inside the Compute hostnameOverride task (referencing _microshift_configs, parsed, ns.value, and from_yaml) to catch exceptions (or check validity before parsing) and only set ns.value when parsing succeeds.
43-54: ⚖️ Poor tradeoffConsider simplifying the Jinja template logic.
The namespace-based iteration and nested conditionals are dense. This could be refactored into a custom Ansible filter or a separate Python script for better testability and maintainability.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ansible/roles/common/tasks/nodename.yml` around lines 43 - 54, The Jinja loop is too dense—replace the inline namespace/loop with a custom Ansible filter (e.g., filter name extract_hostname_override) that accepts _microshift_configs.results, decodes each r.content, parses YAML, and returns parsed.node.hostnameOverride or '' as a default; then call extract_hostname_override from the template instead of the current block. Implement the filter in filter_plugins (function name extract_hostname_override) to handle b64decode/from_yaml, mapping checks for parsed and parsed.node, and safe defaults so the template becomes a single function call returning ns.value.ansible/setup-node.yml (1)
51-57: 💤 Low valueMinor: Inconsistent capitalization in task name.
The play name uses "Resolve microshift nodename" (line 51) but the task name uses "resolve kubernetes nodename" (line 56). Consider consistent capitalization for readability.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ansible/setup-node.yml` around lines 51 - 57, The play uses "Resolve microshift nodename" but the task is titled "resolve kubernetes nodename" which is inconsistent; update the task name string "resolve kubernetes nodename" to match the play's capitalization and wording (e.g., "Resolve microshift nodename" or at least capitalize and standardize to "Resolve Kubernetes nodename") so both the play name and the task name are consistent; edit the task title in the include_tasks block that references roles/common/tasks/nodename.yml.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@ansible/roles/common/tasks/nodename.yml`:
- Around line 33-54: The YAML parsing in the set_fact block for
_microshift_hostname_override uses from_yaml on r.content and can raise on
malformed YAML; update the logic to handle parse errors by wrapping the
from_yaml call in a Jinja2 try/rescue (or pre-validate files) so that if parsing
fails for a given r in _microshift_configs.results it is skipped and does not
abort the playbook, and ensure the namespace value fallback remains
ansible_hostname; specifically modify the template inside the Compute
hostnameOverride task (referencing _microshift_configs, parsed, ns.value, and
from_yaml) to catch exceptions (or check validity before parsing) and only set
ns.value when parsing succeeds.
- Around line 43-54: The Jinja loop is too dense—replace the inline
namespace/loop with a custom Ansible filter (e.g., filter name
extract_hostname_override) that accepts _microshift_configs.results, decodes
each r.content, parses YAML, and returns parsed.node.hostnameOverride or '' as a
default; then call extract_hostname_override from the template instead of the
current block. Implement the filter in filter_plugins (function name
extract_hostname_override) to handle b64decode/from_yaml, mapping checks for
parsed and parsed.node, and safe defaults so the template becomes a single
function call returning ns.value.
In `@ansible/setup-node.yml`:
- Around line 51-57: The play uses "Resolve microshift nodename" but the task is
titled "resolve kubernetes nodename" which is inconsistent; update the task name
string "resolve kubernetes nodename" to match the play's capitalization and
wording (e.g., "Resolve microshift nodename" or at least capitalize and
standardize to "Resolve Kubernetes nodename") so both the play name and the task
name are consistent; edit the task title in the include_tasks block that
references roles/common/tasks/nodename.yml.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 17ac164f-4570-41c7-8dd0-e7f8bb042889
📒 Files selected for processing (6)
ansible/roles/add-kubelet-logging/tasks/main.ymlansible/roles/common/tasks/files/prom-network-query.pyansible/roles/common/tasks/nodename.ymlansible/roles/install-logging/templates/prometheus.yml.j2ansible/roles/microshift-start/tasks/main.ymlansible/setup-node.yml
|
@sjug: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Switches the Prometheus
instancelabel fromhost:portto the Kubernetes NodeName so the same queries work uniformly against MicroShift and OpenShift cluster-monitoring. The label value is now resolved by replicating MicroShift's own NodeName logic (node.hostnameOverridefrom/etc/microshift/config.yamland drop-ins underconfig.d/in lexical order, lowercased; fallback to OS hostname) instead of using the inventory hostname, so the scrape label tracks what kubelet actually advertises even when an override is configured.