USHIFT-6957: Use nodename as Prometheus instance by sjug · Pull Request #6659 · openshift/microshift

sjug · 2026-05-12T17:20:14Z

Switches the Prometheus instance label from host:port to the Kubernetes NodeName so the same queries work uniformly against MicroShift and OpenShift cluster-monitoring. The label value is now resolved by replicating MicroShift's own NodeName logic (node.hostnameOverride from /etc/microshift/config.yaml and drop-ins under config.d/ in lexical order, lowercased; fallback to OS hostname) instead of using the inventory hostname, so the scrape label tracks what kubelet actually advertises even when an override is configured.

Label Prometheus metrics with the Kubernetes NodeName instead of host:port to match the OpenShift cluster-monitoring convention so the same tooling and queries can target MicroShift and OpenShift uniformly.

openshift-ci-robot · 2026-05-12T17:20:19Z

@sjug: This pull request references USHIFT-6957 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Label Prometheus metrics with the Kubernetes NodeName instead of host:port to match OpenShift cluster-monitoring's convention so the same tooling and queries can target MicroShift and OpenShift uniformly.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2026-05-12T17:20:37Z

Walkthrough

The PR updates Ansible infrastructure to resolve Kubernetes node names from MicroShift configuration files and propagate that name through Prometheus scrape job configurations, kubelet logging blocks, and network query defaults, while adding CGNAT network support.

Changes

Kubernetes Node Name Resolution & Prometheus Configuration

Layer / File(s)	Summary
Node name resolution from MicroShift config `ansible/roles/common/tasks/nodename.yml`	New task file reads `/etc/microshift/config.yaml` and drop-in files under `/etc/microshift/config.d`, parses YAML to extract `node.hostnameOverride`, computes merged override (later drop-ins take precedence), and sets `kubernetes_nodename` fact with fallback to `ansible_hostname`.
Playbook orchestration and fact usage `ansible/setup-node.yml`, `ansible/roles/microshift-start/tasks/main.yml`	Playbook adds new "Resolve microshift nodename" play to run nodename resolution before logging setup. `microshift-start` includes nodename task and updates Prometheus instance reference to use `kubernetes_nodename` instead of `inventory_hostname:9100`.
Prometheus scrape job instance labels `ansible/roles/add-kubelet-logging/tasks/main.yml`, `ansible/roles/install-logging/templates/prometheus.yml.j2`	Kubelet and cadvisor scrape jobs now add `labels.instance` derived from `kubernetes_nodename` (with fallback to host). Template-based `node`, `process`, and `crio` jobs similarly emit per-host targets with `instance` labels.
Network query defaults and CGNAT support `ansible/roles/common/tasks/files/prom-network-query.py`	Default Prometheus URL changes from `http://zen3:9091` to `http://prometheus:9091`. IP validation now treats CGNAT range (`100.64.0.0/10`) as local/private in addition to loopback, link-local, and RFC1918. Help text updated to reflect new defaults.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (11 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: using Kubernetes nodename as the Prometheus instance label instead of host:port, which is the core objective across all modified files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	PR contains only Ansible config files (YAML, Python, Jinja2), no Go test files. The custom check for Ginkgo test names is not applicable.
Test Structure And Quality	✅ Passed	PR contains no Ginkgo test code. Changes are only to Ansible roles, tasks, templates, and Python utilities. Custom check for test quality requirements is not applicable.
Microshift Test Compatibility	✅ Passed	No new Ginkgo e2e tests added in this PR. Changes are limited to Ansible playbooks/roles for Prometheus configuration.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	No Ginkgo e2e tests added. PR contains only Ansible automation and configuration changes (YAML, Python, Jinja templates). SNO test compatibility check not applicable.
Topology-Aware Scheduling Compatibility	✅ Passed	PR is Ansible automation/configuration code only. No Kubernetes manifests, operator code, or scheduling constraints introduced. Check applies only to deployment manifests and operator code.
Ote Binary Stdout Contract	✅ Passed	PR modifies only Ansible playbooks, Python utility scripts, and Jinja2 templates. Contains no OTE binaries, Go test code, or Ginkgo test framework. Check not applicable.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	No new Ginkgo e2e tests added in this PR. Changes consist only of Ansible playbooks, roles, Python scripts, and Jinja2 templates. Check is not applicable.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-05-12T17:20:51Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sjug
Once this PR has been reviewed and has the lgtm label, please assign jerpeter1 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

🧹 Nitpick comments (3)

ansible/roles/common/tasks/nodename.yml (2)

33-54: ⚡ Quick win

Add error handling for malformed YAML in config files.

The from_yaml filter will raise an error if config files contain invalid YAML, but there's no explicit handling for this case. A malformed drop-in will fail the entire playbook rather than gracefully falling back to ansible_hostname.

Consider wrapping the YAML parsing in a rescue block or validating files before parsing.

🛡️ Option: Add validation task

Before line 40, add:

    - name: Validate YAML syntax in config files
      ansible.builtin.command:
        cmd: python3 -c "import yaml; yaml.safe_load(open('{{ item }}'))"
      loop: "{{ _microshift_configs.results | selectattr('content', 'defined') | map(attribute='item') | list }}"
      register: _yaml_validation
      failed_when: false
      changed_when: false

    - name: Warn about invalid YAML files
      ansible.builtin.debug:
        msg: "Warning: {{ item.item }} contains invalid YAML and will be skipped"
      loop: "{{ _yaml_validation.results | selectattr('rc', 'defined') | selectattr('rc', 'ne', 0) | list }}"
      when: _yaml_validation is defined

Then filter out invalid files in the merge logic.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ansible/roles/common/tasks/nodename.yml` around lines 33 - 54, The YAML
parsing in the set_fact block for _microshift_hostname_override uses from_yaml
on r.content and can raise on malformed YAML; update the logic to handle parse
errors by wrapping the from_yaml call in a Jinja2 try/rescue (or pre-validate
files) so that if parsing fails for a given r in _microshift_configs.results it
is skipped and does not abort the playbook, and ensure the namespace value
fallback remains ansible_hostname; specifically modify the template inside the
Compute hostnameOverride task (referencing _microshift_configs, parsed,
ns.value, and from_yaml) to catch exceptions (or check validity before parsing)
and only set ns.value when parsing succeeds.

43-54: ⚖️ Poor tradeoff

Consider simplifying the Jinja template logic.

The namespace-based iteration and nested conditionals are dense. This could be refactored into a custom Ansible filter or a separate Python script for better testability and maintainability.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ansible/roles/common/tasks/nodename.yml` around lines 43 - 54, The Jinja loop
is too dense—replace the inline namespace/loop with a custom Ansible filter
(e.g., filter name extract_hostname_override) that accepts
_microshift_configs.results, decodes each r.content, parses YAML, and returns
parsed.node.hostnameOverride or '' as a default; then call
extract_hostname_override from the template instead of the current block.
Implement the filter in filter_plugins (function name extract_hostname_override)
to handle b64decode/from_yaml, mapping checks for parsed and parsed.node, and
safe defaults so the template becomes a single function call returning ns.value.

ansible/setup-node.yml (1)

51-57: 💤 Low value

Minor: Inconsistent capitalization in task name.

The play name uses "Resolve microshift nodename" (line 51) but the task name uses "resolve kubernetes nodename" (line 56). Consider consistent capitalization for readability.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ansible/setup-node.yml` around lines 51 - 57, The play uses "Resolve
microshift nodename" but the task is titled "resolve kubernetes nodename" which
is inconsistent; update the task name string "resolve kubernetes nodename" to
match the play's capitalization and wording (e.g., "Resolve microshift nodename"
or at least capitalize and standardize to "Resolve Kubernetes nodename") so both
the play name and the task name are consistent; edit the task title in the
include_tasks block that references roles/common/tasks/nodename.yml.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@ansible/roles/common/tasks/nodename.yml`:
- Around line 33-54: The YAML parsing in the set_fact block for
_microshift_hostname_override uses from_yaml on r.content and can raise on
malformed YAML; update the logic to handle parse errors by wrapping the
from_yaml call in a Jinja2 try/rescue (or pre-validate files) so that if parsing
fails for a given r in _microshift_configs.results it is skipped and does not
abort the playbook, and ensure the namespace value fallback remains
ansible_hostname; specifically modify the template inside the Compute
hostnameOverride task (referencing _microshift_configs, parsed, ns.value, and
from_yaml) to catch exceptions (or check validity before parsing) and only set
ns.value when parsing succeeds.
- Around line 43-54: The Jinja loop is too dense—replace the inline
namespace/loop with a custom Ansible filter (e.g., filter name
extract_hostname_override) that accepts _microshift_configs.results, decodes
each r.content, parses YAML, and returns parsed.node.hostnameOverride or '' as a
default; then call extract_hostname_override from the template instead of the
current block. Implement the filter in filter_plugins (function name
extract_hostname_override) to handle b64decode/from_yaml, mapping checks for
parsed and parsed.node, and safe defaults so the template becomes a single
function call returning ns.value.

In `@ansible/setup-node.yml`:
- Around line 51-57: The play uses "Resolve microshift nodename" but the task is
titled "resolve kubernetes nodename" which is inconsistent; update the task name
string "resolve kubernetes nodename" to match the play's capitalization and
wording (e.g., "Resolve microshift nodename" or at least capitalize and
standardize to "Resolve Kubernetes nodename") so both the play name and the task
name are consistent; edit the task title in the include_tasks block that
references roles/common/tasks/nodename.yml.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 17ac164f-4570-41c7-8dd0-e7f8bb042889

📥 Commits

Reviewing files that changed from the base of the PR and between db194f3 and 70d39b3.

📒 Files selected for processing (6)

ansible/roles/add-kubelet-logging/tasks/main.yml
ansible/roles/common/tasks/files/prom-network-query.py
ansible/roles/common/tasks/nodename.yml
ansible/roles/install-logging/templates/prometheus.yml.j2
ansible/roles/microshift-start/tasks/main.yml
ansible/setup-node.yml

openshift-ci · 2026-05-12T17:36:00Z

@sjug: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

USHIFT-6957: Use nodename as Prometheus instance

70d39b3

Label Prometheus metrics with the Kubernetes NodeName instead of host:port to match the OpenShift cluster-monitoring convention so the same tooling and queries can target MicroShift and OpenShift uniformly.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 12, 2026

openshift-ci Bot requested review from ggiguash and pmtk May 12, 2026 17:20

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

USHIFT-6957: Use nodename as Prometheus instance#6659

USHIFT-6957: Use nodename as Prometheus instance#6659
sjug wants to merge 1 commit into
openshift:mainfrom
sjug:prom-instance-nodename

sjug commented May 12, 2026 •

edited

Loading

Uh oh!

openshift-ci-robot commented May 12, 2026 •

edited by openshift-ci Bot

Loading

Uh oh!

coderabbitai Bot commented May 12, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

openshift-ci Bot commented May 12, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

openshift-ci Bot commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sjug commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented May 12, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

openshift-ci Bot commented May 12, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sjug commented May 12, 2026 •

edited

Loading

openshift-ci-robot commented May 12, 2026 •

edited by openshift-ci Bot

Loading

coderabbitai Bot commented May 12, 2026 •

edited

Loading