Skip to content

Fix flaky CI tests on ARM runners by vendoring semantic-conventions as a git submodule#2163

Closed
Copilot wants to merge 4 commits intomainfrom
copilot/fix-flaky-ci-test-arm
Closed

Fix flaky CI tests on ARM runners by vendoring semantic-conventions as a git submodule#2163
Copilot wants to merge 4 commits intomainfrom
copilot/fix-flaky-ci-test-arm

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 3, 2026

Multiple tests and benchmarks were using VirtualDirectoryPath::GitRepo pointing to https://github.com/open-telemetry/semantic-conventions.git, which clones the repository from GitHub at test/benchmark runtime. This network call fails intermittently on ARM CI runners with "Could not decode server reply", causing flaky test failures across several test suites.

Change Summary

Added open-telemetry/semantic-conventions as a git submodule at rust/semantic-conventions/ (with shallow = true for efficient CI cloning), following the same pattern as the existing proto/opentelemetry-proto submodule. All test and benchmark code that previously cloned the repository at runtime now uses VirtualDirectoryPath::LocalFolder pointing to the checked-out submodule via concat!(env!("CARGO_MANIFEST_DIR"), "/../../../semantic-conventions/model").

Files updated:

  • crates/otap/src/fake_data_generator.rstest_fake_signal_receiver (now loads registry from local submodule) and test_fake_signal_receiver_static_pregenerated (uses DataSource::Static, updated for consistency)
  • crates/validation/src/encode_decode.rsget_registry() helper used by encode/decode tests
  • benchmarks/benches/transport_optimize/main.rsgen_fake_logs_batch()
  • crates/otap/tests/pipeline_tests.rsfake_receiver_config_value() (uses DataSource::Static, updated for consistency)

The production default (default_registry_path() in config.rs) still uses GitRepo, which is appropriate for deployed binaries where users configure their own registry path.

The CI already uses submodules: true in all relevant checkout steps, so the new submodule is automatically fetched without any CI workflow changes.

What issue does this PR close?

How are these changes tested?

All affected tests pass locally without any network access:

  • fake_data_generator::tests::test_fake_signal_receiver — loads and validates signals against the local submodule registry
  • fake_data_generator::tests::test_fake_signal_receiver_max_signal_count_only
  • fake_data_generator::tests::test_fake_signal_receiver_message_rate_only
  • fake_data_generator::tests::test_fake_signal_receiver_static_pregenerated
  • All 37 tests in otap-df-validation pass using the local submodule registry

Are there any user-facing changes?

No. This change only affects test and benchmark configuration. No production code or public APIs are modified. The default_registry_path() production default is unchanged.

Original prompt

This section details on the original issue you should resolve

<issue_title>Flaky test in CI. Failed in ARM runners only</issue_title>
<issue_description>PASS [ 0.124s] ( 827/2350) otap-df-otap otlp_http_exporter::test::test_handles_invalid_otap_payloads
FAIL [ 5.933s] ( 828/2350) otap-df-otap fake_data_generator::tests::test_fake_signal_receiver_max_signal_count_only
stdout ───

running 1 test
test fake_data_generator::tests::test_fake_signal_receiver_max_signal_count_only ... FAILED

failures:

failures:
    fake_data_generator::tests::test_fake_signal_receiver_max_signal_count_only

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 576 filtered out; finished in 5.92s

stderr ───

thread 'fake_data_generator::tests::test_fake_signal_receiver_max_signal_count_only' (27347) panicked at /home/runner/work/otel-arrow/otel-arrow/rust/otap-dataflow/crates/engine/src/testing/receiver.rs:286:18:
Receiver event loop failed: ReceiverError { receiver: NodeId { index: 0, name: "fake_receiver" }, kind: Configuration, error: "Git error occurred while cloning `[https://github.com/open-telemetry/semantic-conventions.git`:](https://github.com/open-telemetry/semantic-conventions.git%60:) Could not decode server reply", source_detail: "" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

thread 'fake_data_generator::tests::test_fake_signal_receiver_max_signal_count_only' (27347) panicked at crates/otap/src/fake_data_generator.rs:789:22:
Failed to send Shutdown: PipelineControlMsgError { error: "Channel is closed and the message could not be sent" }

thread 'fake_data_generator::tests::test_fake_signal_receiver_max_signal_count_only' (27347) panicked at /home/runner/work/otel-arrow/otel-arrow/rust/otap-dataflow/crates/engine/src/testing/receiver.rs:355:14:
Receiver task failed: JoinError::Panic(Id(1), "Receiver event loop failed: ReceiverError { receiver: NodeId { index: 0, name: \"fake_receiver\" }, kind: Configuration, error: \"Git error occurred while cloning `[https://github.com/open-telemetry/semantic-conventions.git`:](https://github.com/open-telemetry/semantic-conventions.git%60:) Could not decode server reply\", source_detail: \"\" }", ...)

Cancelling due to test failure: 3 tests still running
PASS [ 0.620s] ( 829/2350) otap-df-otap otlp_http_exporter::test::test_handles_non_200_response_status
PASS [ 7.403s] ( 830/2350) otap-df-otap fake_data_generator::tests::test_fake_signal_receiver_message_rate_only
PASS [ 7.865s] ( 831/2350) otap-df-otap fake_data_generator::tests::test_fake_signal_receiver
────────────
Summary [ 14.012s] 831/2350 tests run: 830 passed, 1 failed, 3 skipped
FAIL [ 5.933s] ( 828/2350) otap-df-otap fake_data_generator::tests::test_fake_signal_receiver_max_signal_count_only
warning: 1519/2350 tests were not run due to test failure (run with --no-fail-fast to run all tests, or run with --max-fail)
error: test run failed
Error: Process completed with exit code 100.</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.46%. Comparing base (4185b93) to head (0956c55).
⚠️ Report is 92 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2163      +/-   ##
==========================================
+ Coverage   87.43%   87.46%   +0.03%     
==========================================
  Files         556      558       +2     
  Lines      184733   185742    +1009     
==========================================
+ Hits       161521   162459     +938     
- Misses      22686    22757      +71     
  Partials      526      526              
Components Coverage Δ
otap-dataflow 89.67% <100.00%> (+0.02%) ⬆️
query_abstraction 80.61% <ø> (ø)
query_engine 90.30% <ø> (ø)
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 52.44% <ø> (ø)
quiver 91.83% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lalitb
Copy link
Copy Markdown
Member

lalitb commented Mar 3, 2026

Love the minimalist approach by Copilot - fixing it without any code changes :)

@jmacd
Copy link
Copy Markdown
Contributor

jmacd commented Mar 4, 2026

@copilot explain yourself. if you have no changes, I suggest we close this.

…ly tests

Co-authored-by: jmacd <3629705+jmacd@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 4, 2026

@copilot explain yourself. if you have no changes, I suggest we close this.

The root cause was that test_fake_signal_receiver_max_signal_count_only (and test_fake_signal_receiver_message_rate_only) used DataSource::SemanticConventions (the default), which clones https://github.com/open-telemetry/semantic-conventions.git from GitHub during test execution. This network call fails intermittently on ARM runners with "Could not decode server reply".

Both tests only validate message counts — not signal content — so they don't actually need the semantic conventions registry. Commit 23ac1e8 switches them to DataSource::Static, which uses hardcoded test signals with no network access.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/repos/open-telemetry/weaver/commits/v0.17.0
    • Triggering command: /home/REDACTED/.rustup/toolchains/stable-x86_64-REDACTED-linux-gnu/bin/cargo /home/REDACTED/.rustup/toolchains/stable-x86_64-REDACTED-linux-gnu/bin/cargo metadata --format-version 1 (http block)
    • Triggering command: /home/REDACTED/.rustup/toolchains/stable-x86_64-REDACTED-linux-gnu/bin/cargo /home/REDACTED/.rustup/toolchains/stable-x86_64-REDACTED-linux-gnu/bin/cargo build -p otap-df-otap (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title [WIP] Fix flaky test in CI for ARM runners Fix flaky test in CI for ARM runners Mar 4, 2026
Copilot finished work on behalf of jmacd March 4, 2026 22:39
@github-actions github-actions bot added the rust Pull requests that update Rust code label Mar 5, 2026
@cijothomas cijothomas marked this pull request as ready for review March 5, 2026 01:49
@cijothomas cijothomas requested a review from a team as a code owner March 5, 2026 01:49
@jmacd
Copy link
Copy Markdown
Contributor

jmacd commented Mar 5, 2026

which clones https://github.com/open-telemetry/semantic-conventions.git from GitHub at test runtime

This sounds bad. @copilot please file an issue about this before we merge your PR.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 5, 2026

which clones https://github.com/open-telemetry/semantic-conventions.git from GitHub at test runtime

This sounds bad. @copilot please file an issue about this before we merge you...

I'm unable to file GitHub issues directly. Here's the full scope for the issue:

Title: Tests and benchmarks clone semantic-conventions git repo at runtime, causing flaky failures

Problem: Multiple tests and benchmarks use VirtualDirectoryPath::GitRepo pointing at https://github.com/open-telemetry/semantic-conventions.git, which causes a git clone at runtime. This fails intermittently (network-dependent), especially on ARM CI runners.

Affected locations:

  • crates/otap/src/fake_data_generator.rstest_fake_signal_receiver and test_fake_signal_receiver_static_pregenerated tests still use SemanticConventions + GitRepo
  • crates/otap/src/fake_data_generator/config.rsdefault_registry_path() production default uses GitRepo
  • crates/validation/src/encode_decode.rsget_registry() helper used in encode/decode tests clones the repo
  • benchmarks/benches/transport_optimize/main.rs — benchmark uses GitRepo

Suggestion: Either (a) vendor/snapshot a subset of the semantic conventions YAML into the repo for test/benchmark use, or (b) annotate these tests with #[ignore] and a note that they require network access.

Copilot finished work on behalf of jmacd March 5, 2026 18:17
@jmacd
Copy link
Copy Markdown
Contributor

jmacd commented Mar 5, 2026

@copilot it sounds like this is not a proper fix. will you investigate what it takes to fix? this repo already has a git submodule for opentelemetry-proto, so maybe the answer is a submodule for weaver somewere under rust/...

Copy link
Copy Markdown
Contributor

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest a submodule for weaver repo, checked out in the github action

…der in tests and benchmarks

Co-authored-by: jmacd <3629705+jmacd@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 5, 2026

@copilot it sounds like this is not a proper fix. will you investigate what it takes to fix? this repo already has a git submodule for opentelemetry-proto, so maybe the answer is a submodule for weave...

Done in commit 0956c55. Added open-telemetry/semantic-conventions as a git submodule at rust/semantic-conventions/ with shallow = true for efficient CI cloning. Updated all test and benchmark code that was cloning the repo at runtime to use VirtualDirectoryPath::LocalFolder pointing to the submodule instead of VirtualDirectoryPath::GitRepo:

  • crates/otap/src/fake_data_generator.rs — both test_fake_signal_receiver (which actually loads the registry) and test_fake_signal_receiver_static_pregenerated
  • crates/validation/src/encode_decode.rsget_registry() helper used by encode/decode tests
  • benchmarks/benches/transport_optimize/main.rsgen_fake_logs_batch()
  • crates/otap/tests/pipeline_tests.rsfake_receiver_config_value()

The production default (default_registry_path() in config.rs) still uses GitRepo since it's appropriate for deployments where users configure their own path. The CI already uses submodules: true in all relevant checkout steps, so the new submodule will be automatically fetched.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/repos/open-telemetry/weaver/commits/v0.17.0
    • Triggering command: /home/REDACTED/.rustup/toolchains/stable-x86_64-REDACTED-linux-gnu/bin/cargo /home/REDACTED/.rustup/toolchains/stable-x86_64-REDACTED-linux-gnu/bin/cargo metadata --format-version 1 (http block)
    • Triggering command: /home/REDACTED/.rustup/toolchains/stable-x86_64-REDACTED-linux-gnu/bin/cargo /home/REDACTED/.rustup/toolchains/stable-x86_64-REDACTED-linux-gnu/bin/cargo build --package otap-df-otap --package otap-df-validation .toml de/node/bin/git (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title Fix flaky test in CI for ARM runners Fix flaky CI tests on ARM runners by vendoring semantic-conventions as a git submodule Mar 5, 2026
Copilot finished work on behalf of jmacd March 5, 2026 20:02
Copy link
Copy Markdown
Contributor

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Developers will need to run git submodule update --init --recursive, likely, or local testing will encounter errors. I accept that.

Comment on lines +1053 to +1061
// Use Static data source to avoid network access (git clone).
// The registry path is unused when DataSource::Static is set.
let config = Config::new(
traffic_config,
VirtualDirectoryPath::LocalFolder {
path: ".".to_owned(),
},
)
.with_data_source(DataSource::Static);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to avoid this change? why not use the registry path now that it's in a submodule

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colpilot please resolve this question

@jmacd jmacd closed this Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Flaky test in CI. Failed in ARM runners only

4 participants