Skip to content

Conversation

@mahipdeora25
Copy link
Contributor

⏺ What does this PR do?

Adds DynamicD, a new ddev tool that generates realistic fake telemetry data for Datadog integrations using AI.

DynamicD uses Claude to analyze an integration's metrics, dashboards, and service checks, then generates a self-contained Python script that simulates scenario-aware telemetry including:

  • Metrics - All dashboard metrics with realistic, correlated values
  • Logs - Scenario-appropriate log messages via HTTP Logs API
  • Service Checks - Health status matching the scenario (if integration defines them)
  • Events - Significant state changes (incidents, recoveries)

All telemetry is tagged with env:dynamicd for easy filtering.

Usage:
ddev meta scripts dynamicd celery --scenario incident
ddev meta scripts dynamicd redis --scenario healthy --save

Scenarios: healthy, degraded, incident, recovery, peak_load, maintenance

Motivation

Testing integrations and dashboards requires realistic data patterns. Manually creating fake data is tedious and often produces unrealistic patterns. DynamicD leverages LLMs to understand the operational characteristics of each service and generate data that:

  • Shows proper metric correlations (e.g., latency increases when queue depth increases)
  • Follows realistic value ranges (not random large numbers)
  • Transitions smoothly between states (no erratic jumps)
  • Populates all required dashboard tags (no "N/A" in widgets)

This enables rapid dashboard validation, demo environments, and integration testing.

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
  • If you need to backport this PR to another branch, you can add the backport/ label to the PR and it will automatically open a backport PR once this one is merged

@github-actions
Copy link

github-actions bot commented Jan 14, 2026

⚠️ Recommendation: Add qa/skip-qa label

This PR does not modify any files shipped with the agent.

To help streamline the release process, please consider adding the qa/skip-qa label if these changes do not require QA testing.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab73d229c9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +472 to +476
for match in filter_tag_pattern.finditer(content):
tag_name = match.group(1)
# Add to all metrics as a common tag
for metric in metric_tags:
metric_tags[metric].add(tag_name)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Capture filter tags for non-grouped metrics

The dashboard tag extraction only adds filter tags to metrics that are already in metric_tags, but metric_tags is only populated when a query has a by {} clause. Widgets that filter on tags without grouping (e.g., avg:redis.keys{db:0}) will leave metric_tags empty, so required filter tags never make it into the prompt and the generated data can omit them, causing those widgets to show no data. Consider initializing metric_tags for every metric found in queries before adding filter/template tags.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@codecov
Copy link

codecov bot commented Jan 14, 2026

Codecov Report

❌ Patch coverage is 42.39130% with 477 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.75%. Comparing base (ae03189) to head (0d88935).
⚠️ Report is 9 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

mahipdeora25 and others added 2 commits January 14, 2026 18:53
- executor.py: Add None check for process.stdout before iteration
- cli.py: Add assert to narrow scenario type after validation

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Initialize metric_tags for all metrics found in dashboard queries,
not just those with "by {}" groupings. This ensures filter tags
(e.g., {db:0}) are captured even for widgets without grouping.
@mahipdeora25
Copy link
Contributor Author

Re: Timeout enforcement (Comment #1)

Valid point - the current implementation doesn't enforce timeout during stdout streaming. Since --timeout is primarily for testing and fixing this properly requires threading/asyncio, I'll address this in a follow-up PR to keep this one focused.

Re: Filter tags (Comment #2)

Fixed in 25ad5c7 - now initializing metric_tags for all metrics found in queries, not just those with by {} groupings.

@maycmlee maycmlee self-assigned this Jan 14, 2026
Copy link
Contributor

@maycmlee maycmlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small nits and a suggestion

@nubtron nubtron self-assigned this Jan 15, 2026
…ta generation

- Add _read_dashboard_tag_values() to parse dashboard queries and extract exact tag:value combinations
- Include required tag values per metric in prompt context
- Add summary of all required tag values to make it crystal clear to the LLM
- This ensures widgets like 'Zones' that filter by resource_type:zone will always have data
"""Error during script execution."""


def execute_script(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we execute AI-generated code inside a container sandbox by default, so they won't accidentally affect the local machine or access unintended environments (for eng. by using credentials stored on the host)?

Here's is a proposed prompt:

Implement containerized sandboxing for DynamicD script execution to isolate AI-generated code.

  • Sandboxed execution should be the default behavior, with explicit opt-out
  • Use minimal container privileges
  • Automatically clean up the container
  • Give actionable feedback if docker is not installed
  • Pass the API key to the container (ensure that the environment variable is forwarded with the correct name)
  • Give no access to the host filesystem
  • Use a python-slim image, don't create a custom dockerfile
  • Install the requests library by shell immediately after creating the container

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return []


def _read_dashboard_metrics(integration: Integration, metric_prefix: str) -> list[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some tests that check these functions against a few real integrations? That will allow us to notice when changes in assets break this tool.

Bonus: include some unit tests as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added unit test here: [3d064ed]

- Add --all-metrics CLI flag for full metric coverage mode
- Add all_metrics_mode to context builder and prompts
- Add log generation instructions to prompts (send_logs, generate_logs)
- Add DATADOG_LOGS_SITES and DEFAULT_LOGS_PER_BATCH constants
- Logs are generated alongside metrics with proper source/service/level
…tering

- Add _read_dashboard_log_config() to extract source and service from log widget queries
- Add dashboard_log_config field to IntegrationContext
- Include log config in prompt context with explicit instructions
- For Kuma: source=kuma, service=kuma-control-plane
- Add Datadog site configuration section for non-US users
- Fix formatting: add periods to scenario descriptions
- Convert options section to table format
- Add --all-metrics option to docs
Copy link
Contributor

@maycmlee maycmlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just some additional small nits, otherwise looks great!

mahipdeora25 and others added 5 commits January 15, 2026 16:46
- Fix dashboard_log_config type annotation to accept list[str] values
- Rename loop variable to avoid shadowing conflict in to_prompt_context
- Add unit tests for dashboard parsing functions (metrics, tags, tag values)
- Add --sandbox/--no-sandbox flag (auto-detects Docker by default)
- Run LLM-generated scripts in isolated container with:
  - Read-only filesystem
  - Memory limit (256MB)
  - CPU limit (1 core)
  - Network access for Datadog API
- Graceful fallback if Docker unavailable
@mahipdeora25 mahipdeora25 requested a review from nubtron January 16, 2026 10:59
"httpx",
"jsonpointer",
"pluggy",
"pyyaml",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we acutally using pyyaml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's used in context_builder.py to read spec.yaml files via yaml.safe_load()

class TestReadDashboardMetrics:
"""Tests for _read_dashboard_metrics function."""

def test_extracts_metrics_from_redis_dashboard(self, real_repo):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests in this file are very repetitive, you could parametrize them so that they take a list of integrations as input. Here's an example of parametrized test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants