-
Notifications
You must be signed in to change notification settings - Fork 1.5k
DynamicD First PR #22328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
DynamicD First PR #22328
Conversation
- Add validate_org function to check if API key belongs to Datadog internal org - Warn and prompt for confirmation before sending data to Datadog HQ/Staging - Add mandatory env:dynamicd tag to all generated metrics for filtering fake data
|
This PR does not modify any files shipped with the agent. To help streamline the release process, please consider adding the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ab73d229c9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| for match in filter_tag_pattern.finditer(content): | ||
| tag_name = match.group(1) | ||
| # Add to all metrics as a common tag | ||
| for metric in metric_tags: | ||
| metric_tags[metric].add(tag_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capture filter tags for non-grouped metrics
The dashboard tag extraction only adds filter tags to metrics that are already in metric_tags, but metric_tags is only populated when a query has a by {} clause. Widgets that filter on tags without grouping (e.g., avg:redis.keys{db:0}) will leave metric_tags empty, so required filter tags never make it into the prompt and the generated data can omit them, causing those widgets to show no data. Consider initializing metric_tags for every metric found in queries before adding filter/template tags.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Codecov Report❌ Patch coverage is Additional details and impacted files🚀 New features to boost your workflow:
|
- executor.py: Add None check for process.stdout before iteration - cli.py: Add assert to narrow scenario type after validation Co-Authored-By: Claude Opus 4.5 <[email protected]>
Initialize metric_tags for all metrics found in dashboard queries,
not just those with "by {}" groupings. This ensures filter tags
(e.g., {db:0}) are captured even for widgets without grouping.
|
Re: Timeout enforcement (Comment #1) Valid point - the current implementation doesn't enforce timeout during stdout streaming. Since Re: Filter tags (Comment #2) Fixed in 25ad5c7 - now initializing |
maycmlee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small nits and a suggestion
…ta generation - Add _read_dashboard_tag_values() to parse dashboard queries and extract exact tag:value combinations - Include required tag values per metric in prompt context - Add summary of all required tag values to make it crystal clear to the LLM - This ensures widgets like 'Zones' that filter by resource_type:zone will always have data
| """Error during script execution.""" | ||
|
|
||
|
|
||
| def execute_script( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we execute AI-generated code inside a container sandbox by default, so they won't accidentally affect the local machine or access unintended environments (for eng. by using credentials stored on the host)?
Here's is a proposed prompt:
Implement containerized sandboxing for DynamicD script execution to isolate AI-generated code.
- Sandboxed execution should be the default behavior, with explicit opt-out
- Use minimal container privileges
- Automatically clean up the container
- Give actionable feedback if docker is not installed
- Pass the API key to the container (ensure that the environment variable is forwarded with the correct name)
- Give no access to the host filesystem
- Use a python-slim image, don't create a custom dockerfile
- Install the requests library by shell immediately after creating the container
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return [] | ||
|
|
||
|
|
||
| def _read_dashboard_metrics(integration: Integration, metric_prefix: str) -> list[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some tests that check these functions against a few real integrations? That will allow us to notice when changes in assets break this tool.
Bonus: include some unit tests as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added unit test here: [3d064ed]
- Add --all-metrics CLI flag for full metric coverage mode - Add all_metrics_mode to context builder and prompts - Add log generation instructions to prompts (send_logs, generate_logs) - Add DATADOG_LOGS_SITES and DEFAULT_LOGS_PER_BATCH constants - Logs are generated alongside metrics with proper source/service/level
…tering - Add _read_dashboard_log_config() to extract source and service from log widget queries - Add dashboard_log_config field to IntegrationContext - Include log config in prompt context with explicit instructions - For Kuma: source=kuma, service=kuma-control-plane
- Add Datadog site configuration section for non-US users - Fix formatting: add periods to scenario descriptions - Convert options section to table format - Add --all-metrics option to docs
maycmlee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Just some additional small nits, otherwise looks great!
- Fix dashboard_log_config type annotation to accept list[str] values - Rename loop variable to avoid shadowing conflict in to_prompt_context - Add unit tests for dashboard parsing functions (metrics, tags, tag values)
Co-authored-by: May Lee <[email protected]>
- Add --sandbox/--no-sandbox flag (auto-detects Docker by default) - Run LLM-generated scripts in isolated container with: - Read-only filesystem - Memory limit (256MB) - CPU limit (1 core) - Network access for Datadog API - Graceful fallback if Docker unavailable
| "httpx", | ||
| "jsonpointer", | ||
| "pluggy", | ||
| "pyyaml", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we acutally using pyyaml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's used in context_builder.py to read spec.yaml files via yaml.safe_load()
| class TestReadDashboardMetrics: | ||
| """Tests for _read_dashboard_metrics function.""" | ||
|
|
||
| def test_extracts_metrics_from_redis_dashboard(self, real_repo): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests in this file are very repetitive, you could parametrize them so that they take a list of integrations as input. Here's an example of parametrized test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
⏺ What does this PR do?
Adds DynamicD, a new ddev tool that generates realistic fake telemetry data for Datadog integrations using AI.
DynamicD uses Claude to analyze an integration's metrics, dashboards, and service checks, then generates a self-contained Python script that simulates scenario-aware telemetry including:
All telemetry is tagged with env:dynamicd for easy filtering.
Usage:
ddev meta scripts dynamicd celery --scenario incident
ddev meta scripts dynamicd redis --scenario healthy --save
Scenarios: healthy, degraded, incident, recovery, peak_load, maintenance
Motivation
Testing integrations and dashboards requires realistic data patterns. Manually creating fake data is tedious and often produces unrealistic patterns. DynamicD leverages LLMs to understand the operational characteristics of each service and generate data that:
This enables rapid dashboard validation, demo environments, and integration testing.
Review checklist (to be filled by reviewers)