-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Update Redis dashboard #22301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Update Redis dashboard #22301
Conversation
|
This PR does not modify any files shipped with the agent. To help streamline the release process, please consider adding the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 370f4c8d96
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "type": "canonical_unit", | ||
| "unit_name": "second" | ||
| } | ||
| }, | ||
| "formula": "top(query1 / 1000, 10, 'mean', 'desc')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use correct units for slowlog duration
The slowlog metric redis.slowlog.micros.95percentile is in microseconds (see redisdb/metadata.csv:250), but this widget divides by 1000 and then formats the result as seconds. That converts microseconds to milliseconds while labeling them as seconds, so the displayed values are 1000× too large for any slowlog duration. Either divide by 1e6 for seconds or change the unit to milliseconds to avoid misleading latency readings.
Useful? React with 👍 / 👎.
| { | ||
| "name": "query1", | ||
| "data_source": "metrics", | ||
| "query": "avg:redis.cpu.sys{$host,$scope}" | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid labeling cumulative CPU time as usage
This chart plots redis.cpu.sys (and redis.cpu.user just below), which are cumulative CPU time counters (“CPU consumed” per redisdb/metadata.csv:58-61). Averaging these cumulative values yields a steadily increasing line, not actual CPU usage or percent, so the “Average CPU usage” widget will be misleading on any long-running Redis instance. Consider converting these counters to a rate/percent or renaming the widget to reflect CPU time consumed.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The raw Redis metrics are exposed as monotonically increasing counters but they get submitted to Datadog as rates in the Redis integration.
Review from rtrieu is dismissed. Related teams and files:
- documentation
- redisdb/assets/dashboards/overview.json
mahipdeora25
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What does this PR do?
Update the OOTB redis dashboard.
PREVIEW here
Motivation
Review checklist (to be filled by reviewers)
qa/skip-qalabel if the PR doesn't need to be tested during QA.backport/<branch-name>label to the PR and it will automatically open a backport PR once this one is merged