RSPEED-2943: add quota monitoring metrics#1640
RSPEED-2943: add quota monitoring metrics#1640major wants to merge 1 commit intolightspeed-core:mainfrom
Conversation
|
Warning Rate limit exceeded
To continue reviewing without waiting, purchase usage credits in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (7)
WalkthroughThis PR adds quota-check instrumentation across endpoints by introducing dedicated quota-check helper functions that measure check latency and record bounded metrics labeled by endpoint path, quota type, and result status. Metrics infrastructure includes new Prometheus Counter and Histogram with normalized label values for quota types and result outcomes. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
2221254 to
7668c4b
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 535-562: The _check_infer_quota function currently logs raw
strings ("skipped","failure","success") and only catches HTTPException; change
it to use the exported recording constants (e.g.,
recording.QUOTA_RESULT_SKIPPED, recording.QUOTA_RESULT_FAILURE,
recording.QUOTA_RESULT_SUCCESS) in the three record_quota_check calls, and add a
broad except Exception block around check_tokens_available that records
recording.QUOTA_RESULT_ERROR with the elapsed time before re-raising the
exception; keep the existing HTTPException except as-is but use the constant
there too. Use the existing symbols _check_infer_quota, check_tokens_available,
and recording.record_quota_check to locate and modify the code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e9a8ec74-e971-491a-8e93-5dd57f734cd2
📒 Files selected for processing (6)
src/app/endpoints/responses.pysrc/app/endpoints/rlsapi_v1.pysrc/metrics/__init__.pysrc/metrics/recording.pytests/unit/app/endpoints/test_responses.pytests/unit/metrics/test_recording.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Pylinter
- GitHub Check: mypy
- GitHub Check: unit_tests (3.13)
- GitHub Check: build-pr
- GitHub Check: unit_tests (3.12)
- GitHub Check: integration_tests (3.12)
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (6)
tests/unit/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/unit/**/*.py: Use pytest for all unit and integration tests
Usepytest-mockfor AsyncMock objects in unit tests
Use markerpytest.mark.asynciofor async tests
Files:
tests/unit/app/endpoints/test_responses.pytests/unit/metrics/test_recording.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Do not use unittest - pytest is the standard testing framework for this project
Files:
tests/unit/app/endpoints/test_responses.pytests/unit/metrics/test_recording.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
Use Final[type] as type hint for all constants
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types
Use Union types with modern syntax:str | int
UseOptional[Type]for optional type hints
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Useasync deffor I/O operations and external API calls
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes use ABC with@abstractmethoddecorators
Complete type annotations for all class attributes, use specific types, notAny
Follow Google Python docstring conventions for all modules, classes, and functions
Docstring Parameters section documents function parameters
Docstring Returns section documents function return values
Docstring Raises section documents exceptions that may be raised
Use black for code formatting
Use pylint for static analysis with source-roots configuration set to "src"
Use pyright for type checking
Use ruff for fast linting
Use pydocstyle for docstring style validation
Use mypy for additional type checking
Use bandit for security issue detection
Files:
src/app/endpoints/rlsapi_v1.pysrc/metrics/__init__.pysrc/app/endpoints/responses.pysrc/metrics/recording.py
src/app/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use FastAPI dependencies:
from fastapi import APIRouter, HTTPException, Request, status, Depends
Files:
src/app/endpoints/rlsapi_v1.pysrc/app/endpoints/responses.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use FastAPI
HTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/rlsapi_v1.pysrc/app/endpoints/responses.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles contain brief package descriptions
Files:
src/metrics/__init__.py
🧠 Learnings (2)
📚 Learning: 2026-04-07T14:44:42.022Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1469
File: src/models/config.py:1928-1933
Timestamp: 2026-04-07T14:44:42.022Z
Learning: In lightspeed-core/lightspeed-stack, `allow_verbose_infer` (previously `customization.allow_verbose_infer`, now `rlsapi_v1.allow_verbose_infer`) is only used internally by the `rlsapi_v1` `/infer` endpoint and has a single known consumer (the PR author). Backward compatibility for this config field relocation is intentionally not required and should not be flagged in future reviews.
Applied to files:
src/app/endpoints/rlsapi_v1.py
📚 Learning: 2026-04-06T20:18:07.852Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.
Applied to files:
src/app/endpoints/rlsapi_v1.pysrc/app/endpoints/responses.py
🔇 Additional comments (8)
tests/unit/app/endpoints/test_responses.py (1)
223-244: LGTM!The test correctly verifies that unexpected quota backend failures (RuntimeError) are re-raised after recording metrics with the appropriate arguments (endpoint_path="/v1/responses", quota_type="user_id", result="error", and a non-negative duration).
src/app/endpoints/responses.py (2)
141-170: LGTM!The quota check wrapper correctly:
- Times the quota check using
time.monotonic()- Records metrics with bounded labels before re-raising exceptions
- Uses the exported constants for type safety (
QUOTA_TYPE_USER_ID,QUOTA_RESULT_*)- Handles both expected (
HTTPException) and unexpected (Exception) quota backend failures
311-316: LGTM!Moving
endpoint_pathassignment earlier and using it consistently for quota metrics is clean.src/metrics/__init__.py (2)
11-24: LGTM!The bucket distribution covers sub-second latencies well (1ms to 5s), which is appropriate for quota backend checks. Including
float("inf")as the final bucket follows Prometheus histogram best practices.
76-92: LGTM!The quota metrics are well-designed:
- Labels are documented as bounded for cardinality safety
- Counter and Histogram share the same label set for consistent querying
- Descriptive metric names follow the existing
ls_prefix conventiontests/unit/metrics/test_recording.py (1)
165-218: LGTM!Excellent test coverage for the new
record_quota_checkfunction:
- The fixture cleanly patches the logger for failure assertions
- Parametrized test efficiently covers both counter and histogram failure paths
- The bounds test verifies that unexpected inputs are normalized to safe fallback values ("customer-123" → "user_id", "timeout" → "error")
src/metrics/recording.py (2)
17-55: LGTM!Well-designed label bounding:
- Constants provide a single source of truth for valid label values
frozensetensures immutability of the allowed sets- Normalization functions guarantee bounded cardinality for Prometheus metrics
- Fallback values ("user_id" and "error") are sensible defaults
155-177: LGTM!The
record_quota_checkhelper follows the established pattern from other recording functions:
- Normalizes inputs before recording to prevent cardinality explosion
- Updates both counter and histogram atomically
- Catches expected metric update exceptions and logs them without disrupting the request
7668c4b to
a45bbd9
Compare
tisnik
left a comment
There was a problem hiding this comment.
Please ignore docs/demos directory completely - the sources are incorrect there on purpose.
c64fde1 to
17bdda5
Compare
Signed-off-by: Major Hayden <major@redhat.com>
Description
Adds bounded Prometheus metrics for pre-request quota checks on
/v1/responsesand/v1/infer. The labels capture endpoint, quota type, and result without exposing the actual quota subject identifier.Type of change
Tools used to create PR
Related Tickets & Documents
Checklist before requesting a review
Testing
uv run pytest tests/unit/metrics/test_recording.py tests/unit/app/endpoints/test_responses.py tests/unit/app/endpoints/test_rlsapi_v1.pyuv run radon cc -s src/app/endpoints/responses.py src/app/endpoints/rlsapi_v1.py src/metrics/recording.pyuv run make verifySummary by CodeRabbit
Chores
Tests