-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Improve EventSubscription database connection usage #25250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...e/src/main/java/org/openmetadata/service/apps/bundles/changeEvent/AbstractEventConsumer.java
Show resolved
Hide resolved
🔍 CI failure analysis for c040a23: Comprehensive analysis: All CI failures (Python Bitbucket API, 20+ Playwright UI tests, SonarCloud timeouts, Collate builds) are unrelated to PR's Java backend changesCI Failure Summary - All Unrelated to PRThis PR continues to face CI failures across multiple categories, all unrelated to the event subscription batch processing changes: 1. Python Tests - Bitbucket API (Earlier runs)
2. Playwright UI Tests - Extensive Flakiness (Multiple runs)
3. Maven SonarCloud CI (Earlier runs)
4. Maven Collate CI (Persistent)
PR Changes (All Approved)This PR modifies only Java backend code:
Does NOT modify: Python code, UI/frontend, Collate code, workflow definitions, or any code related to the failing tests. ConclusionAll CI failures stem from:
None are caused by this PR's event subscription optimizations. Code Review ✅ Approved 5 resolved / 5 findingsSolid performance improvement for EventSubscription batching with comprehensive tests. The implementation correctly addresses connection pool contention. Resolved ✅ 5 resolvedEdge Case: Redundant empty check in batchRecordSuccessfulEvents()
Edge Case: Exception in batchRecordSuccessfulEvents prevents offset update
Bug: Missing error handling in batchRecordSuccessfulEvents
Edge Case: Timeout capping silently changes caller-specified values
Edge Case: Thread safety issue with shared successfulEvents list
What Works WellThe batch processing approach effectively reduces database connections from ~500 to ~2 per batch. The N+1 query fix for FilteringRules is a valuable optimization. Thread-safety is properly documented with the reliance on @DisallowConcurrentExecution. Error handling in commit() is well-designed - it catches exceptions but still advances the offset to prevent duplicate HTTP calls. Tip Comment OptionsAuto-apply is off Gitar will not commit updates to this branch. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | This comment will update automatically (Docs) |
|
|
Failed to cherry-pick changes to the 1.11.6 branch. |



High connection churn - not connection leaks. With JDBI's onDemand() proxy pattern, connections are released after each DAO method call. But when processing 100 events × 5 destinations = 500 individual INSERT statements, each acquiring a connection, this creates massive connection pool contention under load.
Changes Made
1. Batch Processing in AbstractEventConsumer
File: openmetadata-service/.../apps/bundles/changeEvent/AbstractEventConsumer.java
Key changes:
Impact
Impact
┌───────────────────────────────────────┬────────┬───────┐
│ Metric │ Before │ After │
├───────────────────────────────────────┼────────┼───────┤
│ DB connections per batch (100 events) │ ~500 │ ~2 │
├───────────────────────────────────────┼────────┼───────┤
│ Connection pool contention │ High │ Low │
├───────────────────────────────────────┼────────┼───────┤
│ N+1 queries in diagnostics │ Yes │ No │
└───────────────────────────────────────┴────────┴───────┘
2. Batch DAO Method
File: openmetadata-service/.../jdbi3/CollectionDAO.java
Added batchUpsertSuccessfulChangeEvents() using @ConnectionAwareSqlBatch for MySQL and PostgreSQL support.
3. N+1 Query Fix
File: openmetadata-service/.../events/scheduled/EventSubscriptionScheduler.java
4. HTTP Timeout Caps
File: openmetadata-service/.../util/SubscriptionUtil.java
Added timeout caps to prevent runaway webhooks:
Checklist:
Fixes <issue-number>: <short explanation>