*: fix reload tls certs in reconnect and fix region cache by yongman · Pull Request #537 · tikv/client-rust

yongman · 2026-05-09T10:31:44Z

Deep Review

Problem Summary

The branch is trying to fix two long-lived cache problems:
TLS materials were effectively pinned into cached TiKV gRPC clients, so cert/key rotation did not take effect until process restart or a full client rebuild.
Region-to-store mapping could get stuck on stale cache state, especially when the cached region pointed at a store entry that could no longer be resolved.
The implementation addresses this by reloading TLS files on connection creation, attaching a TLS-dependent cache key to cached TiKV clients, and retrying more store-mapping failures through the existing region backoff path.

Solution Walkthrough

src/common/security.rs replaces in-memory ca/cert bytes with stored file paths. SecurityManager::connect() now reloads the CA, cert, and key from disk every time it builds a new TLS Endpoint, which is what enables rotated files to affect future connections.
src/common/security.rs also adds connection_cache_key(), which hashes per-file metadata from file_signature() and returns None when TLS is disabled. The intended use is to tell whether a cached TiKV client was created from the same TLS inputs as the current filesystem state.
src/store/client.rs extends KvConnect with an optional connection_cache_key() hook. TikvConnect forwards that call to SecurityManager, so the TiKV transport layer can participate in TLS-aware cache invalidation without changing non-TLS connectors.
src/pd/client.rs changes kv_client_cache from HashMap<String, KvClient> to HashMap<String, CachedKvClient<_>>. PdRpcClient::kv_client() now reads the current TLS cache key, compares it with the cached entry for the target store, and reconnects when the key changes.
src/request/plan.rs broadens retry handling in RetryableMultiRegion::single_shard_handler(). Previously only LeaderNotFound during map_region_to_store() went through handle_other_error(). Now any failure from map_region_to_store() or plan.apply_store() invalidates region cache state, consumes region backoff, and retries the full shard mapping flow.
src/request/mod.rs and src/pd/client.rs add tests for the new retry path and cache-key-sensitive TiKV client reuse. src/common/security.rs adds a reload test that proves file rereads and cache-key changes when file contents change.
.cargo/config.toml adds resolver.incompatible-rust-versions = "fallback" so Cargo can resolve dependencies on older toolchains in CI.

Findings (ordered by severity)

Medium: TLS rotation detection is based only on (file length, modified time). A renewed PEM with the same length and an unchanged or coarse-grained mtime will produce the same cache key, so the cached Channel keeps using the old TLS identity indefinitely. This undercuts the main goal of the patch, and the added test only covers size-changing rewrites, so the blind spot is untested. Refs: src/common/security.rs:83-92, src/common/security.rs:161-169, src/common/security.rs:205-241.
It's enough use file length and the file mtime as fingerprint.

Costs and Negative Impacts

Correctness: The region-cache retry fix in src/request/plan.rs:166-189 is directionally correct for stale store mapping, but the TLS cache-key design introduces new failure cases where cert-file metadata availability, rather than connection health, decides whether a request can proceed.
Security: Reloading TLS materials per new connection is the right security direction because rotated credentials can take effect without a process restart. The risk is that metadata-only change detection can silently keep using an old client identity after rotation.
Compatibility: RetryableMultiRegion now retries any map_region_to_store() / apply_store() error instead of only LeaderNotFound. For in-tree request types that mostly means more aggressive recovery, but it is still a behavioral change for any downstream Shardable implementation that used apply_store() errors as terminal failures. Refs: src/request/plan.rs:168-189.
Robustness: The new retry path will also retry local TLS/configuration failures after invalidating region cache state, which adds PD churn and delays surfacing deterministic local errors.
Cognitive Load: The TiKV client cache is now coupled to TLS-file metadata and request retry behavior. Future maintainers need to reason about connection reuse, secret rotation, and async hot-path filesystem access together.
CPU: Extra hashing work is minor, but it is now paid on every TiKV client lookup.
Memory: Negligible; cached entries now carry one additional Option<u64>.
Log Volume: No major new log volume, though repeated retry loops on local TLS-file errors will amplify existing retry/debug logs.

Engineering Rules Check

No repo-local AGENTS.md was found under /home/dev/workspace/client-rust, so there were no repository-specific engineering rules to validate beyond the code itself.

Questions and Assumptions

I assumed the review target is origin/HEAD...HEAD because no explicit diff or PR metadata was provided.
I assumed the intended behavior for TLS rotation is “use existing healthy channels until a refresh is required, then rebuild with new materials,” because that preserves availability while still honoring rotated credentials.
I did not find external PR text, so the problem statement above is inferred from the commit messages bd2c601, 5e975b0, and 29f4ff1.

Suggested Tests / Validation

Add a TLS reload test where file contents change but file length does not, and force the file timestamps to remain equal if the platform allows it. That covers the current metadata-signature blind spot.
Add a TiKV client cache test where connection_cache_key() temporarily returns an error after a client has already been cached; verify whether requests should reuse the cached client or fail closed.
Add a benchmark or targeted perf test for repeated kv_client() cache hits under TLS to measure the cost of the per-request metadata reads.
Validation run: cargo test --lib passed locally on May 12, 2026.

Summary by CodeRabbit

Performance
- Load TLS credentials on demand to reduce memory usage.
- Reduce redundant KV connection attempts via improved per-address caching.
Reliability
- Stronger cache invalidation and retry handling to recover from transient errors.
- Improved correctness when mapping regions to stores during retries.
Chores
- Project build configuration updated to use a fallback resolver for Rust version handling.

Signed-off-by: Ray Yan <yming0221@gmail.com>

ti-chi-bot · 2026-05-09T10:31:48Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign marsishandsome for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-05-09T10:31:51Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2c355ef9-ded3-43e5-b19c-3aae04185802

📥 Commits

Reviewing files that changed from the base of the PR and between 7a24227 and b8d876f.

📒 Files selected for processing (3)

src/common/security.rs
src/pd/client.rs
src/request/mod.rs

🚧 Files skipped from review as they are similar to previous changes (3)

src/request/mod.rs
src/pd/client.rs
src/common/security.rs

📝 Walkthrough

Walkthrough

This PR refactors TLS certificate handling in SecurityManager to store file paths instead of in-memory PEM bytes and load materials on-demand. It updates RegionCache to return the invalidated store, coordinates KV client cache invalidation with region cache invalidation in PdRpcClient, simplifies request plan error handling, and adds tests validating invalidation and TLS reload behavior.

Changes

TLS File Paths and Cache Invalidation Coordination

Layer / File(s)	Summary
Cargo resolver configuration `.cargo/config.toml`	Adds `[resolver]` configuration with `incompatible-rust-versions = "fallback"`.
SecurityManager path-based storage and on-demand loading `src/common/security.rs`	`SecurityManager` stores optional file paths (`ca_path`, `cert_path`, `key_path`) instead of loaded PEM bytes. `load()` validates paths; `load_tls_materials()` reads PEM files on demand (spawn_blocking). Tests converted to async and validate reload detection.
RegionCache return type change `src/region_cache.rs`	`invalidate_store_cache()` now returns `Option<Store>` indicating whether a store was removed.
PdRpcClient KV client cache coordination `src/pd/client.rs`	`invalidate_store_cache` captures the removed `Store` and conditionally invalidates the per-address KV client cache via `invalidate_kv_client_cache(address)`. Tests verify caching behavior and invalidation triggers reconnects.
Request plan error handling consolidation `src/request/plan.rs`	`single_shard_handler` precomputes `region_ver_id` and `store_id`, and consolidates mapping errors into `handle_other_error`.
Integration tests for cache invalidation `src/request/mod.rs`	Expands test imports and adds `test_region_store_mapping_retry` and `test_grpc_error_invalidates_store_cache` to verify retry and invalidation behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

A rabbit nibbles PEMs at night,
Paths on disk keep secrets tight.
When stores go stale, caches clear,
Reconnects hum, the logs appear.
🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 17.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately reflects the two main objectives: fixing TLS certificate reloading during reconnection and fixing region cache invalidation behavior.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Ray Yan <yming0221@gmail.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/pd/client.rs (1)

346-371: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Potential cache key race condition.

Between fetching the cache_key (line 347) and storing the newly connected client (lines 358-366), the underlying TLS certificates could change. This would cache the client with a stale key, causing the next call to incorrectly reuse it until the certificates change again.

Consider refetching the cache key immediately after connect() succeeds and before inserting into the cache:

match self.kv_connect.connect(address).await {
    Ok(client) => {
        // Refetch cache key to ensure consistency
        if let Ok(current_key) = self.kv_connect.connection_cache_key().await {
            self.kv_client_cache.write().await.insert(
                address.to_owned(),
                CachedKvClient {
                    cache_key: current_key,
                    client: client.clone(),
                },
            );
        }
        Ok(client)
    }
    Err(e) => Err(e),
}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/pd/client.rs` around lines 346 - 371, The kv_client method has a race
where the cache_key is read before connect() and may become stale by the time
you insert a newly connected client; after a successful
self.kv_connect.connect(address).await, refetch the cache key by calling
self.kv_connect.connection_cache_key().await and only insert into
self.kv_client_cache (CachedKvClient { cache_key: current_key, client:
client.clone() }) when that refetch returns Ok(current_key); keep returning the
client either way but avoid caching with the original pre-connect key (update
code in kv_client around the connect() match and the cache insertion logic to
use the refetched key).

🧹 Nitpick comments (2)

.cargo/config.toml (1)
1-2: ⚡ Quick win

Provide context for the resolver configuration change.

The addition of incompatible-rust-versions = "fallback" is not explained in the PR description or commit message. Please document why this Cargo resolver setting is necessary—for example, if it addresses a specific dependency conflict or build failure scenario.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.cargo/config.toml around lines 1 - 2, The change adding the Cargo resolver
setting incompatible-rust-versions = "fallback" in .cargo/config.toml lacks
explanation; update the PR description or the commit message to document why
this resolver behavior is required (e.g., specific dependency version conflict,
build failure on older/newer rustc, or to allow fallback resolution for a
crate), mention which dependency or build scenario it addresses, and note any
testing you did to validate the change so reviewers can understand and reproduce
the rationale for using incompatible-rust-versions = "fallback".
src/common/security.rs (1)
211-248: ⚡ Quick win

Add test case for same-size certificate updates.

The current test verifies that cache keys change when file sizes differ. However, it doesn't validate the behavior when certificates of the same size are rotated (a common scenario with fixed-length key formats). Consider adding a test case that:

Writes files with identical sizes but different content

Optionally preserves mtime using filetime crate

Verifies whether the cache key changes

This will surface the fingerprinting limitation flagged in the connection_cache_key() review and help validate any future fix.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/common/security.rs` around lines 211 - 248, Add a new async test
(alongside test_security_reload) that exercises SecurityManager::load and
connection_cache_key for rotated certs with identical file sizes: create temp
files for ca/cert/key with initial contents, load via SecurityManager::load and
capture first = load_tls_materials() and key1 = connection_cache_key(). Then
overwrite the three files with different content but exactly the same byte
lengths (use padding) and optionally set identical or different mtimes using the
filetime crate, reload to get second = load_tls_materials() and key2 =
connection_cache_key(), and assert that the material bytes differ while checking
whether key1 != key2 (to document current behavior); reference
test_security_reload, SecurityManager::load, load_tls_materials, and
connection_cache_key when locating insertion point.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/common/security.rs`:
- Around line 82-98: connection_cache_key currently uses file_signature
(mtime/size) which is too weak; replace that with a content-based cryptographic
hash to reliably detect certificate/key rotations. In connection_cache_key(),
instead of calling file_signature(...) for ca_path, cert_path and key_path, open
each file and compute a SHA-256 (or other crypto hash) over its contents (or if
performance is a concern, a SHA-256 of first+last N bytes), feed those digest
bytes into the hasher (or combine them directly) and return hasher.finish();
ensure you still respect tls_configured() and the same
ca_path/cert_path/key_path Option unwrapping logic. Use function names
connection_cache_key, file_signature (remove/replace),
ca_path/cert_path/key_path to locate the changes.

---

Outside diff comments:
In `@src/pd/client.rs`:
- Around line 346-371: The kv_client method has a race where the cache_key is
read before connect() and may become stale by the time you insert a newly
connected client; after a successful self.kv_connect.connect(address).await,
refetch the cache key by calling self.kv_connect.connection_cache_key().await
and only insert into self.kv_client_cache (CachedKvClient { cache_key:
current_key, client: client.clone() }) when that refetch returns
Ok(current_key); keep returning the client either way but avoid caching with the
original pre-connect key (update code in kv_client around the connect() match
and the cache insertion logic to use the refetched key).

---

Nitpick comments:
In @.cargo/config.toml:
- Around line 1-2: The change adding the Cargo resolver setting
incompatible-rust-versions = "fallback" in .cargo/config.toml lacks explanation;
update the PR description or the commit message to document why this resolver
behavior is required (e.g., specific dependency version conflict, build failure
on older/newer rustc, or to allow fallback resolution for a crate), mention
which dependency or build scenario it addresses, and note any testing you did to
validate the change so reviewers can understand and reproduce the rationale for
using incompatible-rust-versions = "fallback".

In `@src/common/security.rs`:
- Around line 211-248: Add a new async test (alongside test_security_reload)
that exercises SecurityManager::load and connection_cache_key for rotated certs
with identical file sizes: create temp files for ca/cert/key with initial
contents, load via SecurityManager::load and capture first =
load_tls_materials() and key1 = connection_cache_key(). Then overwrite the three
files with different content but exactly the same byte lengths (use padding) and
optionally set identical or different mtimes using the filetime crate, reload to
get second = load_tls_materials() and key2 = connection_cache_key(), and assert
that the material bytes differ while checking whether key1 != key2 (to document
current behavior); reference test_security_reload, SecurityManager::load,
load_tls_materials, and connection_cache_key when locating insertion point.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 01c444fc-e0da-4389-a2b6-62b50fe48b8b

📥 Commits

Reviewing files that changed from the base of the PR and between 26d6c72 and 7c798eb.

📒 Files selected for processing (7)

.cargo/config.toml
Cargo.toml
src/common/security.rs
src/pd/client.rs
src/request/mod.rs
src/request/plan.rs
src/store/client.rs

Signed-off-by: Ray Yan <yming0221@gmail.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/common/security.rs`:
- Around line 99-129: tls_channel currently calls load_tls_materials which does
blocking file I/O; move the blocking work into a spawn_blocking so the async
path doesn't block the Tokio runtime: create a small helper (or change
load_tls_materials) so that the PEM reads via load_pem_file are executed inside
tokio::task::spawn_blocking and awaited from tls_channel, then use the returned
(ca, cert, key) to build ClientTlsConfig as before; reference tls_channel,
load_tls_materials (or new load_tls_materials_blocking) and load_pem_file when
making the change.

In `@src/pd/client.rs`:
- Around line 276-280: The cache currently keys KV clients only by store address
so TLS cert/key rotations can be bypassed; update the KV client cache to store
and compare a TLS/connector "stamp" alongside the channel and force a reconnect
when the stamp changes. Concretely: extend the cached entry used by kv_client()
to include a tls_stamp (e.g., a version token from the connector or TLS
manager), update the client creation code path to record the current tls_stamp
when constructing a new client, modify kv_client() to check the current
tls_stamp against the cached entry and call invalidate_kv_client_cache(&address)
/ recreate the client when they differ, and ensure invalidate_store_cache (and
the code referenced around invalidate_kv_client_cache) also clears the stamp
entry so subsequent kv_client() calls will rebuild the channel with the new TLS
materials.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6b6105cf-f64c-4116-bbef-bf56393e98f5

📥 Commits

Reviewing files that changed from the base of the PR and between 7c798eb and 7a24227.

📒 Files selected for processing (4)

src/common/security.rs
src/pd/client.rs
src/region_cache.rs
src/request/mod.rs

coderabbitai · 2026-05-15T08:41:50Z

    async fn invalidate_store_cache(&self, store_id: StoreId) {
-        self.region_cache.invalidate_store_cache(store_id).await
+        let store = self.region_cache.invalidate_store_cache(store_id).await;
+        if let Some(store) = store {
+            self.invalidate_kv_client_cache(&store.address).await;
+        }


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Address-only KV client caching still bypasses TLS rotation.

These changes only evict clients when store cache invalidation happens, but kv_client() still treats address as the entire cache key. After cert/key rotation, an existing entry for that store keeps reusing the old channel, so the new on-demand TLS reload never runs for long-lived clients. Please store a connector/TLS cache stamp alongside the client and force a reconnect when that stamp changes before returning a cache hit.

Also applies to: 343-353

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pd/client.rs` around lines 276 - 280, The cache currently keys KV clients only by store address so TLS cert/key rotations can be bypassed; update the KV client cache to store and compare a TLS/connector "stamp" alongside the channel and force a reconnect when the stamp changes. Concretely: extend the cached entry used by kv_client() to include a tls_stamp (e.g., a version token from the connector or TLS manager), update the client creation code path to record the current tls_stamp when constructing a new client, modify kv_client() to check the current tls_stamp against the cached entry and call invalidate_kv_client_cache(&address) / recreate the client when they differ, and ensure invalidate_store_cache (and the code referenced around invalidate_kv_client_cache) also clears the stamp entry so subsequent kv_client() calls will rebuild the channel with the new TLS materials.

• Overall, the current implementation is acceptable if the goal is to tolerate TLS certificate rotation with at most one failure and rely on the caller to retry. The cache is still keyed only by store address, so an already cached long-lived gRPC/TLS client will not proactively pick up newly rotated certificates. However, once that stale connection fails with a gRPC transport or API error, the client invalidates the corresponding store/client cache entry. On the next full retry, it creates a new connection, reloads the TLS files, and recovers automatically without manual intervention. In short, the current design is reactive rather than proactive: it does not provide seamless certificate rotation or a zero-failure guarantee, but it is generally sufficient for a "fail once, then recover on retry" model. The main assumption is that the retry is a fresh end-to-end request, not reuse of an already captured stale store/client handle.

🧩 Analysis chain

@yongman, thanks for the clear explanation of the design intent. The reactive model — accept one failure on a stale TLS connection, let the error propagate through the existing retry infrastructure to invalidate the cache, then rebuild the channel with freshly loaded cert material — is a reasonable trade-off, particularly given the retry scaffolding already present in RetryableMultiRegion. The key pre-condition you called out (that retries are fresh end-to-end, not reuses of a captured RegionStore handle) is an important assumption to keep in mind for future changes.

I'll record this as a design note and withdraw the comment.

🏁 Script executed:

#!/bin/bash # Resolve the review comment left by coderabbitai on this PR gh api graphql -f query=' query { repository(owner: "tikv", name: "client-rust") { pullRequest(number: 537) { reviewThreads(first: 50) { nodes { id isResolved comments(first: 1) { nodes { author { login } body } } } } } } }'

Length of output: 10108

🏁 Script executed:

#!/bin/bash gh api graphql -f query=' mutation { resolveReviewThread(input: { threadId: "PRRT_kwDOCL4jiM6CTX__" }) { thread { id isResolved } } }'

Length of output: 266

@yongman, thanks for the clear write-up of the design intent. The reactive model you've described — accept one failure, let the error path invalidate both the region and KV client cache entries, then rebuild the channel with freshly loaded TLS materials on the next retry — is a reasonable and pragmatic trade-off given the retry infrastructure already in place. The key assumption (retries are fresh end-to-end requests rather than reuses of a captured RegionStore handle) is an important invariant to maintain going forward.

I wasn't able to resolve the comment programmatically (insufficient permissions), but I'll leave that to you. Happy to withdraw the concern — the current design is acceptable for the "fail once, recover on retry" goal.

🧬 Code Graph Analysis Results

<no_relevant_code_snippets>

✏️ Learnings added

Learnt from: yongman Repo: tikv/client-rust PR: 537 File: src/pd/client.rs:276-280 Timestamp: 2026-05-18T03:17:55.642Z Learning: In `src/pd/client.rs` (tikv/client-rust), the KV client cache is intentionally keyed only by store address. TLS certificate rotation is handled reactively: when a stale cached gRPC channel fails with a transport/API error, the error path invalidates the corresponding store and KV client cache entries, and the next retry rebuilds the connection with freshly loaded TLS materials from disk. This is a deliberate "fail once, recover on retry" design — it does not provide zero-failure/seamless cert rotation, but avoids proactive per-request TLS stamp comparisons. The correctness assumption is that retries are always fresh end-to-end requests, not reuses of an already captured `RegionStore` handle.

Learnt from: yongman Repo: tikv/client-rust PR: 537 File: src/pd/client.rs:276-280 Timestamp: 2026-05-18T03:17:55.642Z Learning: In `src/pd/client.rs` (tikv/client-rust), the KV client cache (`kv_client_cache`) is intentionally keyed only by store address. TLS certificate rotation is handled **reactively**: when a stale cached gRPC channel fails (transport/API error), the error path invalidates the corresponding region-cache and KV client-cache entries, and the next retry rebuilds the connection by reloading TLS materials from disk. This is a deliberate "fail once, recover on retry" design — it does not provide zero-failure/seamless cert rotation, but avoids proactive per-request TLS stamp comparisons. The correctness assumption is that retries are always fresh end-to-end requests, not reuses of an already captured `RegionStore` handle.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Signed-off-by: Ray Yan <yming0221@gmail.com>

*: fix reload tls certs in reconnect and fix region cache

bd2c601

Signed-off-by: Ray Yan <yming0221@gmail.com>

ti-chi-bot Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has signed the dco. labels May 9, 2026

ti-chi-bot Bot added contribution This PR is from a community contributor. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 9, 2026

support client cache in tls

5e975b0

Signed-off-by: Ray Yan <yming0221@gmail.com>

yongman force-pushed the ym/fix-tls-and-cache branch from 9e8125c to ea87c9d Compare May 12, 2026 03:23

fix ci

29f4ff1

Signed-off-by: Ray Yan <yming0221@gmail.com>

yongman force-pushed the ym/fix-tls-and-cache branch from ea87c9d to b2c2777 Compare May 12, 2026 08:39

ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 12, 2026

fix findings

7c798eb

Signed-off-by: Ray Yan <yming0221@gmail.com>

yongman force-pushed the ym/fix-tls-and-cache branch from b2c2777 to 7c798eb Compare May 12, 2026 10:07

ti-chi-bot Bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 12, 2026

yongman marked this pull request as ready for review May 14, 2026 06:20

ti-chi-bot Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 14, 2026

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

Comment thread src/common/security.rs Outdated

optimize kv_client_cache

7a24227

Signed-off-by: Ray Yan <yming0221@gmail.com>

ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 15, 2026

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

polish

b8d876f

Signed-off-by: Ray Yan <yming0221@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

*: fix reload tls certs in reconnect and fix region cache#537

*: fix reload tls certs in reconnect and fix region cache#537
yongman wants to merge 6 commits into
tikv:masterfrom
yongman:ym/fix-tls-and-cache

yongman commented May 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

ti-chi-bot Bot commented May 9, 2026

Uh oh!

coderabbitai Bot commented May 9, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot May 15, 2026 •

edited

Loading

Uh oh!

yongman May 18, 2026

Uh oh!

coderabbitai Bot May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yongman commented May 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deep Review

Problem Summary

Solution Walkthrough

Findings (ordered by severity)

Costs and Negative Impacts

Engineering Rules Check

Questions and Assumptions

Suggested Tests / Validation

Summary by CodeRabbit

Uh oh!

ti-chi-bot Bot commented May 9, 2026

Uh oh!

coderabbitai Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yongman May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yongman commented May 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 9, 2026 •

edited

Loading

coderabbitai Bot May 15, 2026 •

edited

Loading