Skip to content

GcpAuthProvider: iamconnectorcredentials.credentials:retrieve returns 404 NOT_FOUND with empty details[] for SPIFFE-bound agents despite correct IAM + Agent Registry binding #5753

@allentong

Description

@allentong

🔴 Required Information

Describe the Bug:

Every call to iamconnectorcredentials.credentials:retrieve (via google.adk.integrations.agent_identity.GcpAuthProvider) returns HTTP 404 NOT_FOUND with empty details: [] when made from a SPIFFE-bound Vertex Agent Engine — even though every documented prerequisite is in place and a control external call against the same URL returns a proper 403 PERMISSION_DENIED with populated details.

The 404 with empty details is anomalous: every other Google API I've seen returns the missing-resource name in details[]. Its absence here suggests the 404 is masking an internal lookup that the API surface doesn't expose. The docs at https://docs.cloud.google.com/iam/docs/auth-with-3lo (Preview / "as is") imply the first call should return RetrieveCredentialsMetadata.uri_consent_required for an unknown user, not 404.

Possibly relevant context: Auth Manager 3LO is a Preview feature. We registered our OAuth client + consent screen with the Auth Manager oauthcallback URL in Authorized Redirect URIs and the drive.readonly scope on the consent screen, but we have not received any confirmation from Google that our project / OAuth client is enabled for Auth Manager Preview access (no email, no Console status indicator, no API response we could find). If preview enrollment is a prerequisite, the docs don't say so and the API surfaces the rejection as 404 NOT_FOUND with empty details rather than something diagnosable.

Steps to Reproduce:

  1. Deploy a Vertex Agent Engine with --agent-identity (SPIFFE).
  2. Create an iamconnectors 3LO connector (google-workspace, drive.readonly scope, valid OAuth client).
  3. Grant the SPIFFE principal roles/iamconnectors.user (the only required permission for retrieveCredentials).
  4. Wire McpToolset + GcpAuthProviderScheme(name=connector, scopes=[...], continue_uri=...) in the agent (canonical pattern from the 3LO docs).
  5. Register the agent + an MCP server + a binding in Agent Registry that pairs the agent's SPIFFE URN with the connector + scopes + matching continueUri.
  6. Invoke a tool that triggers the auth lookup (any first-time user_id — Slack composite, email, arbitrary string).

Expected Behavior:

Backend returns an LRO with RetrieveCredentialsMetadata.uri_consent_required containing an authorization_uri (consent URL) and consent_nonce. ADK surfaces that as an AuthCredential with auth_uri and the bridge posts the consent link to the user.

Observed Behavior:

google.api_core.exceptions.NotFound: 404 POST
https://iamconnectorcredentials.mtls.googleapis.com/v1alpha/projects/<PROJECT_NUMBER>/locations/us-central1/connectors/google-workspace/credentials:retrieve?$alt=json;enum-encoding=int:
Requested entity was not found.

Full HTTP response body captured via a diagnostic GcpAuthProvider subclass:

{"error": {"code": 404, "message": "Requested entity was not found.", "status": "NOT_FOUND"}}
  • details: [] (empty)
  • errors: [] (empty)
  • Server: ESF
  • grpc_status_code: NOT_FOUND

ADK wraps this as RuntimeError("Failed to retrieve credential for user '<user_id>' on connector '<name>'.") at google/adk/integrations/agent_identity/gcp_auth_provider.py:243, which is what surfaces to the user — losing the underlying NotFound until you wrap the provider.

Environment Details:

  • ADK Library Version (pip show google-adk): 1.34.0
  • Desktop OS: Linux container in Vertex Agent Runtime (us-central1)
  • Python Version: 3.12

Model Information:

  • Are you using LiteLLM: No
  • Which model is being used: gemini-flash-latest (not relevant — failure is in tool auth before any model call)

🟡 Optional Information

Regression: N/A — first attempt to stand up Auth Manager 3LO. Latest google-adk 1.34.0 + google-cloud-iamconnectorcredentials 1.0.0a2.

Logs:

The diagnostic wrapper captures the outgoing request fields and the raw exception chain. The outgoing request is well-formed (placeholders below to protect identifiers):

auth_diag.request.connector = 'projects/<PROJECT_NUMBER>/locations/us-central1/connectors/google-workspace'
auth_diag.request.user_id = '<REDACTED_SLACK_COMPOSITE>'
auth_diag.request.scopes = ['https://www.googleapis.com/auth/drive.readonly']
auth_diag.request.continue_uri = '<REDACTED_BRIDGE_URL>/oauth/continue'
auth_diag.request.force_refresh = False (ADK hardcoded)

auth_diag.exc[1] type=NotFound
auth_diag.exc[1].code = 404
auth_diag.exc[1].grpc_status_code = NOT_FOUND
auth_diag.exc[1].details = []
auth_diag.exc[1].errors = []
auth_diag.exc[1].message = 'POST https://iamconnectorcredentials.mtls.googleapis.com/v1alpha/projects/<PROJECT_NUMBER>/locations/us-central1/connectors/google-workspace/credentials:retrieve?$alt=json;enum-encoding=int: Requested entity was not found.'
auth_diag.exc[1].response.status_code = 404
auth_diag.exc[1].response.headers = {'Vary': 'Origin, X-Origin, Referer', 'Content-Type': 'application/json; charset=UTF-8', 'Content-Encoding': 'gzip', 'Server': 'ESF', ...}
auth_diag.exc[1].response.text = '{\n  "error": {\n    "code": 404,\n    "message": "Requested entity was not found.",\n    "status": "NOT_FOUND"\n  }\n}\n'

Minimal Reproduction Code:

# app/agent.py (relevant lines)
from google.adk.auth.credential_manager import CredentialManager
from google.adk.integrations.agent_identity import GcpAuthProvider, GcpAuthProviderScheme
from google.adk.tools.mcp_tool.mcp_toolset import McpToolset, StreamableHTTPConnectionParams

CredentialManager.register_auth_provider(GcpAuthProvider())

_drive_toolset = McpToolset(
    connection_params=StreamableHTTPConnectionParams(
        url="https://drivemcp.googleapis.com/mcp/v1",
        headers={"x-goog-user-project": PROJECT_ID},
    ),
    tool_filter=["search_files"],
    auth_scheme=GcpAuthProviderScheme(
        name=f"projects/{PROJECT_ID}/locations/us-central1/connectors/google-workspace",
        scopes=["https://www.googleapis.com/auth/drive.readonly"],
        continue_uri=os.environ["AUTH_MANAGER_CONTINUE_URI"],
    ),
)

We also tried the canonical AgentRegistry.get_mcp_toolset(mcp_server_name=...) pattern from the docs. That call fails at module import inside the SPIFFE-bound runtime with 401 Unauthorized on agentregistry.googleapis.com/v1alpha/.../mcpServers/<id> because the container hasn't yet established a token-bearing outbound identity at import time. (See separate concern below.)

How often has this issue occurred?: Always (100%) — every call, every user_id we've tested (real Slack composite, fresh email, never-seen-before random strings).


What we have ruled out (to save support cycles)

Hypothesis Test Result
URL/routing wrong External curl against same URL with my user (no role) Real 403 PERMISSION_DENIED with IAM_PERMISSION_DENIED and populated details — URL is routable, IAM is the first gate
Request shape wrong Compared client wire body against proto RetrieveCredentialsRequest (google-cloud-iamconnectorcredentials v1alpha) Shape matches: connector, user_id, scopes, continue_uri. force_refresh=False omitted by proto3 JSON default
IAM missing gcloud projects get-iam-policy on both the SPIFFE principal and the principalSet Both have iamconnectors.user. The role's single permission matches what the API requires
Connector misconfigured GET connector resource state=ENABLED, scopes + clientId + redirectUrl all populated; OAuth client has matching Authorized Redirect URI
continueUri mismatch with binding Patched Agent Registry binding's continueUri to match request exactly (dev URL), retested; reverted to prod URL, retested Still 404 in both cases
Agent not registered in Agent Registry GET agents/... confirms agent present with matching SPIFFE URN Registered; agentId == binding source.identifier
Stale probe artifacts Deleted all probe services / bindings; recreated clean Still 404

What we could NOT verify

Item Why we couldn't verify
Project enrolled in Auth Manager Preview No Console flag, no API endpoint, no email/notification received. The 3LO docs don't describe an enrollment step but the 404 with empty details could be the symptom
OAuth client / consent screen approved by Google Submitted with Auth Manager oauthcallback URL + drive.readonly scope; no confirmation received

Asks for the ADK team

  1. The user-visible error masks a structured 404. Could GcpAuthProvider.get_auth_credential surface the underlying NotFound (or at minimum the URL and the response body) instead of swallowing it into a bare RuntimeError("Failed to retrieve credential ...")? Took us several debugging cycles to wrap the provider and recover the actual response.

  2. AgentRegistry.get_mcp_toolset makes a synchronous HTTPS call at module-import time (agentregistry.googleapis.com). In a SPIFFE-bound Vertex Agent Engine, no outbound token is available at module import — the container only acquires its identity once incoming requests start flowing. Result: the canonical "Agent Registry MCP" pattern from the 3LO docs deadlocks at startup with 401 Unauthorized on mcpServers/<id>. Consider deferring the lookup to first tool resolution (lazy) so this pattern works under --agent-identity.

  3. The 404 with empty details itself is more likely a server-side concern, but the ADK team likely has the right contacts inside the Auth Manager team to escalate. The repro is fully isolated to the wire request shown above + a SPIFFE caller; happy to give a project number off-issue. If there's a Preview enrollment step we're missing, please point us at the right form / contact.

Related

  • Docs that imply this should work: https://docs.cloud.google.com/iam/docs/auth-with-3lo
  • ADK source where the underlying exception gets wrapped and lost: google/adk/integrations/agent_identity/gcp_auth_provider.py:243
  • ADK source where Agent Registry makes the sync import-time call: google/adk/integrations/agent_registry/agent_registry.py:223 (_make_request) called from get_mcp_toolset:326

Metadata

Metadata

Assignees

Labels

auth[Component] This issue is related to authorization

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions