-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Description
- Package Name: azure-ai-projects
- Package Version: 2.0.0b3
- Operating System: Windows
- Python Version:
Description
When using an Azure AI Foundry agent with the Azure AI Search tool (remote_functions.azure_ai_search), the telemetry/tracing data captured in Application Insights does not include the tool's input parameters or retrieved content. This makes it difficult to debug, audit, and understand the agent's search behavior.
Current Behavior
In the Application Insights End-to-End Transaction view, the remote_functions.azure_ai_search → execute_tool span shows:
| Field | Value |
|---|---|
| EVENT | remote_functions.azure_ai_search execute_tool |
| RES. | 0 |
| DURATION | 220.2 ms |
| Messages (Tool) | {"type":"text","content":""} |
What's Missing
- Input Parameters: The search query, filters, or any parameters passed to the Azure AI Search tool are not visible
- Retrieved Content: The documents/results returned from the search are not captured
- Search Metadata: No visibility into which index was queried, number of results, relevance scores, etc.
Expected Behavior
The telemetry for remote_functions.azure_ai_search → execute_tool should include:
-
Tool Input:
- Search query text
- Filter expressions (if any)
- Top-K parameter
- Index name
-
Tool Output:
- Retrieved document snippets/content
- Number of documents returned
- Relevance scores (if available)
-
Metadata:
- Azure AI Search endpoint
- Index name
- Request latency breakdown
Reproduction Steps
- Create an Azure AI Foundry agent with Azure AI Search tool configured
- Run the agent with a query that triggers the search tool:
response = openai_client.responses.create(
input=[{"role": "user", "content": "your query."}],
extra_body={"agent": {"name": agent.name, "type": "agent_reference"}},
)
- Check Application Insights → End-to-End Transaction
- Observe that the
remote_functions.azure_ai_search→execute_toolspan has empty/minimal content
Evidence
Operation ID: 4e33833fca2c0a9684233063950f9b74
Transaction Timeline:
| Event | Duration |
|---|---|
| AI unknown - Processing request - Instructions | 196.2 µs |
| AI unknown - Processing request - User message | 155.3 µs |
| execute_tool | 176.9 ms |
| remote_functions.azure_ai_search execute_tool | 220.2 ms |
| AI unknown - Working - Generating response | 4.7 s |
Tool Message Content: {"type":"text","content":""} (empty)
Indirect Evidence of Retrieved Data
Notably, the subsequent LLM call (AI unknown - Working - Generating response) shows:
| Attribute | Value |
|---|---|
gen_ai.usage.input_tokens |
4157 |
gen_ai.usage.output_tokens |
333 |
gen_ai.output.messages |
[{"role": "assistant", "parts": [{"type": "text", "content": "The catalog includes the following tests:\n\n1. Leakage Test\n - Ensures LED equipment limits leakage currents to safe values for human ... |
This confirms that:
- The search tool DID retrieve data — the 4157 input tokens indicate substantial content was passed to the LLM
- The LLM successfully used the retrieved content — the response references specific test catalog items
- The gap is purely observability — the tool execution works correctly, but its input/output is not captured in telemetry
This makes debugging even more challenging: you can see the data was retrieved (via token count) and used (via LLM response), but you cannot inspect the actual search results directly.
Screenshots
Impact
- Debugging: Cannot troubleshoot why search results may be incorrect or incomplete
- Auditing: No record of what data was searched or retrieved
- Cost Analysis: Cannot correlate search usage with specific queries
- Quality Improvement: Cannot analyze search patterns to improve prompts or index configuration