GenAI Utils | Allow passing external Span to the manual LLMInvocation#4281
GenAI Utils | Allow passing external Span to the manual LLMInvocation#4281Fiery-Fenix wants to merge 2 commits intoopen-telemetry:mainfrom
Conversation
c63f0cb to
cc44bb2
Compare
| # Create a custom Span | ||
| external_span = self.telemetry_handler._tracer.start_span( | ||
| name="external operation", kind=trace.SpanKind.INTERNAL | ||
| ) | ||
|
|
||
| invocation = LLMInvocation( | ||
| span=external_span, | ||
| request_model="manual-model", | ||
| input_messages=[message], | ||
| provider="test-provider", | ||
| attributes={"external": True}, | ||
| ) |
There was a problem hiding this comment.
Can you explain a little more on the use case? Since the span is returned already, why not just use use the returned span?
There was a problem hiding this comment.
Sure, main purpose was to make adoption of instrumentation-genai-utils into instrumentation-botocore without big changes on botocore instrumentation side.
In botocore instrumentation Span is started here. Please take a note that this Span is started for any supported "extension" here, not only for Bedrock instrumentation, but also for S3/Lambda/DocumentDB/etc.
At the same time actual LLM Invocation instrumentation is located deeper - in bedrock extension which is actually called few lines below.
Please take a note that bedrock extension can handle both synchronous and stream invokes that bring another complexity for instrumentation-genai-utils.
My idea is to keep top-level instrumentation as is, keeping Span lifecycle management there, at the same time run TelemetryHandler.start_llm/TelemetryHandler.stop_llm exactly in bedrock extension using the provided active Span. This will keep code change more local and will not break code responsibility by introducing dependency on GenAI TelemetryHandler on the botocore instrumentation top level.
There was a problem hiding this comment.
Hmm I see. I don't love it for complicating the API, I think this will pretty much only be used for Bedrock.
It's a bit of an anti pattern to update the span name after starting the span, since the final name isn't available to samplers. But it is what it is. I'll defer to @xrmx who wrote the bedrock code though if he thinks it's worth refactoring.
There was a problem hiding this comment.
Actually, current implementation of the TelemetryHandler forcedly overwrites Span name after it's creation, I haven't touched that part.
util/opentelemetry-util-genai/src/opentelemetry/util/genai/handler.py
Outdated
Show resolved
Hide resolved
| if invocation.end_span_on_exit: | ||
| span.end() |
There was a problem hiding this comment.
Does end_span_on_exit need to be public, or is it only used internally here?
Another option that might be cleaner is to update LLMInvocation to store an ExitStack for cleanup, since you can dynamically push callbacks onto it.
There was a problem hiding this comment.
As this field is used by TelemetryHandler to determine if it's required to close the span - I believe yes, it should be public field of LLMInvocation. Usage of this field is very similar to existing LLMInvocation field context_token - it's also seems to be private, but never used by LLMInvocation class itself, only by TelemetryHandler.
cc44bb2 to
18be12f
Compare
Description
This PR allows to pass already active Span to
TelemetryHandlerviaLLMInvocationinstead of always creating a new Span duringstart_llmcall.This will allow to control Span lifecycle outside of GenAI utils which might be useful for some cases.
Primary goal for this change - to allow usage of GenAI Utills in
botocoreinstrumentation forbedrockinstrumentation. Main feature ofbotocoreinstrumentation that it consist of 2 layers:To use
instrumentation-genai-utilinbotocoreinstrumentation we'll either need to do a big refactoring of thebotocoreinstrumentation to move Span lifecycle control to "extensions" or to allow passing external Span toTelemetryHandler, keeping control on the caller side.This is also can be useful for instrumenting Streaming GenAI calls when Span lifecycle should be managed on the same level as stream lifecycle control.
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
New test was added.
Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.