-
Notifications
You must be signed in to change notification settings - Fork 1k
Python: ADR for simplified get response #3098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Python: ADR for simplified get response #3098
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR proposes an ADR (Architecture Decision Record) for simplifying Python's chat client API by consolidating the get_response and get_streaming_response methods into a single method with a stream parameter. This would leverage Python's type system and overload patterns to provide better type safety while reducing code duplication.
Key Changes
- Proposes consolidating two separate methods (
get_responseandget_streaming_response) into a single method with astreamparameter - Analyzes the implications across all usage areas including agents, decorators, middleware, and provider implementations
- Provides a detailed migration path with backward compatibility considerations
f51a20d to
b16f403
Compare
c7465ec to
f59fb8c
Compare
| - Bad: Increased complexity in decorators and middleware | ||
| - Bad: Less alignment with .NET design | ||
|
|
||
| ## Option 3: Consolidate + Merge Agent Methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of a single method and passing in stream=True|False.
| # Streaming agent - same method, different parameter | ||
| print(f"\nUser: {message}") | ||
| print(f"{agent.name}: ", end="") | ||
| async for update in agent.run(message, thread=thread, stream=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I like this...
| - `_trace_get_streaming_response()` wraps `get_streaming_response` | ||
| - `use_instrumentation` decorator applies both | ||
|
|
||
| **Impact**: Would need consolidation into a single tracing wrapper. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have a single wrapper but internally split into two for streaming and non-streaming? Essentially maintaining the status quote of the internals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the other decorators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the decorators work on the public api's so those would have to be changed, we could decide to do some on the _inner methods and keep those split between streaming and non-streaming, but I would prefer to go allout, because then the code duplication is reduced by quite a bit and it becomes much simpler to understand the subtle differences between streaming and non-streaming behavior, or put differently it becomes much easier to have consistent behavior.
0ec0eff to
ac1692e
Compare
| ## Misc | ||
|
|
||
| Smaller questions to consider: | ||
| - Should default be `stream=False` or `stream=True`? (Current is False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think defaulting to False makes sense.
7d0489c to
b5b4b2f
Compare
Motivation and Context
ADR describing the implications of a potential simplification of the
get_responsemethod in ChatClients, this would leverage python's type system and astream=True/Falseparam to steer the users instead of using two separate methods, that already have a lot of overlap.Related to #3096 - Point 5
Description
Contribution Checklist