-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix pipeline handling in the copilot provider #547
Conversation
@@ -275,6 +275,13 @@ def __init__( | |||
self.secret_manager = secret_manager | |||
self.is_fim = is_fim | |||
self.context = PipelineContext() | |||
|
|||
# we create the sesitive context here so that it is not shared between individual requests | |||
# TODO: could we get away with just generating the session ID for an instance? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming each request has exactly one instance of the pipeline and each pipeline has one instance of PipelineSensitiveData, the session ID can be generated for the pipeline instance and used in PipelineSensitiveData.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, let's fix this in a follow-up, I will raise an issue
self.context_tracking: Optional[PipelineContext] = None | ||
|
||
def _ensure_pipelines(self): | ||
if not self.input_pipeline or not self.fim_pipeline: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to instantiate both the pipelines here, or can we just create one conditionally based on if the request is chat or fim?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could, I'll add that to the issue I'll open
@@ -288,6 +295,8 @@ async def _forward_data_through_pipeline(self, data: bytes) -> Union[HttpRequest | |||
http_request.headers, | |||
http_request.body, | |||
) | |||
# TODO: it's weird that we're overwriting the context. Should we set the context once? Maybe when | |||
# creating the pipeline instance? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that makes sense.
No longer a draft and can be tested/reviewed, but only now did I notice @ptelang 's comments (sorry I was focused on getting the damn thing working first). |
…ce, not on every processing We used to create the pipeline context during pipeline processing which means we cannot reuse the same pipeline for output that spans several data buffers.
…r not for every process A pipeline instance is what binds the pipeline steps with the context. Create the instance sooner, not when processing the request.
Since the copilot provider class instance is created once per connection, let's create the pipelines when establishing the connection and reuse them.
…tput pipeline Since we can reuse a single pipeline for multiple request-reply round trips, we shouldn't flush the buffer and destroy the context.
We've seen instances where the request (typically a FIM one) contained more than one request. Let's dispatch them one by one individually. Also let's not pass around self.buffer into the tasks but a parameter.
For some reason we coded up the SecretsManager so that it held only one secret per session. Let's store a dict instead.
Fixes: #528