Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pipeline handling in the copilot provider #547

Merged
merged 7 commits into from
Jan 13, 2025

Conversation

jhrozek
Copy link
Contributor

@jhrozek jhrozek commented Jan 10, 2025

  • Create the pipeline sensitive context when creating a pipeline instance, not on every processing
  • Create pipeline instance when creating the SequentialPipelineProcessor not for every process
  • Create the pipelines only once in the copilot provider
  • test me: is this needed

Fixes: #528

@@ -275,6 +275,13 @@ def __init__(
self.secret_manager = secret_manager
self.is_fim = is_fim
self.context = PipelineContext()

# we create the sesitive context here so that it is not shared between individual requests
# TODO: could we get away with just generating the session ID for an instance?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming each request has exactly one instance of the pipeline and each pipeline has one instance of PipelineSensitiveData, the session ID can be generated for the pipeline instance and used in PipelineSensitiveData.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, let's fix this in a follow-up, I will raise an issue

self.context_tracking: Optional[PipelineContext] = None

def _ensure_pipelines(self):
if not self.input_pipeline or not self.fim_pipeline:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to instantiate both the pipelines here, or can we just create one conditionally based on if the request is chat or fim?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could, I'll add that to the issue I'll open

@@ -288,6 +295,8 @@ async def _forward_data_through_pipeline(self, data: bytes) -> Union[HttpRequest
http_request.headers,
http_request.body,
)
# TODO: it's weird that we're overwriting the context. Should we set the context once? Maybe when
# creating the pipeline instance?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that makes sense.

@jhrozek jhrozek marked this pull request as ready for review January 12, 2025 22:42
@jhrozek
Copy link
Contributor Author

jhrozek commented Jan 12, 2025

No longer a draft and can be tested/reviewed, but only now did I notice @ptelang 's comments (sorry I was focused on getting the damn thing working first).
I'll fix them first thing in the morning.

@jhrozek jhrozek changed the title WIP: Fix pipeline handling in the copilot provider Fix pipeline handling in the copilot provider Jan 13, 2025
lukehinds
lukehinds previously approved these changes Jan 13, 2025
…ce, not on every processing

We used to create the pipeline context during pipeline processing which
means we cannot reuse the same pipeline for output that spans several
data buffers.
…r not for every process

A pipeline instance is what binds the pipeline steps with the context.
Create the instance sooner, not when processing the request.
Since the copilot provider class instance is created once per
connection, let's create the pipelines when establishing the connection
and reuse them.
…tput pipeline

Since we can reuse a single pipeline for multiple request-reply round
trips, we shouldn't flush the buffer and destroy the context.
We've seen instances where the request (typically a FIM one) contained
more than one request. Let's dispatch them one by one individually. Also
let's not pass around self.buffer into the tasks but a parameter.
For some reason we coded up the SecretsManager so that it held only one
secret per session. Let's store a dict instead.
@lukehinds lukehinds merged commit c68ef71 into stacklok:main Jan 13, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Secrets redaction response not working for Copilot in v0.1.5
3 participants