-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable IRIS to use FAQs #187
Conversation
Caution Review failedThe pull request is closed. WalkthroughThis pull request introduces comprehensive enhancements related to FAQ (Frequently Asked Questions) functionality across various components of the application. Key changes include the addition of data transfer objects, pipeline configurations, retrieval mechanisms, and webhook endpoints. These modifications facilitate the ingestion, retrieval, and management of FAQs within courses, improving the system's capability to provide contextual information to students through an AI-driven interface. Changes
Possibly related PRs
Suggested Reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (18)
app/pipeline/faq_ingestion_pipeline.py (3)
87-88
: Consolidate nested context managers.Consider using a single
with
statement for bothbatch_update_lock
andself.collection.batch.rate_limit()
to reduce nesting and improve readability.-with batch_update_lock: - with self.collection.batch.rate_limit(requests_per_minute=600) as batch: +with batch_update_lock, self.collection.batch.rate_limit(requests_per_minute=600) as batch: ...🧰 Tools
🪛 Ruff (0.8.2)
87-88: Use a single
with
statement with multiple contexts instead of nestedwith
statements(SIM117)
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
140-146
: Return value fromchunk_data
to avoid confusion.The method doesn’t return anything. If returning
None
is intentional, explicitly returnNone
to improve clarity.def chunk_data(self, path: str) -> List[Dict[str, str]]: ... return + # or, for explicitness: + # return None🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
1-146
: Apply Black formatting.The linter warns the file needs to be reformatted. Please run Black to ensure consistent code style.
🧰 Tools
🪛 Ruff (0.8.2)
6-6:
openai.OpenAI
imported but unusedRemove unused import:
openai.OpenAI
(F401)
87-88: Use a single
with
statement with multiple contexts instead of nestedwith
statements(SIM117)
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/retrieval/faq_retrieval.py (3)
2-2
: Evaluate the usage oflogger
fromasyncio.log
.You already import the standard
logging
module. If it’s intentional to useasyncio.log.logger
, clarify to avoid confusion for future maintainers. Otherwise, consider switching to the conventionallogging
approach.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
191-191
: Check string matching for'YES'
.Strict equality to
"YES"
could cause edge-case failures if the response includes extra whitespace or a different capitalization. Consider normalizing the casing or trimming whitespace before comparison.-return response == "YES" +return response.strip().upper() == "YES"🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
1-395
: Apply Black formatting.The linter warns the file needs to be reformatted. Please run Black to ensure consistent code style.
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/domain/data/faq_dto.py (1)
1-13
: Apply Black formatting.The linter warns the file needs to be reformatted. Please run Black to ensure consistent code style.
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/domain/ingestion/deletionPipelineExecutionDto.py (1)
18-23
: Apply Black formatting.The file also needs reformatting according to the pipeline.
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/domain/ingestion/ingestion_pipeline_execution_dto.py (1)
18-23
: Apply Black formatting.The file also needs reformatting according to the pipeline.
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/web/status/faq_ingestion_status_callback.py (1)
17-45
: Add robust error handling.Consider adding error handling around URL construction and the
super().__init__
call to anticipate potential issues in production—for example, invalid or inaccessible endpoints.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/vector_database/faq_schema.py (2)
1-1
: Remove unusedlogging
import.
This import is never referenced in the file. Eliminating unused imports helps reduce code clutter.- import logging from enum import Enum
🧰 Tools
🪛 Ruff (0.8.2)
1-1:
logging
imported but unusedRemove unused import:
logging
(F401)
31-31
: Remove unused variable assignment.
The local variableproperties
is assigned but never used. Eliminating unused variables helps keep the code clean.- properties = collection.config.get(simple=True).properties
🧰 Tools
🪛 Ruff (0.8.2)
31-31: Local variable
properties
is assigned to but never usedRemove assignment to unused variable
properties
(F841)
app/pipeline/shared/citation_pipeline.py (2)
1-1
: Remove unusedlogging
import.
This import is never referenced, so deleting it helps maintain code cleanliness.- import logging import os from asyncio.log import logger
🧰 Tools
🪛 Ruff (0.8.2)
1-1:
logging
imported but unusedRemove unused import:
logging
(F401)
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
80-80
: Rename unused loop variable.
The loop index variablei
is never used. Renaming it to_
or_i
clarifies that it’s intentionally unused.for i, faq in enumerate(faqs): faq = "Question: {}, Answer: {}".format( ... + # rename i -> _ for _, faq in enumerate(faqs): faq = ...
🧰 Tools
🪛 Ruff (0.8.2)
80-80: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/pipeline/chat/course_chat_pipeline.py (4)
13-13
: Remove unused import.Static analysis indicates that
SystemMessagePromptTemplate
is imported but not used. Please remove it to keep the code clean.- from langchain_core.prompts import ( - ChatPromptTemplate, SystemMessagePromptTemplate, - ) + from langchain_core.prompts import ( + ChatPromptTemplate, + )🧰 Tools
🪛 Ruff (0.8.2)
13-13:
langchain_core.prompts.SystemMessagePromptTemplate
imported but unusedRemove unused import:
langchain_core.prompts.SystemMessagePromptTemplate
(F401)
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
85-85
: Consider initializingretrieved_faqs
to an empty list.Currently,
retrieved_faqs
isNone
by default. If the downstream logic doesn’t handleNone
checks or if you need to append to it before any retrieval, you may encounterNoneType
errors. Initializing it to an empty list can help.- retrieved_faqs: List[dict] = None + retrieved_faqs: List[dict] = []🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
300-329
: Explore error handling for no matched FAQs.The function returns an empty string if no FAQs are found. Consider adding some feedback to clarify that no matching FAQs were retrieved, otherwise there's a risk of returning an empty answer with no context.
if not self.retrieved_faqs: + return "No matching FAQs found." result = "" for faq in self.retrieved_faqs: ...
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
510-516
: Fix docstring reference.The docstring still references “lectures” instead of “FAQs.” Update it to reflect the correct resource type being checked.
:return: True if there are indexed lectures for the course, False otherwise +:return: True if there are indexed FAQs for the course, False otherwise
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
docker/pyris-dev.yml
is excluded by!**/*.yml
📒 Files selected for processing (14)
app/common/PipelineEnum.py
(1 hunks)app/domain/data/faq_dto.py
(1 hunks)app/domain/ingestion/deletionPipelineExecutionDto.py
(2 hunks)app/domain/ingestion/ingestion_pipeline_execution_dto.py
(2 hunks)app/pipeline/chat/course_chat_pipeline.py
(10 hunks)app/pipeline/faq_ingestion_pipeline.py
(1 hunks)app/pipeline/prompts/faq_retrieval_prompts.py
(1 hunks)app/pipeline/shared/citation_pipeline.py
(4 hunks)app/retrieval/faq_retrieval.py
(1 hunks)app/vector_database/database.py
(2 hunks)app/vector_database/faq_schema.py
(1 hunks)app/web/routers/pipelines.py
(1 hunks)app/web/routers/webhooks.py
(5 hunks)app/web/status/faq_ingestion_status_callback.py
(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Run linters
app/domain/ingestion/deletionPipelineExecutionDto.py
[warning] File needs to be reformatted using Black
app/domain/data/faq_dto.py
[warning] File needs to be reformatted using Black
app/pipeline/chat/course_chat_pipeline.py
[warning] File needs to be reformatted using Black
app/pipeline/shared/citation_pipeline.py
[warning] File needs to be reformatted using Black
app/pipeline/faq_ingestion_pipeline.py
[warning] File needs to be reformatted using Black
app/web/status/faq_ingestion_status_callback.py
[warning] File needs to be reformatted using Black
app/domain/ingestion/ingestion_pipeline_execution_dto.py
[warning] File needs to be reformatted using Black
app/web/routers/webhooks.py
[warning] File needs to be reformatted using Black
app/retrieval/faq_retrieval.py
[warning] File needs to be reformatted using Black
🪛 Ruff (0.8.2)
app/pipeline/chat/course_chat_pipeline.py
13-13: langchain_core.prompts.SystemMessagePromptTemplate
imported but unused
Remove unused import: langchain_core.prompts.SystemMessagePromptTemplate
(F401)
app/pipeline/shared/citation_pipeline.py
1-1: logging
imported but unused
Remove unused import: logging
(F401)
80-80: Loop control variable i
not used within loop body
Rename unused i
to _i
(B007)
app/vector_database/faq_schema.py
1-1: logging
imported but unused
Remove unused import: logging
(F401)
31-31: Local variable properties
is assigned to but never used
Remove assignment to unused variable properties
(F841)
app/pipeline/faq_ingestion_pipeline.py
6-6: openai.OpenAI
imported but unused
Remove unused import: openai.OpenAI
(F401)
87-88: Use a single with
statement with multiple contexts instead of nested with
statements
(SIM117)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Push Docker Image
🔇 Additional comments (13)
app/pipeline/faq_ingestion_pipeline.py (1)
56-79
: Handle potentialdto
attribute issues in__call__
.
self.dto
is assumed to have non-None values forfaq
inside the method. Ifdto
ordto.faq
can beNone
, consider adding safeguards or checks before accessingdto.faq
.Would you like a script to search for other places where
dto
might be used without null checks across the codebase to confirm it's always set?🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/retrieval/faq_retrieval.py (2)
148-149
: Confirm question assessment coverage.The condition
if not self.assess_question(chat_history, student_query): return []
might skip retrieving FAQs entirely. Ensure you have tests verifying that valid queries pass this check.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
223-225
: Record token usage more uniformly.You're appending
self.llm.tokens
toself.tokens
. Verify that any prior usage of tokens in the pipeline isn’t overwritten or missed. If concurrency is involved, consider thread-safe approaches.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/domain/data/faq_dto.py (1)
4-10
: DTO field defaults and alias usage look good.The structure appears straightforward, with well-defined fields and aliases. Everything looks consistent and clear.
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/domain/ingestion/deletionPipelineExecutionDto.py (1)
18-23
: Ensure alignment with the ingestion/deletion flow.The new
FaqDeletionExecutionDto
referencesFaqDTO
with aliaspyrisFaqWebhookDTO
. Verify that references in the deletion pipeline only rely on that alias or confirm that all references are consistent.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/domain/ingestion/ingestion_pipeline_execution_dto.py (1)
18-23
: DTO usage consistency.Similar to the deletion DTO, ensure calls expecting
pyrisFaqWebhookDTO
come from the correct field. This prevents mismatch errors when hooking up your ingestion pipeline.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/common/PipelineEnum.py (1)
17-18
: Enums added successfully.The new enum members align well with existing naming conventions. Consider adding brief docstrings or comments indicating their use cases, but otherwise, the additions look good.
app/pipeline/prompts/faq_retrieval_prompts.py (1)
1-19
: Prompts seem clearly defined.These prompt strings are well-structured and descriptive. To further improve maintainability, consider documenting any guidelines or constraints (e.g., word limit enforcement) so that future maintainers understand the context and usage.
app/vector_database/database.py (1)
32-32
: Schema initialization looks good.The new
faqs
schema initialization follows the existing pattern. Verify that repeated queries for FAQ data do not require reinitializing the schema for each call. Otherwise, this approach appears consistent with the rest of the file.app/web/routers/webhooks.py (2)
74-100
: Overall concurrency approach looks consistent.
Your usage ofSemaphore
andfinally
blocks ensures that resources are always released. This is in line with best practices.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
153-179
: Webhook naming and structure are consistent with the codebase.
The new endpoints follow the existing patterns for ingestion and deletion.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/web/routers/pipelines.py (1)
298-305
: Implementation of "FAQ_INGESTION" variant is correct.
The newly added switch case for"FAQ_INGESTION"
properly returns a default variant, mirroring other feature switch patterns.app/pipeline/chat/course_chat_pipeline.py (1)
Line range hint
1-9999
: Reformat code with Black.The file needs to be formatted according to Black. Please run the formatter to comply with the pipeline requirements.
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🔭 Outside diff range comments (1)
app/pipeline/shared/citation_pipeline.py (1)
Line range hint
144-144
: Fix incorrectlogger.error
usageAt line 144, the
logger.error
method is not used correctly. Pass the exception usingexc_info
or include it in the message.Replace:
logger.error("citation pipeline failed", e)With:
logger.error(f"citation pipeline failed: {e}")Or, to include traceback information:
logger.error("citation pipeline failed", exc_info=True)🧰 Tools
🪛 Ruff (0.8.2)
1-1:
logging
imported but unusedRemove unused import:
logging
(F401)
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
🧹 Nitpick comments (14)
app/retrieval/faq_retrieval.py (4)
181-187
: Remove unnecessary exception handlingIn the
assess_question
method, exceptions are caught and immediately re-raised without additional handling or logging. This try-except block can be removed to simplify the code.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
215-222
: Remove unnecessary exception handlingIn
rewrite_student_query
, the try-except block catches exceptions and immediately re-raises them without adding context or handling. Consider removing the try-except block unless specific handling is needed.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
253-260
: Remove unnecessary exception handlingIn
rewrite_elaborated_query
, exceptions are caught and immediately re-raised. Unless you're adding specific handling or logging, the try-except block can be omitted.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
66-82
: Consider initializingself.tokens
in the superclassIf multiple pipelines use
self.tokens
, consider moving its initialization to the superclassPipeline
for consistency and to avoid duplication.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/pipeline/prompts/faq_citation_prompt.txt (3)
1-1
: Add missing commas for clarityIn line 1, add commas after introductory phrases to improve readability.
Change:
In the paragraphs below you are provided with an answer to a question. Underneath the answer you will find the faqs that the answer was based on.
To:
In the paragraphs below, you are provided with an answer to a question. Underneath the answer, you will find the FAQs that the answer was based on.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~1-~1: Possible missing comma found.
Context: In the paragraphs below you are provided with an answer to a qu...(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~1-~1: Possible missing comma found.
Context: ...an answer to a question. Underneath the answer you will find the faqs that the answer ...(AI_HYDRA_LEO_MISSING_COMMA)
7-8
: Clarify instructions regarding citationsLines 7 and 8 contain contradictory instructions about including citations. Consider rephrasing for clarity.
32-32
: Reduce excessive exclamation marksUsing multiple exclamation marks may seem unprofessional. Consider reducing them for a formal tone.
Change:
If the answer actually does not contain any information from the paragraphs, please do not include any citations and return '!NONE!'.
To:
If the answer does not contain any information from the paragraphs, please do not include any citations and return 'NONE'.
🧰 Tools
🪛 LanguageTool
[style] ~32-~32: Using many exclamation marks might seem excessive (in this case: 4 exclamation marks for a text that’s 2043 characters long)
Context: ...o not include any citations and return '!NONE!'.(EN_EXCESSIVE_EXCLAMATION)
app/pipeline/shared/citation_pipeline.py (2)
1-1
: Remove unused importlogging
The
logging
module is imported but not used in the file. Remove it to clean up the code.🧰 Tools
🪛 Ruff (0.8.2)
1-1:
logging
imported but unusedRemove unused import:
logging
(F401)
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
84-91
: Rename unused loop variableIn the
create_formatted_faq_string
method, the loop variablei
is not used. Rename it to_
to indicate it's intentionally unused.Change:
for i, faq in enumerate(faqs):To:
for _, faq in enumerate(faqs):🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/pipeline/chat/course_chat_pipeline.py (5)
13-13
: Remove unused importThe
SystemMessagePromptTemplate
is imported but never used in the code.- ChatPromptTemplate, SystemMessagePromptTemplate, + ChatPromptTemplate,🧰 Tools
🪛 Ruff (0.8.2)
13-13:
langchain_core.prompts.SystemMessagePromptTemplate
imported but unusedRemove unused import:
langchain_core.prompts.SystemMessagePromptTemplate
(F401)
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
300-329
: Consider using string join for better readabilityThe FAQ result string concatenation could be improved using
join()
for better readability and performance.- result = "" - for faq in self.retrieved_faqs: - res = "[FAQ ID: {}, FAQ Question: {}, FAQ Answer: {}]".format( - faq.get(FaqSchema.FAQ_ID.value), - faq.get(FaqSchema.QUESTION_TITLE.value), - faq.get(FaqSchema.QUESTION_ANSWER.value), - ) - result += res - return result + return "\n".join( + "[FAQ ID: {}, FAQ Question: {}, FAQ Answer: {}]".format( + faq.get(FaqSchema.FAQ_ID.value), + faq.get(FaqSchema.QUESTION_TITLE.value), + faq.get(FaqSchema.QUESTION_ANSWER.value), + ) + for faq in self.retrieved_faqs + )🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
429-433
: Remove extra blank linesThere are unnecessary blank lines after the FAQ tool addition.
if self.should_allow_faq_tool(dto.course.id): tool_list.append(faq_content_retrieval) - - -🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
454-459
: Consolidate progress callbacksThe "Augmenting response..." callback is duplicated for paragraphs and FAQs. Consider consolidating these callbacks.
- self.callback.in_progress("Augmenting response ...") out = self.citation_pipeline(self.retrieved_paragraphs, out, InformationType.PARAGRAPHS) self.tokens.extend(self.citation_pipeline.tokens) if self.retrieved_faqs: - self.callback.in_progress("Augmenting response ...") out = self.citation_pipeline(self.retrieved_faqs, out, InformationType.FAQS, base_url=dto.settings.artemis_base_url)🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
511-528
: Fix docstring return descriptionThe docstring incorrectly mentions "lectures" instead of "FAQs" in the return description.
""" Checks if there are indexed faqs for the given course :param course_id: The course ID - :return: True if there are indexed lectures for the course, False otherwise + :return: True if there are indexed FAQs for the course, False otherwise """🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
app/pipeline/chat/course_chat_pipeline.py
(10 hunks)app/pipeline/prompts/faq_citation_prompt.txt
(1 hunks)app/pipeline/shared/citation_pipeline.py
(5 hunks)app/retrieval/faq_retrieval.py
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
app/pipeline/prompts/faq_citation_prompt.txt
[uncategorized] ~1-~1: Possible missing comma found.
Context: In the paragraphs below you are provided with an answer to a qu...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~1-~1: Possible missing comma found.
Context: ...an answer to a question. Underneath the answer you will find the faqs that the answer ...
(AI_HYDRA_LEO_MISSING_COMMA)
[style] ~32-~32: Using many exclamation marks might seem excessive (in this case: 4 exclamation marks for a text that’s 2043 characters long)
Context: ...o not include any citations and return '!NONE!'.
(EN_EXCESSIVE_EXCLAMATION)
🪛 Ruff (0.8.2)
app/pipeline/chat/course_chat_pipeline.py
13-13: langchain_core.prompts.SystemMessagePromptTemplate
imported but unused
Remove unused import: langchain_core.prompts.SystemMessagePromptTemplate
(F401)
app/pipeline/shared/citation_pipeline.py
1-1: logging
imported but unused
Remove unused import: logging
(F401)
83-83: Loop control variable i
not used within loop body
Rename unused i
to _i
(B007)
🪛 GitHub Actions: Run linters
app/pipeline/chat/course_chat_pipeline.py
[warning] File needs to be reformatted using Black
app/pipeline/shared/citation_pipeline.py
[warning] File needs to be reformatted using Black
app/retrieval/faq_retrieval.py
[warning] File needs to be reformatted using Black
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Push Docker Image
🔇 Additional comments (1)
app/pipeline/chat/course_chat_pipeline.py (1)
Line range hint
1-537
: Implementation looks good! 👍The FAQ integration is well-implemented, following existing patterns and conventions. The code successfully enables IRIS to use FAQs as intended in the PR objectives.
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
app/pipeline/chat/course_chat_pipeline.py (5)
13-13
: Remove unused importThe import
SystemMessagePromptTemplate
is not used in the code.-from langchain_core.prompts import ( - ChatPromptTemplate, SystemMessagePromptTemplate, -) +from langchain_core.prompts import ChatPromptTemplate🧰 Tools
🪛 Ruff (0.8.2)
13-13:
langchain_core.prompts.SystemMessagePromptTemplate
imported but unusedRemove unused import:
langchain_core.prompts.SystemMessagePromptTemplate
(F401)
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
307-336
: Add error handling for FAQ retrievalWhile the implementation is good, it should handle potential errors from the FAQ retriever, similar to other tools in the pipeline.
def faq_content_retrieval() -> str: self.callback.in_progress("Retrieving faq content ...") - self.retrieved_faqs = self.faq_retriever( - chat_history=history, - student_query=query.contents[0].text_content, - result_limit=5, - course_name=dto.course.name, - course_id=dto.course.id, - base_url=dto.settings.artemis_base_url, - ) + try: + self.retrieved_faqs = self.faq_retriever( + chat_history=history, + student_query=query.contents[0].text_content, + result_limit=5, + course_name=dto.course.name, + course_id=dto.course.id, + base_url=dto.settings.artemis_base_url, + ) + except Exception as e: + logger.error("Error retrieving FAQs", exc_info=e) + return "Error retrieving FAQ content"🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
436-440
: Fix formatting: Remove extra blank linesThere are unnecessary blank lines that should be removed to maintain consistent formatting.
if self.should_allow_faq_tool(dto.course.id): tool_list.append(faq_content_retrieval) -
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
518-535
: Fix docstring return descriptionThe docstring's return description mentions lectures instead of FAQs.
""" Checks if there are indexed faqs for the given course :param course_id: The course ID - :return: True if there are indexed lectures for the course, False otherwise + :return: True if there are indexed FAQs for the course, False otherwise """🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
543-544
: Remove extra blank lines at the end of fileThere are unnecessary blank lines at the end of the file.
def datetime_to_string(dt: Optional[datetime]) -> str: if dt is None: return "No date provided" else: return dt.strftime("%Y-%m-%d %H:%M:%S") - -🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/pipeline/chat/course_chat_pipeline.py
(10 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
app/pipeline/chat/course_chat_pipeline.py
13-13: langchain_core.prompts.SystemMessagePromptTemplate
imported but unused
Remove unused import: langchain_core.prompts.SystemMessagePromptTemplate
(F401)
🪛 GitHub Actions: Run linters
app/pipeline/chat/course_chat_pipeline.py
[warning] File needs to be reformatted using Black
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Push Docker Image
🔇 Additional comments (2)
app/pipeline/chat/course_chat_pipeline.py (2)
86-86
: LGTM! Good addition of the FAQ retrieval attributeThe
retrieved_faqs
attribute is well-placed alongside the existingretrieved_paragraphs
attribute, maintaining consistency in the class structure.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
464-466
: LGTM! Good integration with the citation pipelineThe FAQ augmentation is well-integrated with the existing citation pipeline, maintaining consistency with the paragraph augmentation pattern.
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
… feature/faq/basic-faq-pipe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (14)
app/web/routers/webhooks.py (4)
74-99
: Improve function documentation.The docstring should be more descriptive and include parameter and return type information.
- """ - Run the exercise chat pipeline in a separate thread - """ + """ + Run the FAQ update pipeline in a separate thread. + + Args: + dto (FaqIngestionPipelineExecutionDto): The data transfer object containing FAQ information + + Note: + This function is executed with semaphore control to limit concurrent operations + """🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
102-126
: Address language consistency and error message specificity.Two improvements needed:
- Translate the German comment to English
- Make the error message more specific to FAQ deletion
- # Hier würd dann die Methode zum entfernen aus der Datenbank kommen + # Method to remove FAQ from the database pipeline = FaqIngestionPipeline(client=client, dto=None, callback=callback) pipeline.delete_faq(dto.faq.faq_id, dto.faq.course_id) except Exception as e: - logger.error(f"Error Ingestion pipeline: {e}") + logger.error(f"Error in FAQ deletion pipeline: {e}")🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
153-165
: Remove unnecessary return statement.FastAPI automatically returns None for void functions, making the explicit return statement redundant.
thread = Thread(target=run_faq_update_pipeline_worker, args=(dto,)) thread.start() - return
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
166-179
: Remove unnecessary return statement.Similar to the ingestion webhook, the explicit return statement is redundant.
logging.info("Starting faq deletion") thread = Thread(target=run_faq_delete_pipeline_worker, args=(dto,)) thread.start() - return
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/pipeline/chat/course_chat_pipeline.py (4)
13-13
: Remove unused import.The
SystemMessagePromptTemplate
import is not used in the code.- ChatPromptTemplate, SystemMessagePromptTemplate, + ChatPromptTemplate,🧰 Tools
🪛 Ruff (0.8.2)
13-13:
langchain_core.prompts.SystemMessagePromptTemplate
imported but unusedRemove unused import:
langchain_core.prompts.SystemMessagePromptTemplate
(F401)
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
307-336
: Improve FAQ result formatting for better readability.Consider using a more structured format with line breaks for better readability.
- res = "[FAQ ID: {}, FAQ Question: {}, FAQ Answer: {}]".format( - faq.get(FaqSchema.FAQ_ID.value), - faq.get(FaqSchema.QUESTION_TITLE.value), - faq.get(FaqSchema.QUESTION_ANSWER.value), - ) + res = """ +FAQ Entry: + ID: {} + Question: {} + Answer: {} + +""".format( + faq.get(FaqSchema.FAQ_ID.value), + faq.get(FaqSchema.QUESTION_TITLE.value), + faq.get(FaqSchema.QUESTION_ANSWER.value), + )🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
436-440
: Remove extra newlines.Multiple consecutive empty lines affect code readability.
if self.should_allow_faq_tool(dto.course.id): tool_list.append(faq_content_retrieval) - -🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
518-524
: Fix incorrect docstring return description.The return description mentions "lectures" instead of "FAQs".
""" Checks if there are indexed faqs for the given course :param course_id: The course ID - :return: True if there are indexed lectures for the course, False otherwise + :return: True if there are indexed FAQs for the course, False otherwise """🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/retrieval/faq_retrieval.py (6)
70-74
: Consider making LLM requirements configurableThe hardcoded values for
gpt_version_equivalent
andcontext_length
should be moved to configuration to make the code more maintainable and flexible.- gpt_version_equivalent=4.25, - context_length=16385, - privacy_compliance=True, + gpt_version_equivalent=config.GPT_VERSION, + context_length=config.CONTEXT_LENGTH, + privacy_compliance=config.PRIVACY_COMPLIANCE,🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
150-155
: Make hybrid_factor configurableThe hardcoded value of 0.9 for
hybrid_factor
should be configurable to allow fine-tuning of the retrieval process.response = self.search_in_db( query=rewritten_query, - hybrid_factor=0.9, + hybrid_factor=self.config.HYBRID_FACTOR, result_limit=result_limit, course_id=course_id, )🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
184-190
: Extract common error handling patternThe same error handling pattern is repeated across multiple methods. Consider extracting it into a decorator.
def handle_llm_errors(func): @functools.wraps(func) def wrapper(*args, **kwargs): try: return func(*args, **kwargs) except Exception as e: logger.error(f"Error in {func.__name__}: {str(e)}") raise e return wrapper @handle_llm_errors def assess_question(self, ...): # ImplementationAlso applies to: 217-225, 255-263
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
276-286
: Add logging for filter conditionsAdd logging for the filter condition to help with debugging and monitoring.
if course_id: + logger.debug(f"Applying course_id filter: {course_id}") # Create a filter for course_id filter_weaviate = Filter.by_property(FaqSchema.COURSE_ID.value).equal( course_id )
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
316-358
: Optimize thread pool usageConsider using a single thread pool for both sets of tasks to reduce overhead.
- with concurrent.futures.ThreadPoolExecutor() as executor: + with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor: # Schedule the rewrite tasks to run in parallel rewritten_query_future = executor.submit( self.rewrite_student_query, chat_history, student_query, course_language, course_name, ) # ... other tasks ... - - # Get the results once both tasks are complete - rewritten_query = rewritten_query_future.result() - hypothetical_answer_query = hypothetical_answer_query_future.result() - - # Execute the database search tasks - with concurrent.futures.ThreadPoolExecutor() as executor: response_future = executor.submit( self.search_in_db, query=rewritten_query_future.result(), hybrid_factor=0.9, result_limit=result_limit, course_id=course_id, )🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
360-385
: Make default language configurableThe hardcoded default language "english" should be moved to configuration.
def fetch_course_language(self, course_id): """ Fetch the language of the course based on the course ID. If no specific language is set, it defaults to English. """ - course_language = "english" + course_language = self.config.DEFAULT_COURSE_LANGUAGE🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/pipeline/chat/course_chat_pipeline.py
(10 hunks)app/retrieval/faq_retrieval.py
(1 hunks)app/web/routers/webhooks.py
(5 hunks)
🧰 Additional context used
🪛 GitHub Actions: Run linters
app/web/routers/webhooks.py
[warning] File needs to be reformatted using Black
app/pipeline/chat/course_chat_pipeline.py
[warning] File needs to be reformatted using Black
app/retrieval/faq_retrieval.py
[warning] File needs to be reformatted using Black
🪛 Ruff (0.8.2)
app/pipeline/chat/course_chat_pipeline.py
13-13: langchain_core.prompts.SystemMessagePromptTemplate
imported but unused
Remove unused import: langchain_core.prompts.SystemMessagePromptTemplate
(F401)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Push Docker Image
🔇 Additional comments (3)
app/web/routers/webhooks.py (1)
11-19
: LGTM! Clean import organization.The new imports and DTOs are well-organized and properly support the FAQ functionality.
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
app/retrieval/faq_retrieval.py (2)
42-58
: Well-implemented merge function!The
merge_retrieved_chunks
function efficiently handles deduplication using dictionary lookups, and the implementation is clear and well-documented.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
113-113
: Use logger.info instead of logging.infoFor consistency with the module-level logger initialization, use
logger.info
instead oflogging.info
.🧰 Tools
🪛 GitHub Actions: Run linters
[warning] File needs to be reformatted using Black
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (9)
app/pipeline/shared/citation_pipeline.py (1)
81-96
: Improve FAQ string formatting implementation.Several improvements can be made to the FAQ string formatting:
- The loop control variable
i
is unused- Consider using f-strings for better readability
- Consider using string joining instead of concatenation for better performance
def create_formatted_faq_string(self, faqs, base_url): formatted_string = "" - for i, faq in enumerate(faqs): - faq = "FAQ ID {}, CourseId {} , FAQ Question title {} and FAQ Question Answer {} and FAQ link {}".format( - faq.get(FaqSchema.FAQ_ID.value), - faq.get(FaqSchema.COURSE_ID.value), - faq.get(FaqSchema.QUESTION_TITLE.value), - faq.get(FaqSchema.QUESTION_ANSWER.value), - f"{base_url}/courses/{faq.get(FaqSchema.COURSE_ID.value)}/faq/?faqId={faq.get(FaqSchema.FAQ_ID.value)}", - ) - formatted_string += faq + formatted_faqs = [] + for _, faq in enumerate(faqs): + faq_link = f"{base_url}/courses/{faq.get(FaqSchema.COURSE_ID.value)}/faq/?faqId={faq.get(FaqSchema.FAQ_ID.value)}" + formatted_faq = ( + f"FAQ ID {faq.get(FaqSchema.FAQ_ID.value)}, " + f"CourseId {faq.get(FaqSchema.COURSE_ID.value)}, " + f"FAQ Question title {faq.get(FaqSchema.QUESTION_TITLE.value)} and " + f"FAQ Question Answer {faq.get(FaqSchema.QUESTION_ANSWER.value)} and " + f"FAQ link {faq_link}" + ) + formatted_faqs.append(formatted_faq) + formatted_string = "\n".join(formatted_faqs)🧰 Tools
🪛 Ruff (0.8.2)
86-86: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
app/pipeline/faq_ingestion_pipeline.py (2)
88-89
: Combine nested context managers for better readability.The nested
with
statements can be combined using a singlewith
statement.Apply this diff:
- with batch_update_lock: - with self.collection.batch.rate_limit(requests_per_minute=600) as batch: + with batch_update_lock, self.collection.batch.rate_limit(requests_per_minute=600) as batch:🧰 Tools
🪛 Ruff (0.8.2)
88-89: Use a single
with
statement with multiple contexts instead of nestedwith
statements(SIM117)
137-141
: Make the return value explicit in chunk_data method.The method implicitly returns None. For better clarity, make the return value explicit.
Apply this diff:
def chunk_data(self, path: str) -> List[Dict[str, str]]: """ Faqs are so small, they do not need to be chunked into smaller parts """ - return + return [] # Return empty list as FAQs don't need chunkingapp/web/routers/webhooks.py (3)
115-116
: Translate German comment to English.For consistency, translate the German comment "Hier würd dann die Methode zum entfernen aus der Datenbank kommen" to English.
Apply this diff:
- # Hier würd dann die Methode zum entfernen aus der Datenbank kommen + # Here comes the method to remove from the database
76-124
: Reduce code duplication in worker functions.The error handling pattern is duplicated across worker functions. Consider extracting it into a decorator or utility function.
Create a new utility function:
def handle_pipeline_errors(func): def wrapper(*args, **kwargs): with semaphore: try: return func(*args, **kwargs) except Exception as e: logger.error(f"Error in pipeline: {e}") logger.error(traceback.format_exc()) capture_exception(e) finally: semaphore.release() return wrapperThen apply it to the worker functions:
+@handle_pipeline_errors def run_faq_update_pipeline_worker(dto: FaqIngestionPipelineExecutionDto): """ Run the exercise chat pipeline in a separate thread """ - with semaphore: - try: - callback = FaqIngestionStatus(...) - pipeline() - except Exception as e: - logger.error(f"Error Faq Ingestion pipeline: {e}") - logger.error(traceback.format_exc()) - capture_exception(e) - finally: - semaphore.release() + callback = FaqIngestionStatus(...) + pipeline()
164-165
: Remove unnecessary return statements.The empty return statements in webhook endpoints are unnecessary as FastAPI automatically returns None.
Apply this diff:
def faq_ingestion_webhook(dto: FaqIngestionPipelineExecutionDto): thread = Thread(target=run_faq_update_pipeline_worker, args=(dto,)) thread.start() - return def faq_deletion_webhook(dto: FaqDeletionExecutionDto): logging.info("Starting faq deletion") thread = Thread(target=run_faq_delete_pipeline_worker, args=(dto,)) thread.start() - returnAlso applies to: 178-179
app/retrieval/faq_retrieval.py (3)
45-60
: Consider preserving chunk IDs in the returned list.The function discards the IDs when returning the merged chunks. Consider preserving the IDs for traceability and debugging purposes.
Apply this diff to preserve the IDs:
- return [properties for uuid, properties in merged_chunks.items()] + return [{"id": uuid, "properties": properties} for uuid, properties in merged_chunks.items()]
185-190
: Simplify error handling.The try-except block that just re-raises the exception is unnecessary and can be removed.
Apply this diff:
- try: - response = (prompt | self.pipeline).invoke({}) - logger.info(f"Response from assessment pipeline: {response}") - return response == "YES" - except Exception as e: - raise e + response = (prompt | self.pipeline).invoke({}) + logger.info(f"Response from assessment pipeline: {response}") + return response == "YES"
364-364
: Capitalize the default language.For consistency with language codes, capitalize the default language.
Apply this diff:
- course_language = "english" + course_language = "English"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
app/domain/data/faq_dto.py
(1 hunks)app/domain/ingestion/deletionPipelineExecutionDto.py
(2 hunks)app/domain/ingestion/ingestion_pipeline_execution_dto.py
(2 hunks)app/pipeline/chat/course_chat_pipeline.py
(10 hunks)app/pipeline/chat/exercise_chat_agent_pipeline.py
(1 hunks)app/pipeline/faq_ingestion_pipeline.py
(1 hunks)app/pipeline/shared/citation_pipeline.py
(5 hunks)app/retrieval/faq_retrieval.py
(1 hunks)app/web/routers/webhooks.py
(5 hunks)app/web/status/faq_ingestion_status_callback.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- app/pipeline/chat/exercise_chat_agent_pipeline.py
🚧 Files skipped from review as they are similar to previous changes (4)
- app/domain/ingestion/deletionPipelineExecutionDto.py
- app/domain/data/faq_dto.py
- app/web/status/faq_ingestion_status_callback.py
- app/domain/ingestion/ingestion_pipeline_execution_dto.py
🧰 Additional context used
🪛 Ruff (0.8.2)
app/pipeline/chat/course_chat_pipeline.py
14-14: langchain_core.prompts.SystemMessagePromptTemplate
imported but unused
Remove unused import: langchain_core.prompts.SystemMessagePromptTemplate
(F401)
app/pipeline/faq_ingestion_pipeline.py
6-6: openai.OpenAI
imported but unused
Remove unused import: openai.OpenAI
(F401)
88-89: Use a single with
statement with multiple contexts instead of nested with
statements
(SIM117)
app/pipeline/shared/citation_pipeline.py
1-1: logging
imported but unused
Remove unused import: logging
(F401)
86-86: Loop control variable i
not used within loop body
Rename unused i
to _i
(B007)
🪛 GitHub Actions: Run linters
app/pipeline/chat/course_chat_pipeline.py
[error] 12-12: Unused import: 'langchain_core.prompts.SystemMessagePromptTemplate' is imported but never used
[warning] 313-316: Multiple lines exceed maximum length of 120 characters
[warning] 318-318: Line too long (140 > 120 characters)
app/pipeline/faq_ingestion_pipeline.py
[error] 6-6: Unused import: 'openai.OpenAI' is imported but never used
app/pipeline/shared/citation_pipeline.py
[error] 1-1: Unused import: 'logging' is imported but never used
[warning] 107-107: Line too long (127 > 120 characters)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Push Docker Image
🔇 Additional comments (9)
app/pipeline/shared/citation_pipeline.py (2)
20-22
: LGTM! Clean enum implementation.The
InformationType
enum is well-defined and follows Python best practices by inheriting from bothstr
andEnum
.
1-3
:⚠️ Potential issueFix duplicate logger imports.
The code has redundant imports for logging functionality:
- Line 1:
import logging
- Line 3:
from asyncio.log import logger
-import logging -from asyncio.log import logger +import logging +logger = logging.getLogger(__name__)Likely invalid or redundant comment.
🧰 Tools
🪛 Ruff (0.8.2)
1-1:
logging
imported but unusedRemove unused import:
logging
(F401)
🪛 GitHub Actions: Run linters
[error] 1-1: Unused import: 'logging' is imported but never used
app/pipeline/faq_ingestion_pipeline.py (1)
1-6
: 🛠️ Refactor suggestionConsolidate logging imports and remove unused import.
The code has redundant logging imports and an unused OpenAI import:
- Remove the unused
openai.OpenAI
import- Consolidate logging by using a single logger instance
Apply this diff:
import logging -from asyncio.log import logger from typing import Optional, List, Dict from langchain_core.output_parsers import StrOutputParser -from openai import OpenAI from weaviate import WeaviateClient +logger = logging.getLogger(__name__)Likely invalid or redundant comment.
🧰 Tools
🪛 Ruff (0.8.2)
6-6:
openai.OpenAI
imported but unusedRemove unused import:
openai.OpenAI
(F401)
🪛 GitHub Actions: Run linters
[error] 6-6: Unused import: 'openai.OpenAI' is imported but never used
app/retrieval/faq_retrieval.py (5)
70-87
: LGTM!The initialization is well-structured with appropriate requirements and configurations for the LLM model, embedding model, and reranker pipeline.
132-162
: LGTM!The function implements a robust basic retrieval logic with appropriate question assessment and query rewriting.
218-225
: Simplify error handling.The try-except block that just re-raises the exception is unnecessary and can be removed.
Apply this diff:
- try: - response = (prompt | self.pipeline).invoke({}) - token_usage = self.llm.tokens - token_usage.pipeline = PipelineEnum.IRIS_FAQ_RETRIEVAL_PIPELINE - self.tokens.append(self.llm.tokens) - return response - except Exception as e: - raise e + response = (prompt | self.pipeline).invoke({}) + token_usage = self.llm.tokens + token_usage.pipeline = PipelineEnum.IRIS_FAQ_RETRIEVAL_PIPELINE + self.tokens.append(self.llm.tokens) + return response
256-263
: Simplify error handling.The try-except block that just re-raises the exception is unnecessary and can be removed.
Apply this diff:
- try: - response = (prompt | self.pipeline).invoke({}) - token_usage = self.llm.tokens - token_usage.pipeline = PipelineEnum.IRIS_FAQ_RETRIEVAL_PIPELINE - self.tokens.append(self.llm.tokens) - return response - except Exception as e: - raise e + response = (prompt | self.pipeline).invoke({}) + token_usage = self.llm.tokens + token_usage.pipeline = PipelineEnum.IRIS_FAQ_RETRIEVAL_PIPELINE + self.tokens.append(self.llm.tokens) + return response
265-302
: LGTM!The function implements a robust database search with appropriate filtering and hybrid search capabilities.
app/pipeline/chat/course_chat_pipeline.py (1)
526-541
: LGTM!The function implements a robust check for FAQ availability with appropriate handling of course_id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/pipeline/chat/course_chat_pipeline.py (1)
526-541
: Fix docstring return description.The docstring incorrectly states "indexed lectures" instead of "indexed FAQs" in the return description.
Apply this diff to fix the docstring:
def should_allow_faq_tool(self, course_id: int) -> bool: """ Checks if there are indexed faqs for the given course :param course_id: The course ID - :return: True if there are indexed lectures for the course, False otherwise + :return: True if there are indexed FAQs for the course, False otherwise """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/pipeline/chat/course_chat_pipeline.py
(9 hunks)
🧰 Additional context used
🪛 GitHub Actions: Run linters
app/pipeline/chat/course_chat_pipeline.py
[warning] 312-312: Line too long (122 > 120 characters)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Push Docker Image
🔇 Additional comments (4)
app/pipeline/chat/course_chat_pipeline.py (4)
23-23
: LGTM! Clean integration of FAQ-related imports and attributes.The new imports and class attribute are well-organized and follow the existing patterns in the codebase.
Also applies to: 45-49, 86-86
121-121
: LGTM! Clean initialization of FAQ retriever.The FAQ retriever is properly initialized following the same pattern as the lecture retriever.
439-441
: LGTM! Clean integration of FAQ support in the pipeline.The FAQ tool is properly integrated into the pipeline, following the same patterns as the lecture tool:
- Conditional tool addition based on availability
- Response augmentation using the citation pipeline
Also applies to: 467-474
309-339
: 🛠️ Refactor suggestionFix line length in docstring and consider enhancing error handling.
- The docstring contains lines exceeding 120 characters.
- The function could benefit from error handling for edge cases.
Apply this diff to fix the line length issues and add error handling:
def faq_content_retrieval() -> str: """ Use this tool to retrieve information from indexed FAQs. - It is suitable when no other tool fits, you think it is a common question or the question is frequently asked, - or the question could be effectively answered by an FAQ. Also use this if the question is explicitly - organizational and course-related. An organizational question about the course might be - "What is the course structure?" or "How do I enroll?" or exam related content like "When is the exam". - The tool performs a RAG retrieval based on the chat history to find the most relevant FAQs. - Each FAQ follows this format: FAQ ID, FAQ Question, FAQ Answer. - Respond to the query concisely and solely using the answer from the relevant FAQs. + It is suitable when no other tool fits, you think it is a common question or the question is + frequently asked, or the question could be effectively answered by an FAQ. Also use this if the + question is explicitly organizational and course-related. + + An organizational question about the course might be "What is the course structure?" or + "How do I enroll?" or exam related content like "When is the exam". + + The tool performs a RAG retrieval based on the chat history to find the most relevant FAQs. + Each FAQ follows this format: FAQ ID, FAQ Question, FAQ Answer. + + Respond to the query concisely and solely using the answer from the relevant FAQs. This tool should only be used once per query. """ self.callback.in_progress("Retrieving faq content ...") - self.retrieved_faqs = self.faq_retriever( - chat_history=history, - student_query=query.contents[0].text_content, - result_limit=10, - course_name=dto.course.name, - course_id=dto.course.id, - base_url=dto.settings.artemis_base_url, - ) + try: + self.retrieved_faqs = self.faq_retriever( + chat_history=history, + student_query=query.contents[0].text_content, + result_limit=10, + course_name=dto.course.name, + course_id=dto.course.id, + base_url=dto.settings.artemis_base_url, + ) + except Exception as e: + logger.error("Failed to retrieve FAQs", exc_info=e) + return "Failed to retrieve FAQs. Please try again later." result = "" for faq in self.retrieved_faqs: res = "[FAQ ID: {}, FAQ Question: {}, FAQ Answer: {}]".format( faq.get(FaqSchema.FAQ_ID.value), faq.get(FaqSchema.QUESTION_TITLE.value), faq.get(FaqSchema.QUESTION_ANSWER.value), ) result += res return resultLikely invalid or redundant comment.
🧰 Tools
🪛 GitHub Actions: Run linters
[warning] 312-312: Line too long (122 > 120 characters)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (9)
app/pipeline/faq_ingestion_pipeline.py (4)
25-26
: Simplify lock variable assignment.The reassignment of
batch_update_lock
is redundant since it's already imported from lecture_ingestion_pipeline.- # we use the same lock as the lecture ingestion pipeline - batch_update_lock = batch_update_lock
72-79
: Improve error handling and fix grammatical error.Two suggestions for improvement:
- The error message "Failed to faq into the database" is grammatically incorrect
- Consider catching specific exceptions instead of generic Exception
- except Exception as e: + except (ValueError, IOError) as e: logger.error(f"Error updating faq: {e}") self.callback.error( - f"Failed to faq into the database: {e}", + f"Failed to ingest FAQ into the database: {e}", exception=e, tokens=self.tokens, )
88-89
: Combine nestedwith
statements.Combine the nested
with
statements for better readability.- with batch_update_lock: - with self.collection.batch.rate_limit(requests_per_minute=600) as batch: + with batch_update_lock, self.collection.batch.rate_limit(requests_per_minute=600) as batch:🧰 Tools
🪛 Ruff (0.8.2)
88-89: Use a single
with
statement with multiple contexts instead of nestedwith
statements(SIM117)
137-141
: Fix return value to match type hint.The method signature indicates it should return
List[Dict[str, str]]
, but it returnsNone
. Consider returning an empty list to match the type hint.def chunk_data(self, path: str) -> List[Dict[str, str]]: """ Faqs are so small, they do not need to be chunked into smaller parts """ - return + return []app/retrieval/basic_retrieval.py (5)
44-63
: Correct the docstring to reflect actual parameters.
The docstring mentions auser_question
parameter, but this function signature only acceptsprompt
andchat_history
. Update the docstring for clarity and consistency.- Adds the chat history and user question to the prompt - :param chat_history: The chat history - :param user_question: The user question + Adds the last four messages from the chat history to the prompt + :param chat_history: The chat history
98-128
: Handle possible variations of "YES" in assessment logic.
Relying on an exact string match may miss valid affirmative answers (like "Yes " or "yes"). A safer check could be case-insensitive and trimmed.- return response == "YES" + return response.strip().lower() == "yes"
166-197
: Gracefully handle embedding or query failures.
Ifself.llm_embedding.embed(query)
or thequery.hybrid(...)
call fails, the method will raise an exception. Consider adding localized error handling or retries without overcomplicating the flow.
198-267
: Reuse thread pools for performance.
Creating two separateThreadPoolExecutor
s in quick succession may add unnecessary overhead. Consider reusing a single executor in the same scope (or a shared executor at class level) to reduce resource churn.
[performance]
275-298
: Normalize or validate retrieved languages.
Users may store inconsistent language values. Consider normalizing or at least validating fetched results to ensure recognized language codes (e.g., "english", "EN", or locale tags).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/pipeline/faq_ingestion_pipeline.py
(1 hunks)app/retrieval/basic_retrieval.py
(1 hunks)app/retrieval/faq_retrieval.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- app/retrieval/faq_retrieval.py
🧰 Additional context used
🪛 Ruff (0.8.2)
app/pipeline/faq_ingestion_pipeline.py
1-1: threading
imported but unused
Remove unused import: threading
(F401)
88-89: Use a single with
statement with multiple contexts instead of nested with
statements
(SIM117)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Mend Security Check
🔇 Additional comments (7)
app/pipeline/faq_ingestion_pipeline.py (4)
1-1
: Remove unused import.The
threading
module is imported but never used since the lock is imported from lecture_ingestion_pipeline.- import threading
🧰 Tools
🪛 Ruff (0.8.2)
1-1:
threading
imported but unusedRemove unused import:
threading
(F401)
44-48
: Consider using a standard GPT version number.The GPT version requirement of 4.25 seems unusually specific. Consider using a standard version number (e.g., 4.0) unless there's a specific feature requirement from version 4.25.
Could you clarify if there are specific features from GPT-4.25 that are required for this implementation?
98-104
: Improve error handling in batch update.
106-120
: Fix inconsistent return values in delete_old_faqs method.app/retrieval/basic_retrieval.py (3)
26-41
: Validate data structure of retrieved chunks.
Currently, the code assumes that each chunk contains an"id"
and"properties"
. Consider adding safeguards or checks to handle potentially malformed chunks, especially if other pipeline stages might generate incomplete data.
130-164
: Confirm concurrency safety for token management.
This method appends token usage toself.tokens
in a try-except block. Consider concurrency patterns that might arise from parallel calls. Centralize and handle token usage consistently across all pipeline methods.
268-274
: Interface method looks appropriate.
The abstractget_schema_properties
method is well defined for subclasses to implement. No concerns here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
app/retrieval/faq_retrieval_utils.py (1)
25-40
: Translate docstring to English for consistency.The docstring is in German while the rest of the codebase is in English. This could make it harder for non-German speaking developers to understand the function's purpose.
Apply this diff to translate the docstring:
- """ - Formatiert die abgerufenen FAQs in einen String. - - :param retrieved_faqs: Liste der abgerufenen FAQs - :return: Formatierter String mit den FAQ-Daten - """ + """ + Formats the retrieved FAQs into a string. + + :param retrieved_faqs: List of retrieved FAQs + :return: Formatted string containing the FAQ data + """app/pipeline/chat/exercise_chat_agent_pipeline.py (1)
42-43
: Remove unused import.The
FaqSchema
import is not used in the code.Apply this diff:
-from ...vector_database.faq_schema import FaqSchema
Also applies to: 46-46
app/pipeline/chat/course_chat_pipeline.py (1)
45-46
: Remove unused import.The
FaqSchema
import is not used in the code.Apply this diff:
-from ...vector_database.faq_schema import FaqSchema
Also applies to: 49-49
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/pipeline/chat/course_chat_pipeline.py
(8 hunks)app/pipeline/chat/exercise_chat_agent_pipeline.py
(7 hunks)app/retrieval/faq_retrieval_utils.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
app/pipeline/chat/course_chat_pipeline.py
49-49: ...vector_database.faq_schema.FaqSchema
imported but unused
Remove unused import: ...vector_database.faq_schema.FaqSchema
(F401)
app/pipeline/chat/exercise_chat_agent_pipeline.py
46-46: ...vector_database.faq_schema.FaqSchema
imported but unused
Remove unused import: ...vector_database.faq_schema.FaqSchema
(F401)
🔇 Additional comments (11)
app/retrieval/faq_retrieval_utils.py (1)
6-22
: LGTM!The function correctly checks for the existence of indexed FAQs for a given course. The implementation is efficient as it limits the query to 1 result since we only need to check existence.
app/pipeline/chat/exercise_chat_agent_pipeline.py (5)
109-109
: LGTM!The attribute is correctly typed and initialized.
143-144
: LGTM!Good job on improving code clarity by using more descriptive variable names and maintaining consistency with other retrievers.
401-424
: Fix line length issues in docstring.The docstring contains lines that exceed the maximum length of 120 characters.
Apply this diff to fix the line length issues:
- It is suitable when no other tool fits, it is a common question or the question is frequently asked, - or the question could be effectively answered by an FAQ. Also use this if the question is explicitly - organizational and course-related. An organizational question about the course might be - "What is the course structure?" or "How do I enroll?" or exam related content like "When is the exam". - The tool performs a RAG retrieval based on the chat history to find the most relevant FAQs. - Each FAQ follows this format: FAQ ID, FAQ Question, FAQ Answer. - Respond to the query concisely and solely using the answer from the relevant FAQs. - This tool should only be used once per query. + It is suitable when no other tool fits, it is a common question or the question is frequently asked, or + the question could be effectively answered by an FAQ. Also use this if the question is explicitly organizational + and course-related. + An organizational question about the course might be "What is the course structure?" or "How do I enroll?" or + exam related content like "When is the exam". + The tool performs a RAG retrieval based on the chat history to find the most relevant FAQs. Each FAQ follows + this format: FAQ ID, FAQ Question, FAQ Answer. + Respond to the query concisely and solely using the answer from the relevant FAQs. This tool should only be used + once per query.
544-547
: LGTM!The FAQ tool is correctly added to the tool list with appropriate conditional logic.
570-572
: LGTM!The code has been reformatted for better readability without any functional changes.
app/pipeline/chat/course_chat_pipeline.py (5)
87-87
: LGTM!The attribute is correctly typed and initialized.
121-122
: LGTM!Good job on improving code clarity by using more descriptive variable names and maintaining consistency with other retrievers.
310-334
: Fix line length issues in docstring.The docstring contains lines that exceed the maximum length of 120 characters.
Apply this diff to fix the line length issues:
- It is suitable when no other tool fits, it is a common question or the question is frequently asked, - or the question could be effectively answered by an FAQ. Also use this if the question is explicitly - organizational and course-related. An organizational question about the course might be - "What is the course structure?" or "How do I enroll?" or exam related content like "When is the exam". - The tool performs a RAG retrieval based on the chat history to find the most relevant FAQs. - Each FAQ follows this format: FAQ ID, FAQ Question, FAQ Answer. - Respond to the query concisely and solely using the answer from the relevant FAQs. - This tool should only be used once per query. + It is suitable when no other tool fits, it is a common question or the question is frequently asked, or + the question could be effectively answered by an FAQ. Also use this if the question is explicitly organizational + and course-related. + An organizational question about the course might be "What is the course structure?" or "How do I enroll?" or + exam related content like "When is the exam". + The tool performs a RAG retrieval based on the chat history to find the most relevant FAQs. Each FAQ follows + this format: FAQ ID, FAQ Question, FAQ Answer. + Respond to the query concisely and solely using the answer from the relevant FAQs. This tool should only be used + once per query.
433-435
: LGTM!The FAQ tool is correctly added to the tool list with appropriate conditional logic.
461-468
: LGTM!The FAQ citation is correctly implemented with appropriate information type and base URL.
# Conflicts: # app/common/PipelineEnum.py # app/web/routers/pipelines.py # docker/pyris-dev.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/common/PipelineEnum.py
(1 hunks)app/pipeline/chat/exercise_chat_agent_pipeline.py
(6 hunks)app/web/routers/pipelines.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- app/common/PipelineEnum.py
- app/web/routers/pipelines.py
🧰 Additional context used
🪛 Ruff (0.8.2)
app/pipeline/chat/exercise_chat_agent_pipeline.py
46-46: ...vector_database.faq_schema.FaqSchema
imported but unused
Remove unused import: ...vector_database.faq_schema.FaqSchema
(F401)
🪛 GitHub Actions: Run linters
app/pipeline/chat/exercise_chat_agent_pipeline.py
[warning] 46-46: Unused import: '...vector_database.faq_schema.FaqSchema' is imported but never used
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Build and Push Docker Image
- GitHub Check: Analyze (python)
🔇 Additional comments (4)
app/pipeline/chat/exercise_chat_agent_pipeline.py (4)
109-109
: LGTM! Well-structured attribute addition.The
retrieved_faqs
attribute is properly typed asList[dict]
and initialized toNone
.
143-144
: LGTM! Clear and consistent retriever initialization.Good refactoring to rename
retriever
tolecture_retriever
for better clarity, and consistent initialization offaq_retriever
.
544-547
: LGTM! Clean and consistent tool integration.The FAQ tool is properly integrated following the same pattern as the lecture tool, with appropriate conditional checking using
should_allow_faq_tool
.
401-424
: Verify the FAQ result limit.The FAQ retrieval is limited to 10 results while lecture retrieval is limited to 5. Please verify if this difference is intentional and if 10 results might be too verbose for the chat response.
Otherwise, the implementation is well-structured and follows the established patterns:
- Clear docstring explaining when to use the tool
- Proper progress callback
- Consistent error handling
✅ Verification successful
The FAQ result limit of 10 is consistent with the codebase patterns.
The higher limit for FAQs (10) compared to lecture content (5) appears to be an intentional design choice, consistently implemented across different pipelines. This makes sense as FAQs are typically more concise than lecture content, and having more FAQ results allows for better coverage of related questions.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check the typical length of FAQ responses # Look for other result limits in the codebase for consistency # Check for other result_limit usage rg "result_limit\s*=" -A 2 -B 2 # Check FAQ formatting to understand the impact of 10 results ast-grep --pattern 'def format_faqs($$$)'Length of output: 5398
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/pipeline/chat/exercise_chat_agent_pipeline.py
(6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Push Docker Image
🔇 Additional comments (5)
app/pipeline/chat/exercise_chat_agent_pipeline.py (5)
42-43
: LGTM! The new imports enhance FAQ functionality.The imports are correctly placed and follow the project's import style.
The
FaqSchema
import is still unused and should be removed as mentioned in the previous review.
108-108
: LGTM! The retrieved_faqs attribute is properly typed.The attribute is correctly initialized as
List[dict] = None
to store FAQ retrieval results.
142-143
: LGTM! Clear and consistent retriever initialization.Good renaming of
retriever
tolecture_retriever
for better clarity, and consistent initialization offaq_retriever
.
544-545
: LGTM! Consistent conditional tool addition.The FAQ tool is correctly added to the tool list using the same pattern as the lecture tool.
400-424
: LGTM! Well-documented and properly implemented FAQ retrieval.The implementation follows the established pattern and includes clear usage guidelines in the docstring.
Verify if retrieving 10 FAQs (line 416) is the optimal limit for this use case. Let's check other retrieval implementations:
✅ Verification successful
The result limit of 10 FAQs is appropriate and consistent with the codebase
The implementation aligns with other FAQ retrievers in the codebase, which also use a limit of 10. This higher limit (compared to 3-5 used for other content types) is suitable for FAQs as they are typically shorter and more focused, allowing for better coverage of common questions.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for result_limit parameter in retrieval implementations rg -A 1 "result_limit\s*=" app/retrieval/Length of output: 1149
Script:
#!/bin/bash # Search for instances where result_limit is specified in pipeline files rg -B 2 -A 2 "result_limit\s*=\s*\d+" app/pipeline/Length of output: 2743
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/pipeline/chat/exercise_chat_agent_pipeline.py (1)
400-423
: Add error handling and empty results validation.While the implementation follows the pattern of
lecture_content_retrieval
, consider these improvements:
- Add validation for empty results
- Include error handling for retriever failures
- Consider a more concise docstring
def faq_content_retrieval() -> str: """ - Use this tool to retrieve information from indexed FAQs. - It is suitable when no other tool fits, it is a common question or the question is frequently asked, - or the question could be effectively answered by an FAQ. Also use this if the question is explicitly - organizational and course-related. An organizational question about the course might be - "What is the course structure?" or "How do I enroll?" or exam related content like "When is the exam". - The tool performs a RAG retrieval based on the chat history to find the most relevant FAQs. - Each FAQ follows this format: FAQ ID, FAQ Question, FAQ Answer. - Respond to the query concisely and solely using the answer from the relevant FAQs. - This tool should only be used once per query. + Retrieve and provide answers from indexed FAQs for common, organizational, + or course-related questions. Uses RAG retrieval based on chat history. + Format: FAQ ID, Question, Answer. Use once per query. """ self.callback.in_progress("Retrieving faq content ...") - self.retrieved_faqs = self.faq_retriever( - chat_history=chat_history, - student_query=query.contents[0].text_content, - result_limit=10, - course_name=dto.course.name, - course_id=dto.course.id, - base_url=dto.settings.artemis_base_url, - ) + try: + self.retrieved_faqs = self.faq_retriever( + chat_history=chat_history, + student_query=query.contents[0].text_content, + result_limit=10, + course_name=dto.course.name, + course_id=dto.course.id, + base_url=dto.settings.artemis_base_url, + ) + if not self.retrieved_faqs: + return "No relevant FAQs found for your question." - result = format_faqs(self.retrieved_faqs) - return result + return format_faqs(self.retrieved_faqs) + except Exception as e: + logger.error("Error retrieving FAQs", exc_info=e) + return "Sorry, I encountered an error while retrieving FAQs."
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
app/pipeline/chat/exercise_chat_agent_pipeline.py
(6 hunks)app/web/routers/pipelines.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- app/web/routers/pipelines.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Push Docker Image
🔇 Additional comments (4)
app/pipeline/chat/exercise_chat_agent_pipeline.py (4)
42-43
: LGTM! Well-structured imports and class attribute.The new imports and class attribute for FAQ functionality are properly organized and typed.
Also applies to: 108-108
142-143
: Good refactoring of retriever variables!The renaming of
retriever
tolecture_retriever
improves clarity, and the newfaq_retriever
follows the same initialization pattern.
543-545
: LGTM! Consistent tool list modification.The FAQ tool is properly added to the tool list with appropriate conditional checking, following the same pattern as the lecture tool.
544-545
: Verify FAQ retrieval functionality in production.Let's verify that the FAQ retrieval is working as expected:
✅ Verification successful
FAQ tool implementation is properly verified and safe
The FAQ retrieval functionality is correctly implemented with proper safeguards:
- Verifies FAQ existence before enabling the tool
- Uses proper database querying and schema validation
- Implements consistent error handling
- Maintains identical behavior across course and exercise chat contexts
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify FAQ retrieval functionality # 1. Check if any courses have FAQs indexed # 2. Verify the structure of indexed FAQs # Check for indexed FAQs across all courses echo "Checking for indexed FAQs..." ast-grep --pattern 'class FaqRetrieval { $$$ __call__($$$) { $$$ } }' # Check FAQ data structure echo "Verifying FAQ data structure..." rg -A 5 "FAQ" --type pythonLength of output: 280
Script:
#!/bin/bash # Add Python file type and search for FAQ-related code echo "Searching for FAQ-related functionality..." rg -i "faq" --type-add 'py:*.py' --type py -A 5 echo "Looking for should_allow_faq_tool implementation..." ast-grep --pattern 'def should_allow_faq_tool($$$): $$$' echo "Searching for faq_content_retrieval definition..." rg "faq_content_retrieval" --type-add 'py:*.py' --type py -A 3Length of output: 59052
… feature/faq/basic-faq-pipe
go away with your nitpicks
New Features
Summary by CodeRabbit
New Features
get_pipeline
function to support FAQ ingestion feature.Improvements
Technical Updates
CitationPipeline
to support formatting of FAQ data.