Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(prompts): add util for variable name extraction #1046

Merged
merged 3 commits into from
Dec 16, 2024

Conversation

hassiebp
Copy link
Contributor

@hassiebp hassiebp commented Dec 16, 2024

Important

Introduces TemplateParser for template variable extraction and compilation, replacing existing methods in BasePromptClient, and adds tests for new functionality.

  • Behavior:
    • Introduces TemplateParser class in model.py for extracting and compiling template variables.
    • Replaces BasePromptClient._compile_template_string with TemplateParser.compile_template in TextPromptClient and ChatPromptClient.
    • Adds variable_names() method to TextPromptClient and ChatPromptClient to extract variable names from templates.
  • Tests:
    • Adds tests in test_prompt.py and test_prompt_compilation.py for TemplateParser methods compile_template and find_variable_names.
    • Tests cover various scenarios including missing variables, unmatched tags, and special characters.

This description was created by Ellipsis for 81e9006. It will automatically update as commits are pushed.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: Experimental PR review

PR Summary

Introduced a new TemplateParser class to centralize template parsing logic and added variable name extraction functionality for both text and chat prompts.

  • Added TemplateParser class in langfuse/model.py with find_variable_names() and compile_template() methods for robust template handling
  • Added variable_names() method to TextPromptClient returning List[str] of template variables
  • Added variable_names() method to ChatPromptClient returning List[ChatMessageVariables] with role-specific variables
  • Added comprehensive test suite in tests/test_prompt_compilation.py covering edge cases like whitespace, special characters and JSON content

3 file(s) reviewed, 3 comment(s)
Edit PR Review Bot Settings | Greptile

langfuse/model.py Outdated Show resolved Hide resolved
tests/test_prompt_compilation.py Show resolved Hide resolved
langfuse/model.py Show resolved Hide resolved
langfuse/model.py Outdated Show resolved Hide resolved
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: Experimental PR review

PR Summary

(updates since last review)

Refactored variable name extraction in prompt clients by converting variable_names() method to a variables property and simplified chat message variable handling to return a flat list.

  • Changed variable_names() to variables property in TextPromptClient and ChatPromptClient for better API design
  • Simplified chat message variable extraction to return List[str] instead of role-based grouping
  • Fixed mutable default argument issue in TemplateParser.compile_template()
  • Added handling for None data parameter in compile_template

2 file(s) reviewed, 1 comment(s)
Edit PR Review Bot Settings | Greptile

tests/test_prompt.py Show resolved Hide resolved
@hassiebp
Copy link
Contributor Author

@rhighs - I have adjusted the variables getter on the ChatPromptClient to return a list of variables to be consistent with the TextPromptClient. I do not see the value of splitting the variables per chat message.

Do you have a strong opinion on this? If so, could you explain exactly why you'd benefit from that?

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: Experimental PR review

PR Summary

(updates since last review)

Based on the recent changes, I'll provide a concise summary focusing only on the latest updates not covered in previous reviews:

Simplified template parsing implementation and improved test coverage for the TemplateParser class.

  • Added extensive test cases in test_prompt_compilation.py for unescaped JSON handling and complex template scenarios
  • Improved TemplateParser.compile_template() to handle unescaped JSON content without character escaping
  • Added test coverage for edge cases like content as variable name and empty string replacements
  • Added test cases for variable case sensitivity and JSON template handling

The changes focus on improving robustness and test coverage of the template parsing functionality, particularly around JSON content handling.

2 file(s) reviewed, 1 comment(s)
Edit PR Review Bot Settings | Greptile

Comment on lines 156 to 157
def _get_langchain_prompt_string(content: str):
return re.sub(r"{{\s*(\w+)\s*}}", r"{\g<1>}", content)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: regex pattern \w+ only matches word chars - may need to support other valid variable name chars

@rhighs
Copy link
Contributor

rhighs commented Dec 16, 2024

@rhighs - I have adjusted the variables getter on the ChatPromptClient to return a list of variables to be consistent with the TextPromptClient. I do not see the value of splitting the variables per chat message.

Do you have a strong opinion on this? If so, could you explain exactly why you'd benefit from that?

LGTM, no strong opinions tbh

@hassiebp hassiebp merged commit 4945318 into main Dec 16, 2024
9 of 10 checks passed
@hassiebp hassiebp deleted the feat-prompts-add-variable-names-util branch December 16, 2024 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants