Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate string output/input to Turn objects #1089

Draft
wants to merge 41 commits into
base: main
Choose a base branch
from

Conversation

leondz
Copy link
Collaborator

@leondz leondz commented Jan 27, 2025

Resolves #602

summary

garak prompts & outputs are all captured as str

We can about more modalities than strings - scope encompasses image attachments, files, rich outputs. Though the Language part of LLM is requisite for garak scope to apply.

This PR abstracts the type sent to and received from generators away from str to Turn. Each Turn() instance corresponds to a turn in a conversation. Simple probes pose Turn() to their target. Attempts manage a message history of Turn() items instead of strings.

details

Turn class

Turn represents the content part of a query to or response from an LLM.

Turns have:

  • text: corresponds to what was previously prompt: str
  • parts: a list of data, default []
  • equality testing function
  • method for populating parts from file data
  • serialisation and deserialisation from dict

Turns should never be None. Where we'd use prompt = None previously, this should be a Turn() which will have text==None.

This is only a part of a query sent to a target model. Turns are agnostic to metadata about which role is uttering it. A full LLM query might be composed of multiple turns.

Tests should all specifically use the Turn object.

NB it's now preferred to copy turn text locally instead of manipulating it in place. e.g.

lower_text = output.text.lower() # OK
output.text = output.text.lower() # disprefered

-- unless dealing things that need to edit to object, e.g. buffs

Updates to Attempt

Attempt is a core place this change affects.

  • attempt.prompt is now a Turn; refer to what was attempt.prompt, with attempt.prompt.text
  • attempt.outputs is now a list of Turns
  • attempt.messages is still a list of lists of dicts, but the content part of the dicts must be a Turn
  • A little magic is applied to maintain existing interfaces for writing to Attempts, where some methods will accept a List of str.
    • Delving into Attempt internals requires using Turns.
    • Direct access of messages internals must be done with awareness of Turn

type changes

  • Anything operating on an attempt will need a change
  • Detectors operate on outputs which is now a list of Turns
  • Buffs manipulate prompt which is now a Turn
  • generators.base.generate() should take a Turn and not a string
  • generators.base.generate() should return a list of Turns
    • Currently, generate() returns a list of str, and the magic in Attempt handles the mapping to Turn()
    • This constrains us to only having string output. We're cool with that for now I think
    • @jmartin-tech is leaving generators.base.generate() as returning a list of strings OK? Is it sensible to postpone migrating this to a list of Turns? Would appreciate your thoughts

Verification

  • python -m pytest tests/test_attempt.py
  • the full test suite with API keys enabled

@leondz leondz added the architecture Architectural upgrades label Jan 27, 2025
@leondz leondz requested a review from jmartin-tech January 27, 2025 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture Architectural upgrades
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable more complex prompts
1 participant