-
Notifications
You must be signed in to change notification settings - Fork 18
Conversation
|
||
#### Rulesets | ||
|
||
[Rulesets](../structures/rulesets.md) provided to image generation engines are combined with prompts, providing further instruction to image generation models. In addition to typical Rulesets, image generation engines support Negative Rulesets. Negative Rulesets are used by [image generation drivers](../structures/image-generation-drivers.md) with support for prompt wieghting and used to influence the image generation model to avoid undesireable features described by negative prompts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wieghting -> weighting
undesireable -> undesirable
|
||
#### Bedrock Stable Diffusion Model Driver | ||
|
||
The Bedrock Stable Diffusion model driver provides support for Stable Diffusion models hosted by Amazon Bedrock. This model driver supports configurations specific to Stable Diffusion, like style presets, clip guidance presets, sampler, and more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: , and more
may be unnecessary after already qualifying list as incomplete with like ...
To generate an image, use one of the following Image Generation Tasks. All Image Generation Tasks accept an Image Generation Engine configured to use an [Image Generation Driver](./image-generation-drivers.md). | ||
|
||
All successful Image Generation Tasks will always output an [Image Artifact](). Each task can be configured to additionally write the generated image to disk by providing either the `output_file` or `output_dir` field. The `output_file` field supports file names in the current directory (`my_image.png`), relative directory prefixes (`images/my_image.png`), or absolute paths (`/usr/var/my_image.png`). By setting `output_dir`, the task will generate a file name and place the image in the requested directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentionally blank URL for Image Artifact
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No! Good catch, updated.
@@ -0,0 +1,147 @@ | |||
## Overview | |||
|
|||
Image generation engines facilitate the use of [image generation drivers](../structures/image-generation-drivers.md) by image generation tasks and tools. Each image generation engine defines a `run` method that accepts the inputs necessary for each image generation mode, combines these inputs with any available rulesets, and provides the request to the configured image generation driver. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalize Griptape things like Engines, Drivers, Tasks, Tools, Rulesets throughout docs.
|
||
Image generation engines facilitate the use of [image generation drivers](../structures/image-generation-drivers.md) by image generation tasks and tools. Each image generation engine defines a `run` method that accepts the inputs necessary for each image generation mode, combines these inputs with any available rulesets, and provides the request to the configured image generation driver. | ||
|
||
#### Rulesets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an H3?
@@ -0,0 +1,147 @@ | |||
## Overview | |||
|
|||
Image generation engines facilitate the use of [image generation drivers](../structures/image-generation-drivers.md) by image generation tasks and tools. Each image generation engine defines a `run` method that accepts the inputs necessary for each image generation mode, combines these inputs with any available rulesets, and provides the request to the configured image generation driver. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link to reference docs for Image Generation Engines
engine = PromptImageGenerationEngine( | ||
rulesets=[positive_ruleset], | ||
negative_rulesets=[negative_ruleset], | ||
image_generation_driver=driver, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Show running the Engine.
# Create a tool configured to use the engine. | ||
tool = PromptImageGenerationClient( | ||
image_generation_engine=engine, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to show Tool creation here since we have a dedicated section for Tools.
## Image Generation Tasks | ||
|
||
To generate an image, use one of the following Image Generation Tasks. All Image Generation Tasks accept an Image Generation Engine configured to use an [Image Generation Driver](./image-generation-drivers.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link to reference docs for Image Generation Task
### Prompt Image Generation Task | ||
|
||
The Prompt Image Generation Task generates an image from a text prompt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reference doc
# Create an agent and provide the tool to it. | ||
agent = Agent(tools=[tool]) | ||
|
||
agent.run("Inpaint a lake to the image at mountain.png using the mask at mask.png.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
External dependency
# Create an agent and provide the tool to it. | ||
agent = Agent(tools=[tool]) | ||
|
||
agent.run("Outpaint a forest to the image at mountain.png using the mask at mask.png.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
External dependency
# Create an agent and provide the tool to it. | ||
agent = Agent(tools=[tool]) | ||
|
||
agent.run("Generate a variation of the image located at mountain.png.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
External dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a LOT of code here we'd have to maintain moving forward. Is there a way to minimize that?
@@ -0,0 +1,147 @@ | |||
## Overview | |||
|
|||
Image generation engines facilitate the use of [image generation drivers](../structures/image-generation-drivers.md) by image generation tasks and tools. Each image generation engine defines a `run` method that accepts the inputs necessary for each image generation mode, combines these inputs with any available rulesets, and provides the request to the configured image generation driver. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this sentence is monotonous with use of the phrase "Image generation" used three times. Suggest splitting this up into the customer benefit first, followed by how it achieves it (maybe two sentences).
|
||
#### Rulesets | ||
|
||
[Rulesets](../structures/rulesets.md) provided to image generation engines are combined with prompts, providing further instruction to image generation models. In addition to typical Rulesets, image generation engines support Negative Rulesets. Negative Rulesets are used by [image generation drivers](../structures/image-generation-drivers.md) with support for prompt wieghting and used to influence the image generation model to avoid undesireable features described by negative prompts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, lead with customer benefit/usage to anchor the value for the reader. e.g., "Customers use Negative Rulesets to influence the model to avoid undesirable results, for example by specifying X Y Z.".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, updated.
|
||
#### Rulesets | ||
|
||
[Rulesets](../structures/rulesets.md) provided to image generation engines are combined with prompts, providing further instruction to image generation models. In addition to typical Rulesets, image generation engines support Negative Rulesets. Negative Rulesets are used by [image generation drivers](../structures/image-generation-drivers.md) with support for prompt wieghting and used to influence the image generation model to avoid undesireable features described by negative prompts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also may want to run this through a spell check. I discovered that I am unable to spell "undesirable" without a lot of help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I get for trying VSCode. Back to PyCharm!
```python | ||
from griptape.structures import Agent | ||
from griptape.engines import PromptImageGenerationEngine | ||
from griptape.drivers import AmazonBedrockImageGenerationDriver, \ | ||
BedrockStableDiffusionImageGenerationModelDriver | ||
from griptape.tools import PromptImageGenerationClient | ||
|
||
|
||
# Define positive and negative rulesets. | ||
positive_ruleset = Ruleset(rules=[Rule("realistic"), Rule("high quality")]) | ||
negative_ruleset = Ruleset(rules=[Rule("distorted")]) | ||
|
||
# Create a driver configured to use Stable Diffusion via Bedrock. | ||
driver = AmazonBedrockImageGenerationDriver( | ||
image_generation_model_driver=BedrockStableDiffusionImageGenerationModelDriver(), | ||
model="stability.stable-diffusion-xl-v0", | ||
) | ||
|
||
# Create an engine configured to use the driver. | ||
engine = PromptImageGenerationEngine( | ||
rulesets=[positive_ruleset], | ||
negative_rulesets=[negative_ruleset], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a lot of code, which means a lot to maintain if we make refactors or upstream changes. Are we able to automate testing it? Should we pare it down to only a handful of lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently do automate testing this, see tests/integration/test_code_snippets.py
. Unfortunately that means we need the boilerplate dependency instantiation because this is real code that gets executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewfrench can you try creating a tests/assets/
directory to see if the code snippets can pull resources from there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done! The LLM looks happy to pull from there.
|
||
### Outpainting Image Generation Engine | ||
|
||
This image generation engine facilitates image outpainting, or modifying an input image according to a text prompt outside the bounds of a mask defined by a mask image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
# Image data in Image Artifact will be in JPG format | ||
image_artifact_jpg = ImageLoader(format="JPG").load("my_image.png") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is the override behavior, can we include another line that loads it "normal-like"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default example is above
from griptape.tools import PromptImageGenerationClient, FileManager | ||
|
||
driver = OpenAiDalleImageGenerationDriver( | ||
model="dall-e-3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open Q: since Dall-E 3 requires a separate monthly subscription, would it be more accessible to start with Dall-E 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These examples aren't prescriptive, but I updated this to dall-e-2
because the Azure driver using our deployment requires dall-e-3
and the downgrade here will save us a bit when running integration tests.
This model driver supports negative prompts. When provided (for example, when used with an [image generation engine](../data/image-generation-engines.md) configured with negative rulesets), the image generation request will include negatively-weighted prompts describing features or characteristics to avoid in the resulting generation. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to illustrate the negative prompts in action? Perhaps one run without, one with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an example including negative rules
To generate an image, use one of the following Image Generation Tasks. All Image Generation Tasks accept an Image Generation Engine configured to use an [Image Generation Driver](./image-generation-drivers.md). | ||
|
||
All successful Image Generation Tasks will always output an [Image Artifact](). Each task can be configured to additionally write the generated image to disk by providing either the `output_file` or `output_dir` field. The `output_file` field supports file names in the current directory (`my_image.png`), relative directory prefixes (`images/my_image.png`), or absolute paths (`/usr/var/my_image.png`). By setting `output_dir`, the task will generate a file name and place the image in the requested directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing URL?
@SavagePencil I think we should encourage lots of examples in our docs as long as they are testable with the integration tests. |
model="dall-e-3", | ||
azure_deployment="my-azure-deployment", | ||
azure_endpoint="https://example-endpoint.openai.azure.com", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Load from environment variables.
from griptape.drivers import LeonardoImageGenerationDriver | ||
|
||
driver = LeonardoImageGenerationDriver( | ||
model="6bef9f1b-29cb-40c7-b9df-32b51c1f67d3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Load from environment variable
|
||
driver = LeonardoImageGenerationDriver( | ||
model="6bef9f1b-29cb-40c7-b9df-32b51c1f67d3", | ||
api_key=os.getenv("LEONARDO_API_KEY"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add to .github/workflows/integration-tests.yml vars.
Re-reviewed on a call, good to merge.
This will be ready to review/merge once pending image generation PRs are merged.
Add docs for:
Resolves #182
📚 Documentation preview 📚: https://griptape--193.org.readthedocs.build/en/193/