griptape-ai · collindutter · Jan 11, 2024 · Jan 10, 2024 · Jan 10, 2024 · Jan 10, 2024
diff --git a/docs/griptape-framework/data/image-generation-engines.md b/docs/griptape-framework/data/image-generation-engines.md
@@ -0,0 +1,147 @@
+## Overview
+
+Image generation engines facilitate the use of [image generation drivers](../structures/image-generation-drivers.md) by image generation tasks and tools. Each image generation engine defines a `run` method that accepts the inputs necessary for each image generation mode, combines these inputs with any available rulesets, and provides the request to the configured image generation driver.
+
+#### Rulesets
+
+[Rulesets](../structures/rulesets.md) provided to image generation engines are combined with prompts, providing further instruction to image generation models. In addition to typical Rulesets, image generation engines support Negative Rulesets. Negative Rulesets are used by [image generation drivers](../structures/image-generation-drivers.md) with support for prompt wieghting and used to influence the image generation model to avoid undesireable features described by negative prompts.
+
+### Prompt Image Generation Engine 
+
+This image generation engine facilitates generating images from text prompts.
+
+```python
+from griptape.structures import Agent
+from griptape.engines import PromptImageGenerationEngine
+from griptape.drivers import AmazonBedrockImageGenerationDriver, \
+    BedrockStableDiffusionImageGenerationModelDriver
+from griptape.tools import PromptImageGenerationClient
+
+
+# Define positive and negative rulesets.
+positive_ruleset = Ruleset(rules=[Rule("realistic"), Rule("high quality")])
+negative_ruleset = Ruleset(rules=[Rule("distorted")])
+
+# Create a driver configured to use Stable Diffusion via Bedrock.
+driver = AmazonBedrockImageGenerationDriver(
+    image_generation_model_driver=BedrockStableDiffusionImageGenerationModelDriver(),
+    model="stability.stable-diffusion-xl-v0",
+)
+
+# Create an engine configured to use the driver.
+engine = PromptImageGenerationEngine(
+    rulesets=[positive_ruleset],
+    negative_rulesets=[negative_ruleset],
+    image_generation_driver=driver,
+)
+
+# Create a tool configured to use the engine.
+tool = PromptImageGenerationClient(
+    image_generation_engine=engine,
+)
+```
+
+### Variation Image Generation Engine 
+
+This image generation engine facilitates generating variations of an input image according to a text prompt.
+
+```python
+from griptape.structures import Agent
+from griptape.engines import VariationImageGenerationEngine
+from griptape.drivers import AmazonBedrockImageGenerationDriver, \
+    BedrockStableDiffusionImageGenerationModelDriver
+from griptape.tools import VariationImageGenerationClient
+
+
+# Define positive and negative rulesets.
+positive_ruleset = Ruleset(rules=[Rule("realistic"), Rule("high quality")])
+negative_ruleset = Ruleset(rules=[Rule("distorted")])
+
+# Create a driver configured to use Stable Diffusion via Bedrock.
+driver = AmazonBedrockImageGenerationDriver(
+    image_generation_model_driver=BedrockStableDiffusionImageGenerationModelDriver(),
+    model="stability.stable-diffusion-xl-v0",
+)
+
+# Create an engine configured to use the driver.
+engine = VariationImageGenerationEngine(
+    rulesets=[positive_ruleset],
+    negative_rulesets=[negative_ruleset],
+    image_generation_driver=driver,
+)
+
+# Create a tool configured to use the engine.
+tool = VariationImageGenerationClient(
+    image_generation_engine=engine,
+)
+```
+
+### Inpainting Image Generation Engine
+
+This image generation engine facilitates image inpainting, or modifying an input image according to a text prompt within the bounds of a mask defined by mask image.
+
+```python
+from griptape.structures import Agent
+from griptape.engines import InpaintingImageGenerationEngine
+from griptape.drivers import AmazonBedrockImageGenerationDriver, \
+    BedrockStableDiffusionImageGenerationModelDriver
+from griptape.tools import InpaintingImageGenerationClient
+
+
+# Define positive and negative rulesets.
+positive_ruleset = Ruleset(rules=[Rule("realistic"), Rule("high quality")])
+negative_ruleset = Ruleset(rules=[Rule("distorted")])
+
+# Create a driver configured to use Stable Diffusion via Bedrock.
+driver = AmazonBedrockImageGenerationDriver(
+    image_generation_model_driver=BedrockStableDiffusionImageGenerationModelDriver(),
+    model="stability.stable-diffusion-xl-v0",
+)
+
+# Create an engine configured to use the driver.
+engine = InpaintingImageGenerationEngine(
+    rulesets=[positive_ruleset],
+    negative_rulesets=[negative_ruleset],
+    image_generation_driver=driver,
+)
+
+# Create a tool configured to use the engine.
+tool = InpaintingImageGenerationClient(
+    image_generation_engine=engine,
+)
+```
+
+### Outpainting Image Generation Engine
+
+This image generation engine facilitates image outpainting, or modifying an input image according to a text prompt outside the bounds of a mask defined by a mask image.
+
+```python
+from griptape.structures import Agent
+from griptape.engines import OutpaintingImageGenerationEngine
+from griptape.drivers import AmazonBedrockImageGenerationDriver, \
+    BedrockStableDiffusionImageGenerationModelDriver
+from griptape.tools import OutpaintingImageGenerationClient
+
+
+# Define positive and negative rulesets.
+positive_ruleset = Ruleset(rules=[Rule("realistic"), Rule("high quality")])
+negative_ruleset = Ruleset(rules=[Rule("distorted")])
+
+# Create a driver configured to use Stable Diffusion via Bedrock.
+driver = AmazonBedrockImageGenerationDriver(
+    image_generation_model_driver=BedrockStableDiffusionImageGenerationModelDriver(),
+    model="stability.stable-diffusion-xl-v0",
+)
+
+# Create an engine configured to use the driver.
+engine = OutpaintingImageGenerationEngine(
+    rulesets=[positive_ruleset],
+    negative_rulesets=[negative_ruleset],
+    image_generation_driver=driver,
+)
+
+# Create a tool configured to use the engine.
+tool = OutpaintingImageGenerationClient(
+    image_generation_engine=engine,
+)
+```
diff --git a/docs/griptape-framework/data/loaders.md b/docs/griptape-framework/data/loaders.md
@@ -104,3 +104,25 @@ WebLoader().load_collection(
     ["https://www.griptape.ai", "https://docs.griptape.ai"]
 )
 ```
+
+## Image Loader
+
+The Image Loader is used to load an image from the filesystem, returning an ImageArtifact.
+
+```python
+from griptape.loaders import ImageLoader
+
+image_artifact = ImageLoader().load("my_image.png")
+
+image_artifacts = ImageLoader().load_collection("image_1.png", "image_2.png")
+```
+
+By default, the Image Loader will ensure all images are in `png` format. If an image in another format (for example, `jpg`) is loaded, it will be reformatted to `png`. Other formats are supported through the `format` field.
+
+```python
+from griptape.loaders import ImageLoader
+
+
+# Image data in Image Artifact will be in JPG format
+image_artifact_jpg = ImageLoader(format="JPG").load("my_image.png")
+```
diff --git a/docs/griptape-framework/structures/image-generation-drivers.md b/docs/griptape-framework/structures/image-generation-drivers.md
@@ -0,0 +1,178 @@
+## Overview
+
+Image generation drivers are used by [image generation engines](../data/image-generation-engines.md) to build and execute API calls to image generation models.
+
+Use a driver to build an engine, then pass it to a tool for use by an [Agent](../structures/agents.md):
+
+```python
+from griptape.structures import Agent
+from griptape.engines import PromptImageGenerationEngine
+from griptape.drivers import OpenAiDalleImageGenerationDriver
+from griptape.tools import PromptImageGenerationClient, FileManager
+
+driver = OpenAiDalleImageGenerationDriver(
+    model="dall-e-3",
+)
+
+engine = PromptImageGenerationEngine(image_generation_driver=driver)
+
+agent = Agent(tools=[
+    PromptImageGenerationClient(image_generation_engine=engine),
+    FileManager(),
+])
+
+agent.run("Generate a watercolor painting of a dog riding a skateboard. Save the image as rad-dog.png.")
+```
+
+### Amazon Bedrock
+
+The Amazon Bedrock image generation driver provides multi-model access to image generation models hosted by Amazon Bedrock. This driver manages the API calls to the Bedrock API, while the specific model drivers below format the API requests and parse the responses.
+
+#### Bedrock Stable Diffusion Model Driver
+
+The Bedrock Stable Diffusion model driver provides support for Stable Diffusion models hosted by Amazon Bedrock. This model driver supports configurations specific to Stable Diffusion, like style presets, clip guidance presets, sampler, and more.
+
+This model driver supports negative prompts. When provided (for example, when used with an [image generation engine](../data/image-generation-engines.md) configured with negative rulesets), the image generation request will include negatively-weighted prompts describing features or characteristics to avoid in the resulting generation.
+
+```python
+from griptape.structures import Agent
+from griptape.tools import PromptImageGenerationClient, FileManager
+from griptape.engines import PromptImageGenerationEngine
+from griptape.drivers import AmazonBedrockImageGenerationDriver, \
+    BedrockStableDiffusionImageGenerationModelDriver
+
+model_driver = BedrockStableDiffusionImageGenerationModelDriver(
+    style_preset="pixel-art",
+    steps=50,
+)
+
+driver = AmazonBedrockImageGenerationDriver(
+    image_generation_model_driver=model_driver,
+)
+
+engine = PromptImageGenerationEngine(image_generation_driver=driver)
+
+agent = Agent(tools=[
+    PromptImageGenerationClient(image_generation_engine=engine),
+    FileManager(),
+])
+
+agent.run("Generate a watercolor painting of a dog riding a skateboard. Save the image as rad-dog.png.")
+```
+
+#### Amazon Bedrock Titan Image Generator Model Driver
+
+The Amazon Bedrock Titan Image Generator model driver provides support for Titan Image Generator models hosted by Amazon Bedrock. This model driver supports configurations specific to Titan Image Generator, like quality, seed, and cfg_scale.
+
+This model driver supports negative prompts. When provided (for example, when used with an [image generation engine](../data/image-generation-engines.md) configured with negative rulesets), the image generation request will include negatively-weighted prompts describing features or characteristics to avoid in the resulting generation.
+
+```python
+from griptape.structures import Agent
+from griptape.tools import PromptImageGenerationClient, FileManager
+from griptape.engines import PromptImageGenerationEngine
+from griptape.drivers import AmazonBedrockImageGenerationDriver\ 
+    BedrockTitanImageGeneratorImageGenerationModelDriver
+
+model_driver = BedrockTitanImageGeneratorImageGenerationModelDriver(
+    quality="hd",
+)
+
+driver = AmazonBedrockImageGenerationDriver(
+    image_generation_model_driver=model_driver,
+)
+
+engine = PromptImageGenerationEngine(image_generation_driver=driver)
+
+agent = Agent(tools=[
+    PromptImageGenerationClient(image_generation_engine=engine),
+    FileManager(),
+])
+
+agent.run("Generate a watercolor painting of a dog riding a skateboard. Save the image as rad-dog.png.")
+```
+
+### Azure OpenAI DALL-E
+
+The Azure OpenAI DALL-E image generation driver provides access to OpenAI DALL-E models hosted by Azure. In addition to the configurations provided by the underlying OpenAI DALL-E driver, the Azure OpenAI Dall-E Driver allows configuration of Azure-specific deployment values.
+
+```python
+from griptape.structures import Agent
+from griptape.tools import PromptImageGenerationClient, FileManager
+from griptape.engines import PromptImageGenerationEngine
+from griptape.drivers import AzureOpenAiDalleImageGenerationDriver
+
+driver = AzureOpenAiDalleImageGenerationDriver(
+    model="dall-e-3",
+    azure_deployment="my-azure-deployment",
+    azure_endpoint="https://example-endpoint.openai.azure.com",
+)
+
+engine = PromptImageGenerationEngine(image_generation_driver=driver)
+
+agent = Agent(tools=[
+    PromptImageGenerationClient(image_generation_engine=engine),
+    FileManager(),
+])
+
+agent.run("Generate a watercolor painting of a dog riding a skateboard. Save the image as rad-dog.png.")
+```
+
+### Leonardo.Ai
+
+The Leonardo image generation driver enables image generation using models hosted by [Leonardo.ai](https://leonardo.ai/).
+
+The Leonardo image generation driver supports configurations like model selection, image size, specifying a generation seed, and generation steps. For details on supported configuration parameters, see [Leonardo.Ai's image generation documentation](https://docs.leonardo.ai/reference/creategeneration).
+
+This driver supports negative prompts. When provided (for example, when used with an [image generation engine](../data/image-generation-engines.md) configured with negative rulesets), the image generation request will include negatively-weighted prompts describing features or characteristics to avoid in the resulting generation.
+
+```python
+import os
+
+from griptape.structures import Agent
+from griptape.tools import PromptImageGenerationClient, FileManager
+from griptape.engines import PromptImageGenerationEngine
+from griptape.drivers import LeonardoImageGenerationDriver
+
+driver = LeonardoImageGenerationDriver(
+    model="6bef9f1b-29cb-40c7-b9df-32b51c1f67d3",
+    api_key=os.getenv("LEONARDO_API_KEY"),
+    image_width=512,
+    image_height=1024,
+)
+
+engine = PromptImageGenerationEngine(image_generation_driver=driver)
+
+agent = Agent(tools=[
+    PromptImageGenerationClient(image_generation_engine=engine),
+    FileManager(),
+])
+
+agent.run("Generate a watercolor painting of a dog riding a skateboard. Save the image as rad-dog.png.")
+```
+
+### OpenAI DALL-E
+
+The OpenAI DALL-E image generation driver enables image generation using OpenAI DALL-E models. Like other OpenAI drivers, the image generation driver will implicitly load an API key in the `OPENAI_API_KEY` environment variable if one is not explicitly provided.
+
+The OpenAI Dall-E driver supports image generation configurations like style presets, image quality preference, and image size. For details on supported configuration values, see the [OpenAI documentation](https://platform.openai.com/docs/guides/images/introduction).
+
+```python
+from griptape.structures import Agent
+from griptape.tools import PromptImageGenerationClient, FileManager
+from griptape.engines import PromptImageGenerationEngine
+from griptape.drivers import OpenAiDalleImageGenerationDriver
+
+driver = OpenAiDalleImageGenerationDriver(
+    model="dall-e-2"
+    image_size="512x512",
+)
+
+engine = PromptImageGenerationEngine(image_generation_driver=driver)
+
+agent = Agent(tools=[
+    PromptImageGenerationClient(image_generation_engine=engine),
+    FileManager(),
+])
+
+agent.run("Generate a watercolor painting of a dog riding a skateboard. Save the image as rad-dog.png.")
+```