Skip to content

Commit

Permalink
Merge branch 'master' into feat/smolagents-integrations-2
Browse files Browse the repository at this point in the history
  • Loading branch information
soumik12345 authored Feb 21, 2025
2 parents 137f661 + 5d997c4 commit 8e56fb8
Show file tree
Hide file tree
Showing 114 changed files with 7,776 additions and 4,126 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,5 @@ gha-creds-*.json
.coverage
.nox
*.log
*/file::memory:?cache=shared
*/file::memory:?cache=shared
tests/weave_models/
1 change: 1 addition & 0 deletions docs/docs/guides/core-types/env-vars.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ os.environ["WEAVE_PRINT_CALL_LINK"] = "false"

| Variable | Type | Default | Description |
|----------|------|---------|-------------|
| `WANDB_API_KEY` | `string` | `None` | If set, automatically log into W&B Weave without being prompted for your API key. To generate an API key, log in to your W&B account and go to [https://wandb.ai/authorize](https://wandb.ai/authorize). |
| `WEAVE_DISABLED` | `bool` | `false` | When set to `true`, disables all Weave tracing. Weave ops will behave like regular functions. |
| `WEAVE_PRINT_CALL_LINK` | `bool` | `true` | Controls whether to print a link to the Weave UI when calling a Weave op. |
| `WEAVE_CAPTURE_CODE` | `bool` | `true` | Controls whether to save code for ops so they can be reloaded for later use. |
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/guides/core-types/evaluations.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Evaluations
# Offline Batch Evaluation

Evaluation-driven development helps you reliably iterate on an application. The `Evaluation` class is designed to assess the performance of a `Model` on a given `Dataset` or set of examples using scoring functions.

Expand Down
6 changes: 3 additions & 3 deletions docs/docs/guides/evaluation/guardrails_and_monitors.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# Guardrails and Monitors
# Online Evaluation: Guardrails and Monitors

![Feedback](./../../../static/img/guardrails_scorers.png)

Expand Down Expand Up @@ -100,7 +100,7 @@ result, call = generate_text.call("Say hello")
await call.apply_scorer(LengthScorer())
```

## Using Scorers as Guardrails
## Using Scorers as Guardrails {#using-scorers-as-guardrails}

Guardrails act as safety checks that run before allowing LLM output to reach users. Here's a practical example:

Expand Down Expand Up @@ -146,7 +146,7 @@ When applying scorers:
- You can view scorer results in the UI or query them via the API
:::

## Using Scorers as Monitors
## Using Scorers as Monitors {#using-scorers-as-monitors}

Monitors help track quality metrics over time without blocking operations. This is useful for:
- Identifying quality trends
Expand Down
9 changes: 4 additions & 5 deletions docs/docs/guides/evaluation/scorers.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# Evaluation Metrics

## Evaluations in Weave
# Scoring Overview

In Weave, Scorers are used to evaluate AI outputs and return evaluation metrics. They take the AI's output, analyze it, and return a dictionary of results. Scorers can use your input data as reference if needed and can also output extra information, such as explanations or reasonings from the evaluation.

Expand All @@ -25,11 +23,12 @@ In Weave, Scorers are used to evaluate AI outputs and return evaluation metrics.
## Create your own Scorers

:::tip[Ready-to-Use Scorers]
While this guide shows you how to create custom scorers, Weave comes with a variety of [predefined scorers](./builtin_scorers.mdx) that you can use right away, including:
While this guide shows you how to create custom scorers, Weave comes with a variety of [predefined scorers](./builtin_scorers.mdx) and [local SLM scorers](./weave_local_scorers.md) that you can use right away, including:
- [Hallucination detection](./builtin_scorers.mdx#hallucinationfreescorer)
- [Summarization quality](./builtin_scorers.mdx#summarizationscorer)
- [Embedding similarity](./builtin_scorers.mdx#embeddingsimilarityscorer)
- [Relevancy evaluation](./builtin_scorers.mdx#ragas---contextrelevancyscorer)
- [Toxicity detection (local)](./weave_local_scorers.md#weavetoxicityscorerv1)
- [Context Relevance scoring (local)](./weave_local_scorers.md#weavecontextrelevancescorerv1)
- And more!
:::

Expand Down
332 changes: 332 additions & 0 deletions docs/docs/guides/evaluation/weave_local_scorers.md

Large diffs are not rendered by default.

40 changes: 23 additions & 17 deletions docs/docs/guides/integrations/azure.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,36 @@
# Microsoft Azure

Weights & Biases integrates with Microsoft Azure OpenAI services, helping teams to manage, debug, and optimize their Azure AI workflows at scale. This guide introduces the W&B integration, what it means for Weave users, its key features, and how to get started.
Weights & Biases (W&B) Weave integrates with Microsoft Azure OpenAI services, helping teams to optimize their Azure AI applications. Using W&B, you can

:::tip
For the latest tutorials, visit [Weights & Biases on Microsoft Azure](https://wandb.ai/site/partners/azure).
:::

## Key features

- **LLM evaluations**: Evaluate and monitor LLM-powered applications using Weave, optimized for Azure infrastructure.
- **Seamless integration**: Deploy W&B Models on a dedicated Azure tenant with built-in integrations for Azure AI Studio, Azure ML, Azure OpenAI Service, and other Azure AI services.
- **Enhanced performance**: Use Azure’s infrastructure to train and deploy models faster, with auto-scaling clusters and optimized resources.
- **Scalable experiment tracking**: Automatically log hyperparameters, metrics, and artifacts for Azure AI Studio and Azure ML runs.
- **LLM fine-tuning**: Fine-tune models with W&B Models.
- **Central repository for models and datasets**: Manage and version models and datasets with W&B Registry and Azure AI Studio.
- **Collaborative workspaces**: Support teamwork with shared workspaces, experiment commenting, and Microsoft Teams integration.
- **Governance framework**: Ensure security with fine-grained access controls, audit trails, and Microsoft Entra ID integration.

## Getting started

To use W&B with Azure, add the W&B integration via the [Azure Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/weightsandbiasesinc1641502883483.weights_biases_for_azure?tab=Overview).
To get started using Azure with Weave, simply decorate the function(s) you want to track with `weave.op`.

For a detailed guide describing how to integrate Azure OpenAI fine-tuning with W&B, see [Integrating Weights & Biases with Azure AI Services](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/weights-and-biases-integration).
```python
@weave.op()
def call_azure_chat(model_id: str, messages: list, max_tokens: int = 1000, temperature: float = 0.5):
response = client.chat.completions.create(
model=model_id,
messages=messages,
max_tokens=max_tokens,
temperature=temperature
)
return {"status": "success", "response": response.choices[0].message.content}

```

## Learn more

- [Weights & Biases + Microsoft Azure Overview](https://wandb.ai/site/partners/azure)
- [How W&B and Microsoft Azure Are Empowering Enterprises](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/how-weights--biases-and-microsoft-azure-are-empowering-enterprises-to-fine-tune-/4303716)
- [Microsoft Azure OpenAI Service Documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/)
Learn more about advanced Azure with Weave topics using the resources below.

### Use the Azure AI Model Inference API with Weave

Learn how to use the [Azure AI Model Inference API] with Weave to gain insights into Azure models in [this guide](https://wandb.ai/byyoung3/ML-NEWS2/reports/A-guide-to-using-the-Azure-AI-model-inference-API--Vmlldzo4OTY1MjEy#tutorial:-implementing-azure-ai-model-inference-api-with-w&b-weave-).

### Trace Azure OpenAI models with Weave

Learn how to trace Azure OpenAI models using Weave in [this guide](https://wandb.ai/a-sh0ts/azure-weave-cookbook/reports/How-to-use-Azure-OpenAI-and-Azure-AI-Studio-with-W-B-Weave--Vmlldzo4MTI0NDgy).
18 changes: 14 additions & 4 deletions docs/docs/guides/integrations/bedrock.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,12 @@

Weave automatically tracks and logs LLM calls made via Amazon Bedrock, AWS's fully managed service that offers foundation models from leading AI companies through a unified API.

There are multiple ways to log LLM calls to Weave from Amazon Bedrock. You can use `weave.op` to create reusable operations for tracking any calls to a Bedrock model. Optionally, if you're using Anthropic models, you can use Weave’s built-in integration with Anthropic.

:::tip
For the latest tutorials, visit [Weights & Biases on Amazon Web Services](https://wandb.ai/site/partners/aws/).
:::

:::note
Do you want to experiment with Amazon Bedrock models on Weave without any set up? Try the [LLM Playground](../tools/playground.md).
:::

## Traces

Weave will automatically capture traces for Bedrock API calls. You can use the Bedrock client as usual after initializing Weave and patching the client:
Expand Down Expand Up @@ -143,3 +141,15 @@ print(result)
```

This approach allows you to version your experiments and easily track different configurations of your Bedrock-based application.

## Learn more

Learn more about using Amazon Bedrock with Weave

### Try Bedrock in the Weave Playground

Do you want to experiment with Amazon Bedrock models in the Weave UI without any set up? Try the [LLM Playground](../tools/playground.md).

### Report: Compare LLMs on Bedrock for text summarization with Weave

The [Compare LLMs on Bedrock for text summarization with Weave](https://wandb.ai/byyoung3/ML_NEWS3/reports/Compare-LLMs-on-Amazon-Bedrock-for-text-summarization-with-W-B-Weave--VmlldzoxMDI1MTIzNw) report explains how to use Bedrock in combination with Weave to evaluate and compare LLMs for summarization tasks, code samples included.
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
# Google Gemini
# Google

:::tip
For the latest tutorials, visit [Weights & Biases on Google Cloud](https://wandb.ai/site/partners/googlecloud/).
:::

:::note
Do you want to experiment with Google Gemini models on Weave without any set up? Try the [LLM Playground](../tools/playground.md).
Do you want to experiment with Google AI models on Weave without any set up? Try the [LLM Playground](../tools/playground.md).
:::

Google offers two ways of calling Gemini via API:
This page describes how to use W&B Weave with the Google Vertex AI API and the Google Gemini API.

1. Via the [Vertex APIs](https://cloud.google.com/vertex-ai/docs).
2. Via the [Gemini API SDK](https://ai.google.dev/gemini-api/docs/quickstart?lang=python).
You can use Weave to evaluate, monitor, and iterate on your Google GenAI applications. Weave automatically captures traces for the:

## Tracing
1. [Google Vertex AI API](https://cloud.google.com/vertex-ai/docs), which provides access to Google’s Gemini models and [various partner models](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-partner-models).
2. [Google Gemini API](https://ai.google.dev/gemini-api/docs/quickstart?lang=python), which is accessible via Python SDK, Node.js SDK, Go SDK, and REST.

It’s important to store traces of language model applications in a central location, both during development and in production. These traces can be useful for debugging, and as a dataset that will help you improve your application.
## Get started

Weave will automatically capture traces for [Gemini API SDK](https://ai.google.dev/gemini-api/docs/quickstart?lang=python). To start tracking, calling `weave.init(project_name="<YOUR-WANDB-PROJECT-NAME>")` and use the library as normal.

Expand Down Expand Up @@ -120,3 +120,4 @@ Given a weave reference to any `weave.Model` object, you can spin up a fastapi s
```shell
weave serve weave:///your_entity/project-name/YourModel:<hash>
```

Loading

0 comments on commit 8e56fb8

Please sign in to comment.