Skip to content

Commit

Permalink
Merge with master
Browse files Browse the repository at this point in the history
  • Loading branch information
m-rgba committed Feb 25, 2025
2 parents 9188f74 + 9fa8402 commit a1fd968
Show file tree
Hide file tree
Showing 313 changed files with 24,198 additions and 2,990 deletions.
75 changes: 75 additions & 0 deletions .coderabbit.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
language: en-US
# CodeRabbit configuration
reviews:
# High-level configuration
poem: true
review_status: false
auto_review:
enabled: true
ignore_title_keywords:
- "WIP"
- "DO NOT MERGE"
drafts: true

# Language-specific instructions
path_instructions:
- path: "**/*.{js,jsx,ts,tsx}"
instructions: |
Focus on architectural and logical issues rather than style (assuming ESLint is in place).
Flag potential memory leaks and performance bottlenecks.
Check for proper error handling and async/await usage.
Avoid strict enforcement of try/catch blocks - accept Promise chains, early returns, and other clear error handling patterns. These are acceptable as long as they maintain clarity and predictability.
Ensure proper type usage in TypeScript files.
Look for security vulnerabilities in data handling.
Don't comment on formatting if prettier is configured.
Verify proper React hooks usage and component lifecycle.
Check for proper state management patterns.
- path: "**/*.py"
instructions: |
Focus on pythonic code patterns.
Check for proper exception handling.
Verify type hints usage where applicable.
Look for potential performance improvements.
Don't comment on formatting if black/isort is configured.
Check for proper dependency injection patterns.
Verify proper async handling if applicable.
- path: "**/*.go"
instructions: |
Focus on idiomatic Go patterns.
Check for proper error handling.
Look for concurrent programming issues.
Verify interface implementations.
Don't comment on formatting (assuming gofmt is used).
Check for proper resource cleanup.
Verify proper package organization.
- path: "**/*.{yaml,yml,json,tf}"
instructions: |
Check for security best practices.
Verify environment-specific configurations.
Look for hardcoded credentials or sensitive data.
Ensure proper resource limits and requests.
Verify proper versioning of dependencies.
Check for infrastructure best practices.
- path: "Dockerfile*"
instructions: |
Check for security best practices.
Verify proper base image usage.
Look for efficient layer caching.
Check for proper cleanup of temporary files.
Verify environment variables usage.
- path: "**/*.{md,mdx}"
instructions: |
Focus on technical accuracy.
Check for broken links.
Verify code examples are up-to-date.
Look for clarity and completeness.
Don't focus on grammar/spelling unless significant.
# General settings
chat:
auto_reply: true
9 changes: 8 additions & 1 deletion .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,7 @@ jobs:
"cohere",
"dspy",
"groq",
"huggingface",
"google_ai_studio",
"instructor",
"langchain",
Expand All @@ -251,7 +252,7 @@ jobs:
"notdiamond",
"openai",
"vertexai",
"scorers_tests",
"scorers",
"pandas-test",
]
fail-fast: false
Expand All @@ -276,6 +277,11 @@ jobs:
--health-start-period=10s
weave_clickhouse:
image: clickhouse/clickhouse-server
env:
CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT: 1
CLICKHOUSE_USER: default
CLICKHOUSE_PASSWORD: ""
CLICKHOUSE_DB: default
ports:
- "8123:8123"
options: --health-cmd "wget -nv -O- 'http://localhost:8123/ping' || exit 1" --health-interval=5s --health-timeout=3s
Expand Down Expand Up @@ -314,6 +320,7 @@ jobs:
WEAVE_SERVER_DISABLE_ECOSYSTEM: 1
WANDB_API_KEY: ${{ secrets.WANDB_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,5 @@ gha-creds-*.json
.coverage
.nox
*.log
*/file::memory:?cache=shared
*/file::memory:?cache=shared
tests/weave_models/
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Our goal is to bring rigor, best-practices, and composability to the inherently

## Documentation

Our documentation site can be found [here](https://wandb.me/weave)
Our documentation site can be found [here](https://wandb.me/weave).

## Installation
```
Expand Down Expand Up @@ -104,3 +104,4 @@ We're in the process of 🧹 cleaning up 🧹. This codebase contains a large am
The Weave Tracing code is mostly in: `weave/trace` and `weave/trace_server`.

The Weave Evaluations code is mostly in `weave/flow`.

43 changes: 24 additions & 19 deletions dev_docs/BuiltinObjectClasses.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ These can then be retrieved using `weave.ref().get()`:
config = weave.ref("my_model_config").get()
```

Sometimes users are working with standard structured classes like `dataclasses` or `pydantic.BaseModel`.
Sometimes users are working with standard structured classes like `dataclasses` or `pydantic.BaseModel`.
In such cases, we have special serialization and deserialization logic that allows for cleaner serialization patterns.
For example, let's say the user does:

Expand All @@ -40,11 +40,11 @@ This will result in an on-disk payload that looks like:

```json
{
"model_name": "my_model",
"model_version": "1.0",
"_type": "ModelConfig",
"_class_name": "ModelConfig",
"_bases": ["Object", "BaseModel"]
"model_name": "my_model",
"model_version": "1.0",
"_type": "ModelConfig",
"_class_name": "ModelConfig",
"_bases": ["Object", "BaseModel"]
}
```

Expand All @@ -53,10 +53,11 @@ Effectively, this is like creating a virtual table for that class.

**Terminology**: We use the term "weave Object" (capital "O") to refer to instances of classes that subclass `weave.Object`.

**Technical note**: the "base_object_class" is the first subtype of "Object", not the _class_name.
**Technical note**: the "base_object_class" is the first subtype of "Object", not the \_class_name.
For example, let's say the class hierarchy is:
* `A -> Object -> BaseModel`, then the `base_object_class` filter will be "A".
* `B -> A -> Object -> BaseModel`, then the `base_object_class` filter will still be "A"!

- `A -> Object -> BaseModel`, then the `base_object_class` filter will be "A".
- `B -> A -> Object -> BaseModel`, then the `base_object_class` filter will still be "A"!

Finally, the Weave library itself utilizes this mechanism for common objects like `Model`, `Dataset`, `Evaluation`, etc...
This allows the user to subclass these objects to add additional metadata or functionality, while categorizing them in the same virtual table.
Expand Down Expand Up @@ -97,6 +98,7 @@ __all__ = ["MyConfig"]
```

2. **Use in Python**:

```python
# Publishing
ref = weave.publish(MyConfig(...))
Expand All @@ -107,6 +109,7 @@ assert isinstance(config, MyConfig)
```

3. **Use via HTTP API**:

```bash
# Creating
curl -X POST 'https://trace.wandb.ai/obj/create' \
Expand All @@ -131,12 +134,13 @@ curl -X POST 'https://trace.wandb.ai/objs/query' \
```

4. **Use in React**:

```typescript
// Read with type safety
const result = useBaseObjectInstances("MyConfig", ...);

// Write with validation
const createFn = useCreateBaseObjectInstance("MyConfig");
const createFn = useCreateBuiltinObjectInstance("MyConfig");
createFn({...}); // TypeScript enforced schema
```

Expand All @@ -157,15 +161,16 @@ Run `make synchronize-base-object-schemas` to ensure the frontend TypeScript typ
1. Define your schema in a python file in the `weave/trace_server/interface/builtin_object_classes/test_only_example.py` directory. See `weave/trace_server/interface/builtin_object_classes/test_only_example.py` as an example.
2. Make sure to register your schemas in `weave/trace_server/interface/builtin_object_classes/builtin_object_registry.py` by calling `register_base_object`.
3. Run `make synchronize-base-object-schemas` to generate the frontend types.
* The first step (`make generate_base_object_schemas`) will run `weave/scripts/generate_base_object_schemas.py` to generate a JSON schema in `weave/trace_server/interface/builtin_object_classes/generated/generated_builtin_object_class_schemas.json`.
* The second step (yarn `generate-schemas`) will read this file and use it to generate the frontend types located in `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBuiltinObjectClasses.zod.ts`.
- The first step (`make generate_base_object_schemas`) will run `weave/scripts/generate_base_object_schemas.py` to generate a JSON schema in `weave/trace_server/interface/builtin_object_classes/generated/generated_builtin_object_class_schemas.json`.
- The second step (yarn `generate-schemas`) will read this file and use it to generate the frontend types located in `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBuiltinObjectClasses.zod.ts`.
4. Now, each use case uses different parts:
1. `Python Writing`. Users can directly import these classes and use them as normal Pydantic models, which get published with `weave.publish`. The python client correct builds the requisite payload.
2. `Python Reading`. Users can `weave.ref().get()` and the weave python SDK will return the instance with the correct type. Note: we do some special handling such that the returned object is not a WeaveObject, but literally the exact pydantic class.
3. `HTTP Writing`. In cases where the client/user does not want to add the special type information, users can publish builtin objects (set of weave.Objects provided by Weave) by setting the `builtin_object_class` setting on `POST obj/create` to the name of the class. The weave server will validate the object against the schema, update the metadata fields, and store the object.
4. `HTTP Reading`. When querying for objects, the server will return the object with the correct type if the `base_object_class` metadata field is set.
5. `Frontend`. The frontend will read the zod schema from `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBuiltinObjectClasses.zod.ts` and use that to provide compile time type safety when using `useBaseObjectInstances` and runtime type safety when using `useCreateBaseObjectInstance`.
* Note: it is critical that all techniques produce the same digest for the same data - which is tested in the tests. This way versions are not thrashed by different clients/users.
1. `Python Writing`. Users can directly import these classes and use them as normal Pydantic models, which get published with `weave.publish`. The python client correct builds the requisite payload.
2. `Python Reading`. Users can `weave.ref().get()` and the weave python SDK will return the instance with the correct type. Note: we do some special handling such that the returned object is not a WeaveObject, but literally the exact pydantic class.
3. `HTTP Writing`. In cases where the client/user does not want to add the special type information, users can publish builtin objects (set of weave.Objects provided by Weave) by setting the `builtin_object_class` setting on `POST obj/create` to the name of the class. The weave server will validate the object against the schema, update the metadata fields, and store the object.
4. `HTTP Reading`. When querying for objects, the server will return the object with the correct type if the `base_object_class` metadata field is set.
5. `Frontend`. The frontend will read the zod schema from `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBuiltinObjectClasses.zod.ts` and use that to provide compile time type safety when using `useBaseObjectInstances` and runtime type safety when using `useCreateBuiltinObjectInstance`.

- Note: it is critical that all techniques produce the same digest for the same data - which is tested in the tests. This way versions are not thrashed by different clients/users.

```mermaid
graph TD
Expand Down Expand Up @@ -201,7 +206,7 @@ graph TD
subgraph "Frontend"
Z --> |import| UBI["useBaseObjectInstances"]
Z --> |import| UCI["useCreateBaseObjectInstance"]
Z --> |import| UCI["useCreateBuiltinObjectInstance"]
UBI --> |Filters base_object_class| HR
UCI --> |object_class| HW
UI[React UI] --> UBI
Expand Down
Loading

0 comments on commit a1fd968

Please sign in to comment.