Skip to content

Commit

Permalink
[Doc][4/N] Reorganize API Reference (vllm-project#11843)
Browse files Browse the repository at this point in the history
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
  • Loading branch information
DarkLight1337 authored Jan 8, 2025
1 parent aba8d6e commit 6cd40a5
Show file tree
Hide file tree
Showing 24 changed files with 38 additions and 67 deletions.
2 changes: 1 addition & 1 deletion .buildkite/test-pipeline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ steps:
- pip install -r requirements-docs.txt
- SPHINXOPTS=\"-W\" make html
# Check API reference (if it fails, you may have missing mock imports)
- grep \"sig sig-object py\" build/html/dev/sampling_params.html
- grep \"sig sig-object py\" build/html/api/params.html

- label: Async Engine, Inputs, Utils, Worker Test # 24min
fast_check: true
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
# to run the OpenAI compatible server.

# Please update any changes made here to
# docs/source/dev/dockerfile/dockerfile.md and
# docs/source/assets/dev/dockerfile-stages-dependency.png
# docs/source/contributing/dockerfile/dockerfile.md and
# docs/source/assets/contributing/dockerfile-stages-dependency.png

ARG CUDA_VERSION=12.4.1
#################### BASE BUILD IMAGE ####################
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -11,18 +11,8 @@ vLLM provides experimental support for multi-modal models through the {mod}`vllm
Multi-modal inputs can be passed alongside text and token prompts to [supported models](#supported-mm-models)
via the `multi_modal_data` field in {class}`vllm.inputs.PromptType`.

Currently, vLLM only has built-in support for image data. You can extend vLLM to process additional modalities
by following [this guide](#adding-multimodal-plugin).

Looking to add your own multi-modal model? Please follow the instructions listed [here](#enabling-multimodal-inputs).

## Guides

```{toctree}
:maxdepth: 1
adding_multimodal_plugin
```

## Module Contents

Expand Down
File renamed without changes.
File renamed without changes.
22 changes: 22 additions & 0 deletions docs/source/api/params.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Optional Parameters

Optional parameters for vLLM APIs.

(sampling-params)=

## Sampling Parameters

```{eval-rst}
.. autoclass:: vllm.SamplingParams
:members:
```

(pooling-params)=

## Pooling Parameters

```{eval-rst}
.. autoclass:: vllm.PoolingParams
:members:
```

2 changes: 1 addition & 1 deletion docs/source/contributing/dockerfile/dockerfile.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ The edges of the build graph represent:

- `RUN --mount=(.\*)from=...` dependencies (with a dotted line and an empty diamond arrow head)

> ```{figure} ../../assets/dev/dockerfile-stages-dependency.png
> ```{figure} /assets/contributing/dockerfile-stages-dependency.png
> :align: center
> :alt: query
> :width: 100%
Expand Down
2 changes: 1 addition & 1 deletion docs/source/design/arch_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ for output in outputs:
```

More API details can be found in the {doc}`Offline Inference
</dev/offline_inference/offline_index>` section of the API docs.
</api/offline_inference/index>` section of the API docs.

The code for the `LLM` class can be found in <gh-file:vllm/entrypoints/llm.py>.

Expand Down
16 changes: 0 additions & 16 deletions docs/source/design/multimodal/adding_multimodal_plugin.md

This file was deleted.

6 changes: 0 additions & 6 deletions docs/source/dev/pooling_params.md

This file was deleted.

6 changes: 0 additions & 6 deletions docs/source/dev/sampling_params.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/source/getting_started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ The first line of this example imports the classes {class}`~vllm.LLM` and {class
from vllm import LLM, SamplingParams
```

The next section defines a list of input prompts and sampling parameters for text generation. The [sampling temperature](https://arxiv.org/html/2402.05201v1) is set to `0.8` and the [nucleus sampling probability](https://en.wikipedia.org/wiki/Top-p_sampling) is set to `0.95`. You can find more information about the sampling parameters [here](https://docs.vllm.ai/en/stable/dev/sampling_params.html).
The next section defines a list of input prompts and sampling parameters for text generation. The [sampling temperature](https://arxiv.org/html/2402.05201v1) is set to `0.8` and the [nucleus sampling probability](https://en.wikipedia.org/wiki/Top-p_sampling) is set to `0.95`. You can find more information about the sampling parameters [here](#sampling-params).

```python
prompts = [
Expand Down
9 changes: 4 additions & 5 deletions docs/source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,10 +137,10 @@ community/sponsors
:caption: API Reference
:maxdepth: 2
dev/sampling_params
dev/pooling_params
dev/offline_inference/offline_index
dev/engine/engine_index
api/offline_inference/index
api/engine/index
api/multimodal/index
api/params
```

% Design Documents: Details about vLLM internals
Expand All @@ -154,7 +154,6 @@ design/huggingface_integration
design/plugin_system
design/kernel/paged_attention
design/input_processing/model_inputs_index
design/multimodal/multimodal_index
design/automatic_prefix_caching
design/multiprocessing
```
Expand Down
2 changes: 1 addition & 1 deletion docs/source/serving/offline_inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The available APIs depend on the type of model that is being run:
Please refer to the above pages for more details about each API.

```{seealso}
[API Reference](/dev/offline_inference/offline_index)
[API Reference](/api/offline_inference/index)
```

## Configuration Options
Expand Down
8 changes: 4 additions & 4 deletions docs/source/serving/openai_compatible_server.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ Code example: <gh-file:examples/online_serving/openai_completion_client.py>

#### Extra parameters

The following [sampling parameters (click through to see documentation)](../dev/sampling_params.md) are supported.
The following [sampling parameters](#sampling-params) are supported.

```{literalinclude} ../../../vllm/entrypoints/openai/protocol.py
:language: python
Expand Down Expand Up @@ -226,7 +226,7 @@ Code example: <gh-file:examples/online_serving/openai_chat_completion_client.py>

#### Extra parameters

The following [sampling parameters (click through to see documentation)](../dev/sampling_params.md) are supported.
The following [sampling parameters](#sampling-params) are supported.

```{literalinclude} ../../../vllm/entrypoints/openai/protocol.py
:language: python
Expand Down Expand Up @@ -259,7 +259,7 @@ Code example: <gh-file:examples/online_serving/openai_embedding_client.py>

#### Extra parameters

The following [pooling parameters (click through to see documentation)](../dev/pooling_params.md) are supported.
The following [pooling parameters](#pooling-params) are supported.

```{literalinclude} ../../../vllm/entrypoints/openai/protocol.py
:language: python
Expand Down Expand Up @@ -447,7 +447,7 @@ Response:

#### Extra parameters

The following [pooling parameters (click through to see documentation)](../dev/pooling_params.md) are supported.
The following [pooling parameters](#pooling-params) are supported.

```{literalinclude} ../../../vllm/entrypoints/openai/protocol.py
:language: python
Expand Down
3 changes: 0 additions & 3 deletions vllm/multimodal/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,6 @@ class MultiModalPlugin(ABC):
process the same data differently). This registry is in turn used by
:class:`~MultiModalRegistry` which acts at a higher level
(i.e., the modality of the data).
See also:
:ref:`adding-multimodal-plugin`
"""

def __init__(self) -> None:
Expand Down
6 changes: 0 additions & 6 deletions vllm/multimodal/inputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,12 +99,6 @@ class MultiModalDataBuiltins(TypedDict, total=False):
MultiModalDataDict: TypeAlias = Mapping[str, ModalityData[Any]]
"""
A dictionary containing an entry for each modality type to input.
Note:
This dictionary also accepts modality keys defined outside
:class:`MultiModalDataBuiltins` as long as a customized plugin
is registered through the :class:`~vllm.multimodal.MULTIMODAL_REGISTRY`.
Read more on that :ref:`here <adding-multimodal-plugin>`.
"""


Expand Down
3 changes: 0 additions & 3 deletions vllm/multimodal/registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,9 +125,6 @@ def __init__(
def register_plugin(self, plugin: MultiModalPlugin) -> None:
"""
Register a multi-modal plugin so it can be recognized by vLLM.
See also:
:ref:`adding-multimodal-plugin`
"""
data_type_key = plugin.get_data_key()

Expand Down
2 changes: 1 addition & 1 deletion vllm/pooling_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ class PoolingParams(
msgspec.Struct,
omit_defaults=True, # type: ignore[call-arg]
array_like=True): # type: ignore[call-arg]
"""Pooling parameters for embeddings API.
"""API parameters for pooling models. This is currently a placeholder.
Attributes:
additional_data: Any additional data needed for pooling.
Expand Down

0 comments on commit 6cd40a5

Please sign in to comment.