Skip to content

Commit

Permalink
Merge branch 'main' into pinbump1111
Browse files Browse the repository at this point in the history
  • Loading branch information
Jack-Khuu authored Dec 6, 2024
2 parents 6e8bfb1 + 29428ef commit 5a80f5f
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/quantization.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ for valid `bitwidth` and `groupsize` values.
| linear with dynamic activations (symmetric) | `'{"linear:a8w4dq" : {"groupsize" : <groupsize>}}'`|
| embedding | `'{"embedding": {"bitwidth": <bitwidth>, "groupsize":<groupsize>}}'` |

See the available quantization schemes [here](https://github.com/pytorch/torchchat/blob/main/torchchat/utils/quantize.py#L1260-L1266).
See the available quantization schemes [here](https://github.com/pytorch/torchchat/blob/b809b69e03f8f4b75a4b27b0778f0d3695ce94c2/torchchat/utils/quantize.py#L887-L894).

In addition to quantization, the [accelerator](model_customization.md#device)
and [precision](model_customization.md#model-precision) can also be specified.
Expand Down

0 comments on commit 5a80f5f

Please sign in to comment.