ptq

Star

Here are 16 public repositories matching this topic...

Xilinx / brevitas

Star

Brevitas: neural network quantization in PyTorch

fpga deep-learning pytorch neural-networks xilinx quantization hardware-acceleration qat brevitas ptq

Updated Apr 20, 2025
Python

Bobo-y / flexible-yolov5

Star

More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam，dcn and so on), and tensorrt

sparsity backbone pytorch resnet object-detection gcn tensorrt neck qat shufflenet yolov3 cbam hrnet dcnv2 yolov5 moblienet swin-transformer triton-server ptq

Updated Aug 19, 2024
Python

Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.

machine-learning deep-neural-networks deep-learning neural-network tensorflow optimizer pytorch quantization qat network-quantization network-compression edge-ai ptq

Updated Apr 20, 2025
Python

TsingmaoAI / MI-optimize

Star

mi-optimize is a versatile tool designed for the quantization and evaluation of large language models (LLMs). The library's seamless integration of various quantization methods and evaluation techniques empowers users to customize their approaches according to specific requirements and constraints, providing a high level of flexibility.

benchmark inference quantization qat ptq llm large-language-model

Updated Nov 28, 2024
Python

MAGICS-LAB / OutEffHop

Star

[ICML 2024] Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

transformer outliers attention attention-mechanism outlier-removal outlier hopfield-neural-network ptq outlier-treatment modern-hopfield-networks modern-hopfield-model icml-2024 softmax-1 quantized-friendly no-op-outlier

Updated Oct 17, 2024
Python

lix19937 / tensorrt-insight

Star

Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda

nvidia asp tensorrt qat ptq

Updated Apr 12, 2025
C++

yester31 / TensorRT_API

Star

Deep Learning Model Optimization Using by TensorRT API, window

cuda pytorch vgg resnet quantization unet tensorrt yolov5 detr ptq yolov6

Updated Aug 29, 2022
Python

yester31 / Quantization_EX

Star

quantization example for pqt & qat

quantization tensorrt int8 qat model-optimization quantization-aware-training post-training-quantization pytorch-quantization ptq

Updated Jul 24, 2023
Python

smpanaro / norm-tweaking

Star

Post post-training-quantization (PTQ) method for improving LLMs. Unofficial implementation of https://arxiv.org/abs/2309.02784

quantization post-training-quantization ptq llms

Updated Feb 21, 2024
Python

yester31 / TensorRT_Sparse

Star

inference with the structured sparsity and quantization

quantization tensorrt structured-sparsity sparsity-pattern ptq sparse-tensor-cores sparse-int8-model accelerate-the-inference

Updated Aug 30, 2023
Python

yester31 / TensorRT_ONNX

Star

Generating tensorrt model using onnx

pytorch quantization tensorrt onnx int8-inference onnxruntime post-training-quantization int8-quantization tensorrt-inference ptq

Updated Jun 22, 2023
C++

BlindOver / blindover_AI

Star

Build AI model to classify beverages for blind individuals

ai deep-learning mobile-app pytorch classification resnet quantization qat shufflenetv2 mobilenetv3 efficientnet ptq

Updated Aug 16, 2023
Python

OmidGhadami95 / EfficientNetV2_Quantization_CK

Star

EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.