Skip to content

ntucllab/libcll

Repository files navigation

libcll: Complementary Label Learning Benchmark

Documentation Status Code style: black

libcll

libcll is a Python library designed to simplify complementary-label learning (CLL) for researchers tackling real-world challenges. The package implements a wide range of popular CLL strategies, including CPE, the state-of-the-art algorithm as of 2023. Additionally, it includes unique datasets like CLCIFAR10, CLCIFAR20, CLMIN10, and CLMIN20, which feature complementary labels collected from human annotators. To foster extensibility, libcll provides a unified interface for integrating additional strategies, datasets, and models, making it a versatile tool for advancing CLL research. For more details, refer to the associated technical report on arXiv.

Installation

  • Python version >= 3.8, <= 3.12
  • Pytorch version >= 1.11, <= 2.0
  • Pytorch Lightning version >= 2.0
  • To install libcll and develop locally:
git clone git@github.com:ntucllab/libcll.git
cd libcll
pip install -e .

Running

Supported Strategies

Strategies Type Description
SCL NL, EXP Surrogate Complementary Loss with the negative log loss (NL) or with the exponential loss (EXP)
URE NN, GA, TNN, TGA Unbiased Risk Estimator whether with gradient ascent (GA) or empirical transition matrix (T)
FWD None Forward Correction
DM None Discriminative Models with Weighted Loss
CPE I, F, T Complementary Probability Estimates with different transition matrices (I, F, T)
MCL MAE, EXP, LOG Multiple Complementary Label learning with different errors (MAE, EXP, LOG)

Supported Datasets

Dataset Number of Classes Input Size Description
MNIST 10 28 x 28 Grayscale images of handwritten digits (0 to 9).
FMNIST 10 28 x 28 Grayscale images of fashion items.
KMNIST 10 28 x 28 Grayscale images of cursive Japanese (“Kuzushiji”) characters.
Yeast 10 8 Features of different localization sites of protein.
Texture 11 40 Features of different textures.
Dermatology 6 130 Clinical Attributes of different diseases.
Control 6 60 Features of synthetically generated control charts.
Micro ImageNet10 10 3 x 64 x 64 Contains images of 10 classes designed for computer vision research.
Micro ImageNet20 20 3 x 64 x 64 Contains images of 20 classes designed for computer vision research.
CIFAR10 10 3 x 32 x 32 Colored images of different objects.
CIFAR20 20 3 x 32 x 32 Colored images of different objects.
CLMicro ImageNet10 10 3 x 64 x 64 Contains images of 10 classes designed for computer vision research paired with complementary labels annotated by humans.
CLMicro ImageNet20 20 3 x 64 x 64 Contains images of 20 classes designed for computer vision research paired with complementary labels annotated by humans.
CLCIFAR10 10 3 x 32 x 32 Colored images of distinct objects paired with complementary labels annotated by humans.
CLCIFAR20 20 3 x 32 x 32 Colored images of distinct objects paired with complementary labels annotated by humans.

Quick Start: Complementary Label Learning on MNIST

To reproduce training results with the SCL-NL method on MNIST

python script/train.py \
  --do_train \
  --do_predict \
  --strategy SCL \
  --type NL \
  --model MLP \
  --dataset MNIST \
  --lr 1e-4 \
  --batch_size 256 \
  --valid_type Accuracy \

Documentation

The documentation for the latest release is available on readthedocs. Feedback, questions, and suggestions are highly encouraged. Contributions to improve the documentation are warmly welcomed and greatly appreciated!

Citing

If you find this package useful, please cite both the original works associated with each strategy and the following:

@techreport{libcll2024,
  author = {Nai-Xuan Ye and Tan-Ha Mai and Hsiu-Hsuan Wang and Wei-I Lin and Hsuan-Tien Lin},
  title = {libcll: an Extendable Python Toolkit for Complementary-Label Learning},
  institution = {National Taiwan University},
  url = {https://github.com/ntucllab/libcll},
  note = {available as arXiv preprint \url{https://arxiv.org/abs/2411.12276}},
  month = nov,
  year = 2024
}

Acknowledgment

We would like to express our gratitude to the following repositories for sharing their code, which greatly facilitated the development of libcll:

About

Complementary-label learning in Pytorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published