Skip to content

Latest commit

 

History

History
90 lines (62 loc) · 2.86 KB

README.md

File metadata and controls

90 lines (62 loc) · 2.86 KB

License: MIT Python

SmartNoise Synthesizers

Differentially private synthesizers for tabular data. Package includes:

  • MWEM
  • QUAIL
  • DP-CTGAN
  • PATE-CTGAN
  • PATE-GAN

Installation

pip install smartnoise-synth

Using

MWEM

import pandas as pd
import numpy as np

pums = pd.read_csv(pums_csv_path, index_col=None) # in datasets/
pums = pums.drop(['income'], axis=1)
nf = pums.to_numpy().astype(int)

synth = snsynth.MWEMSynthesizer(epsilon=1.0, split_factor=nf.shape[1]) 
synth.fit(nf)

sample = synth.sample(10)
print(sample)

DP-CTGAN

import pandas as pd
import numpy as np
from snsynth.pytorch.nn import DPCTGAN
from snsynth.pytorch import PytorchDPSynthesizer

pums = pd.read_csv(pums_csv_path, index_col=None) # in datasets/
pums = pums.drop(['income'], axis=1)

synth = PytorchDPSynthesizer(1.0, DPCTGAN(), None)
synth.fit(pums, categorical_columns=pums.columns)

sample = synth.sample(10) # synthesize 10 rows
print(sample)

PATE-CTGAN

import pandas as pd
import numpy as np
from snsynth.pytorch.nn import PATECTGAN
from snsynth.pytorch import PytorchDPSynthesizer

pums = pd.read_csv(pums_csv_path, index_col=None) # in datasets/
pums = pums.drop(['income'], axis=1)

synth = PytorchDPSynthesizer(1.0, PATECTGAN(regularization='dragan'), None)
synth.fit(pums, categorical_columns=pums.columns)

sample = synth.sample(10) # synthesize 10 rows
print(sample)

Note on Inputs

MWEM, DP-CTGAN, and PATE-CTGAN require columns to be categorical. If you have columns with continuous values, you should discretize them before fitting. Take care to discretize in a way that does not reveal information about the distribution of the data.

Communication

Releases and Contributing

Please let us know if you encounter a bug by creating an issue.

We appreciate all contributions. Please review the contributors guide. We welcome pull requests with bug-fixes without prior discussion.

If you plan to contribute new features, utility functions or extensions to this system, please first open an issue and discuss the feature with us.