Skip to content

Latest commit

 

History

History
61 lines (42 loc) · 2.96 KB

README.md

File metadata and controls

61 lines (42 loc) · 2.96 KB

Unsupervised Ancient Document Image Denoising Based on Attention Mechanism

Motivation

The digitization of ancient documents is on the rise, while the poor quality of raw manuscripts creates problems for researchers and readers. Thus, we hope to propose a method to reduce noise on images and improve their quality. However, paired ancient document images are almost non-existent. Therefore, we proposed a model that could be trained on unpaired images.

Dataset

We chose Fangshan Shijing as our data source and cropped 1200 positive and negative 256 * 256 patches each. The ratio of training set to test set is 5:1. Here are two samples:

Download : Noise2Denoise

Name Explaination
trainA clean images for training
trainB noisy images for training
testA clean images for testing
testB noisy images for testing
testBgt_sim.txt character level annotation for testB (simplified chinese)
testBgt_tra.txt character level annotation for testB (traditional chinese)

The pdf file is the data source from which we have extracted all the samples. You can use it as you like.

Network Architecture

Our improvements focus on the generator module, which works by embedding the attention module in the stacked residuals module. We hope to focus the feature map on the foreground or background of the image for the purpose of denoising but not changing the text. We have tried two different attention mechanisms: SE and CBAM. The former focuses on the channel dimension only, while the latter focuses on both the channel dimension and the spatial dimension.

Results

Attention Map Visualization

FID metrics

Feature Dimention 64 192 2048
CycleGAN 1.34 8.95 66.47
CycleGAN+CBAM 3.16 11.52 67.18
CycleGAN+SE 1.06 4.85 59.79

OCR engine recognition output

We chose Paddle-OCR for testing because it has the best performance on denoised images. Here is the result of a degraded full page after it has been processed by our model.

## OCR metrics
model CER(%)
CycleGAN 49.66
CycleGAN+CBAM 36.01
CycleGAN+SE 31.09

Configurations

torch 1.9.1

🍒More deltials will be added soon!