[Hackathon 7th No.55] Add audiotools to PaddleSpeech (#3900)

* add AudioSignal && util * fix codestyle * add basemodel && decorator * add util && add quality * add acc && data && transforms * add utils * fix dir * add *.py; wo unitest * add unitest * fix codestyle * fix cuda error * add readme && __all__ * add 2 file test * change download dir * fix CI download path * add tar -zxvf * change requirements path * add audiotools path * fix place error * fix paddle2.5 verion Q * FFTConv1d -> FFTConv1D * FFTConv1d -> FFTConv1D * mv unfold * add _unfold1d 2 loudness * fix stupid device variable * bias -> bias_attr * () -> [] * fix .to() * rm ✅ * fix exp * deepcopy -> clone * fix dim error * fix slice && tensor.to * fix paddle2.5 index bug * git rm std * rm comment && ✅ * rm some useless comment * add __all__ * fix codestyle * fix soundfile.info error * fix sth * add License * fix cycle import * Adapt to paddle3.0 && update readme * fix License * fix License * rm duplicate requirements * fix trasform problems * rm disp * Update test_transforms.py * change path * rm notebook && add audio path * rm import * add comment * fix cycle import && rm TYPE_CHECKING * rm IPython * rm sth useless * rm uesless deps * Update requirements.txt
PaddlePaddle · Jan 13, 2025 · cb15e38 · cb15e38
1 parent 553a9db
commit cb15e38
Show file tree

Hide file tree

Showing 42 changed files with 11,176 additions and 0 deletions.
diff --git a/audio/audiotools/README.md b/audio/audiotools/README.md
@@ -0,0 +1,68 @@
+Audiotools is a comprehensive toolkit designed for audio processing and analysis, providing robust solutions for audio signal processing, data management, model training, and evaluation.
+
+### Directory Structure
+
+```
+.
+├── audiotools
+│   ├── README.md
+│   ├── __init__.py
+│   ├── core
+│   │   ├── __init__.py
+│   │   ├── _julius.py
+│   │   ├── audio_signal.py
+│   │   ├── display.py
+│   │   ├── dsp.py
+│   │   ├── effects.py
+│   │   ├── ffmpeg.py
+│   │   ├── loudness.py
+│   │   └── util.py
+│   ├── data
+│   │   ├── __init__.py
+│   │   ├── datasets.py
+│   │   ├── preprocess.py
+│   │   └── transforms.py
+│   ├── metrics
+│   │   ├── __init__.py
+│   │   └── quality.py
+│   ├── ml
+│   │   ├── __init__.py
+│   │   ├── accelerator.py
+│   │   ├── basemodel.py
+│   │   └── decorators.py
+│   ├── requirements.txt
+│   └── post.py
+├── tests
+│   └── audiotools
+│       ├── core
+│       │   ├── test_audio_signal.py
+│       │   ├── test_bands.py
+│       │   ├── test_display.py
+│       │   ├── test_dsp.py
+│       │   ├── test_effects.py
+│       │   ├── test_fftconv.py
+│       │   ├── test_grad.py
+│       │   ├── test_highpass.py
+│       │   ├── test_loudness.py
+│       │   ├── test_lowpass.py
+│       │   └── test_util.py
+│       ├── data
+│       │   ├── test_datasets.py
+│       │   ├── test_preprocess.py
+│       │   └── test_transforms.py
+│       ├── ml
+│       │   ├── test_decorators.py
+│       │   └── test_model.py
+│       └── test_post.py
+
+```
+
+- **core**: Contains the core class AudioSignal, which is responsible for the fundamental representation and manipulation of audio signals.
+
+- **data**: Primarily dedicated to storing and processing datasets, including classes and functions for data preprocessing, ensuring efficient loading and transformation of audio data.
+
+- **metrics**: Implements functions for various audio evaluation metrics, enabling precise assessment of the performance of audio models and processing algorithms.
+
+- **ml**: Comprises classes and methods related to model training, supporting the construction, training, and optimization of machine learning models in the context of audio.
+
+This project aims to provide developers and researchers with an efficient and flexible framework to foster innovation and exploration across various domains of audio technology.
diff --git a/audio/audiotools/__init__.py b/audio/audiotools/__init__.py
@@ -0,0 +1,25 @@
+# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from . import metrics
+from . import ml
+from . import post
+from .core import AudioSignal
+from .core import highpass_filter
+from .core import highpass_filters
+from .core import Meter
+from .core import STFTParams
+from .core import util
+from .data import datasets
+from .data import preprocess
+from .data import transforms
diff --git a/audio/audiotools/core/__init__.py b/audio/audiotools/core/__init__.py
@@ -0,0 +1,28 @@
+# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from . import util
+from ._julius import fft_conv1d
+from ._julius import FFTConv1D
+from ._julius import highpass_filter
+from ._julius import highpass_filters
+from ._julius import lowpass_filter
+from ._julius import LowPassFilter
+from ._julius import LowPassFilters
+from ._julius import pure_tone
+from ._julius import resample_frac
+from ._julius import split_bands
+from ._julius import SplitBands
+from .audio_signal import AudioSignal
+from .audio_signal import STFTParams
+from .loudness import Meter