Audio de-noising using Python (Wavelets)
What is audio de-noising?
Audio de-nosing is a process of removing noise(unwanted sound/signals from audio). You might have heard of noise cancellation headphones, these remove all the noise from the audio.
This process has various applications in media creation and learning. Also, de-noising is not only done on audio but also done on images. There are various methods to do this. We will see 2 methods and implement one.
Method 1: FFT (Fourier Transform)
There are multiple different implementations of FT: FFT (fast FT), STFT (Short time FT). Also, there are inverse for each of them IFT respectively.
So what is it? In short, FFT converts signals from one form to another, in audio, the time-series signal is converted to the frequency domain.
More shortly, the dominant audio (your voice) will have more power than the non-dominant(noise).
The following video will explain (a really good video):
Denoising with FFT (in Python)
Method 2: Wavelets (faster than FT)
These are complicated to understand but in short, it will decompose the signal into levels and windows, resulting in a window where the noise is not there, and only the important stuff exists.
The following video from the same person as above:
Wavelets and multi-resolution analysis
Implementation
Most of you like me will jump directly into coding. I will only explain a few things here.
Requirements:
1. pywt
2. soundfile
3. numpy
4. tqdm (just a loading bar, you can remove it)
The function 'mad : median absolute deviation', it is available in statsmodel package but only for one function I didn't want to install a package.
- Using soundfile.info we get all information about the file like frequency rate(14400Hz), duration, and other metadata.
- Also, you can calculate the duration of the audio by (len(data) / rate) gives the time in seconds.
- We create a numpy array to store the cleaned signal
- Loop through the blocks of the file, some files can be big and take up a lot of memory, so in this approach, I read the file in blocks, you can change the block size (2nd parameter) to suite you.
- If the shape is greater than 1 ( multi-channel audio), select only the single-channel (mono)
- wavedec : wavelet decomposition, this calculates the windows with specified level and wavelet method.
- there are 100s of wavelets https://pywavelets.readthedocs.io/en/latest/ref/index.html
- there are also many modes
- then calculate the median average deviation
- get a threshold value for noise
- apply the threshold value to all the coefficients
- finally waverec : recomposition ( make sure you use the same wavelet and mode for both decompositions)
- save our file and see the result.
There are many things to do with audio, but I wanted to share this application, cause I couldn't find a great one.
Source code : Audio-Denoising
Comment below for any queries, mistakes, anything ;)