Previous Up Next

5.25.22  An example : static noise removal by spectral subtraction

In this section we use Xcas to inplement a simple algorithm for static noise removal based on the spectral subtraction method. For a theoretical overview see the paper "Noise Reduction Based on Modified Spectral Subtraction Method" by Ekaterina Verteletskaya and Boris Simak (2011), International Journal of Computer Science, 38:1 (PDF).

Efficiency of the spectral subtraction method is largely dependent on a good noise spectrum estimate. Below is the code for a function noiseprof that takes data and wlen as its arguments. These are, respectively, a signal chunk containing only noise and the window length for signal segmentation (the best values are powers of two, such as 256, 512 or 1024). The function returns an estimate of the noise power spectrum obtained by averaging the power spectra of a (not too large) number of distinct chunks of data of length wlen. Hamming window function is applied prior to FFT.

noiseprof(data,wlen):={
  local N,h,dx,x,v,cnt;
  N:=length(data);
  h:=wlen/2;
  dx:=min(h,max(1,(N-wlen)/100));
  v:=[0$wlen];
  cnt:=0;
  for (x:=h;x<N-h;x+=dx) {
    v+=abs(fft(hamming_window(
                 mid(data,floor(x)-h,wlen)))).^2;
    cnt++;
  };
  return 1.0/cnt*v;
}:;

The main function is noisered, which takes three arguments: the input signal data, the noise power spectrum np and the "spectral floor" parameter beta (β, the minimum power level). The function performs subtraction of the noise spectrum in chunks of length wlen (the length of list np) using the overlap-and-add approach with Hamming window function. For details see Section 3A of the paper "Speech Enhancement using Spectral Subtraction-type Algorithms: A Comparison and Simulation Study" by Navneet Upadhyay and Abhijit Karmakar (2015), Procedia Computer Science, vol. 54, pp. 574–584 (PDF).

noisered(data,np,beta):={
  local wlen,h,N,L,padded,out,j,k,s,ds,r,alpha;
  wlen:=length(np);
  N:=length(data);
  h:=wlen/2;
  L:=0;
  repeat L+=wlen; until L>=N;
  padded:=concat(data,[0$(L-N)]);
  out:=[0$L];
  for (k:=0;k<L-wlen;k+=h) {
    s:=fft(hamming_window(mid(padded,k,wlen)));
    alpha:=max(1,4-3*sum(abs(s).^2)/(20*sum(np)));
    r:=ifft(zip(max,abs(s).^2-alpha*np,beta*np).^(1/2)
            .*exp(i*arg(s)));
    for (j:=0;j<wlen;j++) {
      out[k+j]+=re(r[j]);
    };
  };
  return mid(out,0,N);
}:;

To demonstrate the efficiency of the algorithm, we test it on a small speech sample with an audible amount of static noise. Assume that the corresponding WAV file noised.wav is stored in the directory sounds. Input :

clip:=readwav("/path/to/sounds/noised.wav"):; plotwav(clip)

Output :

Speech starts after approximately 0.2 seconds of pure noise. We use that part of the clip for obtaining an estimate of the noise power spectrum with wlen set to 256. Input :

noise:=channel_data(clip,range=0.0..0.15):; np:=noiseprof(noise,256):;

Now we call the noisered function with β=0.03 :

c:=noisered(channel_data(clip),np,0.03):; cleaned:=createwav(c):; plotwav(cleaned)

Output :

It is clearly visible that the noise level is significantly lower than in the original clip. One can also use the playsnd command to compare the input with the output by hearing, which reveals that the noise is still present but in a lesser degree (the parameter β controls how much noise is "left in").

The algorithm implemented in this section is not particularly fast (removing the noise from a two and a half seconds long recording took 20 seconds of computation time), but serves as a proof of concept and demonstrates the efficiency of noise removal.


Previous Up Next