Tuesday, March 3, 2015

Reading & Plotting Audio Files in MATLAB


Problem
  
This post describes how to read in an audio file and plotting it in MATLAB. We will work with this audio file (calcium.wav) that has a recording of the word calcium. We will assume, for the sake of simplicity, that the file is saved in C:\Users\Vladimir\audio_files


Reading Audio Files and Computing its Statistics

Let us add the path to the MATLAB's path list to reduce the amount of typing, read the file in, and play it.

addpath C:\Users\Vladimir\audio_files
[speech_vector, freq, bits_per_sample] = wavread('calcium.wav');
soundsc(speech_vector, freq);


The variable speech_vector is a digital speech vector. We can compute its length.

>> length(speech_vector)
ans =
      124928


There are 124928 samples in the vector. The second variable freq is the frequency. Recall that freq = 1/T where T is the time duration of the sample. Let us find out what the value of freq is.

>> freq  
freq =
       44100


So the frequency is 44100 Hz. The third variable, bits_per_sample, tells us the number of bits per sample. In our case, it is equal to 16.

>> bits_per_sample
bits_per_sample =
    16


Let us compute the duration of each sample T in milliseconds using the formula T = 1/freq. 

>> 1.0/freq
ans =
   2.2676e-05

 

Let us compute the duration of each sample in seconds.

>> 1.0/freq*1000
ans =
    0.0227
 


Here is how we can compute the duration of the entire file in seconds.

>> length(speech_vector)*1.0/freq*1000
ans =
   2.8328e+03
 


So, the duration of the entire file is 2.833 seconds. Figure 1 shows the file plotted in Audacity. The time line above confirms that the duration of the file is 2.83 seconds.

Figure 1. Calcium.wav plotted in Audacity

A shorter way of computing the time duration of the file is to divide the length by the speech vector by its frequency.

>> length(speech_vector)/freq
ans =
    2.8328


We can play the file faster or slower. For example, this command plays the audio file twice as fast.

>> soundsc(speech_vector, 2*freq)

This command plays the audio file twice as slow.


>> soundsc(speech_vector, freq/2.0)




Plotting Audio Files

Figure 2 shows how we can plot the speech vector read with the audio file with the command plot(speech_vector).


Figure 2. Output of the command plot(speech_vector)


We can plot the speech vector to bring it in line with the Audacity plot in Figure 1 by changing in the x-axis to seconds. Here is how.

step = 1/length(speech_vector);
xval = xval*(length(speech_vector)/freq);

plot(xval, speech_vector);
 

Figure 3 gives the output of the above instructions.

Figure 3. Changing the x-axis to seconds
The same plot can be produced with the following command, which is not as clear.
 plot((0:1/length(speech_vector):1-1/length(speech_vector))*(length(speech_vector)/freq), speech_vector);

Here is how we can label the x- and y-axis, add a grid, a legend, and save the plot as a png file.

plot(xval, speech_vector);
xlabel('Time [seconds]');
ylabel('Amplitude');
grid on;
legend('Calcium.wav');
figure(1), print -dpng 'C:\Users\Vladimir\audio_files\Calcium.png';