Diagram EditThe uniform quantize implemented is based in the following pipeline: a PCM sound vector (read from a .wav file) enters the coder, which quantizes the vector with R bits (using R-1 to represent PCM levels and 1 to represent the sign); this values are written in a .quant file and can be sent into a decoder to obtain a 16 bit playable .wav (this would substitute an audio player that would play the file in real time).
One of the main functions of the Quantizer is the coder (function Quantizer in Quantizer.py). This function receives a path to a wav file (originalFile), the number of bits that must be used to quantize (R) and the output name (codedFile). This values are set to default to make quick tests (originalFile is 'drumsA.wav' in the sounds directory, R is 16 and codedFile is 'yourfile.quant').
First it creates an Output directory, in case there is no folder named like that, and adapts the output to be saved under this directory. Then informs the user through the terminal that the file has started to be encoded and reads the wav input file 'originalFile' using the function from utilFunctions wavread.
Before starting to quantize all the values from the loaded PCM sound, the program has to create a binary file in which the header with information needed to decode and the data body will be written. After opening the file and the stream, we write the header with the following information needed to decode the body data of the file: sampling frequency (needed to write the wav in the decoding), a variable called n which represents the bits needed to write the length of the sound, then the length of the sound (needed to iterate through the file), and the number of bits used for the quantization (needed to "recover" the values in a floating point format from -1 to 1).
Then, we quantize the sound vector sample by sample using a function called quantize_num which implements a midtread quantizer (instead of a midrise, so it can quantize 0 as 0) that returns the sign of the sample (0 = positive, 1 = negative) and the symbol of the absolute value (that is a number between 0 and 2^R-1 - 1 to represent the absolute value using R-1 bits). The symbol and the sign are written in the binary file using the methods described before. Every half second of encoding, you should see a status in the terminal.
Then it flushes the stream and close the file and finally the function returns the path to the coded file.
The bitstream format of the .quant files is composed by a header with information needed to decode and then a body with the quantized values in binary.
- 16 bits to represent the sampling frequency.
- 16 bits to represent the variable 'n' which represents how many bits are used to represent the length of the sound in samples.
- n bits to represent the length of the sound in samples.
- 16 bits to represent the number of bits used for the quantization.
Total number of bits for the header: 48 + n bits.
For each sample:
- R-1 bits to represent the quantized sample.
- 1 bit to represent the sign of the quantized sample.
Total number of bits for the body: R*(length of the sound in samples).
Total number of bits for the file: 48 + n + R*(length of the sound in samples) + padding bits.
The second main function of the Quantizer is the decoder (function Decoder in Quantizer.py). This function receives a path to a .quant file (codedFile).
The decoder first checks if the input file exists, and as we did with the coder it creates the Output directory if this does not exist. Then it informs the user that it has started to decode.
It first opens the binary file and then reads the header using the methods described. It initialises the output sound vector and it reads the symbols and signs produced by the coder assigning the PCM to the output sound vector. As this vector has values between -(2^R-1 -1) and 2^R-1 -1, it normalises the vector with a factor of 2^R-1 -1 to obtain floating point values from -1 to 1. Finally it writes the output sound into a wav using utilFunctions's wavwrite.