Diagram EditThe MDCT Band Coder implemented is based in the following pipeline: a PCM sound vector (read from a wav file) is enframed and then calculates the MDCT of the frame returning its modified discrete cosine transform, this values enter in the coder which quantizes the values in the number of bits assigned from the user; this values are written in a .mdct file and can be sent into a decoder to obtain a 16 bit playable wav.
One of the main function of the MDCT Band coder is the coder (function MDCTcoding in MDCT.py). This function receives a path to a wav file (originalFile), number of bands to divide the MDCT (numBands), frame length (N), number of bits per sample to quantize (bits) and the output name (codedFile). This values are set to default to make quick tests (originalFile is drumsA.wav in the sounds directory, numBands is 5, N is 1024, bits is 16 and codedFile is yourfile.mdct).
First of all it creates an Output directory, in case there is no folder named like that, and adapts the output file to be saved under this directory. Then informs the user through the terminal that the file has started to be encoded and reads the wav input file 'originalFile' using the function wavread .
After that the function enframes the audio signal using a triangular window to avoid artifacts in the edges and obtains the number of frames to encode.
Before starting to encode all the values from the loaded PCM sound, the function creates a binary file using the functions defined in the first section. In this file it includes the header with the following information needed to decode the body data of the file: sampling frequency (needed to write the wav file in the decoding), frame length (needed to unframe the signal), number of bits used for the quantization, number of bands divisions of the MDCT and number of frames of the file.
Then, for every frame it computes the MDCT normalized to avoid clipping. Then separates the MDCT in the number of bands defined, applies a gain to each band to use better the quantification range, and quantizes every sample using the function midtread_quantizer_2 and finally writes this MDCT data into the file.
Every 10 frames, an information about the encoding status should be displayed in the terminal. When it has finished encoding it also displays information.
Then it flushes the stream and close the file and finally the function return the path to the coded file.
The bitstream format of the .mdct files is composed by a header with information needed to decode and then a body with the quantized frames.
- 16 bits to represent the sampling frequency.
- 12 bits to represent the length of each frame (N).
- 8 bits to represent the number of bits used for the quantization (R).
- 8 bits to represent the number of bands divisions of the FFT.
- 26 bits to represent the number of frames of the file.
Total number of bits for the header: 70 bits.
For each frame:
- R*(N/2) bits for the quantized MDCT.
Total number of bits for the body = (number of frames)*(R*N/2) bits.
Total number of bits for the file: 70 + (number of frames)*(R*N/2) + padding bits.
The other main function of the MDCT Band coder is the decoder (function MDCTdecoding in MDCT.py). This function receives a path to a .mdct file (filename).
The decoder first of all checks if the input file exists, and creates the Output directory if this does not exist. Then it informs the user that it has started to decode.
Then the function dequantizes our signal using the function midtrad_dequantizer, it recovers the original gain and then applies the inverse MDCT. In this case the user should see a status in the terminal as well.
After that it joing all frames into an output vector, it verifies if there is clipping and in case of clipping it normalizes the output sound vector. Finally it writes the output sound into a wav using the function wavwrite and returns the path of the coded file.