Click here to Skip to main content
15,124,011 members
Articles / Programming Languages / C++
Article
Posted 29 Sep 2014

Tagged as

Stats

22.5K views
824 downloads
9 bookmarked

The OGG Wrapper (An audio converter)

Rate me:
Please Sign up or sign in to vote.
4.27/5 (7 votes)
29 Sep 2014CPOL9 min read
A wrapper for the libvorbis library that ease the conversion of PCM (*.wav) to Ogg Vorbis audio file (*.ogg) and vice versa to just two lines of code. This also allow the conversion of stereo PCM to mono vorbis and vice versa.

Introduction

The MP3 format gained popularity in the mid 90 but the Fraunhofer Society announced plans to start charging licensing fee for it use in 1998, this prompted the Xiph.Org Foundation (home of the OGG’s; Ogg Vorbis, FLAC, Theora, etc) to intensify work on the already ongoing Vorbis project aiming to make it free and open-source thereby replacing MP3. However it didn't and hasn't replaced MP3 but it has become a standard and decent format in its own right.

It free nature and not been encumbered with patents like the MP3 as made it gained popularity in both open and closed source works like WebM (the HTML5 standard video format), Matroska (*.mkv), and numerous games.

The libvorbis library

The libvorbis library can be downloaded from http://www.vorbis.com/ or http://www.xiph.org/downloads/ or from this article’s attachment.

The wrapper

Working directly with the libvorbis library can be very complicated. Unlike the LAME mp3 library which has in-built functions for some crucial tasks, the libvorbis library however left many crucial tasks to the hands of the coder. Some of these tasks include but not limited to interleaving of samples, channel mixing, encode mode setting (VBR, CBR, ABR), etc.

This wrapper will help by

  1. Reducing encoding/decoding to just two lines of code (if you are using the default value of the wrapper)
  2. Making setting of parameters very easy
  3. Mixing of channels (Encoding a mono PCM to stereo ogg, stereo PCM to mono ogg, etc)
  4. Etc.

NOTE that the libvorbis library is written in such a way that an ogg will be decode to a PCM having the same channel (a mono channel ogg will decode to a mono channel PCM).

Setting up your environment

If you already know how to setup your development environment to compile with libvorbis, please skip to the second part of this section. 

I will be using Visual Studio, if you are using another IDE please find out how to link a static library to your project.

1.       Grab the libvorbis library archive from this article and extract it.

2.       Click on the “Project” button on the menu bar of Visual Studio, select Property Pages from the drop down menu (<Your Project Name> Property Pages), then go to “Configuration Properties” section. Go to the “Linker” subsection, then to the “General item”. Select “Additional Library Directories” from the list of now available options and add the extracted archive path to it. 

3.       Go to the “Input” item and select “Additional Dependencies”. Add libogg_static.lib, libvorbis_static.lib and libvorbisfile_static.lib to it (each on separate lines)  

4.       Now go back to the “Configuration Properties” section then to the “C/C++” subsection, then to the “General” item. Select “Additional Include Directories” from the list of now available options and add the extracted archive path to it.  

Environment ready for libvorbis. 

Secondly, grab the oggHelper_dd-mm-yyyy archive and extract it

  1. Add all the files (AudioSettings.h, oggHelper.cpp, oggHelper.h, OggHelper_VorbisSettings.cpp, OggHelper_VorbisSettings.h, WaveFileHeader.cpp and WaveFileHeader.h) to your project.
  2. #include the oggHelper.h to your project  
C++
#include oggHelper.h

Using the wrapper

To use the wrapper, initializes it thus

C++
#include "oggHelper.h"
int main()
{
 oggHelper oHelper;
 return 0;
}

Encoding (Conversing from PCM to ogg)

There are five overloaded member functions for the encoding

C++
BOOL Encode(char* file_in, char* file_out);
BOOL Encode(char* file_in, char* file_out, EncodeSetting es);
BOOL Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc);
BOOL Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc, WNDPROC callbackproc);
//The asynchronous function
void* Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc, WNDPROC callbackproc, BOOL async);

Encode(char* file_in, char* file_out);

This is the Encode function with the least parameter, the PCM file path as file_in and the resulting ogg file path as file_out. It used the wrapper default setting of stereo channel VBR encode mode, (just for completeness) bitrate of 128kbps for ABR and vbr quality of 0.4 with setting of all comment to an empty string.

Parameters

  • file_in: the path to the PCM (*.wav) file (including the file name)
  • file_out: the output path for the resulting ogg (including the file name)

Return Values

The member function returns 1 on success, 0 on failure.

A usage example

C++
#include "oggHelper.h"
int main()
{
	oggHelper oHelper;
	oHelper.Encode("file.wav", "file.ogg");
	
	return 0;
}

Encode(char* file_in, char* file_out, EncodeSetting es);

This is another overloaded method of the Encode which allows setting of the encoding environment via the EncodeSetting es parameter. The EncodeSetting struct is

C++
//Encoding setting
struct EncodeSetting
{
	Channel channel;
	Encode_Mode encode_mode;
	Bitrate min_abr_br;
	Bitrate max_abr_br;
	Bitrate abr_br;
	Bitrate cbr_br;
	VBR_Quality vbr_quality;

	//The constructor: used to set default values
	EncodeSetting();
};
  • channel is an enum Channel whose value is either value Channel::Stereo or Channel::Mono
  • encode_mode is an enum Encode_Mode whose value is one of Encode_Mode::VBR, Encode_Mode::ABR, or Encode_Mode::CBR
  • min_abr_br, max_abr_br, abr_br, and cbr_br and bitrate for abr and cbr which are enum Bitrate. The most commonly used bitrate is BR_128kbps
  • vbr_quality is the quality value if using VBR. It should be between -0.1 to 1

Parameters

  • file_in: the path to the PCM (*.wav) file (including the file name)
  • file_out: the output path for the resulting ogg (including the file name)
  • es: object of the struct EncodeSetting which is used to specify encoding settings

Return Values

The member function returns 1 on success, 0 on failure.

A usage example

C++
#include "oggHelper.h"
int main()
{
	oggHelper oHelper;
	EncodeSetting es;
	es.channel = Stereo;
	es.cbr_br = BR_128kbps;
	es.encode_mode = CBR;

	oHelper.Encode("file.wav", "file.ogg", es);
	
	return 0;
}

Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc);

This overloaded method allows the setting of Comments alongside various other previous setting. Supported comments include TITLE, VERSION, ALBUM, TRACKNUMBER, ARTIST, PERFORMER, COPYRIGHT, LICENSE, ORGANISATION, DESCRIPTION, GENRE, DATE, LOCATION, CONTACT, and ISRC

C++
struct VorbisComment
{
	char* TITLE;
	char* VERSION;
	char* ALBUM;
	char* TRACKNUMBER;
	char* ARTIST;
	char* PERFORMER;
	char* COPYRIGHT;
	char* LICENSE;
	char* ORGANISATION;
	char* DESCRIPTION;
	char* GENRE;
	char* DATE;
	char* LOCATION;
	char* CONTACT;
	char* ISRC;

	//The constructor: used to set default values
	VorbisComment();
};

Parameters

  • file_in: the path to the PCM (*.wav) file (including the file name)
  • file_out: the output path for the resulting ogg (including the file name)
  • es: object of the struct EncodeSetting which is used to specify encoding settings
  • ivc: object of the struct VorbisComment which is used to set comment for the audio file.

Return Values

The member function returns 1 on success, 0 on failure.

A usage example

C++
#include "oggHelper.h"
int main() 
{ 
	oggHelper oHelper; 
	//Encode setting 
	EncodeSetting es; 
	es.channel = Stereo; 
	es.cbr_br = BR_128kbps; 
	es.encode_mode = CBR; 

	//Comment setting 
	VorbisComment ivc; 
	ivc.ALBUM = "Beautiful Imperfection"; 
	ivc.ARTIST = "Asa"; 
	ivc.DATE = "2011"; 
	
	oHelper.Encode("file.wav", "file.ogg", es, ivc); 
	
	return 0; 
}

BOOL Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc, WNDPROC callbackproc);

This method includes a callback procedure but it is not asynchronous (i.e. it will still hilt your application until encoding is complete)

Parameters

  • file_in: the path to the PCM (*.wav) file (including the file name)
  • file_out: the output path for the resulting ogg (including the file name)
  • es: object of the struct EncodeSetting which is used to specify encoding settings
  • ivc: object of the struct VorbisComment which is used to set comment for the audio file.
  • callbackproc: the callback function

Return Values

The member function returns 1 on success, 0 on failure.

Notes

In WNDPROC callbackproc, msg receive the a OH_STARTED message at the start of encoding/decoding, it receives an OH_COMPUTED message with WPARAM holding the percentage of progress (as an int) during encoding/decoding, and OH_DONE at the end of encoding/decoding. It receives OH_ERROR if any error occur with wParam holding the error code.

A usage example

C++
#include "oggHelper.h"

HRESULT CALLBACK proc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
	switch(msg)
	{
	case OH_STARTED:
		//Start of encoding / decoding
		printf("Starting encoding ");
		break;
	case OH_COMPUTED:
		//Update of percentage done
		//wParam contains the percentage as int
		//the best way to use this is to pass wParam's value into a progress bar
		printf("%i ", wParam);
		break;
	case OH_DONE:
		//Notifying end of encoding / decoding
		printf("Completed successfully");
		break;
	case OH_ERROR:
		//Error occured
		printf("Error code = %i\n", wParam);
		break;
	}
	return 0;
}


int main()
{
	oggHelper oHelper;
	//Encode setting
	EncodeSetting es;
	es.channel = Stereo;
	es.cbr_br = BR_128kbps;
	es.encode_mode = CBR;

	//Comment setting
	VorbisComment ivc;
	ivc.ALBUM = "Beautiful Imperfecion";
	ivc.ARTIST = "Asa";
	ivc.DATE = "2011";

	oHelper.Encode("file.wav", "file.ogg", es, ivc, proc);
	
	return 0;
}

void* Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc, WNDPROC callbackproc, BOOL async);

This is the asynchronous member of the Encode function (i.e. when used, it wont hilt the application, and up to 5 files can be encoded at the same time)

Parameters

  • file_in: the path to the PCM (*.wav) file (including the file name)
  • file_out: the output path for the resulting ogg (including the file name)
  • es: object of the struct EncodeSetting which is used to specify encoding settings
  • ivc: object of the struct VorbisComment which is used to set comment for the audio file.
  • callbackproc: the callback function
  • async: a Boolean, if TRUE the function will be asynchronous and if FALSE the function will behave exactly like the member BOOL Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc, WNDPROC callbackproc);. Default is FALSE

Return Values

If async is set to TRUE, the function returns an HANDLE if successful or -3 if the maximum allowed process is reached. If any other error occurred, OH_ERROR message is sent with WPARAM holding the error code.

If async is set to FALSE, the member function returns 1 on success, 0 on failure.

A usage example

C++
#include "oggHelper.h"

HRESULT CALLBACK proc1(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
	switch(msg)
	{
	case OH_STARTED:
		//Start of encoding / decoding
		printf("Starting encoding ");
		break;
	case OH_COMPUTED:
		//Update of percentage done
		//wParam contains the percentage as int
		//the best way to use this is to pass wParam's value into a progress bar
		printf("%i ", wParam);
		break;
	case OH_DONE:
		//Notifying end of encoding / decoding
		printf("Completed successfully");
		break;
	case OH_ERROR:
		//Error occured
		printf("Error code = %i\n", wParam);
		break;
	}
	return 0;
}

//Write a full blown callback
HRESULT CALLBACK proc2(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam){return 0;}
HRESULT CALLBACK proc3(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam){return 0;}

int main()
{
	
	//Encode setting
	EncodeSetting es;
	es.channel = Stereo;
	es.cbr_br = BR_128kbps;
	es.encode_mode = CBR;

	//Comment setting
	VorbisComment ivc;
	ivc.ALBUM = "Beautiful Imperfection";
	ivc.ARTIST = "Asa";
	ivc.DATE = "2011";

	//Handles
	HANDLE oHelperHandle[5];
	
	oggHelper oHelper;
	
	oHelperHandle[0] = oHelper.Encode("file1.wav", "file1.ogg", es, ivc, proc1, true);
	oHelperHandle[1] = oHelper.Encode("file2.wav", "file2.ogg", es, ivc, proc2, true);
	oHelperHandle[2] = oHelper.Encode("file3.wav", "file3.ogg", es, ivc, proc3, true);
	
	WaitForMultipleObjects(3, oHelperHandle, TRUE, INFINITE);
	
	return 0;
}

Decoding (Conversion of vorbis ogg to PCM)

As earlier said, libvorbis is written is such a way that a vorbis ogg will decode to a PCM having the same channel setting and sample rate. The wrapper as three overloaded member.

C++
//Decode OGG to PCM (with a WAVE header)
BOOL Decode(char* file_in, char* file_out);
BOOL Decode(char* file_in, char* file_out, WNDPROC callbackproc);
//Async function
void* Decode(char* file_in, char* file_out, WNDPROC callbackproc, BOOL async);

BOOL Decode(char* file_in, char* file_out);

This takes two arguments char* file_in, char* file_out

Parameters

  • file_in: the path to the ogg (*.ogg) file (including the file name)
  • file_out: the output path for the resulting PCM (including the file name)

Return Values

The member function returns 1 on success, 0 on failure.

A usage example

C++
#include "oggHelper.h"
int main()
{
	oggHelper oHelper;
	oHelper.Decode("file.ogg", "file.wav"); 
return 0;
}

BOOL Decode(char* file_in, char* file_out, WNDPROC callbackproc);

This method includes a callback procedure but it is not asynchronous (i.e. it will still hilt your application until decoding is complete)

Parameters

  • file_in: the path to the ogg (*.ogg) file (including the file name)
  • file_out: the output path for the resulting PCM (including the file name)
  • callbackproc: the callback function

Return Values

The member function returns 1 on success, 0 on failure.

A usage example

C++
#include "oggHelper.h"

HRESULT CALLBACK proc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
	switch(msg)
	{
	case OH_STARTED:
		//Start of encoding / decoding
		printf("Starting encoding ");
		break;
	case OH_COMPUTED:
		//Update of percentage done
		//wParam contains the percentage as int
		//the best way to use this is to pass wParam's value into a progress bar
		printf("%i ", wParam);
		break;
	case OH_DONE:
		//Notifying end of encoding / decoding
		printf("Completed successfully");
		break;
	case OH_ERROR:
		//Error occured
		printf("Error code = %i\n", wParam);
		break;
	}
	return 0;
}


int main()
{
	oggHelper oHelper;
	oHelper.Decode("file.ogg", "file.wav", proc);

	system("pause");
	return 0;
}

void* Decode(char* file_in, char* file_out, WNDPROC callbackproc, BOOL async);

This member function is truly asynchronous when async is set to TRUE.

Parameters

  • file_in: the path to the ogg (*.ogg) file (including the file name)
  • file_out: the output path for the resulting PCM (including the file name)
  • callbackproc: the callback function
  • async: a Boolean, if TRUE the function will be asynchronous and if FALSE the function will behave exactly like the member BOOL Decode(char* file_in, char* file_out, WNDPROC callbackproc);

Return Values

If async is set to TRUE, the function returns an HANDLE if successful or -3 if the maximum allowed process is reached. If any other error occurred, OH_ERROR message is sent with WPARAM holding the error code.

If async is set to FALSE, the member function returns 1 on success, 0 on failure.

A usage example

C++
#include "oggHelper.h"

HRESULT CALLBACK proc1(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
	switch(msg)
	{
	case OH_STARTED:
		//Start of encoding / decoding
		printf("Starting encoding ");
		break;
	case OH_COMPUTED:
		//Update of percentage done
		//wParam contains the percentage as int
		//the best way to use this is to pass wParam's value into a progress bar
		printf("%i ", wParam);
		break;
	case OH_DONE:
		//Notifying end of encoding / decoding
		printf("Completed successfully");
		break;
	case OH_ERROR:
		//Error occured
		printf("Error code = %i\n", wParam);
		break;
	}
	return 0;
}


int main()
{
	oggHelper oHelper;
	HANDLE oHelperHandle[5];
	oHelperHandle[0] = oHelper.Decode("file.ogg", "file.wav", proc1, true);

	WaitForMultipleObjects(1, oHelperHandle, TRUE, INFINITE);
	system("pause");
	return 0;
}

Understanding the code

Understanding the channel mixing (Working with the 16bits per sample PCM)

PCM data structure in based on the number of channel and the bits per sample of the PCM.

A 16bits per sample stereo PCM

A 16bits per sample stereo PCM stores each sample in sections of 4bytes which is (16(bits) * 2(channel))/8. Since both the left and the right channel has to be represented, the left have 2bytes while the right also have 2 bytes.

[Left Channel 2 byte][Right Channel 2 byte]

In a .wav file (which is what will are dealing with), the data are stored in little endian format which means the Less Significant Byte (LSB) is stored first while the Most Significant Byte (MSB) is stored last. So the format becomes

[Left LSB][Left MSB][Right LSB][Right MSB]

A 16bits per sample mono PCM

A 16bits per sample mono PCM stores each sample in section of 2bytes which is (16(bits) * 1(channel))/8. Since the sample has only one channel, the channel takes the whole 2bytes.

[Mono channel 2byte]

In .wav little endian format, it is stored as

[Mono LSB][Mono MSB]

Note that in vorbis, the samples should be in float.

Mixing the channels

Stereo to Mono sampling

To get a mono sample out of a stereo sample, the 4 bytes of the stereo have to be reduced to 2 bytes. This can be done by getting the left and right channels from the stereo sample, adding them together and dividing them by 2.

Say, the 4 bytes of the stereo sample are read into readbuffer, to get the left and right channel out (remember LSB, MSB).

C++
lChannel = ((readbuffer[1]<<8) | (0x00ff & (int)readbuffer[0])) / 32768.f;
rChannel = ((readbuffer[3]<<8) | (0x00ff & (int)readbuffer[2])) / 32768.f;

To get the mono channel,

C++
mChannel = (lChannel + rChannel) * 0.5f

Mono to Stereo sampling

To convert a mono channel sample to stereo, get out the mono channel and set it has both left and right channel.

Say, the 2 bytes of the mono sample are read into readbuffer,

C++
monoChl = ((readbuffer[1]<<8) | (0x00ff & (int)readbuffer[0])) / 32768.f;
lChannel = monoChl;
rChannel = monoChl;

History

  • 22nd October, 2014 - Included "Understanding the code" and updated a section of oggHelper.cpp file
  • 29th of September, 2014 - Initial article release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Oso Oluwafemi Ebenezer
Software Developer
Nigeria Nigeria
A graduate of Agricultural Engineering from Ladoke Akintola University of Technology, Ogbomoso but computer and web programming is his first love. You can meet him on Facebook Osofem Inc.

Comments and Discussions

 
QuestionThis project on GitHub Pin
Member 1340146119-Nov-18 1:26
MemberMember 1340146119-Nov-18 1:26 
Questionprogress callback needed Pin
jackyxinli29-Sep-14 4:22
Memberjackyxinli29-Sep-14 4:22 
i expect encode progress callback will be provided.
AnswerRe: progress callback needed Pin
Oso Oluwafemi Ebenezer29-Sep-14 4:27
MemberOso Oluwafemi Ebenezer29-Sep-14 4:27 
GeneralRe: progress callback needed Pin
jackyxinli29-Sep-14 4:36
Memberjackyxinli29-Sep-14 4:36 
GeneralRe: progress callback needed Pin
jackyxinli29-Sep-14 4:48
Memberjackyxinli29-Sep-14 4:48 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.