Cheat Engine

Deltron Z · Posted: Sat Apr 10, 2010 4:26 pm Post subject: Analyzing Audio Files

I've been investigating audio files recently and I've managed to read Wave Files successfuly and draw the wave:
http://up203.siz.co.il/up2/ctihtw4njoto.jpg
The blue one is mine. Smile

I've also compared it with another program which my friend found later that day I finished coding mine (this one. this control is pretty basic, doesn't do much. source codes are almost 1:1, so I won't include mine in case it's needed) and it's a 100% match with mine.

Anyway, I'm trying to take one step at a time. before I'll handle MP3 files and more than one channel I will only handle simple .WAV files, while learning some more about sound waves and the way they are stored, so meanwhile I have only 2 questions:
1. Does anyone know any good referrence about sound waves? some information about it, properties and everything like frequency, channels and the way they are used and stored. I know the basics needed to do what I've done, but I'd like to learn more than that.
2. How can I play sound? I guess using DirectX, but the problem is the data is stored in an array, not as a file. how would I play this?

This is it for now, I will probably have some more questions later.

Thanks in advance. Razz

Polynomial · Posted: Sun Apr 11, 2010 11:13 am Post subject:

I wrote some code a year or so ago to do exactly this. Unfortunately I'm on holdiay right now and only have my laptop with me.

The wave format is pretty flexible, and supports many-channel audio, multiple sample frequencies and multiple sample sizes. I found this resource and this one very useful in understanding how wave files are stored in terms of header data and audio data.

In general, you have a header that tells you some information about how the audio data is stored, then your audio data itself. The "standard" format is 16 bits per sample, 44100 samples per second (44.1KHz sample rate), two channels. This translates to two bytes per sample (16 bits), and each channel's set of samples are interlaced with each other. So you'd have two bytes for the first sample of the left channel, then two bytes for the first sample of the right channel, and so on.

One thing you MUST remember is that fields in the format do not always have the same endianness, so you need to refer to the specification to make sure you're reading the bytes in the right order.

Regarding playback, this is a job for DirectX. If you read the audio data into a buffer you can play it back from memory. I'm pretty sure you can load raw audio data into the buffer if you specify the format in a WAVEFORMATEX structure.

Deltron Z · Posted: Sun Apr 11, 2010 11:51 am Post subject:

Thank you for the detailed answer, but I've already managed to read .WAV format files successfuly. Smile

I also found these links, that's how I've managed to read Wave Files. Rolling Eyes

As for the endianess, I've noticed this when reading about .MP3 format - why isn't it consistent? How do I know if the data is big or little endian? It seemed like the ID3 header is storing little-endian data and the rest, MP3 chunks, store data as big-endian. Confused

As for playing sounds, I've been told not to use DirectX, but to use FMOD library, but I didn't find any way to play raw audio data, so I'll try this once again. what function am I supposed to use? and does the WAVEFORMATEX support MP3 files? (after decompression, or is it stored differently)

Thanks again.

Edit:
There's one thing I forgot to ask... do channels represent the speakers, left and right? so, can there be more than 2 channels?
By the way, I've managed to play my audio with DirectSound, but why is the volume range between -10000 and -1? Laughing

Polynomial · Posted: Sun Apr 11, 2010 2:18 pm Post subject:

Yes, channels represent left and right, or more. The WAV format technically supports up to 16,384 channels on 16 bits per sample, but obviously there's no supporting software for it.

The internals of the MP3 format aren't something I'm hugely familiar with, and I've not really dealt with in-memory playback of sound. Check the specs for the endianness. WAVEFORMATEX isn't applicable for MP3, it's a completely different format.

Deltron Z · Posted: Sun Apr 11, 2010 2:49 pm Post subject:

Isn't MP3 just compressed wave files? after decompressing, it's supposed to contain the same data as wave files, isn't it? I thought everything needed to convert MP3 to WAV (only the audio, without MP3 header information) is to decompress the data...

Polynomial · Posted: Sun Apr 11, 2010 3:54 pm Post subject:

Nope, it's a completely different format. The audio is compressed with a lossy compression algorithm and stored in an MPEG container format file. When the audio data is decompressed, it is stored in-memory as PCM audio, which is the same as WAV audio data. The headers and other metadata are completely different.

If you're looking for the decompression algorithm, it's not as simple as LZW, RLE or any other standard data compression algorithm. The audio data is split into separate sections and analysed for "inaudible" sounds which are filtered out to create a less harmonically complex sample. Each block is compressed separately with a proprietary algorithm. Look up the MP3 specs online.

Deltron Z · Posted: Sun Apr 11, 2010 4:04 pm Post subject:

That's what I meant, that the data is the same. I know the format is diffrent, I've already managed to read information about MP3 files such as Artists, Song Name, Album and everything else stored in the header. the problem is knowing when it ends - and what's after that I haven't checked yet.

Polynomial · Posted: Sun Apr 11, 2010 4:10 pm Post subject:

Ususally there's something in the specs that tells you how the format ends. In extendable formats (which I expect MPEG is) there's usually a field that tells you the length of extended header data.

After the header data, there's MPEG Layer 3 compressed audio data. The compression algorithm specs will be out there on the net somewhere. The algorithm will be split into two sections - the preprocessor (not an official term) and the compressor.

The preprocessor takes raw PCM audio data and removes complex harmonics to improve compression ratio. The specs on this "lossy" part are pretty loose, but you should really have to worry about them since they result in valid PCM audio data.

The compressor does what it says. If you find the decompression algorithm and implement it, you can make your own MP3 codec.

Deltron Z · Posted: Mon Apr 12, 2010 11:48 am Post subject:

Thank you for your help, it has been very helpful to me.

I have one more question though... how would I analyze the sound and detect, for instance, guitar sound wave? dynamic algorithms, I guess? so far, I checked the avarage of X ms. in the file and accoording to this value, I decided wether the sound is loud or not. I know I'm not supposed to do it this way, obviously, but I just wanted to try something out. it wasn't even close to correct, even though my calculation was correct, I guess it was just a bad idea. Rolling Eyes

However, I could generate from this avarage numbers according to the loudness of the sound, but it would be in constant distance... what I'm looking for is something more like guitar hero, except that you choose your own music and difficulty. Smile

I will start by creating numbers or arrows according to any sound at first, but since I've done something like that now I want to try filtering the random sound and creating a number or an arrow when a meaningful sound is played.

Polynomial · Posted: Sat Apr 17, 2010 8:44 am Post subject:

You want a map of amplitudes across a frequency range over time. You can achieve this using a fourier transform. Find an appropriate FFT library for whatever language you're working with and perform FFT on short samples. Essentially you're building a 3D array of amplitudes across frequency and time. Recognition of a pre-analysed section of audio or simple monofrequncy events such as a drum kickL is possible from this, but detection of arbitrary instruments is near impossible for composite samples such as music, unless you have a second version mastered without that instrument; in which case you can use difference analysis.