WAV檔案格式解析

wav檔案解析——————-

Useful References:

1.File Format Specifications: WAVE or RIFF WAVE sound file. Site includes pointers to and local copies of significant documents.
2.page for WAVE audio file format.
3.WAVE PCM soundfile format..
4.The WAVE file specifications came from Microsoft. The WAVE file format use RIFF chunks, each chunk consisting of a chunk identifier, chunk length and chunk data.
WAVE specifications, Version 1.0, 1991-08: riffmci.rtf.

Q&A

  • 8bit/16 bit 樣值的二進位制編碼表示一樣嗎?
  • 現有的wav支援哪幾種音訊編碼方法?

data format

  • 在資料域中除了單聲道-量化位數為8音訊資料之外PCM儲存格式按照補碼的形式存放。
  • 於單聲道、量化位數為8的情況,使用offset binary(偏移二進位制碼)。
    PCM data is two’s-complement except for resolutions of 1-8 bits, which are represented as offset binary.

The data format and maximum and minimums values for PCM waveform samples of various sizes are as follows:

Sample SizeData FormatMaximum ValueMinimum Value
one to eight bitsunsigned integer255(0xFF)0
Nine or more bitsSigned integer iLargest positive value of iMost negative value of i

For example, the maximum, minimum, and midpoint values for 8-bit and 16-bit PCM waveform data are as follows:

Sample SizeData FormatMaximum ValueMinimum Value
8-bit PCM255(0xFF)0128(0x80)
16-bit PCM32767(0x7FFF)-32768(-0x8000)0

Format Code

WAV對音訊流的編碼沒有硬性規定,除了PCM之外,還幾乎所有支援ACM規範的編碼都可以為WAV的音訊流進行編碼。
The standard format codes for waveform data are given below. The references above give many more format codes for
compressed data, a good fraction of which are now obsolete(部分現在已經過時了~).

Format CodePreProcessor SymbolData
0x0001WAVE_FORMAT_PCMPCM
0x0003WAVE_FORMAT_IEEE_FLOATIEEE float
0x0006WAVE_FORMAT_ALAM8-bit ITU-T G.711 A-law
0x0007WAVE_FORMAT_MULAW8-bit ITU-T G.711 µ-law
0xFFFEWAVE_FORMAT_EXTENSIBLEDetermined by SubFormat

下面維基百科提供的一份參考資料:
幾種不同編碼方式下得到的單聲道monophonic(not sterophonic)wav音訊audio quality聲音質量與壓縮位元率compression bitrates的對比

幾種不同編碼方式下得到的單聲道monophonic(not sterophonic)wav音訊audio quality聲音質量與壓縮位元率compression bitrates的對比

code知識補充 :

  • 我們熟知的數字訊號本質上是對連續變化的模擬訊號進行抽樣、量化和編碼得到的,稱為PCM(Pulse-code modulation),即脈衝編碼調製,這種電的數字訊號被叫做數字基帶訊號,由PCM電端機產生。
  • 音訊編碼:針對頻率範圍較寬的音訊訊號進行的編碼。主要應用於數字廣播和數字電視廣播、消費電子產品、音訊資訊的儲存、下載等。
  • 語音編碼:對模擬的語音訊號進行編碼,將模擬訊號轉化成數字訊號,從而降低傳輸位元速率並進行數字傳輸。語音編碼的基本方法可分為波形編碼(Waveform Coding)、參量編碼(Parametric Coding)和混合編碼(Hybrid Coding)。
    • 波形編碼是將時域的模擬話音的波形訊號經過取樣、量化、編碼而形成的數字話音訊號;
    • 參量編碼是基於人類語言的發音機理,找出表徵語音的特徵參量,對特徵參量進行編碼;
    • 混合編碼則是結合了兩種編碼方式的優點,基於語音產生模型的假定採用了分析合成技術,同時又利用了語音的時間波形資訊,增強了重建語音的自然度,使得語音質量有明顯的提高,代價是編碼速率相應上升。
  • Microsoft GSM 06.10 The low-level speech compression algorithm of the GSM suite is called GSM 06.10 RPE-LTP (Regular-Pulse Excitation Long-Term Predictor).
  • ADPCM (adaptive difference pulse code modulation)自適應差分PCM
  • SBC(sub-band coding)子帶編碼
  • CELP(Code Excited Linear Prediction,碼激勵線性預測編碼)
  • Truespeech(a proprietary audio codec produced by the DSP Group). It is designed for encoding voice data at low bitrates, and to be embedded into DSP chips. True speech has been integrated into Windows Media Player in older versions of Windows, but no longer supported since Windows Vista. It is also the format used by the voice chat features of Yahoo! Messenger

wav格式詳細分析

trouble

因為老師推薦所以第一次嘗試閱讀英文文件,自我感覺看了相當多的東西但是仔細整理起來非常雜亂,總結下來有三個問題:1速度慢,2句子含義理解不夠,3短時記憶很快忘……所以應該要逐漸鍛鍊自己文獻閱讀的能力,每週讀一定數量的paper,知識的積累很重要~

format description
The WAVE file format is a subset子集 of Microsoft’s RIFF specification規格詳細說明書 for the storage of multimedia多媒體 files. A RIFF file starts out with a file header followed by a sequence of 一串data chunks. A WAVE file is often just a RIFF file with a single “WAVE” chunk which consists of two sub-chunks – a “fmt ” chunk specifying the data format and a “data” chunk containing the actual實際的樣點資料 sample data. Call this form the “Canonical標準的典範性的格式 form”. Who knows how it really all works. An almost complete description which seems totally useless unless you want to spend a week looking over it can be found at MSDN (mostly describes the non-PCM 非PCM格式, or 已登記的擁有所有權的資料格式registered proprietary data formats). I use the standard WAVE format as created by the sox program: PCM 脈衝編碼調製 pulse code modulation
WAVE檔案作為Windows多媒體中使用的聲音波形檔案格式之一,它是以RIFF(Resource Interchange File Format)格式為標準的。
RIFF全稱為資源互換檔案格式(ResourcesInterchange FileFormat),
RIFF檔案是windows環境下大部分多媒體檔案遵循的一種檔案結構,RIFF檔案所包含的資料型別由該檔案的副檔名來標識,
能以RIFF檔案儲存的資料包括:
音訊視訊交錯格式資料(.AVI)
波形格式資料 (.WAV)
點陣圖格式資料 (.RDI)
MIDI格式資料 (.RMI)
調色盤格式 (.PAL)
多媒體電影 (.RMN)
動畫游標 (.ANI)
其它RIFF檔案 (.BND)

wav結構
A WAVE file is often just a RIFF file with a single “WAVE” chunk which consists of two sub-chunks – a “fmt ” chunk specifying the data format and a “data” chunk containing the actual sample data. Call this form the “Canonical form”.

這裡以無壓縮的PCM wav檔案為例簡析檔案結構


這裡寫圖片描述

這裡寫圖片描述

chunk 結構

typedef struct waveChunk
{
unsigned int chunkID;    //RIFF
unsigned int chunksize;  //儲存整個檔案的文字數
unsigned int WaveID;     //WAVE 
}WAVE;
typedef struct tWAVEFORMATEX
{
short wFormatTag; // format type 
short nChannels; // number of channels (i.e. mono, stereo...) 
unsigned int nSamplesPerSec;    // sample rate 
unsigned int nAvgBytesPerSec;   // for buffer estimation 
short nBlockAlign;        // block size of data 
short wBitsPerSample;    // number of bits per sample of mono data 
short cbSize;          // the count in bytes of the size of 
/* extra information (after cbSize) */
} WAVEFORMATEX, *PWAVEFORMATEX;
typedef struct dataChunk
{
unsigned int Subchunk2ID;    //data
unsigned int Subchunk2size;  //data size
unsigned char *data;        //data 
}WAVE;