[MPlayer-dev-eng] MS-ADPCM/Stereo Works

Michael Niedermayer michaelni at gmx.at
Fri Dec 28 14:19:29 CET 2001


Hi

On Friday 28 December 2001 05:20, Mike Melanson wrote:
> On Fri, 28 Dec 2001, Michael Niedermayer wrote:
> > sure iam allways interrested ...
> > if i understand corrrectly than the imaadpcm decoder only stores
> > the predictor (=last output sample)
> > the index (0-88)
> > the stepsize (which only depends on the index)
> > allthough the predictor isnt really used except that it is added at the
> > end so it might be possible to replace the whole decoder with a look up
> > table ... assuming that i understand it correctly
> > int delta= lut[input + index*16];
> > predictor+= delta;
> > clamp predictor
> > output=predictor;
> > index += adpcm_index[input];
> > clamp index
>
> 	It's late and I'm having trouble digesting all of this. Here's
after looking at it again, my 2 table suggestions are mostly identical ... 
well i guess i shouldnt try to optimize stuff at 5 o clock ;)

> what I have in my ADPCM document on the IMA algorithm (numbers are
> big-endian):
>
> The remaining 32 bytes, or 64 nibbles, are decoded into 64 16-bit PCM
> samples. For each byte, the lower nibble is decoded first (bits 3-0), then
> the upper nibble.
>
> initialize:
>   predictor = (first 2 bytes of the chunk) & 0xFF80
>     sign extend number
>     clamp within signed 16-bit range
>   index = (first 2 bytes of chunk) & 0x7F
>     clamp between 0 and 88
>   step = step_table[index]
>
> for each nibble:
>   index += index_table[(unsigned)nibble]
>   clamp index between 0..88 (table limits)
>   diff = ((signed)nibble + 0.5) * step / 4
this is not identical to
		o2 = step >> 3;
		if (delta & 4) o2 += step;
		if (delta & 2) o2 += step >> 1;
		if (delta & 1) o2 += step >> 2;
+/- 1 difference in the output or something, but i guess noone will notice 
the difference ...
(((nibble<<1) + 1)*step)>>3;
is another way to do it without the +/- 1 difference

> [insert note about calculating diff]
>   predictor += diff
>   clamp predictor value within signed 16-bit range and output to
>     decompressed audio stream
>   step = step_table[index]
>
> Again, rough draft. Also, I recalled that stereo IMA is stored as an
> entire 0x22-byte block for the left channel followed by an entire
> 0x22-byte block for the right channel. However, MS ADPCM is stored
> interleaved.
>
> > after thinking about it again simd doesnt seem to be a that good choice
> > :( ... but who knows, i doubt that a SIMD-adpcm decoder is a good
> > beginers exercise
>
> 	There may yet be some opportunity. Is there any efficient way to
> rip apart nibbles and interleave them using SIMD instructions? Also,
interleave them bytewise and mask and or shift at the end
interleave 8 blocks -> n11, n12, n21, n22, n31, n32 ... (4-bit each)
after masking (pand) -> n11, n21, n31, ... (8-bit each)
or shift&mask (pand, psrlw $4) -> n12, n22, n32, ... (8-bit each)

> where are some good references for SIMD stuff? I studied up on MMX a long
> time ago, but could never find 3dnow or SSE stuff. Then, when I found
well there are the manuals from intel & amd 
intels instruction set reference (describes what each instructon does 
(mmx,mmx2, sse, sse2) no 3dnow though
"intel pentium4 and intel xeon processor optimizarion reference manual"
"amd athlon processor x86 code optimization guide"
try google or http://www.sandpile.org/ to find them
and for P1-P3 intel cpus there is http://www.agner.org/assem/

> MPlayer, I realized there were second versions of each. I'm a little
iam not sure if the MMX2 / SSE2 names are completly official ;)
for intel MMX2 = SSE but not for amd ...

> behind and am constrained to thinking in terms of MMX v1 only.
>
> 	Thanks...
no problem

Michael



More information about the MPlayer-dev-eng mailing list