[Bug 14717] resampled sound is horrible

Mon Nov 7 23:42:24 CST 2011

http://bugs.winehq.org/show_bug.cgi?id=14717

--- Comment #221 from Alexander E. Patrakov <patrakov at gmail.com> 2011-11-07 23:42:24 CST ---
Thanks for nicely grouping my patches.

Some random notes on the reworked patches. I understand that they also apply to
the original patch.

-       if (dsb->device->tmp_buffer_len < len || !dsb->device->tmp_buffer)
-       {
-               /* If we just resampled in DSOUND_MixToTemporary, we shouldn't
need to resize here */
-               assert(!dsb->resampleinmixer);
-               dsb->device->tmp_buffer_len = len;
-               if (dsb->device->tmp_buffer)
-                       dsb->device->tmp_buffer = HeapReAlloc(GetProcessHeap(),
0, dsb->device->tmp_buffer, len);
-               else
-                       dsb->device->tmp_buffer = HeapAlloc(GetProcessHeap(),
0, len);
-       }
+       assert(dsb->device->tmp_buffer_len >= len && dsb->device->tmp_buffer);

This is a stray assert from the old version of the patch. It is later (in the
patch in comment 219) deleted as the only user of the otherwise-unused len
variable. Maybe it should not be added in the first place?

And please include the patch from comment 219. Without it, there are clicks in
the output. As for the note above the DSOUND_bufpos_to_secpos() function, it is
invalid for resampling in the mixer thread. Anyway, at the end of the series,
DSOUND_bufpos_to_secpos() is called from exactly one place, and that place
clearly needs freqAcc, not freqAccNext.

Also, the FIR generator includes two extra points, one at each end of the FIR.
They were needed by cubic interpolation to set the correct derivative at the
end, but are unused by the linear interpolation. Can we drop them by removing
these two lines

+    /* For cubic interpolation, one extra dummy point */
+    wing_len += 1;

and removing the "1 +" from "DWORD idx = 1 + (hires_pos >> DSOUND_FREQSHIFT);"?

As for your question about bitmasking and shifting: this is rather boring
fixed-point math, of the similar kind as we already do with freqAcc. Values
freqAdjust and freqAcc in the original Wine are to be interpreted as follows:
they represent fractional values multiplied by (1 << DSOUND_FREQSHIFT). The
same interpretation applues to invFreqAdjust, dsb->firstepdsb->firstep, the mu
parameter for linear_interp(), hires_pos and max_pos. So, with the current
value of DSOUND_FREQSHIFT equal to 20, e.g. the value of hires_pos = 3 << 19
(i.e. 1.5 * (1 << DSOUND_FREQSHIFT)) means "something right in the middle
between the first and the second (counting from zero) used points of the FIR".
Then the expression

DWORD idx = 1 + (hires_pos >> DSOUND_FREQSHIFT);

really means "take the integer part of the value represented by hires_pos and
add one, to compensate for the dummy point added for the cubic interpolator",
and

DWORD rem = hires_pos & ((1 << DSOUND_FREQSHIFT) - 1);

means "take the fractional part".

That's all about time-related variables. Also sample-related variables use the
same fixed-point scheme, but there is no explicit nice "normalization" value
like 1 << DSOUND_FREQSHIFT. The values returned by get() and accepted by put()
are, obviously, normalized in a way that 0x7fffffff represents the maximum
possible sample value.

Note that gcc with -march=i686 or higher compiles the "((LONGLONG)something *
something_else) >> 32" expression into a single "imul" CPU instruction, that's
why it is extensively used. With the "0x7fffffff means just a bit less than
1.0" normalization, this expression means "something_float *
something_else_float / 2.0". Conceptually, the math is simple: linear_interp()
gets the interpolated FIR value, this is multiplied by the input sample and
added to sample[channel], and then the sum is multiplied by gain, clipped and
stored.

So the concern is how to preserve the most of the significant bits in
sample[channel] without leading to the overflow. Because of the overflow when
accumulating sample[channel], especially in the downsampling case, I cannot
normalize the FIR to 0x7fffffff. That's why it is normalized to 0x1000000,
giving 20 significant bits. This is well-balanced with the 20 bits of precision
in time-related variables like freqAcc.

As for the gain normalization vs "((LONGLONG)sample[channel] * gain) >> 17",
the things are indeed rather clumsy. When this was written, I didn't know what
the correct value of the gain would be (i.e. supposed that I might need either
to attenuate or to amplify the samples), so wrote something that would work for
either case. Then I figured out the correct value in DSOUND_RecalcFreqAcc(). So
it turns out that the overall effect for upsampling is to left-shift the
samples by 8 bits, thus giving 24 bits of possible precision here (of course,
the FIR is worse than that).

Given that the maximum value of the gain to be applied after the FIR is now
known, it is indeed better to change this part so that firgain is normalized to
0x7fffffff. I will do that later today.

One more remark: you merged my patches related to channel copying and with
resampling. These are conceptually different things, I disagree with the merge
and thus will split the patch into two: the first patch will copy channels
using the bad resampler, and the second patch will add a resampler.

-- 
Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email
Do not reply to this email, post in Bugzilla using the
above URL to reply.
------- You are receiving this mail because: -------
You are watching all bug changes.