[PATCH v3 0/7] MR61: winegstreamer: Dynamic transform stream format change support.

Tue May 24 17:52:28 CDT 2022

On 5/24/22 14:14, Rémi Bernon (@rbernon) wrote:
>> I'm looking at MFT_INPUT_STREAM_PROCESSES_IN_PLACE. I haven't checked
>> whether anything interesting exposes it, though; maybe nothing does?
>>
>> (I don't think any of the GStreamer elements we care about are capable
>> of in-place transform either, but theoretically some of them could be.)
> 
> As far as the tests go, no transform is currently exposing this (and we
> start to have quite a few in the tests).
> 
> There's also the MFT_OUTPUT_STREAM_(CAN_)PROVIDES_SAMPLES flags that
> could have been useful, but there too, it doesn't seem to be used by
> native transforms, and I think games aren't expecting to see it.
> 
> Thus the implementation has all been around the worst case, which is
> when output buffer is app provided and cannot be swapped behind the
> scenes.
> 
> 
>> This is ugly to deal with. I guess you can provide a specific buffer by
>> (ab)using GstBufferPool's acquire_buffer() method to block until
>> ProcessOutput() is called and we have a buffer; I don't know if you had
>> a better plan than that.
> 
> A bit like this, but it's actually more tricky because some decoders
> aren't playing ball.
> 
> They are happy to allocate buffers from a downstream pool, but they
> won't release the buffers so easily. I understand that they keep the
> buffers for previous frame reference in their decoding process, and
> that needs to outlive the app provided output buffer.
> 
> Because they hold a ref on the buffer, it isn't writable and we cannot
> detach its memory either.
> 
> 
>> On the other hand, it's not clear to me how we're going to deal with
>> MF_E_TRANSFORM_STREAM_CHANGE or MFT_OUTPUT_STATUS_SAMPLE_READY while
>> maintaining zero-copy. What is your plan for those?
> 
> So my plan, which will also cover the case above, is to use instead a
> custom allocator (going to be generally useful imho if we want zero-copy
> in wg_parser too), which will allocate memory normally with an optional
> app provided memory overlay.
> 
> 
> When allocation needs to be done without an app provided buffer, the
> allocator will work as the default allocator, allocate unix side memory,
> and return a pointer to it on map. This is for instance going to happen
> if more than one buffer gets decoded. In that case, on the next
> ProcessOutput call, the data will have to be copied.
> 
> When we have an app provided buffer, and in the good cases, it would
> instead overlayed the unix memory with it and return the app memory on
> map so the decoder writes directly to it. When the app buffer needs to
> be released, we remove the overlay and the unix memory will later be
> used if there's any re-use.
> 
> In the bad cases, when either the output buffer is still referenced by
> the decoder, or when we actually should keep the output sample pending,
> and when we have incorrectly overlayed app buffer, we will copy back the
> data to the unix memory before detaching the overlay. In any case we
> still get only one buffer copy, so not worst than the normal copy.
> 
> 
> In practice the bad cases shouldn't happen too often: when decoder hold
> on the buffers, we cannot remove memory from them. This also means the
> pool will not discard the buffers on release, and after a few allocs the
> decoder will have enough buffers to use on them and the allocator will
> not receive any more allocation requests. We will have to copy the data
> anyway, but in the right direction this time.
> 
> In the good cases, releasing buffer memory will make sure to taint the
> buffers and the pool will free them on release. The allocator will
> receive a request every time a buffer is needed.
> 

Thank you for the context; the initial design decisions behind this 
patch make a lot more sense now.

Once you take it as a given that we have to explicitly specify the 
output buffer per sample even in the zero-copy case, it indeed doesn't 
seem plausible to try to peek without dequeuing. In that case I'm 
inclined to agree that the best solution is a monolithic 
wg_transform_read_data() that returns MF_E_FORMAT_CHANGED_OR_WHATEVER.

Based on IRC conversation I get the impression we can plausibly track it 
all on the Unix side after all, possibly without even needing an extra 
API to invalidate the format? That would feel the least awkward to me. 
We'd still return the new format from the Unix side as part of 
wg_transform_read_data(), but it'd be output-only.

(I'm starting to suspect we should reinstate the wg_stream_event 
union... I'll have to examine that possibility after this is upstream.)

Sorry for all the runaround (and the slowness of review). This is an 
ugly API mismatch we have to deal with and I have found it quite 
difficult to grasp all of the many considerations.