[PATCH 3/5] winegstreamer: Replace source pad interface with GstAppSrc.

Thu Sep 9 11:25:55 CDT 2021

On 9/9/21 9:05 AM, Derek Lesho wrote:
> 
> On 9/9/21 12:00 AM, Zebediah Figura wrote:
>> On 9/8/21 8:36 PM, Derek Lesho wrote:
>>>
>>> On 9/8/21 6:21 PM, Zebediah Figura (she/her) wrote:
>>>> On 9/8/21 2:25 PM, Derek Lesho wrote:
>>>>> On 9/6/21 4:14 PM, Zebediah Figura wrote:
>>>>>>> +static void src_need_data(GstElement *appsrc, guint length,
>>>>>>> gpointer user)
>>>>>>>    {
>>>>>>>
>>>>>>> +    struct wg_parser *parser = user;
>>>>>>>          pthread_mutex_lock(&parser->mutex);
>>>>>>>
>>>>>>> +    /* Sometimes GstAppSrc sends identical need-data requests when
>>>>>>> it is woken up,
>>>>>>> +       we can rely on push-buffer not having completed (and hence
>>>>>>> offset being -1)
>>>>>>> +       because they are blocked on the internal mutex held by the
>>>>>>> pulling thread
>>>>>>> +       calling this callback. */
>>>>>>> +    if (parser->read_request.offset != -1)
>>>>>>>        {
>>>>>>
>>>>>> Well, sure, but do we really need this if block at all?
>>>>>
>>>>> Yes, we do.  Without it, if there a spurious wakeup of appsrc's wait
>>>>> loop in gst_app_src_create (get_range) [1], and the "push-buffer"
>>>>> signal is signaled just after the loop's g_cond_wait reacquires the
>>>>> mutex on return [1], we could have "push-buffer" blocked on
>>>>> acquisition of the mutex [2] held by gst_app_src_create. The
>>>>> function would block before pushing the buffer to the internal
>>>>> queue.  Then, gst_app_src_create continues and, seeing as there are
>>>>> no buffers in the queue, unlocks the mutex and calls need_data.  Here
>>>>> we have our race:
>>>>>
>>>>> - if the need_data's code is called first, read_request's offset and
>>>>> size of overwritten with the same values. Then, the code waiting on
>>>>> the mutex pushes the buffer responding to our request, and everything
>>>>> continues as normal (as if another need_data never occurred).
>>>>>
>>>>> (Prepare for the text wall 😁)
>>>>>
>>>>> - If the push-buffer code waiting on the mutex continues first, the
>>>>> internal buffer queue has the buffer written to it, and the
>>>>> push-buffer signal returns.  Afterwards, read_request.offset is set
>>>>> to the sentinel value indicating that the request has been responded
>>>>> to, and that a new need-data is awaited, push_data returns. Then,
>>>>> need_data is run, acquiring the mutex too late to catch the previous
>>>>> buffer send, and it requests a read of the same size directly
>>>>> following the prior read request, since the offset had been updated
>>>>> by push_data. gst_app_src_create then sees the pushed buffer and
>>>>> returns it back to the getrange-related function, only for another
>>>>> getrange to come in with an offset not directly following the last
>>>>> request's.  While the client code is reading the invalid request's
>>>>> data, src_seek_data is called. Right now we have an assert to make
>>>>> sure that there is no active request when seek_data is called, but if
>>>>> we didn't, we'd just set next_pull_offset to the seek's value.  Then,
>>>>> need_data would be called again, but all it would do is update the
>>>>> size to the new request's size.  After this, push_data (in response
>>>>> to the previous request) would be called and, for simplicity's sake,
>>>>> if the size of the two requests were identical, a push-buffer would
>>>>> be called, src_seek_data would pick up and have no way of knowing
>>>>> that the data is coming from after previous request's location in the
>>>>> file, which could cause any number of errors.  If we keep around both
>>>>> next_pull_offset and read_request.offset, we could compare them in
>>>>> push_data and discard the buffer if they don't match, but this seems
>>>>> a lot more complicated than catching this case earlier on and
>>>>> preventing all this confusion.
>>>>>
>>>>> 1:
>>>>> https://github.com/GStreamer/gst-plugins-base/blob/master/gst-libs/gst/app/gstappsrc.c#L1774 
>>>>>
>>>>>
>>>>>
>>>>> 2:
>>>>> https://github.com/GStreamer/gst-plugins-base/blob/master/gst-libs/gst/app/gstappsrc.c#L2526 
>>>>>
>>>>
>>>>
>>>> Okay, so I was operating under the assumption that appsrc would drop
>>>> buffers if the offset didn't match what it wanted, since that seemed
>>>> like the only safe way to implement the behaviour the documentation
>>>> describes. Apparently it doesn't. That means that I guess we can't
>>>> send more than one buffer at a time, i.e. we really need to give it
>>>> only what it asks for.
>>> Right, to be honest I completely forgot that we were sending in the
>>> offset through the buffer's field, but it seems that neither app source
>>> nor GstPad validate it (maybe something downstream would though?).
>>> Rethinking this after some time away from the keyboard, I now realize
>>> that my conditional only triggers in the first unproblematic case, and
>>> that only the assert in seek_data would catch the problem of the 2nd
>>> case.  I also now realize that, moving to using only next_pull_offset
>>> and a boolean flag, push_data would set on the buffer the updated/seeked
>>> offset, not the offset of the data that it actually retrieved. But even
>>> if we keep both of them to verify the offset in push_data ourselves, we
>>> would still have the possibility of going out of sync in continuous
>>> pulls unless GStreamer itself verified the offset, although I haven't
>>> though about the consequences of that.
>>>>
>>>> I'll have to submit a patch to GStreamer to clarify the documentation.
>>>>
>>>> In effect I guess that means we should keep the need-data/enough-data
>>>> introduced by this patch
>>> I'm not sure what you're referring to here, I didn't add enough-data
>>> handling to this patch.
>>>> only for stream mode, i.e. push mode. For random access mode we should
>>>> treat need-data as license to send only a single buffer. I.e. upon
>>>> receiving wg_parser_push_data() I guess we should reset whatever flag
>>>> or sentinel to nonsignaled, so that a subsequent
>>>> wg_parser_get_next_read_offset() blocks.
>>> That's already being done in this patch, and doesn't actually fix the
>>> issue.  I'm not actually sure what the correct way to fix this would be,
>>> tomorrow I'll try consulting with the GStreamer devs to make sure I'm
>>> not making up a problem and see what they suggest.
>>>
>>
>> Sorry, I think I initially failed to read your description carefully 
>> enough, and assumed it was something different.
>>
>> I see the problem now, but in fact this conditional (the one in your 
>> patch) isn't enough to fix it. We can get a race this bad:
>>
>>
>> thread A
>> --------------------------------------------------
>> queue is empty
>> emit need_data
>> queue is empty
>>                     wg_parser_get_next_read_offset
>>                     wg_parser_push_data
>> emit need_data
>>                     wg_parser_get_next_read_offset
>> ...
>>                     wg_parser_push_data
>>
>>
>> and push the wrong buffer. It won't be the same buffer (we won't get a 
>> seek, so we're supposed to push the next one consecutively), but it 
>> will be the wrong one.
>>
>> Unfortunately I think this means that appsrc is unusably broken. It 
>> can and should be fixed upstream, one way or another, but we can't use 
>> it in our code, at least not for a while.
> Yeah, this definitely seems to be something that should be fixed 
> upstream, I'm going to submit an issue.  In the meantime though, we can 
> workaround it by looking at the current-level-buffers property after we 
> acquire the mutex in need_data, this will tell us if any buffers have 
> been queued, and in such case the need-data would obviously be spurious.

Okay, I've run through it in my head and I think that does work. That's 
good.

>>
>> I could also be missing something, though.
>