[PATCH 2/3] winegstreamer: Use decodebin to initialize media streams.

Wed Sep 16 08:15:24 CDT 2020

On 9/15/20 5:47 PM, Zebediah Figura wrote:
> On 9/15/20 10:44 AM, Derek Lesho wrote:
>> On 9/15/20 10:07 AM, Zebediah Figura wrote:
>>
>>> On 9/15/20 8:47 AM, Derek Lesho wrote:
>>>> On 9/14/20 4:21 PM, Zebediah Figura wrote:
>>>>> On 9/11/20 12:23 PM, Derek Lesho wrote:
>>>>>> On 9/11/20 11:24 AM, Zebediah Figura wrote:
>>>>>>> On 9/11/20 10:18 AM, Derek Lesho wrote:
>>>>>>>> On 9/10/20 6:43 PM, Zebediah Figura wrote:
>>>>>>>>
>>>>>>>>> I got the impression from some past communication that the reason for
>>>>>>>>> using appsink in the first place (instead of just a solitary sink pad)
>>>>>>>>> is so that we can buffer, or is there some other reason?
>>>>>>>> Well, buffering is necessary, as media streams operate in a sort of
>>>>>>>> pull/push mode where we can only send them samples if they have
>>>>>>>> requested one with IMFMediaStream::RequestSample.
>>>>>>> Sure, but it also seems potentially simple enough to just wait on a
>>>>>>> semaphore in the chain function.
>>>>>> It's definitely more complicated for a system like that (I used to have
>>>>>> it that way).  With appsink, we just insert the request sample call into
>>>>>> a command queue, and pull the buffer from the appsink in the async
>>>>>> command.  The command queue is completely synchronous so we don't have
>>>>>> to worry about stuff like seeking or stopping interfering.  With the
>>>>>> semaphore solution, I'm not sure how much we have to worry about those
>>>>>> cases, but I remember having all sorts of strange bugs.
>>>>> Could you please describe in more detail the problems you encountered?
>>>> To be honest, I don't remember them too clearly, but back then I had a
>>>> system where the appsink was accessed by both RequestSample and the
>>>> new-sample callback. When there were outstanding requests, the
>>>> new-sample callback would dispatch MEMediaSample, and when there were
>>>> buffered samples, RequestSample would pop one and dispatch
>>>> MEMediaSample.  The problems occurred around the state of pending
>>>> samples being invalidated upon actions performed on the gstreamer
>>>> pipeline.  Of course, if we redid it now, I'm sure we could avoid this,
>>>> but it just seems like it would be more tricky.
>>>>> To be sure you'd need an extra event in there to handle flushes, but
>>>>> that doesn't seem like a deal-breaker to me either.
>>>> Flushing is what caused me the most trouble earlier on, yeah.
>>>>>>>      If it makes sense (for latency reasons)
>>>>>>> to buffer more than that, then there's a potential argument for appsink
>>>>>>> (but also, perhaps, just an argument for appending a queue element after
>>>>>>> each stream). I don't know/remember the relevant details for Media
>>>>>>> Foundation timing.
>>>>>>>
>>>>>>>> Also, appsink is just
>>>>>>>> more convenient than manually setting up a pad, I've also considered
>>>>>>>> changing the source pad to appsrc, but if it isn't broken....
>>>>>>>>
>>>>>>> Is it, though? I don't know what all the things you need to hook up are,
>>>>>>> but it looks to me like you essentially have to set up the same
>>>>>>> callbacks
>>>>>> We don't set up any callbacks for appsink, just pull a buffer when we
>>>>>> need one.
>>>>> Well, you do, actually, to get current caps (and, eventually, for caps
>>>>> negotiation). appsink does save you a chain function, but I don't
>>>>> immediately see anything else that it saves you.
>>>> https://source.winehq.org/git/wine.git/blob/HEAD:/dlls/winegstreamer/gstdemux.c#l2200
>>>>
>>>> Seems like it saves us from needing an event and query function as well,
>>>> going off the quartz equivalent.
>>> Not really, since you need to add a probe function instead. Note that
>>> you also don't need to handle most of the events that quartz does.
>>>
>>>> And it looks to me like there's a
>>>> significant enough amount of boilerplate in them for it to be undesirable.
>>> I'm not sure what in those functions qualifies as boilerplate?
>> I just took a quick look at it, and saw the gstreamer bug workaround,
>> but that's it.  So I take back what I said, it seems like it's just a
>> case of quartz pins aligning closer to gstreamer pads.
>>>>>> And given how much boilerplate and how many naming problems
>>>>>> callbacks cause in winegstreamer, it's probably better this way.
>>>>> Wiring up GStreamer callbacks is annoying, yes, but that annoyance may
>>>>> be worthwhile. In particular adding another callback doesn't actually
>>>>> increase code complexity; it's just copying an existing pattern.
>>>>>
>>>>> winegstreamer has some bad names, as I've complained in this thread, but
>>>>> that's a preëxisting problem that wouldn't be exacerbated by adding a
>>>>> chain function.
>>>>>
>>>>>> https://github.com/Guy1524/wine/commit/366946876d7a53f80d91749f290f511294415cf5#diff-2873f4b2d2de6a23dbc67b2960e28f88
>>>>>>
>>>>>>> , just in a different form. In the case of caps it makes things
>>>>>>> more than a little uglier
>>>>>> What does, appsink or a custom pad?  With appsink all you need to do is
>>>>>> set the desired caps property.
>>>>>>> , especially if you later end up doing more
>>>>>>> complicated caps negotiation. Note also that GstPad has default handling
>>>>>>> of EOS events and GST_PAD_IS_EOS(), which I'm inclined to understand is
>>>>>>> more useful for MF than handling GST_EVENT_EOS directly.
>>>>>> I'm not sure what you're referring to here.  All we need to do in terms
>>>>>> of end of stream is read the "eos" property on the appsink during the
>>>>>> RequestSample-derived command, and dispatch the relevant events.
>>>>> Yes, my point is that it's not really any different for GstPad; quartz
>>>>> waits for GST_EVENT_EOS but that's only because it matches quartz's
>>>>> threading model much better.
>>>>>
>>>>>
>>>>> To be quite clear, I'm not trying to argue that appsink shouldn't be
>>>>> used here; I'm just trying to understand all the potential factors
>>>>> behind the decision, and essentially stress-test it. So far as I see
>>>>> appsink provides a built-in buffer that matches mfplat's model nicely,
>>>>> which is good, but it also forces the use of probes and thereby some
>>>>> tricky parameter passing, which is not good.
>>>> Well, the probe isn't strictly necessary, another option is to wait on
>>>> the paused state, then read the caps from the pads, but the pad probe
>>>> way would be useful in a case where we want to enumerate all the fixed
>>>> types to expose on the stream's media type handler.
>>> Wait for what, exactly?
>> Well, at first, my thought was that decodebin may output a wide array of
>> possible formats, and I wanted to expose them in the stream's media type
>> handler.  I can't just go off the caps exposed in pad-added, as they are
>> just templates.  As it happened though, it seemed like the only things
>> which would vary are the video formats, so instead I decided to just
>> take the first caps decodebin exposes, then create duplicate the type
>> for every output format videoconvert exposes.  I kept the pad probe
>> method for getting caps, but I'm not currently aware of any circumstance
>> where we'd still need to retain that ability to loop through every cap
>> possibility.  If you don't think this is something we'll rub up against
>> in the future, I can get rid of the probe and have 0 sink callbacks :-)
> What you are (were) looking for is an (upstream) GST_QUERY_CAPS, or
> gst_pad_query_caps() for short.
>
> You can't get rid of the probe in any case, because you need to wait for
> caps to actually be negotiated before you can query them at all. That
> is, you need to wait for GST_EVENT_CAPS in some form or another. As far
> as I can tell there's no API-level guarantee that the caps will be fixed
> by the time a ready -> paused transition completes.
In that case, I could just wait on pull-preroll instead.
>
> While exposing every format that GStreamer exposes is a sound choice,
> it's also not the only one, or clearly the best. In particular, when I
> ran into bug 47642 (or variants thereof) in quartz, it seemed clear that
> we'd need videoconvert and audioconvert plugins to convert from formats
> that DirectShow can't express. In that case it was just as simple to ask
> videoconvert to do e.g. color space conversion to a fixed list of
> formats that DirectShow sinks would likely be able to accept. In that
> case trying to expose all of the formats natively supported by a sink is
> more work for probably no benefit. mfplat looks like it has about the
> same limitations (no way to express RGB video, for example) with respect
> to format, so this may also be the course that we want to take there.
Makes sense, so should I remove the caps listener and use pull-preroll, 
or keep it as is?
>