[PATCH v2 4/5] winegstreamer: Implement IMFMediaSource::Start.

Thu Oct 8 14:16:01 CDT 2020

On 10/8/20 12:06 PM, Derek Lesho wrote:
> 
> On 10/8/20 11:33 AM, Zebediah Figura wrote:
>> On 10/7/20 3:19 PM, Derek Lesho wrote:
>>> On 10/7/20 1:53 PM, Zebediah Figura wrote:
>>>> On 10/7/20 9:58 AM, Derek Lesho wrote:
>>>>> On 10/6/20 9:58 PM, Zebediah Figura wrote:
>>>>>> On 10/6/20 10:59 AM, Derek Lesho wrote:
>>>>>>> +        if (position->vt != VT_EMPTY)
>>>>>>> +        {
>>>>>>> +            GstEvent *seek_event = gst_event_new_seek(1.0,
>>>>>>> GST_FORMAT_TIME, GST_SEEK_FLAG_FLUSH,
>>>>>>> +                    GST_SEEK_TYPE_SET, position->u.hVal.QuadPart /
>>>>>>> 100, GST_SEEK_TYPE_NONE, 0);
>>>>>>> +            GstSample *preroll;
>>>>>>> +
>>>>>>> +            gst_pad_push_event(stream->my_sink, seek_event);
>>>>>>> +
>>>>>>> +            /* this works because the pre-seek preroll is already
>>>>>>> removed by media_source_constructor */
>>>>>>> +            g_signal_emit_by_name(stream->appsink, "pull-preroll",
>>>>>>> &preroll);
>>>>>>> +            if (preroll)
>>>>>>> +                gst_sample_unref(preroll);
>>>>>> What is the point of doing this?
>>>>> To wait for the flush to complete.  What is the point of waiting for the
>>>>> flush to complete?  To ensure we don't send any MEMediaSample events
>>>>> from before the seek after MESourceSeeked.
>>>> I don't think this is necessary; I don't think a flush can be
>>>> asynchronous, nor a seek.
>>> Interesting, so gst_pad_push_event blocks until the seek event is
>>> handled, but I'm not sure if this is before or after the flush start
>>> event has been propogated downstream, as what we're really waiting for
>>> is appsink to receive that and invalidate all buffered samples.  I'll
>>> look into this and if just waiting for gst_pad_push_event to return
>>> suffices, I'll remove the hack.
>> A flushing seek sends both FLUSH_START and FLUSH_STOP events downstream.
> Yes
>> Moreover, on further examination, why do you need to wait for this at
>> all? The flushing seek explicitly guarantees that stale samples will be
>> discarded; even if the event delivery were itself asynchronous, you're
>> not sending samples as soon as you get them, but rather only when the
>> application requests them, so as far as I can see the MEStreamSeeked
>> (&c.) events sent by this function will still be serialized with respect
>> to MEMediaSample events.
> Yes, my only concern would be for when we send the seek event, then
> proceed to a ::RequestSample call, which then calls pull-sample before
> the seek-start event reaches the appsink.  This would cause a pre-seek
> buffer from the appsink's queue to be returned.

Right, I don't think that can happen; the seek has to complete before
gst_pad_push_event() returns.

>> In fact, another thing: doesn't that mean you need to somehow flush the
>> work queue here?
> 
> This function is in step with the work queue, so all previous samples
> requests will have been fulfilled, as is documented in the MSDN:
> 
>> After *Start* is called, each stream on the media source must do one
>> of the following:
>>
>>   * Deliver media data in response to IMFMediaStream::RequestSample
>>     <https://docs.microsoft.com/en-us/windows/desktop/api/mfidl/nf-mfidl-imfmediastream-requestsample> calls.
>>   * Send MEStreamTick
>>     <https://docs.microsoft.com/en-us/windows/desktop/medfound/mestreamtick> events
>>     to indicate a gap in the stream.
>>   * Send an MEEndOfStream
>>     <https://docs.microsoft.com/en-us/windows/desktop/medfound/meendofstream> event
>>     to indicate the end of the stream.
>>
> 

Okay, that's a good reason to make Start() asynchronous, then. I don't
know if that deserves better communication in the code.

>>>>>>> +
>>>>>>> +           
>>>>>>> IMFMediaEventQueue_QueueEventParamVar(stream->event_queue,
>>>>>>> +                seek_message ? MEStreamSeeked : MEStreamStarted,
>>>>>>> &GUID_NULL, S_OK, position);
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    IMFMediaEventQueue_QueueEventParamVar(source->event_queue,
>>>>>>> +        seek_message ? MESourceSeeked : MESourceStarted,
>>>>>>> +        &GUID_NULL, S_OK, position);
>>>>>>> +
>>>>>>> +    source->state = SOURCE_RUNNING;
>>>>>>> +
>>>>>>> +    gst_element_set_state(source->container, GST_STATE_PLAYING);
>>>>>>> +    gst_element_get_state(source->container, NULL, NULL, -1);
>>>>>> And if this method is asynchronous, do you need this
>>>>>> gst_element_get_state() call?
>>>>> Probably not, but not because the method is asynchronous, but rather
>>>>> because it shouldn't affect how we interact with the appsink/s.
>>>>>>> @@ -432,6 +717,8 @@ static HRESULT
>>>>>>> media_stream_connect_to_sink(struct media_stream *stream)
>>>>>>>        if (gst_pad_link(stream->their_src, stream->my_sink) !=
>>>>>>> GST_PAD_LINK_OK)
>>>>>>>            return E_FAIL;
>>>>>>>    +    g_object_set(stream->appsink, "caps", source_caps, NULL);
>>>>>>> +
>>>>>> Again, what's the point of this?
>>>>> To see whether the caps have changed since the prior selection in
>>>>> start_pipeline.  If they have changed, we must send a reconfigure event.
>>>> This might need clarification in the code, then; it looks redundant at
>>>> first glance.
>>>>
>>>> It may even be better to remove it. I have my doubts that it will make a
>>>> difference in terms of performance; in particular I suspect that the
>>>> caps GStreamer generates and the caps generated by
>>>> caps_from_mf_media_type() will have subtle differences.
>>>>
>>> That's true, I guess I'll remove it.
>>>>>>> +    if (SUCCEEDED(hr = source_create_async_op(SOURCE_ASYNC_START,
>>>>>>> &command)))
>>>>>>> +    {
>>>>>>> +        command->u.start.descriptor = descriptor;
>>>>>>> +        command->u.start.format = *time_format;
>>>>>>> +        PropVariantCopy(&command->u.start.position, position);
>>>>>>> +
>>>>>>> +        hr = MFPutWorkItem(source->async_commands_queue,
>>>>>>> &source->async_commands_callback, &command->IUnknown_iface);
>>>>>>> +    }
>>>>>> There are likely mfplat pecularities I'm not aware of here, but does
>>>>>> this need to be asynchronous? I know that the documentation says "this
>>>>>> method is asynchronous", but it's not immediately obvious to me that
>>>>>> there isn't already a level of asynchronicity between this call and any
>>>>>> of its effects.
>>>>> Well, I think that making the media source calls synchronous in their
>>>>> own thread simplifies things since you don't have to worry about
>>>>> blocking or synchronization with other calls.  For example,
>>>>> ::RequestSample, if synchronous, would have to have two paths, one for
>>>>> popping signals from the appsink and one for issuing the request into
>>>>> some dedicated request queue.
>>>> I don't think I understand this; can you please explain in more detail?
>>> Yeah, in my first implementation, I setup a new-sample callback with the
>>> appsink, as well as a queue using a doubly linked list for sample
>>> requests.  When ::RequestSample was called, if the appsink had any
>>> buffered samples, I would immediately dispatch the sample with the token
>>> sent in.  Otherwise, I would insert a sample request into the queue and
>>> return.  When the new-sample callback was run, I'd pop the sample from
>>> the queue and send the event.  In IMFMediaSource::Start.  I still have
>>> the code for this lying around on my github [1], it was quite an
>>> unnecessary mess.
>> Okay, the point then is that IMFMediaSample::RequestSample() is supposed
>> to always return immediately?
> Yes, it is supposed to.  One thing I failed to mention in my previous
> email (the incomplete "In IMFMediaSource::Start." sentence) is that we
> also require code for draining the ::RequestSample queue in ::Start,
> again increasing complexity.
>>>>> This was actually the implementation I
>>>>> had at first, but then I implemented the async reader callback with the
>>>>> same system, and Nikolay instead insisted that I use a command queue,
>>>>> which reduces complexity a lot.  Then when adding seeking support, I
>>>>> switched to this approach for the media source.  If nothing else, this
>>>>> system matches other media foundation components better
>>>>> (IMFMediaSession, IMFSourceReader), which reduces overhead for the
>>>>> reader.
>>>> Even if RequestSample() needs to be asynchronous, is there a reason that
>>>> Start() needs to be asynchronous? Thread safety is obviously important,
>>>> but that can be achieved with simple locking.
>>> One thing I want to avoid here is blocking too much during a function
>>> which is documented to be asynchronous.  We have to wait for the media
>>> streams to finish up responding to requested samples in this function,
>>> which may take a non trivial amount of time.  I don't think it's
>>> unreasonable to expect there to be applications which calls
>>> IMFMediaSource methods from their interactive thread, resulting in our
>>> blocking causing a reduction in responsiveness.
>> This strikes me as premature optimization. But if you're really
>> concerned, the best way to resolve this is in fact to measure the time
>> this function takes.
> The time it takes depends on multiple factors, whether it's a seek or a
> start, which demuxer, decoders, and elements are hooked up, and mainly
> how many ::RequestSample instances are in a queue at the time of the
> call.  Plus, it's not the only reason to keep it asynchronous as
> documented.  As mentioned earlier, it better aligns to other media
> foundation classes, and reduces code complexity.
>>>>>>> +
>>>>>>> +    return hr;
>>>>>>>    }
>>>>>>>       static HRESULT WINAPI media_source_Stop(IMFMediaSource *iface)
>>>>>>> @@ -772,6 +1073,9 @@ static HRESULT WINAPI
>>>>>>> media_source_Shutdown(IMFMediaSource *iface)
>>>>>>>        if (source->no_more_pads_event)
>>>>>>>            CloseHandle(source->no_more_pads_event);
>>>>>>>    +    if (source->async_commands_queue)
>>>>>>> +        MFUnlockWorkQueue(source->async_commands_queue);
>>>>>>> +
>>>>>>>        return S_OK;
>>>>>>>    }
>>>>>>>    @@ -852,6 +1156,7 @@ static HRESULT
>>>>>>> media_source_constructor(IMFByteStream *bytestream, struct media_
>>>>>>>            return E_OUTOFMEMORY;
>>>>>>>           object->IMFMediaSource_iface.lpVtbl = &IMFMediaSource_vtbl;
>>>>>>> +    object->async_commands_callback.lpVtbl =
>>>>>>> &source_async_commands_callback_vtbl;
>>>>>>>        object->ref = 1;
>>>>>>>        object->byte_stream = bytestream;
>>>>>>>        IMFByteStream_AddRef(bytestream);
>>>>>>> @@ -860,6 +1165,9 @@ static HRESULT
>>>>>>> media_source_constructor(IMFByteStream *bytestream, struct media_
>>>>>>>        if (FAILED(hr = MFCreateEventQueue(&object->event_queue)))
>>>>>>>            goto fail;
>>>>>>>    +    if (FAILED(hr =
>>>>>>> MFAllocateWorkQueue(&object->async_commands_queue)))
>>>>>>> +        goto fail;
>>>>>>> +
>>>>>>>        object->container = gst_bin_new(NULL);
>>>>>>>        object->bus = gst_bus_new();
>>>>>>>        gst_bus_set_sync_handler(object->bus,
>>>>>>> mf_src_bus_watch_wrapper, object, NULL);
>>>>>>> diff --git a/dlls/winegstreamer/mfplat.c b/dlls/winegstreamer/mfplat.c
>>>>>>> index 2e8b0978648..9aa17ad00ab 100644
>>>>>>> --- a/dlls/winegstreamer/mfplat.c
>>>>>>> +++ b/dlls/winegstreamer/mfplat.c
>>>>>>> @@ -601,3 +601,152 @@ IMFMediaType *mf_media_type_from_caps(const
>>>>>>> GstCaps *caps)
>>>>>>>           return media_type;
>>>>>>>    }
>>>>>>> +
>>>>>>> +GstCaps *caps_from_mf_media_type(IMFMediaType *type)
>>>>>>> +{
>>>>>>> +    GUID major_type;
>>>>>>> +    GUID subtype;
>>>>>>> +    GstCaps *output = NULL;
>>>>>>> +
>>>>>>> +    if (FAILED(IMFMediaType_GetMajorType(type, &major_type)))
>>>>>>> +        return NULL;
>>>>>>> +    if (FAILED(IMFMediaType_GetGUID(type, &MF_MT_SUBTYPE, &subtype)))
>>>>>>> +        return NULL;
>>>>>>> +
>>>>>>> +    if (IsEqualGUID(&major_type, &MFMediaType_Video))
>>>>>>> +    {
>>>>>>> +        UINT64 frame_rate = 0, frame_size = 0;
>>>>>>> +        DWORD width, height, framerate_num, framerate_den;
>>>>>>> +        UINT32 unused;
>>>>>>> +
>>>>>>> +        if (FAILED(IMFMediaType_GetUINT64(type, &MF_MT_FRAME_SIZE,
>>>>>>> &frame_size)))
>>>>>>> +            return NULL;
>>>>>>> +        width = frame_size >> 32;
>>>>>>> +        height = frame_size;
>>>>>>> +        if (FAILED(IMFMediaType_GetUINT64(type, &MF_MT_FRAME_RATE,
>>>>>>> &frame_rate)))
>>>>>>> +        {
>>>>>>> +            frame_rate = TRUE;
>>>>>>> +            framerate_num = 0;
>>>>>>> +            framerate_den = 1;
>>>>>>> +        }
>>>>>>> +        else
>>>>>>> +        {
>>>>>>> +            framerate_num = frame_rate >> 32;
>>>>>>> +            framerate_den = frame_rate;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        /* Check if type is uncompressed */
>>>>>>> +        if (SUCCEEDED(MFCalculateImageSize(&subtype, 100, 100,
>>>>>>> &unused)))
>>>>>> Early return could save a level of indentation.
>>>>> Yes but in future patches there will be else statements for handling
>>>>> compressed types.  Even if we don't end up using this for the media
>>>>> source, it will be necessary for games which manually use decoder MFTs.
>>>>>> I also feel like there's an easier way to do this check. Maybe
>>>>>> something
>>>>>> like:
>>>>>>
>>>>>> for (i = 0; i < ARRAY_SIZE(uncompressed_video_formats); i++)
>>>>>> {
>>>>>>       ...
>>>>>> }
>>>>>>
>>>>>> if (format == GST_VIDEO_FORMAT_UNKNOWN
>>>>>>           && !memcmp(&subtype, &MFVideoFormat_Base.Data2, ...))
>>>>>>       format = gst_video_format_from_fourcc(subtype.Data1);
>>>>>>
>>>>>> if (format == GST_VIDEO_FORMAT_UNKNOWN)
>>>>>> {
>>>>>>       FIXME("Unrecognized subtype %s.\n", debugstr_guid(&subtype));
>>>>>>       return NULL;
>>>>>> }
>>>>> That's certainly a possibility, but using MFCalculateImageSize is
>>>>> probably the best way to concisely determine whether we want to take the
>>>>> uncompressed video format route.  With your solution, I guess we'd just
>>>>> add the handling for those formats in `if (format ==
>>>>> GST_VIDEO_FORMAT_UNKNOWN)`.  Either way is fine, which path should I
>>>>> take?
>>>> I don't see any reason we need to check whether it's uncompressed in
>>>> general; both "an uncompressed format we can't handle" and "a compressed
>>>> format" yield the same result.
>>> Right now they do, but my point is that in future patches the structure
>>> is more like
>>>
>>> if (uncompressed) { ... } else if (format == h264) { ... } else if
>>> (format == wmv)
>> Right, I don't see why there's any point doing that when you can instead
>> just leave the uncompressed case as the default, e.g.
>>
>> if (IsEqualGUID(&subtype, &some_compressed_format))
>>     ...
>> else if (IsEqualGUID(&subtype, &some_other_compressed_format))
>>     ...
>> else
>> {
>>     for (i = 0; i < ARRAY_SIZE(uncompressed_video_formats); ++i)
>>         ...
>>
>>     if (format == GST_VIDEO_FORMAT_UNKNOWN)
>>         ...
>>
>>     if (format == GST_VIDEO_FORMAT_UNKNOWN)
>>         return NULL;
>> }
> Sounds good
>>>>>>> +            FIXME("Unrecognized subtype %s\n",
>>>>>>> debugstr_guid(&subtype));
>>>>>>> +            return NULL;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +
>>>>>>> +        if (frame_size)
>>>>>>> +        {
>>>>>>> +            gst_caps_set_simple(output, "width", G_TYPE_INT,
>>>>>>> width, NULL);
>>>>>>> +            gst_caps_set_simple(output, "height", G_TYPE_INT,
>>>>>>> height, NULL);
>>>>>>> +        }
>>>>>>> +        if (frame_rate)
>>>>>> Why do you need to check this is nonzero? Can the frame rate really
>>>>>> be 0/0?
>>>>> To be honest, I don't remember exactly which circumstance led to this
>>>>> being necessary.  I assume it was something with the IMFMediaType MK11
>>>>> generates, but it does seem very strange so I'll have to check again.
>>>>>> For that matter, do we need to give GStreamer the framerate at all? It
>>>>>> should only ever matter to sink elements (or those that respect a
>>>>>> clock).
>>>>> Why risk seeing what happens when you don't translate this cap, are
>>>>> there any benefits?  Maybe a decoder would need it need it to handle
>>>>> some time-based seek?
>>>>>
>>>> In fact, translating the framerate in quartz has caused problems before,
>>>> as applications didn't always provide it, or provided the wrong one, and
>>>> thus the caps would not match; as a result we always remove it. On the
>>>> other hand, I haven't seen any problems arise from clearing this field.
>>>> In practice I don't think it should ever be necessary, because the
>>>> parser provides the framerate.
>>> My feeling is that such a workaround should be put in place where it
>>> causes problems, not in a general conversion function which may be used
>>> (as in later patches) for other purposes where the cap is necessary. 
>> I don't think either translating the frame rate or not translating the
>> frame rate is really incorrect, such that one can be called a
>> workaround. And one is clearly more work than the other. [Note that
>> gstdemux explicitly clears the framerate because it uses
>> gst_video_info_to_caps() instead of building the caps structure manually.]
> In quartz, you have a static function which is only used within the
> context of your equivalent of the media source.  This definition resides
> inside mfplat.c, where it will also be used by the decoder, and in the
> future, encoders (some games use the encoders of media foundation to
> save highlights to disk, and streaming services use it for obvious
> reasons).  I find it very likely that an encoder would want to know the
> framerate of an incoming stream.

Okay, I checked, and at least some decoders (dvdec) do demand to know
the framerate, for some reason. [dvdec doesn't actually do anything with
this information except pass it on to the downstream pad. Nor can I
guess why a decoder would need to know the framerate in general].

>>> For what it's worth, the chance of a media foundation application
>>> feeding some bogus framerate is slim, as they usually just search the
>>> media type handler for a desired type anyway, and we're supposed to fail
>>> on SetCurrentMediaType if the media type they enter is unsupported.
>>> 1:
>>> https://github.com/Guy1524/wine/blob/mfplat/dlls/winegstreamer/media_source.c#L98
>>>
>>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20201008/f9b6938f/attachment-0001.sig>