[PATCH v2 4/5] winegstreamer: Implement IMFMediaSource::Start.

Thu Oct 8 12:06:00 CDT 2020

On 10/8/20 11:33 AM, Zebediah Figura wrote:
> On 10/7/20 3:19 PM, Derek Lesho wrote:
>> On 10/7/20 1:53 PM, Zebediah Figura wrote:
>>> On 10/7/20 9:58 AM, Derek Lesho wrote:
>>>> On 10/6/20 9:58 PM, Zebediah Figura wrote:
>>>>> On 10/6/20 10:59 AM, Derek Lesho wrote:
>>>>>> +        if (position->vt != VT_EMPTY)
>>>>>> +        {
>>>>>> +            GstEvent *seek_event = gst_event_new_seek(1.0,
>>>>>> GST_FORMAT_TIME, GST_SEEK_FLAG_FLUSH,
>>>>>> +                    GST_SEEK_TYPE_SET, position->u.hVal.QuadPart /
>>>>>> 100, GST_SEEK_TYPE_NONE, 0);
>>>>>> +            GstSample *preroll;
>>>>>> +
>>>>>> +            gst_pad_push_event(stream->my_sink, seek_event);
>>>>>> +
>>>>>> +            /* this works because the pre-seek preroll is already
>>>>>> removed by media_source_constructor */
>>>>>> +            g_signal_emit_by_name(stream->appsink, "pull-preroll",
>>>>>> &preroll);
>>>>>> +            if (preroll)
>>>>>> +                gst_sample_unref(preroll);
>>>>> What is the point of doing this?
>>>> To wait for the flush to complete.  What is the point of waiting for the
>>>> flush to complete?  To ensure we don't send any MEMediaSample events
>>>> from before the seek after MESourceSeeked.
>>> I don't think this is necessary; I don't think a flush can be
>>> asynchronous, nor a seek.
>> Interesting, so gst_pad_push_event blocks until the seek event is
>> handled, but I'm not sure if this is before or after the flush start
>> event has been propogated downstream, as what we're really waiting for
>> is appsink to receive that and invalidate all buffered samples.  I'll
>> look into this and if just waiting for gst_pad_push_event to return
>> suffices, I'll remove the hack.
> A flushing seek sends both FLUSH_START and FLUSH_STOP events downstream.
Yes
>
> Moreover, on further examination, why do you need to wait for this at
> all? The flushing seek explicitly guarantees that stale samples will be
> discarded; even if the event delivery were itself asynchronous, you're
> not sending samples as soon as you get them, but rather only when the
> application requests them, so as far as I can see the MEStreamSeeked
> (&c.) events sent by this function will still be serialized with respect
> to MEMediaSample events.
Yes, my only concern would be for when we send the seek event, then 
proceed to a ::RequestSample call, which then calls pull-sample before 
the seek-start event reaches the appsink.  This would cause a pre-seek 
buffer from the appsink's queue to be returned.
>
> In fact, another thing: doesn't that mean you need to somehow flush the
> work queue here?

This function is in step with the work queue, so all previous samples 
requests will have been fulfilled, as is documented in the MSDN:

> After*Start*is called, each stream on the media source must do one of 
> the following:
>
>   * Deliver media data in response toIMFMediaStream::RequestSample
>     <https://docs.microsoft.com/en-us/windows/desktop/api/mfidl/nf-mfidl-imfmediastream-requestsample>calls.
>   * SendMEStreamTick
>     <https://docs.microsoft.com/en-us/windows/desktop/medfound/mestreamtick>events
>     to indicate a gap in the stream.
>   * Send anMEEndOfStream
>     <https://docs.microsoft.com/en-us/windows/desktop/medfound/meendofstream>event
>     to indicate the end of the stream.
>

>
>>>>>> +
>>>>>> +
>>>>>> IMFMediaEventQueue_QueueEventParamVar(stream->event_queue,
>>>>>> +                seek_message ? MEStreamSeeked : MEStreamStarted,
>>>>>> &GUID_NULL, S_OK, position);
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    IMFMediaEventQueue_QueueEventParamVar(source->event_queue,
>>>>>> +        seek_message ? MESourceSeeked : MESourceStarted,
>>>>>> +        &GUID_NULL, S_OK, position);
>>>>>> +
>>>>>> +    source->state = SOURCE_RUNNING;
>>>>>> +
>>>>>> +    gst_element_set_state(source->container, GST_STATE_PLAYING);
>>>>>> +    gst_element_get_state(source->container, NULL, NULL, -1);
>>>>> And if this method is asynchronous, do you need this
>>>>> gst_element_get_state() call?
>>>> Probably not, but not because the method is asynchronous, but rather
>>>> because it shouldn't affect how we interact with the appsink/s.
>>>>>> @@ -432,6 +717,8 @@ static HRESULT
>>>>>> media_stream_connect_to_sink(struct media_stream *stream)
>>>>>>         if (gst_pad_link(stream->their_src, stream->my_sink) !=
>>>>>> GST_PAD_LINK_OK)
>>>>>>             return E_FAIL;
>>>>>>     +    g_object_set(stream->appsink, "caps", source_caps, NULL);
>>>>>> +
>>>>> Again, what's the point of this?
>>>> To see whether the caps have changed since the prior selection in
>>>> start_pipeline.  If they have changed, we must send a reconfigure event.
>>> This might need clarification in the code, then; it looks redundant at
>>> first glance.
>>>
>>> It may even be better to remove it. I have my doubts that it will make a
>>> difference in terms of performance; in particular I suspect that the
>>> caps GStreamer generates and the caps generated by
>>> caps_from_mf_media_type() will have subtle differences.
>>>
>> That's true, I guess I'll remove it.
>>>>>> +    if (SUCCEEDED(hr = source_create_async_op(SOURCE_ASYNC_START,
>>>>>> &command)))
>>>>>> +    {
>>>>>> +        command->u.start.descriptor = descriptor;
>>>>>> +        command->u.start.format = *time_format;
>>>>>> +        PropVariantCopy(&command->u.start.position, position);
>>>>>> +
>>>>>> +        hr = MFPutWorkItem(source->async_commands_queue,
>>>>>> &source->async_commands_callback, &command->IUnknown_iface);
>>>>>> +    }
>>>>> There are likely mfplat pecularities I'm not aware of here, but does
>>>>> this need to be asynchronous? I know that the documentation says "this
>>>>> method is asynchronous", but it's not immediately obvious to me that
>>>>> there isn't already a level of asynchronicity between this call and any
>>>>> of its effects.
>>>> Well, I think that making the media source calls synchronous in their
>>>> own thread simplifies things since you don't have to worry about
>>>> blocking or synchronization with other calls.  For example,
>>>> ::RequestSample, if synchronous, would have to have two paths, one for
>>>> popping signals from the appsink and one for issuing the request into
>>>> some dedicated request queue.
>>> I don't think I understand this; can you please explain in more detail?
>> Yeah, in my first implementation, I setup a new-sample callback with the
>> appsink, as well as a queue using a doubly linked list for sample
>> requests.  When ::RequestSample was called, if the appsink had any
>> buffered samples, I would immediately dispatch the sample with the token
>> sent in.  Otherwise, I would insert a sample request into the queue and
>> return.  When the new-sample callback was run, I'd pop the sample from
>> the queue and send the event.  In IMFMediaSource::Start.  I still have
>> the code for this lying around on my github [1], it was quite an
>> unnecessary mess.
> Okay, the point then is that IMFMediaSample::RequestSample() is supposed
> to always return immediately?
Yes, it is supposed to.  One thing I failed to mention in my previous 
email (the incomplete "In IMFMediaSource::Start." sentence) is that we 
also require code for draining the ::RequestSample queue in ::Start, 
again increasing complexity.
>
>>>> This was actually the implementation I
>>>> had at first, but then I implemented the async reader callback with the
>>>> same system, and Nikolay instead insisted that I use a command queue,
>>>> which reduces complexity a lot.  Then when adding seeking support, I
>>>> switched to this approach for the media source.  If nothing else, this
>>>> system matches other media foundation components better
>>>> (IMFMediaSession, IMFSourceReader), which reduces overhead for the
>>>> reader.
>>> Even if RequestSample() needs to be asynchronous, is there a reason that
>>> Start() needs to be asynchronous? Thread safety is obviously important,
>>> but that can be achieved with simple locking.
>> One thing I want to avoid here is blocking too much during a function
>> which is documented to be asynchronous.  We have to wait for the media
>> streams to finish up responding to requested samples in this function,
>> which may take a non trivial amount of time.  I don't think it's
>> unreasonable to expect there to be applications which calls
>> IMFMediaSource methods from their interactive thread, resulting in our
>> blocking causing a reduction in responsiveness.
> This strikes me as premature optimization. But if you're really
> concerned, the best way to resolve this is in fact to measure the time
> this function takes.
The time it takes depends on multiple factors, whether it's a seek or a 
start, which demuxer, decoders, and elements are hooked up, and mainly 
how many ::RequestSample instances are in a queue at the time of the 
call.  Plus, it's not the only reason to keep it asynchronous as 
documented.  As mentioned earlier, it better aligns to other media 
foundation classes, and reduces code complexity.
>
>>>>>> +
>>>>>> +    return hr;
>>>>>>     }
>>>>>>        static HRESULT WINAPI media_source_Stop(IMFMediaSource *iface)
>>>>>> @@ -772,6 +1073,9 @@ static HRESULT WINAPI
>>>>>> media_source_Shutdown(IMFMediaSource *iface)
>>>>>>         if (source->no_more_pads_event)
>>>>>>             CloseHandle(source->no_more_pads_event);
>>>>>>     +    if (source->async_commands_queue)
>>>>>> +        MFUnlockWorkQueue(source->async_commands_queue);
>>>>>> +
>>>>>>         return S_OK;
>>>>>>     }
>>>>>>     @@ -852,6 +1156,7 @@ static HRESULT
>>>>>> media_source_constructor(IMFByteStream *bytestream, struct media_
>>>>>>             return E_OUTOFMEMORY;
>>>>>>            object->IMFMediaSource_iface.lpVtbl = &IMFMediaSource_vtbl;
>>>>>> +    object->async_commands_callback.lpVtbl =
>>>>>> &source_async_commands_callback_vtbl;
>>>>>>         object->ref = 1;
>>>>>>         object->byte_stream = bytestream;
>>>>>>         IMFByteStream_AddRef(bytestream);
>>>>>> @@ -860,6 +1165,9 @@ static HRESULT
>>>>>> media_source_constructor(IMFByteStream *bytestream, struct media_
>>>>>>         if (FAILED(hr = MFCreateEventQueue(&object->event_queue)))
>>>>>>             goto fail;
>>>>>>     +    if (FAILED(hr =
>>>>>> MFAllocateWorkQueue(&object->async_commands_queue)))
>>>>>> +        goto fail;
>>>>>> +
>>>>>>         object->container = gst_bin_new(NULL);
>>>>>>         object->bus = gst_bus_new();
>>>>>>         gst_bus_set_sync_handler(object->bus,
>>>>>> mf_src_bus_watch_wrapper, object, NULL);
>>>>>> diff --git a/dlls/winegstreamer/mfplat.c b/dlls/winegstreamer/mfplat.c
>>>>>> index 2e8b0978648..9aa17ad00ab 100644
>>>>>> --- a/dlls/winegstreamer/mfplat.c
>>>>>> +++ b/dlls/winegstreamer/mfplat.c
>>>>>> @@ -601,3 +601,152 @@ IMFMediaType *mf_media_type_from_caps(const
>>>>>> GstCaps *caps)
>>>>>>            return media_type;
>>>>>>     }
>>>>>> +
>>>>>> +GstCaps *caps_from_mf_media_type(IMFMediaType *type)
>>>>>> +{
>>>>>> +    GUID major_type;
>>>>>> +    GUID subtype;
>>>>>> +    GstCaps *output = NULL;
>>>>>> +
>>>>>> +    if (FAILED(IMFMediaType_GetMajorType(type, &major_type)))
>>>>>> +        return NULL;
>>>>>> +    if (FAILED(IMFMediaType_GetGUID(type, &MF_MT_SUBTYPE, &subtype)))
>>>>>> +        return NULL;
>>>>>> +
>>>>>> +    if (IsEqualGUID(&major_type, &MFMediaType_Video))
>>>>>> +    {
>>>>>> +        UINT64 frame_rate = 0, frame_size = 0;
>>>>>> +        DWORD width, height, framerate_num, framerate_den;
>>>>>> +        UINT32 unused;
>>>>>> +
>>>>>> +        if (FAILED(IMFMediaType_GetUINT64(type, &MF_MT_FRAME_SIZE,
>>>>>> &frame_size)))
>>>>>> +            return NULL;
>>>>>> +        width = frame_size >> 32;
>>>>>> +        height = frame_size;
>>>>>> +        if (FAILED(IMFMediaType_GetUINT64(type, &MF_MT_FRAME_RATE,
>>>>>> &frame_rate)))
>>>>>> +        {
>>>>>> +            frame_rate = TRUE;
>>>>>> +            framerate_num = 0;
>>>>>> +            framerate_den = 1;
>>>>>> +        }
>>>>>> +        else
>>>>>> +        {
>>>>>> +            framerate_num = frame_rate >> 32;
>>>>>> +            framerate_den = frame_rate;
>>>>>> +        }
>>>>>> +
>>>>>> +        /* Check if type is uncompressed */
>>>>>> +        if (SUCCEEDED(MFCalculateImageSize(&subtype, 100, 100,
>>>>>> &unused)))
>>>>> Early return could save a level of indentation.
>>>> Yes but in future patches there will be else statements for handling
>>>> compressed types.  Even if we don't end up using this for the media
>>>> source, it will be necessary for games which manually use decoder MFTs.
>>>>> I also feel like there's an easier way to do this check. Maybe
>>>>> something
>>>>> like:
>>>>>
>>>>> for (i = 0; i < ARRAY_SIZE(uncompressed_video_formats); i++)
>>>>> {
>>>>>        ...
>>>>> }
>>>>>
>>>>> if (format == GST_VIDEO_FORMAT_UNKNOWN
>>>>>            && !memcmp(&subtype, &MFVideoFormat_Base.Data2, ...))
>>>>>        format = gst_video_format_from_fourcc(subtype.Data1);
>>>>>
>>>>> if (format == GST_VIDEO_FORMAT_UNKNOWN)
>>>>> {
>>>>>        FIXME("Unrecognized subtype %s.\n", debugstr_guid(&subtype));
>>>>>        return NULL;
>>>>> }
>>>> That's certainly a possibility, but using MFCalculateImageSize is
>>>> probably the best way to concisely determine whether we want to take the
>>>> uncompressed video format route.  With your solution, I guess we'd just
>>>> add the handling for those formats in `if (format ==
>>>> GST_VIDEO_FORMAT_UNKNOWN)`.  Either way is fine, which path should I
>>>> take?
>>> I don't see any reason we need to check whether it's uncompressed in
>>> general; both "an uncompressed format we can't handle" and "a compressed
>>> format" yield the same result.
>> Right now they do, but my point is that in future patches the structure
>> is more like
>>
>> if (uncompressed) { ... } else if (format == h264) { ... } else if
>> (format == wmv)
> Right, I don't see why there's any point doing that when you can instead
> just leave the uncompressed case as the default, e.g.
>
> if (IsEqualGUID(&subtype, &some_compressed_format))
>      ...
> else if (IsEqualGUID(&subtype, &some_other_compressed_format))
>      ...
> else
> {
>      for (i = 0; i < ARRAY_SIZE(uncompressed_video_formats); ++i)
>          ...
>
>      if (format == GST_VIDEO_FORMAT_UNKNOWN)
>          ...
>
>      if (format == GST_VIDEO_FORMAT_UNKNOWN)
>          return NULL;
> }
Sounds good
>
>>>>>> +            FIXME("Unrecognized subtype %s\n",
>>>>>> debugstr_guid(&subtype));
>>>>>> +            return NULL;
>>>>>> +        }
>>>>>> +
>>>>>> +
>>>>>> +        if (frame_size)
>>>>>> +        {
>>>>>> +            gst_caps_set_simple(output, "width", G_TYPE_INT,
>>>>>> width, NULL);
>>>>>> +            gst_caps_set_simple(output, "height", G_TYPE_INT,
>>>>>> height, NULL);
>>>>>> +        }
>>>>>> +        if (frame_rate)
>>>>> Why do you need to check this is nonzero? Can the frame rate really
>>>>> be 0/0?
>>>> To be honest, I don't remember exactly which circumstance led to this
>>>> being necessary.  I assume it was something with the IMFMediaType MK11
>>>> generates, but it does seem very strange so I'll have to check again.
>>>>> For that matter, do we need to give GStreamer the framerate at all? It
>>>>> should only ever matter to sink elements (or those that respect a
>>>>> clock).
>>>> Why risk seeing what happens when you don't translate this cap, are
>>>> there any benefits?  Maybe a decoder would need it need it to handle
>>>> some time-based seek?
>>>>
>>> In fact, translating the framerate in quartz has caused problems before,
>>> as applications didn't always provide it, or provided the wrong one, and
>>> thus the caps would not match; as a result we always remove it. On the
>>> other hand, I haven't seen any problems arise from clearing this field.
>>> In practice I don't think it should ever be necessary, because the
>>> parser provides the framerate.
>> My feeling is that such a workaround should be put in place where it
>> causes problems, not in a general conversion function which may be used
>> (as in later patches) for other purposes where the cap is necessary.
> I don't think either translating the frame rate or not translating the
> frame rate is really incorrect, such that one can be called a
> workaround. And one is clearly more work than the other. [Note that
> gstdemux explicitly clears the framerate because it uses
> gst_video_info_to_caps() instead of building the caps structure manually.]
In quartz, you have a static function which is only used within the 
context of your equivalent of the media source.  This definition resides 
inside mfplat.c, where it will also be used by the decoder, and in the 
future, encoders (some games use the encoders of media foundation to 
save highlights to disk, and streaming services use it for obvious 
reasons).  I find it very likely that an encoder would want to know the 
framerate of an incoming stream.
>
>> For what it's worth, the chance of a media foundation application
>> feeding some bogus framerate is slim, as they usually just search the
>> media type handler for a desired type anyway, and we're supposed to fail
>> on SetCurrentMediaType if the media type they enter is unsupported.
>> 1:
>> https://github.com/Guy1524/wine/blob/mfplat/dlls/winegstreamer/media_source.c#L98
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20201008/17a9a4e6/attachment-0001.htm>