<div dir="ltr"><div>> It obviously wouldn't match native structure, but it's not clear to me that it would fail to match native in a way that would cause problems.</div><div><br></div><div>Famous last words.</div><div><br></div><div>- Josh 🐸</div>


</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 26 Mar 2020 at 22:09, Zebediah Figura <<a href="mailto:zfigura@codeweavers.com" target="_blank">zfigura@codeweavers.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">There's another broad question I have with this approach, actually,<br>

which is fundamental enough I have to assume it's at had some thought<br>

put into it, but it would be nice if that discussion happened in a more<br>

public place, and was justified in the patches sent.<br>

<br>

Essentially, the question is: what if we were to use decodebin directly?<br>

<br>

As I understand (and admittedly Media Foundation is far more complex<br>

than I could hope to understand) an application which just calls<br>

IMFSourceResolver methods just needs to get back a working<br>

IMFMediaSource, and we could wrap decodebin with one of those, similar<br>

to the quartz wrapper.<br>

<br>

First of all, this is something I think we want to do anyway. Microsoft<br>

has no demuxer for, say, Vorbis (at least, there's not one registered on<br>

my Windows 10 machine), but I think that we want to be able to play back<br>

Vorbis files anyway (in, say, a Win32 media player application). Instead<br>

of writing yet another source for vorbis, and for each other obscure<br>

format, we just write one generic decodebin wrapper.<br>

<br>

Second of all, the most obvious benefit, at least while looking at these<br>

patches, is that you now don't need to write caps <-> IMFMediaType<br>

conversion for every type on the planet. Another benefit is that you let<br>

all of the decoding happen within a single GStreamer pipeline, which is<br>

probably better for performance. You also can simplify your<br>

postprocessing step to adding a single videoconvert and audioconvert,<br>

instead of having to manually (or semi-manually) add e.g. an h264 parser<br>

element. These are some of the benefits I had in mind when removing the<br>

GStreamer quartz transforms.<br>

<br>

Even in the case where the application manually creates e.g. an MPEG-4<br>

source, my understanding is it's still the source's job to automatically<br>

append transforms to match the requested type. We'd just be moving that<br>

from the mfplat level to the gstreamer level—i.e. let decodebin select<br>

the 'transforms' needed to convert to raw video and audio.<br>

<br>

It obviously wouldn't match native structure, but it's not clear to me<br>

that it would fail to match native in a way that would cause problems.<br>

Judging from my experience with quartz, most applications aren't going<br>

to care how their media is decoded as long as they get raw samples out<br>

of it. Only a select few build the graph manually because they don't<br>

realize that they can autoplug, or make assumptions about which filters<br>

will be present once autoplugging is done, and some of those even fall<br>

back to autoplugging if their preferred method fails. Maybe the<br>

situation is different with mfplat, but given that there is a way to let<br>

mfplat figure out which sources and transforms to use, I'm gonna be<br>

really surprised if most applications aren't using it.<br>

<br>

If you do come across an application that requires we mimic native's<br>

specific arrangement of sources and transforms, it seems to me it<br>

wouldn't require that much effort to swap a different parser in for<br>

decodebin, and to implement the necessary bits in the media type<br>

conversion functions. Ultimately I suspect it'd be less work to have a<br>

decodebin wrapper + specific sources for applications that require them,<br>

than to manually implement every source and transform.<br>

<br>

</blockquote></div>