[3/5] WineD3D: Set WINED3D_BUFFER_CREATEBO in buffer_init

Wed Dec 23 04:41:10 CST 2009

Am 22.12.2009 um 19:23 schrieb Roderick Colenbrander:

> On Tue, Dec 22, 2009 at 6:44 PM, Henri Verbeet <hverbeet at gmail.com> wrote:
>> 2009/12/22 Stefan Dösinger <stefan at codeweavers.com>:
>>> +    conv = ((FVF & WINED3DFVF_POSITION_MASK) == WINED3DFVF_XYZRHW ) || (FVF & (WINED3DFVF_DIFFUSE | WINED3DFVF_SPECULAR));
>>>      hr = buffer_init(object, This, Size, Usage, WINED3DFMT_VERTEXDATA,
>>> -            Pool, GL_ARRAY_BUFFER_ARB, NULL, parent, parent_ops);
>>> +            Pool, GL_ARRAY_BUFFER_ARB, NULL, parent, parent_ops, conv);
>> This looks questionable, we use the FVF to determine that the buffer
>> is going to need conversion, but don't pass that FVF to the buffer
>> itself? Shouldn't this just use the existing code in buffer.c to
>> determine when we need conversion in the first place, and just drop
>> the VBO when the overhead becomes too large? Note that if we have
>> EXT_vertex_array_bgra we don't need conversion for the color data in
>> the first place.
This is for d3d7. d3d7's Vertex buffer lock method doesn't have a parameter to specify the locked ranges properly. Some apps(3dmark 2000, max payne) use vertex buffers in the d3d9 D3DUSAGE_DYNAMIC fashion, by putting some vertices there, drawing, then putting new vertices in them. Since we always reconvert the whole buffer(no range hints from the app), this ends up slowing things down a lot.

Using code similar to the buffer drop on conversion type changes sounds tempting, but we need a heuristic for that. There will be no conversion description changes in case of d3d7(static buffer declaration), but we could catch full buffer conversions vs buffer uses, and drop the buffer if there are e.g. less than 3 or 4 draws per full conversion.

This code indeed fails to take notice of EXT_vertex_array_bgra.

> The code mentions that when conversion is needed no VBO is created
> because conversion on the VBO memory in combination with uploading and
> drawStridedFast is slower than drawStridedSlow. The buffer object
> extensions discourage to perform much operations on buffer memory
> because typically it is uncached. Have you tried to perform conversion
> on a normal memory buffer and compared performance to doing the same
> on VBO memory?
We convert in HeapAlloc'ed memory and upload the final data. PreLoad doesn't operate on glMap()ed memory.