[PATCH 2/2] comctl32/listbox: Implement LBS_NODATA

Mon Nov 19 12:31:32 CST 2018

On Mon, Nov 19, 2018 at 8:08 PM Nikolay Sivov <nsivov at codeweavers.com> wrote:
>
> On 11/19/18 6:30 PM, Gabriel Ivăncescu wrote:
> > On Mon, Nov 19, 2018 at 4:44 PM Nikolay Sivov <nsivov at codeweavers.com>
> > wrote:
> >> I suggest to start with byte array for item data, reusing existing
> >> "items" field, and try to avoid specializing LBS_NODATA case
> >> everywhere. Heavy optimizations like that should be justified by some
> >> real life use case in a first place.
> > There's no way around having to special-case it everywhere. Even if we
> > ignore the bit array (which is used only for multi-selection
> > listboxes), it would still have to be special cased everywhere to
> > avoid NULL dereference or invalid data. Because obviously it has to
> > store less data and be much faster in many cases. That's literally the
> > whole point of LBS_NODATA, even the name says it. For single-selection
> > it should store zero data (as on Windows) and be instant (see linked
> > bug report). The alternative to this "special case everywhere" is to
> > not implement it at all and forever remain as TODO. There's no point
> > to implement it if we aren't going to make it faster than the current
> > situation.
> Forget about performance for a moment. All that matters is that it's
> none or less data per-item,
> still insert/remove/select do the same thing. Insert reallocates and
> moves, remove moves and optionally shrinks,
> select/deselect sets some flag. There is no reason it shouldn't work the
> same way for both cases,
> with necessary helpers hiding differences if need to be.
>
> For first implementation we could even use same item structure,
> disregarding excessive memory usage.
> I don't think it's a problem if it lets you verify correctness.

But don't we already have that? Right now, LBS_NODATA is implemented
using the normal item structure. It works, but it's super slow,
especially for single-selection listboxes (which should store no data
at all).

I'm not sure what exactly you want me to implement? Can you please elaborate.

> >> If the only item data is selection bit, you can use array of
> >> selection ranges as another approach, that's what ListView does. To
> >> return number of currently select items I don't think you need to
> >> iterate, you can simply keep and update a single counter field.
> > That's another approach, but it would blow up if the items are
> > selected randomly or in alternating order (we don't know what an app
> > can do here). Btw, all the operations are done to mimic normal
> > listboxes, which also iterate to get number of selections. This method
> > is more conservative and safer for any selections.
> I don't know if it will blow up without any test data. Worst case for
> alternating selection it will take index_size * number_items,
> who knows if it's a lot or or if it's slow?
>
> For multiselect case with your approach the cost of changing selection
> still depends on number of items, and with ranges it's not always true.
>

Yeah, I was talking about memory usage when I meant "blow up" here
(especially for 32-bit address space), there's apps around that have a
"random select" thing in listboxes but I've never used them on super
large data.

I'm not sure if your method is any easier, though. I think it might
lead to more code but maybe i don't know how to implement it in a
simple manner.

>
> > Also note that the code wouldn't be much simpler if it was byte array
> > instead of bits, assuming we still proceeded in chunks. To make it
> > simpler we'd have to process 1 item at a time as 1 byte for each, but
> > it would be like 32x or 64x slower (in practice at least 8x, if memory
> > bottleneck and not something else).
> Again, I don't know how you estimate any of that without trying.
>

Because right now it's processed in 4-byte chunks (8-byte for 64-bit),
where 1 bit = 1 item, so that's 32/64 items per iteration. If it's
memory bottlenecked (as it likely is, depends on CPU/RAM), then we end
up at least 8x faster since it's simply 8x less data to go through. I
can ofc benchmark it for my CPU (it depends on its cache size too),
but it just seems obvious to me. I mean 8x is the minimum increase and
32x is theoretical maximum, so it must be somewhere between those.

> > I'd say that is quite a massive difference for a style that's
> > literally made for performance on large lists.
> >> Do you have additional tests for LBS_NODATA? Mostly notifications are
> >> interesting.
> > What sort of notifications do you have in mind? I don't think
> > LBS_NODATA sends any extra notifications.
> It must send some, so parent can render any content.

That's not the job of LBS_NODATA. That's LBS_OWNERDRAWFIXED, which is
mandatory for LBS_NODATA (amongst other things) and is already tested,
though.

Other than enabling SetCount to work, LBS_NODATA is purely for
performance. If you take it out, it still works (except SetCount as I
said), but just much slower and much more memory usage.