wineport: Add support for ctz().
Marcus Meissner
marcus at jet.franken.de
Thu Mar 17 02:54:10 CDT 2011
On Wed, Mar 16, 2011 at 01:26:31PM -0500, Adam Martinson wrote:
> On 03/16/2011 08:34 AM, Alexandre Julliard wrote:
> >Adam Martinson<amartinson at codeweavers.com> writes:
> >
> >>@@ -239,6 +243,19 @@ extern int getopt_long_only (int ___argc, char *const *___argv,
> >> int ffs( int x );
> >> #endif
> >>
> >>+#if defined(__GNUC__)&& (GCC_VERSION>= 30406)
> >>+ #define ctz(x) __builtin_ctz(x)
> >>+#elif defined(__GNUC__)&& (defined(__i386__) || defined(__x86_64__))
> >>+ static inline int ctz( unsigned int x )
> >>+ {
> >>+ int ret;
> >>+ __asm__("bsfl %1, %0" : "=r" (ret) : "r" (x));
> >>+ return ret;
> >>+ }
> >>+#else
> >>+ #define ctz(x) (ffs(x)-1)
> >>+#endif
> >There's no reason to add this. Just use ffs().
> >
> If I thought ffs() was adequate, I would. I need this for iterating
> sparse bitsets.
>
> __builtin_ctz() compiles to:
> mov 0x8(%ebp),%eax
> bsf %eax,%eax
>
> (ffs()-1) compiles to:
> mov $0xffffffff,%edx
> bsf 0x8(%ebp),%eax
> cmove %edx,%eax
> add $0x1,%eax
> sub $0x1,%eax
>
> ... Fortunately -O2 catches the add/sub. So yes, there is a reason,
> ctz() is at least 50% faster.
You are optimizing in the wrong spot.
If this is not in performance relevant code, readability is always better
than hacks.
ciao, Marcus
More information about the wine-devel
mailing list