wineport: Add support for ctz().

Marcus Meissner marcus at jet.franken.de
Thu Mar 17 02:54:10 CDT 2011


On Wed, Mar 16, 2011 at 01:26:31PM -0500, Adam Martinson wrote:
> On 03/16/2011 08:34 AM, Alexandre Julliard wrote:
> >Adam Martinson<amartinson at codeweavers.com>  writes:
> >
> >>@@ -239,6 +243,19 @@ extern int getopt_long_only (int ___argc, char *const *___argv,
> >>  int ffs( int x );
> >>  #endif
> >>
> >>+#if defined(__GNUC__)&&  (GCC_VERSION>= 30406)
> >>+    #define ctz(x) __builtin_ctz(x)
> >>+#elif defined(__GNUC__)&&  (defined(__i386__) || defined(__x86_64__))
> >>+    static inline int ctz( unsigned int x )
> >>+    {
> >>+        int ret;
> >>+        __asm__("bsfl %1, %0" : "=r" (ret) : "r" (x));
> >>+        return ret;
> >>+    }
> >>+#else
> >>+    #define ctz(x) (ffs(x)-1)
> >>+#endif
> >There's no reason to add this. Just use ffs().
> >
> If I thought ffs() was adequate, I would.  I need this for iterating
> sparse bitsets.
> 
> __builtin_ctz() compiles to:
> mov    0x8(%ebp),%eax
> bsf    %eax,%eax
> 
> (ffs()-1) compiles to:
> mov    $0xffffffff,%edx
> bsf    0x8(%ebp),%eax
> cmove  %edx,%eax
> add    $0x1,%eax
> sub    $0x1,%eax
> 
> ... Fortunately -O2 catches the add/sub.  So yes, there is a reason,
> ctz() is at least 50% faster.

You are optimizing in the wrong spot.

If this is not in performance relevant code, readability is always better
than hacks.

ciao, Marcus



More information about the wine-devel mailing list