wineport: Add fast builtin & asm versions of ffs() + ctz() where supported.
david at l8s.co.uk
Fri Mar 11 11:32:50 CST 2011
On Fri, Mar 11, 2011 at 02:28:27PM +0100, Alexandre Julliard wrote:
> Adam Martinson <amartinson at codeweavers.com> writes:
> > @@ -236,7 +241,40 @@ extern int getopt_long_only (int ___argc, char *const *___argv,
> > #endif /* HAVE_GETOPT_LONG */
> > #ifndef HAVE_FFS
> > -int ffs( int x );
> > + #if defined(__i386__) || defined(__x86_64__)
> > + __asm__("bsfl %1, %0; incl %0" : "=r" (ret) : "r" (x));
> > + return ret;
> > + #elif defined(__GNUC__) && GCC_VERSION >= 29503
> > + #define HAVE_FFS
> > + #define ffs(x) __builtin_ffs(x)
> > + #else
> > + int ffs( int x );
> > + #endif
> > +#endif
> You'd have to show benchmarks to prove that this complexity is
> necessary. Given that ffs() should already be inlined on all decent
> platforms, I doubt you'd be able to demonstrate a difference (if
> anything, your version would be slower because of the extra increment).
Never mind the fact that it is possible to write code that is (probably)
faster than the bsfl instruction on most x86 processors!
You'd also get a gain from getting the 'if (x == 0)' statically predicted
correctly (probably for not zero) - that is also likely to be large.
David Laight: david at l8s.co.uk
More information about the wine-devel