AUTHORS list and the C locale on Mac OS X

James Mckenzie jjmckenzie51 at
Tue Nov 9 13:13:28 CST 2010

Charles Davis <cdavis at> wrote:
>There may be a problem with the way the authors.c file is generated on a
>Mac with GNU sed installed.
>On Mac OS X, the C locale's default encoding is MacRoman, not UTF-8.
>This has some pretty surprising consequences. For example, since the
>AUTHORS file contains UTF-8 multibyte sequences that aren't valid in the
>MacRoman encoding, GNU sed doesn't match those sequences in the .*
>regexp, and the authors.c file comes out wrong. Mac OS sed does not have
>this problem. So now I'm stuck changing the Makefile to use the system
>sed (in /usr/bin) instead of GNU sed.
I found this a long time ago.  Fink just modifies the Authors.c file to remove those sequences.

I went one better and removed GNU sed from my Mac (fink purge sed) as I found that MacOSX version of sed would do everything that GNU sed did and I did not have to edit the file.

>I could uninstall GNU sed, but there's one small problem. I have a
>Gentoo prefix set up. It is the reason I have GNU sed installed. If I
>install or upgrade practically anything in my Gentoo prefix, then GNU
>sed will just get pulled right back in.
>Reading the manual for GNU sed tells me that this is by design and that
>this behavior--not matching characters that are invalid in the current
>locale--is in fact mandated by POSIX. If that's the case, then the
>LC_ALL= statement in the Makefile needs to change. To what, I don't
>know. I'm hoping one of you has an idea.
>If you think, however, that this is a bug in GNU sed, then I will gladly
>write a report to the maintainer about it.
No, it is not a bug in GNU sed.  The authors.c file needs to have the erroneous characters for the language used by MacOSX changed to be acceptable?

James McKenzie

