[Templates] Plugins and Unicode confusion
Trond Michelsen
trondmm-tt@crusaders.no
Fri, 17 Feb 2006 01:32:33 +0100
On Fri, Feb 17, 2006 at 12:56:21AM +0100, Bernhard Graf wrote:
> On Friday 17 February 2006 00:28, Tatsuhiko Miyagawa wrote:
> > On 2/16/06, Bernhard Graf <tt@augensalat.de> wrote:
>>>> Yeah, POSIX.
>>> Then why does this work:
>>> perl -MPOSIX -e 'print POSIX::strftime("%B",0,0,0,1,2,106), "\n";'
>>> März
>>
>> Because that's latin-1 string which is okay to print out to terminal,
>> individually. Problem occurs when you concatinate latin-1 bytes and
>> utf-8 bytes.
>
> No. My terminal uses utf-8 too.See my first posting.
>
> "März" from within the utf-8 encoded template is printed OK, while März
> from T::P::Date is displayed broken: My utf-8 terminal thinks latin
> chars "är" (two bytes) is one utf-8 char.
That's a bit odd, not just because "är" is an illegal utf-8 sequence,
but "ä" should signal a three-byte utf-8 character.
>> Maybe you could try de_DE.UTF-8 if your system has that locale.
> ~> echo $LANG
> de_DE.UTF-8
would you mind piping your output through od, just to verify?
$ LC_ALL=de_DE.UTF-8 perl -MPOSIX -le 'print POSIX::strftime("%B",0,0,0,1,2,106)' | od -t c
0000000 M Ã ¤ r z \n
0000006
$ LC_ALL=de_DE.UTF-8 perl -MPOSIX -le 'print POSIX::strftime("%B",0,0,0,1,2,106)' | od -t x1
0000000 4d c3 a4 72 7a 0a
0000006
$ LC_ALL=de_DE.ISO-8859-1 perl -MPOSIX -le 'print POSIX::strftime("%B",0,0,0,1,2,106)' | od -t c
0000000 M ä r z \n
0000005
$ LC_ALL=de_DE.ISO-8859-1 perl -MPOSIX -le 'print POSIX::strftime("%B",0,0,0,1,2,106)' | od -t x1
0000000 4d e4 72 7a 0a
0000005
--
Trond Michelsen