[Templates] Unicode Line Separator support

Bill Moseley moseley@hank.org
Fri, 2 Mar 2007 09:52:15 -0800


This is more of a Perl question, but I've got utf8 templates and
someone is using BB Edit and SubEthaEdit to create the templates.  I'm
not familiar with BB Edit or SubEthaEdit.

They have been creating templates fine, but today created one and
their editor used U+2028 instead of new lines.  Perl sees those as
matchings \s but it doesn't seem to match \n.

The problem, of course, is there's a comment in the template and since
it's all one line everything after the comment is ignored.

I'm curious if anyone else has come across this.  Google finds
discussions of this issue related to Perl but I'm not entirely clear
if U+2028 (and 2029) should be considered newlines.

And with respect to TT, should it do anything different than Perl in
dealing with this character.  Such as:

    s/\x{2028}/\n/g;

Which might be a bit expensive in general.  Or maybe there's away to make
perl see that as a \n that I'm not aware of.

Finally, anyone know BB Edit?  The person claims that they are editing
files like always before but not for some reason it's using the Line
Separator instead of \n.


-- 
Bill Moseley
moseley@hank.org