View Issue Details

IDProjectCategoryView StatusLast Update
0000869luatexluatex limitationpublic2013-12-20 10:01
Reporterpatrick Assigned ToHans Hagen  
PrioritynormalSeverityminorReproducibilityalways
Status closedResolutionfixed 
PlatformApple/Intel/64bitOSOS XOS Version10.6
Product Version0.70.1 
Summary0000869: Bug? in selene unicode library
DescriptionThe bundled selene unicode library thinks that the utf-8 sequence denoted by à is a "space".

the output of the following code (texlua --lua foo.lua)
------------------------------
match = unicode.utf8.match

if match("à","%s") then
   print("space")
else
   print("not a space")
end

if match("á","%s") then
   print("space")
else
   print("not a space")
end
------------------------------

is

space
not a space

probably because the sequence C3 A0 contains A0 (non breaking space) and the sequence C3 A1 does not contain a "space byte".

I am not sure that this is a bug or if this is by design.
Additional InformationSee the discussion here: http://tug.org/pipermail/luatex/2013-December/004676.html
and
http://tug.org/mailman/htdig/luatex/2010-March/001242.html
TagsNo tags attached.

Activities

Hans Hagen

2013-12-19 14:04

manager   ~0001148

LS/HH: the advance over the string at no match was wrong, fixed in 0.78

Hans Hagen

2013-12-20 10:01

manager   ~0001181

closed in 0.78

Issue History

Date Modified Username Field Change
2013-12-12 11:53 patrick New Issue
2013-12-19 14:03 Hans Hagen Assigned To => Hans Hagen
2013-12-19 14:03 Hans Hagen Status new => assigned
2013-12-19 14:04 Hans Hagen Note Added: 0001148
2013-12-19 14:04 Hans Hagen Status assigned => resolved
2013-12-19 14:04 Hans Hagen Resolution open => fixed
2013-12-20 10:01 Hans Hagen Note Added: 0001181
2013-12-20 10:01 Hans Hagen Status resolved => closed