View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001001 | luatex | texlua bug | public | 2017-12-28 06:52 | 2017-12-31 17:23 |
Reporter | wspr | Assigned To | Hans Hagen | ||
Priority | normal | Severity | minor | Reproducibility | always |
Status | closed | Resolution | no change required | ||
Summary | 0001001: String lengths with unicode.utf8.format is not unicode-aware | ||||
Description | LuaTeX's built-in string.format function uses the number of bytes when calculating string lengths, so it can't be used to format Unicode strings correctly when a desired "string length" is specified with something like "%4s". unicode.utf8.format inherits this problem, surprisingly, although unicode.utf8.len DOES calculate Unicode string lengths as expected. | ||||
Steps To Reproduce | s = "‡" print(unicode.utf8.len(s)) -- "1", as expected print(unicode.utf8.format("string: [%-4s]",s)) -- "[‡ ]", only 2 chars, not 4; same as string.format | ||||
Tags | No tags attached. | ||||
|
patching such a core helper will not be compatible (also, %20s can be considered to mean 20 bytes) it's no big deal to write a helper that does the padding: function string.utfpadd(s,n) local l = 0 local p = 1 while p do local _, u = string.find(s,"[\0-\x7F\xC2-\xF4][\x80-\xBF]*",p) if u then l = l + 1 else break end p = u + 1 end if n > 0 then return string.rep(" ",n-l) .. s else return s .. string.rep(" ",-n-l) end end function string.utfpadd(s,n) if not n or n == 0 then return s end local l = string.utflength(s) -- luatex extension to string if n > 0 then return string.rep(" ",n-l) .. s else return s .. string.rep(" ",-n-l) end end print(string.format("%30s[]","xxaxx")) print(string.format("%30s[]","xx½xx")) print(string.utfpadd("xxaxx", 30) .. "[]") print(string.utfpadd("xx½xx", 30) .. "[]") print(string.format("%-30s[]","xxaxx")) print(string.format("%-30s[]","xx½xx")) print(string.utfpadd("xxaxx",-30) .. "[]") print(string.utfpadd("xx½xx",-30) .. "[]") |
Date Modified | Username | Field | Change |
---|---|---|---|
2017-12-28 06:52 | wspr | New Issue | |
2017-12-31 17:23 | Hans Hagen | Note Added: 0001679 | |
2017-12-31 17:23 | Hans Hagen | Assigned To | => Hans Hagen |
2017-12-31 17:23 | Hans Hagen | Status | new => assigned |
2017-12-31 17:23 | Hans Hagen | Status | assigned => closed |
2017-12-31 17:23 | Hans Hagen | Resolution | open => no change required |