View Issue Details

IDProjectCategoryView StatusLast Update
0000928luatexfeature requestpublic2015-10-06 15:56
ReportererouxAssigned ToHans Hagen 
PrioritynormalSeverityfeatureReproducibilityN/A
Status closedResolutionfixed 
Product Version0.79.1 
Target VersionFixed in Version0.80.0 
Summary0000928: add penalty field to disc node
DescriptionUsers should be able to set (in lua or tex) the penalty of disc nodes individually. A few proposals :

 - adding a penalty field in the lua node, which is used only if the value has been changed from lua
 - same, but with a \localhyphenpenalty that overrides \hyphenpenalty for the nodes after it (unlike \hyphenpenalty which is the same for the whole paragraph)
 - fill the penalty field of disc nodes with exhyphenpenalty or hyphenpenalty, and always use the node value

The attached patch takes the first approach. See mwe.tex for a minimal example using it.

This feature is needed for a project called Gregorio :

https://github.com/gregorio-project/gregorio

it has quite a big convoluted (and not always well written I admit)
plain tex + lua codebase (in the tex/ directory). It is used to typeset
gregorian chant scores; it is the current standard in this area (the
competition is not very strong, I admit). Here is a part of the
algorithm that needs to be fixed:

when finishing a syllable (text associated with notes), I currently
simply put a penalty (depending on whether it's the final syllable of
the word or not, and a few other things), and a space (depending on a
convoluted algorithm).

the center of the first note is aligned with the center of the vowel of
a syllable, so in many cases, notes start after the text, see ex1.png.
This is the most common case, let's assume all syllables are like that.

so when a new line starts, the first thing to be typeset will be text,
which means that the text of each line will be aligned, see ex2.pdf.

The problem is: this is not the way Gregorian scores are meant to be
aligned. if you look at a Gregorian chant book, the lines are aligned on
the notes, not on text. See ex2-fixed.pdf for the desired output.

Currently, I don't think I can do that without hacking the parbuilder,
which seems a bit overkill and will raise many performance issues (some
books produced with Gregorio have more than 1000 pages, with around 2000
scores). So the solution I'm thinking about is the following:

at the end of a syllable, put \discretionary{\hskip x}{}{\hbox{}\kern
y}, where x is the skip in case line doesn't break, and y is the
negative distance I have to add at the beginning of a line in order to
have lines left-aligned on notes (I can compute this easily). This
solves the problem, but one thing remains, which is the fact that these
disc will have different penalties. At first I though \penalty
x\discretionary{a}{b}{c} would break on the disc with penalty x, but
it's not the case. So it seems TeX doesn't offer this possibility, and I
cannot see a good reason not to be able to change a disc penalty...

Thank you
TagsNo tags attached.

Activities

eroux

2015-03-15 16:41

reporter  

new_disc_penalty.patch (5,646 bytes)
Index: manual/luatexref-t.tex
===================================================================
--- manual/luatexref-t.tex	(révision 5187)
+++ manual/luatexref-t.tex	(copie de travail)
@@ -9516,10 +9516,16 @@
 \NC pre        \NC \syntax{<node>}    \NC  pointer to the pre|-|break text\NC\NR
 \NC post       \NC \syntax{<node>}    \NC  pointer to the post|-|break text\NC\NR
 \NC replace    \NC \syntax{<node>}    \NC  pointer to the no|-|break text\NC\NR
+\NC penalty    \NC number              \NC  penalty associated with the discretionary\NC\NR
 \stoptabulate
 
 The subtype numbers~4 and~5 belong to the \quote{of-f-ice} explanation given elsewhere.
 
+The penalty field is always 0 by default. When it is 0, line breaking will use
+hyphenpenalty or exhyphenpenalty (default behaviour). When it is set to a
+non-zero value, line breaking will use the specified value. You can only change
+this field on the Lua side.
+
 A warning: never assign a node list to the pre, post or replace field
 unless you are sure its internal link structure is correct, otherwise
 an error may be result.
Index: source/texk/web2c/luatexdir/lua/lnodelib.c
===================================================================
--- source/texk/web2c/luatexdir/lua/lnodelib.c	(révision 5187)
+++ source/texk/web2c/luatexdir/lua/lnodelib.c	(copie de travail)
@@ -2853,6 +2853,8 @@
             fast_metatable_or_nil(vlink(post_break(n)));
         } else if (lua_key_eq(s, replace)) {
             fast_metatable_or_nil(vlink(no_break(n)));
+        } else if (lua_key_eq(s, penalty)) {
+            lua_pushnumber(L, disc_penalty(n));
         } else {
             lua_pushnil(L);
         }
@@ -3603,6 +3605,8 @@
             nodelib_pushdirect_or_nil(vlink(post_break(n)));
         } else if (lua_key_eq(s, replace)) {
             nodelib_pushdirect_or_nil(vlink(no_break(n)));
+        } else if (lua_key_eq(s, penalty)) {
+            lua_pushnumber(L, disc_penalty(n));
         } else {
             lua_pushnil(L);
         }
@@ -4798,6 +4802,8 @@
             set_disc_field(post_break(n), nodelib_getlist(L, 3));
         } else if (lua_key_eq(s, replace)) {
             set_disc_field(no_break(n), nodelib_getlist(L, 3));
+        } else if (lua_key_eq(s, penalty)) {
+            disc_penalty(n) = (halfword) lua_tointeger(L, 3);
         } else {
             return nodelib_cantset(L, n, s);
         }
@@ -5544,6 +5550,8 @@
             set_disc_field(post_break(n), nodelib_popdirect(3));
         } else if (lua_key_eq(s, replace)) {
             set_disc_field(no_break(n), nodelib_popdirect(3));
+        } else if (lua_key_eq(s, penalty)) {
+            disc_penalty(n) = (halfword) lua_tointeger(L, 3);
         } else {
             return nodelib_cantset(L, n, s);
         }
Index: source/texk/web2c/luatexdir/tex/linebreak.w
===================================================================
--- source/texk/web2c/luatexdir/tex/linebreak.w	(révision 5187)
+++ source/texk/web2c/luatexdir/tex/linebreak.w	(copie de travail)
@@ -1950,10 +1950,12 @@
                     int actual_penalty = hyphen_penalty;
                     if (subtype(cur_p) == automatic_disc)
                         actual_penalty = ex_hyphen_penalty;
+                     if (disc_penalty(cur_p) != 0)
+                        actual_penalty = (int) disc_penalty(cur_p);
                     s = vlink_pre_break(cur_p);
                     do_one_seven_eight(reset_disc_width);
                     if (s == null) {    /* trivial pre-break */
Index: source/texk/web2c/luatexdir/tex/texnodes.h
===================================================================
--- source/texk/web2c/luatexdir/tex/texnodes.h	(révision 5187)
+++ source/texk/web2c/luatexdir/tex/texnodes.h	(copie de travail)
@@ -151,7 +151,7 @@
    pointers are not really needed (8 instead of 10).
  */
 
-#  define disc_node_size 10
+#  define disc_node_size 11
 
 typedef enum {
     discretionary_disc = 0,
@@ -162,13 +162,14 @@
     select_disc,                /* second of a duo of syllable_discs */
 } discretionary_types;
 
-#  define pre_break_head(a)   ((a)+4)
-#  define post_break_head(a)  ((a)+6)
-#  define no_break_head(a)    ((a)+8)
+#  define pre_break_head(a)   ((a)+5)
+#  define post_break_head(a)  ((a)+7)
+#  define no_break_head(a)    ((a)+9)
 
-#  define pre_break(a)     vinfo((a)+2)
-#  define post_break(a)    vlink((a)+2)
-#  define no_break(a)      vlink((a)+3)
+#  define disc_penalty(a)    vlink((a)+2)
+#  define pre_break(a)     vinfo((a)+3)
+#  define post_break(a)    vlink((a)+3)
+#  define no_break(a)      vlink((a)+4)
 #  define tlink llink
 
 #  define vlink_pre_break(a)  vlink(pre_break_head(a))
Index: source/texk/web2c/luatexdir/tex/texnodes.w
===================================================================
--- source/texk/web2c/luatexdir/tex/texnodes.w	(révision 5187)
+++ source/texk/web2c/luatexdir/tex/texnodes.w	(copie de travail)
@@ -3082,6 +3082,8 @@
                 /* The |post_break| list of a discretionary node is indicated by a prefixed
                    `\.{\char'174}' instead of the `\..' before the |pre_break| list. */
                 tprint_esc("discretionary");
+                print_int(disc_penalty(p));
+                print_char('|');
                 if (vlink(no_break(p)) != null) {
                     tprint(" replacing ");
                     node_list_display(vlink(no_break(p)));
@@ -3479,6 +3481,7 @@
 {                               /* creates an empty |disc_node| */
     halfword p;                 /* the new node */
     p = new_node(disc_node, 0);
+    disc_penalty(p) = 0;
     return p;
 }
 
new_disc_penalty.patch (5,646 bytes)

eroux

2015-03-15 16:42

reporter  

mwe.tex (413 bytes)

Hans Hagen

2015-03-15 17:11

manager   ~0001330

maybe let store current exhyphenpenalty in disc node ... otherwise we also need a primitive to set the (new) field

eroux

2015-03-16 14:46

reporter   ~0001331

Thinking back about it, I'm not sure this approach is easy: if you fill disc.penalty with the current value of hyphenchar and then use the value in linebreaking, this means that TeX behaviour will be changed, as hyphenpenalty will be local and not the same for the whole paragraph.

For example, in

abc\hyphenpenalty -10001\discretionary{d}{e}{f}ghi\hyphenpenalty 10001\discretionary{d}{e}{f}\bye

your approach (as I understand it) will make the first discretionary have a different penalty than the second one, while current TeX gets both the same (10001).

Or does it mean you would set disc.penalty to latest hyphenpenalty for all disc of paragraph before pre_linebreak_filter? This might make things a tiny bit slower I believe, as you would have to run through all disc at the end of a paragraph...

Or it's also possible to say that \hyphenpenalty now behaves like the \localhyphenpenalty you though about... I doubt many documents set it multiple times in a paragraph and rely on the fact that only the last is applied.

Hans Hagen

2015-03-16 18:23

manager   ~0001333

Yes, but you will do that grouped, so normally the global one will be in use. Personally I think that this is not different from what luatex already does with left/righthyphenmin values in glyphs. So consider it progress.

With luatex we try to remain compatible unless we can make things better. if needed in this case we can make a flag that disables this feature so a macro package can decide what to support.

Hans Hagen

2015-03-16 18:28

manager   ~0001334

btw, we only talk of hyphens not inserted by the hyphenator as the regular ones are added by a hyphenation pass so those listen to the normal penalties

Hans Hagen

2015-03-16 19:20

manager   ~0001338

i was wondering ... how about

  discretionary penalty 123 {} {} {}

defaulting to nothing thereby hyphenpenalty

eroux

2015-03-16 19:35

reporter   ~0001339

That sounds good! It seems easier than \localhyphenmin and clearer.

Thank you very much for the fix of 842 also!

Hans Hagen

2015-09-12 16:40

manager   ~0001401

extending the scanner for \discretionary makes the code messy but \discpenalty can do the job as one can group:

\bgroup\discpenalty 123\discretionary{}{}{}\egroup

after all this is some kind of special use. So, adding a key at the lua end and a new primitive at the tex end probably is easiests.

i'll play with that

eroux

2015-09-12 17:02

reporter   ~0001402

I'm fine with that!

Hans Hagen

2015-10-06 15:56

manager   ~0001416

instead we now store (ex)hyohenpanalty in the penalty field of a disc node so one can do \bgroup\hyphenpenalty 123\discretionary{}{}{}\egroup etc

Issue History

Date Modified Username Field Change
2015-03-15 16:41 eroux New Issue
2015-03-15 16:41 eroux File Added: new_disc_penalty.patch
2015-03-15 16:42 eroux File Added: mwe.tex
2015-03-15 17:11 Hans Hagen Note Added: 0001330
2015-03-15 17:12 Hans Hagen Assigned To => Hans Hagen
2015-03-15 17:12 Hans Hagen Status new => assigned
2015-03-16 14:46 eroux Note Added: 0001331
2015-03-16 18:23 Hans Hagen Note Added: 0001333
2015-03-16 18:28 Hans Hagen Note Added: 0001334
2015-03-16 18:58 Hans Hagen Status assigned => resolved
2015-03-16 18:58 Hans Hagen Fixed in Version => 0.80.0
2015-03-16 18:58 Hans Hagen Resolution open => fixed
2015-03-16 19:08 eroux Status resolved => feedback
2015-03-16 19:08 eroux Resolution fixed => reopened
2015-03-16 19:19 Hans Hagen Status feedback => new
2015-03-16 19:19 Hans Hagen Status new => acknowledged
2015-03-16 19:20 Hans Hagen Note Added: 0001338
2015-03-16 19:35 eroux Note Added: 0001339
2015-09-12 16:40 Hans Hagen Note Added: 0001401
2015-09-12 17:02 eroux Note Added: 0001402
2015-10-06 15:56 Hans Hagen Note Added: 0001416
2015-10-06 15:56 Hans Hagen Status acknowledged => closed
2015-10-06 15:56 Hans Hagen Resolution reopened => fixed