Grew Tutorial • Lesson 5 • Confluence
Another well-known issue with rewriting is the problem of confluence.
Two concurrent rules
As said earlier, the Sequoia tag V may be converted to AUX or VERB.
A naive way to encode this in rules is to write (file: aux_verb.grs) the package:
package v {
rule aux {
pattern { X [upos = V] }
commands { X.upos = AUX }
}
rule verb {
pattern { X [upos = V] }
commands { X.upos = VERB }
}
}
The two rules overlap: each time a POS V is found, both rules can be used and produce a different output!
We call this kind of system non-confluent.
So, what is the output of the following commands?
grew transform -config sequoia -grs aux_verb.grs -strat "Onf(v)" -i frwiki_50.1000_00907.seq.conll
Let’s try it:
# global.columns = ID FORM LEMMA UPOS XPOS FEATS HEAD DEPREL DEPS MISC
# sent_id = frwiki_50.1000_00907
# text = Deux autres photos sont également montrées du doigt.
1 Deux deux D _ s=card 3 det _ _
2 autres autre A _ n=p|s=ind 3 mod _ _
3 photos photo N _ g=f|n=p|s=c 6 suj _ _
4 sont être VERB _ m=ind|n=p|p=3|t=pst 6 aux.pass _ _
5 également également ADV _ _ 6 mod _ _
6 montrées montrer VERB _ g=f|m=part|n=p|t=past 0 root _ _
7 du de P+D _ s=def 6 mod _ _
8 doigt doigt N _ g=m|n=s|s=c 7 obj.p _ _
Well, it produced exactly one graph output by choosing (in a way which cannot be controlled) one of the possible ways to rewrite.
What should we do with non-confluent system? In fact, there are two possible situations:
- The two rules are correct and there is a real (linguistic) ambiguity; and the different solutions must be considered.
- There is no ambiguity, the rules must be corrected.
In our example (AUX and VERB), we are clearly in the second case, but let us consider the other one anyway, just to see how to deal with really non-confluent setting.
The Iter strategy
Here, we suppose then that we are interested in all possible solutions.
Grew provides a strategy Iter to do this:
grew transform -config sequoia -grs aux_verb.grs -strat "Iter(v)" -i frwiki_50.1000_00907.seq.conll
This will produces 4 different graphs with all combinations of AUX and VERB for the two words sont and montrées.
# global.columns = ID FORM LEMMA UPOS XPOS FEATS HEAD DEPREL DEPS MISC
# sent_id = frwiki_50.1000_00907_0
# text = Deux autres photos sont également montrées du doigt.
1 Deux deux D _ s=card 3 det _ _
2 autres autre A _ n=p|s=ind 3 mod _ _
3 photos photo N _ g=f|n=p|s=c 6 suj _ _
4 sont être AUX _ m=ind|n=p|p=3|t=pst 6 aux.pass _ _
5 également également ADV _ _ 6 mod _ _
6 montrées montrer AUX _ g=f|m=part|n=p|t=past 0 root _ _
7 du de P+D _ s=def 6 mod _ _
8 doigt doigt N _ g=m|n=s|s=c 7 obj.p _ _
# sent_id = frwiki_50.1000_00907_1
# text = Deux autres photos sont également montrées du doigt.
1 Deux deux D _ s=card 3 det _ _
2 autres autre A _ n=p|s=ind 3 mod _ _
3 photos photo N _ g=f|n=p|s=c 6 suj _ _
4 sont être AUX _ m=ind|n=p|p=3|t=pst 6 aux.pass _ _
5 également également ADV _ _ 6 mod _ _
6 montrées montrer VERB _ g=f|m=part|n=p|t=past 0 root _ _
7 du de P+D _ s=def 6 mod _ _
8 doigt doigt N _ g=m|n=s|s=c 7 obj.p _ _
# sent_id = frwiki_50.1000_00907_2
# text = Deux autres photos sont également montrées du doigt.
1 Deux deux D _ s=card 3 det _ _
2 autres autre A _ n=p|s=ind 3 mod _ _
3 photos photo N _ g=f|n=p|s=c 6 suj _ _
4 sont être VERB _ m=ind|n=p|p=3|t=pst 6 aux.pass _ _
5 également également ADV _ _ 6 mod _ _
6 montrées montrer AUX _ g=f|m=part|n=p|t=past 0 root _ _
7 du de P+D _ s=def 6 mod _ _
8 doigt doigt N _ g=m|n=s|s=c 7 obj.p _ _
# sent_id = frwiki_50.1000_00907_3
# text = Deux autres photos sont également montrées du doigt.
1 Deux deux D _ s=card 3 det _ _
2 autres autre A _ n=p|s=ind 3 mod _ _
3 photos photo N _ g=f|n=p|s=c 6 suj _ _
4 sont être VERB _ m=ind|n=p|p=3|t=pst 6 aux.pass _ _
5 également également ADV _ _ 6 mod _ _
6 montrées montrer VERB _ g=f|m=part|n=p|t=past 0 root _ _
7 du de P+D _ s=def 6 mod _ _
8 doigt doigt N _ g=m|n=s|s=c 7 obj.p _ _
Be stricter in rules design
Of course, in our POS tags conversion example, the correct solution is to design more carefully our two rules, in order to produce the correct output. For instance (file: aux_verb_confluent.grs):
package v {
rule aux {
pattern { X [upos = V]; Y -[aux.pass]-> X }
commands { X.upos = AUX }
}
rule verb {
pattern { X [upos = V] }
without { Y -[aux.pass]-> X }
commands { X.upos = VERB }
}
}
Here, the two rules are clearly separated: the same clause Y -[aux.pass]-> X is used first the pattern part for rule aux and in the without part for the rule verb.
For each occurence of V in the sentence, exactly one of the two rules can be used.
With these two new rules, the system is confluent, and there is only one possible output.
This can be tested with the Iter strategy which produces all possible graphs:
grew transform -config sequoia -grs aux_verb_confluent.grs -strat "Iter(v)" -i frwiki_50.1000_00907.seq.conll
# global.columns = ID FORM LEMMA UPOS XPOS FEATS HEAD DEPREL DEPS MISC
# sent_id = frwiki_50.1000_00907
# text = Deux autres photos sont également montrées du doigt.
1 Deux deux D _ s=card 3 det _ _
2 autres autre A _ n=p|s=ind 3 mod _ _
3 photos photo N _ g=f|n=p|s=c 6 suj _ _
4 sont être AUX _ m=ind|n=p|p=3|t=pst 6 aux.pass _ _
5 également également ADV _ _ 6 mod _ _
6 montrées montrer VERB _ g=f|m=part|n=p|t=past 0 root _ _
7 du de P+D _ s=def 6 mod _ _
8 doigt doigt N _ g=m|n=s|s=c 7 obj.p _ _
Of course, the Onf produces the same output in this setting.
Note that there are two different ways to compute the final graph: first apply rule aux and then the rule verb or the other way round: rule verb and then rule aux. But the important consequence of the confluence property is that the same graph is produced in both cases.
⚠️ Use Iter carefully
When a package p is confluent, the two strategies Onf(p) and Iter(p) give the same result.
In practice, the strategy Onf(p) must be preferred because it is much more efficient to compute.
The Iter(p) strategy computes the set of all graphs which can be obtained with the package p from the input graph and then, returns the subset of normal forms.
If the rule applies in several places in the graph, the number of graphs can be huge.
For instance, with the rule del_xpos below, if the graph G contains n nodes with an xpos feature, the number of graphs in the produced set is 2n (each one of the n nodes can appear with or without xpos feature).
rule del_xpos{
pattern{ X[xpos] }
commands{ del_feat X.xpos }
}
We have seen in the previous lesson that Grew stops the computation when a bound in the number of rule applications is reached.
By default, the bound is 10,000; so, if we try to apply the strategy Iter(del_xpos) to the 14 nodes graph:
# sent_id = n01006011
# text = A witness told police that the victim had attacked the suspect in April.
1 A a DET DT Definite=Ind|PronType=Art 2 det 2:det _
2 witness witness NOUN NN Number=Sing 3 nsubj 3:nsubj _
3 told tell VERB VBD Mood=Ind|Tense=Past|VerbForm=Fin 0 root 0:root _
4 police police NOUN NNS Number=Plur 3 iobj 3:iobj _
5 that that SCONJ IN _ 9 mark 9:mark _
6 the the DET DT Definite=Def|PronType=Art 7 det 7:det _
7 victim victim NOUN NN Number=Sing 9 nsubj 9:nsubj _
8 had have AUX VBD Mood=Ind|Tense=Past|VerbForm=Fin 9 aux 9:aux _
9 attacked attack VERB VBN Tense=Past|VerbForm=Part 3 ccomp 3:ccomp _
10 the the DET DT Definite=Def|PronType=Art 11 det 11:det _
11 suspect suspect NOUN NN Number=Sing 9 obj 9:obj _
12 in in ADP IN _ 13 case 13:case _
13 April April PROPN NNP Number=Sing 9 obl 9:obl:in SpaceAfter=No
14 . . PUNCT . _ 3 punct 3:punct _
There are 214 = 16,384 graphs to compute and necessarily at least the same number of rule applications.
Hence, even if there is no loop in this case, there is a huge number of rule applications, this explain why Grew wrongly detects a loop and output the message:
Grewlib error:
[file: del_xpos.grs, line: 1] More than 10000 rewriting steps: check for loops or increase max_rules value. Last rules are: […del_xpos, del_xpos, del_xpos, del_xpos, del_xpos, del_xpos, del_xpos, del_xpos, del_xpos, del_xpos]
Of course, in this case, the strategy Onf(p) nicely computes the unique normal form quickly (with only 14 rule applications).




