Basic Grew rules
A rewrite rule in Grew is defined by:
- A request (see request for a complete doc) composed of:
- One item (introduced by the keyword
pattern
) describing the part of graph to match and on which commands will be applied, - A set of negative (or positive) filtering items to filter out unwanted occurrences of the request, each clause being introduced by the keyword
without
(orwith
)
- One item (introduced by the keyword
- A sequence of commands to apply (see commands page), introduced by the keyword
commands
Example
Here is an example of rule taken from a gallery page (see this gallery example for an explanation).
rule sh2c {
pattern {
H -[fixed]-> N1;
e:H -[fixed]-> N2;
N1 < N2;
}
without {
H -[fixed]-> N3;
N2 < N3;
}
commands {
del_edge e;
add_edge N1 -[fixed]-> N2;
}
}
Avoiding Looping Rules
One of the most common problems we can have when using Grew rules is the fact that the rule starts an infinite loop.
For example, the rule tv
below add a feature Transitive = Yes
on the verb when there is a comp:obj
relation from this verb to a noun.
rule tv {
pattern { X[upos=VERB]; Y[upos=NOUN|PROPN|PRON]; X-[comp:obj]->Y }
commands { X.Transitive = Yes }
}
This rule must to be applied iteratively to correctly handle a sentence with more than one transitive verb with a strategy:
strat main { Onf (tv) }
In this case, you will get very soon the error message:
"More than 10000 rewriting steps: check for loops or increase max_rules value. Last rules are: […tv, tv, tv, tv, tv, tv, tv, tv, tv, tv]"
This is because there is nothing to prevent the rule from being applied again to the same node after a first application.
To avoid this, a without
clause can be added.
rule tv {
pattern { X[upos=VERB]; Y[upos=NOUN|PROPN|PRON]; X-[comp:obj]->Y }
without { X[Transitive = Yes] }
commands { X.Transitive = Yes }
}
See also the tutorial page about termination for more details.
Using lexicons in Grew rules
Grew rules can be parametrised by one or several lexicons.
Lexicon
A lexicon is defined by:
- a list of n different field identifiers
- a list of lexicon items, each item is a n-tuple
For instance, the table below describes a tiny lexicon for French nouns where each noun is associated with its gender.
noun | Gender |
---|---|
garçon | Masc |
maison | Fem |
A lexicon is written as text where:
- Blank lines and lines starting with
%
symbol are ignored - Each line is a list of elements separated by tabulations
- The first line defines the field identifiers
- All other lines define the lexicon items that are n-uples of strings.
- All lines must contain the same number of elements.
The lexicon above can be then written in the file nouns.lex
noun Gender
%-------------
garçon Masc
maison Fem
Lexical rule
A rule can be parametrised by a lexicon.
The rule below adds a new feat Gender
with the relevant value when the noun is found in the lexicon.
Note that the lexicon is named in the rule (lex
in the example), this will allow us to use several lexicons in the same rule.
rule set_gender (lex from "nouns.lex") {
pattern { X [upos=NOUN, !Gender, lemma=lex.noun] }
commands { X.Gender = lex.Gender }
}
Once the lexicon lex
is declared, the syntax lex.ident
can be used to refer to lexical items in any place where a feature value can be used in the rule definition.
When a lexicon is short and specific to one rule, it may be painful to put it in a new file. In this case, an alternative syntax is proposed: the lexicon is defined directly at the end of the rule definition. The rule above can be written:
rule set_gender {
pattern { N [upos=NOUN, !Gender, lemma=lex.noun] }
commands { N.Gender = lex.Gender }
}
#BEGIN lex
noun Gender
%-------------
garçon Masc
maison Fem
#END
Try it!
The file set_gender.py
below presents a self-contained example of rewriting with the lexical rule above.
It supposes that the grew Python library is installed (see Installation page).
⚠️ The tabulation is not well interpreted when using copy/paste into interactive Python session.
That’s why tabulations are replaced by explicit \t
in the Python code below.
import pprint
from grewpy import Graph, GRS
graph = Graph("""1\tle\tle\tDET\t_\t_\t2\tdet\t_\t_
2\tgarçon\tgarçon\tNOUN\t_\t_\t3\tsubj\t_\t_
3\tvoit\tvoir\tVERB\t_\t_\t0\troot\t_\t_
4\tla\tle\tDET\t_\t_\t5\tdet\t_\t_
5\tmaison\tmaison\tNOUN\t_\t_\t3\tcomp:obj\t_\t_
""")
rule = GRS("""
rule set_gender {
pattern { N [upos=NOUN, !Gender, lemma=lex.noun] }
commands { N.Gender = lex.Gender }
}
#BEGIN lex
noun\tGender
%-------------
garçon\tMasc
maison\tFem
#END
""")
output = rule.apply(graph, strat = "Onf(set_gender)")
print (output.to_conll())
The code above outputs the following structure where the two nouns have their gender values.
Using several lexicons
The file obl_loc.grs
below defines a rule which changes the relation obl
into obl:loc
when both the verb and the preposition are controlled by lexicons.
rule obl_loc {
pattern {
e: VERB -[obl]-> OBL; OBL -[case]-> ADP;
VERB [lemma = loc_verb.lemma];
ADP [lemma = loc_prep.lemma];
}
commands { del_edge e; add_edge VERB -[obl:loc]-> OBL; }
}
#BEGIN loc_verb
lemma
%--------------
aller
venir
#END
#BEGIN loc_prep
lemma
%--------------
à
dans
sur
vers
#END
The file max.conll
contains the following sentence:
With the command grew transform -grs obl_loc.grs -strat "Onf(obl_loc)" -i max.conll
, the rule above is applied twice and produces the next graph:
Using twice the same lexicon
If the file transitive_verbs.lex
contains a list of transitive verbs, the following rule distributes the obj
relation when two transitive verbs are coordinated.
rule transitive_coord (lex_1 from "transitive_verbs.lex", lex_2 from "transitive_verbs.lex") {
pattern {
VERB1 [lemma=lex_1.lemma]; VERB1 -[conj]-> VERB2; VERB2 [lemma=lex_2.lemma];
VERB2 -[obj]-> OBJ; VERB2 << OBJ;
}
without { VERB1 -[E:obj]-> OBJ; }
commands { add_edge VERB1 -[E:obj]-> OBJ; }
}
This rule can be used to turn the left part below into the right part: