Pattern syntax

A Pattern is defined through 3 different parts that are all optional.

The global matching process is:

Note that if there is more than one negative matchings, there are all interpreted independently.

The basic syntax of patterns in grew can be learned using the tutorial part of the Grew-match tool.

Positive and negative patterns

Positive and negative patterns both follow the same syntax. These patterns are described by a list of clauses: node clauses, edge clauses and additional constraints

Node clauses

In a node clause, a node is described by an identifier and some constraints on the feature structure.

N [upos = VERB, Mood = Ind|Imp, Tense <> Fut, Number, !Person, lemma = "être" ]

The clause above illustrated the syntax of constraint that can be expressed, in turn:

Edge clauses

All edge clauses below require the existence of an edge between the node selected by N and the node selected by M, evntually with additional constraints:

Edge may also be named for future use (in commands for instance) with an identifier:

Note that edge may refer to undeclared nodes, these nodes are then implicitly declared with any constraint. For instance, the two patterns below are equivalent:

pattern { N -[nsubj]-> M }
pattern { N[]; M[]; N -[nsubj]-> M }

Since version 1.2, more complex edges can be used, see here.

Additional constraints

These constrains do not identify new elements in the graph, but must be respected.

When two or more nodes are equivalent in a pattern, each occurrence of the pattern in a graph will be found several times (up to permutation in the sets of equivalent nodes). For instance, in the pattern below, the 3 nodes N1, N2 and N3 are equivalent.

pattern { N1 -[ARG1]-> N; N2 -[ARG1]-> N; N3 -[ARG1]-> N; }

This pattern is found 120 times in the Little Prince corpus (Grew-match) but there are only 20 different occurrences, each one is reported 6 times with all permutations on N1, N2 and N3. To avoid this, a constraint id(N1) < id(N2) can be used. It imposes an ordering on some internal representation of the nodes and so avoid these permutations.

The pattern below returns the 20 expected occurrences (Grew-match)

pattern {
    N1 -[ARG1]-> N; N2 -[ARG1]-> N; N3 -[ARG1]-> N;
    id(N1) < id(N2); id(N2) < id (N3);

Global pattern

Global patterns were introduced in version 1.2 to let the user express constrain about the whole graph. Currently, constraints may be expressed with a fixed list of keywords. We plan to add more constraints in the near future. Please drop us a feature request if you like to suggest one. We describe below 4 of the constraints available in version 1.2. For each one, its negation is available by changing the is_ prefix by the is_not_ prefix.