Grew-match: Online Graph Matching

Grew-match is a one page online web application for searching graph patterns in treebanks. Since March 2022, Grew-match was split into several instances, each one with each own url. The address http://match.grew.fr now displays a portal with links to instances.

The http://universal.grew.fr instance

This instance contains the last version of the UD and the SUD treebanks and a few more recent versions synchronised with GitHub data. The top navbar gives access to:

• UD 2.10: The 228 treebanks of the version 2.10 of UD
• SUD 2.10: The 227 treebanks of the version 2.9 of SUD
• UD Latest:
• suffix @dev: corpora in their latest version available on dev branch on GitHub (English, French, Irish and Portuguese). If you want to access to the dev branch of another UD treebank, please contact us.
• suffix @conv: the automatic conversion of the native SUD treebanks into UD.
• SUD Latest:
• suffix @latest: latest version available on GitHub of the native SUD corpora.

Other instances

The page Local installation of Grew-match describes how you can run your own instance of Grew-match.

Basic usage

1. Select the instance and then the corpus on which you want to search.
2. Enter the search pattern in the text area (you may use some snippets on the right of the text area)
3. Click on Search

The number of items is displayed and the first 10 items can be explored. If you want to see the next 10 items, click on More results.

To limit server usage, only the first 1000 items are computed. If the searched pattern is found more then 1000 times, the amount of corpus used to find the first 1000 items is reported. For instance, if you search for a nsubj relation in the UD_French-GSD corpus , the message is More than 1000 results found in 5.29% of the corpus. This means that the first 1000 items were found in 5.29% of the 16,341 sentences of the UD_French-GSD corpus.

Learning syntax

A tutorial with a progressive sequence of patterns is available. You may also explore snippets given on the right of the text area to learn with other examples. A more comprehensive documentation is available in the patterns page.

Clustering the occurrences

In addition to the main request, it is possible to make some clustering on the set of occurrences given by this request.

Use a clustering key

When a clustering key is used, the set of occurrences (or the first 1000 occurrences) is split in subsets depending of the key value. Each possible value is presented as a button with the size of the associated subset; the button gives access to the corresponding occurrences. The clustering key can be:

• N.f: cluster following the feature named f for the node N present in the (positive part of) main request
• List lemmas of auxiliaries in UD_Polish-LFG
• List VerbForm of verb without nsubj in UD_German-GSD
• Find the huge number of form associated to the lemma saada in UD_Finnish-FTB
• e.label: cluster following the full label of edge e present in the (positive part of) main request
• List relations used for auxiliaries in UD_Italian-ParTUT
• e.f: cluster following the edge feature name f for a named edge e present in the (positive part of) main request
• List subtypes used with acl relation in UD_Swedish-Talbanken
• e.length: cluster following the length of edge e present in the (positive part of) main request
• Observe the length of the amod relation in UD_Korean-PUD
• e.delta: cluster following the relative position of governor and dependent of edge e present in the (positive part of) main request
• Observe the relative positions of nsubj related tokens in UD_Naija-NSC

Use a “whether” sub pattern

A “whether” sub pattern contains a list of clauses (as in pattern or without constructions). The set of occurrences (or the first 1000 occurrences) is split in two subsets:

• one tagged No corresponds to the subset of occurrences where the “whether” sub pattern cannot not be fulfilled (the “whether” is interpreted like a without)
• one tagged Yes is the complementary of the No subset and so, corresponds to the occurrences where the sub pattern can be found.

Note that no curly brackets are needed in the “whether” text area (see examples below).

Examples

• Is advcl left-headed in UD_Hungarian-Szeged?
• In UD_English-GUM, how often the relation expl appear with or without nsubj?
• In UD_French-GSD, there are 627 left-headed nsubj (or subtypes):
• How often is it in an interrogative sentences?
• How often is it in an relative clause?
• How often is there an expletive subject?

The fields 2, 3, 4 and 5 of CoNLL-U files are considered as features with the following feature names.

CoNLL-U field 2 3 4 5
Name form lemma upos xpos

For instance:

• searching for the word ispattern { N [form="is"] }
• searching for the lemma bepattern { N [lemma="be"] }

Display options

Below the text area, a few options are available:

• lemma: if checked, the lemma (CoNLL-U column 3) is shown in output
• upos: if checked, the upos (CoNLL-U column 4) is shown in output
• xpos: if checked, the xpos (CoNLL-U column 5) is shown in output
• features: if checked, other features (CoNLL-U column 6 and column 10) are shown
• textform/wordform: if checked, additional features textform and wordform (see CoNLL-U doc) are shown
• sentence order: 3 values are available
• initial: the sentence are scanned in the order they are present in the original corpus
• by length: the shortest sentences (in term of tokens number) are scanned first
• shuffle the set of sentences is shuffled before searching the pattern (useful to search randomly for examples in a corpus)
• context: if checked, the previous and the following sentences are shown (of course, this is useful only on corpora where original sentences ordering is preserved)

Enhanced dependencies

In the UD framework, a few corpora are also provided with another annotation layer EUD (Enhanced dependencies). For these corpora, a switch button is available (above the text area) where the user can chose between UD and EUD.

If EUD is selected, enhanced dependencies are displayed in blue below the sentence. In the pattern, an enhanced dependency can be searched with the prefix E:. For instance, the pattern below searches for an enhanced obl relation in UD_English-EWT without a non-enhanced counterpart :

pattern { N -[E:obj]-> M }
without { N -[obj]-> M }


Contact

For any remark or request, you can either contact us or open an issue on GitHub.