CoNLL-U Plus Format

Grew partially takes into account the CoNLL-U Plus Format.

Columns declaration

If some CoNLL file start with the initial line # global.columns = …, it is taken into account for the parsing of the file.

In order to specify the columns declaration to be used in output, the command line argument -columns … can be used. For instance -columns "ID FORM UPOS" will produce a 3 columns output with token id, phonological form and universal POS.

New columns

In addition to CoNLL-U columns, two column declarations are handled by Grew:

In both cases, Grew interprets each annotated element as a new node with linked to all tokens it contains.

Below, an example of these annotations on UD_French-Sequoia (blue: MWE annotation of the Parseme project, orange: NE annotation of the Parseme project, green: semantic noun annotation of the FrSemCor project).

NB: other aspects of the CoNLL-U Plus Format, like cross-reference are not handled.