# Grew Tutorial • Lesson 6 • More commands

Let us go on with our conversion of Sequoia POS tagging to the SUD POS tagging.

We recall the two formats:

Format frwiki_50.1000_00907
Sequoia
SUD

You can observe that in addition to a different POS tagset, the SUD format also uses a different tokenisation. The word du of the input sentence is one token du with POS P+D in Sequoia but this is in fact an amalgam of two lexical units: a preposition and a determiner (this is exactly what the tag P+D means). In SUD, such combined tag are not allowed, so the sentence is annotated with two tokens de and le for the word du.

## The command add_node

So, we have to design a rule to make this new tokenisation. The commented rule below computes this transformation (file: amalgam1.grs):

rule amalgam {
pattern { N [form = "du", upos = "P+D"] }
commands {
add_node D :> N;    % Create a new node called D and place it just after N
N.form = "de";      % Change the form of N to de
N.upos = ADP;       % Set the ADP tag for the preposition "de"
D.form = "le";      % Add the form feature of D to le
D.upos = DET;       % Set the DET tag for the determiner "le"
}
}

This is our first rule in this tutorial with more than one command. In general, the transformation is described by a sequence of commands which are applied successively to the current graph.

The application of this rule to our input graph builds:

Good, we have the final tokenisation we expected, but the new node for “le” is not linked to the graph. We can imagine to connect it later with some other rule but it may be dangerous: imagine an input sentence with several occurrences of the word “du”, the application of Onf (amalgam) will build a graph with several isolated nodes “le” and it may be confusing to choose later the “right” determiner with the “right” noun! In practice, it is safer to avoid to build disconnected graph.

## The command add_edge

With our example above, our rule should take care of the connection of the new node to the relevant noun. This can be done with a command add_edge M -[det]-> D where M is the node for the word doigt. But, to be able to use this node M in the command part, it must be declared in the pattern part.

The new rule is then (file: amalgam2.grs):

rule amalgam {
pattern {
N [form = "du", upos = "P+D"];    % match the amalgam word "du";
N -[obj.p]-> M;                   % match the node linked to "du" with the obj.p relation
}
commands {
add_node D :> N;        % Create a new node called D and place it just after N
N.form = "de";          % Change the form of N to de
N.upos = ADP;           % Set the ADP tag for the preposition "de"
D.form = "le";          % Add the form feature of D to le
D.upos = DET;           % Set the DET tag for the determiner "le"
add_edge M -[det]-> D;  % Add the dependency link to the new node D
}
}

The application of this rule to our input graph builds:

TODO: dealing the the special encoding of Mutli-Word Tokens in (S)UD with wordform and textform.