Compute the connected components of a graph

The strategy main of the GRS below (cc.grs) outputs a set of graphs, each one being one of the connected components of the input graphs.

rule select_one {                         % This rule applies exactly once on each connected component of the graph.
  pattern { N[!select] }                  % Select an unselected node N
  % Nodes are totally ordrerd though an internal "__id__" value;
  % we use this in the two without clauses below to ensure that only one node can be chosen in a given connected component:
  % the "smallest" one, according to "__id__" order
  without { N -> M; M.__id__ < N.__id__}  % such that there is no "smaller" node M linked as a successor of N
  without { M -> N; M.__id__ < N.__id__}  % add such that there is no "smaller" node M linked as a predecessor of N
  commands { N.select=yes }
}

package propagate {
  rule down {
    pattern { M[select]; N[!select]; M -> N }
    commands { N.select = yes }
  }
  rule up {
    pattern { M[!select]; N[select]; M -> N }
    commands { M.select = yes }
  }
}

rule remove_unselected {
  pattern { N[!select]; }
  commands { del_node N; }
}

rule clean {
  pattern { N[select]; }
  commands { del_feat N.select; }
}

strat main {
  Seq (
    select_one,
    Onf (propagate),
    Onf (remove_unselected),
    Onf (clean),
  )
}

Explaining the GRS

Applied on a graph G, the GRS works as follow:

  1. Apply the select_one rule which output exactly one new graph for each connected component (see comments in rule for detail); in the graph one of the node of the selected connected component contained the features select=yes
  2. Apply iteratively the package propagate in order to mark all the node of the connected component with the feature select=yes
  3. Remove all unselected nodes that are not in the current connected component
  4. Remove the feature select=yes on all nodes

Examples

Basic two components graph

With the graph two_cc.json, and the command:

grew transform -grs cc.grs -i two_cc.json -json

We obtained two output graphs, one for each connected component in the input graph.

Input Output 1 Output 2
two_cc cc1 cc1

Example with linguistic data

Starting with the sentence below:

fr-ud-train_04997

We apply to it the following GRS which removes appos relations:

rule r { pattern { e: M -[appos]-> N} commands {del_edge e }}
strat main { Onf (r) }

We obtained the following structure:

fr-ud-train_04997_split

If we applied the GRS above, we obtained 3 small structures:

cc1 cc2 cc3
cc1 cc2 cc3