Grew-count web service

The Grew-count web service is available on http://count.grew.fr. It is still in development and it may evolve in the near future.

With the Grew-count web service, it is possible to send a list of Grew patterns and a list of corpora and to get a TSV file with the number of occurrences of each pattern in each corpus.


The count service

The URL of the main service is http://count.grew.fr/count and it must be called with two POST parameters: corpora and patterns.

The corpora parameter must be a JSON string describing a list of corpora. For instance:

[
  "SUD_French-PUD@2.8",
  "SUD_English-PUD@2.8"
]

The available corpora are the same as the ones available on Grew-match, with the same identifiers.

The patterns parameter must be a JSON string describing a dictionary of patterns. For instance:

{
  "sv": "pattern { V -[subj]-> S; S << V }",
  "vs": "pattern { V -[subj]-> S; V << S }"
}

Again, the patterns are the same as the ones available on Grew-match. Patterns syntax can be learned through Grew-match’s tutorial and some documentation is available on the pattern page.


Example of usage with Python

The web service can be called with Python’s requests library. The code below (Download) shows a way to call the web service with the two patterns above and with the 20 PUD corpora of SUD 2.8.

import requests

url = "http://count.grew.fr/count"

data={
'corpora': '''[
  "SUD_Arabic-PUD@2.8",
  "SUD_Chinese-PUD@2.8",
  "SUD_Czech-PUD@2.8",
  "SUD_English-PUD@2.8",
  "SUD_Finnish-PUD@2.8",
  "SUD_French-PUD@2.8",
  "SUD_German-PUD@2.8",
  "SUD_Hindi-PUD@2.8",
  "SUD_Icelandic-PUD@2.8",
  "SUD_Indonesian-PUD@2.8",
  "SUD_Italian-PUD@2.8",
  "SUD_Japanese-PUD@2.8",
  "SUD_Korean-PUD@2.8",
  "SUD_Polish-PUD@2.8",
  "SUD_Portuguese-PUD@2.8",
  "SUD_Russian-PUD@2.8",
  "SUD_Spanish-PUD@2.8",
  "SUD_Swedish-PUD@2.8",
  "SUD_Thai-PUD@2.8",
  "SUD_Turkish-PUD@2.8"
]
''',
'patterns': '''{
  "sv": "pattern { V -[subj]-> S; S << V }",
  "vs": "pattern { V -[subj]-> S; V << S }"
}
'''}

response = requests.request("POST", url, data=data)

print(response.text)

The script should produce, the following TSV file:

Corpus	# sentences	sv	vs
SUD_Arabic-PUD@2.8	1000	460	966
SUD_Chinese-PUD@2.8	1000	1836	12
SUD_Czech-PUD@2.8	1000	926	376
SUD_English-PUD@2.8	1000	1344	76
SUD_Finnish-PUD@2.8	1000	1013	93
SUD_French-PUD@2.8	1000	1352	64
SUD_German-PUD@2.8	1000	1123	386
SUD_Hindi-PUD@2.8	1000	1133	4
SUD_Icelandic-PUD@2.8	1000	1408	436
SUD_Indonesian-PUD@2.8	1000	1421	137
SUD_Italian-PUD@2.8	1000	1024	136
SUD_Japanese-PUD@2.8	1000	1493	0
SUD_Korean-PUD@2.8	1000	1565	0
SUD_Polish-PUD@2.8	1000	858	223
SUD_Portuguese-PUD@2.8	1000	1214	101
SUD_Russian-PUD@2.8	1000	1155	254
SUD_Spanish-PUD@2.8	1000	1064	165
SUD_Swedish-PUD@2.8	1000	1162	383
SUD_Thai-PUD@2.8	1000	1666	5
SUD_Turkish-PUD@2.8	1000	1326	6