| Abaza ATB |
only spoken |
2.11 |
https://github.com/UniversalDependencies/UD_Abaza-ATB |
spoken |
alexeykochevoy@gmail.com |
98 |
652 |
yes |
|
|
text_rus |
text_transcription;text_orth |
|
text_name |
|
|
|
|
|
|
|
|
|
|
|
| Alemannic DIVITAL |
mixed |
2.17 |
https://github.com/UniversalDependencies/UD_Alemannic-DIVITAL |
fiction nonfiction legal spoken wiki bible |
dbernhard@unistra.fr |
977 |
19334 |
|
|
|
language variety ; language_glottolog |
text_orig; annotation |
author |
newdoc id ; title ; domain; factuality ; sample_type; origin |
channel; form; discourse_type;genre; audience ; translator; transcriber; source |
|
|
|
|
|
|
|
|
|
|
| Beja Autogramm |
only spoken |
2.8 |
https://github.com/UniversalDependencies/UD_Beja-Autogramm |
spoken |
martine.vanhove@cnrs.fr; sylvain@kahane.fr |
763 |
11951 |
yes |
|
sound_url |
text_en |
phonetic_text |
speaker_id |
|
|
sent_timecode |
|
|
|
|
|
|
|
|
|
| Bokota ChibErgIS |
only spoken |
2.16 |
https://github.com/UniversalDependencies/UD_Bokota-ChibErgIS |
spoken |
marie.benzerrak@laposte.net |
406 |
2713 |
yes |
|
sound_url |
text_en |
text_ortho; morphemic_text |
speaker_id |
sent_timecode |
|
|
|
|
|
|
|
|
|
|
|
| Bororo BDT |
mixed |
2.12 |
https://github.com/UniversalDependencies/UD_Bororo-BDT |
grammar-examples spoken nonfiction bible |
fabricio.gerardi@uni-tuebingen.de |
21384 |
160356 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Cantonese HK |
only spoken |
2.1 |
https://github.com/UniversalDependencies/UD_Cantonese-HK |
spoken |
tswong-c@my.cityu.edu.hk; jsylee@cityu.edu.hk |
1004 |
13918 |
yes |
|
|
|
|
|
_filename |
|
parallel_id |
|
|
|
|
|
|
|
|
|
| Central_Romani Selice |
only spoken |
2.16 |
https://github.com/UniversalDependencies/UD_Central_Romani-Selice |
spoken |
zeman@ufal.mff.cuni.cz |
|
|
yes |
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Chinese HK |
only spoken |
2.1 |
https://github.com/UniversalDependencies/UD_Chinese-HK |
spoken |
tswong-c@my.cityu.edu.hk; jsylee@cityu.edu.hk |
1004 |
9874 |
yes |
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Chukchi HSE |
only spoken |
2.7 |
https://github.com/UniversalDependencies/UD_Chukchi-HSE |
spoken |
ftyers@iu.edu |
1004 |
5389 |
yes |
|
|
text[eng];text[eng’];text[rus] |
text[phon] |
|
|
|
timestamp |
comment;labels |
|
|
|
|
|
|
|
|
| Classical_Nahuatl FloCo |
mixed |
2.12 |
https://github.com/UniversalDependencies/UD_Classical_Nahuatl-FloCo |
spoken fiction grammar-examples nonfiction |
pughrob@iu.edu |
|
|
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Czech PDTC |
mixed |
1.0 |
https://github.com/UniversalDependencies/UD_Czech-PDTC |
news reviews nonfiction academic spoken social |
zeman@ufal.mff.cuni.cz |
213897 |
3432078 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Danish DDT |
mixed |
1.1 |
https://github.com/UniversalDependencies/UD_Danish-DDT |
news fiction spoken nonfiction |
zeman@ufal.mff.cuni.cz |
5512 |
100733 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Dargwa Mehweb |
only spoken |
2.1 |
https://github.com/UniversalDependencies/UD_Dargwa-Mehweb |
spoken |
sasha.kozhukhar@gmail.com, olesar@yandex.ru |
|
|
yes |
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| English CHILDES |
only spoken |
2.16 |
https://github.com/UniversalDependencies/UD_English-CHILDES |
spoken |
xy236@georgetown.edu |
48183 |
289817 |
yes |
corpus_name |
|
|
gold_annotation;childes_toks |
speaker_role;child_name;child_gender;chi l d;child_age |
|
|
type;original_sent_id;s_24_sent_id |
|
|
|
|
|
|
|
|
|
| English ESLSpok |
only spoken |
2.12 |
https://github.com/UniversalDependencies/UD_English-ESLSpok |
spoken |
kkyle2@uoregon.edu |
2320 |
21312 |
yes |
———————————— |
———————————— |
———————————— |
———————————— |
———————————— |
———————————— |
———————————— |
———————————— |
|
|
|
|
|
|
|
|
|
| English GENTLE |
mixed |
2.12 |
https://github.com/UniversalDependencies/UD_English-GENTLE |
academic grammar-examples legal medical nonfiction poetry social spoken |
amir.zeldes@georgetown.edu |
1334 |
17619 |
|
|
meta::sourceURL |
|
meta::salientEntities;transition |
speaker |
newdoc id;newpar_block;meta::speaker_count;meta::summary3;meta::summary2;meta::summary1;meta::author;meta::summary4;meta::summary5;meta::dateCollected;meta::dateModified;meta::dateCreated;meta::title |
meta::genre;addressee |
s_prominence;s_type |
global.Entity |
|
|
|
|
|
|
|
|
| English GUM |
mixed |
2.2 |
https://github.com/UniversalDependencies/UD_English-GUM |
academic blog email fiction government legal news nonfiction social spoken web wiki |
amir.zeldes@georgetown.edu |
13263 |
229851 |
|
|
meta::sourceURL |
|
meta::salientEntities;transition |
speaker |
newdoc id;newpar_block;meta::speaker_count;meta::summary3;meta::summary2;meta::summary1;meta::author;meta::summary4;meta::summary5;meta::dateCollected;meta::dateModified;meta::dateCreated;meta::title |
meta::genre;addressee |
s_prominence;s_type |
global.Entity |
|
|
|
|
|
|
|
|
| French ParisStories |
only spoken |
2.9 |
https://github.com/UniversalDependencies/UD_French-ParisStories |
spoken |
gerdes@lisn.fr |
2776 |
42257 |
yes |
|
sound_url |
|
macrosyntax |
speaker |
|
|
|
|
|
|
|
|
|
|
|
|
| French Rhapsodie |
only spoken |
2.2 |
https://github.com/UniversalDependencies/UD_French-Rhapsodie |
spoken |
kim@gerdes.fr |
3209 |
43699 |
yes |
corpus metadata |
sound_url |
languages and translation(s) |
macrosyntax;prosodic_annotation |
speaker_id;speaker;speaker_education;speaker_family_social_role;speaker_fullname;speaker_age;speaker_role;speaker_sex;genre;event_structure;subgenre;involvement;modalities;interactivity;planning_type;subject;social_context |
task |
type;channel |
sent metadata |
old_id |
|
|
|
|
|
|
|
|
| Frisian_Dutch Fame |
only spoken |
2.8 |
https://github.com/UniversalDependencies/UD_Frisian_Dutch-Fame |
spoken |
a.r.y.braggaar@student.rug.nl |
400 |
3729 |
yes |
|
|
|
text_switch |
speaker |
newdoc id |
|
|
|
|
|
|
|
|
|
|
|
| Gheg GPS |
only spoken |
2.11 |
https://github.com/UniversalDependencies/UD_Gheg-GPS |
spoken |
christiangeorg.ebert@uzh.ch, barbara.sonnenhauser@uzh.ch, paul.widmer@uzh.ch |
966 |
15990 |
yes |
———————————— |
———————————— |
———————————— |
———————————— |
———————————— |
———————————— |
———————————— |
———————————— |
|
|
|
|
|
|
|
|
|
| Greek GDT |
mixed |
1.1 |
https://github.com/UniversalDependencies/UD_Greek-GDT |
news wiki spoken |
prokopis@ilsp.gr |
2521 |
61773 |
|
|
|
|
|
|
newdoc id |
|
|
|
|
|
|
|
|
|
|
|
| Greek Lesbian |
mixed |
2.16 |
https://github.com/UniversalDependencies/UD_Greek-Lesbian |
grammar-examples spoken fiction |
s.bompolas@athenarc.gr |
540 |
5733 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Hausa NorthernAutogramm |
only spoken |
2.14 |
https://github.com/UniversalDependencies/UD_Hausa-NorthernAutogramm |
spoken |
bernard.l.caron@gmail.com |
423 |
4116 |
yes |
|
sound_url |
text_en |
phonetic_text |
speaker_id |
|
|
sent_timecode |
|
|
|
|
|
|
|
|
|
| Hausa SouthernAutogramm |
only spoken |
2.14 |
https://github.com/UniversalDependencies/UD_Hausa-SouthernAutogramm |
spoken |
bernard.l.caron@gmail.com |
1927 |
14401 |
yes |
|
sound_url |
text_en |
text_ortho |
speaker_id |
|
|
sent_timecode |
|
|
|
|
|
|
|
|
|
| Hausa WesternAutogramm |
mixed |
2.17 |
https://github.com/UniversalDependencies/UD_Hausa-WesternAutogramm |
fiction nonfiction spoken |
bernard.l.caron@gmail.com |
775 |
13862 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Hebrew IAHLTknesset |
mixed |
2.15 |
https://github.com/UniversalDependencies/UD_Hebrew-IAHLTknesset |
government spoken |
amir.zeldes@georgetown.edu |
2883 |
50499 |
|
|
|
|
|
speaker;is_valid_speaker;is_chairman |
newdoc id;newpar id |
|
turnnumber |
|
|
|
|
|
|
|
|
|
| Highland_Puebla_Nahuatl ITML |
mixed |
2.13 |
https://github.com/UniversalDependencies/UD_Highland_Puebla_Nahuatl-ITML |
spoken grammar-examples nonfiction |
pughrob@iu.edu |
1260 |
10018 |
|
|
|
text[spa];text[orig] |
text[gloss] |
|
|
|
|
labels;text[a139];text[a21];text[a70];text[a86];textl[a140];text[a94];text[a27];text[a113];text[a77];text[a205];text[a52];text[a127];text[a89];text[a110];text[a125];text[a82];text[a44];text[a19];text[a118];text[a59];text[a57];text[a174];text[a151];text[a14];text[a211];text[a48];text[a88];text[a105];text[a135];text[a150];text[a229];text[a5];text[a54];text[a114];text[a25];text[a26];text[a16];text[a69];text[a134];text[a230];text[a212];text[a58];text[a76];text[a23];text[a49];text[a91];text[a7];text[a10];text[a102];text[a112];text[a29];text[a6];text[a43];text[a115];text[a50];text[a24];text[a108];text[a47];text[a227];text[a111];text[a170];text[a203];text[a136];text[a101];text[a131];text[a40];text[a81];text[a128];text[a87];text[a85];text[a158];text[a80];text[a95];text[a15];text[a22];text[a53];text[a74];text[a51];text[a116];text[a96];text[a18];text[a75];text[a119];text[a157];text[a9];text[a28];text[a17];text[a45];text[a90];text[a130];text[a210];text[a46];text[a31];text[a97];text[2];text[a84];text[a30];text[a104];text[a39];text[a99];text[a4];text[rus];text[a11];text[a232];text[a138] |
|
|
|
|
|
|
|
|
| Ika ChibErgIS |
only spoken |
2.16 |
https://github.com/UniversalDependencies/UD_Ika-ChibErgIS |
spoken |
jana.bajorat@hu-berlin.de |
628 |
5307 |
yes |
|
sound_url |
text_en |
morphemic_text;text_phrase-gls-es;text_phrase-gls-tl |
speaker_id |
|
|
sent_timecode |
tags |
|
|
|
|
|
|
|
|
| Italian KIParlaForest |
only spoken |
2.17 |
https://github.com/UniversalDependencies/UD_Italian-KIParlaForest |
spoken |
ellepannitto@gmail.com |
1007 |
9135 |
yes |
|
|
|
jefferson_text |
speaker_id |
conversation_id |
|
|
|
|
|
|
|
|
|
|
|
| Japanese JDD |
only spoken |
2.15 |
https://github.com/UniversalDependencies/UD_Japanese-JDD |
spoken |
masayu-a@ninjal.ac.jp |
|
|
yes |
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Khoekhoe KDT |
mixed |
2.16 |
https://github.com/UniversalDependencies/UD_Khoekhoe-KDT |
fiction grammar-examples spoken |
kira.tulchynska@mail.huji.ac.il, witzlack@gmail.com |
3589 |
27611 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Khunsari AHA |
mixed |
2.7 |
https://github.com/UniversalDependencies/UD_Khunsari-AHA |
grammar-examples spoken |
amojiry@gmail.com |
10 |
74 |
|
|
|
text_en;text_fa |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Komi_Zyrian IKDP |
only spoken |
2.2 |
https://github.com/UniversalDependencies/UD_Komi_Zyrian-IKDP |
spoken |
nikotapiopartanen@gmail.com |
214 |
2304 |
yes |
corpus_version |
|
text_en;text_ru;text_end |
|
|
|
|
|
comment;label |
|
|
|
|
|
|
|
|
| Latvian LVTB |
mixed |
1.3 |
https://github.com/UniversalDependencies/UD_Latvian-LVTB |
news fiction legal spoken academic |
lauma@ailab.lv, normunds@ailab.lv |
19580 |
330318 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Ligurian GLT |
mixed |
2.9 |
https://github.com/UniversalDependencies/UD_Ligurian-GLT |
nonfiction fiction news wiki bible spoken grammar-examples |
stefano.lusito@uibk.ac.at |
316 |
6568 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Naija NSC |
only spoken |
2.2 |
https://github.com/UniversalDependencies/UD_Naija-NSC |
spoken |
kim@gerdes.fr |
9241 |
140837 |
yes |
|
sound_url |
text_en |
text_ortho |
speaker_id;speaker_education;speaker_age;speaker_sex;speaker_resindence;speaker_naija_competency;speaker_birthplace;speaker_primary_other_language |
|
|
|
|
|
|
|
|
|
|
|
|
| Nayini AHA |
mixed |
2.7 |
https://github.com/UniversalDependencies/UD_Nayini-AHA |
grammar-examples spoken |
amojiry@gmail.com |
10 |
78 |
|
|
|
text_en;text_fa |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Nenets Tundra |
only spoken |
2.16 |
https://github.com/UniversalDependencies/UD_Nenets-Tundra |
spoken |
mus.nikolett@gmail.com |
170 |
1272 |
yes |
corpus metadata |
sound_url;media |
text_en;text_ru;language_variety |
text_p;translit;p_text |
speaker metadata |
doc_title_ |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Nheengatu CompLin |
mixed |
2.11 |
https://github.com/UniversalDependencies/UD_Nheengatu-CompLin |
spoken bible fiction nonfiction grammar-examples |
leonel.de.alencar@ufc.br |
2742 |
25645 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Northwest_Gbaya Autogramm |
only spoken |
2.15 |
https://github.com/UniversalDependencies/UD_Northwest_Gbaya-Autogramm |
spoken |
pauletteroulon@gmail.com |
403 |
2692 |
yes |
|
sound_url |
text_fr |
phonetic_text |
speaker_id |
|
|
sent_timecode |
|
|
|
|
|
|
|
|
|
| Norwegian NynorskLIA |
only spoken |
2.1 |
https://github.com/UniversalDependencies/UD_Norwegian-NynorskLIA |
spoken |
liljao@ifi.uio.no |
|
|
yes |
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Persian Seraji |
mixed |
1.1 |
https://github.com/UniversalDependencies/UD_Persian-Seraji |
news fiction medical legal social spoken nonfiction |
mojgan.seraji96@gmail.com |
5997 |
151627 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Pesh ChibErgIS |
only spoken |
2.15 |
https://github.com/UniversalDependencies/UD_Pesh-ChibErgIS |
spoken |
natalia.caceres.arandia@cnrs.fr |
524 |
4275 |
yes |
corpus metadata |
sound_url |
text_en |
morphemic_text;text_phrase-gls-es;text_phrase-gls-tl;text_phrase-gls-de;text_phrase-gls-pro;text_phrase-gls-wg;text_phrase-gls-it |
speaker_id |
doc (and paragraphs) metadata |
modality metadata |
sent_timecode |
tags |
|
|
|
|
|
|
|
|
| Polish LFG |
mixed |
2.2 |
https://github.com/UniversalDependencies/UD_Polish-LFG |
fiction nonfiction news spoken social |
aep@ipipan.waw.pl, adamp@ipipan.waw.pl |
17246 |
130967 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Scottish_Gaelic ARCOSG |
mixed |
2.5 |
https://github.com/UniversalDependencies/UD_Scottish_Gaelic-ARCOSG |
nonfiction fiction news spoken |
colin.r.batchelor@googlemail.com |
4748 |
86139 |
|
|
|
|
|
speaker |
newdoc id |
|
|
comment;revision |
|
|
|
|
|
|
|
|
| Skolt_Sami Giellagas |
mixed |
2.5 |
https://github.com/UniversalDependencies/UD_Skolt_Sami-Giellagas |
nonfiction news spoken |
rueter.jack@gmail.com |
250 |
2961 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Slovenian SST |
only spoken |
1.3 |
https://github.com/UniversalDependencies/UD_Slovenian-SST |
spoken |
kaja.dobrovoljc@ff.uni-lj.si |
6121 |
98393 |
yes |
|
sound_url |
|
|
speaker_id |
newdoc id |
|
|
|
|
|
|
|
|
|
|
|
| Soi AHA |
mixed |
2.7 |
https://github.com/UniversalDependencies/UD_Soi-AHA |
grammar-examples spoken |
amojiry@gmail.com |
8 |
55 |
|
|
|
text_en;text_fa |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| South_Levantine_Arabic MADAR |
mixed |
2.7 |
https://github.com/UniversalDependencies/UD_South_Levantine_Arabic-MADAR |
spoken social |
shorouqjzahra@gmail.com |
100 |
789 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Spanish COSER |
only spoken |
2.14 |
https://github.com/UniversalDependencies/UD_Spanish-COSER |
spoken |
johnatan.bonillahuerfano@ugent.be |
539 |
7987 |
yes |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Swedish_Sign_Language SSLC |
only spoken |
1.4 |
https://github.com/UniversalDependencies/UD_Swedish_Sign_Language-SSLC |
spoken |
robert@ling.su.se |
203 |
1610 |
yes |
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Telugu_English TECT |
only spoken |
2.14 |
https://github.com/UniversalDependencies/UD_Telugu_English-TECT |
spoken |
anishka18v@gmail.com |
97 |
456 |
yes |
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Turkish_English BUTR |
only spoken |
2.16 |
https://github.com/UniversalDependencies/UD_Turkish_English-BUTR |
spoken |
furkanakkurt7242@icloud.com |
51 |
393 |
yes |
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Turkish_German SAGT |
only spoken |
2.7 |
https://github.com/UniversalDependencies/UD_Turkish_German-SAGT |
spoken |
ozlem@ims.uni-stuttgart.de |
2184 |
36934 |
yes |
——————- |
|
——————- |
|
——————- |
|
——————- |
|
——————- |
|
——————- |
|
——————- |
|
——————- |
|
——————- |
| Ukrainian ParlaMint |
mixed |
2.15 |
https://github.com/UniversalDependencies/UD_Ukrainian-ParlaMint |
government legal spoken |
corpus.textiv@gmail.com |
7142 |
109166 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Vietnamese TueCL |
only spoken |
2.14 |
https://github.com/UniversalDependencies/UD_Vietnamese-TueCL |
spoken |
hoa.do@student.uni-tuebingen.de,cagri.coeltekin@uni-tuebingen.de |
100 |
1888 |
yes |
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Western_Armenian ArmTDP |
mixed |
2.8 |
https://github.com/UniversalDependencies/UD_Western_Armenian-ArmTDP |
blog fiction news nonfiction reviews social spoken web wiki |
marat.yavrumyan@ysu.am |
6644 |
121432 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Western_Sierra_Puebla_Nahuatl MesoTree |
mixed |
2.11 |
https://github.com/UniversalDependencies/UD_Western_Sierra_Puebla_Nahuatl-MesoTree |
spoken fiction grammar-examples nonfiction |
pughrob@iu.edu |
|
|
|
|
|
text[spa];text[orig] |
|
|
|
|
|
labels |
|
|
|
|
|
|
|
|
| Yiddish YiTB |
mixed |
2.17 |
https://github.com/UniversalDependencies/UD_Yiddish-YiTB |
grammar-examples learner-essays bible wiki fiction nonfiction spoken web |
m.kirkandrews@gmail.com |
3054 |
27488 |
|
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|
| Zazaki ZSD |
only spoken |
2.17 |
https://github.com/UniversalDependencies/UD_Zazaki-ZSD |
spoken |
luigi.talamo@uni-saarland.de |
|
|
yes |
corpus metadata |
media data |
languages and translation(s) |
transcription and annotation levels available |
speaker metadata |
doc (and paragraphs) metadata |
modality metadata |
sent metadata |
varia |
|
|
|
|
|
|
|
|