Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome

LDC Catalog | By Type and Source | By Year | Top Ten | Projects | Catalog Search



LDC Top Ten Corpora

These 10 LDC corpora are the most popular (number distributed is in italic)

1093LDC93S1TIMIT Acoustic-Phonetic Continuous Speech Corpus
962LDC2006T13Web 1T 5-gram Version 1
777LDC96L14CELEX2
471LDC93S10TIDIGITS
457LDC94T5ECI Multilingual Text
407LDC99T42Treebank-3
359LDC93T3ATIPSTER Complete
335LDC93S2NTIMIT
290LDC2001T02Message Understanding Conference (MUC) 7
289LDC94S16YOHO Speaker Verification

About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Data

Contact: ldc@ldc.upenn.edu

(c) 1992-2010 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.