Obtaining DataUsing DataProviding DataCreating Data
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact UsUPennHome

LDC Catalog | By Type and Source | By Year | Top Ten | Projects | Catalog Search



LDC Top Ten Corpora

These 10 LDC corpora are the most popular (number distributed is in italic)

1051LDC93S1TIMIT Acoustic-Phonetic Continuous Speech Corpus
917LDC2006T13Web 1T 5-gram Version 1
758LDC96L14CELEX2
456LDC93S10TIDIGITS
443LDC94T5ECI Multilingual Text
383LDC99T42Treebank-3
350LDC93T3ATIPSTER Complete
325LDC93S2NTIMIT
280LDC94S16YOHO Speaker Verification
279LDC2001T02Message Understanding Conference (MUC) 7

About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Data

Contact: ldc@ldc.upenn.edu

(c) 1992-2010 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.