[Main Page | Bookstore | Search | Links Page | Submit a Site | Contact | Site Map ]

Linguistics - Annotated DB, Lexical DB, Tagging, Corpora

When I was in grad school, at one point I became very interested in the use of corpora for various purposes in language learning. At the time, I was collecting links to a corpus here, a corpus there, corpora corpora everywhere! Sadly, many of the links I found in 1996 are gone away, but there are still some good references and databases online. Here are some I've found or that people have submitted. If you have a good corpus link, or studies related to annotated databases and tagging, please feel free to submit it using the "Submit a Site" link above or drop me an email using the "Contact" link.

6 Links in the category "Linguistics - Annotated DB, Lexical DB, Tagging, Corpora"

Dictiome pronunciation database
Submitted on 2015-08-04 by Anthony Bassett [Edit] [Delete]
Dictiome is a fast-growing community-built bank of English pronunciations.

Davies/BYU Corpus of American English
Submitted on 2008-03-04 by Webmaster [Edit] [Delete]
The BYU Corpus of American English is the first large corpus of American English, and it is freely available online. It contains more than 360 million words of text, including 20 million words each year from 1990-2007, and it is equally divided among spoken, fiction, popular magazines, newspapers, and academic texts (more information). [Description taken from site]

Bookmarks for Corpus-based Linguists
Submitted on 2001-12-28 by David Lee [Edit] [Delete]
A web site with annotated links for corpus-based linguists (comprehensive for the English language, with links to corpora of other languages). Key features: (1) up-to-date (2) focuses on links for linguists & language teachers (not NLP/language engineering); (3) listings are mostly annotated (4) brings together in ONE place info on corpora, software tools, bibliographies, references, electronic papers, mailing lists, on-line courses, conferences, etc. that people doing corpus work will need.

Submitted on 2001-03-02 by Webmaster [Edit] [Delete]
Fred: the SGML Grammar Builder

Linguistic Data Consortium
Submitted on 2001-03-02 by Webmaster [Edit] [Delete]
Another site with corpora in different languages.

Electronic Text Center
Submitted on 2001-03-02 by Webmaster [Edit] [Delete]
Electronic Text Center at the University of Virginia

Go to the Linguistic Funland TESL Page.
Go to the Linguistic Funland.
Contact the Maintainer
Copyright 1995 - 2011, Kristina Harris