关于正式开通“English Corpora美国英语语料库”的通知

发布者:图书馆-张雪璐发布时间:2023-03-24浏览次数:354

访问网址

https://www.english-corpora.org/

访问方式

IP地址登录/密码登录,无并发用户限制。

访问时间

即日起-2023/12/31

数据库简介:

English Corpora是目前使用最广泛的语料库,平台整合了多个常用的语料库资源,如:

1)美国当代英语语料库COCAThe Corpus of Contemporary American English):包含19万篇文本中约4.5亿个单词,收录年度1990-2012,分为话语、小说、杂志、报纸、学术5大类。

2)美国历史英语语料库COHAThe Corpus of Historical American English):包含11.5万篇文本中约4亿个单词,收录年度1810-2009

3)全球网络英语语料库GloWbECorpus of Global Web-based English):包含19亿个单词,来自20多个英语国家超过180万个网页。

4)英语国家语料库BNCBritish National Corpups):包含1亿个单词,1980-1993BNC语料库最初由牛津大学出版社于1980-1990年建立。English Corpora平台收录BNC完整的语料信息,采用的版本为CLAWS 7 tagset

数据库网址:https://www.English Corpora.org,以下语料库资源均可以在平台检索并查看词条详细信息。


Corpus (online access)

# words

Dialect

Time period

Genre(s)

News on the   Web (NOW)

14.2 billion+

20 countries

2010-yesterday

Web: News

iWeb: The   Intelligent Web-based Corpus

14 billion

6 countries

2017

Web

Global   Web-Based English (GloWbE)

1.9 billion

20 countries

2012-13

Web (incl   blogs)

Wikipedia   Corpus

1.9 billion

(Various)

2014

Wikipedia

Coronavirus   Corpus

1.3 billion+

20 countries

Jan   2020-yesterday

Web: News

Corpus of   Contemporary American English (COCA)

1.0 billion

American

1990-2019

Balanced

Corpus of   Historical American English (COHA)

475 million

American

1820-2019

Balanced

The TV Corpus

325 million

6 countries

1950-2018

TV shows

The Movie   Corpus

200 million

6 countries

1930-2018

Movies

Corpus of   American Soap Operas

100 million

American

2001-2012

TV shows

Hansard Corpus

1.6 billion

British

1803-2005

Parliament

Early English   Books Online

755 million

British

1470s-1690s

(Various)

Corpus of US Supreme   Court Opinions

130 million

American

1790s-present

Legal opinions

TIME Magazine   Corpus

100 million

American

1923-2006

Magazine

British   National Corpus (BNC) *

100 million

British

1980s-1993

Balanced

Strathy Corpus   (Canada)

50 million

Canadian

1970s-2000s

Balanced

CORE Corpus

50 million

6 countries

2014

Web

American   English

155 billion

American

1500s-2000s

(Various)

British   English

34 billion

British

1500s-2000

(Various)