CLEC收集了包括中学生、大学英语4级和6级、专业英语低年级和高年级在内的5种学生的语料一百多万词,并对言语失误进行标注。其目的就是观察各类学生的英语特征和言语失误的情况,希望通过定量和定性的方法对中国学习者英语作出较为精确的描写,为我国学生的英语教学提供有用的反馈信息。
|
表1 CLEC语料分布 |
|
|
类型 |
词次 |
ST2 |
208088 |
ST3 |
209043 |
ST4 |
212855 |
ST5 |
214510 |
ST6 |
226106 |
|
总计 |
1070602 |
5. 对语体或失误的来由暂不作标注,因为这需要标注者较多的主观判断,更难以统一。
|
词形 |
动词短语 |
名词短语 |
代词 |
||||||
|
码 |
类型 |
码 |
类型 |
码 |
类型 |
码 |
类型 |
||
|
fm1 |
Spelling |
vp1 |
pattern |
np1 |
pattern |
pr1 |
Reference |
||
|
fm2 |
word
building |
vp2 |
set
phrase |
np2
|
set
phrase |
pr2 |
anticipatory
it |
||
|
fm3
|
capitalization |
vp3 |
agreement |
np3 |
agreement |
pr3 |
Agreement |
||
|
|
|
vp4 |
finite/non-finite |
np4 |
case |
pr4 |
Case |
||
|
|
|
vp5 |
non-finite |
np5 |
countability |
pr5 |
wh- |
||
|
|
|
vp6 |
tense |
np6 |
number |
pr6 |
Indefinite |
||
|
|
|
vp7 |
voice |
np7 |
article |
|
|
||
|
|
|
vp8 |
mood |
np8 |
quantifiers |
|
|
||
|
|
|
vp9 |
modal/auxiliary |
np9 |
other
determiners |
|
|
||
|
形容词短语 |
副词 |
介词短语 |
连词 |
||||||
|
码 |
类型 |
码 |
类型 |
码 |
类型 |
码 |
类型 |
||
|
aj1 |
pattern |
ad1 |
order |
pp1 |
pattern |
cj1 |
pattern |
||
|
aj2 |
set
phrase |
ad2 |
modification |
pp2
|
set
phrase |
cj2 |
set
phrase |
||
|
aj3
|
degree |
ad3 |
degree |
|
|
|
|
||
|
aj4 |
-ed/-ing
confusion |
|
|
|
|
|
|
||
|
aj5 |
predicative/attributive |
|
|
|
|
|
|
||
|
词语 |
搭配 |
句子 |
|
||||||
|
码 |
类型 |
码 |
类型 |
码 |
类型 |
|
|||
|
wd1 |
order |
cc1 |
noun/noun |
sn1 |
run-on
sentence |
|
|||
|
wd2 |
part
of speech |
cc2 |
noun/verb |
sn2
|
sentence
fragment |
|
|||
|
wd3
|
substitution |
cc3 |
verb/noun |
sn3 |
dangling
modifier |
|
|||
|
wd4 |
absence |
cc4 |
adj/noun |
sn4 |
illogical
comparison |
|
|||
|
wd5 |
redundancy |
cc5 |
verb/adv |
sn5 |
topic
prominence |
|
|||
|
wd6 |
repetition |
cc6 |
adv/adj |
sn6 |
Coordination |
|
|||
|
wd7 |
ambiguity |
|
|
sn7 |
Subordination |
|
|||
|
|
|
|
|
sn8 |
structural
deficiency |
|
|||
|
|
|
|
|
sn9 |
Punctuation |
|
|||