CLEC收集了包括中学生、大学英语4级和6级、专业英语低年级和高年级在内的5种学生的语料一百多万词,并对言语失误进行标注。其目的就是观察各类学生的英语特征和言语失误的情况,希望通过定量和定性的方法对中国学习者英语作出较为精确的描写,为我国学生的英语教学提供有用的反馈信息。
|
表1 CLEC语料分布 |
|
|
类型 |
词次 |
ST2 |
208088 |
ST3 |
209043 |
ST4 |
212855 |
ST5 |
214510 |
ST6 |
226106 |
|
总计 |
1070602 |
5. 对语体或失误的来由暂不作标注,因为这需要标注者较多的主观判断,更难以统一。
|
词形 |
动词短语 |
名词短语 |
代词 |
||||||
|
码 |
类型 |
码 |
类型 |
码 |
类型 |
码 |
类型 |
||
|
fm1 |
Spelling |
vp1 |
pattern |
np1 |
pattern |
pr1 |
Reference |
||
|
fm2 |
word
building |
vp2 |
set
phrase |
np2
|
set
phrase |
pr2 |
anticipatory
it |
||
|
fm3
|
capitalization |
vp3 |
agreement |
np3 |
agreement |
pr3 |
Agreement |
||
|
|
|
vp4 |
finite/non-finite |
np4 |
case |
pr4 |
Case |
||
|
|
|
vp5 |
non-finite |
np5 |
countability |
pr5 |
wh- |
||
|
|
|
vp6 |
tense |
np6 |
number |
pr6 |
Indefinite |
||
|
|
|
vp7 |
voice |
np7 |
article |
|
|
||
|
|
|
vp8 |
mood |
np8 |
quantifiers |
|
|
||
|
|
|
vp9 |
modal/auxiliary |
np9 |
other
determiners |
|
|
||
|
形容词短语 |
副词 |
介词短语 |
连词 |
||||||
|
码 |
类型 |
码 |
类型 |
码 |
类型 |
码 |
类型 |
||
|
aj1 |
pattern |
ad1 |
order |
pp1 |
pattern |
cj1 |
pattern |
||
|
aj2 |
set
phrase |
ad2 |
modification |
pp2
|
set
phrase |
cj2 |
set
phrase |
||
|
aj3
|
degree |
ad3 |
degree |
|
|
|
|
||
|
aj4 |
-ed/-ing
confusion |
|
|
|
|
|
|
||
|
aj5 |
predicative/attributive |
|
|
|
|
|
|
||
|
词语 |
搭配 |
句子 |
|
||||||
|
码 |
类型 |
码 |
类型 |
码 |
类型 |
|
|||
|
wd1 |
order |
cc1 |
noun/noun |
sn1 |
run-on
sentence |
|
|||
|
wd2 |
part
of speech |
cc2 |
noun/verb |
sn2
|
sentence
fragment |
|
|||
|
wd3
|
substitution |
cc3 |
verb/noun |
sn3 |
dangling
modifier |
|
|||
|
wd4 |
absence |
cc4 |
adj/noun |
sn4 |
illogical
comparison |
|
|||
|
wd5 |
redundancy |
cc5 |
verb/adv |
sn5 |
topic
prominence |
|
|||
|
wd6 |
repetition |
cc6 |
adv/adj |
sn6 |
Coordination |
|
|||
|
wd7 |
ambiguity |
|
|
sn7 |
Subordination |
|
|||
|
|
|
|
|
sn8 |
structural
deficiency |
|
|||
|
|
|
|
|
sn9 |
Punctuation |
|
|||
|
码 |
分
类 |
类
别 |
说
明 |
|
fm1 |
word |
Spelling(拼写) |
spelling, coinage, abbreviation, apostrophe |
|
fm2 |
word |
word building(构词) |
derivation, inflection, compounding, plurality (noun), irregularity(verb), 3rd person singular form(verb), syllabification, hyphenation, word division or fusion |
|
fm3 |
word |
Capitalization(大小写) |
lower initial letter for upper initial letter or vice versa |
|
vp1 |
vb phr |
Pattern(及物性型式) |
error in transitivity(vi as vt or vice versa), transitive verb pattern/ grammatical(cf Oxford advanced learner’s dictionary of current English edited by A. S. Hornby) |
|
vp2 |
vb phr |
set phrase(固定词组) |
phrasal verb and verbal phrase: error in form or use |
|
vp3 |
vb phr |
Agreement(主谓一致性) |
number agreement with its subject (noun or pronoun) |
|
vp4 |
vb phr |
finite/non-finite(定式) |
finite verb for non-finite verb or vice versa |
|
vp5 |
vb phr |
non-finite(不定式) |
infinitive error: form and use/ infinitive for participle or vice versa/ -ed participle for -ing participle or vice versa |
|
vp6 |
vb phr |
Tense(时态) |
error in tense use within a sentence/ the sequence of tenses between sentences |
|
vp7 |
vb phr |
voice (语态) |
error in the use of voice: active for passive or vice versa |
|
vp8 |
vb phr |
Mood(语气) |
error in the use of mood: imperative, subjunctive/ improper structure of conditional sentences |
|
vp9 |
vb phr |
modal/auxiliary(情态) |
misuse of modal/auxiliary verbs/ wrong form of modal verb(or auxiliary verb) and verb combination (e.g tense form, voice form, etc) |
|
np1 |
nn phr |
Pattern(名词型式) |
Error in combination with other words/grammatical |
|
np2 |
nn phr |
set phrase(固定词组) |
omission or replacement of a fixed element that goes after a certain noun |
|
np3 |
nn phr |
Agreement(主谓一致性) |
number agreement of a noun with its determiner or a word that refers to it |
|
np4 |
nn phr |
Case(格) |
possessive case error: form or use |
|
np5 |
nn phr |
Countability(可数性) |
uncountable noun used as countable noun |
|
np6 |
nn phr |
Number(数) |
countable noun used with no determiner or -s/ a or -s with plural noun |
|
np7 |
nn phr |
Article(冠词) |
a/an confusion or definite/indefinite confusion |
|
np8 |
nn phr |
Quantifiers(数量词) |
misuse or confusion between many/much, (a) few/(a) little, some/any, etc |
|
np9 |
nn phr |
other determiners(其他限定词) |
misuse or confusion of demonstratives, wh- determiners, numerals, etc. |
|
pr1 |
pron |
Reference(指称) |
incorrect/ambiguous pronoun reference/anaphoric |
|
pr2 |
pron |
anticipatory it(先行it) |
improper or wrong use of anticipatory it / it replaced by a demonstrative, etc |
|
pr3 |
pron |
Agreement(主谓一致性) |
number agreement with a noun it refers to |
|
pr4 |
pron |
Case(格) |
case error of any personal pronoun |
|
pr5 |
pron |
wh-(wh-代词) |
misuse or confusion of interrogative, relative and conjunctive pronouns |
|
pr6 |
pron |
Indefinite(不定式) |
misuse or confusion of indefinite pronouns such as all/both, few/little, some/any, either/neither, etc |
|
aj1 |
adj |
Pattern(形容词型式) |
error in the combination with other words/grammatical |
|
aj2 |
adj |
set phrase(固定词组) |
error in the idiomatic use of an adjectival phrase/ omission or replacement of a fixed element that goes after a certain adjective |
|
aj3 |
adj |
Degree(级) |
adjective degree error: form and use |
|
aj4 |
adj |
-ed/-ing confusion(-ed/-ing混淆) |
-ed adjective for -ing adjective or vice versa |
|
aj5 |
adj |
predicative/attributive(谓语/定语) |
predicative adjective used as attributive adjective |
|
ad1 |
adv |
Order(词序) |
improper adverb placement/wrong position |
|
ad2 |
adv |
Modification(修饰语) |
adjective modifier used as verb modifier/ other kinds of confusion |
|
ad3 |
adv |
Degree(级) |
adverb degree error: form and use |
|
pp1 |
prep |
Pattern(介词型式) |
unacceptable combination with other words/grammatical |
|
pp2 |
prep |
set phrase(固定词组) |
error in the formation or use of an idiomatic prepositional phrase |
|
cj1 |
conj |
Pattern(连词型式) |
unacceptable combination with other words/grammatical |
|
cj2 |
conj |
set phrase(固定词组) |
error in the formation or use of a phrase functioning as a conjunction |
|
wd1 |
word |
Order(词序) |
misplacement of any word other than an adverb |
|
wd2 |
word |
part of speech(词类) |
error in part of speech: right root but wrong word class |
|
wd3 |
word |
Substitution(替代) |
error in word choice: right word class but wrong selection (any part of speech) |
|
wd4 |
word |
Absence(缺少) |
omission of a word(any part of speech) |
|
wd5 |
word |
Redundancy(冗余) |
oversuppliance of a word(any part of speech) |
|
wd6 |
word |
Repetition(重复) |
unnecessary repeating of a word |
|
wd7 |
word |
Ambiguity(歧义) |
not clear word meaning/semantic |
|
cc1 |
notional |
n/n collocation(名词/名词) |
improper noun(phrase) and noun(phrase) combination/semantic |
|
cc2 |
notional |
n/v collocation(名词/动词) |
improper noun(phrase) and verb(phrase) combination/semantic |
|
cc3 |
notional |
v/n collocation(动词/名词) |
improper verb and noun(phrase) combination/semantic |
|
cc4 |
notional |
a/n collocation(形容词/名词) |
improper adjective and noun(phrase) combination/semantic |
|
cc5 |
notional |
v/ad collocation(动词/副词) |
improper verb and adverb (or ad/v) combination/semantic |
|
cc6 |
notional |
ad/a collocation(副词/形容词) |
improper adverb and adjective combination/semantic |
|
sn1 |
sentence |
run-on sentence(不断句) |
improper addition of clauses/fused sentence |
|
sn2 |
sentence |
sentence fragment(片段) |
subordinate clause as a sentence/ any phrase as a sentence |
|
sn3 |
sentence |
dangling modifier(垂悬修饰语) |
illogical adverbial modification of a clause |
|
sn4 |
sentence |
illogical comparison(比较不符合逻辑) |
error in the comparison of words or phrases in a sentence which can not be compared |
|
sn5 |
sentence |
topic prominence(主题突出) |
the co-occurrence of an initial noun phrase and its equivalent(usually a pronoun) in the same sentence |
|
sn6 |
sentence |
Coordination(并列) |
faulty parallelism of clauses (or words/phrases) in a sentence |
|
sn7 |
sentence |
Subordination(主从) |
faulty attachment of a subordinate clause to the main clause |
|
sn8 |
sentence |
structural deficiency(结构缺陷) |
error in the grammatical construction of a sentence: improper splitting, pattern shifting, confusing structure, etc |
|
sn9 |
sentence |
Punctuation(标点符号) |
overuse, absence, choice, apostrophe, comma splice, etc. |
|
失误类型 |
st2 |
st3 |
st3 |
st4 |
st5 |
总计 |
百分比(%) |
|
fm1 |
1928.8 |
2877.4 |
2112.6 |
1826.7 |
1686.7 |
10432.2 |
17.47 |
|
fm2 |
349.3 |
448.9 |
438.9 |
226.9 |
328.7 |
1792.7 |
3 |
|
fm3 |
1474.4 |
731.8 |
405.8 |
694.1 |
174.6 |
3480.7 |
5.83 |
|
vp1 |
259.4 |
325.9 |
498.4 |
103.4 |
200.8 |
1387.9 |
2.32 |
|
vp2 |
179 |
139.3 |
61.2 |
104.2 |
22.1 |
505.8 |
0.85 |
|
vp3 |
374 |
524.6 |
785.2 |
273.1 |
327 |
2283.9 |
3.82 |
|
vp4 |
140.8 |
159.1 |
110.8 |
63.9 |
51.6 |
526.2 |
0.88 |
|
vp5 |
140 |
118.7 |
107.4 |
89.9 |
46.7 |
502.7 |
0.84 |
|
vp6 |
1165.7 |
356 |
311.6 |
379.8 |
215.6 |
2428.7 |
4.07 |
|
vp7 |
172.7 |
104.1 |
98.4 |
63.9 |
46.7 |
485.8 |
0.81 |
|
vp8 |
27.1 |
16.3 |
8.3 |
25.2 |
11.5 |
88.4 |
0.15 |
|
vp9 |
111.4 |
274.3 |
278.5 |
42.9 |
86.1 |
793.2 |
1.33 |
|
np1 |
46.9 |
33.5 |
28.9 |
16.8 |
10.7 |
136.8 |
0.23 |
|
np2 |
24.7 |
22.4 |
17.4 |
19.3 |
2.5 |
86.3 |
0.14 |
|
np3 |
202.1 |
247.7 |
249.6 |
210.9 |
186 |
1096.3 |
1.84 |
|
np4 |
66.8 |
55.9 |
26.4 |
22.7 |
21.3 |
193.1 |
0.32 |
|
np5 |
58.9 |
98 |
71.9 |
60.5 |
84.4 |
373.7 |
0.63 |
|
np6 |
374 |
654.4 |
481 |
358.8 |
354.1 |
2222.3 |
3.72 |
|
np7 |
237.9 |
107.5 |
89.3 |
174.8 |
54.9 |
664.4 |
1.11 |
|
np8 |
35 |
65.4 |
47.9 |
13.4 |
7.4 |
169.1 |
0.28 |
|
np9 |
6.4 |
41.3 |
12.4 |
7.6 |
5.7 |
73.4 |
0.12 |
|
pr1 |
82 |
236.5 |
205 |
89.9 |
18.9 |
632.3 |
1.06 |
|
pr2 |
16.7 |
78.3 |
23.1 |
4.2 |
0 |
122.3 |
0.2 |
|
pr3 |
52.5 |
54.2 |
172.7 |
28.6 |
60.6 |
368.6 |
0.62 |
|
pr4 |
74.8 |
37 |
20.7 |
48.7 |
10.7 |
191.9 |
0.32 |
|
pr5 |
26.3 |
53.3 |
14.1 |
7.6 |
10.7 |
112 |
0.19 |
|
pr6 |
9.5 |
2.6 |
5 |
3.4 |
0 |
20.5 |
0.03 |
|
aj1 |
6.4 |
18.9 |
15.7 |
5 |
9 |
55 |
0.09 |
|
aj2 |
9.5 |
3.4 |
9.9 |
5.9 |
7.4 |
36.1 |
0.06 |
|
aj3 |
38.2 |
39.6 |
32.2 |
43.7 |
97.5 |
251.2 |
0.42 |
|
aj4 |
16.7 |
2.6 |
22.3 |
12.6 |
5.7 |
59.9 |
0.1 |
|
aj5 |
0.8 |
3.4 |
7.4 |
1.7 |
0 |
13.3 |
0.02 |
|
ad1 |
35.8 |
96.3 |
39.7 |
27.7 |
15.6 |
215.1 |
0.36 |
|
ad2 |
42.2 |
37.8 |
12.4 |
9.2 |
4.9 |
106.5 |
0.18 |
|
ad3 |
7.2 |
12 |
9.9 |
1.7 |
2.5 |
33.3 |
0.06 |
|
pp1 |
136.1 |
98 |
43 |
169.7 |
28.7 |
475.5 |
0.8 |
|
pp2 |
25.5 |
262.3 |
143.8 |
37 |
27.9 |
496.5 |
0.83 |
|
cj1 |
27.8 |
20.6 |
18.2 |
21.8 |
12.3 |
100.7 |
0.17 |
|
cj2 |
4 |
7.7 |
13.2 |
5.9 |
4.9 |
35.7 |
0.06 |
|
Wd1 |
43.8 |
151.3 |
114.1 |
25.2 |
37.7 |
372.1 |
0.62 |
|
Wd2 |
324.6 |
929.6 |
772.8 |
226.9 |
242.6 |
2496.5 |
4.18 |
|
Wd3 |
1102 |
1634.7 |
1815 |
757.1 |
359.8 |
5668.6 |
9.49 |
|
Wd4 |
585.6 |
829.8 |
443.8 |
403.3 |
427 |
2689.5 |
4.5 |
|
Wd5 |
410.6 |
613.1 |
518.2 |
265.5 |
171.3 |
1978.7 |
3.31 |
|
Wd6 |
27.1 |
37 |
22.3 |
34.5 |
29.5 |
150.4 |
0.25 |
|
Wd7 |
261.8 |
430.8 |
261.2 |
228.6 |
209.8 |
1392.2 |
2.33 |
|
cc1 |
72.4 |
65.4 |
76 |
23.5 |
36.1 |
273.4 |
0.46 |
|
cc2 |
35 |
177.1 |
49.6 |
6.7 |
21.3 |
289.7 |
0.49 |
|
Cc3 |
168.7 |
514.2 |
417.4 |
75.6 |
112.3 |
1288.2 |
2.16 |
|
Cc4 |
64.5 |
94.6 |
134.7 |
42 |
39.3 |
375.1 |
0.63 |
|
Cc5 |
23.9 |
40.4 |
29.8 |
5 |
4.1 |
103.2 |
0.17 |
|
Cc6 |
17.5 |
12 |
6.6 |
2.5 |
1.6 |
40.2 |
0.07 |
|
Sn1 |
419.3 |
596.8 |
576.9 |
118.5 |
42.6 |
1754.1 |
2.94 |
|
Sn2 |
424.9 |
389.6 |
303.3 |
132.8 |
76.2 |
1326.8 |
2.22 |
|
Sn3 |
10.3 |
20.6 |
17.4 |
2.5 |
10.7 |
61.5 |
0.1 |
|
Sn4 |
17.5 |
24.9 |
6.6 |
20.2 |
4.9 |
74.1 |
0.12 |
|
Sn5 |
9.5 |
14.6 |
17.4 |
2.5 |
4.9 |
48.9 |
0.08 |
|
Sn6 |
84.3 |
41.3 |
39.7 |
41.2 |
1.6 |
208.1 |
0.35 |
|
Sn7 |
49.3 |
55.9 |
63.6 |
23.5 |
3.3 |
195.6 |
0.33 |
|
Sn8 |
1103.6 |
446.3 |
862.1 |
493.2 |
231.9 |
3137.1 |
5.25 |
|
Sn9 |
861.7 |
573.6 |
337.2 |
649.5 |
322.9 |
2744.9 |
4.6 |
|
总计 |
14105.2 |
16160.6 |
13935.9 |
8883.4 |
6633.8 |
59718.9 |
100 |
|
|
|
|
|
|
|||||
|
|
st2 |
st3 |
st4 |
st5 |
st6 |
总计 |
百分比 |
累积百分比 |
|
|
词形 |
3752.5 |
4058.1 |
2957.3 |
2747.7 |
2190 |
15705.6 |
26.299 |
26.299 |
|
|
词汇 |
2755.5 |
4626.3 |
3947.4 |
1941.1 |
1477.7 |
14748 |
24.696 |
50.995 |
|
|
句法 |
2980.4 |
2163.6 |
2224.2 |
1483.9 |
699 |
9551.1 |
15.993 |
66.988 |
|
|
动词 |
2570.1 |
2018.3 |
2259.8 |
1146.3 |
1008.1 |
9002.6 |
15.075 |
82.063 |
|
|
名词 |
1052.7 |
1326.1 |
1024.8 |
884.8 |
727 |
5015.4 |
8.398 |
90.461 |
|
|
搭配 |
382 |
903.7 |
714.1 |
155.3 |
214.7 |
2369.8 |
3.968 |
94.429 |
|
|
代词 |
261.8 |
461.9 |
440.6 |
182.4 |
100.9 |
1447.6 |
2.424 |
96.853 |
|
|
介词 |
161.6 |
360.3 |
186.8 |
206.7 |
56.6 |
972 |
1.628 |
98.481 |
|
|
形容词 |
71.6 |
67.9 |
87.5 |
68.9 |
119.6 |
415.5 |
0.696 |
99.177 |
|
|
副词 |
85.2 |
146.1 |
62 |
38.6 |
23 |
354.9 |
0.594 |
99.771 |
|
|
连词 |
31.8 |
28.3 |
31.4 |
27.7 |
17.2 |
136.4 |
0.228 |
99.999 |
|
|
总计 |
14105.2 |
16160.6 |
13935.9 |
8883.4 |
6633.8 |
59718.9 |
99.999 |
|
|
|
百分比 |
0.24 |
0.27 |
0.23 |
0.15 |
0.11 |
|
|
|
|
|
|
|
|
|
|
||||
|
类型 |
st2 |
st3 |
st4 |
st5 |
st6 |
总计 |
百分比 |
|
|
fm1 |
1928.8 |
2877.4 |
2112.6 |
1826.7 |
1686.7 |
10432.2 |
17.47 |
|
|
wd3 |
1102 |
1634.7 |
1815 |
757.1 |
359.8 |
5668.6 |
9.49 |
|
|
fm3 |
1474.4 |
731.8 |
405.8 |
694.1 |
174.6 |
3480.7 |
5.83 |
|
|
sn8 |
1103.6 |
446.3 |
862.1 |
493.2 |
231.9 |
3137.1 |
5.25 |
|
|
sn9 |
861.7 |
573.6 |
337.2 |
649.5 |
322.9 |
2744.9 |
4.6 |
|
|
wd4 |
585.6 |
829.8 |
443.8 |
403.3 |
427 |
2689.5 |
4.5 |
|
|
wd2 |
324.6 |
929.6 |
772.8 |
226.9 |
242.6 |
2496.5 |
4.18 |
|
|
vp6 |
1165.7 |
356 |
311.6 |
379.8 |
215.6 |
2428.7 |
4.07 |
|
|
vp3 |
374 |
524.6 |
785.2 |
273.1 |
327 |
2283.9 |
3.82 |
|
|
np6 |
374 |
654.4 |
481 |
358.8 |
354.1 |
2222.3 |
3.72 |
|
|
wd5 |
410.6 |
613.1 |
| |||||