I generated some statistics about the frequency of syllables in Italian language in order to use them in cognitive psychology.
Here's an example of the calculations my scripts and programmes have done.
I started from a list of words with their frequency in Italian.
Starting from this example file (the actual one was extremely longer):
I hyphenated the words (separated the syllables with ~) and got a result file like
Then the scripts generated some statistical tables on syllable frequency. Below you'll find some examples.
Here's the legend of the table headers:
Syllable | % tot | total frequency | whole word | 1 | 2 | 3 | 4 | last |
---|---|---|---|---|---|---|---|---|
a | 5.1365 | 31321 | 0 | 31321 | 0 | 0 | 0 | 0 |
ab | 22.5631 | 137584 | 0 | 137584 | 0 | 0 | 0 | 0 |
ba | 11.3396 | 69146 | 0 | 0 | 69146 | 0 | 0 | 0 |
bac | 15.1928 | 92642 | 0 | 0 | 92642 | 0 | 0 | 0 |
bat | 1.1672 | 7117 | 0 | 0 | 7117 | 0 | 0 | 0 |
chi | 3.4924 | 21296 | 0 | 0 | 0 | 21296 | 0 | 21296 |
chia | 6.1565 | 37541 | 0 | 0 | 0 | 37541 | 0 | 0 |
chiar | 3.7634 | 22948 | 0 | 0 | 0 | 22948 | 0 | 0 |
chio | 2.3874 | 14558 | 0 | 0 | 0 | 14558 | 0 | 14558 |
ci | 6.7846 | 41371 | 0 | 0 | 0 | 41371 | 0 | 0 |
co | 1.8577 | 11328 | 0 | 0 | 0 | 11328 | 0 | 11328 |
jour | 1.1672 | 7117 | 0 | 0 | 0 | 7117 | 0 | 7117 |
na | 3.7520 | 22879 | 0 | 0 | 0 | 0 | 22879 | 22879 |
nai | 3.0326 | 18492 | 0 | 0 | 0 | 0 | 18492 | 18492 |
re | 1.0520 | 6415 | 0 | 0 | 0 | 0 | 6415 | 6415 |
si | 3.7634 | 22948 | 0 | 0 | 0 | 0 | 22948 | 22948 |
te | 0.3755 | 2290 | 1200 | 0 | 0 | 1090 | 0 | 1090 |
ti | 1.9115 | 11656 | 0 | 0 | 0 | 11656 | 0 | 11656 |
to | 5.1045 | 31126 | 0 | 0 | 0 | 0 | 31126 | 31126 |
Here the syllables are group by “skeleton”: c stands for consonant, v for vowel and S for foreign letter (for instance “w” and “y”)
Skeleton | % tot | total frequency | whole word | 1 | 2 | 3 | 4 | last |
---|---|---|---|---|---|---|---|---|
Svvc | 1.1672 | 7117 | 0 | 0 | 0 | 7117 | 0 | 7117 |
ccv | 3.4924 | 21296 | 0 | 0 | 0 | 21296 | 0 | 21296 |
ccvv | 8.5440 | 52099 | 0 | 0 | 0 | 52099 | 0 | 14558 |
ccvvc | 3.7634 | 22948 | 0 | 0 | 0 | 22948 | 0 | 0 |
cv | 35.9410 | 219159 | 1200 | 0 | 69146 | 65445 | 83368 | 107442 |
cvc | 16.3600 | 99759 | 0 | 0 | 99759 | 0 | 0 | 0 |
cvv | 3.0326 | 18492 | 0 | 0 | 0 | 0 | 18492 | 18492 |
v | 5.1365 | 31321 | 0 | 31321 | 0 | 0 | 0 | 0 |
vc | 22.5631 | 137584 | 0 | 137584 | 0 | 0 | 0 | 0 |
Here the syllables are group by number of letters
N of letters | % tot | total frequency | whole word | 1 | 2 | 3 | 4 | last |
---|---|---|---|---|---|---|---|---|
1_let | 5.1365 | 31321 | 0 | 31321 | 0 | 0 | 0 | 0 |
2_let | 58.5040 | 356743 | 1200 | 137584 | 69146 | 65445 | 83368 | 107442 |
3_let | 22.8850 | 139547 | 0 | 0 | 99759 | 21296 | 18492 | 39788 |
4_let | 9.7111 | 59216 | 0 | 0 | 0 | 59216 | 0 | 21675 |
5_let | 3.7634 | 22948 | 0 | 0 | 0 | 22948 | 0 | 0 |
Here the syllables are group by “type”. In Italian there are “open” and “closed” syllables. Open syllables end with a vowel, while closed ones end with a consonant.
Syllable type | % tot | total frequency | whole word | 1 | 2 | 3 | 4 | last |
---|---|---|---|---|---|---|---|---|
open | 56.1464 | 342367 | 1200 | 31321 | 69146 | 138840 | 101860 | 161788 |
closed | 43.8536 | 267408 | 0 | 137584 | 99759 | 30065 | 0 | 7117 |