I generated some statistics about the frequencies of syllables in Italian language to be used in cognitive psychology.
Here's an example of the computations that my script and programmes made.
I started from a list of words with their frequencies in Italian.
Starting from this example file (the actual one was extremely longer):
I hyphenated the words (separated the syllables with ~) and got
Then the scripts generated some statistic tables about syllables frequencies. Below you'll find some examples.
The meaning of the columns are:
| Syllable | % tot | absolute frequency | whole word | 1 | 2 | 3 | 4 | last |
|---|---|---|---|---|---|---|---|---|
| a | 5,1365 | 31321 | 0 | 31321 | 0 | 0 | 0 | 0 |
| ab | 22,5631 | 137584 | 0 | 137584 | 0 | 0 | 0 | 0 |
| ba | 11,3396 | 69146 | 0 | 0 | 69146 | 0 | 0 | 0 |
| bac | 15,1928 | 92642 | 0 | 0 | 92642 | 0 | 0 | 0 |
| bat | 1,1672 | 7117 | 0 | 0 | 7117 | 0 | 0 | 0 |
| chi | 3,4924 | 21296 | 0 | 0 | 0 | 21296 | 0 | 21296 |
| chia | 6,1565 | 37541 | 0 | 0 | 0 | 37541 | 0 | 0 |
| chiar | 3,7634 | 22948 | 0 | 0 | 0 | 22948 | 0 | 0 |
| chio | 2,3874 | 14558 | 0 | 0 | 0 | 14558 | 0 | 14558 |
| ci | 6,7846 | 41371 | 0 | 0 | 0 | 41371 | 0 | 0 |
| co | 1,8010 | 10982 | 0 | 0 | 0 | 10982 | 0 | 10982 |
| cò | 0,0567 | 346 | 0 | 0 | 0 | 346 | 0 | 346 |
| jour | 1,1672 | 7117 | 0 | 0 | 0 | 7117 | 0 | 7117 |
| na | 3,7520 | 22879 | 0 | 0 | 0 | 0 | 22879 | 22879 |
| nai | 3,0326 | 18492 | 0 | 0 | 0 | 0 | 18492 | 18492 |
| re | 1,0520 | 6415 | 0 | 0 | 0 | 0 | 6415 | 6415 |
| si | 3,7634 | 22948 | 0 | 0 | 0 | 0 | 22948 | 22948 |
| te | 0,3755 | 2290 | 1200 | 0 | 0 | 1090 | 0 | 1090 |
| ti | 1,9115 | 11656 | 0 | 0 | 0 | 11656 | 0 | 11656 |
| to | 5,1045 | 31126 | 0 | 0 | 0 | 0 | 31126 | 31126 |
Here the syllables are grouped by “skeleton”: c stands for consonant and v for vowel
| Skeleton | % tot | absolute frequency | whole word | 1 | 2 | 3 | 4 | last |
|---|---|---|---|---|---|---|---|---|
| Svvc | 1,1672 | 7117 | 0 | 0 | 0 | 7117 | 0 | 7117 |
| ccv | 3,4924 | 21296 | 0 | 0 | 0 | 21296 | 0 | 21296 |
| ccvv | 8,5440 | 52099 | 0 | 0 | 0 | 52099 | 0 | 14558 |
| ccvvc | 3,7634 | 22948 | 0 | 0 | 0 | 22948 | 0 | 0 |
| cv | 35,9410 | 219159 | 1200 | 0 | 69146 | 65445 | 83368 | 107442 |
| cvc | 16,3600 | 99759 | 0 | 0 | 99759 | 0 | 0 | 0 |
| cvv | 3,0326 | 18492 | 0 | 0 | 0 | 0 | 18492 | 18492 |
| v | 5,1365 | 31321 | 0 | 31321 | 0 | 0 | 0 | 0 |
| vc | 22,5631 | 137584 | 0 | 137584 | 0 | 0 | 0 | 0 |
Here the syllables are grouped by number of letters
| N of letters | % tot | absolute frequency | whole word | 1 | 2 | 3 | 4 | last |
|---|---|---|---|---|---|---|---|---|
| 1_let | 5,1365 | 31321 | 0 | 31321 | 0 | 0 | 0 | 0 |
| 2_let | 58,5040 | 356743 | 1200 | 137584 | 69146 | 65445 | 83368 | 107442 |
| 3_let | 22,8850 | 139547 | 0 | 0 | 99759 | 21296 | 18492 | 39788 |
| 4_let | 9,7111 | 59216 | 0 | 0 | 0 | 59216 | 0 | 21675 |
| 5_let | 3,7634 | 22948 | 0 | 0 | 0 | 22948 | 0 | 0 |
Here the syllables are grouped by “type”. In Italian there are “open” and “closed” syllables.
| Syllable type | % tot | absolute frequency | whole word | 1 | 2 | 3 | 4 | last |
|---|---|---|---|---|---|---|---|---|
| open | 56,1464 | 342367 | 1200 | 31321 | 69146 | 138840 | 101860 | 161788 |
| closed | 43,8536 | 267408 | 0 | 137584 | 99759 | 30065 | 0 | 7117 |