I generated some statistics about the frequencies of syllables in Italian language to be used in cognitive psychology.
Here's an example of the computations that my script and programmes made.
I started from a list of words with their frequencies in Italian.
Starting from this example file (the actual one was extremely longer):
I hyphenated the words (separated the syllables with ~) and got
Then the scripts generated some statistic tables about syllables frequencies. Below you'll find some examples.
The meaning of the columns are:
| Syllable | % tot | absolute frequency | whole word | 1 | 2 | 3 | 4 | last |
|---|---|---|---|---|---|---|---|---|
| a | 5.1365 | 31321 | 0 | 31321 | 0 | 0 | 0 | 0 |
| ab | 22.5631 | 137584 | 0 | 137584 | 0 | 0 | 0 | 0 |
| ba | 11.3396 | 69146 | 0 | 0 | 69146 | 0 | 0 | 0 |
| bac | 15.1928 | 92642 | 0 | 0 | 92642 | 0 | 0 | 0 |
| bat | 1.1672 | 7117 | 0 | 0 | 7117 | 0 | 0 | 0 |
| chi | 3.4924 | 21296 | 0 | 0 | 0 | 21296 | 0 | 21296 |
| chia | 6.1565 | 37541 | 0 | 0 | 0 | 37541 | 0 | 0 |
| chiar | 3.7634 | 22948 | 0 | 0 | 0 | 22948 | 0 | 0 |
| chio | 2.3874 | 14558 | 0 | 0 | 0 | 14558 | 0 | 14558 |
| ci | 6.7846 | 41371 | 0 | 0 | 0 | 41371 | 0 | 0 |
| co | 1.8010 | 10982 | 0 | 0 | 0 | 10982 | 0 | 10982 |
| cò | 0.0567 | 346 | 0 | 0 | 0 | 346 | 0 | 346 |
| jour | 1.1672 | 7117 | 0 | 0 | 0 | 7117 | 0 | 7117 |
| na | 3.7520 | 22879 | 0 | 0 | 0 | 0 | 22879 | 22879 |
| nai | 3.0326 | 18492 | 0 | 0 | 0 | 0 | 18492 | 18492 |
| re | 1.0520 | 6415 | 0 | 0 | 0 | 0 | 6415 | 6415 |
| si | 3.7634 | 22948 | 0 | 0 | 0 | 0 | 22948 | 22948 |
| te | 0.3755 | 2290 | 1200 | 0 | 0 | 1090 | 0 | 1090 |
| ti | 1.9115 | 11656 | 0 | 0 | 0 | 11656 | 0 | 11656 |
| to | 5.1045 | 31126 | 0 | 0 | 0 | 0 | 31126 | 31126 |
Here the syllables are grouped by “skeleton”: c stands for consonant and v for vowel
| Skeleton | % tot | absolute frequency | whole word | 1 | 2 | 3 | 4 | last |
|---|---|---|---|---|---|---|---|---|
| Svvc | 1.1672 | 7117 | 0 | 0 | 0 | 7117 | 0 | 7117 |
| ccv | 3.4924 | 21296 | 0 | 0 | 0 | 21296 | 0 | 21296 |
| ccvv | 8.5440 | 52099 | 0 | 0 | 0 | 52099 | 0 | 14558 |
| ccvvc | 3.7634 | 22948 | 0 | 0 | 0 | 22948 | 0 | 0 |
| cv | 35.9410 | 219159 | 1200 | 0 | 69146 | 65445 | 83368 | 107442 |
| cvc | 16.3600 | 99759 | 0 | 0 | 99759 | 0 | 0 | 0 |
| cvv | 3.0326 | 18492 | 0 | 0 | 0 | 0 | 18492 | 18492 |
| v | 5.1365 | 31321 | 0 | 31321 | 0 | 0 | 0 | 0 |
| vc | 22.5631 | 137584 | 0 | 137584 | 0 | 0 | 0 | 0 |
Here the syllables are grouped by number of letters
| N of letters | % tot | absolute frequency | whole word | 1 | 2 | 3 | 4 | last |
|---|---|---|---|---|---|---|---|---|
| 1_let | 5.1365 | 31321 | 0 | 31321 | 0 | 0 | 0 | 0 |
| 2_let | 58.5040 | 356743 | 1200 | 137584 | 69146 | 65445 | 83368 | 107442 |
| 3_let | 22.8850 | 139547 | 0 | 0 | 99759 | 21296 | 18492 | 39788 |
| 4_let | 9.7111 | 59216 | 0 | 0 | 0 | 59216 | 0 | 21675 |
| 5_let | 3.7634 | 22948 | 0 | 0 | 0 | 22948 | 0 | 0 |
Here the syllables are grouped by “type”. In Italian there are “open” and “closed” syllables.
| Syllable type | % tot | absolute frequency | whole word | 1 | 2 | 3 | 4 | last |
|---|---|---|---|---|---|---|---|---|
| open | 56.1464 | 342367 | 1200 | 31321 | 69146 | 138840 | 101860 | 161788 |
| closed | 43.8536 | 267408 | 0 | 137584 | 99759 | 30065 | 0 | 7117 |