menu

Statistics on syllables in Italian

I generated some statistics about the frequency of syllables in Italian language in order to use them in cognitive psychology.

Here's an example of the computations that my scripts and programmes made.

I started from a list of words with their frequency in Italian.

Starting from this example file (the actual one was extremely longer):

  • 130 abachi
  • 11328 abaco
  • 1090 abate
  • 11656 abati
  • 7117 abat-jour
  • 17595 abbacchi
  • 6415 abbacchiare
  • 22948 abbacchiarsi
  • 31126 abbacchiato
  • 14558 abbacchio
  • 3571 abbachi
  • 22879 abbacina
  • 18492 abbacinai
  • 1200 te

I hyphenated the words (separated the syllables with ~) and got

  • 130 a~ba~chi
  • 11328 a~ba~co
  • 1090 a~ba~te
  • 11656 a~ba~ti
  • 7117 a~bat-jour
  • 17595 ab~bac~chi
  • 6415 ab~bac~chia~re
  • 22948 ab~bac~chiar~si
  • 31126 ab~bac~chia~to
  • 14558 ab~bac~chio
  • 3571 ab~ba~chi
  • 22879 ab~ba~ci~na
  • 18492 ab~ba~ci~nai
  • 1200 te

Then the scripts generated some statistical tables on syllable frequency. Below you'll find some examples.

Here's the legend of the table headers:

Syllable
a syllable in italian language
% tot
the syllable frequency in percentage
total frequency
the total syllable frequency
whole word
the total frequency of the syllable in the case in which it forms an entire word
1, 2, 3, 4
the total frequency of the syllable in the case where it appears in that position in a word
last
the total frequency of the syllable in the case where it's the last of a word
Syllable % tot total frequency whole word 1 2 3 4 last
a5.1365313210313210000
ab22.563113758401375840000
ba11.3396691460069146000
bac15.1928926420092642000
bat1.16727117007117000
chi3.49242129600021296021296
chia6.1565375410003754100
chiar3.7634229480002294800
chio2.38741455800014558014558
ci6.7846413710004137100
co1.85771132800011328011328
jour1.16727117000711707117
na3.75202287900002287922879
nai3.03261849200001849218492
re1.05206415000064156415
si3.76342294800002294822948
te0.37552290120000109001090
ti1.91151165600011656011656
to5.10453112600003112631126

Here the syllables are grouped by “skeleton”: c stands for consonant, v for vowel and S for foreign letter

Skeleton % tot total frequency whole word 1 2 3 4 last
Svvc1.16727117000711707117
ccv3.49242129600021296021296
ccvv8.54405209900052099014558
ccvvc3.7634229480002294800
cv35.941021915912000691466544583368107442
cvc16.3600997590099759000
cvv3.03261849200001849218492
v5.1365313210313210000
vc22.563113758401375840000

Here the syllables are grouped by number of letters

N of letters % tot total frequency whole word 1 2 3 4 last
1_let5.1365313210313210000
2_let58.50403567431200137584691466544583368107442
3_let22.88501395470099759212961849239788
4_let9.71115921600059216021675
5_let3.7634229480002294800

Here the syllables are grouped by “type”. In Italian there are “open” and “closed” syllables. Open syllables end with a vowel, while closed ones end with a consonant.

Syllable type % tot total frequency whole word 1 2 3 4 last
open56.146434236712003132169146138840101860161788
closed43.85362674080137584997593006507117