Transitions within words

From Derek
Revision as of 15:51, 17 August 2009 by A1133050 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The following table is a raw count of transitions (from 1984 by George Orwell) within a single word. ie. in the phrase "hi there" the pairs counted are 'HI' 'TH' 'HE' 'ER' 'RE' ('IT' is NOT counted as it is not within a single word).

A transition from 'O' to 'D' is counted in row 'O' and column 'D'. This means that to find the probability of transition to letter 'D' given letter 'T' (ie. the pair 'TD' within a single word) we would find the value in row 'T' column 'D' (2 in this case) and divide it by the sum of the elements in the row 'T' (1405 + 12 + 147 + etcetera)


A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
A 10 898 1516 2186 17 224 837 21 1538 11 551 3120 741 6006 9 867 3 3654 4416 4633 311 700 271 20 927 53
B 463 97 0 3 2606 0 0 0 288 53 0 1051 12 1 831 0 0 635 95 33 850 26 0 0 471 0
C 1349 0 190 2 2091 0 0 1733 554 0 728 381 0 0 2260 0 6 615 22 995 364 0 0 0 60 0
D 470 7 5 226 2115 13 76 17 1366 14 0 212 23 62 990 1 0 321 596 17 250 25 13 0 256 0
E 2881 66 1219 4903 2134 504 219 111 556 5 80 1955 1434 5171 221 532 77 7575 3640 1522 104 1341 529 626 896 18
F 672 3 3 0 889 463 0 0 769 0 0 298 0 1 1438 0 0 726 19 295 331 0 0 0 25 0
G 596 0 0 0 1178 0 111 1283 470 0 0 250 22 123 484 0 0 482 232 45 286 0 0 0 37 0
H 5077 7 0 0 13695 17 0 0 4065 0 0 23 17 24 1879 2 1 319 49 866 286 0 6 0 177 0
I 664 298 1830 1357 1112 714 913 8 8 0 289 1322 1664 9708 1398 242 18 995 3371 4648 22 627 0 82 0 179
J 22 0 0 0 100 0 0 0 0 0 0 0 0 0 84 0 0 0 0 0 250 0 0 0 0 0
K 40 15 2 1 1109 4 4 2 537 0 0 58 2 424 2 3 0 0 133 0 15 0 34 0 26 0
L 1317 13 24 1524 3168 347 7 0 2261 0 118 2430 166 6 1322 64 1 74 423 372 321 91 154 0 1965 0
M 1406 362 1 0 3192 14 0 0 1029 0 0 8 206 28 1237 638 0 69 283 5 363 0 1 0 155 0
N 552 19 967 4295 2957 138 4333 19 993 12 312 354 34 220 2350 14 44 16 1770 3004 236 119 25 7 450 5
O 144 430 505 587 76 3716 231 18 342 4 427 1180 1990 5162 1155 639 0 3831 965 2031 5134 626 1592 47 93 21
P 1228 0 0 0 1682 2 0 177 461 0 0 785 8 17 991 588 0 957 211 340 309 0 3 0 48 0
Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 409 0 0 0 0 0
R 1596 45 181 736 6480 99 225 146 2109 0 286 347 397 421 2298 93 0 476 1316 1351 537 103 74 1 1067 0
S 1131 8 469 8 2703 41 15 1332 1633 0 113 342 259 98 1380 657 38 3 1386 4134 750 0 150 0 148 0
T 1405 12 147 2 3670 20 1 14399 2808 0 0 520 53 19 4163 11 0 1354 901 737 694 0 348 0 917 0
U 415 267 455 196 390 65 815 0 261 0 0 1647 359 1290 44 600 0 1752 1389 2037 0 15 0 8 12 6
V 299 0 0 0 3138 0 0 0 534 0 0 0 0 1 291 0 0 3 0 0 4 0 0 0 35 0
W 3469 0 0 39 1650 2 0 1947 1995 0 6 57 0 451 1242 0 0 175 169 2 2 0 0 0 6 0
X 81 0 106 0 55 1 0 10 147 0 0 1 0 0 1 113 1 0 0 136 22 0 3 0 16 0
Y 28 20 8 14 536 3 1 5 165 1 0 33 72 10 1346 22 0 34 374 144 0 5 31 0 0 0
Z 36 0 0 0 177 0 0 0 40 0 0 9 0 0 20 0 0 0 0 0 1 2 0 0 5 15