Difference between revisions of "Transition probabilities from selected texts"

From Derek
Jump to: navigation, search
(See also: back section added)
(New correction factor for p=0)
Line 1: Line 1:
 
The Somerton Man's code (without the extra line) is 44 characters long. So, if the text is purely random (1/26 chance of each letter appearing) then the probability of attaining this particular string of 44 is (1/26)^44 = 5.51027E-63. This is a good initial comparison.
 
The Somerton Man's code (without the extra line) is 44 characters long. So, if the text is purely random (1/26 chance of each letter appearing) then the probability of attaining this particular string of 44 is (1/26)^44 = 5.51027E-63. This is a good initial comparison.
  
==First Order Transition Probabilities==
+
==First order==
 
+
===All letters===
English (1984 - George Orwell)
+
<br/>(1984 - George Orwell.txt) All Letters
<br/>Markov Probability: 1.4641414719132793E-67
+
<br/>Markov Probability: 1.4641414719133752E-71
 
<br/>Corrected Zeroes:  1
 
<br/>Corrected Zeroes:  1
 
+
<br/>
French (Les Orientales - Victor Hugo)
+
<br/>(Les Orientales - Victor Hugo.txt) All Letters
<br/>Markov Probability: 1.1571661202766258E-70
+
<br/>Markov Probability: 1.1571661202766416E-78
 
<br/>Corrected Zeroes:  2
 
<br/>Corrected Zeroes:  2
 
+
<br/>
Vigenere Cipher (1984 - George Orwell, Keyword LEMON)
+
<br/>(Traumdeutung - Sigmund Freud.txt) All Letters
<br/>Markov Probability: 1.646391769425068E-70
+
<br/>Markov Probability: 3.866259362091006E-77
<br/>Corrected Zeroes:  0
+
 
+
German (Traumdeutung - Sigmund Freud)
+
<br/>Note: does not account for Eszett (sharp s) character
+
<br/>Markov Probability: 3.8662593620911806E-73
+
 
<br/>Corrected Zeroes:  1
 
<br/>Corrected Zeroes:  1
 
+
<br/>
English Initial Letters (1984 - George Orwell)
+
<br/>(Vigenere - 1984.txt) All Letters
<br/>Markov Probability: 1.9187432339606176E-56
+
<br/>Markov Probability: 1.6463917694250256E-70
 
<br/>Corrected Zeroes:  0
 
<br/>Corrected Zeroes:  0
 
+
<br/>
French Initial Letters (Les Orientales - Victor Hugo)
+
===Initial letters===
<br/>counting words like l'hopital as two words (''le'' followed by ''hopital''):
+
<br/>(1984 - George Orwell.txt) Initial Letters
<br/>Markov Probability: 7.809561685705767E-61
+
<br/>Markov Probability: 1.9187432339606235E-56
 
<br/>Corrected Zeroes:  0
 
<br/>Corrected Zeroes:  0
<br/>discounting the ''l''' (only consider the ''hopital'')
+
<br/>
<br/>Markov Probability: 1.1841007473332175E-60
+
<br/>(Les Orientales - Victor Hugo.txt) Initial Letters
 +
<br/>Markov Probability: 7.809561685705419E-61
 
<br/>Corrected Zeroes:  0
 
<br/>Corrected Zeroes:  0
 
+
<br/>
 
+
<br/>(Traumdeutung - Sigmund Freud.txt) Initial Letters
German Initial Letters (Traumdeutung - Sigmund Freud)
+
<br/>Markov Probability: 4.553994899282498E-68
<br/>Note: does not account for Eszett (sharp s) character. Though I don't think a word can ever start with this character
+
<br/>Markov Probability: 4.29592233581315E-64
+
 
<br/>Corrected Zeroes:  1
 
<br/>Corrected Zeroes:  1
 
+
<br/>
==Second Order Transition Probabilies==
+
==Second order==
 
+
===All letters===
English (1984 - George Orwell)
+
<br/>(1984 - George Orwell.txt) All Letters
<br/>Markov Probability: 2.115089006082431E-43
+
<br/>Markov Probability: 2.1150890060822598E-99
 
<br/>Corrected Zeroes:  14
 
<br/>Corrected Zeroes:  14
 
+
<br/>
German (Traumdeutung - Sigmund Freud)
+
<br/>(Les Orientales - Victor Hugo.txt) All Letters
<br/>Note: does not account for Eszett (sharp s) character
+
<br/>Markov Probability: 4.429249667204522E-106
<br/>Markov Probability: 3.79644909538402E-35
+
<br/>Corrected Zeroes:  18
 +
<br/>
 +
<br/>(Traumdeutung - Sigmund Freud.txt) All Letters
 +
<br/>Markov Probability: 3.796449095383427E-119
 
<br/>Corrected Zeroes:  21
 
<br/>Corrected Zeroes:  21
 
+
<br/>
French (Les Orientales - Victor Hugo)
+
<br/>(Vigenere - 1984.txt) All Letters
<br/>Markov Probability: 4.429249667204738E-34
+
<br/>Markov Probability: 1.6699440985106455E-92
<br/>Corrected Zeroes:  18
+
 
+
Vigenere (English - 1984 - Orwell)
+
<br/>Markov Probability: 1.6699440985106574E-60
+
 
<br/>Corrected Zeroes:  8
 
<br/>Corrected Zeroes:  8
 
+
<br/>
English Initial Letters (1984 - George Orwell)
+
===Initial letters===
<br/>Markov Probability: 7.009981410871232E-53
+
<br/>(1984 - George Orwell.txt) Initial Letters
 +
<br/>Markov Probability: 7.009981410871061E-61
 
<br/>Corrected Zeroes:  2
 
<br/>Corrected Zeroes:  2
 
+
<br/>
German Initial Letters (Traumdeutung - Sigmund Freud)
+
<br/>(Les Orientales - Victor Hugo.txt) Initial Letters
<br/>Note: does not account for Eszett (sharp s) character
+
<br/>Markov Probability: 2.9707877167600856E-89
<br/>Markov Probability: 2.908650572588623E-32
+
<br/>Corrected Zeroes:  12
 +
<br/>
 +
<br/>(Traumdeutung - Sigmund Freud.txt) Initial Letters
 +
<br/>Markov Probability: 2.9078792518321384E-100
 
<br/>Corrected Zeroes:  17
 
<br/>Corrected Zeroes:  17
 
+
<br/>
Les Orientales - Victor Hugo.txt
+
<br/>Not counting ''l''' as a word (but counting the word contracted with it):
+
<br/>Markov Probability: 1.0762921500526206E-40
+
<br/>Corrected Zeroes:  12
+
<br/>Counting the ''l''' as one word and the other contracted word as another word:
+
<br/>Markov Probability: 2.970787716759867E-41
+
<br/>Corrected Zeroes:  12
+
  
 
==See also==
 
==See also==

Revision as of 17:39, 15 June 2009

The Somerton Man's code (without the extra line) is 44 characters long. So, if the text is purely random (1/26 chance of each letter appearing) then the probability of attaining this particular string of 44 is (1/26)^44 = 5.51027E-63. This is a good initial comparison.

First order

All letters


(1984 - George Orwell.txt) All Letters
Markov Probability: 1.4641414719133752E-71
Corrected Zeroes: 1

(Les Orientales - Victor Hugo.txt) All Letters
Markov Probability: 1.1571661202766416E-78
Corrected Zeroes: 2

(Traumdeutung - Sigmund Freud.txt) All Letters
Markov Probability: 3.866259362091006E-77
Corrected Zeroes: 1

(Vigenere - 1984.txt) All Letters
Markov Probability: 1.6463917694250256E-70
Corrected Zeroes: 0

Initial letters


(1984 - George Orwell.txt) Initial Letters
Markov Probability: 1.9187432339606235E-56
Corrected Zeroes: 0

(Les Orientales - Victor Hugo.txt) Initial Letters
Markov Probability: 7.809561685705419E-61
Corrected Zeroes: 0

(Traumdeutung - Sigmund Freud.txt) Initial Letters
Markov Probability: 4.553994899282498E-68
Corrected Zeroes: 1

Second order

All letters


(1984 - George Orwell.txt) All Letters
Markov Probability: 2.1150890060822598E-99
Corrected Zeroes: 14

(Les Orientales - Victor Hugo.txt) All Letters
Markov Probability: 4.429249667204522E-106
Corrected Zeroes: 18

(Traumdeutung - Sigmund Freud.txt) All Letters
Markov Probability: 3.796449095383427E-119
Corrected Zeroes: 21

(Vigenere - 1984.txt) All Letters
Markov Probability: 1.6699440985106455E-92
Corrected Zeroes: 8

Initial letters


(1984 - George Orwell.txt) Initial Letters
Markov Probability: 7.009981410871061E-61
Corrected Zeroes: 2

(Les Orientales - Victor Hugo.txt) Initial Letters
Markov Probability: 2.9707877167600856E-89
Corrected Zeroes: 12

(Traumdeutung - Sigmund Freud.txt) Initial Letters
Markov Probability: 2.9078792518321384E-100
Corrected Zeroes: 17

See also

Back