Editing
Final Report/Thesis 2015
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====Evaluation and Justification==== The results of the initial validation seen in Figure 37, show that using the ''Project Gutenberg'' novel as an English reference text provided lower chi-squared values for all test cases and thus it was deemed to be a more suitable base text than the modified version of the ''Universal Declaration of Human Rights''. The results from the European Language Comparison in Figure 38, show that English had the lowest chi-squared value when compared to all languages in our ''Project Gutenberg'' corpus, and thus was the closest language to the Somerton Man code. The chi-squared values calculated using the English translation of the Declaration of Human Rights (Figure 39) were found to be much higher than those calculated using the English ''Project Gutenberg'' novel (Figure 40). The large difference in results, and the fact that real english samples obtained such high chi-squared values, show that the Declaration may not provide an accurate representation of letter frequencies in the English language, and thus the validity of this as a base text has been proven to be questionable when used as part of a chi-squared analysis. In addition to this, despite the chi-squared values calculated using the Somerton Man code variants being much lower in all cases, ''hypothesis testing'' could not be completed due to the large chi-squared values producing very small ''p-values'' for the code variants and significance level (approaching 0). The results from the initial significance level calculation in Figure 40 shows that 2 of the 3 Somerton Man code variants, and thus the average result, achieved higher ''p-values'' than the calculated significance level. From this we could deduce that our preliminary results showed that our null hypothesis was accepted and that English is the most likely language of origin of the code, assuming that it is an ''initialism''. Upon increasing the sample size, the significance level calculation in Figure 41 shows that now only 1 of the 3 code variants achieved a ''p-value'' higher than the calculated significance level. This caused the average result to fall below the significance level. Due to this, our statement had to be modified to say that overall the null hypothesis was rejected and alternative hypothesis accepted, meaning that we can not confidently say that the language of origin of the Somerton Man code is english for all variants. Despite this, the null hypothesis could be accepted and English is the most likely language of origin of the Somerton Man code, assuming that it contains 4 M's, 2 W's and is an ''initialism''. Regardless of the choice to accept or reject the null hypothesis, the similarities in chi-squared and ''p-values'' calculated between real 44 letter English Samples and all variants of the Somerton Man code using the ''Project Gutenberg'' base text reinforces the notion that the language of origin of the code is indeed English. Although we were able to find a base text with frequency greater than 0 for each letter, suitable for reanalysis and performing ''hypothesis testing'', the chi-squared method used was still not entirely mathematically accurate since the expected value of the number of sample observations for each letter in the code should have been a minimum of 5<ref>Stat Trek. (2015). Chi-Square Goodness of Fit Test [Online]. Available: http://stattrek.com/chi-square-test/goodness-of-fit.aspx?Tutorial=AP.</ref>. This was unavoidable since we had limited letter frequencies provided by the Somerton Man code and thus this sample size could not be increased. Since this was constant when comparing across all languages, the method was still able to be used as a means of comparing the 'goodness of fit' of letters in each language.
Summary:
Please note that all contributions to Derek may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Derek:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information