Editing
Final Report/Thesis 2018
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Task 1: Code Analysis== === Aim === The aim of this task is to comprehend whether the mysterious code represents some collective object. The collective objects that will be used are horse names, Australian beaches and cities, South Australian street names, and The Rubaiyat of Omar Khayyam book. The assumption will be made that the letters in the mysterious code, are the initialism of a word. The Somerton Man had a lot of associations with horses, so further research will be going into the assumption that the mysterious code are ‘horse names’. The null hypothesis is ‘The group of letters are horse names’ and the alternative hypothesis is ‘The group of letters are not horse name.’ === Preperation === Before the initialisation of this task, the details of the whole case were reviewed. This review included the following; who found the corpse, where it was located and when it was discovered. This gave a better understanding of what type of collective object should be researched, which are stated above. As mentioned before, the reason why horse names are a major factor is because the location of his body was found to be located near Morphettville Race course, as seen in Figure 6. [[File:ZFigure6.jpg|thumb|500px|center|Figure 6. Location of Somerton Beach and Morphettville Racecourse]] It should also be noted that the people who found the deceased body were racehorse jockeys. Hence the reason we assumed the mysterious code are the initialism of horse names from the year 1948. === Method === The approach to determine if the mysterious code represents collective objects, will be done by calculating the p-value and implementing hypothesis testing. In the case of the horse names, there were no direct websites, which provided horse names in the year of 1948. This led to discovering evidence of these names within relevant newspapers and articles. This was done by using ‘Trove’, a search engine to help find resources in Australia. More specifically, in this case it was used to obtain articles and newspapers from 1948. The other collective objects, as mentioned above, were found using South Australian government websites. This led to an abundance of cross checking, to make sure that the list which was going to be utilised, was indeed correct. Matlab was used as the software tool. The initial letter of the each collective object was all that was needed, where using code was implemented to perform this task. In the case, where a collective object had multiple words, both words were included. Also if there was any extra punctuation, it was removed. A demonstration is shown in Figure 7, which illustrates the input and output results for the horse names case and also displays the frequency of each letter. [[File:ZFigure7.jpg|thumb|500px|center|Figure 7. Matlab Results]] Excel was used, to produce the statistical results from Matlab. The letter frequency will be obtained by dividing the amount of each letter over the total amount calculated, where then a p-value test was performed and a comparison graph was completed. === Results === Each collective object was compared to the mysterious code by the frequency of each letter. Where the x-axis represents the alphabet and the y-axis represents the frequency of the letters between the two testing objects. The p-value test was also completed to verify the results, where a p-value of less than 0.05 shows that it is very unlikely that the collective object is the mysterious code. ====Horse Names==== The comparison of horse names to the mysterious code is seen in Figure 8. [[File:ZFigure8.jpg|thumb|500px|center|Figure 8. Comparison of Mysterious Code with Horse Names]] There was a sample of 69 horse names and it can be seen on the graph that the horse names do not correlate with the mysterious code with many of the English letters. This was also proven by the p-value, as it was lower than 0.05, which means the null hypothesis is not accepted. ====Australian Beaches==== The comparison of Australian beaches to the mysterious code is seen in Figure 9. [[File:ZFigure9.jpg|thumb|500px|center|Figure 9. Comparison of Mysterious Code with Australian Beach Names]] There was a sample of 114 beach names. Analysing the graph it be seen that the frequency of the letters do correlate with mysterious code. As the results seemed genuine a hypothesis test was done between this values. The results showed a p-value of greater than 0.05, which indicates that the mysterious code could be Australian beach names. ====South Australia Street Names==== The comparison of South Australian street names to the mysterious code is seen in Figure 10. [[File:ZFigure10.jpg|thumb|500px|center|Figure 10. Comparison of Mysterious Code with South Australian Street Names]] There was a sample of 447 South Australian street names. Observing the graph it can be seen that the frequency of the letter are not similar with the mysterious code. This was also proven by the p-value, as it was lower than 0.05. ====Australian City's==== The comparison of Australian city names to the mysterious code is seen in Figure 11. [[File:ZFigure11.jpg|thumb|500px|center|Figure 11. Comparison of Mysterious Code with Australian City Names]] There was a sample of 90 Australian city names. Observing the graph it can be seen that the frequency of some letter are similar with the mysterious code. A hypothesis test was then done to check the results. The p-value that was obtained was less than 0.05. ====The Rubaiyat of Omar Khayyam book==== The comparison of the Rubaiyat of Omar Khayyam book to the mysterious code is seen in Figure 12. [[File:ZFigure12.jpg|thumb|500px|center|Figure 12. Comparison of Mysterious Code with The Rubaiyat of Omar Khayyam book]] There was a sample of 852 words form the book. Observing the graph it can be seen that the frequency of the letter are not similar with the mysterious code. This was also proven by the p-value, as it was lower than 0.05. An extension of this task was also done. This includes analysing The Rubaiyat of Omar Khayyam book more carefully. Previous years stated that the mysterious code does not correlate with the book. Each paragraph in the book has four lines of words (see Figure 13), which compared with the mysterious code also has four lines. Still assuming that each letter in the mysterious code is an initial word, we can compare the two. [[File:ZFigure13.jpg|thumb|500px|center|Figure 13. A page from The Rubaiyat of Omar Khayyam]] The task was to count how many words are in each line of the book and compare it with the mysterious code. Using the first paragraph in Figure 13 (outlined with a red square), the first line has 9 words, then followed by 7 words in the second line, then 8 words in the last two lines. Comparing just the first paragraph with the mysterious code from line 1 to 4, there are 9, 11, 11 and 13 letters respectively. It already can be seen from the first paragraph that there may not be a correlation between the mysterious code and the book. Counting every line would be very time consuming, therefore a text file of The Rubaiyat of Omar Khayyam was used, in correlation with Matlab to count each word in each line. Then using excel, a graph was plotted with error bars to the number of letters in the mysterious code. This can be seen in Figure 14. [[File:ZFigure14.jpg|thumb|500px|center|Figure 14. Error Bars against the book]] The x-axis represents which line in the paragraph it is and the y-axis represents the amount of words present in that line. It can be seen that on line 1, the mysterious code is in the error bars. The rest of the lines are out of the error bars. This indicates that the mysterious code is not from The Rubaiyat of Omar Khayyam book and further proves the previous year’s studies of the book not being part of the mysterious code. ====P-value==== A summary of the p-values are shown in Table 1. [[File:ZFigure15.jpg|thumb|500px|center|Table 1. P-value Results]] It can be observed that the only collective object that is above 0.05 is Australian beaches, this indicates that the mysterious code could be Australian beaches. ===Conclusion=== Overall, the results have shown, that it is unlikely that the mysterious code represents the hypothesis stated, of that it is horse names. It has also shown that it is unlikely to be South Australian street names, Australian city names or The Rubaiyat of Omar Khayyam book, even with extensive analysis on the book. But to some surprise it is possible that the mysterious code is Australian beach names, as the p-value was above 0.05. Future research will be required to obtain more meaningful results.
Summary:
Please note that all contributions to Derek may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Derek:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information