Editing
Final Report 2010
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Summary and Conclusions== The main aim of the project was to create a web crawler and text parsing algorithm for use in this project as well as general non-specific use. We believe that we were quite successful in this element because our code is quite capable of parsing through a single file or several directories and storing the results into another text file. The code can be improved by attempting to remove some of the limitations stated earlier. This project attempted to prove that the Somerton Man's code belonged to a specific English text type by testing the frequency of initialism occurring; however, based on our testing, we could not find any significant differences in frequency over any of the text types. The results found show that the text type with the greatest amount of initialisms found was the Revised Standard Version of the Bible, however this is only due to the length of the text; the text type with the greatest amount of initialisms found relative to it's length was The Rubaiyat for 3 letter initialisms and science texts for 4 letter initialisms. It is currently undetermined whether this has any significance or not. Perhaps the most promising lead is the lack of results found in The Rubaiyat; it seems incredibly unlikely that such small frequency would occur unless it was intentional. Although it seems highly suspect, more testing is needed to determine if this is significant or not. As stated earlier, testing poetry texts truncated to the same size as The Rubaiyat would prove very beneficial. Due to the sheer number of different initialisms, the pattern initialism results were not able to be completely analysed and we were unable to determine what kind of substitution cipher, if any, was being used. To further this area of the project, we need to analyse the greatest occurring initialisms for each pattern and cross reference them with every other pattern to narrow down the possible substitutions used. Due to time constraints, this was not able to be completed. Our results suggest that the Somerton Man's code is most likely an initialism taken from The Rubaiyat and involves some type of cipher. Although this cannot be absolutely confirmed, we believe that this is currently the best theory surrounding the mystery of the Somerton Man's code.
Summary:
Please note that all contributions to Derek may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Derek:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information