Difference between revisions of "Final Report/Thesis 2018"

From Derek
Jump to: navigation, search
(Task 1: Code Analysis)
(Method)
Line 90: Line 90:
 
Matlab was used as the software tool. The initial letter of the each collective object was all that was needed, where using code was implemented to perform this task. In the case, where a collective object had multiple words, both words were included. Also if there was any extra punctuation, it was removed. A demonstration is shown in Figure 7, which illustrates the input and output results for the horse names case and also displays the frequency of each letter.
 
Matlab was used as the software tool. The initial letter of the each collective object was all that was needed, where using code was implemented to perform this task. In the case, where a collective object had multiple words, both words were included. Also if there was any extra punctuation, it was removed. A demonstration is shown in Figure 7, which illustrates the input and output results for the horse names case and also displays the frequency of each letter.
 
[[File:ZFigure7.jpg|thumb|500px|center|Figure7. Matlab Results]]
 
[[File:ZFigure7.jpg|thumb|500px|center|Figure7. Matlab Results]]
Figure 7: Matlab Results
 
 
Excel was used, to produce the statistical results from Matlab. The letter frequency will be obtained by dividing the amount of each letter over the total amount calculated, where then a p-value test was performed and a comparison graph was completed.
 
Excel was used, to produce the statistical results from Matlab. The letter frequency will be obtained by dividing the amount of each letter over the total amount calculated, where then a p-value test was performed and a comparison graph was completed.
  

Revision as of 11:50, 22 October 2018

Abstract

The corpse of a mystery man’s body was found at Adelaide’s Somerton Beach, South Australia, Australia, on the 1st December 1948 and was hence referred to as the Somerton Man. Till this day the identity of the Somerton Man and the cause of his death is unknown. This project will be broken up into three different tasks, with all contributing towards the unsolved case of the mysterious man. The first task is based around the piece of paper that was found inside his trousers pocket. This piece of paper had five lines of capital letters, and is thought to be some kind of code or cipher. It was found that this mysterious code was a part of a poetry book, known as the ‘Rubaiyat of Omar Khayyam’. The letters on the mysterious code are thought to be first letters of words, based on previous year’s project groups. The location of his death is near Morphettville Racecourse, which leads to belief that the Somerton Man’s mysterious code are different horse names. However, using various statistical approaches, this was proven to be not true. The second part of this project involved, using a mass spectrometer to analyse different isotopic signatures of the samples. More specifically, the samples used were the shaft of the hair, obtained from various people, which were ablated by the laser, and effectively recorded the various elements. This will be compared with the Somerton Man’s hair, to identify specific elements present, as well as noting how long he was in Adelaide before his death. Different DNA samples, were analysed in the final task. Using software tools the samples were degraded, until the DNA became unidentifiable. It was found that degrading 50% or greater of the DNA sequences, makes the DNA unidentifiable.

Acknowledgements

Project supervisor Professor Derek Abbott, for the helpful and motivational advice, as well as the exceptional guidance, which was presented on each of the completed tasks. The University of Adelaide Microscopy Centre, for permitting the project team to use the mass spectrometer. Dr Sarah Gilbert for the assistance and suggestions whilst using the mass spectrometer. Also, thank you to all the volunteer who gave their hairs to use during the experiment. In addition, the project team would like to acknowledge all the software support from University of Adelaide, including Microsoft, Matlab as well as Iolite from the Microscopy centre.

Introduction

Motivation

During this project, a study was undertaken on an unsolved murder case. On December 1st, 1948, an unknown man was found deceased, on Somerton Beach in Adelaide [1]. From then on, he was labelled, ‘the Somerton Man’. There was no form of identification present on him, as well as little information surrounding his death. Therefore, the cause of his death is still unknown to this day [2]. Figure 1 shows the deceased man.

Figure 1 The Somerton Man

A piece of paper with the words “Tamám Shud” printed on it, was found rolled up inside his trousers pocket, which can be seen in Figure 2. This statement can be translated from Persian to either “it is ended” or “it is finished”. This piece of paper was found to be part of a poetry book, the ‘Rubaiyat of Omar Khayyam’ [3]. The theme of the book is that, one should live life to the full and have no regrets when it ends [4].

Figure 2 The Scrap of Paper

The book is speculated to be related to the dead man, due to the parchment. Hence, the case being known as the Tamam Shud case. This has been considered, since the early stages of the police investigation, to be "one of Australia's most profound mysteries [6].” Capital letters were found to be scribbled in the back cover of the Rubaiyat of Omar Khayyam, as seen in Figure 3. Thus, indicating that these letters are somewhat vital to the case at hand, as it is speculated that they may be a form of code or cipher.

Figure 3 The Mysterious Code

The code consisted of five various lines of capital letters, with a total of fifty letters all together. The second and fourth lines are very similar in the way in which they appear. It is therefore, believed that the Somerton Man may have made an error with the encryption, when writing the second line, hence why it is struck out. It is unclear whether some of the letters are in fact an “M” or “W” [7]. For tasks referring to the code, it is assumed that the unclear letter is an “M”. His body was found to be located near the Morphettville Racecourse, this leads to the belief, that the Somerton Man’s mysterious code are related to different horse names. It was also noted that the people who found the deceased body of the Somerton Man, were racehorse jockeys [2]. An Inductively Coupled Plasma Mass Spectrometer (ICP-MS) is used in this project to identify the various isotopes in various people’s hair. With the main concern of Strontium as, Adelaide has high levels in strontium in the soil compared to the rest of Australia.

Previous Studies/Related Work

Australian Department of Defence

In 1978, a request was sent by journalist Stuart Littlemore to the Department of Defence cryptographers to analyse the code. Unfortunately, the cryptographers were unable to crack the code, as they could not provide a satisfactory answer. It was stated that the code had “insufficient symbols” or a “disturbed mind” generated the meaningless code [8].

The University of Adelaide project groups

There have been several Honours project groups at the University of Adelaide that have undertaken this project. The previous work the project groups have done include:

  • Letter frequency analysis in different languages.
  • Initial letter and sentence letter probabilities.
  • The likelihood of the code being an initialism of a poem.
  • Different cypher techniques.
  • The design and implementation of a web crawler.
  • 3D generated reconstruction bust of the Somerton Man.

Main conclusions that these project groups have come to are:

  • It is unlikely that the mysterious code is created randomly.
  • There is strong evidence to believe the mysterious code is most likely to be in English.
  • It is not likely that the mysterious code are initialisms extracted from poems.
  • The Rubaiyat of Omar Khayyam was not used as a straight substitution one-time pad for encryption.
  • The Rubaiyat of Omar Khayyam was not created as a one-time pad for the mysterious code.

With these conclusions, this project will look into further detail of what the Mysterious Code is [9] [10] [11] [12] [13] [14] [15].

Mass Spectrometer Pervious Work

Previous years have also done study with Mass Spectrometer. The 2013 project group had some of Somerton Mans hair and plotted the different elements in the hair comparing with controlled samples. Analysis was conducted on different elements between the two samples. This was done using a glass slide, which have impurities in it [13]. In the 2016 project group, they recreated the analysis using a quartz slide, which does not have impurities. They concluded that Somerton Mans had some abnormally high readings, of some elements, one of which is strontium [15]. In this project, the strontium level will be looked in higher detail and this will be used to indicate how long the Somerton Man was in Adelaide before his death.

Aims and Objectives

The first task that will be done is to understand if the mysterious code are a collective object (horse names, Adelaide street names, Australian beaches, etc.). This will be done using hypothesis testing. An extension of this task will also be completed, which involves the Rubaiyat of Omar Khayyam. The second task that will be performed is using a mass spectrometer. Controlled sample hairs will be compared with Somerton man’s hair, to see how long he was in Adelaide before his death, by finding different elements in the hair. The third task that will be accomplished is using DNA data. The data will be degraded using software tools till it becomes unidentifiable. This then can be used to see how much DNA we need from Somerton man, where further research can undergo.

Technical Background

P-value Theorem

A p-value is a recognised statistical probability, which acknowledges whether an equal or larger effect is present, in comparison to its observed counterpart. In statistics, the p-value helps you determine the significance of the statistical hypothesis by observing the results that were sampled. This determines the probability that the results are due to chance, rather than the experimental conditions. Thus, determining the strength and validity of the results against that of the null hypothesis [16]. In this project, the main focus with the p-values is to determine whether the mysterious code, represents local horse names. Where the null hypothesis is ‘The group of letters are horse names’ with the alternative hypothesis being ‘The group of letters are not horse names.’ For null hypothesis to be accepted the p-value must be larger than 0.05, this indicates that the observed data point is located in the ‘most likely observation’ range, as seen in Figure 4. If the p-value is lower than 0.05, this indicates that the collected results are statistically significant and that the observed data point is located in the ‘very unlikely observations’ range. If the results are in the ‘very unlikely observations’ range, then the null hypothesis can be rejected, which means that the mysterious code is indeed, not horse names [17].

Figure4. Computation of a p-value

Figure 4 shows, the y-axis is the probability and the x-axis is the set of possible results.

Mass Spectrometer

A mass spectrometer is an analytical technique which, when given a foreign sample, it can detect unknown compounds within it. The mass spectrometer produces a multitude of ions in the sample, which then uses a mass to charge ratio of the different ions and records the quantity of each ion type [19]. The components of a typical mass spectrometer is seen in Figure 5. The three major components are the ion source, analyser and the ion detector system.

Figure 5. The Main Components of a Mass Spectrometer [19]

The ion source produces gaseous ions from the sample that was used. The analyser, then sorts the different ions by using the mass-to-charge ratio, according to their mass components. The ion detector system detects the different ions in the sample and records the quantity of each ion type and converts it into an electric signal [20]. The Inductively Coupled Plasma Mass Spectrometer (ICP-MS) is the type of mass spectrometer that was used in this project. The ICP-MS is faster, more precise and sensitive at finding different ions compared to other types of mass spectrometer [21]. In regards to the project, the sample being used is the shaft of the hair.

Single Nucleotide Polymorphism (SNP)

Single Nucleotide Polymorphisms (SNPs), are one of the most common and well talked about genetic variations, which can be present between humans [22]. These variations occur within a nucleotide (a single block of DNA) and happen roughly within 1 out of 300 base pairings [23]. In regards to the project, SNPs will be removed from a DNA sample.

Task 1: Code Analysis

Aim

The aim of this task is to comprehend whether the mysterious code represents some collective object. The collective objects that will be used are horse names, Australian beaches and cities, South Australian street names, and The Rubaiyat of Omar Khayyam book. The assumption will be made that the letters in the mysterious code, are the initialism of a word. The Somerton Man had a lot of associations with horses, so further research will be going into the assumption that the mysterious code are ‘horse names’. The null hypothesis is ‘The group of letters are horse names’ and the alternative hypothesis is ‘The group of letters are not horse name.’

Preperation

Before the initialisation of this task, the details of the whole case were reviewed. This review included the following; who found the corpse, where it was located and when it was discovered. This gave a better understanding of what type of collective object should be researched, which are stated above. As mentioned before, the reason why horse names are a major factor is because the location of his body was found to be located near Morphettville Race course, as seen in Figure 6.

Figure6. Location of Somerton Beach and Morphettville Racecourse

It should also be noted that the people who found the deceased body were racehorse jockeys. Hence the reason we assumed the mysterious code are the initialism of horse names from the year 1948.

Method

The approach to determine if the mysterious code represents collective objects, will be done by calculating the p-value and implementing hypothesis testing. In the case of the horse names, there were no direct websites, which provided horse names in the year of 1948. This led to discovering evidence of these names within relevant newspapers and articles. This was done by using ‘Trove’, a search engine to help find resources in Australia. More specifically, in this case it was used to obtain articles and newspapers from 1948. The other collective objects, as mentioned above, were found using South Australian government websites. This led to an abundance of cross checking, to make sure that the list which was going to be utilised, was indeed correct. Matlab was used as the software tool. The initial letter of the each collective object was all that was needed, where using code was implemented to perform this task. In the case, where a collective object had multiple words, both words were included. Also if there was any extra punctuation, it was removed. A demonstration is shown in Figure 7, which illustrates the input and output results for the horse names case and also displays the frequency of each letter.

Figure7. Matlab Results

Excel was used, to produce the statistical results from Matlab. The letter frequency will be obtained by dividing the amount of each letter over the total amount calculated, where then a p-value test was performed and a comparison graph was completed.

Results

Each collective object was compared to the mysterious code by the frequency of each letter. Where the x-axis represents the alphabet and the y-axis represents the frequency of the letters between the two testing objects. The p-value test was also completed to verify the results, where a p-value of less than 0.05 shows that it is very unlikely that the collective object is the mysterious code.

Horse Names

The comparison of horse names to the mysterious code is seen in Figure 8.

Figure 8: Comparison of Mysterious Code with Horse Names There was a sample of 69 horse names and it can be seen on the graph that the horse names do not correlate with the mysterious code with many of the English letters. This was also proven by the p-value, as it was lower than 0.05, which means the null hypothesis is not accepted.

Australian Beaches

The comparison of Australian beaches to the mysterious code is seen in Figure 9.

Figure 9: Comparison of Mysterious Code with Australian Beach Names There was a sample of 114 beach names. Analysing the graph it be seen that the frequency of the letters do correlate with mysterious code. As the results seemed genuine a hypothesis test was done between this values. The results showed a p-value of greater than 0.05, which indicates that the mysterious code could be Australian beach names.

South Australia Street Names

The comparison of South Australian street names to the mysterious code is seen in Figure 10.

Figure 10: Comparison of Mysterious Code with South Australian Street Names There was a sample of 447 South Australian street names. Observing the graph it can be seen that the frequency of the letter are not similar with the mysterious code. This was also proven by the p-value, as it was lower than 0.05.

Australian City's

The comparison of Australian city names to the mysterious code is seen in Figure 11.

Figure 11: Comparison of Mysterious Code with Australian City Names There was a sample of 90 Australian city names. Observing the graph it can be seen that the frequency of some letter are similar with the mysterious code. A hypothesis test was then done to check the results. The p-value that was obtained was less than 0.05.

The Rubaiyat of Omar Khayyam book

The comparison of the Rubaiyat of Omar Khayyam book to the mysterious code is seen in Figure 12.

Figure 12: Comparison of Mysterious Code with The Rubaiyat of Omar Khayyam book There was a sample of 852 words form the book. Observing the graph it can be seen that the frequency of the letter are not similar with the mysterious code. This was also proven by the p-value, as it was lower than 0.05. An extension of this task was also done. This includes analysing The Rubaiyat of Omar Khayyam book more carefully. Previous years stated that the mysterious code does not correlate with the book. Each paragraph in the book has four lines of words (see Figure 13), which compared with the mysterious code also has four lines. Still assuming that each letter in the mysterious code is an initial word, we can compare the two.

Figure 13: A page from The Rubaiyat of Omar Khayyam [25] The task was to count how many words are in each line of the book and compare it with the mysterious code. Using the first paragraph in Figure 13 (outlined with a red square), the first line has 9 words, then followed by 7 words in the second line, then 8 words in the last two lines. Comparing just the first paragraph with the mysterious code from line 1 to 4, there are 9, 11, 11 and 13 letters respectively. It already can be seen from the first paragraph that there may not be a correlation between the mysterious code and the book. Counting every line would be very time consuming, therefore a text file of The Rubaiyat of Omar Khayyam was used, in correlation with Matlab to count each word in each line. Then using excel, a graph was plotted with error bars to the number of letters in the mysterious code. This can be seen in Figure 14.

Figure 14: Error Bars against the book The x-axis represents which line in the paragraph it is and the y-axis represents the amount of words present in that line. It can be seen that on line 1, the mysterious code is in the error bars. The rest of the lines are out of the error bars. This indicates that the mysterious code is not from The Rubaiyat of Omar Khayyam book and further proves the previous year’s studies of the book not being part of the mysterious code.

P-value

A summary of the p-values are shown in Table 1. Table 1: P-value Results Collective Object P-value Horse Names 0.002 Australian Beaches 0.121 South Australia Street Names 0.005 Australian City’s 0.007 The Rubaiyat of Omar Khayyam book 0.008

It can be observed that the only collective object that is above 0.05 is Australian beaches, this indicates that the mysterious code could be Australian beaches.

Conclusion

Overall, the results have shown, that it is unlikely that the mysterious code represents the hypothesis stated, of that it is horse names. It has also shown that it is unlikely to be South Australian street names, Australian city names or The Rubaiyat of Omar Khayyam book, even with extensive analysis on the book. But to some surprise it is possible that the mysterious code is Australian beach names, as the p-value was above 0.05. Future research will be required to obtain more meaningful results.

Task 2 Mass Spectrometer isotope Data Analysis

Relative research

1.Hair elements Hair analysis can demonstrate the content level of the heavy mental and essential elements in human body. [19] The different value of some essential elements in hair may be correlated with nutritional status and diseases, such as calcium (Ca), chromium (Cr), zinc (Zn), copper (Cu), and selenium (Se). The high values of some elements such as lead (Pb), arsenic (As), mercury (Hg), and cadmium (Cd) in hair can indicate the intoxicating phenomenon. [19] Using this analysis method can discover the useful information about the Somerton man healthy condition and has possibility to find the cause of death. 2.The relation between living environment and hair elements The living environment can affect the content level of elements in human body. This effect not only caused by polluted water and food, but through the injury and radiation. Trace elements values in hair can reflect the source of behaviors including the diet hobbies, smoking and medication. Human absorbs pollution source by different ways such as oral, breath, injection and radiation. These contaminants stored in different body organs and tissues can affect human health. Hair is a good indicator of environment change. Normally, it shows the variation of trace element in human body as the hair growth.

Glass test data analysis

Data screening

The hair data was recorded by mass spectrometer and the data was got from glass test. There are 44 elements gained by laser ablation (shown in Table 1). Three of these elements are needed to be mentioned. The first one is sulphur (S7) which must exist in human’s hair. The value of Sulfer can be refered to the drift value of the environment. The second one is lead (Pb206), the high level of Lead is harmful for human health. The third one is strontium (Sr88). This element has high percentage in Adelaide’s soil compared with other areas. So, the analysis of the change tendency of Strontium’s value will provide useful information about Somerton man’s living environment before his death. Table 1 Elements table for glass test

Table 1 Elements table for glass test

Element classification

Some recorded hair data contains toxic and heavy mental elements which are harmful for human health in high values. However, most are essential elements for human body. Table 2 shows the toxic elements and essential elements in hair. The comparison of toxic elements content in hair is an important research subject.[21] Table 2 the Different Type Elements of Hair

Table 2 the Different Type Elements of Hair

In previous study, the hair data analysis is based on glass test result. However, the impurity material has influence on element values, thus, the data analysis mainly relies on quartz test rather than glass test in this project. Compared with glass plate, the quartz plate is a purity material composition. Table 3 shows the main composition of glass and quartz. Table 3 The different Elements of Glass and Quartz

Table 3 The different Elements of Glass and Quartz

Element content comparison

In order to speculate the cause of death and living environment, the Somerton man hair data should be compared with control samples. In this experiment, six normal people’s hairs are selected as control samples. Then, the mass hair data is disposed by software for a visual comparison. Excel At first, the hair data of Somerton man was plotted by Excel (shown in Figure 6). The X axis demonstrates the scan time of mass spectrometer. The Y axis demonstrates the relative value of elements content. Figure 1 Somerton man’s hair data

Figure 6 Somerton man’s hair data

However, only 28 elements were shown on the chart. Some data was missing due to the shortage of Excel. Besides, most elements are not clear except sodium (Na23). Some traces are out of range and not obvious, such as the elements located on the bottom of this chart. So, these troubles result in the difficulties of contrast. The control samples hair data plotted by Excel (shown on appendix A) as well. Unfortunately, they all have the shortages as the Somerton man’s data chart. Matlab In order to solve these problems, Matlab can be used to plot each element’s content distribution. 44 elements comparison figures have been plotted and shown on appendix A. Some comparison figures of element value display the results clearly. However, some figures have difficulties to find the tendency. The elements figures’ classification is shown on table 4. Table 4 Classify elements by comparison results

Table 4 Classify elements by comparison results

The quality of the results is considered below, four elements (Na23, Mg24, Si29 and Ca43) have the similar tendency. Meanwhile these four elements are all contained in glass material in previous study. This phenomenon can assume that the recorded elements values are affected by glass slice. However, the elements which do not exist in the glass will not be affected. In table 4, the same value means that the content of elements in the Somerton man and control samples can be regarded as the same value. Different value represents a significant difference in element content, and the change trend is not similar. No tendency means that the comparison figures can’t show the law of change and the results can’t be compared. Figure 7 shows the lead (Pb206) relative value comparison between Somerton man and control samples. Figure 7 Lead (Pb206) Value Comparison

Figure 7 Lead (Pb206) Value Comparison

The X axis demonstrates the scan time of mass spectrometer. The Y axis demonstrates the relative value of Lead content. The mass spectrometer scans from the root of hair. Therefore, the smaller number of X axis represents the closer to death time. In Figure 2, the content of lead in control samples is very low and there is slight upward trend only in a short period of time. However, there have found a very high level of Lead in Somerton man’s hair compared with control samples especially the end of his hair. This situation of high lead content is worthy of attention. Through further studying the distribution graph of lead content, it clearly shows a tendency that the lead value has a visual reduction before his death. So, it can be affirmed that high level of lead was not the main cause of his death. In fact, the high level of lead content can be related to the living environment and the rapid decline in lead content may represents the recent replacement of the living environment. Thus, some of the geographic information in 1948 needed to research, such as solid analysis, petroleum quality, water quality and nuclear power plant location. Figure 8 shows the mercury (Hg202) relative value comparing between Somerton man and control samples. Figure 8 Mercury (Hg202) Value Comparison

Figure 8 Mercury (Hg202) Value Comparison

The X axis demonstrates the scan time of mass spectrometer. The Y axis demonstrates the relative value of mercury content. There have found a higher level of mercury in Somerton man’s hair compared with control samples. This value from root to the end of hair is basically the same. Through research, mercury is a highly toxic heavy metal pollutants of biologic toxicity. It is difficult to be discharged back into the organism. So it is a serious threat to human health. However, mercury is ubiquitous in nature. There are trace amounts of mercury existing in plants animals and food. Normally, human can eliminate toxins through excretion and metabolism to keep the amount that should not influence health. This is the reason for finding mercury in human’s hair. The main pollution of mercury is from chlor-alkali, plastics, batteries, electronics and other industrial emissions. The Somerton man’s higher level of mercury may be due to his living area. Thus, this should not be regarded as the cause of his death. The strontium value between the Somerton man and control samples’ hair are obviously different. Figure 4 shows the strontium (Sr88) relative value comparing between Somerton man and control samples. Figure 9 Strontium (Sr88) Value Comparison

Figure 9 Strontium (Sr88) Value Comparison

The X axis demonstrates the scan time of mass spectrometer. The Y axis demonstrates the relative value of strontium content. Strontium exists in mineral water. It is an essential element for human. This element can prevent hardening of the arteries. At the root of Somerton man hair, the content of strontium is significant high. This means the living area for Somerton man contains high level of strontium content before his death. In addition, the strontium value shows an increasing tendency in his last two weeks. However, strontium is not a toxic element and the increasing tendency is not the cause for the death of Somerton man. Reasonable inference is that the Somerton man changes living environment two weeks before the death and this is the same with the speculation for lead content change. Moving-average Filter in Matlab Some comparison figures cannot be shown clearly by Matlab plotting. Thus, the Moving-average Filter should be applied. Firstly, selecting a window size for Filter. Then, calculating the arithmetic average of outliers within the window and make the average of the demand as outlier’s window centre point. After that, moving the window as window size and the average method is repeated until this process is complete. This method has a good inhibitory effect on periodic disturbance and makes the figure smoother. It is suitable for high frequency oscillation system. All of the elements content figures have been plotted by using moving-average filter and shown on appendix B. After using moving-average filter, the comparison figures have some new founds, such that more similar tendency elements are found and some clear comparison figures have been shown. The details are shown on table 5. Table 5 Classify elements by comparison results

Table 5 Classify elements by comparison results

Through filtering, As75, Ag107, Cd111 are the new found toxic elements which have high level in the Somerton man’s hair. The content distribution is shown as follow: Figure 10 Cadmium (Cd111) Value Comparison

Figure 10 Cadmium (Cd111) Value Comparison

Cadmium is an element present in the nickel-cadmium battery and it is a toxic element which can create poison gas when boiled. The average content value of cadmium in Somerton man is much higher than control samples especially when the scan time is 1400 seconds. However, the change trend of this element are similar with lead and content value of cadmium shows a decreasing tend for his last time. Unlike lead, cadmium is more toxic and it can be regarded as the clue for the death of Somerton man. However, this kind of battery did not appear before 60 years. That can be indicate this battery is not the source of cadmium. It can also be observed from the figure that the hair for control samples contain cadmium as well. Thus, it can assume that this kind of element may come from drinking water or food, or other frequently used items. Figure 11 Silver (Ag107) Value Comparison

Figure 11 Silver (Ag107) Value Comparison

Through the observation of Figure 6,the silver content of Somerton man is higher than control samples. Besides, the value change for silver in hair of Somerton man is not obvious and the whole tends to be stable. Thus, this element should not be used for the speculation of Somerton man living condition. Besides, the element of silver is not harmful for human healthy. It can only make some colour change for skin. Thus, this element will not be researched in this project. Figure12 Arsenic (As75) Value Comparison

Figure12 Arsenic (As75) Value Comparison

Arsenic content level in Somerton man is increasing during the last two week. At the tip of Somerton hair, the arsenic content is similar with control samples. That means this content level is not harmful for human. Besides, the living area of Somerton can be assumed as same as control samples. However, the arsenic level is significant high at his last few days and the arsenic can be used for making toxicant. So, changing living area or being poisoned should be considered. This comparison figure can be regarded as one valuable clue.

Quartz test data analysis

Data screening

Compared with glass plate, quartz plate is more pure. So, some elements recorded in the glass experiment were not present in the quartz test and some elements have different relative atomic weight. The recorded 35 elements in quartz test are shown in Table 6.

Table 6 Recorded Elements on Quartz Test

Data comparison figures

All of the quartz data is recorded. In order to comparison, the data of element has been classified as four sets of data: • Control sample hairs: Files name are Hair02-Hair08. For control hairs, it analyzed 2 sections of each hair, each 500um in length, so there are 0.7 centimeters control sample in total. The number of samples is seven, and each person takes two hairs as the two samples. • Somerton Man: It analyzed 500um sections, each subsequent transect beginning where the previous finished for a total of 96 sections (4.8cm total). • Standards: The files named NIST612-1 to NIST612-20 (20 transects) are standards and were run to ensure that the instrument wasn’t grifting too much during the analysis, they can be used to compensate for the drift if need be as they were run at intervals during the sample acquisition. • Background: Transits of 100um lengths across the quartz, files are Glue -1 to Glue–4. For the quartz test data, each element needs to plot three comparison figures. The first comparison is the comparison of four sets of data which are control samples, Somerthon man, quartz and standards. The second one is enlarged graph for the first comparison figure. The third figure shown fourteen control samples’ hair data comparison. Using the different colour dots to represent the four sets of data. Blue dots represent control samples, red dots represent Somerton man, yellow dots represent standards and green dots represent quartz background. The Figures (13-19) show the comparison of four sets of data. The X axis for all of the figures describes the scan times of the laser. The Y axis represents the relative value of elements in four sets of data.

Figure 13a Silver (Ag107)
Figure 13b Silver (Ag107)
Figure 13c Silver (Ag107)

The comparison result of silver (Ag107) value between four sets of data is shown in Figure 8. Yellow dots represent the drift values which are basically stable and indicate that the drift change is not significant. Thus, the recorded data is reliable. Green dots represent the silver content in quartz. As can be seen from the figure, the content of silver in the quartz is very small. Therefore, this set of data has little effect on the test data. In general, the silver element content in Somerton man’s hair is higher than the control hairs. In quartz experiment, the relative content value of silver in the hair is a table value and it is similar with the glass experiment result. Figure 14 Carbon (C13)

Figure 14a Carbon (C13)
Figure 14b Carbon (C13)
Figure 14c Carbon (C13)

Figure 15 Sulphur (S34)

Figure 15a Sulphur (S34)
Figure 15b Sulphur (S34)
Figure 15c Sulphur (S34)

Same as silver, the standard values for sulphur and carbon are constant which means the instrument wasn’t drifting too much. By comparison, the carbon and sulphur relative values are similar among Somerton man and 12 control hairs. Only two samples have significant differences and these samples come from one special person. Carbon and sulphur are essential elements in human hair. The change in the content of these two elements can be a direct indicator of a person's living environment. Based on the comparison results of the Figure 9 and 10, the living environment conditions for Somerton man and six control samples are similar but different from the special one. Thus, the living environment of Somerton man can be confirmed. Figure 16 Calcium (Ca43)

Figure 16 Calcium (Ca43)

Figure 17 Potassium (K39)

Figure 17 Potassium (K39)

Figure 18 Strontium (Sr88)

Figure 18 Strontium (Sr88)

Compared with the elements relative values between Somerton man and control samples, Figure 11, 12, and 13 have some similarities. The relative values of calcium, potassium and strontium in Somerton man’ hair are higher than most control hairs. However, the content of these three elements in two samples is close to the content of Somerton man’ hair. Moreover, these two hair samples both are from the one person. Based this result, it can be speculated that the content values of these three elements in the living environment for Somerton man and this special sample are similar. Figure 19 Lead (Pb206)

Figure 19a Lead (Pb206)
Figure 19b Lead (Pb206)
Figure 19c Lead (Pb206)

It has mentioned in the previous chapter that lead is a toxic element, and the data obtained from the glass plate experiment show that the lead content in Somerton man’s hair is significantly higher than that in control hairs. This phenomenon still exists in the quartz plate test. Since the incomplete hair for quartz plate test, the trend of lead content value in the hair before Somerton man dying cannot be expressed. However, the content of lead in Somerton man’s hair is indeed higher than the control hairs. The difference in lead levels does not indicate that the death of Somerton man is caused by the high level of lead in the body. The difference may be due to the fact that the lead content in human living environment is different from 60 years ago. Figure 20 Comparison of the relative values of some elements in control hairs

Figure 20a Comparison of the relative values of some elements in control hairs
Figure 20b Comparison of the relative values of some elements in control hairs

For control hairs, two sections of each hair have been analysed. There are fourteen control hairs which are from seven persons. By comparing these 14 samples, it was found that there was a difference in the elemental content between the samples, especially the first control person’s hair. Figure 15 shows the relative value of the first sample is significantly higher than the other samples. These elements are Mg, Ca, K, Mn, Na, Li, Sr. In addition, by the analysis and comparison results of several previous graphs, the values of the elemental contents of this sample are similar to those of Somerton man. From this result it can be inferred that this control sample and Somerton man have similarities in the living environment or lifestyles.

Two data comparison

Two tests data comparison table

In the glass plate test and the quartz plate test, 31 same elements were recorded and shown in table 7. Table 7 the Same Elements in Two Data Sets

Table 7 the Same Elements in Two Data Sets

Comparison of Somerton man’s two hairs

In the last two chapters, we compare and analyze the differences between Somerton man and the control samples in the two experimental data. Next, the two experimental results of Somerton man's two hair data will be compared. We hope to compare the two groups of experimental data to find a same trend but different values data, and then multiply a constant number with one of the test data to obtain the other test data. Figure 21 (a) Silicon (Si) value comparison in quartz test

Figure 21 (a) Silicon (Si) value comparison in quartz test

Figure 21 (b) Comparison of the relative values of Silicon (Si) in Somerton man hair

Figure 21 (b) Comparison of the relative values of Silicon (Si) in Somerton man hair

Figure 16(a) is a comparison of the silicon data obtained in a quartz plate experiment. The silicon content in the standard is reduced. So that the silicon content in the hair data is reduced. It is shown that the environment of test has influence on test result. From Figure 21 (b), these two hairs of Somerton man have significant different in the relative value of silicon. Besides, the trend of these two sets of data is significantly different. Thus, it cannot find the constant number from Figure 21. The following figures that from Figure 22 to 25 are the different elements comparison results of two Somerton man’s hairs. It is hard to find the same trend from this comparison. Therefore, it is necessary to find another suitable way to compare the two sets of data. Figure 22 Comparison of the relative values of Arsenic (As75) in Somerton man hair

Figure 22 Comparison of the relative values of Arsenic (As75) in Somerton man hair

Figure 23 Comparison of the relative values of Strontium (Sr88) in Somerton man hair

Figure 23 Comparison of the relative values of Strontium (Sr88) in Somerton man hair

Figure 24 Comparison of the relative values of Lead (Pb206) in Somerton man hair

Figure 24 Comparison of the relative values of Lead (Pb206) in Somerton man hair

Figure 25 Comparison of the relative values of Sulphur (S34) in Somerton man hair

Figure 25 Comparison of the relative values of Sulphur (S34) in Somerton man hair

Project Management - Planning and Feasibility

Timeline

The timeline Gantt chart has been attached in appendix B. It has listed the key milestones of the project (shown in Table 7). Table 7 Key Milestones of the Project

Table 7a Key Milestones of the Project(a)
Table 7b Key Milestones of the Project(b)

Work breakdown

The key task for the project is plotting and comparing data figures. Then, finding clues from the comparison. Due to two group members have different study directions, the project work is individual.

Task allocation

The project has two directions. In general, two group members worked on different tasks. This report devotes to analyzing the mass spectrometer data of the Somerton man hairs. The other group member works on cracking code and writing software

Management Strategy

To ensure finishing project on time, the following strategies have been applied: • Meeting with supervisors regualrly. • Receiving effective feed backs from supervisors • Finishing tasks without delay • Communicating with group member frequently.

Budget

No hardware was required in this project. All the softwares used in this project were either provided the University of Adelaide or open-sourced. As a result the $500 budget was not used at all.

Risk Management

The risks of project are listed in Table 2. The first risk should be considered is misunderstanding project tasks. So, the well communication with supervisors is essential. For this report, the crucial task is plotting figures by Matlab. Matlab unavailable will have influence on project processing. However, the Matlab is available on the computers of university. So, this risk can be negligible. Table 8 Project Risks

Table 8 Project Risks

Future work

Code cracking

In the part of code cracking, the next several tasks are expected to be finished: 1. Extend the two tests to other languages which have not been tested. 2. Perform more n-grams test using the Simhash Algorithm. 3. Increase the number of tests in the 50-letters group Levenshtein test to see if the randomness can be avoided.

Hair data analysis

In the part of mass spectral analysis, Somerton man’s hair elements data which got from glass test and quartz test have been plotted. The future works as following: 1. Re-analysing the uncommon elements value of Somerton man’s hair 2. Find the Somerton man’s living environment clues 3. Find the same trend part from two Somerton man’s hair comparison figures. 4. Gain the constant number to multiply one test data

Conclusions

Code cracking

In this project, two new algorithms have been researched and introduced to check the similarities between the mysterious code and texts of different languages. Algorithms have been implemented in Java. Text materials for each test have been gathered and arranged into ideal formats. Individual test for each algorithm has been designed and implemented. After analyzing test results the conclusion that the code consists of Initialism of English has been drawn out.

Hair data analysis

In this project, three important comparisons of experimental data were made. These three comparisons are related to glass test, quartz test and the comparison of glass and quartz test results. In the comparison of the experimental data of glass plate test, the relative values of element content for lead, mercury, arsenic and cadmium in Somerton man’s hair are significantly higher than control samples in certain period of time. However, these elements’ content have been reduced to a safe range before his death. Thus the high content of these elements cannot be regarded as the main cause of Somerton man’s death, only as a reference to the change of living environment. In the comparison for the data of quartz test, the total amount of recorded elements were reduced from 44 in glass test to 35 which is due to the high purity of quartz plate. Thus, some elements in glass test are excluded. Through comparison, the content of carbon and sulphur in Somerton man’s hair has a large difference with the fourth sample. In addition, the content of potassium, strontium and calcium are similar with the first sample. Therefore, it can assume that Somerton man’s living area is different with the fourth sample, but similar with the first sample. The data comparison between glass and quartz test does not achieve the desired results. So, the experimental method may need to change in the future work.

Appendices

  • Appendix A:Glass test figures

File:Glass Test.pdf

  • Appendix B: After Moving-average Filters figures

File:After Moving-average.pdf

  • Appendix C: Quartz data figures

File:Quartz Test.pdf

  • Appendix D: Two test comparison figures

File:Two test comparison.pdf

  • Appendix E: The Timeline Gantt Chart


The Timeline Gantt Chart

References

[1]. L. Griffith and P. Varsos. (2013). Semester B Final Report 2013 – Cipher Cracking . Available: http://www.adelaidenow.com.au/news/south-australia/somerton-man-mystery-new-details-revealed-of-jo-thomson-nurse-in-the-case/news-story/4c6bccbd2318584ad0cc6daaf3d8abd4

[2]. Renato Castello, “New twist in Somerton Man mystery as fresh claims emerge,” Sunday Mail SA, November 23th, 2013. Access via Internet: http://www.adelaidenow.com.au/news/south-australia/new-twist-in-somerton-man-mystery-as-fresh-claims-emerge/story-fni6uo1m-1226766905157

[3]. Lynton Grace, “Somerton Man mystery: New details revealed of Jo Thomson, nurse in the case”, The Advertiser, 29th May 2015. Access via Internet: http://www.adelaidenow.com.au/news/south-australia/somerton-man-mystery-new-details-revealed-of-jo-thomson-nurse-in-the-case/news-story/4c6bccbd2318584ad0cc6daaf3d8abd4

[4]. From Wikipedia, the Taman Shud Case. Access via Internet: https://en.wikipedia.org/wiki/Tamam_Shud_case

[5]. From Internet: http://ciphermysteries.com/wp-content/uploads/sites/6/2014/01/SomertonManCode-wikipedia.jpg

[6]. Inside Story, presented by Stuart Littlemore, ABC TV, screened at 8 pm, Thursday, August 24th, 1978.

[7]. A. Turnbull and D. Bihari. (2009). Final Report 2009: Who killed the Somerton man? Available: https://www.eleceng.adelaide.edu.au/personal/dabbott/wiki/index.php/Final_report_2009:_Who_killed_the_Somerton_man%3F

[8]. K. Ramirez and L-V. Michael. (2010). Final Report 2010 . Available: https://www.eleceng.adelaide.edu.au/personal/dabbott/wiki/index.php/Final_Report_2010

[9]. S. Maxwell and P. Johnson. (2011). Final Report 2011 . Available: https://www.eleceng.adelaide.edu.au/personal/dabbott/wiki/index.php/Final_Report_2011

[10]. A. Duffy and T. Stratfold. (2012). Final Report 2012 . Available: https://www.eleceng.adelaide.edu.au/personal/dabbott/wiki/index.php/Final_Report_2012

[11]. L. Griffith and P. Varsos. (2013). Semester B Final Report 2013 – Cipher Cracking . Available: https://www.eleceng.adelaide.edu.au/personal/dabbott/wiki/index.php/Semester_B_Final_Report_2013_-_Cipher_cracking

[12]. N. Gencarelli and J-K. Yang. (2015). Semester B Final Report 2015 – Cipher Cracking . Available: https://www.eleceng.adelaide.edu.au/personal/dabbott/wiki/index.php/Final_Report/Thesis_2015

[13]. MS. Charikar. (2002). Similarity estimation techniques from rounding algorithms. Available: https://www.cs.princeton.edu/courses/archive/spr04/cos598B/bib/CharikarEstim.pdf

[14]. G.S. Manku, A. Jain and A. Das Sarma. (2007). Detecting Near-Duplicates for Web Crawling. Conference on World Wide Web. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.78.7794&rep=rep1&type=pdf

[15]. From Wikipedia, the Vector Space Model. Access via Internet: https://en.wikipedia.org/wiki/Vector_space_model#/media/File:Vector_space_model.jpg

[16]. From Internet: http://static.oschina.net/uploads/img/201308/30125158_L1CI.jpg

[17] Adelaide Microscopy, ‘instrumentation’ viewed on 23 March 2016 <https://www.adelaide.edu.au/microscopy/instrumentation/icpms.html>

[18] ‘Australia can solve one of the world’s most intriguing mysteries by exhuming the body of The Somerton Man’ 2015, in NEWS.com.au, viewed on 15 March 2016. <http://www.news.com.au/national/crime/ >

[19]Batool,AI, Rehman, FU, Naveed, NH, Shaheen,A and Irfan,S 2010 ‘Hairs as biomonitors of hazardous metals present in a work environment’ in Full Length Research Paper, vol. 10, no. 18, pp.3602-3607.

[20] ‘Dead Man Found Lying on Somerton Beach’ 1948, in The News, vol. 51, no. 7902, pp. 1.

[21] Miekeley, N, Dias Carneiro, MTW, and Porto da Silveira, CL 1998, ‘How reliable are human hair reference intervals for trance element?’ in the Science of the Total Environment, vol. 218, no. 1998, pp. 9-17.

[22] School of Electrical and Electronic Engineering ‘Final Year Project Proposal’, 2014

[23] ‘Somerton man’ viewed on 23 March 2016 <https://en.wikipedia.org/wiki/Taman_Shud_Case>

[24] ‘Why The Somerton Man Endures As One Of Australia's Most Fascinating Cold Cases’ 2016, in GIZMODO, viewed on 15 March 2016. <http://www.gizmodo.com.au>

Glossary and Symbols

  • ICP-MS: Inductively Coupled Plasma Mass Spectrometer