Difference between revisions of "Cipher Cracking 2009"
(→Weekly progress and questions) |
(→References and useful resources) |
||
(73 intermediate revisions by 4 users not shown) | |||
Line 6: | Line 6: | ||
*[[User:Andrew Turnbull|Andrew Turnbull]] | *[[User:Andrew Turnbull|Andrew Turnbull]] | ||
*[[User:Denley Bihari|Denley Bihari]] | *[[User:Denley Bihari|Denley Bihari]] | ||
+ | |||
+ | ==Project guidelines== | ||
+ | *[http://www.eleceng.adelaide.edu.au/personal/dabbott/project_handbook_2009.pdf Project Handbook] | ||
+ | |||
+ | ==Project description== | ||
+ | In this project you will attempt to solve a murder that took place in Adelaide in 1948. This crime remains unsolved till today, but you can use engineering to bring our knowledge closer to the killer. You can read the details about the dead body and the circumstances [http://en.wikipedia.org/wiki/Taman_Shud_Case] | ||
+ | |||
+ | Associated with the dead body was this secret code: | ||
+ | |||
+ | :MRGOABABD | ||
+ | :MTBIMPANETP | ||
+ | :MLIABOAIAQC | ||
+ | :ITTMTSAMSTGAB | ||
+ | |||
+ | (See the original photograph, as there may be an extra line, and some of the M's may be W's. Some people also think that the last "I" is really a "V". Also the last G is probably really a C). To this day code crackers have been unable to decrypt it. | ||
== Weekly progress and questions == | == Weekly progress and questions == | ||
Line 19: | Line 34: | ||
*The grave site is at West Tce Cemetery: grave site number 106 on row 12 of an area of the cemetery called "Plan 3." To find it, the easiest way is to go to a road called "Road 5 East" and then walk about 40 paces along row 12. | *The grave site is at West Tce Cemetery: grave site number 106 on row 12 of an area of the cemetery called "Plan 3." To find it, the easiest way is to go to a road called "Road 5 East" and then walk about 40 paces along row 12. | ||
+ | *The ABC TV documentary show called ''Inside Story'' screened an episode of "The Somerton Beach Mystery" at 8pm, Thursday, August 24th, 1978 (Sydney time). Need to watch a DVD of it. | ||
− | == | + | ==Approach and methodology== |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
You have an advantage that as engineers you know more about information theory and statistics than the average policeman or code breaking expert. You will take a structured approach to writing software code to use a process of elimination to say whether particular coding schemes were used or not. | You have an advantage that as engineers you know more about information theory and statistics than the average policeman or code breaking expert. You will take a structured approach to writing software code to use a process of elimination to say whether particular coding schemes were used or not. | ||
Start with the [http://en.wikipedia.org/wiki/Playfair_cipher Playfair cipher] and the [http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher Vigenère cipher] to begin with and you should find that you can easily test the above sequence of letters to prove the Vigenère cipher was definitely not used. Then you can go onto exploring [http://en.wikipedia.org/wiki/Category:Classical_ciphers other encryption schemes] | Start with the [http://en.wikipedia.org/wiki/Playfair_cipher Playfair cipher] and the [http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher Vigenère cipher] to begin with and you should find that you can easily test the above sequence of letters to prove the Vigenère cipher was definitely not used. Then you can go onto exploring [http://en.wikipedia.org/wiki/Category:Classical_ciphers other encryption schemes] | ||
− | + | ||
+ | :'''Note''' from Matthew: If you include the extra line, I'm not so sure you can prove it's not the Vigenère cipher. Also, given the date of the murder, and the dates of invention of some ciphers, there are some you could reasonably rule out (e.g. I doubt it's RSA for historical and technical reasons), however you can still implement them and try them out :). If you dig into some of the historical documents on the case you may find [http://xkcd.com/538/ clues to possible decryption keys]. | ||
We would also like you to perform simple statistical tests to show if English was the most likely language or not in the original message. Also you should be able to prove if the code is the beginning letter of a sequence of words or is composed of whole words. | We would also like you to perform simple statistical tests to show if English was the most likely language or not in the original message. Also you should be able to prove if the code is the beginning letter of a sequence of words or is composed of whole words. | ||
A list of letter frequency rankings for different languages can be found [http://www.bckelk.ukfsn.org/words/etaoin.html here]. | A list of letter frequency rankings for different languages can be found [http://www.bckelk.ukfsn.org/words/etaoin.html here]. | ||
− | Then if you have time and if you are excited to take this project to a higher | + | Then if you have time and if you are excited to take this project to a higher level you can start to check out the work of the great electrical engineer Claude Shannon and apply his techniques from information theory. You can measure the information content in the message in terms of bits for starters. |
==Possible extension== | ==Possible extension== | ||
Line 51: | Line 52: | ||
== Expectations == | == Expectations == | ||
− | We don't really expect you to find the killer, though that would be cool if you do and you'll become very famous overnight. To get good marks we expect you to show a logical approach to | + | We don't really expect you to find the killer, though that would be cool if you do and you'll become very famous overnight. To get good marks we expect you to show a logical approach to decisively eliminating which coding schemes were definitely not used. In your conclusion, you need to come up with a short list of likely possibilities and a list of things you can definitely eliminate that the code is not. |
== Relationship to possible career path== | == Relationship to possible career path== | ||
− | Whilst the project is fascinating as you'll learn about a specific murder case—and we do want you to have a lot of fun with it—the project does have a hard-core serious engineering side. It will | + | Whilst the project is fascinating as you'll learn about a specific murder case—and we do want you to have a lot of fun with it—the project does have a hard-core serious engineering side. It will familiarize you with techniques in information theory, probability, statistics, encryption, decryption, and datamining. It will also improve your software skills. The project will also involve writing software code that trawls for patterns on the world wide web (exploiting it as a huge database). This will force you to learn about search engines and databases; and the new tools you develop may lead to new IP in the area of datamining and also make you rich/famous. The types of jobs out there where these skills are useful are in computer security, comms, or in digital forensics. The types of industries that will need you are: the software industry, e-finance industry, e-security, IT industry, Google, telecoms industry, [http://www.asio.gov.au/ ASIO], [http://www.asis.gov.au/ ASIS], defence industry (e.g. [http://www.dsd.gov.au/ DSD]), etc. So go ahead and have fun with this, but keep your eye on the bigger engineering picture and try to build up an appreciation of why these techniques are useful to our industry. Now go find that killer...this message will self-destruct in five seconds :-) |
+ | |||
+ | ==See also== | ||
+ | * [[Critical design review 2009: Who killed the Somerton man?]] | ||
+ | * [[Final report 2009: Who killed the Somerton man?]] | ||
+ | * [[Timeline of the Taman Shud Case]] | ||
+ | * [[List of people connected to the Taman Shud Case]] | ||
+ | * [[List of facts on the Taman Shud Case that are often misreported]] | ||
+ | * [[List of facts we do know about the Somerton Man]] | ||
+ | * [[The Taman Shud Case Coronial Inquest]] | ||
+ | * [[Letter frequency plots]] | ||
+ | * [[Structural Features of the Code]] | ||
+ | * [[Markov models]] | ||
+ | * [[Primary source material on the Taman Shud Case]] | ||
+ | * [[Secondary source material on the Taman Shud Case]] | ||
+ | * [[Transition probabilities from selected texts]] | ||
+ | * [[Listed poems from The Rubaiyat of Omar Khayyam]] | ||
+ | * [[Using the Rubaiyat of Omar Khayyam as a one-time pad]] | ||
+ | * [[Using the King James Bible as a one-time pad]] | ||
+ | * [[Using the Revised Standard Edition Bible as a one-time pad]] | ||
+ | * [[Transitions within words]] | ||
== References and useful resources== | == References and useful resources== | ||
+ | If you find any useful external links, list them here: | ||
+ | * [http://en.wikipedia.org/wiki/Taman_Shud_Case The taman shud case] | ||
* [http://www.library.cornell.edu/colldev/mideast/okhym.htm Edward Fitzgerald's translation of رباعیات عمر خیام by عمر خیام] | * [http://www.library.cornell.edu/colldev/mideast/okhym.htm Edward Fitzgerald's translation of رباعیات عمر خیام by عمر خیام] | ||
* [http://ebooks.adelaide.edu.au/ Adelaide Uni Library e-book collection] | * [http://ebooks.adelaide.edu.au/ Adelaide Uni Library e-book collection] | ||
* [http://www.gutenberg.org/wiki/Main_Page Project Gutenburg e-books] | * [http://www.gutenberg.org/wiki/Main_Page Project Gutenburg e-books] | ||
+ | * [http://onlinebooks.library.upenn.edu/archives.html#foreign Foreign language e-books] | ||
+ | * [http://www.ohchr.org/EN/UDHR/Pages/Introduction.aspx UN Declaration of Human Rights - different languages] | ||
+ | * [http://www.eleceng.adelaide.edu.au/personal/dabbott/tamanshud/SSC_mckay1999.pdf Statistical debunking of the 'Bible code'] | ||
+ | * [http://www.austlii.edu.au/cgi-bin/sinodisp/au/journals/UNSWLJ/2004/23.html#Heading26 Spys in Australia 1948] | ||
+ | * [http://www.eleceng.adelaide.edu.au/personal/dabbott/tamanshud/fitzgerald_editions.pdf The differences between the 5 FitzGerald editions] | ||
+ | * [http://books.google.com.au/books?id=Pi80__UBKmEC&pg ASIO Unofficial History by Frank Cain] | ||
+ | * [http://math.ucsd.edu/~crypto/Projects/KyleCai/spiesandcrypto.htm Operation Venona] | ||
+ | * [http://www.nsa.gov/applications/search/index.cfm?q=venona Venona documents] | ||
+ | * [https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and-monographs/venona-soviet-espionage-and-the-american-response-1939-1957/preface.htm Venona code] | ||
+ | * [http://permanent.access.gpo.gov/lps33230/www.nsa.gov/docs/venona/monographs/monograph-1.html Venona history] | ||
+ | * [http://naa12.naa.gov.au/scripts/imagine.asp?B=30091097&I=1&SE=1 Jacqueline Templeton, ''Royal Commission on Intelligence and Security Seventh Report - Australian Intelligence/Security Services 1900-1950,'' '''Volume 2''', 1977] | ||
+ | * [http://www.theage.com.au/articles/2005/01/29/1106850156146.html The Kaiser affair] | ||
+ | * [http://enc.slider.com/Enc/OneTimePads One time pads] | ||
+ | * [http://www.fbi.gov/hq/lab/fsc/backissu/jan2000/olson.htm Analysis of criminal codes and ciphers] | ||
+ | * [http://www.fbi.gov/hq/lab/fsc/backissu/april2006/research/2006_04_research01.htm Code breaking in law enforcement: A 400-year history] | ||
+ | * [http://portal.acm.org/citation.cfm?id=1389095.1389425 Evolutionary algorithm for decryption of monoalphabetic homophonic substitution ciphers encoded as constraint satisfaction problems] | ||
==Back== | ==Back== |
Latest revision as of 15:32, 13 January 2010
Contents
Supervisors
Honours students
Project guidelines
Project description
In this project you will attempt to solve a murder that took place in Adelaide in 1948. This crime remains unsolved till today, but you can use engineering to bring our knowledge closer to the killer. You can read the details about the dead body and the circumstances [1]
Associated with the dead body was this secret code:
- MRGOABABD
- MTBIMPANETP
- MLIABOAIAQC
- ITTMTSAMSTGAB
(See the original photograph, as there may be an extra line, and some of the M's may be W's. Some people also think that the last "I" is really a "V". Also the last G is probably really a C). To this day code crackers have been unable to decrypt it.
Weekly progress and questions
This is where you record your progress and ask questions. Make sure you update this every week.
Useful notes
As useful bits of information come to light, just list them here:
- It was the Edward Fitzgerald loose translation's (plus his own additions and recombination of poems) of the Rubaiyat that was found on the body. Call number is 891.5 O5.B in the BSL.
- The grave site is at West Tce Cemetery: grave site number 106 on row 12 of an area of the cemetery called "Plan 3." To find it, the easiest way is to go to a road called "Road 5 East" and then walk about 40 paces along row 12.
- The ABC TV documentary show called Inside Story screened an episode of "The Somerton Beach Mystery" at 8pm, Thursday, August 24th, 1978 (Sydney time). Need to watch a DVD of it.
Approach and methodology
You have an advantage that as engineers you know more about information theory and statistics than the average policeman or code breaking expert. You will take a structured approach to writing software code to use a process of elimination to say whether particular coding schemes were used or not.
Start with the Playfair cipher and the Vigenère cipher to begin with and you should find that you can easily test the above sequence of letters to prove the Vigenère cipher was definitely not used. Then you can go onto exploring other encryption schemes
- Note from Matthew: If you include the extra line, I'm not so sure you can prove it's not the Vigenère cipher. Also, given the date of the murder, and the dates of invention of some ciphers, there are some you could reasonably rule out (e.g. I doubt it's RSA for historical and technical reasons), however you can still implement them and try them out :). If you dig into some of the historical documents on the case you may find clues to possible decryption keys.
We would also like you to perform simple statistical tests to show if English was the most likely language or not in the original message. Also you should be able to prove if the code is the beginning letter of a sequence of words or is composed of whole words. A list of letter frequency rankings for different languages can be found here.
Then if you have time and if you are excited to take this project to a higher level you can start to check out the work of the great electrical engineer Claude Shannon and apply his techniques from information theory. You can measure the information content in the message in terms of bits for starters.
Possible extension
If you knock off this project too easily and are looking for a harder code cracking problem to try your software out on, you can progress to analyzing another famous unsolved mystery: the Voynich Manuscript
Expectations
We don't really expect you to find the killer, though that would be cool if you do and you'll become very famous overnight. To get good marks we expect you to show a logical approach to decisively eliminating which coding schemes were definitely not used. In your conclusion, you need to come up with a short list of likely possibilities and a list of things you can definitely eliminate that the code is not.
Relationship to possible career path
Whilst the project is fascinating as you'll learn about a specific murder case—and we do want you to have a lot of fun with it—the project does have a hard-core serious engineering side. It will familiarize you with techniques in information theory, probability, statistics, encryption, decryption, and datamining. It will also improve your software skills. The project will also involve writing software code that trawls for patterns on the world wide web (exploiting it as a huge database). This will force you to learn about search engines and databases; and the new tools you develop may lead to new IP in the area of datamining and also make you rich/famous. The types of jobs out there where these skills are useful are in computer security, comms, or in digital forensics. The types of industries that will need you are: the software industry, e-finance industry, e-security, IT industry, Google, telecoms industry, ASIO, ASIS, defence industry (e.g. DSD), etc. So go ahead and have fun with this, but keep your eye on the bigger engineering picture and try to build up an appreciation of why these techniques are useful to our industry. Now go find that killer...this message will self-destruct in five seconds :-)
See also
- Critical design review 2009: Who killed the Somerton man?
- Final report 2009: Who killed the Somerton man?
- Timeline of the Taman Shud Case
- List of people connected to the Taman Shud Case
- List of facts on the Taman Shud Case that are often misreported
- List of facts we do know about the Somerton Man
- The Taman Shud Case Coronial Inquest
- Letter frequency plots
- Structural Features of the Code
- Markov models
- Primary source material on the Taman Shud Case
- Secondary source material on the Taman Shud Case
- Transition probabilities from selected texts
- Listed poems from The Rubaiyat of Omar Khayyam
- Using the Rubaiyat of Omar Khayyam as a one-time pad
- Using the King James Bible as a one-time pad
- Using the Revised Standard Edition Bible as a one-time pad
- Transitions within words
References and useful resources
If you find any useful external links, list them here:
- The taman shud case
- Edward Fitzgerald's translation of رباعیات عمر خیام by عمر خیام
- Adelaide Uni Library e-book collection
- Project Gutenburg e-books
- Foreign language e-books
- UN Declaration of Human Rights - different languages
- Statistical debunking of the 'Bible code'
- Spys in Australia 1948
- The differences between the 5 FitzGerald editions
- ASIO Unofficial History by Frank Cain
- Operation Venona
- Venona documents
- Venona code
- Venona history
- Jacqueline Templeton, Royal Commission on Intelligence and Security Seventh Report - Australian Intelligence/Security Services 1900-1950, Volume 2, 1977
- The Kaiser affair
- One time pads
- Analysis of criminal codes and ciphers
- Code breaking in law enforcement: A 400-year history
- Evolutionary algorithm for decryption of monoalphabetic homophonic substitution ciphers encoded as constraint satisfaction problems