Cracking the Voynich code

From Derek
Revision as of 20:22, 5 March 2014 by Dabbott (Talk | contribs)

Jump to: navigation, search

Honours students

Project guidelines

Project description

The Voynich Manscript is a mysterious 15th century book that no one today know what it says or who wrote it. The book is in a strange alphabet. See details here.

Fortunately the whole book has been converted into an electronic format with each character changed to a convenient ascii character. We want you to write software that will search the text and perform statistical tests to get clues as to the nature of the writing. Does the document bear the statistics of a natural language or is it a fake?

We already have Support Vector Machine (SVM) amd Multiple Discriminant Analysis (MDA) software that you can adpat for your purposes. This software is set up to test if two texts are written by the same author or not. The great thing about our software is that it is independent of language. So you could compare it against the existing writings of Roger Bacon, who is a suspected author

Useful notes

Approach and methodology

Possible extension

If you knock off this project too easily and are looking for a harder code cracking problem to try your software out on, you can progress to analyzing another famous unsolved mystery: the Voynich Manuscript


  • We don't really expect you to find the killer, though that would be cool if you do and you'll become very famous overnight.
  • To get good marks we expect you to show a logical approach to decisively eliminating which coding schemes were definitely not used.
  • In your conclusion, you need to come up with a short list of likely possibilities and a list of things you can definitely eliminate that the code is not.
  • We expect you to critically look at the conclusions of the previous project groups and highlight to what extent your conclusions agree and where you disagree.
  • We expect all the written work to be place on this wiki. No paper reports are to be handed up. Just hand up a CD with your complete project directory at the end. One CD for each group member.
  • It is expected that you fill out a short progress report on the wiki each week, every Friday evening, to briefly state what you did that week and what the goals are for the following week.
  • It is important to regularly see your main supervisors. Don't let more than 2 week go by without them seeing your face briefly.
  • You should be making at least one formal progress meeting with supervisors per month. It does not strictly have to be exactly a month, but roughly each month you should be in a position to show some progress and have some problems and difficulties to discuss.
  • The onus is on you to drive the meetings, make the appointments and set them up.
  • You are expected to make a YouTube presentation of your whole project.

Relationship to possible career path

Whilst the project is fascinating as you'll learn about a specific cold case—and we do want you to have a lot of fun with it—the project does have a hard-core serious engineering side. It will familiarize you with techniques in information theory, probability, statistics, encryption, decryption, and datamining. It will also improve your software skills. The project will also involve writing software code that trawls for patterns on the world wide web (exploiting it as a huge database). This will force you to learn about search engines and databases; and the new tools you develop may lead to new IP in the area of datamining and also make you rich/famous. The types of jobs out there where these skills are useful are in computer security, comms, or in digital forensics. The types of industries that will need you are: the software industry, e-finance industry, e-security, IT industry, Google, telecoms industry, ASIO, ASIS, defence industry (e.g. DSD), etc. So go ahead and have fun with this, but keep your eye on the bigger engineering picture and try to build up an appreciation of why these techniques are useful to our industry. Now go find that killer...this message will self-destruct in five seconds :-)

See also

References and useful resources

If you find any useful external links, list them here:
