Difference between revisions of "Cipher cracking 2010 weekly progress"

From Derek
Jump to: navigation, search
(Kevin)
(Kevin)
 
(6 intermediate revisions by 2 users not shown)
Line 472: Line 472:
 
#Finalise our final report ready for hand up
 
#Finalise our final report ready for hand up
 
#finalise exhibition poster
 
#finalise exhibition poster
 +
 +
=== Semester 2 Week 11 ===
 +
====Kevin====
 +
=====Last Week=====
 +
# Fnished final Report.
 +
# Finished Exhibition Poster.
 +
# Handed up project files on CD to Derek's office.
 +
 +
=====Next Week Plan=====
 +
# Have the actual project exhibition. Think more about the pitch to better sell our project.
 +
# Do some of the additional stuff that Derek and Matt suggested:
 +
## Getting results for the poetry texts truncated to a similar size as The Rubaiyat.
 +
## Making tables for the raw hits of text types for exact initialisms and putting them into the Final Report.
 +
## Making additional format changes to the Final Report.
 +
# Get the youtube video done.
 +
 +
====Michael====
 +
=====Last Week=====
 +
#Finalised the report with Kevin and got feedback from Derek and Matt
 +
# Finished Exhibition Poster with Kevin
 +
# Handed up project files on CD to Derek's office.
 +
 +
=====Next Week Plan=====
 +
# Compile the Youtube video - this will be a steep learning curve!
 +
# Fill in the suggested parts of the final report that were left out of the parts I did
 +
# Get another updated copy of my home directory to Derek because there will be a lot more in it after making the video
 +
# Enjoy the exhibition on Friday!
 +
 +
=== Semester 2 Week 12 ===
 +
====Kevin====
 +
=====Last Week=====
 +
# Had Project Exhibition. Went pretty well.
 +
# Finished making all the changes and additions for the Final Report.
 +
# Finished the Project :)
 +
 +
====Michael====
 +
=====Last Week=====
 +
# Completed the Youtube video (woohoo!)
 +
# Had the exhibition - all went pretty smoothly and ended up presenting our work to quite a few people.
 +
 +
=====This Week=====
 +
#Still need to burn my home directory again to give to Derek.

Latest revision as of 21:51, 3 November 2010

Contents

Weekly progress and questions

This is where you record your progress and ask questions. Make sure you update this every week. The deadline is every Friday evening. However, if you sometimes slip a little into the weekend (so long as you don't do it too often) we won't be too hard on your marks.

Please remember that we make use of this progress section to give you your project mark. Your mark will suffer if you don't complete this section.

Semester 1 Week 1

Kevin

This Week
  1. . Had initial group meeting.
  2. . Researched background of the case using Wikipedia and looking at the work from last years project.

Michael

This Week
  1. . Had introduction meeting with project coordinators and Kevin.
  2. . Researched basic case background and ciphers.

Semester 1 Week 2

Kevin

This Week
  1. . Looked into different web crawlers that we could use for our project.
  2. . Started working on Proposal Seminar.
  3. . Bought Secrets of Codes book to help with Proposal Seminar.

Michael

This Week
  1. . Developed a rough project breakdown structure.
  2. . Began compiling information on some available web crawling devices.
  3. . Constructed a basic outline for the proposal seminar.

Semester 1 Week 3

Kevin

This Week
  1. . Had our Proposal Seminar presentation.
  2. . Started to verify the past years results. Started to carefully look at the java code from the previous project group.
Next Week Plan
  1. . Survey several people while drunk/not drunk for random letters.
  2. . Finish or almost finish verifying past year results. Check if we come to the same conclusions.
  3. . Brush up on java writing skills.

Michael

This Week
  1. . Converted the work breakdown structured into a Gantt Chart.
  2. . Completed the Proposal Seminar and presented it on Wednesday.
  3. . Drew up a survey sheet for recording 45 random letters to be taken from a variety of subjects.
  4. . Started collecting random letter samples to explore the theory that the code is just a sequence of random letters.
Next Week Plan
  1. . Complete the collection of random letter samples (intoxicated samples will most likely have to wait until the following weekend :D)
  2. . Draw up an outline for the stage 1 design document.
  3. . Continue researching available web crawlers and hopefully come to a conclusion on one that will be most useful for the project.

Semester 1 Week 4

Kevin

This Week
  1. . Started writing up my sections of the Stage 1 Design Document.
  2. . Got a few random letter surveys done. Not really sure if they were that intoxicated though ;(.
  3. . Looked through the java code from last year's project. Fairly certain the code is working correctly and we get the same outputs.
  4. . Drew up an initial flow chart of what we think the web crawler code should do.
Next Week Plan
  1. . Finish the Stage 1 Design Document.
  2. . Finishing touches on confirming last year's results if needed.
  3. . Possibly start initial coding of the Web Crawler if time permits.

Michael

This Week
  1. . Gathered quite a few more random letter samples which should be a sufficient amount. Haven't had the opportunity to get intoxicated samples yet however the upcoming holidays should provide them.
  2. . Began compiling our Stage 1 Design Document. Acquired specific guidelines for the contents of the document that are relevant to project.
Next Week Plan
  1. . Main focus is to complete the Stage 1 Design Document. In doing so the project will also progress significantly in terms of knowledge on available web crawlers and specific hypotheses that should be tested for the project.

Semester 1 Week 5

Kevin

This Week
  1. . Focused mainly on finishing Stage 1 Design Document.
Next 2 Weeks Plan
  1. . Start coding basic search algorithm.
  2. . Finish verifying previous year's project results.
  3. . Read up on how web crawlers work. Come to an conclusion on which web crawler is best for us to modify.

Michael

This Week
  1. . Worked on and complete stage 1 design document
  2. . Downloaded and experimented breifly with Arachnid and Jspider web crawlers
Next 2 Weeks Plan
  1. . Finalise collection of random letter samples and compile results
  2. . continue researching and experiementing with pre existing web crawlers
  3. . brush up on java coding

Semester 1 Mid-Semester Break

Kevin

Last 2 Weeks
  1. Finished verifying previous year's code. From my best understanding of the commenting/java code, I agree with the results.
  2. Looked at the user guides/manuals of several different web crawlers. Still not sure which one we're going to use though.
  3. Started to code a basic java file to parse through text.
Next Week Plan
  1. Complete the Peer Review.
  2. Continue coding basic java code.
  3. Hopefully decide on which web crawler we want to use.

Michael

Last 2 Weeks
  1. Completed collection of random letters
  2. read into web crawler fundamentals to get more of an understanding of their general structure
Next Week Plan
  1. main focus will be to get the peer review written up
  2. compile random letter samples into graphs and compare with last years results
  3. spend some time continuing with crawler fundamentals

Semester 1 Week 6

Kevin

This Week
  1. Started doing the Peer Review.
  2. Spent most of this week doing other assignments and catching up on other courses.
Next Week Plan
  1. Finish the Peer Review
  2. Continue working on basic pattern finding java code.

Michael

This Week
  1. Began and completed Peer Review
  2. Started running small tests and playing with the Arachnid crawler. Appears to be quite slow and inconsistent.
Next Week Plan
  1. Keep experimenting with Arachnid.
  2. Try running at least 1 other crawler.

Semester 1 Week 7

Kevin

This Week
  1. Finished the Peer Review Report.
  2. Continued to work on text parsing java code. It still has quite a few errors that need to be ironed out.
Next Week Plan
  1. Continue modifying text parsing code and hopefully finish it.
  2. Play around with some of the webcrawlers to get a better understanding of how it works and how to modify the code.

Michael

This Week
  1. Ran some more tests with the supplied arachnid code
  2. read up on the J-spider crawler
Next Week Plan
  1. try editing arachnid to manipulate results of tests
  2. test and assignment for other courses next week so they will take up most of my time

Semester 1 Week 8

Kevin

This Week
  1. Was busy with tests and assignments in other subjects and wasn't really able to get much done :(.
Next Week Plan
  1. Fix and finish up basic text matching code.
  2. Otherwise, same goals as last week as I wasn't able to complete them.

Michael

This Week
  1. Spent most of my time completing assignments and studying for tests.
  2. Ran a few more tests with arachnid.
Next Week Plan
  1. Meeting with Derek and Matt to hopefully get some more guidance
  2. Read further into J-spider to see where that may take us

Semester 1 Week 9

Kevin

This Week
  1. Finished writing code to find exact matches in a text file. Looks to work exactly as I hope. Should only take a few minor changes to implement ignoring HTML code and/or searching initialisms only.
  2. Messed around trying to get a feel of the JSpider and Arachnid web crawlers.
Next Week Plan
  1. Implement above changes to java code.
  2. Read up on how to implement the code into the web crawler.

Michael

This Week
  1. Made progress with Arachnid, still having issues with consistency
  2. got some ideas on alternative paths from the meeting (i.e. using Wget)
Next Week Plan
  1. Begin outlining progress report
  2. Look into Wget and how to use it
  3. Make a decision on Arachnid and whether or not it is going to be useful

Semester 1 Week 10

Kevin

This Week
  1. Modified my Java code program so that it can choose to search between exact words or initialisms in a text file. Haven't yet implemented ignoring HTML code.
  2. Started working on the progress report.
Next Week Plan
  1. Main aim is to finish the progress report.

Michael

This Week
  1. Found an issue with Arachnid with respect to character encoding - getting null titles for UTF - 8 html pages. also often having "handleBadIO" errors which isnt very promising
  2. Began looking into Wget
  3. Started progress report
Next Week Plan
  1. Major focus is to complete our Progress Report
  2. Secondary task is to get Wget running on my laptop so that I can begin exploring with downloading web pages and pulling out text/URLs

Semester 1 Week 11

Kevin

This Week
  1. Still working on the progress report.
  2. Found some errors in my java code when attempting different cases and had to debug and modify to get it working correctly. Searching for exact initialisms seems to work perfectly now.
Next Week Plan
  1. Finish progress report.
  2. Implement ignoring HTML. Only way I can think of is ignoring characters/words between "<" and ">".

Michael

This Week
  1. Made a start on the progress reports
  2. still getting errors with character encoding but have a few things to try
Next Week Plan
  1. Getting the progress report will be the priority, along with assignments for other subjects
  2. Try to make progress with the character encoding issue if possible, otherwise this may need to wait until i get time between studying for exams

Semester 1 Week 12

Kevin

This Week
  1. Finished progress report.
  2. Implemented ignoring HTML by ignoring characters between "<" and ">".Also implemented wildcard searches when "*" is used.
  3. Both implemented methods seem to work, though haven't done much testing yet.
Next Week Plan
  1. Test and debug above implementations more.
  2. Figure out a way to implement "similar" initialisms.
  3. Read up on how to implement my code with the web crawler.
  4. These will most likely have to wait until after exams because I'm busy with studying and such.

Michael

This Week
  1. Completed the Progress Report
  2. looked further into Wget
Next Week Plan
  1. Exam study will take up most of my time for the next few weeks however after exams there is considerable time to focus solely on the project
  2. After exam plans:
  • Get Wget working properly and try to use it to retrieve data from websites
  • Decide whether to continue with Arachnid or focus on Wget
  • Work with Kevin to start amalgamating our parts of the project

Semester 2 Week 1

Kevin

Last Few Weeks
  1. Tested ignoring HTML code and wildcards. Seemed to work properly.
  2. Changed code so that it could parse through multiple files.
  3. Added code to allow user to enter a directory for files to be parsed. There were some errors involving the directory list but it was fixed and now searches the entered directory and all sub-directories.
  4. The parser now also outputs the results to a text file called "Results.txt". Also had to make sure that the parser filters out files with "Result" in the name so it doesn't parse through the results file.
  5. Started working with Michael to bring parser and web crawler together to tabulate results.
Next Week Plan
  1. Get more parsing results from web sites and analyse them.
  2. Modify parser to work with Cyrillic alphabet for Russian, etc.

Michael

Last Few Weeks
  1. While researching wget I came across website mirroring software HTTrack and found it quite useful
  2. Developed a batch file to take an input URL from either a file or user input and collect html files using HTTrack
  3. Experimented with parameters and settings to avoid downloading images and to alter search depth
  4. Began working with Kevin to get results using his pattern algorithm
  5. Combined the pattern code and HTTrack to be able to parse several URLs and run pattern matching searches on the results
Next Week Plan
  1. Determine sites/searches of interest and collect results for initialism
  2. Go through Somerton Man code to find most likely initialism sequences
  3. Work with Kevin on his code to optimise and extend

Semester 2 Week 2

Kevin

This Week
  1. Worked with Michael to get some results ready for our meeting.
  2. Added a results summary to output file for easy reading and made the user terminal interface for code more user friendly.
  3. Had meeting with Derek and Matt to talk about what we've done and what we should be doing now.
Next Week Plan
  1. Get some results from the Bible, Shakespeare and Rubiayat text files.
  2. Try to work on some more of the things we talked about in our meeting.

Michael

This Week
Next Week Plan

Semester 2 Week 3

Kevin

This Week
  1. Was busy this week with assignments and studying for tests.
  2. Edited the parser to take into account periods, commas, exclamation marks, etc that separate words (ie 'word1.word2' would be read as 1 word previously, but would not count them as 2 words). This makes the parser more accurate and lowers the chances of finding false positives in the results.
Next Week Plan
  1. Work on stuff from last week.

Michael

This Week
  1. Worked with Kevin to optimise the patter code.
  2. Added in some English probability calculation algorithm so that results are displayed after a search.
  3. obtained some useful text files to run tests on and collect results
  4. got a few more intoxicated letter samples - still not enough for a useful result
Next Week Plan
  1. Main aim is to collect intoxicated letter samples and run tests with our code to compile results

Semester 2 Week 4

Michael

This Week
  1. Finalised collection of intoxicated letter samples
  2. Ran initial tests with Kevin on the Rubaiyat of Omar Khayyam and Shakespeare texts
  3. Developed new pattern matching algorithm to search for substitution patterns. currently works for all possible 4-symbol combinations i.e. @@@@ or @@@# or @@## etc.
Next Week Plan
  1. Produce graph of intoxicated letter samples and see what the result looks like
  2. Get the pattern matching algorithm embedded into the main "FindMatch.java" file
  3. Produce a 3 symbol pattern matching method as this will probably be more useful with the Mystery Code letters.

Kevin

This Week
  1. Integrated Michael's FindPattern code into the parser. I think he still needs to add more work to it and do some testing.
  2. Ran parser and got results from The Rubiayat, Shakespeare's stuff and the Bible (RSV and KJV). Used random 4 character long parts from the Somerton man code to find the expected and actual proportion of initialisms in all of the texts. Also did the same thing using the reverse of the chosen sequences.
Next Week Plan
  1. Put the results into a readable table to easily compare.
  2. Try different parts of the Somerton man code.
  3. Try using 3 character long segments instead of 4.
  4. As above, but for other texts or websites that might be useful.

Semester 2 Week 5

Kevin

This Week
  1. Put last weeks results into a table.
  2. Ran parser to get similar results as above using 3 character long chains and put them into a results table.
Next Week Plan
  1. Try different texts/websites.
  2. See if Michael needs help finishing pattern finding code.

Michael

This Week
  1. Compiled graph of intoxicated letter samples
  2. Pattern finding algorithm now in FindMatch
  3. Working on 3 symbol algorithm
Next Week Plan
  1. continue work on 3 symbol algorithm and run some tests

Semester 2 Week 6

Kevin

This Week
  1. Re-did results for the Rubiayat after altering the text file so that each poem is spread over 1 line rather than 4.
  2. Started working on a rough outline for our Final Seminar.
Next Week Plan
  1. Get more results.
  2. Work on what we're gonna say for Final Seminar more.

Michael

This Week
  1. Have been interstate but began working on a rough outline for our seminar
  2. Looked for interesting websites to run the pattern matching algorithm on
Next Week Plan
  1. I have a large test and assignment next week however the plan is to work with kevin to finalise the code in order to get some good results to show in the seminar
  2. Continue brainstorming for the seminar and then focus completely on that from Friday onwards

Semester 2 Week 7

Kevin

This Week
  1. Was busy with assignments and several tests. Will probably be busy with work until next Friday.
Next Week Plan
  1. Get more results to show during Final Seminar.
  2. Get slides and speech prepared for Final Seminar.

Michael

This Week
  1. Similar to Kevin I have had too much other work to focus on this week
Next Week Plan
  1. Work on our seminar presentation

Semester 2 Week 8

Kevin

This Week
  1. Worked with Michael to select texts and websites to parse through using our code.
  2. Got 3 and 4 letter initialism results for those texts and put them into a table.
Next Week Plan
  1. Put all of those results into a pretty graph for our Final Seminar.
  2. Continue writing and making slides for the seminar.
  3. Possibly try to get more results.

Michael

This Week
  1. worked with Kevin to select interesting portions of the mystery code
  2. did some debugging with the Find Match code
Next Week Plan
  1. compiling huge amounts of results!
  2. writing and practising our final seminar presentation

Semester 2 Mid-Semester Break

Kevin

Last 2 Weeks
  1. Re-did some of the results after slightly modifying the code. Also tidied the results up a bit to make it more presentable.
  2. Made graphs and slides for Final Seminar.
  3. Wrote a short project description for the Exhibition Brochure.
  4. Had Final Seminar and I think it went pretty well.
Next Week Plan
  1. Start writing up Final Report.
  2. Work on some of the things Matt said in his e-mail about our Seminar to improve our Final Report.
  3. Make Project Exhibition Poster.
  4. Try to improve our results if we have enough time.

Michael

Last 2 Weeks
  1. Worked on finalising results
  2. Compiled results into graphs and produced results booklet for seminar with Kevin
  3. Had Final Seminar
Next Week Plan
  1. Get started on our final report and Exhibition poster
  2. Run some final tests on short poems to see significance of the lack of findings in the Rubiayat results (as suggested by Matt after our Seminar)

Semester 2 Week 9

Kevin

Last Week
  1. Started working on the Exhibition Poster.
  2. Started writing up Final Report.
Next Week Plan
  1. Continue working on/finish Exhibition Poster and Final Report.

Michael

Last Week
  1. Began working on final report sections
  2. Did some brainstorming for the poster and exhibition pitch
  3. Began running some tests on short poems
Next Week Plan
  1. Main aim now is to finalise the small poem tests and compile our final report

Semester 2 Week 10

Kevin

Last Week
  1. Continued to work on Final Report and Exhibition Poster.
  2. Was a bit busy with a test and assignment from other subjects.
Next Week Plan
  1. Finish Final Report.
  2. Finish Exhibition Poster.

Michael

Last Week
  1. Worked on compiling final report
  2. Developed exhibition poster with Kevin
  3. Worked with kevin on our pitch for exhibition
Next Week Plan
  1. Finalise our final report ready for hand up
  2. finalise exhibition poster

Semester 2 Week 11

Kevin

Last Week
  1. Fnished final Report.
  2. Finished Exhibition Poster.
  3. Handed up project files on CD to Derek's office.
Next Week Plan
  1. Have the actual project exhibition. Think more about the pitch to better sell our project.
  2. Do some of the additional stuff that Derek and Matt suggested:
    1. Getting results for the poetry texts truncated to a similar size as The Rubaiyat.
    2. Making tables for the raw hits of text types for exact initialisms and putting them into the Final Report.
    3. Making additional format changes to the Final Report.
  3. Get the youtube video done.

Michael

Last Week
  1. Finalised the report with Kevin and got feedback from Derek and Matt
  2. Finished Exhibition Poster with Kevin
  3. Handed up project files on CD to Derek's office.
Next Week Plan
  1. Compile the Youtube video - this will be a steep learning curve!
  2. Fill in the suggested parts of the final report that were left out of the parts I did
  3. Get another updated copy of my home directory to Derek because there will be a lot more in it after making the video
  4. Enjoy the exhibition on Friday!

Semester 2 Week 12

Kevin

Last Week
  1. Had Project Exhibition. Went pretty well.
  2. Finished making all the changes and additions for the Final Report.
  3. Finished the Project :)

Michael

Last Week
  1. Completed the Youtube video (woohoo!)
  2. Had the exhibition - all went pretty smoothly and ended up presenting our work to quite a few people.
This Week
  1. Still need to burn my home directory again to give to Derek.