Cracking the Voynich Code 2015 weekly progress

From Derek
Revision as of 10:18, 21 October 2015 by A1614410 (Talk | contribs)

Jump to: navigation, search

Contents

Weekly progress and questions

This is where you record your progress and ask questions. Make sure you update this every week. The deadline is every Friday evening. However, if you sometimes slip a little into the weekend (so long as you don't do it too often) we won't be too hard on your marks.

Please remember that we make use of this progress section to give you your project mark. Your mark will suffer if you don't complete this section.

Briefly record (i) progress, (ii) problems/questions, (iii) goals for following week.

Semester 1 Week 1

Andrew

  1. Week 1 progress:
    • Set up and attended meeting with Lifei and Supervisors.
    • Started research on background information about Voynich Manuscript.
    • Reviewed work completed by the previous Team.
  2. Goals for Week 2:
    • Start on project proposal preparation.
    • Plan out Project Plan.
    • Continue research on Voynich Manuscript.

Week 1 Log Book Scans

Lifei

  1. Week 1 progress:
    • First group meeting with supervisors.
    • Started reading and collecting the background information about the Voynich Manuscript.
    • Arranged the alphabet which appear in the Voynich Manuscript.
  2. Goals for Week 2:
    • Start on project proposal preparation.
    • Continue researching on the Voynich Manuscript and the articles about the Voynich Manuscript.

Semester 1 Week 2

Andrew

  1. Week 2 progress:
    • Group meeting with Lifei (Wednesday)
    • Coded pre-processing of the Interlinear Archive into respective transcribers
      • Code available on Google Drive
      • Minor issue with inline formatting (to be fixed)
    • Continued research on Voynich Manuscript
    • Began looking into Research Proposal
  2. Goals for Week 3:
    • Complete Interlinear Archive pre-processing
      • Integrate Lifei's code
    • Progress meeting with Lifei and Supervisors
    • Begin MATLAB code for basic Voynich statistics analysis

Week 2 Log Book Scans

Lifei

  1. Week 2 progress:
    • Group meeting with Andrew (Wednesday)
      • Discuss the current progress
      • Discuss the content for the proposal seminar
    • Started on the code which use to count the features of a given article.
    • Continued research on past VMS work of word frequency.
  2. Goals for Week 3:
    • Meeting with Andrew and supervisor
    • Finish the code of feature count (word length, word frequency, etc).
    • Finish the slides of proposal seminar.

Semester 1 Week 3

Andrew

  1. Week 3 progress:
    • Completed Interlinear Archive pre-processing code.
    • Completed MATLAB code for reading all text files within a separate folder.
    • Begun MATLAB code for character and word counting.
    • Held meeting with Lifei, Derek and Maryam.
      • Discussed progress.
      • Went over work to de done (text characterization).
    • Begun MATLAB code for text characterization (character token counter).
  2. Goals for Week 4:
    • Complete basic characterization code.
      • Use on VMS.
    • Begin analysis on English Alphabet using basic characterization code.
    • Meet with Lifei.
      • Discuss Proposal Seminar.
        • Slides and Content.
        • Schedule Seminar Time.
    • If time permits, meet with supervisors to discuss findings.

Lifei

  1. Week 3 progress:
    • Attended literature search training (Tuesday).
    • Completed c++ code for counting the word and character.
    • Started writing the c++ code for determining the position of giving words in text.
    • Had meeting with Andrew, Derek and Maryam.
      • Discussed project progress and the work to be done.
    • Started making the power point slides for Proposal Seminar.
  2. Goals for Week 4:
    • Complete the c++ code for determining word position.
    • Finish the slides of Proposal Seminar.
    • Meet with Andrew.
      • Discuss the slides and content of Proposal Seminar.

Semester 1 Week 4

Andrew

  1. Week 4 progress:
    • Completed basic characterization code
      • Used on VMS, output tables available on Google Drive
    • Continued with Proposal Seminar preparation
      • Met with Lifei and discussed Slide composition
      • Worked on Slides
      • Worked on presentation transcript for practice purposes
  2. Goals for Week 5:
    • Attend and complete Proposal Seminar
    • Begin actual write-up of Research Proposal Draft
    • Write up Wiki page for VMS characterization results
      • Discuss issues with '*' character within transcriptions
    • Begin characterization of English alphabet

Lifei

  1. Week 4 progress:
    • Finish the slides of Proposal Seminar.
    • Wrote a draft for the proposal presentation.
    • Started writing the Matlab code
      • Which Voynich tokens only appear at the start of words and which are only at the end.
    • Had meeting with Andrew
      • Discussed proposal seminar plan and slides.
  2. Goals for Week 5:
    • Try to finish the Matlab code about analysing tokens.
    • Present at Proposal Seminar (Tuesday, 31 March).
    • Meet with Andrew.
      • Discuss the progress of project.

Semester 1 Week 5

Andrew

  1. Week 5 progress:
    • Met with Lifei
      • Completed Proposal Seminar Slides
      • Practiced Proposal Seminar
    • Attended and completed Proposal Seminar
    • Began Wiki write-up for characterization code and results (very rough, in Code section)
    • Began English characterization.
  2. Goals for Week 6:
    • Complete Wiki write-up for characterization code and results
    • Continue English characterization (Phase 2)
    • Continue Research Proposal Draft
    • Hold formal progress meeting with Supervisors

Lifei

  1. Week 5 progress:
    • Continued to write the Matlab code
      • Token only appear at the start and at the end.
    • Met with Andrew
      • Completed Proposal Seminar Slides.
      • Practiced Proposal Seminar.
    • Attended and completed Proposal Seminar.
  2. Goals for Week 6:
    • Meeting with Supervisors.
    • Complete the Matlab code about analysis token.
    • Start Research proposal and Progress report draft.

Semester 1 Week 6

Andrew

  1. Week 6 progress:
    • Held meeting formal progress meeting with Lifei, Derek, and Maryam.
    • Continued Draft Research Project Proposal.
    • Used characterization code on section of "The Hunt for Red October".
  2. Goals for Week 6b:
    • Complete and Deliver Draft Research Project Proposal.
    • Determine most relevant UDHR languages and plot against Voynich.
    • Make characterization code more efficient.
    • Continue English Characterization:
      • Bigrams.
      • Alphabet extraction code.

Lifei

  1. Week 6 progress:
    • Held meeting formal progress meeting with Lifei, Derek, and Maryam.
    • Completed the Matlab code about analysis token.
    • Started Research proposal and Progress report draft.
  2. Goals for Week 6b (Holidays)
    • Finish Research proposal and Progress report draft.
    • Complete Matlab token location code and results.

Semester 1 Break

Andrew

  1. Semester Break progress:
    • Completed Draft Research Project Proposal
      • Uploaded hard-copy .pdf to Wiki
      • Attempted basic conversion to wiki page
    • Continued work on English Investigation
  2. Goals for Week 7:
    • Complete bigram code for English Investigation
      • Complete re-work of characterization code
      • Plot and analyze bigrams
    • Attempt to extract English alphabetical characters from texts

Lifei

  1. Semester Break progress:
    • Completed the Matlab code of counting tokens' position and get results.
  2. Goals for Week 7:
    • Improve the Matlab code.
    • Continuing research how to separate tokens from characters.
    • Start Phase 4 Investigate Stylometry.

Semester 1 Week 7

Andrew

  1. Week 7 progress:
    • Continued English Investigation - Phase 2 Work
      • Plotted base bigrams
      • Begun attempting English Alphabet extraction
  2. Goals for Week 8:
    • Begin researching into morphology analysis techniques
    • Continue English Extraction attempts
      • Run on Voynich

Lifei

  1. Week 7 progress:
    • Continued tokens analysis.
      • Using the data from Matlab code.
      • Plot diagram.
    • Continued research of Stylometry analysis (Phase 4).
  2. Goals for Week 8:
    • Complete all the Matlab code about tokens analysis.

Semester 1 Week 8

Andrew

  1. Week 8 progress:
    • Added 'de-voweled' English to Word-Length Distributions
    • Continued English Alphabet Extraction
      • Issues with some punctuation characters (e.g. ")
    • Continued research on Morphology Analysis techniques
  2. Goals for Week 9:
    • Complete English Alphabet Extraction
    • Begin Morphology coding
      • Begin, again, using English as a base
    • Hold progress meeting with supervisors

Lifei

  1. Week 8 progress:
    • Completed the Matlab code about tokens analysis.
      • Tokens appear at the start of words and at the end.
    • Continued Alphabet extraction.
    • Continued research of Stylometry analysis (Phase 4).
  2. Goals for Week 9:
    • Hold progress meeting with Supervisors
    • How to extraction alphabet in English.
      • Matlab code to count token frequency, token recurrence interval

Semester 1 Week 9

Andrew

  1. Week 9 progress:
    • Held Progress Meeting with Derek
      • Displayed current progress with phase 2
    • Continued working on extracting English alphabet
      • Could not complete, issues with other characters aside from most lower-case alphabet
      • Compiling Lifei's data into the extraction code
  2. Goals for Week 10:
    • Continue and complete the English Extraction code
    • Begin writing up final Research Proposal and Progress Report
    • Continue research on aspects of Morphology

Lifei

  1. Week 9 progress:
    • Held Progress Meeting with Derek
      • Displayed current progress
    • Continued working on extracting English alphabet
    • Continued working on token analysis.
  2. Goals for Week 10:
    • Begin writing up final Research Proposal and Progress Report
    • Continue research on Stylometry.

Semester 1 Week 10

Andrew

  1. Week 10 progress:
    • Continued work on English Token Extraction
      • Adding Lifei's Start/End token data
      • Adding 'cross-checking' between data sets to increase probability of extraction
    • Begun Final Research Project Proposal and Progress Report Document
      • Making Phase 2 Data more 'report-friendly'
    • No significant progress on Morphology unfortunately
  2. Goals for Week 11:
    • Continue Final Research Project Proposal and Progress Report Document
    • Test Voynich against Token Extraction Code
    • Begin coding Morphology Investigation (Phase 3)

Lifei

  1. Week 10 progress:
    • Begun Final Research Project Proposal and Progress Report Document
    • Completed the Token analysis code
      • Token recurrence interval
      • Token frequency
      • Record all the data
  2. Goals for Week 11:
    • Meet with Andrew discuss the final Research Proposal and Progress Report
    • Continue final Research Proposal and Progress Report
    • Continue research on Stylometry.

Semester 1 Week 11

Andrew

  1. Week 11 progress:
    • Continued writing Research Project Proposal and Progress Report
      • Adding Implementation and Testing section
      • Setting up data into a more presentable summary
    • Continued English Extraction
      • Mostly completed but have issues with some tokens, specifically:
        • " ' ( - currently considered possible alphabet tokens
        • X x currently considered possible non-alphabet tokens
  2. Goals for Week 12:
    • Completed Research Project Proposal and Progress Report
    • Refine English Extraction Results where possible
    • Work on Phase 3 coding
    • Hold progress meeting

Lifei

  1. Week 11 progress:
    • Continued writing Research Project Proposal and Progress Report
    • Added content to phase 4
      • Illustration Investigation
      • Started from f102r2 ("frog page")
  2. Goals for Week 12:
    • Finish the code which are going to use in phase 4
      • Finding unique word tokens in each pages and sections
      • Determine the location of a given word token
      • Determine the frequency of a given word token
    • Completed Research Project Proposal and Progress Report
    • Continue Illustration Investigation
      • Sorting data of f102r2

Semester 1 Week 12

Andrew

  1. Week 12 progress:
    • Completed Research Project Proposal and Progress Report
      • Handed separate, inidividual version up via MyUni
    • Held progress meeting with Derek and Lifei
      • Need to rework the Extraction software
    • Begun Phase 3 Software
  2. Goals for Week 13:
    • Rework Extraction Software
      • Remove relative threshold values
      • Use basic data from Bigrams
      • Retest

Lifei

  1. Week 12 progress:
    • Held progress meeting with Derek and Andrew
    • Completed Research Project Proposal and Progress Report
      • Wiki page version and PDF version
    • Continued work on Illustration Investigation
  2. Goals for Week 13:
    • Compare the data of folio f102r2 and others folio
    • Continue work on the search engine of Phase 4

Semester 1 Week 13

Andrew

  1. Week 13 progress:
    • Continued work on Extraction code
      • Attempting to optimize Bigrams
      • Checking with various sample sizes
    • Minor research on Minimum Description Length models
      • May not fit with planned Voynich use
  2. Goals for Week 14 (SWOTVAC):
    • Complete second implementation of Extraction code
    • Expand Morphology code

Lifei

  1. Week 13 progress:
    • Sorted the data of folio f102r2 and others folio
      • Compared the similarity and difference between them
    • Contiuned work on the search engine of Phase 4 (Illustration)
      • In one excel file, list all the "special" word in f102r2
      • click any to them, go the the folio which contain the same word
  2. Goals for Week 14 (SWOTVAC):
    • Try to improve the search engine
    • Complete the analysis of folio f102r2 and start a new folio

Semester 1 Week 14 (SWOTVAC)

Andrew

  1. Week 14 progress:
    • Continued Phase 2
      • Expansion of Extraction Code (less reliant on threshold values)
  1. Goals for Semester Break:
    • Complete Phase 2
      • Re-work and plot error rate of updated Extraction Code
    • Complete Phase 3
      • Determine possible prefixes within Voynich
      • Determine possible suffixes within Voynich

Lifei

  1. Week 14 progress:
  1. Goals for Semester Break:

Semester 2 Week 1

Andrew

  1. Week 1 Progress:
    • Continued Phase 3
      • Determining possible 4-character token suffixes within Voynich
      • Plotting most common cases
    • Researching on possible follow-up avenues for Phase 5 onwards
  1. Week 2 Goals:
    • Attend Progress Meeting with Supervisors
    • Complete plots for Phase 3
    • Determine follow-up avenues for Phase 5
    • Begin Phase 5

Lifei

  1. Week 1 Progress:
    • Continued Phase 4
      • Continued working on the search engine
      • Continued working on the excel file
        • Click any word in folio f102, then go to the folios which contain the same word
  2. Week 2 Goals:
    • Attend Progress Meeting with Supervisors
    • Finish the excel file

Semester 2 Week 2

Andrew

  1. Week 2 Progress:
    • Held Progress Meeting
    • Continued work on Morphology
      • Removing punctuation and cases from English
    • Creating summary table for all different transcriptions
  1. Week 3 Goals:
    • Create error rate graph for English Extraction
      • Continue updating/upgrading code
    • Re-create all prefix/suffix graphs
    • Help Lifei begin Wiki pages for unique words on each Voynich page
    • Research on Word Correlations

Lifei

  1. Week 2 Progress:
    • Held Progress Meeting
    • Continued work on Phase 4 illustration Investigation
      • Moved the work from excel file to wiki page
    • Created a new wiki page for illustration investigation
    • Sorted data for each folios, start from 1r
  2. Week 3 Goals:
    • Finish folio 1r and 1v in wiki page.
    • Start comparing the illustration between folio 102r and other folios.

Semester 2 Week 3

Andrew

  1. Week 3 Progress:
    • Continued Research on Word Correlations
    • Re-created Prefix/Suffix Graphs
    • Created Error Rate Graph for Phase 2
  1. Week 4 Goals:
    • Begin coding for possible Word Correlation Statistics
      • Run tests on various texts to determine plausibility
    • Update English Extraction Code

Lifei

  1. Week 3 Progress:
    • Improved the search engine in wiki page
      • Finished folio 1r and 1v in wiki page.
      • Added data and content of the search engine based on the transcription by Takeshi Takahashi.
    • Started comparing the illustration between folio 102r and other folios.
  2. Week 4 Goals:
    • Continue working on and improve the search engine in wiki page.
    • Continue comparing between folio 102r's illustration with other folios

Semester 2 Week 4

Andrew

  1. Week 4 Progress:
    • Continued Word Correlations
      • Begun Coding in MATLAB
      • Continued research on how word correlations have been used
  1. Week 5 Goals:
    • Complete basic code and tests on Word Correlations
      • Texts using English and various languages
    • Set-up Progress Meeting

Lifei

  1. Week 4 Progress:
    • Continued working on and improve the search engine in wiki page.
    • Improved the Matlab code which can achieve below function
      • A given absolute frequency range, find out the word in the Voynich Manuscript which in the range
  2. Week 5 Goals:
    • Continue working on the search engine
    • Find out similar illustrations in the Voynich Manuscript

Semester 2 Week 5

Andrew

  1. Week 5 Progress:
    • Completed basic word correlation code
      • Needs improvement - Slow
    • Created basic word correlation tables
    • Minor improvement to English Extraction Code
  1. Week 6 Goals:
    • Write code to 'scramble' English text for word correlation testing
    • Improve Word Correlation code
    • Reschedule Progress Meeting

Lifei

  1. Week 5 Progress:
    • List the similar illustrations in the Voynich Manuscript in a Excel file
    • Completed the frame of the search engine in wiki page
    • Started to write Matlab code about Stylometry
  2. Week 6 Goals:
    • Attend Progress Meeting with Supervisors
    • Continue working on search engine in wiki page
    • Improve Stylometry code

Semester 2 Week 6

Andrew

  1. Week 6 Progress:
    • Held Progress Meeting
    • Completed C++ text 'scrambling' code
      • Only scrambles words per line for efficiency
    • Plotted scrambled English correlations
    • Minor efficiency improvements to Correlation code.
  1. Week 7 Goals:
    • Improve 'scrambler' code
      • Run on English and Voynich
      • Created separate tables for each
    • Add extra languages to suffix extraction
      • Plot and determine ratio of top 2

Lifei

  1. Week 6 Progress:
    • Held Progress Meeting
    • Compared the transcription of the pages which content similar illustrations
    • Completed Stylometry code
  2. Week 7 Goals:
    • Improve the Matlab code the achieve following function:
      • Find out unique word in each page of Voynich
      • Find out unique word of Voynich in each page.
    • Continue working on the search engine

Semester 2 Week 7

Andrew

  1. Week 7 Progress:
    • Completed English Randomised Collocations
      • Random 'sentence' lengths of 100, 200, 500, 1000
    • Ran into issues with randomising Voynich pages
      • Outputs missing words within sentences
      • Re-writing code to fix this issue
  1. Week 8 Goals:
    • Complete Voynich Random Collocations
      • Graph and compare results with English
    • Complete Prefix and Suffix extraction for other languages
      • Compare top 2

Lifei

  1. Week 7 Progress:
    • Improved the Matlab code the achieve following function:
      • Find out unique word in each page of Voynich
      • Find out unique word of Voynich in each page.
    • Continued working on the search engine in wiki page
  1. Week 8 Goals:
    • Improve the Matlab code to generate separate txt file for each page
      • The txt file should content the following data
        • Total word number of each page
        • Total word length of each page as well as the average word length
        • List of unique words in each page as well as the total number of unique word
        • List of unique word of Voynich in each page as well as the total number.
    • Completed upload the pictures and transcriptions of each folio for the search engine in wiki page

Semester 2 Week 8

Andrew

  1. Week 8 Progress:
    • Completed basic Random Collocations for both English and Voynich
      • To be expanded to other languages
    • Continued Prefix and Suffix extraction
      • Did not manage to complete, require more data on other languages
  1. Week 8a Goals:
    • Find and run Collocation tests on other languages
      • Improve Collocation extraction code to be more efficient
    • Complete Prefix and Suffix extraction on other languages
    • Set-up Final Seminar times and answer exhibition questions.

Lifei

  1. Week 8 Progress:
    • Improved the Matlab code to generate separate txt file which contain the statistic data for each page.
    • Completed the Matlab code for analysing each section of the Voynich
      • Number of Total word of each sections.
      • Words frequency of each sections.
    • Completed upload the pictures and transcriptions of each folio for the search engine in wiki page
  2. Week 8a Goals:
    • Collection and sorting data using the Matlab program.
    • Complete the search engine in wiki page
    • Imporove the Matlab code
      • Compare each section of the Voynich with others, work out the unique words only appear in each section.

Semester 2 Week 8a (Semester Break Wk 1)

Andrew

  1. Week 8a Progress:
    • Submitted seminar times and exhibition questions
    • Added extra languages to:
      • Collocation Tests
      • Prefix and Suffix Test
      • Possible issues with removal of punctuation and lower-case in languages
        • Removes accented letters
  1. Week 8b Goals:
    • Begin seminar preparation
    • Improve punctuation removal and lower-case conversion code
      • Re-run Collocation, Prefix and Suffix Tests
    • Test different ranking method(s) on Collocations

Lifei

  1. Week 8a Progress:
    • Completed the search engine in wiki page
    • Imporoved the Matlab code
      • Compare each section of the Voynich with others, work out the unique words only appear in each section.
  2. Week 8b Goals:
    • Make figure to show the words that are unique to each folio and section
    • Analyse the words that are unique to each section.
      • Most frequency unique words of each section
      • Find out the folios which contain the unique words

Semester 2 Week 8b (Semester Break Wk 2)

Andrew

  1. Week 8b Progress:
    • Begun Final Seminar Transcription
    • Rerun tests using modified punctuation and lower-case code
      • Added different ranking methods to collocations
    • Begun separate final report/thesis document
  1. Week 9 Goals:
    • Complete Final Seminar preparation
    • Continue final report/thesis
    • Begin exhibition preparation
      • Exhibition poster
      • Continue Exhibition Wiki page

Lifei

  1. Week 8b Progress:
  1. Week 9 Goals:

Semester 2 Week 9

Andrew

  1. Week 9 Progress:
    • Making final touches to Final Seminar Slides
      • Practicing using transcript
    • Writing separate final report/thesis as per course requirements
  1. Week 10 Goals:
    • Complete final seminar on Tuesday
    • Continue final report/thesis
    • Work on exhibition preparation
      • Update exhibition wiki
      • Make exhibition poster

Lifei

  1. Week 9 Progress:
  1. Week 10 Goals:

Semester 2 Week 10

Andrew

  1. Week 10 Progress:
    • Completed and Recorded Final Seminar Presentation
    • Continuing Final Report/Thesis
    • Updated/Rerun Collocation Tests due to error in Vietnamese
      • Removed lower-case converter as may not function correctly with other languages other than English
  1. Week 11 Goals:
    • Complete Final Report/Thesis
      • Convert to wiki page (written as separate .pdf files as per course requirements discussed with Hong Gunn)
    • Complete Exhibition Poster
      • Determine the different 3-minute speeches for expo
    • Begin planning 5 minute YouTube video.

Lifei

  1. Week 10 Progress:
  1. Week 11 Goals:

Semester 2 Week 11

Andrew

  1. Week 11 Progress:
  1. Week 12 Goals:

Lifei

  1. Week 11 Progress:
  1. Week 12 Goals:

Semester 2 Week 12

Andrew

  1. Week 12 Progress:

Lifei

  1. Week 12 Progress:

END OF PROJECT

See also


Back