Electronic Voynich Transcriptions

From Derek
Jump to: navigation, search

We noticed in our initial research (and throughout the year) that a large portion of the research on the VMS has been done on isolated sections of the manuscript. In order to facilitate a broader selection of data for testing, we looked at the quality of each transcription that was available in the EVT file. All transcriptions in this file have been converted (if necessary) to the Electronic Voynich Alphabet (EVA), which uses an analytic approach to character selection and avoids some of the problems faced by the more simplistic Currier alphabet.

We looked at the “completeness” of each transcription, which was measured by the transcribed percentage of all lines included in the EVT. The results of this test can be seen in Figure X4. By this metric, the Takahashi transcription was a clear victor, as it included data for 97.2% of the 5239 lines in the manuscript (21% more than any other transcription). Of the lines not included in the Takahashi transcription, nearly all were short labels from the astronomical and cosmological folios that were transcribed by Stolfi and/or Grove.

We also looked at the accuracy of each transcription, as recorded in the EVT. This was difficult to test rigorously without developing our own transcription from scans of the manuscript, so we used an approximate method that tested the number of characters for which the transcription agrees with the ‘majority rule’. The results for selected transcriptions can be seen in Table X5. Takahashi again leads in this metric, with an “accuracy” of 99.1%.

Finally, we used this same approximate method to find the more ‘reliable’ pages/sections in the EVT. This data is shown in Figure X5. Each bar represents a page, and higher bars represent more agreement between transcriptions. The colour of a bar in Figure X5a represents the average number of transcriptions of a page; more purple means more transcriptions. The colour of each bar in figure X5b represents the section of that page, with a legend as shown.

Transcription accuracy gradient.png
Transcription accuracy stolfisection.png