Editing
Final Report/Thesis 2015
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=====Parametric Testing===== Firstly, a group of test texts were prepared, using 3 groups of 44 letters from the English, French and German languages from the ''Universal Declaration of Human Rights'', as well as the Somerton Man Code. The test texts were analysed using ANOVA in Excel. Once these results were obtained, the ''p-value'' calculation method used was tested using various pieces of software. A test case with known ''p-value'' result from the Engineering Maths IIA notes was run through ANOVA in Microsoft Excel, as well as using built-in MATLAB functions found in Engineering Maths IIA notes, and by hand. All of these methods produced the same ''p-value'' using the test case with known ''p-value'' and so the ANOVA method was verified to be functioning correctly. ======Parametric Testing Results====== ANOVA One Way analysis was used in Microsoft Excel to compare letter frequency between texts, as well as ANOVA Two Way Analysis of Variance, where Factor 1 was the letter, Factor 2 was the text and the response was the letter frequency. Both of these methods did not produce valid ''p-values'' as they used the comparison of total means and variances. Paired data analysis using a paired sample t-test was researched and attempted based on the Engineering Maths IIA Notes. This was computed using MATLAB, but also produced unusable ''p-values'' since the method uses mean and standard deviation of the differences between samples to calculate the ''p-value'', thus letter and text type data is lost in the process and so is not applicable. The ''p-values'' calculated using these methods fluctuated depending on the type of data used. If the raw number of letter frequencies for the sample text or code (44 letters) versus the ''Universal Declaration of Human Rights'' reference text (1000-2000 letters depending on language) was used, then the ''p-value'' became very small since the total means of each text were vastly different. Whereas, if the frequency of each letter as a percentage of the total number of letters in each text were used, this gave a ''p-value'' of 1 since the means became the same. Thus this method of mean comparison did not work for letter frequency analysis. ======Parametric Testing Evaluation====== All of these methods did not compare individual data entries from one group to their corresponding entry in another group. Instead they used the sample size, sum, mean, and variance of each group to compare to the other group. Because of this, this method could not be used when comparing letter frequency between languages. For example the frequency of the letter A in one text, must be compared with the frequency of the letter A in another text, rather than the frequency of all letters in one text being compared to the frequency of all letters in another text.
Summary:
Please note that all contributions to Derek may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Derek:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information