Difference between revisions of "Authorship detection: 2011 group"

From Derek
Jump to: navigation, search
 
(83 intermediate revisions by 3 users not shown)
Line 89: Line 89:
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Attend the literature search training
#
+
# Identity the algorithms are used in this project
#
+
# Prepare the proposal algorithms and complete the slides for presentation
 +
# Further reading on research papers
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Set up the Work Breakdown Structure, Milestones, Gantt Chart and Project Budget
#
+
# Send the presentation slides to supervisors
#
+
# Prepare the presentation of proposal seminar next week
 +
# Analysis the proposal algorithms used in this project
 +
# Research and discuss the classifier
 +
# Have a team meeting with the other members
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Abstract on proposal seminar.
#
+
# Allocate seminar role for each group member.
#
+
# Prepare outline PowerPoint slides.
 +
# Identify the brief idea on the project.
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
 
+
# Present proposal seminar.
#
+
# Identify the methods on project.
#
+
# Identify classifiers on project.
#
+
  
 
===Semester 2, Week 3===
 
===Semester 2, Week 3===
Line 116: Line 121:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Complete the Gantt Chart, Work Breakdown Structure, Milestones, Budget and risk analysis with the other team members
#  
+
# Modifications on the slides of presentation
#
+
# Prepare the presentation
 +
# Introduce the Common N-grams
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Research on SVM classifier for the algorithm Common N-grams used
#
+
# Start to design the Common N-grams
#
+
# Make stage one progress report template
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Modify the slides after getting a feedback from Brian
#
+
# Prepare the presentation this week
#
+
# Identity classifiers used with the algorithms
 +
# Plan the upcoming goal for the proposal algorithms
 +
# Start to design the method: Maximal Frequent Word Sequence
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Have a detail review on the method of Maximal Frequent Word Sequence
#
+
# Understand the classifier of Naïve Bayes
#
+
# Prepare the stage one progress report
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Discuss about proposal slides with Brian.
#
+
# Modify the slides.
#
+
# Present proposal seminar.
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Further researches about methods.
#
+
# Prepare for stage one report
#
+
  
 
===Semester 2, Week 4===
 
===Semester 2, Week 4===
Line 155: Line 163:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Work on the method of Common N-grams by using Java
#  
+
# Fully read paper of the algorithm and classifier
#
+
# Discuss the design of Common N-grams with the other members
 +
# Delegate tasks of the stage one progress report to individual members
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Complete parts of Executive Summary, Previous Studies, Coding Requirements and Tasks on Stage Two Report on the stage one progress report
#
+
# Modify Work Breakdown Structure, Risk Assessment, Milestones, Monitoring Scheme and Proposed Budget
#
+
# Complete writing on Common N-grams and SVM
 +
# Write up the draft of the stage one progress report and send it to supervisors for feedback
 +
# Modification on stage one progress report until deadline
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Researches on the method of Maximal Frequent Word Sequence have completed
#
+
# Coding on Maximal Frequent Word Sequence
#
+
# Have a meeting with the other members to delicate the tasks of the stage one progress report
 +
# Write Project Background and Significance, Technical Background, Motivations and Key Requirements of the stage one progress report
 +
# Modify the stage one report with the criteria
 +
# Grammar checking
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Coding on Maximal Frequent Word Sequence
#
+
# Complete my tasks on stage one report
#
+
# Send the draft to supervisors
 +
# Modify and format
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Test previous methods.
#
+
# Compared with previous researches, clarity and identify methods and classifiers we use.
#
+
# Processing stage one report.
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Finish stage one report.
#
+
# Allocate the report roles for each group members.
#
+
  
 
===Semester 2, Week 5===
 
===Semester 2, Week 5===
Line 194: Line 209:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Done my allocated parts of the stage one report
#  
+
# Attend a group weekly meeting within the team and discuss uncompleted sections
#
+
# Help formatting
 +
# Send the report draft to supervisors
 +
# Modify the report after getting feedback from supervisors
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Develop the method of Common N-grams
#
+
# Read papers
#
+
# Learn to use SVM
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Finish Project Background and Significance, Technical Background, Motivations and Key Requirements
#
+
# Write Input and Output Specifications, and Testing and Verification
#
+
# Help to write the part of Project Management
 +
# Grammar checking and formatting
 +
# Modification on the stage one progress report after getting feedback from supervisors
 +
# Done the final version of the stage one progress report and submit
 +
# Coding on Maximal Frequent Word Sequence
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Coding on Maximal Frequent Word Sequence
#
+
# Have a meeting with the other members discussing the upcoming goals
#
+
# Review papers
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Allocate stage one-report roles.
#
+
#Allocate research method: common N-gram for me.
#
+
# Allocate classifier method: dissimilarity calculation for me.
 +
# Modify stage one report after feedback.
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Coding and developing N-gram
#
+
# Researching on dissimilarity
#
+
  
 
===Semester 2, Week 6===
 
===Semester 2, Week 6===
Line 233: Line 256:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Read the papers of the algorithm of Common N-grams
#  
+
# Have a big structure of programming Common N-grams
#
+
# Review paper of SVM
 +
# The classifier SVM – still consider how to use the produced output text file as the input of the SVM
 +
# Participate the group meeting
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Discuss the code with the team
#
+
# Coding on Common N-grams
#
+
# Design SVM
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Research on Maximal Frequent Word Sequence
#
+
# Develop the programing on Maximal Frequent Word Sequence
#
+
# Debugging
 +
# Help the other members coding
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Complete about 30% - 40% of the code for data extraction using Maximal Frequent Word Sequence
#
+
# Discuss classifiers
#
+
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Learning and coding on N-gram
#
+
# Debugging
#
+
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Discussing within team about coding
#
+
# Design classifier method
#
+
  
 
===Semester 2, Week 7===
 
===Semester 2, Week 7===
Line 272: Line 295:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Discuss the Common N-grams problems with the other members
#  
+
# Finish about 50% of the code for data extraction using Common N-grams
#
+
# Have a group meeting with the other two members reporting my current progress of extraction method of Common N-grams
 +
# Introduce the stage two report
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Continue coding of Common N-grams
#
+
# Participate the meeting about stage two report with the other members
#
+
# Try to figure out how to use SVM function in MATLAB
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Review the paper of
#
+
# Algorithm for Maximal Frequent Sequences in Document Clustering
#
+
# Experimenting with Maximal Frequent Sequences for Multi-Document Summarization
 +
# Discovery of Frequent Word Sequences in Text
 +
# Done 30% of the code for data extraction using Maximal Frequent Word Sequence
 +
# Review the paper of Augmenting Naïve Bayes Classifiers with Statistical Language Models
 +
# Review the criteria of stage two report
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Coding and Debugging
#
+
# Discuss implementation of output of data from Maximal Frequent Word Sequence to Naïve Bayes Classifiers
#
+
# Prepare the stage two report
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Group meeting, discussing with other team members.
#
+
# Coding on N-gram
#
+
# Structuring the stage two report
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Keep coding N-gram
#
+
# Group meeting about stage two report
#
+
# Begin to coding dissimilarity classifier
  
 
===Semester 2, Week 8===
 
===Semester 2, Week 8===
Line 311: Line 340:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Coding of Common N-grams
#  
+
# Discuss the project management
#
+
# Investigation on SVM in MATLAB
 +
# Participate a meeting discuss how to apply the generate data to the classifier
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Complete software coding v1.0 at the end of Week 11
#
+
# Start to write the stage two report
#
+
# Review SVM from previous attempt
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Coding and Debugging on Maximal Frequent Word Sequence
#
+
# Further research on Naïve Bayes
#
+
# Discuss the Naïve Bayes Classifier with the other members
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Write the project management of the stage two report
#
+
# Continue coding and debugging
#
+
# Weekly meeting with the other team members
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Coding N-gram
#
+
# Group meeting about stage two report
#
+
# Try to begin coding dissimilarity classifier
 +
# Researching on dissimilarity classifier
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Write stage two report
#
+
# Group meeting
#
+
  
 
===Semester 2, Week 9===
 
===Semester 2, Week 9===
Line 350: Line 381:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Add some new classes on the code of Common N-grams
#  
+
# Code modification
#
+
# Weekly meeting with the other team members to report the progress of Common N-grams coding
 +
# Write parts of Project Objectives, Background, Algorithm Programming and Project Management on the stage two report
 +
# Get feedback of the stage one progress report from Brian
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Complete software coding v1.0 at the end of Week 11
#
+
# Continue code modification
#
+
# Testing
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Done 60% of the code for data extraction using Maximal Frequent Word Sequence
#
+
# Help debugging the code of the common N-grams
#
+
# Report the code progress so far in the team meeting
 +
# Set up the upcoming goals: Software Coding V1.0, Stage 2 Report Due, Software Testing V1.0 and Software Coding V2.0
 +
# Start to design the training process and classification process using Naïve Bayes Classifier
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Write the parts of Introduction, Objectives, Background, Algorithm Definition, Work Breakdown Structure, Milestones and Budgets on the stage two report
#
+
# Choose some simple text files to test
#
+
# Further research on the classifier of Naïve Bayes
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Coding and debugging N-gram
#
+
# Writing stage two report
#
+
# Changing a little bit progress about schedule
 +
# Group meeting with group member and report the stages up to now
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Writing the stage two report
#
+
# Developing on dissimilarity classifier
#
+
# Testing
  
 
===Semester 2, Week 10===
 
===Semester 2, Week 10===
Line 389: Line 426:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Done most code of the common N-grams
#  
+
# Delete the unused inner classes
#
+
# Discuss SVM with the other team members
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Complete software coding v1.0 at the end of Week 11
#
+
# Figure out SVM
#
+
# Try to test the code using some simple text file
 +
# Write the stage two report
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Write the stage two report
#
+
# Complete the coding of Maximal Frequent Word Sequence
#
+
# Working on modified Maximal Frequent Word Sequence
 +
# Test efficiency using different input texts
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Modify code of Maximal Frequent Word Sequence
#
+
# Design the Naïve Bayes classifier
#
+
# Report the progress in the team meeting
 +
# Continue write the stage two report
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Testing N-gram code and debugging
#
+
# Writing stage two report
#
+
# Coding dissimilarity classifier
 +
# Group meeting
 +
 
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Finish coding on N-gram
#
+
# Coding on dissimilarity classifier
#
+
# Writing report
 +
# Group meeting to report N-gram coding
  
 
===Semester 2, Week 11===
 
===Semester 2, Week 11===
Line 428: Line 473:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Complete software coding v1.0 of the Common N-grams
#  
+
# Using my own text to verify this code is working properly
#
+
# Compare using a small test file with a large test file
 +
# Begin by building large sets of training data and testing data by randomly collecting extracted features from Author Profiles on SVM
 +
# Done the draft of the stage two report
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Modify the stage two report
#
+
# Submit the stage two report
#
+
# Use same training data, unknown data to test two extraction algorithms
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Complete the draft of the stage two report
#
+
# Grammar checking and formatting
#
+
# The output of Maximal Frequent Word Sequence code is not proper, modification is needed
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Delivery the stage two report
#
+
# Complete the code of Maximal Frequent Word Sequence
#
+
# Test the output
 +
# Have a meeting discuss the upcoming goals
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Modify and finish N-gram
#
+
# Testing N-gram code using training texts
#
+
#Coding dissimilarity classifier
 +
#Working on stage two report
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Modify stage two report
#
+
# Using training data to test N-gram coding
#
+
  
 
===Semester 2, Week 12===
 
===Semester 2, Week 12===
Line 467: Line 517:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Submit the stage two report and send it to supervisors
#  
+
# Report my individual work done so far
#
+
# Report the code of the common N-grams completed and tested
 +
# Report the progress of SVM
 +
# Discuss the upcoming goals with the other members
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Prepare for exams
#
+
#
+
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
 
+
# Send my stage two report to supervisors
#
+
# Weekly meeting with the other team members to report the progress of the project
#
+
#
+
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Stop project
#
+
# Work on exams
#
+
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Submit stage two report
#
+
# Group meeting to report progress
#
+
# Coding on dissimilarity classifier
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# None (prepare about final exam)
#
+
#
+
  
 
===Semester 1, Week 1===
 
===Semester 1, Week 1===
Line 506: Line 551:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Review two algorithms and three classifiers
#  
+
# Group members present individual report so far on the group weekly meeting
#
+
# Work on coding SVM program
 +
# Check the Milestones for the upcoming goals
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Email supervisors to have a meeting reporting the progress of the report
#
+
# Discuss the performance of the current progress
#
+
# Modify the SVM program
 +
# Prepare the project description and images for project exhibition
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Meet with the team members discussing the classifiers
#
+
# Simplify the code of Maximal Frequent Word Sequence
#
+
# Work on the Naïve Bayes classifier
 +
# Do some testing
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Arrange a time meeting with supervisors
#
+
# Discuss the key methods used in Naïve Bayes with the team
#
+
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Group meeting to report progress of project during the summer break
#
+
# Keep coding on dissimilarity classifier
#
+
# Do testing on training data
 +
 
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Plan to meeting with supervisor
#
+
# Modify and coding dissimilarity classifier
#
+
  
 
===Semester 1, Week 2===
 
===Semester 1, Week 2===
Line 545: Line 592:
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Confirm a meeting time with supervisors
#  
+
# Complete a project description and image, also email to Braden
#
+
# Discuss SVM with the team members
 +
# Continue working on SVM
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Meet up with supervisors
#
+
# Code modification
#
+
# Plan the upcoming goals within the team
 +
# Test programs using English text
 +
# Start to prepare the exhibition and final seminar
  
 
====Kai He====
 
====Kai He====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
#
+
# Done half of the program of the Naïve Bayes
#
+
# Change the classes in the program
#
+
# Code modification
 +
# Check the project description and image
 +
# Have a brief meeting with the team members
  
 
''' Plan and Goals for Next Week '''
 
''' Plan and Goals for Next Week '''
  
#
+
# Have a meeting with supervisors
#
+
# Develop software
#
+
# Prepare the exhibition and final seminar
 +
 
 +
====Zhaokun Wang====
 +
'''Progress and Status This Week'''
 +
 
 +
# Group meeting within group
 +
# Modify and coding dissimilarity classifier
 +
# Working on project description and image
 +
 
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
# Meeting with supervisor
 +
# Keep coding
 +
# Prepare for the final seminar
 +
 
 +
===Semester 1, Week 3===
 +
 
 +
====Yan Xie====
 +
'''Progress and Status this week:'''
 +
# Get feedback from meeting with supervisors
 +
# Consider the punctuation remove, lowercase conversion, space combination and word overlapping
 +
# Develop the java code of the Common N-gram
 +
# Analysis the poor result from text with chapter numbers and titles
 +
 
 +
'''Plan and Goals for new week:'''
 +
# Complete the java code of the Common N-gram
 +
# Test the 155 English text, 82 Federalist Paper and 27 Greek New Testament
 +
 
 +
====Kai He====
 +
'''Progress and Status This Week'''
 +
 
 +
# Have a meeting with supervisors to discuss our project’s progress.
 +
# Consider how to realize overlapping detection using colors in Java.
 +
# Continue developing the Maximal Frequent Word Sequence Algorithm
 +
# Start preparing the final Seminar in week 6.
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
#      Finish coding the Maximal Frequent Word Sequence Algorithm
 +
#      Have a draft for the final seminar.
 +
 
 +
====Zhaokun Wang====
 +
'''Progress and Status This Week'''
 +
 
 +
# Getting feedback from supervisor
 +
# Fixing on N-gram (suggestion from supervisors)
 +
# Group meeting with team members
 +
 
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
# Keep on dissimilarity classifier
 +
# Finish fixing N-gram
 +
 
 +
===Semester 1, Week 4===
 +
 
 +
====Yan Xie====
 +
'''Progress and Status this week:'''
 +
# Engage in removing all chapter numbers and titles
 +
# Add ranking method in the program
 +
# Finish the code of Common N-gram
 +
# Run the completed program on 155 English text, 82 Federalist Paper and 27 Greek New Testament
 +
# Draft the structure of the final seminar PPT
 +
 
 +
'''Plan and Goals for new week:'''
 +
# Analysis the output of tested text and consider removing tail and setting threshold in the big size of training data
 +
# Discuss the tested result with the group members
 +
# Prepare the slides of final seminar with the group members
 +
 
 +
====Kai He====
 +
'''Progress and Status This Week'''
 +
 
 +
# Maximal Frequent Word Sequence code is completed to combine features for different threshold n.
 +
# Remove titles and redundant information from the allocated 150 English corpus.
 +
# Generate extracted features from the text corpuses.
 +
# A first draft PowerPoint is completed for the final seminar.
 +
# Research on the overlapping problem and find it cannot be done using Java since the text corpuses are plain texts, they do not support color highlighted.
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
#      Finish coding the Naïve Bayes classifier to take multiple input files.
 +
#      Assemble the PowerPoint and start practicing.
 +
 
 +
====Zhaokun Wang====
 +
'''Progress and Status This Week'''
 +
 
 +
# Finish coding N-gram
 +
#Removing unnecessary marks on the testing texts
 +
# Run all texts using N-gram code
 +
# Group meeting about final seminar
 +
# Finalize dissimilarity classifier
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
# Prepare for final seminar
 +
# Done running on texts using N-gram
 +
# Compared with training data, and analysis tested texts output
 +
 
 +
===Semester 1, Week 5===
 +
 
 +
====Yan Xie====
 +
'''Progress and Status this week:'''
 +
# Set threshold in the output of tested test
 +
# Analysis the input format of SVM
 +
# Work on preparing final seminar
 +
 
 +
'''Plan and Goals for new week:'''
 +
# Send the draft of PPT to Brian
 +
# PPT Slides modification
 +
# Prepare the presentation with the group members
 +
 
 +
====Kai He====
 +
'''Progress and Status This Week'''
 +
 
 +
# Naïve Bayes classifier code 80% modified. Have bugs in the code.
 +
# Group meeting to prepare for the final seminar.
 +
# PowerPoint slides are added to one, roles and tasks are allocated for each member.
 +
 
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
#      Finish debugging.
 +
#      Send the completed PowerPoint to our supervisors for feedback.
 +
#      Prepare the final seminar
 +
 
 +
====Zhaokun Wang====
 +
'''Progress and Status This Week'''
 +
 
 +
# Allocation the final seminar
 +
# Finish dissimilarity classifier
 +
# Fixing input format on dissimilarity classifier
 +
 
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
# Modify PPT slides for final seminar
 +
# Preparing final seminar
 +
 
 +
===Semester 1, Week 6===
 +
 
 +
====Yan Xie====
 +
'''Progress and Status this week:'''
 +
# Classify all authors’ output file after setting threshold when N equals from 2 to 10
 +
# The Java code of Common N-gram update:
 +
#* eg. In 155 English Text, when n = 2, combine six authors’ features and create a master list.
 +
#* From N=2 to N=10, it gives 9 master lists. Find each author’s features with its frequency of occurrence in the master list and only list frequencies as one part of the input format of SVM.
 +
# Also classify the output files of Federalist Paper and Greek New Testament
 +
# Finish the input format of SVM and write matlab code of SVM
 +
 
 +
'''Plan and Goals for new week:'''
 +
# Prepare the final report and is due on week 11
 +
# SVM code modification
 +
# Do some testing
 +
 
 +
====Kai He====
 +
'''Progress and Status This Week'''
 +
 
 +
# Naïve Bayes classifier debugged. Now consider how to present the output results.
 +
# Have meeting with Brian to talk about our PowerPoint slides.
 +
# Finalize our PowerPoint.
 +
# More practice on the final seminar.
 +
# Did our final seminar on Friday.
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
# Consider the structure of the final report.
 +
# Further test on the methods .
 +
 
 +
====Zhaokun Wang====
 +
'''Progress and Status This Week'''
 +
 
 +
# Classify the output files of federalist paper and Greek New Testament
 +
# Fixing problems about input format on dissimilarity classifier
 +
# Classify all authors output files and setting N (2 to 10)
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
# Modify dissimilarity classifier
 +
# Do testing
 +
 
 +
===Semester 1, Week 7===
 +
 
 +
====Yan Xie====
 +
'''Progress and Status this week:'''
 +
# Amend the SVM matlab code
 +
# Test the 155 English Text, 82 Federalist Paper and 27 Greek New Testament, and produce the output of the dispute text
 +
# Gains the performance results and arrive to a conclusion (possible authors)
 +
# Meet with the other group members and discuss the results
 +
# Build the structure of the final report
 +
 
 +
'''Plan and Goals for new week:'''
 +
# Analysis the results of the Common N-gram and compare the classification accuracy of the other algorithm of Maximal Frequent Word Sequence with group members
 +
# Give some suggestions on potential modification
 +
# Start working on some parts of the final report
 +
 
 +
====Kai He====
 +
'''Progress and Status This Week'''
 +
 
 +
# Have a brief idea of how the final report will be structured.
 +
# Capture test results for the final report.
 +
# Meeting with the group.
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
# Modify the output file for using SVM.
 +
# Evaluate results.
 +
# Plan to upload things to this wiki
 +
 
 +
====Zhaokun Wang====
 +
'''Progress and Status This Week'''
 +
 
 +
# Group meeting with group members
 +
# Doing tests using dissimilarity method
 +
# Test the 132 English Text, Federalist Paper and Greek New Testament, and produce the output of the dispute texts
 +
# Layout for final report
 +
 
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
# Writing final report
 +
# Analysis accuracy between two methods
 +
 
 +
===Semester 1, Week 8===
 +
 
 +
====Yan Xie====
 +
'''Progress and Status this week:'''
 +
# Summary the results from algorithms of Common N-gram and Maximal Frequent Word Sequence
 +
# Test the other text files (English New Testament) using Common N-gram algorithm and SVM classification
 +
# Write the part of Common N-gram in the final report
 +
# Have a meeting with the other group members discussing the upcoming goal
 +
 
 +
'''Plan and Goals for new week:'''
 +
# Analysis the English New Testament output gained from SVM classification and also compared with using the Maximal Frequent Word Sequence algorithm and the Naïve Bayes classification
 +
# Write the final report
 +
 
 +
====Kai He====
 +
'''Progress and Status This Week'''
 +
 
 +
# Group meeting .
 +
# Obtained test results from the Federal list and New Testaments.
 +
# Finish coding in order to use SVM.
 +
# Help debug  codes from other group members.
 +
 
 +
''' Plan and Goals for Next Week '''
 +
 
 +
# More tests and writings
 +
# To upload things to wiki
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
Line 580: Line 879:
 
#
 
#
  
===Semester 1, Week 3===
+
===Semester 1, Week 9===
  
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Find the small part of the generated outputs of text files using Common N-gram need to modify and write few lines of code to achieve, e.g. Duplicate feature adding
#  
+
# All the text file including 155 English texts, 82 Federalist Paper and 27 Greek New Testament, are needed to generate again, and process the output into SVM as input to perform the possibility
#
+
# Also, try testing the English version text of New Testament, which contains 27 texts, as well
 +
# Analysis the gained results and compared with the algorithm of Maximal Frequent Word Sequence, and documentation
 +
# Work on writing the final report due to two weeks left
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Commence the section of SVM of the final report
#
+
# Email supervisors about the final report due to some queries
#
+
# Discuss the youtube video and post coming to next three weeks
  
 
====Kai He====
 
====Kai He====
 +
'''Progress and Status This Week'''
 +
 +
# Compare results with Common N-gram.
 +
# Upload and help format stage reports on the wiki page.
 +
# Upload my weekly reports onto the wiki.
 +
# Write final report.
 +
# Methods code modification.
 +
 +
''' Plan and Goals for Next Week '''
 +
 +
#      Have a draft final report.
 +
 +
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
Line 605: Line 919:
 
#
 
#
 
#
 
#
 +
 +
===Semester 1, Week 10===
 +
 +
====Yan Xie====
 +
'''Progress and Status this week:'''
 +
# Write up the SVM in the final report
 +
# Meet up with the group for the final report
 +
# Consider the video and post
 +
 +
'''Plan and Goals for new week:'''
 +
# Work on the final report
 +
# Email to supervisors arranging a time to run test, report what we have done and predict the potential authors for the Letter to the Hebrews
 +
# Prepare the post
 +
 +
====Kai He====
 +
'''Progress and Status This Week'''
 +
 +
# Group meeting for the poster and video.
 +
# Write final report
 +
 +
''' Plan and Goals for Next Week '''
 +
 +
# Plan to have a meeting with supervisors to report our progress.
 +
# Finish the final report.
 +
# Upload the rest of my weekly reports to the wiki
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
Line 619: Line 958:
 
#
 
#
  
===Semester 1, Week 4===
+
===Semester 1, Week 11===
  
 
====Yan Xie====
 
====Yan Xie====
 
'''Progress and Status this week:'''
 
'''Progress and Status this week:'''
#  
+
# Output data analysis and documentation
#  
+
# Write sections of Common Ngram and SVM
#
+
# Complete the final report
 +
# Prepare the poster
 +
# Meet with supervisors and answer potential author who wrote the letter to Hebrews
  
 
'''Plan and Goals for new week:'''
 
'''Plan and Goals for new week:'''
#  
+
# Send the poster to Braden
#
+
# Prepare the project exhibition
#
+
# Start recording video with the other group members
  
 
====Kai He====
 
====Kai He====
 +
'''Progress and Status This Week'''
 +
 +
#Write the project final report
 +
#Have meeting with supervisors to present the project's outcomes
 +
#Prepare poster and video for the exhibition
 +
 +
 +
''' Plan and Goals for Next Week '''
 +
 +
#Finalise the poster and video
 +
#Prepare the exhibition
 +
 +
====Zhaokun Wang====
 
'''Progress and Status This Week'''
 
'''Progress and Status This Week'''
  
Line 644: Line 998:
 
#
 
#
 
#
 
#
 +
 +
===Semester 1, Week 12===
 +
 +
====Yan Xie====
 +
'''Progress and Status this week:'''
 +
# Finish poster and send it to Braden
 +
# Discuss the structure of the video within the team
 +
# Finish video
 +
# Present results at the project exhibition
 +
 +
'''Plan and Goals for new week:'''
 +
# Pop up document to the Wiki page
 +
# Project closeout
 +
 +
====Kai He====
 +
'''Progress and Status This Week'''
 +
 +
#Send poster to Braden
 +
#Make video
 +
#Demonstrate the project's outcomes at exhibition
 +
#Project closeout
 +
 +
''' Plan and Goals for Next Week '''
 +
 +
#Upload document to project wiki page
  
 
====Zhaokun Wang====
 
====Zhaokun Wang====
Line 657: Line 1,036:
 
#
 
#
 
#
 
#
 
  
  
Line 663: Line 1,041:
 
==See also==
 
==See also==
 
*[[Authorship detection: Who wrote the Letter to the Hebrews?]]
 
*[[Authorship detection: Who wrote the Letter to the Hebrews?]]
*[[Minutes of Meeting 2011: Who wrote the Letter to the Hebrews?]]
+
*[[Proposal Seminar 2011: Who wrote the Letter to the Hebrews?]]
*[[Critical design review 2011: Who wrote the Letter to the Hebrews?]]
+
*[[Final Seminar 2011: Who wrote the Letter to the Hebrews?]]
*[[Progress Report 2011: Who wrote the Letter to the Hebrews?]]
+
*[[Stage One Progress Report 2011: Who wrote the Letter to the Hebrews?]]
*[[Final report 2011: Who wrote the Letter to the Hebrews?]]
+
*[[Stage Two Progress Report 2011: Who wrote the Letter to the Hebrews?]]
 +
*[[Final Report 2011: Who wrote the Letter to the Hebrews?]]
 +
*[[Exhibition Poster 2011: Who wrote the Letter to the Hebrews?]]
 
*[[Youtube Video Presentation 2011: Who wrote the Letter to the Hebrews?]]
 
*[[Youtube Video Presentation 2011: Who wrote the Letter to the Hebrews?]]
  

Latest revision as of 03:58, 7 June 2012

Contents

Supervisors

Collaborators

2011 Students

Weekly progress and questions

Semester 2, Week 1

Yan Xie

Progress and Status this week:

  1. All team members had the first meeting with a Supervisor Derek, Co-supervisors Brian and Maryam
  2. The basic idea and various applications were introduced by Derek
  3. Discuss about previous attempts and further exploration on the meeting
  4. Research the topic about authorship detection and data mining
  5. Review the researches of past year students

Plan and Goals for new week:

  1. Further study on the past researches
  2. Search the proper algorithms
  3. Have a group meeting with the other members Kai and Zhaokun

Kai He

Progress and Status This Week

  1. Met with a Supervisor Derek, Co-supervisors Brian and Maryam.
  2. The supervisors introduced the concept of this project and discuss the outcome from last year project students
  3. Research on authorship detection
  4. Study the previous algorithms

Plan and Goals for Next Week

  1. Literature search training will be held next week
  2. Have a meeting with team members
  3. Research on various methods
  4. Read papers on authorship detection

Zhaokun Wang

Progress and Status This Week

1. Fist meeting with Derek and Brian and other group member Kai and Yan.

2. Derek and Brian introduced the outline and background about this project

3. Based on previous year researches, Derek gave some suggestion about the following research.

4. Derek passed the previous research resources to us.


Plan and Goals for Next Week

1. Read through and understand previous research report.

2. Research on controversy.

3. Research on various methods.

4. Prepare the proposal seminar.

Semester 2, Week 2

Yan Xie

Progress and Status this week:

  1. Review past year’s three methods: word frequency, word recurrence interval and trigram markov model
  2. On-going researches
  3. Attend a literature search training session with the other members
  4. Discuss algorithms chosen for this project on the meeting
  5. Prepare the proposal seminar on week 3

Plan and Goals for new week:

  1. Modify the slides and send it to supervisors
  2. Prepare the presentation
  3. Analysis the chosen algorithms
  4. Discuss the project management with the other members next meeting

Kai He

Progress and Status This Week

  1. Attend the literature search training
  2. Identity the algorithms are used in this project
  3. Prepare the proposal algorithms and complete the slides for presentation
  4. Further reading on research papers

Plan and Goals for Next Week

  1. Set up the Work Breakdown Structure, Milestones, Gantt Chart and Project Budget
  2. Send the presentation slides to supervisors
  3. Prepare the presentation of proposal seminar next week
  4. Analysis the proposal algorithms used in this project
  5. Research and discuss the classifier
  6. Have a team meeting with the other members

Zhaokun Wang

Progress and Status This Week

  1. Abstract on proposal seminar.
  2. Allocate seminar role for each group member.
  3. Prepare outline PowerPoint slides.
  4. Identify the brief idea on the project.


Plan and Goals for Next Week

  1. Present proposal seminar.
  2. Identify the methods on project.
  3. Identify classifiers on project.

Semester 2, Week 3

Yan Xie

Progress and Status this week:

  1. Complete the Gantt Chart, Work Breakdown Structure, Milestones, Budget and risk analysis with the other team members
  2. Modifications on the slides of presentation
  3. Prepare the presentation
  4. Introduce the Common N-grams

Plan and Goals for new week:

  1. Research on SVM classifier for the algorithm Common N-grams used
  2. Start to design the Common N-grams
  3. Make stage one progress report template

Kai He

Progress and Status This Week

  1. Modify the slides after getting a feedback from Brian
  2. Prepare the presentation this week
  3. Identity classifiers used with the algorithms
  4. Plan the upcoming goal for the proposal algorithms
  5. Start to design the method: Maximal Frequent Word Sequence

Plan and Goals for Next Week

  1. Have a detail review on the method of Maximal Frequent Word Sequence
  2. Understand the classifier of Naïve Bayes
  3. Prepare the stage one progress report

Zhaokun Wang

Progress and Status This Week

  1. Discuss about proposal slides with Brian.
  2. Modify the slides.
  3. Present proposal seminar.


Plan and Goals for Next Week

  1. Further researches about methods.
  2. Prepare for stage one report

Semester 2, Week 4

Yan Xie

Progress and Status this week:

  1. Work on the method of Common N-grams by using Java
  2. Fully read paper of the algorithm and classifier
  3. Discuss the design of Common N-grams with the other members
  4. Delegate tasks of the stage one progress report to individual members

Plan and Goals for new week:

  1. Complete parts of Executive Summary, Previous Studies, Coding Requirements and Tasks on Stage Two Report on the stage one progress report
  2. Modify Work Breakdown Structure, Risk Assessment, Milestones, Monitoring Scheme and Proposed Budget
  3. Complete writing on Common N-grams and SVM
  4. Write up the draft of the stage one progress report and send it to supervisors for feedback
  5. Modification on stage one progress report until deadline

Kai He

Progress and Status This Week

  1. Researches on the method of Maximal Frequent Word Sequence have completed
  2. Coding on Maximal Frequent Word Sequence
  3. Have a meeting with the other members to delicate the tasks of the stage one progress report
  4. Write Project Background and Significance, Technical Background, Motivations and Key Requirements of the stage one progress report
  5. Modify the stage one report with the criteria
  6. Grammar checking


Plan and Goals for Next Week

  1. Coding on Maximal Frequent Word Sequence
  2. Complete my tasks on stage one report
  3. Send the draft to supervisors
  4. Modify and format

Zhaokun Wang

Progress and Status This Week

  1. Test previous methods.
  2. Compared with previous researches, clarity and identify methods and classifiers we use.
  3. Processing stage one report.

Plan and Goals for Next Week

  1. Finish stage one report.
  2. Allocate the report roles for each group members.

Semester 2, Week 5

Yan Xie

Progress and Status this week:

  1. Done my allocated parts of the stage one report
  2. Attend a group weekly meeting within the team and discuss uncompleted sections
  3. Help formatting
  4. Send the report draft to supervisors
  5. Modify the report after getting feedback from supervisors

Plan and Goals for new week:

  1. Develop the method of Common N-grams
  2. Read papers
  3. Learn to use SVM

Kai He

Progress and Status This Week

  1. Finish Project Background and Significance, Technical Background, Motivations and Key Requirements
  2. Write Input and Output Specifications, and Testing and Verification
  3. Help to write the part of Project Management
  4. Grammar checking and formatting
  5. Modification on the stage one progress report after getting feedback from supervisors
  6. Done the final version of the stage one progress report and submit
  7. Coding on Maximal Frequent Word Sequence


Plan and Goals for Next Week

  1. Coding on Maximal Frequent Word Sequence
  2. Have a meeting with the other members discussing the upcoming goals
  3. Review papers

Zhaokun Wang

Progress and Status This Week

  1. Allocate stage one-report roles.
  2. Allocate research method: common N-gram for me.
  3. Allocate classifier method: dissimilarity calculation for me.
  4. Modify stage one report after feedback.


Plan and Goals for Next Week

  1. Coding and developing N-gram
  2. Researching on dissimilarity

Semester 2, Week 6

Yan Xie

Progress and Status this week:

  1. Read the papers of the algorithm of Common N-grams
  2. Have a big structure of programming Common N-grams
  3. Review paper of SVM
  4. The classifier SVM – still consider how to use the produced output text file as the input of the SVM
  5. Participate the group meeting

Plan and Goals for new week:

  1. Discuss the code with the team
  2. Coding on Common N-grams
  3. Design SVM

Kai He

Progress and Status This Week

  1. Research on Maximal Frequent Word Sequence
  2. Develop the programing on Maximal Frequent Word Sequence
  3. Debugging
  4. Help the other members coding

Plan and Goals for Next Week

  1. Complete about 30% - 40% of the code for data extraction using Maximal Frequent Word Sequence
  2. Discuss classifiers

Zhaokun Wang

Progress and Status This Week

  1. Learning and coding on N-gram
  2. Debugging

Plan and Goals for Next Week

  1. Discussing within team about coding
  2. Design classifier method

Semester 2, Week 7

Yan Xie

Progress and Status this week:

  1. Discuss the Common N-grams problems with the other members
  2. Finish about 50% of the code for data extraction using Common N-grams
  3. Have a group meeting with the other two members reporting my current progress of extraction method of Common N-grams
  4. Introduce the stage two report

Plan and Goals for new week:

  1. Continue coding of Common N-grams
  2. Participate the meeting about stage two report with the other members
  3. Try to figure out how to use SVM function in MATLAB

Kai He

Progress and Status This Week

  1. Review the paper of
  2. Algorithm for Maximal Frequent Sequences in Document Clustering
  3. Experimenting with Maximal Frequent Sequences for Multi-Document Summarization
  4. Discovery of Frequent Word Sequences in Text
  5. Done 30% of the code for data extraction using Maximal Frequent Word Sequence
  6. Review the paper of Augmenting Naïve Bayes Classifiers with Statistical Language Models
  7. Review the criteria of stage two report

Plan and Goals for Next Week

  1. Coding and Debugging
  2. Discuss implementation of output of data from Maximal Frequent Word Sequence to Naïve Bayes Classifiers
  3. Prepare the stage two report

Zhaokun Wang

Progress and Status This Week

  1. Group meeting, discussing with other team members.
  2. Coding on N-gram
  3. Structuring the stage two report


Plan and Goals for Next Week

  1. Keep coding N-gram
  2. Group meeting about stage two report
  3. Begin to coding dissimilarity classifier

Semester 2, Week 8

Yan Xie

Progress and Status this week:

  1. Coding of Common N-grams
  2. Discuss the project management
  3. Investigation on SVM in MATLAB
  4. Participate a meeting discuss how to apply the generate data to the classifier

Plan and Goals for new week:

  1. Complete software coding v1.0 at the end of Week 11
  2. Start to write the stage two report
  3. Review SVM from previous attempt

Kai He

Progress and Status This Week

  1. Coding and Debugging on Maximal Frequent Word Sequence
  2. Further research on Naïve Bayes
  3. Discuss the Naïve Bayes Classifier with the other members

Plan and Goals for Next Week

  1. Write the project management of the stage two report
  2. Continue coding and debugging
  3. Weekly meeting with the other team members

Zhaokun Wang

Progress and Status This Week

  1. Coding N-gram
  2. Group meeting about stage two report
  3. Try to begin coding dissimilarity classifier
  4. Researching on dissimilarity classifier


Plan and Goals for Next Week

  1. Write stage two report
  2. Group meeting

Semester 2, Week 9

Yan Xie

Progress and Status this week:

  1. Add some new classes on the code of Common N-grams
  2. Code modification
  3. Weekly meeting with the other team members to report the progress of Common N-grams coding
  4. Write parts of Project Objectives, Background, Algorithm Programming and Project Management on the stage two report
  5. Get feedback of the stage one progress report from Brian

Plan and Goals for new week:

  1. Complete software coding v1.0 at the end of Week 11
  2. Continue code modification
  3. Testing

Kai He

Progress and Status This Week

  1. Done 60% of the code for data extraction using Maximal Frequent Word Sequence
  2. Help debugging the code of the common N-grams
  3. Report the code progress so far in the team meeting
  4. Set up the upcoming goals: Software Coding V1.0, Stage 2 Report Due, Software Testing V1.0 and Software Coding V2.0
  5. Start to design the training process and classification process using Naïve Bayes Classifier

Plan and Goals for Next Week

  1. Write the parts of Introduction, Objectives, Background, Algorithm Definition, Work Breakdown Structure, Milestones and Budgets on the stage two report
  2. Choose some simple text files to test
  3. Further research on the classifier of Naïve Bayes

Zhaokun Wang

Progress and Status This Week

  1. Coding and debugging N-gram
  2. Writing stage two report
  3. Changing a little bit progress about schedule
  4. Group meeting with group member and report the stages up to now


Plan and Goals for Next Week

  1. Writing the stage two report
  2. Developing on dissimilarity classifier
  3. Testing

Semester 2, Week 10

Yan Xie

Progress and Status this week:

  1. Done most code of the common N-grams
  2. Delete the unused inner classes
  3. Discuss SVM with the other team members

Plan and Goals for new week:

  1. Complete software coding v1.0 at the end of Week 11
  2. Figure out SVM
  3. Try to test the code using some simple text file
  4. Write the stage two report

Kai He

Progress and Status This Week

  1. Write the stage two report
  2. Complete the coding of Maximal Frequent Word Sequence
  3. Working on modified Maximal Frequent Word Sequence
  4. Test efficiency using different input texts


Plan and Goals for Next Week

  1. Modify code of Maximal Frequent Word Sequence
  2. Design the Naïve Bayes classifier
  3. Report the progress in the team meeting
  4. Continue write the stage two report

Zhaokun Wang

Progress and Status This Week

  1. Testing N-gram code and debugging
  2. Writing stage two report
  3. Coding dissimilarity classifier
  4. Group meeting


Plan and Goals for Next Week

  1. Finish coding on N-gram
  2. Coding on dissimilarity classifier
  3. Writing report
  4. Group meeting to report N-gram coding

Semester 2, Week 11

Yan Xie

Progress and Status this week:

  1. Complete software coding v1.0 of the Common N-grams
  2. Using my own text to verify this code is working properly
  3. Compare using a small test file with a large test file
  4. Begin by building large sets of training data and testing data by randomly collecting extracted features from Author Profiles on SVM
  5. Done the draft of the stage two report

Plan and Goals for new week:

  1. Modify the stage two report
  2. Submit the stage two report
  3. Use same training data, unknown data to test two extraction algorithms

Kai He

Progress and Status This Week

  1. Complete the draft of the stage two report
  2. Grammar checking and formatting
  3. The output of Maximal Frequent Word Sequence code is not proper, modification is needed


Plan and Goals for Next Week

  1. Delivery the stage two report
  2. Complete the code of Maximal Frequent Word Sequence
  3. Test the output
  4. Have a meeting discuss the upcoming goals

Zhaokun Wang

Progress and Status This Week

  1. Modify and finish N-gram
  2. Testing N-gram code using training texts
  3. Coding dissimilarity classifier
  4. Working on stage two report


Plan and Goals for Next Week

  1. Modify stage two report
  2. Using training data to test N-gram coding

Semester 2, Week 12

Yan Xie

Progress and Status this week:

  1. Submit the stage two report and send it to supervisors
  2. Report my individual work done so far
  3. Report the code of the common N-grams completed and tested
  4. Report the progress of SVM
  5. Discuss the upcoming goals with the other members

Plan and Goals for new week:

  1. Prepare for exams

Kai He

Progress and Status This Week

  1. Send my stage two report to supervisors
  2. Weekly meeting with the other team members to report the progress of the project

Plan and Goals for Next Week

  1. Stop project
  2. Work on exams

Zhaokun Wang

Progress and Status This Week

  1. Submit stage two report
  2. Group meeting to report progress
  3. Coding on dissimilarity classifier

Plan and Goals for Next Week

  1. None (prepare about final exam)

Semester 1, Week 1

Yan Xie

Progress and Status this week:

  1. Review two algorithms and three classifiers
  2. Group members present individual report so far on the group weekly meeting
  3. Work on coding SVM program
  4. Check the Milestones for the upcoming goals

Plan and Goals for new week:

  1. Email supervisors to have a meeting reporting the progress of the report
  2. Discuss the performance of the current progress
  3. Modify the SVM program
  4. Prepare the project description and images for project exhibition

Kai He

Progress and Status This Week

  1. Meet with the team members discussing the classifiers
  2. Simplify the code of Maximal Frequent Word Sequence
  3. Work on the Naïve Bayes classifier
  4. Do some testing

Plan and Goals for Next Week

  1. Arrange a time meeting with supervisors
  2. Discuss the key methods used in Naïve Bayes with the team

Zhaokun Wang

Progress and Status This Week

  1. Group meeting to report progress of project during the summer break
  2. Keep coding on dissimilarity classifier
  3. Do testing on training data


Plan and Goals for Next Week

  1. Plan to meeting with supervisor
  2. Modify and coding dissimilarity classifier

Semester 1, Week 2

Yan Xie

Progress and Status this week:

  1. Confirm a meeting time with supervisors
  2. Complete a project description and image, also email to Braden
  3. Discuss SVM with the team members
  4. Continue working on SVM

Plan and Goals for new week:

  1. Meet up with supervisors
  2. Code modification
  3. Plan the upcoming goals within the team
  4. Test programs using English text
  5. Start to prepare the exhibition and final seminar

Kai He

Progress and Status This Week

  1. Done half of the program of the Naïve Bayes
  2. Change the classes in the program
  3. Code modification
  4. Check the project description and image
  5. Have a brief meeting with the team members

Plan and Goals for Next Week

  1. Have a meeting with supervisors
  2. Develop software
  3. Prepare the exhibition and final seminar

Zhaokun Wang

Progress and Status This Week

  1. Group meeting within group
  2. Modify and coding dissimilarity classifier
  3. Working on project description and image


Plan and Goals for Next Week

  1. Meeting with supervisor
  2. Keep coding
  3. Prepare for the final seminar

Semester 1, Week 3

Yan Xie

Progress and Status this week:

  1. Get feedback from meeting with supervisors
  2. Consider the punctuation remove, lowercase conversion, space combination and word overlapping
  3. Develop the java code of the Common N-gram
  4. Analysis the poor result from text with chapter numbers and titles

Plan and Goals for new week:

  1. Complete the java code of the Common N-gram
  2. Test the 155 English text, 82 Federalist Paper and 27 Greek New Testament

Kai He

Progress and Status This Week

  1. Have a meeting with supervisors to discuss our project’s progress.
  2. Consider how to realize overlapping detection using colors in Java.
  3. Continue developing the Maximal Frequent Word Sequence Algorithm
  4. Start preparing the final Seminar in week 6.

Plan and Goals for Next Week

  1. Finish coding the Maximal Frequent Word Sequence Algorithm
  2. Have a draft for the final seminar.

Zhaokun Wang

Progress and Status This Week

  1. Getting feedback from supervisor
  2. Fixing on N-gram (suggestion from supervisors)
  3. Group meeting with team members


Plan and Goals for Next Week

  1. Keep on dissimilarity classifier
  2. Finish fixing N-gram

Semester 1, Week 4

Yan Xie

Progress and Status this week:

  1. Engage in removing all chapter numbers and titles
  2. Add ranking method in the program
  3. Finish the code of Common N-gram
  4. Run the completed program on 155 English text, 82 Federalist Paper and 27 Greek New Testament
  5. Draft the structure of the final seminar PPT

Plan and Goals for new week:

  1. Analysis the output of tested text and consider removing tail and setting threshold in the big size of training data
  2. Discuss the tested result with the group members
  3. Prepare the slides of final seminar with the group members

Kai He

Progress and Status This Week

  1. Maximal Frequent Word Sequence code is completed to combine features for different threshold n.
  2. Remove titles and redundant information from the allocated 150 English corpus.
  3. Generate extracted features from the text corpuses.
  4. A first draft PowerPoint is completed for the final seminar.
  5. Research on the overlapping problem and find it cannot be done using Java since the text corpuses are plain texts, they do not support color highlighted.

Plan and Goals for Next Week

  1. Finish coding the Naïve Bayes classifier to take multiple input files.
  2. Assemble the PowerPoint and start practicing.

Zhaokun Wang

Progress and Status This Week

  1. Finish coding N-gram
  2. Removing unnecessary marks on the testing texts
  3. Run all texts using N-gram code
  4. Group meeting about final seminar
  5. Finalize dissimilarity classifier

Plan and Goals for Next Week

  1. Prepare for final seminar
  2. Done running on texts using N-gram
  3. Compared with training data, and analysis tested texts output

Semester 1, Week 5

Yan Xie

Progress and Status this week:

  1. Set threshold in the output of tested test
  2. Analysis the input format of SVM
  3. Work on preparing final seminar

Plan and Goals for new week:

  1. Send the draft of PPT to Brian
  2. PPT Slides modification
  3. Prepare the presentation with the group members

Kai He

Progress and Status This Week

  1. Naïve Bayes classifier code 80% modified. Have bugs in the code.
  2. Group meeting to prepare for the final seminar.
  3. PowerPoint slides are added to one, roles and tasks are allocated for each member.


Plan and Goals for Next Week

  1. Finish debugging.
  2. Send the completed PowerPoint to our supervisors for feedback.
  3. Prepare the final seminar

Zhaokun Wang

Progress and Status This Week

  1. Allocation the final seminar
  2. Finish dissimilarity classifier
  3. Fixing input format on dissimilarity classifier


Plan and Goals for Next Week

  1. Modify PPT slides for final seminar
  2. Preparing final seminar

Semester 1, Week 6

Yan Xie

Progress and Status this week:

  1. Classify all authors’ output file after setting threshold when N equals from 2 to 10
  2. The Java code of Common N-gram update:
    • eg. In 155 English Text, when n = 2, combine six authors’ features and create a master list.
    • From N=2 to N=10, it gives 9 master lists. Find each author’s features with its frequency of occurrence in the master list and only list frequencies as one part of the input format of SVM.
  3. Also classify the output files of Federalist Paper and Greek New Testament
  4. Finish the input format of SVM and write matlab code of SVM

Plan and Goals for new week:

  1. Prepare the final report and is due on week 11
  2. SVM code modification
  3. Do some testing

Kai He

Progress and Status This Week

  1. Naïve Bayes classifier debugged. Now consider how to present the output results.
  2. Have meeting with Brian to talk about our PowerPoint slides.
  3. Finalize our PowerPoint.
  4. More practice on the final seminar.
  5. Did our final seminar on Friday.

Plan and Goals for Next Week

  1. Consider the structure of the final report.
  2. Further test on the methods .

Zhaokun Wang

Progress and Status This Week

  1. Classify the output files of federalist paper and Greek New Testament
  2. Fixing problems about input format on dissimilarity classifier
  3. Classify all authors output files and setting N (2 to 10)

Plan and Goals for Next Week

  1. Modify dissimilarity classifier
  2. Do testing

Semester 1, Week 7

Yan Xie

Progress and Status this week:

  1. Amend the SVM matlab code
  2. Test the 155 English Text, 82 Federalist Paper and 27 Greek New Testament, and produce the output of the dispute text
  3. Gains the performance results and arrive to a conclusion (possible authors)
  4. Meet with the other group members and discuss the results
  5. Build the structure of the final report

Plan and Goals for new week:

  1. Analysis the results of the Common N-gram and compare the classification accuracy of the other algorithm of Maximal Frequent Word Sequence with group members
  2. Give some suggestions on potential modification
  3. Start working on some parts of the final report

Kai He

Progress and Status This Week

  1. Have a brief idea of how the final report will be structured.
  2. Capture test results for the final report.
  3. Meeting with the group.

Plan and Goals for Next Week

  1. Modify the output file for using SVM.
  2. Evaluate results.
  3. Plan to upload things to this wiki

Zhaokun Wang

Progress and Status This Week

  1. Group meeting with group members
  2. Doing tests using dissimilarity method
  3. Test the 132 English Text, Federalist Paper and Greek New Testament, and produce the output of the dispute texts
  4. Layout for final report


Plan and Goals for Next Week

  1. Writing final report
  2. Analysis accuracy between two methods

Semester 1, Week 8

Yan Xie

Progress and Status this week:

  1. Summary the results from algorithms of Common N-gram and Maximal Frequent Word Sequence
  2. Test the other text files (English New Testament) using Common N-gram algorithm and SVM classification
  3. Write the part of Common N-gram in the final report
  4. Have a meeting with the other group members discussing the upcoming goal

Plan and Goals for new week:

  1. Analysis the English New Testament output gained from SVM classification and also compared with using the Maximal Frequent Word Sequence algorithm and the Naïve Bayes classification
  2. Write the final report

Kai He

Progress and Status This Week

  1. Group meeting .
  2. Obtained test results from the Federal list and New Testaments.
  3. Finish coding in order to use SVM.
  4. Help debug codes from other group members.

Plan and Goals for Next Week

  1. More tests and writings
  2. To upload things to wiki

Zhaokun Wang

Progress and Status This Week

Plan and Goals for Next Week

Semester 1, Week 9

Yan Xie

Progress and Status this week:

  1. Find the small part of the generated outputs of text files using Common N-gram need to modify and write few lines of code to achieve, e.g. Duplicate feature adding
  2. All the text file including 155 English texts, 82 Federalist Paper and 27 Greek New Testament, are needed to generate again, and process the output into SVM as input to perform the possibility
  3. Also, try testing the English version text of New Testament, which contains 27 texts, as well
  4. Analysis the gained results and compared with the algorithm of Maximal Frequent Word Sequence, and documentation
  5. Work on writing the final report due to two weeks left

Plan and Goals for new week:

  1. Commence the section of SVM of the final report
  2. Email supervisors about the final report due to some queries
  3. Discuss the youtube video and post coming to next three weeks

Kai He

Progress and Status This Week

  1. Compare results with Common N-gram.
  2. Upload and help format stage reports on the wiki page.
  3. Upload my weekly reports onto the wiki.
  4. Write final report.
  5. Methods code modification.

Plan and Goals for Next Week

  1. Have a draft final report.

Zhaokun Wang

Progress and Status This Week

Plan and Goals for Next Week

Semester 1, Week 10

Yan Xie

Progress and Status this week:

  1. Write up the SVM in the final report
  2. Meet up with the group for the final report
  3. Consider the video and post

Plan and Goals for new week:

  1. Work on the final report
  2. Email to supervisors arranging a time to run test, report what we have done and predict the potential authors for the Letter to the Hebrews
  3. Prepare the post

Kai He

Progress and Status This Week

  1. Group meeting for the poster and video.
  2. Write final report

Plan and Goals for Next Week

  1. Plan to have a meeting with supervisors to report our progress.
  2. Finish the final report.
  3. Upload the rest of my weekly reports to the wiki

Zhaokun Wang

Progress and Status This Week

Plan and Goals for Next Week

Semester 1, Week 11

Yan Xie

Progress and Status this week:

  1. Output data analysis and documentation
  2. Write sections of Common Ngram and SVM
  3. Complete the final report
  4. Prepare the poster
  5. Meet with supervisors and answer potential author who wrote the letter to Hebrews

Plan and Goals for new week:

  1. Send the poster to Braden
  2. Prepare the project exhibition
  3. Start recording video with the other group members

Kai He

Progress and Status This Week

  1. Write the project final report
  2. Have meeting with supervisors to present the project's outcomes
  3. Prepare poster and video for the exhibition


Plan and Goals for Next Week

  1. Finalise the poster and video
  2. Prepare the exhibition

Zhaokun Wang

Progress and Status This Week

Plan and Goals for Next Week

Semester 1, Week 12

Yan Xie

Progress and Status this week:

  1. Finish poster and send it to Braden
  2. Discuss the structure of the video within the team
  3. Finish video
  4. Present results at the project exhibition

Plan and Goals for new week:

  1. Pop up document to the Wiki page
  2. Project closeout

Kai He

Progress and Status This Week

  1. Send poster to Braden
  2. Make video
  3. Demonstrate the project's outcomes at exhibition
  4. Project closeout

Plan and Goals for Next Week

  1. Upload document to project wiki page

Zhaokun Wang

Progress and Status This Week

Plan and Goals for Next Week


See also

Back