Difference between revisions of "Talk for me application 2024"
(→Deliverables) |
(→Approach and methodology) |
||
Line 89: | Line 89: | ||
The approach involved a number of methods listed in our Progress and Final Reports. | The approach involved a number of methods listed in our Progress and Final Reports. | ||
Due to our NDA, strict methods are not able to be provided, but can be summarised as accessing the existing code, and modifying/adding to it to gain the desired functions/changes. | Due to our NDA, strict methods are not able to be provided, but can be summarised as accessing the existing code, and modifying/adding to it to gain the desired functions/changes. | ||
− | |||
== Relationship to possible career path== | == Relationship to possible career path== |
Revision as of 13:27, 5 November 2024
Contents
Supervisors
Honours students
General project description
With the constant improvements in technology, there has been a clear shift in the ways that non- verbal neurodivergent individuals are able to communicate with the general population. This includes the use of tools, such as phone applications, which using text-to-speech (TTS) can take the form of an artiAicial voice for the individual. This project looks to take one such tool and improve it in a way that will create a near seamless artiAicial voice, which also beneAits those who struggle with the process of typing. These tools are more commonly known as Augmentative and Alternative Communication (AAC) tools/applications, communication devices, or “talkers”, and assist in construction of sentences/phrases through the use of linking symbols/keywords and predictive keyboards [1]. With numerous options on the market, they look to meet the accessibility needs of neurodivergent people, categorised by cognitive, physical, and speech needs. The main features expected of a AAC application is a simple layout, usually with larger buttons to press to assist those with physical needs who mays struggle to type on a keyboard, as well as predeAined options/the ability to categorise and store symbols/keywords for repeated use in communication. With the speciAic needs of individuals varying highly from person to person, there is often a required trial period to determine if the application/tool is appropriate for the needs of the individual. This can be a slow, and tedious problem, and as a result, a solution that meets all needs is desirable. Talk For Me, is an application developed by Across the Cloud Ltd., which in its current alpha version, presents the user with images/terms to select, then passing these along to a Large Language Model (LLM), which creates a sentence to be spoken via TTS. It was created by Dr Matthew Berryman, who suffered a haemorrhagic stroke that left him paralysed and unable to speak for three weeks. During this time, he was frustrated in the limited tools provided by the hospital to allow for him to communicate, sighting paper charts that did not even include things such as the television in his room. As a result, Talk for Me is an application aimed towards predominately the needs of stroke patients, with other disabilities that affect speech also in mind. The improvements we have made include User Interface (UI) improvements to better suit the application towards its neurodivergent user base’s needs, focusing on simple, easy to understand, and easy to reach buttons. We have implemented features that utilise the user’s current location, and time, to tailor the experience to the user, by making recommendations for Menu items of nearby restaurants, as well as sorting of the provided keywords to recommend those commonly selected by the user. Finally, work was completed to test the performance of a few LLMs available to the team, to determine the most suitable LLM for the application and use-case, as a focus is placed on the near seamless user experience of talking. This report goes into greater detail regarding each of the components of our improvements, with a Literature Review exploring some of the concepts we looked at working on for the project, as well as how we achieved the features we have included.
Specific tasks
The tasks identified by the group to complete for this project focus around improving three areas, the UI, the functionality of the App, and the performance of the app through LLM optimisation.
UI
- Update the UI to fit with similar apps on the market
Functionality of the App
- Receive Location/Time information to make informed suggestions
- Suggest Menu Items based off the users location
- Suggest previously used sentences given a selection of words
- Generate 1+ messages from the LLM
- Sort the keywords presented to the users based off the acquired Location/Time information gained above.
Large Language Models
- Run and compare the time take to respond over a number of datasets reflecting 1-3 word combinations
- Compare the number of outliers in each performance
- Compare the models "reliability" as in how well it follows the provided instructions to respond with a response that contains the expected words (and a sentence that can be used for TTS).
Deliverables
Semester 1
- Project Proposal File:TalkForMe-ProjectPlan.pdf
- Seminar [1] password: 3LC!+USl
- Progress Report: File:TalkForMe-ProgressReport.pdf
Semester 2
- Ingenuity Poster File:Talk For Me Ingenuity Poster.pdf
- Ingenuity Video [2]
- Final Report File:TalkForMe-FinalReport.pdf
Weekly progress and questions
Weekly progress and questions were recorded in our minutes, see below:
- Semester 1: File:Talk For Me - Semester 1 Minutes.zip
Approach and methodology
The approach involved a number of methods listed in our Progress and Final Reports. Due to our NDA, strict methods are not able to be provided, but can be summarised as accessing the existing code, and modifying/adding to it to gain the desired functions/changes.
Relationship to possible career path
The the coding and project management skills you pick up in the project will serve you for a wide range of careers. Notably, due to the use of the Swift programming language, this project could lead well into a career in app development for IOS.