IS480 Team wiki:2017T2 Zenith Midterm Wiki
|Midterm Wiki||Final Wiki|
- 1 Project Progress Summary
- 2 Project Management
- 3 Quality of product
- 4 Reflections
Project Progress Summary
(insert link here)
Our project schedule is divided into 13 iterations.
- We are currently on our 9th iteration (12 Feb - 25 Feb 2018).
- Up till 20 Feb 2018, we have completed 80.56% of our development progress.
- 1 User Acceptance Test was conducted before Midterms. The results are shown here.
- Achieved and exceeded Midterms X-factor.
- New team of clients.
- Cancellation of one User Acceptance Test by clients due to busy schedules.
- List of requirement changes after Acceptance can be viewed here.
Iteration Progress: 9 of 13
Features Completion: 80.56% (29 out of 36 features)
Confidence Level: 100%
A breakdown of tasks is shown in our project scope.
Project Schedule (Plan Vs Actual):
|Score||TM <= 50||50 < TM <= 75||75 < TM <= 125||125 < TM <= 150||150 > TM|
|Action||1. Inform supervisor within 24 hours.
2. Re-estimate tasks for future iterations.
3. Consider dropping Tasks.
|1. Re-estimate tasks for future iterations.
2. Deduct number of days behind from buffer days.
3. If there are no more buffer days, decide the functionalities to drop.
|1. Our estimates are fairly accurate, and are roughly on track.
2. Add/deduct number of days ahead / behind from buffer days.
|1. Re-estimate tasks for future iterations.
2. Add number of days ahead to buffer days.
|1. Inform supervisor within 24 hours. |
2. Re-estimate tasks for future iterations.
|Severity||Low Impact||High Impact||Critical Impact|
|Description||User interface display errors, such as out of alignment, colour used is not according to theme.
It does not affect the functionality of the system.
|The system is functional with some non-critical functionalities are not working.||The system is not functional.
Bugs have to be fixed before proceeding.
|Points||BM <= 5||5 < BM < 10||BM >= 10|
|Description||The system does not need immediate fixing, could be fixed during buffer time or during coding sessions||Coders to use planned debugging time in the iteration to solve the bug||The team has to stop all current development and resolve the bug immediately|
|S/N||Risk Type||Description||Likelihood||Impact Level||Threat Level||Mitigation Plan|
|1||Technical||Ransomware attacks on Database||Low (but it happened)||High||B||System Architect to improve database security|
|2||Organizational||New members in NUS MedSense Team||Medium||Medium||B||Project Manager will be in constant communication with new members, and will regularly review the scope with them.|
|3||Project Management||Members falling sick or going overseas doing school period, reducing team's available manpower. This can cause a potential delays in the project||Low||Medium||C||Team members should constantly check on the health and well-being of one another, as well as update the Project Manager of any overseas plans as early as possible|
Natural Language Processing
Natural Language Processing, or NLP for short, is broadly defined as the automatic manipulation of natural language, like speech and text, by software. The study of natural language processing has been around for more than 50 years and grew out of the field of linguistics with the rise of computers.
Our team decided to employ NLP techniques to perform automated marking for open ended questions. This benefits the user as professors do not need to mark each answer and students can receive immediate feedback with regards to their answers.
Data structure for storing tokens: Trie
Trie is an efficient information reTrieval data structure. Using Trie, search complexities can be brought to optimal limit (key length). If we store keys in binary search tree, a well balanced BST will need time proportional to M * log N, where M is maximum string length and N is number of keys in tree. Using Trie, we can search the key in O(M) time. However the penalty is on Trie storage requirements. Every node of Trie consists of multiple branches. Each branch represents a possible character of keys. We need to mark the last node of every key as end of word node. A Trie node field isEndOfWord is used to distinguish the node as end of word node. A simple structure to represent nodes of English alphabet can be as following,
Inserting a key into Trie is simple approach. Every character of input key is inserted as an individual Trie node. Note that the children is an array of pointers (or references) to next level trie nodes. The key character acts as an index into the array children. If the input key is new or an extension of existing key, we need to construct non-existing nodes of the key, and mark end of word for last node. If the input key is prefix of existing key in Trie, we simply mark the last node of key as end of word. The key length determines Trie depth.
Searching for a key is similar to insert operation, however we only compare the characters and move down. The search can terminate due to end of string or lack of key in trie. In the former case, if the isEndofWord field of last node is true, then the key exists in trie. In the second case, the search terminates without examining all the characters of key, since the key is not present in trie.
Below are some of the techniques used:
Given a character sequence and a defined document unit, tokenization is the task of chopping it up into pieces, called tokens , perhaps at the same time throwing away certain characters, such as punctuation.
Sometimes, some extremely common words which would appear to be of little value in helping select documents matching a user need are excluded from the vocabulary entirely. In natural language processing, Stopwords are words that are so frequent that they can safely be removed from a text without altering its meaning. Hence, for automated marking, we removed all common words from the submitted answer. Doing this significantly reduces the number of tokens our system has to match and store.
For grammatical reasons, answers are going to use different forms of a word, such as organize, organizes, and organizing. Additionally, there are families of derivationally related words with similar meanings, such as democracy, democratic, and democratization. In these situations, we have to treat these words to be the same, as they have the same root meaning.
The goal of both stemming is to reduce inflectional forms and sometimes derivationally related forms of a word to a common base form.
We decided to use the Porter Stemming Algorithm (https://tartarus.org/martin/PorterStemmer/index.html) as it is the most popular algorithm for stemming English and has shown to be empirically very effective.
Porter's algorithm consists of 5 phases of word reductions, applied sequentially. Within each phase there are various conventions to select rules, such as selecting the rule from each rule group that applies to the longest suffix.
Together with the NUS team, we developed a few basic game rules.
For our application, we introduce a leveling system for student. In its simplest form, leveling up occurs through the process of gaining enough experience points until a target experience point total is reached. Once the target is met, the student's character "levels up," and a new target experience threshold is set. Students gains experience points (XP) by attempting a medical case. The amount of XP is dependent on how well the student scores.
Currently, the main selling point of the medical cases is to practice for their exams. By introducing the leveling system, we hope to further incentivise students to attempt medical cases on our application. The idea is to ensure that students will be playing the cases throughout the year rather than just during peak (exam) periods.
Anti Cheat Mechanism for MCQ Scoring
There can be more than one correct answer for each MCQ questions. Hence we use multi-select MCQ questions in our game. However, the problem we face is that students can simply tick all the options to get the correct answer. Hence, we developed a rule that penalizes the student for every wrong option selected.
Anti Repeat Mechanism for experience points (XP)
Another rule we developed is to halve the total amount of experience points gained for each subsequent game play of the same case. This is to reduce the incentive of repeating the same case again and again for the sake of leveling up. Furthermore, students are already expected to score better after doing the case, as the answers are revealed at the end of each case.
Added security features
In December 2017, our MongoDB database was compromised and held hostage by ransomware. We were instructed to pay 0.1 bitcoin (USD $1594) for the return of our data. Fortunately, we had backed up the data so there was no need for us to pay the ransom. Since this incident, we have taken additional measures to ensure that this does not happen again.
Quality of product
|Analysis||Use case diagram|
|Testing||User Acceptance Test 1 (11 - 13 Feb 2018)|
- Creation of test cases during development.
- Functionality testing after completion of function.
- Regression testing at the end of every iteration.
- We expect to complete 2 UATs by the end of the project.
- 1 UAT has been completed before Midterms. To view the results of this UAT, click here.
Our team has learnt the importance of working together closely with our clients, to ensure an ideal alignment between the client business requirements and our end-product. Throughout our development process, we are constantly plagued with minor hiccups and bugs. However, our team feels that our most valuable quality is resilience, as we tackle each and every problem with our ineffable tenacity and innovation.
It has been a very experimental journey managing the team since the project started. New challenges are always popping up, and I have learnt to manage the workload of each member more carefully. I believe communication is today’s important skill, be it raising issues with the team or simple dissemination of information. That is why I ensure that there are no communication lapses in our team.
It has been a challenging 7 weeks since our acceptance. The application has undergone multiple major changes, especially our database schema, to ensure better quality and continuity for our clients. The database attack was an eye-opening experience for me, and I have learnt never to take data security for granted.
Being the previous Business Analyst, it was an enriching and mind-boggling experience researching on various Natural Language Processing techniques and trying to appraise their adequacy and appropriateness for our application. Moving on as the Quality Assurance lead, I hope I do not offend everyone involved in development by scrutinizing every single corner case in the application.
The opportunity to be able to apply the entire design thinking process from the creation of prototypes to the actual web development have been arduous but rewarding journey. The final designs of the application are a product of multiple revamps, each better compared to the previous. I have learnt the importance of gathering feedback from the clients as well as real users.
Having to pick up Node.js and React.js in a short period of time really pushed me to learn and adapt at a much faster pace. There are many front-end and back-end considerations to factor in during the design process, such as the communication between parent-child components and the use of redux store versus a direct http request to the database. I believe that my technical skills have definitely improved over the past few weeks.
It has been an amazing experience seeing how the project has grown from infancy to where it is today and developing my own technical complexity along the way. Throughout the past few months working on the project, I have learnt that what’s even more important than completing the work assigned to me, is helping my team members with their tasks, be it coding or non-coding work. This will enable smoother and faster progression as a team.