Difference between revisions of "ISSS608 2017-18 T1 Assign ZHENG MIANYI"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 27: Line 27:
 
<p><b>Subsequently, I chose the key words to select the relevant information. Personally, I prefer a relatively small dataset with higher accuracy rather than a large dataset with lower accuracy. After many trials, I set the target words as:"fever", "chill", "fatigue", "cough", "difficult", "nausea", "vomit", "diarrhea", "lymph" and "throat".</b></p>
 
<p><b>Subsequently, I chose the key words to select the relevant information. Personally, I prefer a relatively small dataset with higher accuracy rather than a large dataset with lower accuracy. After many trials, I set the target words as:"fever", "chill", "fatigue", "cough", "difficult", "nausea", "vomit", "diarrhea", "lymph" and "throat".</b></p>
 
<br \>
 
<br \>
Last but not lease, I attempted to explore more information. For instant, is there any initial symptoms before the patients becoming ill? In addition, after viewing the symptoms, we can initially group them into two main problems: flu (those with fever, chills, fatigue, coughing, breathing difficulty, sore throat and enlarged lymph nodes) and stomach problem (those with nausea, vomiting, diarrhea).
+
Last but not lease, I attempted to explore more information. For instant, is there any initial symptoms before the patients becoming ill? In addition, after viewing the symptoms, we can initially group them into two main problems: flu (those with fever, chills, fatigue, coughing, breathing difficulty, sore throat and enlarged lymph nodes) and stomach problem (those with nausea, vomiting, diarrhea). All these two type of problems I stored them in "Type" column. In terms of symptoms, for those patient who suffered two kinds or above, i created the additional rows to store them in the "Symptom" column. (e.g. one record like " I got fever and my throat is on fire." will be recorded twice with "fever" tag and "sore throat" tag respectively.)
 
 
 
</div>
 
</div>

Revision as of 12:33, 15 October 2017

RaymHeader.png


By Zheng Mianyi


Background

An epidemic disease broke out in a major metropolitan area, Smartpolis. With provided information such as the city population, disease symptoms, both geographical map and weather of the city and most importantly: microblogs of the residents, I made every efforts to detect the transmission of this disease.


Data Preparation

The initial dataset put the latitude and longitude data together, and the main information is contains in more than 1 million microblogs records. Hence, I separated the geographical digit to two columns, namely latitude and longitude.


Subsequently, I chose the key words to select the relevant information. Personally, I prefer a relatively small dataset with higher accuracy rather than a large dataset with lower accuracy. After many trials, I set the target words as:"fever", "chill", "fatigue", "cough", "difficult", "nausea", "vomit", "diarrhea", "lymph" and "throat".


Last but not lease, I attempted to explore more information. For instant, is there any initial symptoms before the patients becoming ill? In addition, after viewing the symptoms, we can initially group them into two main problems: flu (those with fever, chills, fatigue, coughing, breathing difficulty, sore throat and enlarged lymph nodes) and stomach problem (those with nausea, vomiting, diarrhea). All these two type of problems I stored them in "Type" column. In terms of symptoms, for those patient who suffered two kinds or above, i created the additional rows to store them in the "Symptom" column. (e.g. one record like " I got fever and my throat is on fire." will be recorded twice with "fever" tag and "sore throat" tag respectively.)