Difference between revisions of "ISSS608 2017-18 T1 Assign MA XIAOLIU Data Preparation"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 1: Line 1:
 
<div style=background:#2B3820 border:#A3BFB1>
 
<div style=background:#2B3820 border:#A3BFB1>
 
[[file:Timg.jpg|260px]]  
 
[[file:Timg.jpg|260px]]  
<font size = 5; color="#FFFFFF"> Epidemic Spread in Smartpoils</font>
+
<font size = 6; color="#FFFFFF"> Epidemic Spread in Smartpoils</font>
 
</div>
 
</div>
 
<!--MAIN HEADER -->
 
<!--MAIN HEADER -->
Line 46: Line 46:
 
=Find the useful microblogs=
 
=Find the useful microblogs=
 
When we decide if the text is what we want, we need to the find the key words. For example, the words that related to this epidemic illness. In this case, Observed symptoms are largely flu­like and include fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. As the disease continues to expand, there is a reasonable assumption that the these words which related to the symptoms will become more frequent. According to the symptom and description of the flu, I set some key words. If the text has the same words as key words, then it can be looked as the useful text.  
 
When we decide if the text is what we want, we need to the find the key words. For example, the words that related to this epidemic illness. In this case, Observed symptoms are largely flu­like and include fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. As the disease continues to expand, there is a reasonable assumption that the these words which related to the symptoms will become more frequent. According to the symptom and description of the flu, I set some key words. If the text has the same words as key words, then it can be looked as the useful text.  
 +
 
'''Key word: 'flu','fever','chills','sweats','aches','pains','fatigue','coughing','breathing','nausea','vomiting','diarrhea','lymph','death''''
 
'''Key word: 'flu','fever','chills','sweats','aches','pains','fatigue','coughing','breathing','nausea','vomiting','diarrhea','lymph','death''''
 +
 
''Note:There might be a question here that, most of the people are normal people,not the doctor or nurse, so they might not use the professional term but normal words. Then this method will loss many useful text. However, we still not sure the text which might about disease but not has key words is exactly related to this flulike illness. So this method is still reasonable, which can help to find more precise texts that fit the characteristics of the disease ''  
 
''Note:There might be a question here that, most of the people are normal people,not the doctor or nurse, so they might not use the professional term but normal words. Then this method will loss many useful text. However, we still not sure the text which might about disease but not has key words is exactly related to this flulike illness. So this method is still reasonable, which can help to find more precise texts that fit the characteristics of the disease ''  
 +
 
I pick out the text, lower the words, remove the stop words and do stemming. Then if there re same words both in key word and text, the text is the target text we want
 
I pick out the text, lower the words, remove the stop words and do stemming. Then if there re same words both in key word and text, the text is the target text we want
  
Line 59: Line 62:
 
=Map=
 
=Map=
 
1. Water Supply(blue) - Residents and businesses get their drinking water by pumping water from nearby reservoirs or rivers.  These distributed water systems are both public and privately owned.
 
1. Water Supply(blue) - Residents and businesses get their drinking water by pumping water from nearby reservoirs or rivers.  These distributed water systems are both public and privately owned.
 +
 
2. Entertainment (yellow)– Vastopolis has two stadiums (Vastopolis Dome and Westside Stadium) for sports, concerts, and other events.  The various lakes and the Vast River, which flows south at a steady rate of three miles per hour, is used for water-based sports and recreation.
 
2. Entertainment (yellow)– Vastopolis has two stadiums (Vastopolis Dome and Westside Stadium) for sports, concerts, and other events.  The various lakes and the Vast River, which flows south at a steady rate of three miles per hour, is used for water-based sports and recreation.
 +
 
3. City Administration(green) – Vastopolis has several locations of significance including a state courthouse, a capitol building, convention center, and a large airport.
 
3. City Administration(green) – Vastopolis has several locations of significance including a state courthouse, a capitol building, convention center, and a large airport.
 +
 
4. various hospital
 
4. various hospital
  
 
[[File:map.jpg|280px]]
 
[[File:map.jpg|280px]]

Revision as of 23:22, 15 October 2017

Timg.jpg Epidemic Spread in Smartpoils

Overview

Data Preparation

Origin and Epidemic Spread

transmition

Suggestion

Conclusion


Original data

According to the overview, there are 3 kind of datasets, the data contents show below:

Name Description
Microblogs contains the microblogs' contents, the location and the people's ID.
Population Total population and daytime population of 13 zones.
Weather the weather, wind direction and wind power.

Find the useful microblogs

When we decide if the text is what we want, we need to the find the key words. For example, the words that related to this epidemic illness. In this case, Observed symptoms are largely flu­like and include fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. As the disease continues to expand, there is a reasonable assumption that the these words which related to the symptoms will become more frequent. According to the symptom and description of the flu, I set some key words. If the text has the same words as key words, then it can be looked as the useful text.

Key word: 'flu','fever','chills','sweats','aches','pains','fatigue','coughing','breathing','nausea','vomiting','diarrhea','lymph','death'

Note:There might be a question here that, most of the people are normal people,not the doctor or nurse, so they might not use the professional term but normal words. Then this method will loss many useful text. However, we still not sure the text which might about disease but not has key words is exactly related to this flulike illness. So this method is still reasonable, which can help to find more precise texts that fit the characteristics of the disease

I pick out the text, lower the words, remove the stop words and do stemming. Then if there re same words both in key word and text, the text is the target text we want

other adjustments

Location

Separated the location to longitude and latitude. Because the longitude in west, so I change the number to negative.

Symptom

I also add another column which named ‘Symptom’ to find the keyword in the text. This can help to know more about the flu, like which is the initial symptom, and how will the symptom change. These all can be revealed from the text.

Text symptom.jpg

Map

1. Water Supply(blue) - Residents and businesses get their drinking water by pumping water from nearby reservoirs or rivers. These distributed water systems are both public and privately owned.

2. Entertainment (yellow)– Vastopolis has two stadiums (Vastopolis Dome and Westside Stadium) for sports, concerts, and other events. The various lakes and the Vast River, which flows south at a steady rate of three miles per hour, is used for water-based sports and recreation.

3. City Administration(green) – Vastopolis has several locations of significance including a state courthouse, a capitol building, convention center, and a large airport.

4. various hospital

Map.jpg