Difference between revisions of "ISSS608 2016-17 T1 Assign2 Ye Jiatao"
Line 43: | Line 43: | ||
== Tool Utilized == | == Tool Utilized == | ||
− | Tool used: Tableau, JMP, Parallel Set. | + | Tool used: Tableau, JMP, Parallel Set.<br /> |
− | Chart used: Tree-plot, mosaic plot, stack bar, parallel set, bar chart. | + | Chart used: Tree-plot, mosaic plot, Trellis, stack bar, parallel set, bar chart. |
== Result == | == Result == |
Revision as of 22:29, 24 September 2016
Contents
Abstract
Wikipedia is a multilingual, web-based, free-content encyclopedia project supported by the Wikimedia Foundation and based on a model of openly editable content. Anyone can share their knowledge and insight through Wikipedia, which also make it a very useful tool for education purpose. Teachers can design educational activities, sharing teaching materialist and searching specific information using Wikipedia. In this project, we will use data visualization to explore the wikiHE4 data set and try to deliver some useful insight to our readers. The theme of this case focus on the popularity and usefulness of Wikipedia among different user groups.
Problems
In this project, we will mainly answer several questions relating to the theme mentioned above.
- What is the popularity of Wikipedia among different user groups?
- What about the different user groups' attitude toward usefulness of Wikipedia?
- Is there a strong relationship between social image and behavioral intention?
Data-set
The Data set WikiHE4 is from UCI which extracted from a survey of faculty members from two Spanish universities on teaching uses of Wikipedia. The data-set mainly consists of 2 parts: the first part is about demographic information in terms of each participant, the second part is the result of a series of Likert Scale Questions regarding to a wide range of user experience of Wikipedia. The raw data-set is .csv format as showed below.
Approaches
Data Preparation
Before using the data to perform visualization task, we need to carefully clean and reshape the data into appropriate format. In this case, we cannot directly use the original data-set in tableau, because most of the dimensions are related to Likert Scale Questions. In addition, to make our visualization more friendly to readers, we also need to map the raw data with original meaning using data dictionary. The detailed data preparation processes as below.
- Using excel to separate the .csv data into corresponding columns.
- Mapping the demographic variables with readable value using data dictionary.
- Combining variables "UOC_POSITON" and "OTHERSTATUS" to derive a new variable "POSITON", which indicate each participant's occupational title.
- Delete 5 rows whose value of "POSITON" are unknown.
- Using excel tableau add-in to reshape data-set, which separate one row into multiple rows according to Likert Scale Questions.
Popularity of Wikipedia
Usefulness of Wikipedia
Relationship between social image and behavioral intention
Tool Utilized
Tool used: Tableau, JMP, Parallel Set.
Chart used: Tree-plot, mosaic plot, Trellis, stack bar, parallel set, bar chart.