Difference between revisions of "ISSS608 2016-17 T1 Assign2 XU Qiuhui"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 77: Line 77:
 
==User impressions and behaviors==
 
==User impressions and behaviors==
 
===Overall Impressions===
 
===Overall Impressions===
[[File:overall impression.jpg|600px|centre|Overall Impression]]  
+
[[File:overall impression.jpg|800px|centre|Overall Impression]]  
 
We selected 4 dimensions, usefulness, ease of use, enjoyment, and quality, to see score distribution and average score to get to know people's overall impression on Wikipedia. As we can see, generally, average scores are higher than 3, which is median value. So we consider, overall, people has a positive impression on Wiki. Then we Drill down to each detailed dimension. For quality, it's the lowest on average, and also has fewer responds as score increases. So among usefulness, quality, ease of use, and enjoyment, we consider quality is Wikipedia's weakest feature.
 
We selected 4 dimensions, usefulness, ease of use, enjoyment, and quality, to see score distribution and average score to get to know people's overall impression on Wikipedia. As we can see, generally, average scores are higher than 3, which is median value. So we consider, overall, people has a positive impression on Wiki. Then we Drill down to each detailed dimension. For quality, it's the lowest on average, and also has fewer responds as score increases. So among usefulness, quality, ease of use, and enjoyment, we consider quality is Wikipedia's weakest feature.
 +
 
===Relationships===
 
===Relationships===
 
[[File:1.1.jpg|600px|centre|1.1]] [[File:1.2.jpg|600px|centre|1.2]]  
 
[[File:1.1.jpg|600px|centre|1.1]] [[File:1.2.jpg|600px|centre|1.2]]  

Revision as of 16:28, 26 September 2016

Data Sources

Dataset from UCI, Survey of faculty members from two Spanish universities on teaching uses of Wikipedia

Source: E. Aibar, J. Lladós, A. Meseguer, J. Minguillón (jminguillona[at]uoc[dot]edu), M. Lerga. Universitat Oberta de Catalunya, Barcelona, Spain.

Theme of Interest and Motivation

This Analysis aims to find out overall impressions of different user segments on Wikipedia and their use behavior according to high dimensional survey question answers. Then propose recommendations for Wikipedia's future development. In this analysis, we'll mainly answer the following questions:

  1. Relationships between user impressions and user behaviors.
  2. Relationships user behaviors and external environments.

Data Preparation

Transfer Data Type

Variables Original Data Type Transferred Data Type Reason
Gender Numeric Categorical According to dataset dictionary, gender is meaningless while using numeric value to do analysis.
PhD Numeric Categorical According to dataset dictionary, PhD is meaningless while using numeric value to do analysis.
University Numeric Categorical According to dataset dictionary, University is meaningless while using numeric value to do analysis.
YearsExp Categorical Numeric Years of experience should be continuous data, so that we can firstly bin them into several groups, then use groups to classify them.

Bin Numeric Data

Variables Original Transferred Variables Formula
Age
Age
Age(bin) If(:AGE <= 30,"20~30",If(:AGE <= 40,"30~40",If(:AGE <= 50,"40~50",If(:AGE <= 60,"50~60","60~70"))))
YearsExp
YearsExp
YearsExp(bin) If( :YEARSEXP <= 10,"0~10",If( :YEARSEXP <= 20,"10~20",If( :YEARSEXP <= 30,"20~30","more than 30")))

Group Categorical Data

Transform all survey question answers with 1-5 scores to “High, Mid, Low” degree.

Scores Degree
1 Low
2 Low
3 Mid
4 High
5 High

Inset New Column

Insert a new column, User ID to uniquely represent one user in the dataset.

Variable Data Type Example Description
UserID Categorical “U1”, “U2” …” U913” Each User ID uniquely identifies a user in the dataset.

Visualization

Parallel Set

Analysis

Users Overview

overview
  1. Among people who respond to the survey, number of people with and without PhD degree are comparable, while those who don't hold a PhD degree are relatively higher.
  2. As years of experience increase, number of respondents decrease.
  3. Almost half of respondents come from unknown domain, others mainly come from arts & humanities, engineering and law.
  4. Among all respondents, number of Adjunct are dominant.

User impressions and behaviors

Overall Impressions

Overall Impression

We selected 4 dimensions, usefulness, ease of use, enjoyment, and quality, to see score distribution and average score to get to know people's overall impression on Wikipedia. As we can see, generally, average scores are higher than 3, which is median value. So we consider, overall, people has a positive impression on Wiki. Then we Drill down to each detailed dimension. For quality, it's the lowest on average, and also has fewer responds as score increases. So among usefulness, quality, ease of use, and enjoyment, we consider quality is Wikipedia's weakest feature.

Relationships

1.1
1.2

There’s a large proportion user who don’t use to teach have very good impression on wiki, they’re potential users.

impression

Information on wiki are considered updated and relatively reliable, but still considered with lower quality than other educational resources.

impression2

Even though wiki is considered with lower quality, users still trust in it's editing system.

User behaviors and external environments

2.1

External environments tend to have huge influences on behavioral intention. Form the parallel set we can clearly get that almost all people whose colleagues don’t use wiki and are not consider well on wiki are not intended to use wiki in teaching in the future.

Relationships beneath the surface

Conclusions and Recommendations

Key Findings

Recommendations

Tools Utilized

  1. High-D - For initial data exploration and analysis
  2. JMP 12, MS Excel – For data preparation
  3. d3.js, Tableau, Treemap - For data visualization