ISSS608 2016-17 T1 Assign2 Ho Li Chin

From Visual Analytics and Applications
Jump to navigation Jump to search

Abstract

In this assignment, the interactive data exploration and the analysis techniques will be applied to discovery patterns in multivariate data.

Theme of Interest

In this assignment, the dataset that have been chosen for the analysis is “wiki4HE Data Set” (https://archive.ics.uci.edu/ml/datasets/wiki4HE).

The theme of interest is to explore the factors affecting the use of Wikipedia in Higher Education. Wiki technology emerged in higher education teaching and learning experiences as early as 1999 and is integrated into many courses for its ability to provide a collaborative environment for Academic staff and students.

The objective of this assignment is to understand the main factors that influence the teaching uses of Wikipedia among university faculty staff, and any of those factors have significant direct impact on the Behavioral Intention of adopting Wikipedia in higher education, for providing more effective and efficient methods to maximize the teaching and learning experience.

Examine the Survey Questions

In this section, we will first examine the survey questions in the dataset. The survey questions are categorized into respective broader category, and each category is mapped to one construct measure for investigation purpose. These measures will be considered to further derive the investigation questions in next section, for the purpose of the analysis.

Survey Question Category Survey Questions (Variables) Measures
Perceived Usefulness

PU1: The use of Wikipedia makes it easier for students to develop new skills
PU2: The use of Wikipedia improves students' learning
PU3: Wikipedia is useful for teaching

User perception of technological innovations using wiki technology
Perceived Ease of Use

PEU1: Wikipedia is user friendly
PEU2: It is easy to find in Wikipedia the information you seek
PEU3: It is easy to add or edit information in Wikipedia

User perception of technological innovations using wiki technology
Perceived Enjoyment

ENJ1: The use of Wikipedia stimulates curiosity
ENJ2: The use of Wikipedia is entertaining

User perception of technological innovations using wiki technology
Use behavior

USE1: I use Wikipedia to develop my teaching materials
USE2: I use Wikipedia as a platform to develop educational activities with students
USE3: I recommend my students to use Wikipedia
USE4: I recommend my colleagues to use Wikipedia
USE5: I agree my students use Wikipedia in my courses

Motivation to use wiki
Experience

EXP1: I consult Wikipedia for issues related to my field of expertise
EXP2: I consult Wikipedia for other academic related issues
EXP3: I consult Wikipedia for personal issues
EXP4: I contribute to Wikipedia (editions, revisions, articles improvement...)
EXP5: I use wikis to work with my students

Motivation to use wiki
Job relevance

JR1: My university promotes the use of open collaborative environments in the Internet
JR2: My university considers the use of open collaborative environments in the Internet as a teaching merit

Motivation to use wiki
Sharing attitude

SA1: It is important to share academic content in open platforms
SA2: It is important to publish research results in other media than academic journals or books
SA3: It is important that students become familiar with online collaborative environments

Collaborative Mindset / Attitude
Profile 2.0

PF1: I contribute to blogs
PF2: I actively participate in social networks
PF3: I publish academic content in open platforms

Collaborative Mindset / Attitude
Quality

QU1: Articles in Wikipedia are reliable
QU2: Articles in Wikipedia are updated
QU3: Articles in Wikipedia are comprehensive
QU4: In my area of expertise, Wikipedia has a lower quality than other educational resources
QU5: I trust in the editing system of Wikipedia

Perceived quality of wiki information
Social Image

IM1: The use of Wikipedia is well considered among colleagues
IM2: In academia, sharing open educational resources is appreciated
IM3: My colleagues use Wikipedia

Social Influence
Visibility

VIS1: Wikipedia improves visibility of students' work
VIS2: It is easy to have a record of the contributions made in Wikipedia
VIS3: I cite Wikipedia in my academic papers

Social Influence
Behavioral intention

BI1: In the future I will recommend the use of Wikipedia to my colleagues and students
BI2: In the future I will use Wikipedia in my teaching activity Incentives

Intention to use wiki


Questions for Investigation

From the construct measures as defined in Section 3, the followings are the evolved list of questions for investigation. Different visualizations will be constructed to present the data and to answer the questions.

List of questions for investigation:

  • a) Relationship between the different factors that could influence the uses of Wikipedia among university faculty.
    • What are the positive factors and the negative influence factors?


  • b) What are the likely factors that negatively influence the Behavior Intention to adopt Wikipedia
    • Any identification of attributes (e.g. age, academic position, current wiki user etc) that result in the negative Behavior intention to use Wiki
    • Any skeptical attitudes in university faculty regards using Wikipedia in class
   The two questions regarding the Behavior Intention in the survey were:
   BI1: In the future I will recommend the use of Wikipedia to my colleagues and students
   BI2: In the future I will use Wikipedia in my teaching activity


  • c) Understanding of some measures as listed below, that could influence the Behavior Intention to use Wikipedia
    • User perception of technological innovations using wiki technology
    • Motivation factors to use Wikipedia
    • Social Influence
    • Perceived quality of information using wiki


Analysis of Wiki For Higher Education Dataset through Visualisation

Data Source

The data source for the Wiki dataset can be obtained from

  1. UCI dataset [[1]]

Data Preparation

The Data preparation is mainly done with JMP, the steps can be summarised as follow:

  • Load the wik4HE.csv file into JMP
  • Examine the attribute data types
    • Convert the data types of some attributes from Numeric to Categorical
  • Recode the values for some categorical attributes. E.g. For Gender, recoded as "Male" and "Female"
  • Missing data pattern check
  • For categorical variables such as Domain, UOC_Position, the missing values were recoded as "Unknown"
  • For Survey Scale values, the missing values were recoded as "Don't Know" response.

Data Cleansing & Transformation

The data transformation was done in JMP in few iterations.

  • First, summary and distribution analysis were done for the categorical (univariate) variables.
  • Then, for those continuous survey scale response, the multivariate analysis was done to examine the relationship between the variables. Pair-wise correlation, Trenary, Parallel Plot were carried out for this data exploration.
  • In order to prepare the dataset for Likert Scale analysis for the survey based questions, all the survey questions variables were stacked using JMP Stacked function to stack all the columns into rows under two columns, namely "Survey Qns" and "Scale".
    • One new column was created to recode the Scale to Categorical type with the followings types.
  Likert Scales - label each scale as follows:
1-Strongly Disagree, 2-Disagree, 3-Neutral, 4-Agree, 5-Strongly Agree

For those missing value, the type is recoded as "Don't Know"

  • Lastly, the cleaned data file is imported to Tableau and Qlik Sense for further analysis through data Visualisation.

Visualisation to Answer the Questions under investigation

Demographics Profiles of the University Faculty Staff

First, before we analyse on the survey variables that could affect the adoption behavior of Wikipedia among the University staff, let’s take a look on the demographics information of University faculty staff participated in this survey.

Dashboard #1

Dashboard #1 Demographics Info [[2]] (click here)

Visualisation Design

The above Tableau dashboard provides an interactive way to allow user to use Gender & WikiUser as Filtering options to select the info on University, PhD and Domain details.

Findings
In summary, there are a total of 913 University faculty staff participated in the survey. Some key demographics profile of the staff are:

  • More male staff (57.5%) than female staff (42.5%)
  • About 86% of the staff are NonWiki user
  • About 88% of staff were from UOC University, and out of this, 56.6% have PhD qualifications.
  • For all staff, besides Others (which is the highest percentage), 20% of staff were from Arts & Humanities domain, followed by 15% from Engineering & Sciences, and 11% from Law & Politics.

Staff Profile Dashboard

The following Tableau dashboard provides an interactive visualisation to further drill down to more details profile information in terms of Age Group by each Domain, Gender % for each domain, and also the academic position profile for each domain area.

Dashboard #2

Dashboard #2 Staff Profile Dashboard [[3]]


Findings

  • For all domains, Adjunct staff attributed to the highest academic profile for the faculty staff
  • From the Age Group vs Domain profile, for almost all domains, the main age groups of staff were in the range of 41 to 45, followed by 36 to 40. They were the middle age from 36 to 45.
  • We further drill down to Arts & Humanities faculty, it’s observed a high percentage of staff of 75% were Adjunct staff, out of this 57.7% were female staff, and about 50% of the Adjunct in age range 41- 50.

ArtsHumanities


The Various Factors that Influence the Use of Wikipedia

Next, we will analyse at a macro level all the various factors that could influence the adoption of Wikipedia among the University staff.

The following Tableau dashboard provides an interactive visualisation to examine the positive and negative factors that would affect the use of Wikipedia among the University staff in Higher Education .

Dashboard #3

Dashboard #3 Survey Analysis Dashboard [[4]] (Click here)

Visualisation Design

The above Tableau dashboard consists of a Heat Map that visualize the % of survey response scale for each survey category. Note that the % calculation is computed as % of Total across each category, that means the total % for each category should add up to 100%. In this case, we are able to identify the positive and negative factors (variables) that influence the behavior intention to use Wikipedia.

Upon selecting any of the categories in Heat Map, it provides the interactivity to allow user to drill down on the divergent bar chart on the right quadrant to look at the Likert scales for the response each survey question.

This dashboard will allow the user to examine the “What are the positive and negative influence factors that could affect the uses of Wikipedia among university faculty”

In addition, the Age Group, Academic Profile and Years of Experience were included at the bottom quadrant so that we can further examine what could be the likely factors that negatively influence the Behavior Intention to adopt Wikipedia, and if there is any identification of attributes (e.g. age, academic position, current wiki user etc.) that result in the negative Behavior Intention to use Wiki?

Findings
To answer the Investigation (a) on “What are the positive factors and the negative influence factors?”

  • 1) Positive factors (in terms of high response % of Strongly Agree + Agree) are:
    Sharing Attitude (80.6%), Perceived Ease of Use (67%), Perceived Enjoyment (66.2%)
  • 2) Poor (negative) factors (in terms of high response % of Disagree + Strongly Disagree) are:
    Use Behavior (51.5%) and Profile 2.0 (51.3%)
  • 3) It’s also observed that some of the survey questions category have responses with more than 30% of Neutral responses, they are
    Perceived Usefulness, Quality, Social Image, Visibility, and Behavioral Intention


To answer the Investigation (b) on “What are the likely factors that negatively influence the Behavior Intention to adopt Wikipedia”

From the Heat Map on % of Scale Response for Each Category, it’s observed that 30.4% indicated positive response (SA + A), 34% indicated negative responses (D + SD), and 36.7% stayed neutral opinion for Behavior Intention.

HeatMap

Next, we will compare the Wiki users vs Non-Wiki usersItalic text, to see if any patterns in their positive and negative responses for Behavior Intention to adopt Wiki.

Wiki Users

WikiUsers


Findings

  • For Wiki Users, they were largely have less than 10 years of work experience, and in the age group of 41 to 45. Most of the staff in this group were the Adjunct staff for all domain areas.
  • Almost 55% of the Wiki users were positive about the Behavior Intention to use Wikipedia, 30% indicated Neutral, and about 15% indicated D+SD.
  • They were generally very positive in Perceived Ease of Use, Perceived Enjoyment and Sharing Attitude.
  • On the other hand (as shown in the visualisation below), Use Behavior and Profile 2.0 scored poorer in terms of % response of D + SD.
  • The survey questions that fared with high % of D + SD responses were
    • I use Wikipedia as a platform to develop educational activities with students
    • I use Wikipedia to develop my teaching materials
    • I contribute to blogs
    • The use of Wikipedia is well considered among colleagues

WikiUser_BI


Non-Wiki Users

NonWiki

Findings

  • For Non Wiki Users, similar to Wiki users, they were largely have less than 10 years of work experience, and in the age group of 41 to 45. Most of the staff in this group were the Adjunct staff for all domain areas.
  • It’s observed that nonWiki users seems to have more Neutral response in most survey categories as compared to Wiki user.
  • Non-Wiki users were more negative for Behavior Intention (about 36% of D+ SD as compared to Wiki users at 15%). Less than 30% indicated A+SA, and about 38% indicated Neutral.
  • Similar to Wiki users, they were generally more positive in Perceived Ease of Use, Perceived Enjoyment and Sharing Attitude.
  • On the other hand, as shown in the visualisation below, Use Behavior, Profile 2.0 and Social Image scored poorer in terms of % response of D + SD.
  • The survey questions that fared with high % of D + SD responses were
    • I use Wikipedia as a platform to develop educational activities with students
    • I use Wikipedia to develop my teaching materials
    • I contribute to blogs
    • The use of Wikipedia is well considered among colleagues
    • I publish academic content in open platforms

NonWiki_BI


The Key Factors that Negatively Influence the Behavior Intention to adopt Wikipedia

The two survey questions for BI are:

    • BI1: In the future I will recommend the use of Wikipedia to my colleagues and students
    • BI2: In the future I will use Wikipedia in my teaching activity

Here, we use JMP to do the Pairwise Correlation for the multivariate analysis. In particular, we are keen to identify the factors (variables) which are highly correlated to Behavioral Intention.

JMP

From the above, the variables which are highly correlated to Behavioral Intention are

  • Use Behavior, Experience, Visibility, Quality and Social Image

Next, now a Parallel Plot will be used to ascertain the above are the key factors that are worth to investigate in details.

JMP

The plot was marked with the color marker on the Behavior Intention as shown below.

JMP

From the Parallel Plot above, it's noticeably observed that the bright green markers are cluttered at lower range for Use Behavior, Experience and Visibility. We will therefore further examine into the following measures in next section for further investigate.

  • Motivation to use wiki (Use Bahavior, Experience)
  • Social Influence (Social Image, Visibility)
  • Perceived Quality (Quality)


Investigation on key factors

Now, we will further investigate on the key measures which were identified in the above section.

  • Motivation factors to use Wikipedia (Use Behavior, Experience)
  • Social Influence (Social Image, Visibility)
  • Perceived quality of information using wiki (Quality)

A Qlik Sense App has been created to provide an interactive visualisation to drill down into the relationship between various attributes (age, academic position, domain, current wiki users, survey questions) regards to the positive and negative response in each measure. The types of visualisation objects used in Qlik Sense app include Radar Chart, Dependency Wheel, Bar Charts, Sankey Diagram.

Dashboard #5

Qlik Sense App [[5]] (click here)

Youtube [[6]] (click here)


Visualisation design There are three dashboards in the Qlik Sense App. The purpose of these dashboards are to uncover some of the following insights:

  • Any identification of attributes (e.g. age, academic position, current wiki user etc) that could influence the staff behavior intention to use wiki
  • Detect any skeptical attitudes among university faculty regards using Wikipedia in class
  • Establish any relation to disciplinary factors (e.g. academic position or domain area) or any implicit conflict between the scientific academic culture and Wikipedia culture.


Findings Over here, we will focus only on discussion based on findings in 3 areas, namely the motivation factors to use wiki, the social influence, and lastly the perceived quality of wiki information.


  1. .a) Motivation Factors to adopt Wiki (Use Behavior, Experience)

From the Survey Response Dashboard, first we filter only the 2 Survey Categories namely Use Behavior and Experience as shown below. All the chart visualizations are now associated to the filtered data based on the above selection.

Motivation_a


Next, we would like to identify some attributes observed from Disagree and Strongly Disagree (D + SD) response group.

Motivation_b


Motivation_c


The above two visualisations provided some insights to the followings

Note: The % in bracket means the % of all staff indicated A+SA in survey questions regards to Motivation

  • For the "Poor in Motivation" response (Disagree + Strongly Disagree)
  • Survey questions with high number of (Disagree + Strongly Disagree) responses are:
      • EXP4 (17.6%): I contribute to Wikipedia (editions, revisions, articles improvement...)
      • USE2 (15.8%): I use Wikipedia as a platform to develop educational activities with students
      • USE1 (14.0%): I use Wikipedia to develop my teaching materials
  • For the "Good Motivation" was contributed by Experience, followed by Use Behavior
  • Survey questions with high number of (Agree + Strongly Agree) responses are:
      • EXP3 (20.4%): I consult Wikipedia for personal issues
      • EXP2 (18.1%): I consult Wikipedia for other academic related issues
      • USE5 (15.0%): I agree my students use Wikipedia in my courses
  • Demographics
    • The main group was from Adjunct.
    • Mostly having years of experience of less than 20 yrs.


  1. b) Social Influence (Quality)

Using the same interactive techniques as described above, we will now examine the Influence of Social Influence to adopt Wiki.

SI_a

SI_b


Observations:-

  • By selecting only those with average scale of Behavior Intention less than 2.0, it's observed that most of the response given in Social Image and Visibility cluttered around scale <2.50.
  • The biggest contribution to low scale in Avg Scale in Behavior Intention are from Disagree (30.6%), Neutral (26.7%), and Strongly Disagree in Social Image and Visibility survey questions.
  1. c) Perceived quality of wiki information (Quality)

Lastly, the observations for perceived quality of Wiki Information are as follow:

  • QU4 (26.6%): In my area of expertise, Wikipedia has a lower quality than other educational resources
  • QU3 (21.9%): Articles in Wikipedia are comprehensive (25%)
  • QU5(18.9%): I trust in the editing system of Wikipedia (22.6%)


Tools Used and Visualisation Links

Visualisation Toolskit:-
1) Tableau 10.0
2) JMP
3) Qlik Sense


The Visualisation links are as follow:-

1) Tableau

  • Demographics Info [[7]]
  • Staff Profile Dashboard [[8]]
  • Survey Analysis Dashboard [[9]]

2) Qlik Sense

  • Qlik Sense App [[10]]
  • Youtube Link

Walk through of Qlik Sense App via Youtube [[11]]


3) JMP 12.2

4) Other tools used in the exploration

  • Mondrian
  • High-D from MacroFocus

Conclusion

When we have a dataset with high number of dimensions and several questions to be answered, we need to first identify the relevant parameters for each question. These types of analysis vary based on the nature of the data and the specific relationships that we want to discover and understand. There are different ways to visualise the multivariate data, each type of investigation can be most effectively pursued using particular types of visualizations and particular techniques for interacting with multivariate data. In this assignment, a few of the visualizations e.g. heatmaps, glyphs, parallel plot, divergent bar chart have been used to bring up the light of data.