Introduction

Most faculty members have a tendency to avoid using Wikipedia as a source of reference and teaching material. In the given dataset, the survey was conducted to faculty members with regards to Wikipedia and online platforms. The theme that I have chosen would be the usage of Wikipedia. As such, I would like to find out why faculty members would use Wikipedia, what kind of faculty members would tend to use Wikipedia, and if there are differences between UOC and UPF members with regards to their usage and perception towards Wikipedia. By answering these questions, I will be able to understand why faculty members have, or if not, will encourage the use of Wikipedia.

Questions for Investigation:

What are the main reasons for faculty members to use Wikipedia?
What is the profile of faculty members who are users of Wikipedia?
Are there differences between faculty members solely in UOC, and in UPF, with regards to their usage of Wikipedia?

Data

The data was taken from Wiki4HE. It consists of survey questions that were given to university faculty members, in order to find out about the perception and practices of them using Wikipedia.

The attribute table can also be found from Wiki4HE.

AGE: numeric
GENDER: 0=Male; 1=Female
DOMAIN: 1=Arts & Humanities; 2=Sciences; 3=Health Sciences; 4=Engineering & Architecture; 5=Law & Politics
PhD: 0=No; 1=Yes
YEARSEXP (years of university teaching experience): numeric
UNIVERSITY: 1=UOC; 2=UPF
UOC_POSITION (academic position of UOC members): 1=Professor; 2=Associate; 3=Assistant; 4=Lecturer; 5=Instructor; 6=Adjunct
OTHER (main job in another university for part-time members): 1=Yes; 2=No
OTHER_POSITION (work as part-time in another university and UPF members): 1=Professor; 2=Associate; 3=Assistant; 4=Lecturer; 5=Instructor; 6=Adjunct
USERWIKI (Wikipedia registered user): 0=No; 1=Yes
The 43 survey items are ranked on a Likert scale (1-5) ranging from strongly disagree / never (1) to strongly agree / always (5).

Data Cleaning

Recoding data

Inconsistent attributes with the given attributes information table

Inconsistent data

Inverse data values

“QU4: In my area of expertise, Wikipedia has a lower quality than other educational resources “

New columns

Grouping survey questions into different categories

Question Categories

I categorised the questions according to their codes.

Perceived Usefulness
PU1: The use of Wikipedia makes it easier for students to develop new skills
PU2: The use of Wikipedia improves students' learning
PU3: Wikipedia is useful for teaching

In this case, PU1, PU2 and PU3 will be placed in the group named “PU”. I categorised all of the other questions the same way.

Group Categories

Categories	Questions	Code
Teaching Resource	QU1, QU2, QU3, QU4, VIS3, USE1, BI1, BI2, EXP1, EXP2	TR
Collaborative Platform	VIS1, VIS2, EXP4, EXP5, USE2	CP
Perception of Online Platforms	PF1, PF2, PF3, SA1, SA2, SA3	OPP
Perception of Wikipedia	ENJ1, ENJ2, PEU1, PEU2, PEU3, PU1, PU2, PU3	WP

Transpose data on excel sheet

Data Exploration

Iteration 1

In iteration 1, I initially found the mean and median values of each survey question and plotted them in order to compare them. However, I found out that the values were on a Likert Scale, and this meant that the mean and median values cannot be compared with each other as it is not meaningful. A question that has a mean score of 2.5, would not allow the user to understand the actual percentage of responses and the variation within a question.

Iteration 2

As such, I had to re-look at my data and find alternative ways to analyse them. In this case, I created divergent bar charts by referencing to Data Revelations. The step-by-step instructions can be found here and here. With the help of the online reference, I managed to create a basic Divergent Bar Chart for the survey questions. Below is a screenshot of these charts.

The actual dashboard can also be found: Here.

Visualisation 1: What are the main reasons for faculty members to use Wikipedia?

Methodology

To answer the question, I have decided to display a visualisation related to the two categories of the two main reasons why faculty members would use Wikipedia.

First, Divergent Bar Charts were used to display the questions related to using Wikipedia as a Teaching Resource or a Collaboration Platform. This is because divergent bar charts help to display the percentages of likert scale values of 1-5 on the same bar, and users will be able to see the distribution of scores for each question. The average score was also included, in the bar charts. Although the average score cannot be relied on, by its own, it can come handy when paired with a divergent bar chart.

The divergent bar charts were coloured on a spectrum of two colours, red to blue. This allows us to see the values of 5, that represent "Strongly Agree" can be identified with the portions of the bars that are in dark blue. The values of 1, that represent "Strongly Disagree" can be identified with the portion of the bars that are in dark red.

Other variables were added to the dashboard as well, such as Domain, Position, and UserWiki. These variables were represented in bar charts. The area of the bar represents the distinct count of IDs within each attribute of the variable. A filter function was added to each of these bar charts as well. As such, in the dashboard, the user will be able see which category of each variable uses Wikipedia more as a teaching resource and/ or a collaborative platform, as the divergent bar charts will change according to the filtered variables.

A screenshot of the dashboard can be seen below. It was done using Tableau.

Second, Parallel Coordinates were also used to allow for better comparison. In the following visualisation that was done using Tibco Spotfire, I plotted the parallel coordinates plot of both categories against each other, and included an additional variable "Domain". As such, one will be able to view how the distribution of coordinates changes when a domain changes. In this visualisation, as seen in the screenshot, I highlighted the values that corresponded with faculty members who were registered users of Wikipedia.

A screenshot of the parallel coordinate plot can be seen below.

Questions

Qn Code	Question - Teaching Resource
QU1	Articles in Wikipedia are reliable
QU2	Articles in Wikipedia are updated
QU3	Articles in Wikipedia are comprehensive
Qu4	In my area of expertise, Wikipedia has a lower quality than other educational resources
VIS3	I cite Wikipedia in my academic papers
USE1	I use Wikipedia to develop my teaching materials
BI1	In the future I will recommend the use of Wikipedia to my colleagues and students
BI2	In the future I will use Wikipedia in my teaching activity
EXP1	I consult Wikipedia for issues related to my field of expertise
EXP2	I consult Wikipedia for other academic related issues

Qn Code	Question - Collaborative Platform
VIS1	Wikipedia improves visibility of students' work
VIS2	It is easy to have a record of the contributions made in Wikipedia
EXP4	I contribute to Wikipedia (editions, revisions, articles improvement...)
EXP5	I use wikis to work with my students
USE2	I use Wikipedia as a platform to develop educational activities with students

Visualisation

Parallel Coordinates by Tibco Spotfire:

Divergent Bar Charts by Tableau:

The actual Dashboard can be found here.

Observation & Insights

Visualisation 2: What is the profile of faculty members who are users of Wikipedia?

Methodology

For this visualisation, I decided to use a Treemap Representation to display the findings. Treemaps allow us to visualise and analyse hierarchical data. In this case, as I wanted to find out the profile of faculty members who will be more likely to be users of Wikipedia, I could achieve an organised multivariate hierarchical visualisation, by using a pivot-by-size layout. I have made 2 different Treemaps, one using Tibco Spotfire, and the other using Tableau. Both treemaps differ due to the visualisation options provided by both softwares.

The hierarchy that I've set for the visualisation (in Tibco Spotfire) is as follows:

Domain: values of 1-6 refers differing categories of domains (refer to attribute table above)
Phd: 0 represents no Phd, 1 represents Phd
Gender: 0 represents Male, 1 represents Female
Years of Experience in bins

I also added a filter to ensure that only the registered users of Wikipedia were captured in the treemap visualisation. The size and the colour of the treemap representation is based on the count of distinct ID values. I used a range of blues, from the lightest shade that represents the smallest area, to the darkest shade of blue that represents the largest area.

The hierarchy that I've set for the visualisation (in Tableau) is as follows:

University
Position in UPF
Position in UOC
Domain
PHD

I added 3 different filters to this representation. The first one being Categories, to allow users to see the typical profile of users based on each categorical grouping. The second one being Likert Value, to capture the respondents' response. The last one being a filter with regards to the years of experience that each faculty member had, so that users view the difference between members who have had a lot of experience, and those with the least expereince.

The size and the colour of the treemap representation is based on the count of distinct ID values. I used a range of blues, from the lightest shade that represents the smallest area, to the darkest shade of blue that represents the largest area.

Visualisation

Tibco Spotfire Treemap:

Tableau Treemap:

The Dashboard for the visualisation above can be found here: User Profile Dashboard

Observation & Insights

Visualisation 3: Are there differences between faculty members solely in UOC, and in UPF, with regards to their usage of Wikipedia?

Methodology

I decided to find out the differences between faculty members who solely teach in UOC, as compared to UPF professors (of which some may be teaching in UOC) and their usage of Wikipedia because of the differences in school environments. UOC is an internet- centered open university, whereas UPF is a public university. Both of them are based in Spain. As such, by finding out the differences, I would be able to see if the environment that these faculty members were from will impact the survey results.

To compare the differences, I used three different visualisation methods to be able to understand the survey results better.

First, using Tibco Spotfire, I created another Treemap Representation. In this case, as I wanted to find out the profile of faculty members of differing universities and their likelihood of using Wikipedia, I could achieve an organised multivariate hierarchical visualisation, by using a pivot-by-size layout

The hierarchy that I've set for the visualisation (in Tibco Spotfire) is as follows:

Userwiki: 0 represents not a registered Wikipedia user, 1 represents a registered Wikipedia user
UPF: 0 represents faculty members solely based in UOC, 1 represents faculty members in UPF
Domain: values of 1-6 refers differing categories of domains (refer to attribute table above)
Phd: 0 represents no Phd, 1 represents Phd
Years of Experience in bins

I also added a filter to ensure that only the registered users of Wikipedia were captured in the treemap visualisation. The size and the colour of the treemap representation is based on the count of distinct ID values. I used a range of blues, from the lightest shade that represents the smallest area, to the darkest shade of blue that represents the largest area.

The next visualisation representation I used was Parallel Coordinates using Tibco Spotfire. This visual representation was done for three categories: Perception and Usage. I used parallel coordinates as it would allow the user to look at the variation of answers per question, and to spot trends. In this case, the charts are coloured as well. Blue represents responses of faculty members who were only from UOC. Green represents responses of faculty members who are from UPF. From the visualisation charts below, we will be able to see the differences in responses between the two universities. I also filtered out responses of users who were not registered users of Wikipedia, as the key question relates to registered users of Wikipedia.

The last visualisation method I used was Divergent Bar Charts. There are 3 categories of Divergent Bar Charts that were introduced in the dashboard, University Environment, Perception, and Usage. These divergent bar charts allow us to see the variation of responses of individuals.

The divergent bar charts were coloured on a spectrum of two colours, red to blue. This allows us to see the values of 5, that represent "Strongly Agree" can be identified with the portions of the bars that are in dark blue. The values of 1, that represent "Strongly Disagree" can be identified with the portion of the bars that are in dark red. Additional filters, such as "YearsExp" and "Age" were added to allow the user to filter the data to see the changes in responses (in the divergent bar charts).

Visualisation

Tibco Spotfire Treemap:

Tibco Spotfire Parallel Coordinates:

Tableau Dashboard:

The Dashboard for the visualisation above can be found here: Dashboard

ISSS608 2016-17 T1 Assign2 Lim Hui Ting Jaclyn

Contents

Introduction

Questions for Investigation:

Data

Data Cleaning

Data Exploration

Iteration 1

Iteration 2

Visualisation 1: What are the main reasons for faculty members to use Wikipedia?

Methodology

Questions

Visualisation

Observation & Insights

Visualisation 2: What is the profile of faculty members who are users of Wikipedia?

Methodology

Visualisation

Observation & Insights

Visualisation 3: Are there differences between faculty members solely in UOC, and in UPF, with regards to their usage of Wikipedia?

Methodology

Visualisation

Observation & Insights

Comparison of Software

Conclusion

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools