ISSS608 2016-17 T1 Assign2 Lim Hui Ting Jaclyn
Contents
Introduction
Most faculty members have a tendency to avoid using Wikipedia as a source of reference and teaching material. In the given dataset, the survey was conducted to faculty members with regards to Wikipedia and online platforms. The theme that I have chosen would be the usage of Wikipedia. As such, I would like to find out why faculty members would use Wikipedia, what kind of faculty members would tend to use Wikipedia, and if there are differences between UOC and UPF members with regards to their usage and perception towards Wikipedia. By answering these questions, I will be able to understand why faculty members have, or if not, will encourage the use of Wikipedia.
Questions for Investigation:
- What are the main reasons for faculty members to use Wikipedia?
- What is the profile of faculty members who are users of Wikipedia?
- Are there differences between faculty members solely in UOC, and faculty members in UPF, with regards to their usage of Wikipedia? And if so, why?
Data
The data was taken from Wiki4HE. It consists of survey questions that were given to university faculty members, in order to find out about the perception and practices of them using Wikipedia.
The attribute table can also be found from Wiki4HE.
AGE: numeric
GENDER: 0=Male; 1=Female
DOMAIN: 1=Arts & Humanities; 2=Sciences; 3=Health Sciences; 4=Engineering & Architecture; 5=Law & Politics
PhD: 0=No; 1=Yes
YEARSEXP (years of university teaching experience): numeric
UNIVERSITY: 1=UOC; 2=UPF
UOC_POSITION (academic position of UOC members): 1=Professor; 2=Associate; 3=Assistant; 4=Lecturer; 5=Instructor; 6=Adjunct
OTHER (main job in another university for part-time members): 1=Yes; 2=No
OTHER_POSITION (work as part-time in another university and UPF members): 1=Professor; 2=Associate; 3=Assistant; 4=Lecturer; 5=Instructor; 6=Adjunct
USERWIKI (Wikipedia registered user): 0=No; 1=Yes
The 43 survey items are ranked on a Likert scale (1-5) ranging from strongly disagree / never (1) to strongly agree / always (5).
Data Cleaning
- Recoding data There were columns with “?” cells. Although they meant that there was no response, the usage of “?” resulted in the data type of the column to be Character instead of Numeric. These cells were recoded from “?” to NULL. I also recoded data in OTHER_POSITION from “2” to “0” as they represented faculty members with no other positions.
- Inconsistent attributes with the given attributes information table There were only 5 variables in the “Domain” attribute table given. However, there are 6 different variables, and null values inside the column itself. I assumed that “6” belonged to the category of others. Also, in the column “OTHERSTATUS”, there were 7 variables instead of 6 listed. Hence, I assumed that the variables with value of “7” represented other faculty positions.
- Inconsistent data OTHER_POSITION only had values such as ?, 1, 2. Unlike stated in the attribute table. Also, there is an additional column named “OTHERSTATUS” which was not mentioned in the attribute table given. This column, “OTHERSTATUS”, has values ranging from ?, and 0 to 7. As such, a probable guess is that OTHER_POSITION contains variables of whether faculty members hold other positions, and OTHERSTATUS refers to the position taken in the other position that the faculty member holds.
- Inverse data values A majority of the questions were positively phrased except for QU4 that was negatively phrased. “QU4: In my area of expertise, Wikipedia has a lower quality than other educational resources “ Hence, the values had to be swapped inversely. For example, “5” would represent “Strongly Agree”, such that Wikipedia has a lower quality than other educational resources. By recoding it to “1”, it would mean that these are the people who agree that Wikipedia has a lower quality. The new value of “5” would represent people who agree that Wikipedia is not of a low quality than other educational resources.
- New columns As I found that the following columns, “University”, “Other Position”, “Other status” was quite confusing and also difficult to do analysis in, I decided to create additional columns on JMP. Hence, I created Columns such as “UOC” that contains 0 or 1, 1 if the faculty member is from UOC, and “UPF” that contains 0 or 1, 1 if the faculty member is from UPF. Also, “UOC_Position” contains a value of 1 to 6 if the faculty member is from UOC, and “UPF_Position” contains a value from 1 to 7 if the faculty member is from UPF.
- Grouping survey questions into different categories
- Question Categories
- Group Categories
- Transpose data on excel sheet In order to create a “response” column to find out the scores, and a “questions” column, I had to create a column listing arbitrary ID numbers, and to use a Tableau add-in function on Excel to create a pivot table.
I categorised the questions according to their codes.
Perceived Usefulness
PU1: The use of Wikipedia makes it easier for students to develop new skills
PU2: The use of Wikipedia improves students' learning
PU3: Wikipedia is useful for teaching
In this case, PU1, PU2 and PU3 will be placed in the group named “PU”. I categorised all of the other questions the same way.
Categories |
Questions |
Code |
Teaching Resource | QU1, QU2, QU3, QU4, VIS3, USE1, BI1, BI2, EXP1, EXP2 | TR |
Collaborative Platform | VIS1, VIS2, EXP4, EXP5, USE2 | CP |
Perception of Online Platforms | PF1, PF2, PF3, SA1, SA2, SA3 | OPP |
Perception of Wikipedia | ENJ1, ENJ2, PEU1, PEU2, PEU3, PU1, PU2, PU3 | WP |
Data Exploration
Iteration 1
Iteration 2
Visualisation 1
Question: What are the main reasons for faculty members to use Wikipedia?
Methodology
To answer the question, I have decided to display a visualisation related to the two categories of the two main reasons why faculty members would use Wikipedia.
First, divergent bar charts were used to display the questions related to using Wikipedia as a Teaching Resource or a Collaboration Platform. This is because divergent bar charts help to display the percentages of likert scale values of 1-5 on the same bar, and users will be able to see the distribution of scores for each question. The average score was also included, in the bar charts. Although the average score cannot be relied on, by its own, it can come handy when paired with a divergent bar chart.
Other variables were added to the dashboard as well, such as Domain, Position, and UserWiki. Domain will allow us to see the percentage of people who used
A screenshot of the dashboard can be seen below. It was done using Tableau.
Second, parallel coordinates were also used to allow for better comparison. In the following visualisation that was done using Tibco Spotfire, I plotted the parallel coordinates plot of both categories against each other, and included an additional variable "Domain".
Visualisation
Qn Code |
Question - Teaching Resource |
QU1 | Articles in Wikipedia are reliable |
QU2 | Articles in Wikipedia are updated |
QU3 | Articles in Wikipedia are comprehensive |
Qu4 | In my area of expertise, Wikipedia has a lower quality than other educational resources |
VIS3 | I cite Wikipedia in my academic papers |
USE1 | I use Wikipedia to develop my teaching materials |
BI1 | In the future I will recommend the use of Wikipedia to my colleagues and students |
BI2 | In the future I will use Wikipedia in my teaching activity |
EXP1 | I consult Wikipedia for issues related to my field of expertise |
EXP2 | I consult Wikipedia for other academic related issues |
Qn Code |
Question - Collaborative Platform |
VIS1 | Wikipedia improves visibility of students' work |
VIS2 | It is easy to have a record of the contributions made in Wikipedia |
EXP4 | I contribute to Wikipedia (editions, revisions, articles improvement...) |
EXP5 | I use wikis to work with my students |
USE2 | I use Wikipedia as a platform to develop educational activities with students |
VIS1, VIS2, EXP4, EXP5, USE2
Parallel Coordinates: insert picture
Divergent Bar Charts: Dashboard
Observation & Insights
Visualisation 2
Question: What is the profile of faculty members who are users of Wikipedia?
Methodology
For this visualisation, I decided to use a Treemap Representation to display the findings.
Visualisation
Observation & Insights
Visualisation 3
Question: Are there differences between faculty members solely in UOC, and faculty members in UPF, with regards to their usage of Wikipedia? And if so, why?