ISSS608 2016-17 T1 Assign2 Lim Hui Ting Jaclyn
Contents
Introduction
Most faculty members have a tendency to avoid using Wikipedia as a source of reference and teaching material. In the given dataset, the survey was conducted to faculty members with regards to Wikipedia and online platforms. The theme that I have chosen would be the usage of Wikipedia. As such, I would like to find out why faculty members would use Wikipedia, what kind of faculty members would tend to use Wikipedia, and if there are differences between UOC and UPF members with regards to their usage and perception towards Wikipedia. By answering these questions, I will be able to understand why faculty members have, or if not, will encourage the use of Wikipedia.
Questions for Investigation:
- What are the main reasons for faculty members to use Wikipedia?
- What is the profile of faculty members who are users of Wikipedia?
- Are there differences between faculty members solely in UOC, and faculty members in UPF, with regards to their usage of Wikipedia? And if so, why?
Data
The data was taken from Wiki4HE. It consists of survey questions that were given to university faculty members, in order to find out about the perception and practices of them using Wikipedia.
The attribute table can also be found from Wiki4HE.
AGE: numeric
GENDER: 0=Male; 1=Female
DOMAIN: 1=Arts & Humanities; 2=Sciences; 3=Health Sciences; 4=Engineering & Architecture; 5=Law & Politics
PhD: 0=No; 1=Yes
YEARSEXP (years of university teaching experience): numeric
UNIVERSITY: 1=UOC; 2=UPF
UOC_POSITION (academic position of UOC members): 1=Professor; 2=Associate; 3=Assistant; 4=Lecturer; 5=Instructor; 6=Adjunct
OTHER (main job in another university for part-time members): 1=Yes; 2=No
OTHER_POSITION (work as part-time in another university and UPF members): 1=Professor; 2=Associate; 3=Assistant; 4=Lecturer; 5=Instructor; 6=Adjunct
USERWIKI (Wikipedia registered user): 0=No; 1=Yes
The 43 survey items are ranked on a Likert scale (1-5) ranging from strongly disagree / never (1) to strongly agree / always (5).
Data Cleaning
- Recoding data There were columns with “?” cells. Although they meant that there was no response, the usage of “?” resulted in the data type of the column to be Character instead of Numeric. These cells were recoded from “?” to NULL. I also recoded data in OTHER_POSITION from “2” to “0” as they represented faculty members with no other positions.
- Inconsistent attributes with the given attributes information table There were only 5 variables in the “Domain” attribute table given. However, there are 6 different variables, and null values inside the column itself. I assumed that “6” belonged to the category of others. Also, in the column “OTHERSTATUS”, there were 7 variables instead of 6 listed. Hence, I assumed that the variables with value of “7” represented other faculty positions.
- Inconsistent data OTHER_POSITION only had values such as ?, 1, 2. Unlike stated in the attribute table. Also, there is an additional column named “OTHERSTATUS” which was not mentioned in the attribute table given. This column, “OTHERSTATUS”, has values ranging from ?, and 0 to 7. As such, a probable guess is that OTHER_POSITION contains variables of whether faculty members hold other positions, and OTHERSTATUS refers to the position taken in the other position that the faculty member holds.
- Inverse data values A majority of the questions were positively phrased except for QU4 that was negatively phrased. “QU4: In my area of expertise, Wikipedia has a lower quality than other educational resources “ Hence, the values had to be swapped inversely. For example, “5” would represent “Strongly Agree”, such that Wikipedia has a lower quality than other educational resources. By recoding it to “1”, it would mean that these are the people who agree that Wikipedia has a lower quality. The new value of “5” would represent people who agree that Wikipedia is not of a low quality than other educational resources.
- New columns As I found that the following columns, “University”, “Other Position”, “Other status” was quite confusing and also difficult to do analysis in, I decided to create additional columns on JMP. Hence, I created Columns such as “UOC” that contains 0 or 1, 1 if the faculty member is from UOC, and “UPF” that contains 0 or 1, 1 if the faculty member is from UPF. Also, “UOC_Position” contains a value of 1 to 6 if the faculty member is from UOC, and “UPF_Position” contains a value from 1 to 7 if the faculty member is from UPF.
- Grouping survey questions into different categories
- Question Categories
- Group Categories
- Transpose data on excel sheet In order to create a “response” column to find out the scores, and a “questions” column, I had to create a column listing arbitrary ID numbers, and to use a Tableau add-in function on Excel to create a pivot table.
I categorised the questions according to their codes.
Perceived Usefulness
PU1: The use of Wikipedia makes it easier for students to develop new skills
PU2: The use of Wikipedia improves students' learning
PU3: Wikipedia is useful for teaching
In this case, PU1, PU2 and PU3 will be placed in the group named “PU”. I categorised all of the other questions the same way.
Categories |
Questions |
Code |
Teaching Resource | QU1, QU2, QU3, QU4, VIS3, USE1, BI1, BI2, EXP1, EXP2 | TR |
Collaborative Platform | VIS1, VIS2, EXP4, EXP5, USE2 | CP |
Perception of Online Platforms | PF1, PF2, PF3, SA1, SA2, SA3 | OPP |
Perception of Wikipedia | ENJ1, ENJ2, PEU1, PEU2, PEU3, PU1, PU2, PU3 | WP |
Data Exploration
Iteration 1
Iteration 2
Visualisation 1
Question: What are the main reasons for faculty members to use Wikipedia?
Methodology
Observation & Insights
Visualisation 2
Question: What is the profile of faculty members who are users of Wikipedia?
Methodology
Observation & Insights
Visualisation 3
Question: Are there differences between faculty members solely in UOC, and faculty members in UPF, with regards to their usage of Wikipedia? And if so, why?