ISSS608 2016-17 T1 Assign2 Mukund Krishna Ravi

From Visual Analytics and Applications
Revision as of 19:10, 25 September 2016 by Mukundkr.2015 (talk | contribs)
Jump to navigation Jump to search

Overview

In this digital economy age, massive and complex data have been captured and stored in organization databases and/or data warehouses. By and large, these data contain a large amount of variables of a particular product, customer or activity. Due to limitations in perceptual and screen space, graphical techniques available in traditional business intelligence systems tend to confine to uni variate and bi variate data such as bar chart, pie chart and scatter plot. As a result, many important relationships that live in these data remain undiscovered.For instance, in the wiki4HE dataset there are many relationships in between the survey data and the different academic segments. These observations are hidden and require more complex visualization techniques to uncover all the observations.

Theme of Interest and Motivation

Ongoing research on university faculty perceptions and practices of using Wikipedia as a teaching resource. Based on a Technology Acceptance Model, the relationships within the internal and external constructs of the model are analyzed. Both the perception of colleagues€™ opinion about Wikipedia and the perceived quality of the information in Wikipedia play a central role in the obtained model.In this particular problem I have chosen to focus only a few areas of interest and discovering intricate relations in the data set which would not be visible if we used basic visualization techniques . The following are a few key aspects of the problem-

How do various user segments and domains rate Wikipedia and the perceived quality of information in Wikipedia

To understand this behavior we analyse the following criteria 1. How different Domains rate to perceived usefulness 2. How different users rate perceived usefulness. 3. How different users rate the experience of wikipedia 4. Different users rate the quality of wikipedia

Do registered users of a particular age have any reference towards Wikipedia (rating of 4 and 5)

To understand this behavior we try to understand how different age segments perceive the usefulness of wikipedia and the perceived quality of wikipedia by analyzing all the parameters associated like usefulness and quality ratings.

Relationship between registered users and Age

To understand this behavior between registered users and age we need to uncover correlations between the two parameters to understand this behavior.

Data Set

For this assignment, I have selected to use the wiki4HE dataset, which is an ongoing research on university faculty perceptions and practices of using Wikipedia as a teaching resource. Based on a Technology Acceptance Model, the relationships within the internal and external constructs of the model are analysed. Both the perception of university faculty teaching staff’s opinion about Wikipedia and the perceived quality of the information in Wikipedia play a central role in the obtained model. The original data set can be found on the UC Irvine Machine Learning Repository’s website [1]. The original data set is formatted as a CSV file.

Data Preparation

Analysis

Tools Utilized

Conclusion

References