Difference between revisions of "IS428 2016-17 Term1 Assign2 Wu Wei"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
Line 32: Line 32:
 
By applying the same logic, I create another new column called "Open Access", by doing so I will be able to only get "Yes" for open access button while for the rest of the cases, I would just leave them as "No".
 
By applying the same logic, I create another new column called "Open Access", by doing so I will be able to only get "Yes" for open access button while for the rest of the cases, I would just leave them as "No".
  
[[File:003.jpg|500px]]
+
[[File:004.jpg|500px]]
  
 
* Question 8B: Do you support the goal of Open Access?
 
* Question 8B: Do you support the goal of Open Access?

Revision as of 02:44, 26 September 2016

Theme of Interest

Have you ever heard of the term "Open Access"[1]? If you have never heard about it, I guess you have never tried to get access to those scholarly articles or professional reports online. It is a painful experience when you finally find a research paper relates to your group project on Google Scholar but which requires a purchase to access. Personally, I have such experiences occasionally in many of my school project researches. This is a reason why I am a strong supporter of Open Access Movement[2]. Open Access Movement is the worldwide effort to provide free access to scientific and scholarly researches. In 2013, the US government "issued United States' Federal Agencies with more than $100 million in annual R&D expenditures to develop plans within six months to make the published results of federally funded research freely available to the public within one year of publication" [3]. Other countries like China, Russia, Japan, India and etc, all trying to achieve open access for individuals. Thus, my questions comes out: How is the progress of Open Access Movement? Are we getting fruitful results? Or are we currently in a bottleneck phase indeed?

Find Appropriate Data

Breakdown of questions

The data set of 101-innovations-research-tools-survey includes all survey responses from 20663 researchers all over the world. The data size is too huge which also contains irrelevant information regarding my questions. Thus, to narrow down the scope, I need to further break our questions to specific parts:

  1. What kinds of tools/sites are researchers using when they want to search relative content?
  2. Are they able to get free access to these tools/sites?
  3. Are people from all countries using the same tools/sites?
  4. Could there be a relationship between research role and tools/sites of using?
  5. Could there be a relationship between research category and tools/sites of using?
  6. Do researchers support the idea of open access?
  7. Could there be a relationship between the researcher's research category and support of open access?

After I have these breakdowns, I am able to select specific data accordingly.

Data Reconstruction using JMP Pro

After open survey_cleaned_variable_list.csv, I can see all survey questions. To address my breakdown questions mentioned above, I am only interested in survey questions 1A, 1B, 1C, 2A, 2B and 8B. Questions 1A, 1B and 1C reflect respondents' personal information whereas 2A, 2B and 8B reflect respondents' responses to specific questions. Using "Subset" method in JMP Pro to crop part of the table,

001.jpg

I will construct the following 3 new data sets:

  • Question 2A: What tools/sites do you use to search literature/data ?

002.jpg

Similar to the first table, we construct two more sub-tables for :

  • Question 2B: What tools/sites do you use to get access to literature/data ?

Here, in order to make the data clear, I would create a new column called "Research category" and put "Physical Sciences", "Engineering & Technology", "Life Sciences", "Medicine", "Social Sciences & Economics" as "science" under category; "Law" and "Arts & Humanities" as "non-science".

003.jpg

By applying the same logic, I create another new column called "Open Access", by doing so I will be able to only get "Yes" for open access button while for the rest of the cases, I would just leave them as "No".

004.jpg

  • Question 8B: Do you support the goal of Open Access?

Analysis & Visualization Construction Process

The following content will show how different visual analytics techniques are being applied and how we use these visual tools to analyze the data.

Question 2A

In order to gain a overview on what kinds of tools/sites researchers are using to search, we would first plot a treemap:

Treemap001.jpg

From the treemap, we can see Google Scholar is still the top choice when researchers conduct a search. However, Google Scholar is not always free. Sometimes, people can view a few pages rather than get the full version of the article. Other common choices like Mendsearch, Web of Science, Scopus need the users to log in by using the institutional account. Indeed it is also not free as institutions generally pay a lot of money to buy institutional accounts. At the same time, not everyone in the institution can get a unique account. Most of time, people need to share a few accounts and the number of logins are also limited by the server side. Current situation still shows a lack of unified open access for most researchers.

Now, we can add "Country" in the treemap to compare differences across all countries.

Treemap002.jpg

Although among all countries, Google Scholar is the major searching method; we still can see differences in case of percentage when we compare all countries. Countries like the United States, the United Kingdom, Canada and Brazil have almost 90% of respondent researchers who use Google Scholar for searching. Countries like Germany, Italy, Japan and Spain rely less on Google Scholar compare to the previous countries. Additionally, China seems to have the lowest percentage of using Google Scholar. This is obvious where Chinese government banned Google since 2012.

Question 2B

In order to know how researchers from different industry can access through open access button, I decide to plot a Mosaic plot to show the relationship between the role of researcher, research category and access method:

Mosiac.jpg

Red area denotes researchers are using open access when they are doing research. Interestingly, I observed that librarians have a much higher percentage compare to other researchers. Nearly half of the librarians know how to take open access. This could reflect that librarians generally receive more training on researching. It is not surprising for them to take this advantage to open access. Meanwhile, bachelor and master students have the lowest percentage among all roles. University students have less research work compare to phD students, professors and research experts, thus they are less aware of using open access to research.



Tools

  1. Tableau Desktop 10.0
  2. JMP Pro 12
  3. Mondrian 15b

Reference

  1. [1], wiki definition of open access
  2. [2], Timeline of the Open Access Movement
  3. [3], Information retrieved from the US official file

Comments