Difference between revisions of "Talk:Lesson04"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
Line 23: Line 23:
 
In terms of how do we read the information from the parallel sets, for each dimension, a horizontal bar is shown for each of its possible categories. The width of the bar indicates the absolute number/frequency of matches for that category. Between the dimension bars are ribbons that connect categories that show us how the combination of categories is distributed and how a particular subset can be further subdivided. As such, this can be useful in exploring relationships in data that might be elusive if you are facing with many categories.
 
In terms of how do we read the information from the parallel sets, for each dimension, a horizontal bar is shown for each of its possible categories. The width of the bar indicates the absolute number/frequency of matches for that category. Between the dimension bars are ribbons that connect categories that show us how the combination of categories is distributed and how a particular subset can be further subdivided. As such, this can be useful in exploring relationships in data that might be elusive if you are facing with many categories.
  
Although, Parallel Set does provide us with a nice and refreshing set of outlook in our chart, it also has its disadvantages. For instance, it is unable to show a strong degree of association between dimensions based on its nice regular pattern as compared to Mosaic Plot. Therefore, it is imperative to consider the types of data we have and to trial and error on various kind of visualizations before we are able to reap out the benefits of using a parallel set plot.
+
Although, parallel set does provide us with a nice and refreshing set of outlook in our chart, it also has its disadvantages. For instance, it is unable to show a strong degree of association between dimensions based on its nice regular pattern as compared to Mosaic Plot. Therefore, it is imperative to perform a thorough exploratory analysis on the data and to trial and error on various kind of visualizations before we are able to reap out the purposes and benefits of using a parallel set plot.
  
 
-Lim Kim Yong
 
-Lim Kim Yong

Revision as of 12:32, 2 October 2016

Lucky us!

It has been already four classes but it is for sure that Visual Analytics is so huge that the number of classes won’t be enough to cover everything. All these data are an infinite resource we can play with. Still in the lesson 04, we have seen it. How the visualization is impressive and interesting just by playing with the different dimensions and categories. Of course, now, data is coming from everywhere and not everyone is seeing it from a good eye. But for my part, I think it as an opportunity. Opportunity for the world to improve itself by analyzing the data but also opportunity for me and us. Data is offering new jobs as data scientist. And this kind of jobs is not boring anymore. There is not only one type of data, we cannot even count them, they will be always something to do and no time to say that we don’t like our job because it is annoying. I don’t know about Singapore, but in my country, Belgium, all this data analyses are quite new. Of course there are experts but the number of data is so huge that they’re not sufficient. Companies are looking for people like us. Able to understand the data, interpret them, give them a story but always with a business side, knowing about what we are working with. But still, these people are difficult to find in Belgium because at school we are not specialize in it. We are talking about Big Data, we have Data Mining classes but that’s it. When I looked through the catalogue here it seems more developed. So I was wondering how does it work here? Do you think employers are searching more and more for data scientist or is it already the case in Singapore? For my part, I am happy to learn about all this techniques and programs because it will be for sure a real difference on my CV. I also think that the quicker entreprises are going to realize how it is important to have people able to deal with data (us), the better it will be for them. It means also that this field will still grow and that our qualifications will be definetely valuable.

Don't hesitate to comment to tell your point of view and if Singapore has already more offering this kind of jobs than in Belgium.

-Margot Stelleman

Multivariate Analysis Using Parallel Coordinates

This article discusses the benefits of using parallel coordinates to analyze multivariable data.

To someone who has never examined this graph, it would appear overwhelming and messy. The huge clutter of overlapping lines seems to offer little insight about any patterns of trends. However, after reading this article, I have a new appreciation for parallel coordinates as it offers a new perspective for comparison which I previously was not aware of.

I think that parallel coordinates allow users to get a quick sense of how the overall data is like and what are the general patterns. For example, cars with more cylinders tend to have the lowest MPG (miles per gallon). This can be done easily by simply brushing the lines with high cylinders, follow the highlighted lines and observe where do they connect to on the other axis.

Parallel coordinates are uncommon because its full benefits can only be realized when it is interactive. By hovering the mouse over the graph, the selected line is highlighted and the user can easily contrast against other data and identify patterns. This is not possible over hardcopy text. In fact, parallel coordinates are messy if there is no appropriate highlighting of important lines for comparison. Noting this, it is perhaps important to consider what medium the graphs will be presented on and decide if it is still suitable.

-Arnold Lee Wai Tong

Parallel Sets

Besides the Mosaic Plot, Parallel Set is also an interactive visualization application for displaying multidimensional categorical data. It is similar to the parallel coordinate plot whereby the same category is being “bundled” together.

In terms of how do we read the information from the parallel sets, for each dimension, a horizontal bar is shown for each of its possible categories. The width of the bar indicates the absolute number/frequency of matches for that category. Between the dimension bars are ribbons that connect categories that show us how the combination of categories is distributed and how a particular subset can be further subdivided. As such, this can be useful in exploring relationships in data that might be elusive if you are facing with many categories.

Although, parallel set does provide us with a nice and refreshing set of outlook in our chart, it also has its disadvantages. For instance, it is unable to show a strong degree of association between dimensions based on its nice regular pattern as compared to Mosaic Plot. Therefore, it is imperative to perform a thorough exploratory analysis on the data and to trial and error on various kind of visualizations before we are able to reap out the purposes and benefits of using a parallel set plot.

-Lim Kim Yong