Difference between revisions of "Group05 Proposal"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 31: Line 31:
 
==Background on Time Series Clustering==
 
==Background on Time Series Clustering==
 
<b>What is Time Series Clustering?</b>
 
<b>What is Time Series Clustering?</b>
* XXXX
+
Clustering is a data analysis technique for organizing observed data (e.g. people, things, events, brands, companies) into meaningful taxonomies, groups or clusters without advanced knowledge of the groups’ definition. Clusters are formed based on combinations of input variable, which maximizes the similarity of cases within each cluster while maximizing the dissimilarity between groups that are initially unknows.
* XXX
+
A special type of clustering is time-series clustering, which is essentially dynamic data as its feature values changes as a function of time.
 
<br>
 
<br>
 
+
<b>Key Parameters of Time-Series Clustering: </b>
<b>Key indicators of Association Rules: </b>
+
{| class="wikitable"
<table>
+
|-
<table border='1'>
+
! Parameters !! Algorithm
<tr>
+
|-
<th>Description</th>
+
| Type||
<th>Illustration</th>
 
</tr>
 
 
 
<tr>
 
<td><b>Type</b>
 
 
* Hierarchical Clustering
 
* Hierarchical Clustering
 
* Partitional Clustering
 
* Partitional Clustering
</td>
+
|-
 
+
| Distances||
<td> to include picture </td>
+
* Dynamic Time Wrapping (DTW)
</tr>
+
* Global  Alignment Kernels (GAK)
 
+
* Shape-Based Distance (SBD)
 
+
|-
<tr>
+
| Centroid ||
<td><b>Distance</b>
+
* DTW Barycenter Averaging (DBA)
*XXX
+
* Partitioning Around Medoids (PAM)
*XXXX
+
* Shape Averaging (Shape)
</td>
+
|}
 
 
<td> to include picture </td>
 
</tr>
 
 
 
<tr>
 
<td> <b>Centroid</b>
 
*XX
 
*XXX
 
</td>
 
 
 
<td>to include picture</td>
 
</tr>
 
 
 
 
 
</table>
 
  
 
<br>
 
<br>
Line 86: Line 66:
 
<b>To build an application for Time Series Clustering</b>
 
<b>To build an application for Time Series Clustering</b>
 
<br>
 
<br>
This dashboard aims to allow user to do time series clustering on time series related data to uncover patterns which have potential use case in the respective business domain<br>
+
Time-series data are of interest due to their ubiquity in various areas ranging from science, engineering, business, economics, healthcare, to government. This dashboard aims to allow user to do time series clustering on time series related data to uncover patterns which have potential use case in the respective domain.
 
+
<br>
 
 
Banner Photo from [https://www.pexels.com/photo/man-riding-bicycle-on-city-street-310983/ Pexels]
 

Revision as of 17:15, 20 November 2018

Bike riding.jpg Visual Application for Time Series Clustering

Project Proposal

Poster

Final Report

Application

 


Abstract

XXXXX

Background on Time Series Clustering

What is Time Series Clustering? Clustering is a data analysis technique for organizing observed data (e.g. people, things, events, brands, companies) into meaningful taxonomies, groups or clusters without advanced knowledge of the groups’ definition. Clusters are formed based on combinations of input variable, which maximizes the similarity of cases within each cluster while maximizing the dissimilarity between groups that are initially unknows. A special type of clustering is time-series clustering, which is essentially dynamic data as its feature values changes as a function of time.
Key Parameters of Time-Series Clustering:

Parameters Algorithm
Type
  • Hierarchical Clustering
  • Partitional Clustering
Distances
  • Dynamic Time Wrapping (DTW)
  • Global Alignment Kernels (GAK)
  • Shape-Based Distance (SBD)
Centroid
  • DTW Barycenter Averaging (DBA)
  • Partitioning Around Medoids (PAM)
  • Shape Averaging (Shape)


Packages Used

This dashboard mainly uses dtwclust package from R.

dtwclust:

XXXX

Objective

To build an application for Time Series Clustering
Time-series data are of interest due to their ubiquity in various areas ranging from science, engineering, business, economics, healthcare, to government. This dashboard aims to allow user to do time series clustering on time series related data to uncover patterns which have potential use case in the respective domain.