IS428 2016-17 Term1 Assign3 Gwendoline Tan Wan Xin

From Visual Analytics for Business Intelligence
Revision as of 21:05, 22 October 2016 by Wx.tan.2013 (talk | contribs)
Jump to navigation Jump to search

Problem & Motivation

After the successful resolution of the 2014 kidnapping at GAStech’s Abila, Kronos office, GAStech officials determined that Abila offices needed a significant upgrade. At the end of 2015, the growing company moved into a new, state-of-the-art three-story building near their previous location. The new office is built to the highest energy efficiency standard, but as with any new building, there are still several HVAC issues to work out. Each zone is instrumented with sensors that report building temperatures, heating and cooling system status values, and concentration levels of various chemicals such as carbon dioxide (abbreviated CO2) and hazium (abbreviated Haz), a recently discovered and possibly dangerous chemical. Staff members are also given proximity cards which will register their movements in the building by fixed proximity sensors in each zone and Rosie, a robotic mail delivery system equipped with proximity sensors.

With the huge amount of data collected by the various sensors installed in the building, there is a need to build an interactive data visualization tool to help the management efficiently identify typical patterns and issues of concern in building operations.

The interactive visualization can be targeted at employees from the following departments:

  • Security Department: To track movement of employees within the building so as to ensure that employees are not entering prohibited zones/areas
  • Facilities Department: To understand building operations and the respective building data elements so that anomalies can be spotted quickly
  • Executive Department: To understand power usage of the new building so as to properly plan ahead on possible policies to implement (e.g. campaigns to save power consumption)

Dataset Analysis & Transformation Process

Before the analysis began, the dataset given is analysed to identify its respective format and attributes. There were 6 different zip files provided in the assignment and each has its own unique ways to process and make sense of the data to bring value in the analysis. This section will elaborate on the dataset analysis and transformation process for each dataset in order to prepare the data for import and analysis on an interactive visualization.

Fixed/Mobile Proximity Data & Employee List

There are 2 main types of proximity detectors installed in the building to enhance the safety and security of its employees. These detectors include fixed proximity sensors installed in each zone of the building and a mobile proximity sensor that is attached to Rosie, a robotic mail delivery system. Data will be captured when employees, holding their own proximity card, pass a proximity sensor in the building.

The formats of the captured data by both the fixed proximity sensor and mobile proximity sensor are similar. However, the fixed proximity sensor provides the zone in which employees pass through while the mobile proximity sensor provides the exact coordinates of the employee’s location. The following shows the main differences between the two different types of data captured:

Fixed Mobile Original Data Formats GwendolineTanWanXin.png

The following section illustrates the issues faced in the data analysis phase leading to a need to transform the data into a specified format.

Issue: Because of the differences in the types of data captured, the data cannot be correlated with each other for display in the same chart. For example, one can plot employee’s exact location using the X/Y coordinates (from the mobile proximity sensor) but the zone coordinates (from the fixed proximity sensor) were not known and hence, could not be plotted. This will present an issue during the analysis since we are unable to clearly identify which area the employees are present on the map.

Solution: To resolve this issue, the coordinates for each zone has to be identified. This is done using an online tool that allow analysts to get coordinates for custom polygons on a map. By plotting these custom polygons on a map, we will be able to get the centre point coordinates of the zone. This will then allow us to plot the employees’ location on each zone. The following details the process of getting individual zone coordinates.

  1. Download the online tool from the link: https://github.com/bryantbhowell/tableau-map-pack/blob/master/draw_tableau_polygons_on_background_image.html
  2. Open the tool and choose the map to be plotted. In the zone maps given, there were white spaces and additional text that were not part of the map itself. As such, the map has to be cropped to get the actual map itself only. The following shows the cropping of maps and this process is repeated for each of the 3 levels.

    Cropping of Maps GwendolineTanWanXin.png

  3. Open the map using the online tool and plot each of the custom polygons. During the process of plotting, the following assumptions were made:
    1. For some areas with multiple zones separated on the map, a small area of the zone is being selected. For example, zone 4 have 3 different segments in level 1. However, they all refers to the elevators and stairs. By plotting only one of the zones, we will still be able to know that the employee is taking the stairs/elevators. There isn’t a need to identify which stairs or elevators they took. As such, for zone 4, only one of the custom polygon is drawn.
    2. For areas that were too big, only a small segment of the zone was plotted. For example, zone 1 encompasses the entire corridor spaces. If all the corridor spaces are drawn, the centre point will not be situated in zone 1. Therefore, only one part of the zone is drawn. As long as employees are within the same zone, they are in the corridor areas.
      The outcome of plotting these polygons will result in the following display on the online tool. The same process is repeated for each of the 3 levels.

      Plotting of Zones GwendolineTanWanXin.png

  4. After the polygons have been plotted, the results can be exported into a .csv file format for analysis in Tableau. Once the data has been imported into Tableau, the centre coordinates of each zone can then be retrieved. The following shows the centre points that were plotted on the chart and the ability to export the data. This will then allow us to retrieve the coordinates for each zone to be plotted onto the map.

    Individual Zone Coordinates GwendolineTanWanXin.png

  5. However, in level 2 and 3, shapes of each zones were not consistent and this has resulted in errors in the centre point of the zone. The following shows the errors that has been encountered during the process.

    Error In Centre Point GwendolineTanWanXin.png

  6. In order to resolve this error, a manual task has to be performed to get the coordinate point of each erroneous zone coordinate. This can be performed in Tableau using the annotate function. The following shows an example of how the annotate function can provide the X/Y coordinate values. The process is repeated for each of the erroneous centre points.

    Annotate Area With Point GwendolineTanWanXin.png

  7. After the completion of this process, the issue of lacking individual zone coordinates will be resolved.


Issue: Currently, the data for the fixed proximity sensors and mobile proximity sensors are in 2 separate spreadsheets. This makes the analysis process difficult with the need to blend or join data in Tableau.

Solution: To simplify the process of analysis, both the data from the fixed and mobile proximity sensors are combined into one single file. With the above process completed, we are also able to get the coordinates for each of the zone to be plotted on the chart. However, we do not have the zone information for the mobile proximity sensor. In this analysis, it does not matter as we are only interested in plotting the zone coordinates onto the chart. The following shows the data format of the transformed data:

Combined Proximity Data Format GwendolineTanWanXin.png


Issue: During the process of correlating the zone data and the coordinates, the dataset has a zone that stated “Server Room”. However, there is no point coordinate for this zone on our generated polygons.

Solution: To resolve this issue, the annotate function is also used to help us get the centre point coordinate of the server room. This result is then, updated into the file for analysis.


Issue: With the zone coordinates and the X/Y coordinates from the mobile proximity sensor, the data can be plotted onto a map to identify where employees are located. However, with only the proximity ID in the captured data, we are unable to correlate the employee with each proximity ID. The employee list provided in the dataset also does not show the mapping between their name and the proximity ID.

Solution: Upon close observation of the employee name and the proximity ID, a correlation can be identified. The proximity ID is derived from the first letter of the employee’s first name and the entire string of the last name. The last three numbers refer to the number of times each employee request for a new pass. By default, the count starts from 001. The following formula is then applied to identify a correlation between the employee and the proximity ID:

Employee and Proximity ID GwendolineTanWanXin.png

Building Data Elements (HVAC Sensor Readings & Status Information)

The building data elements provide multiple readings collected from different sensors in the entire building. However, the data is not structured in a way that allows for flexibility in the choice of sensor readings to be analysed and to apply filters by different floors and zones. As a result, there is a need to reshape the data to provide for this flexibility.

There are 3 main categories of data in the building data dataset:

  • Overall Building Data (e.g. Drybulb Temperature, Pump Power etc.)
  • Floor Data (e.g. Air Loop Inlet Temperature, Cooling Coil Power etc.)
  • Floor/Zone Data (e.g. Reheat Coil Power, CO2 Concentration etc.)

To better analyse the data using Tableau, the following is being done:

  1. The entire building data dataset is grouped into different Excel Spreadsheet as follows:

    Building&Floor Data Spreadsheet GwendolineTanWanXin.png Floor Zone Data File GwendolineTanWanXin.png

  2. Tableau provides an Excel plugin that allow one to reshape data into a format for easy processing especially if multiple attributes were present for selection. With the use of the tool, the following is then performed to each of the grouped datasheet:
    1. For the building data spreadsheet, a data reshape is performed to obtain the following result:

      Building Data Reshape GwendolineTanWanXin.png

    2. For the floor data spreadsheet, a data reshape is performed. The output of the reshaped data is similar to the building data, with the exception of the difference in attributes between the overall building and each floor attributes.
    3. For the floor/zone data spreadsheet, a data reshape is also performed on its attributes. However, due to the volume of the data generated, additional processing needs to be done as follows:

      Process Floow Zone Data GwendolineTanWanXin.png

With the reshaped data, it can then be imported into Tableau for analysis.

Hazium Readings

The sensor readings for hazium concentration level were saved as 4 different files in the given dataset. These readings are consolidated into one single file. The result is similar to the building dataset elements. As such, similar processing techniques were used to process the data - Excel Tableau’s plugin for data reshape. The following illustrates the steps taken to achieve the final transformed dataset.

Hazium Data File GwendolineTanWanXin.png

Dataset Import Structure & Process

With the dataset analysis and transformation phase completed, the following files will have to be imported into Tableau for analysis:

Data Sources GwendolineTanWanXin.png

Each of the data file is added as a data source in Tableau. The relationships defined between each data source is the timestamp, floor and zone. This will allow analysis to be conducted across all the data sources at any one point.

Additional processing is performed to the first data source - Proximity Data and Employee List.

  • Import the proximity data as a data source.
  • Add new data connections to the proximity data source. The new data connection file will be the Employee List.
  • Perform a left join between the proximity data and the employee list to correlate the employee ID together. This will allow us to identify the employee with the captured proximity card ID. The following shows the configuration of the join:

Tableau Data Join Of Proximity Data GwendolineTanWanXin.png

To process and display the dates in a readable format for the analyst, each of the data sources will have a new calculated field to derive the date and day of the week. This will convert the date from the format “31/5/2016” to a format of “31 May (Tuesday)”.

The formula used is as follows:

STR(DATEPART('day', [Date/Time])) + " " + DATENAME('month',[Date/Time]) + " (" + DATENAME('weekday', [Date/Time]) + ")"

Interactive Visualization

The interactive visualization can be accessed here: https://public.tableau.com/views/Assignment3_145/PowerConsumption?:embed=y&:display_count=yes

For the best experience, adjust your screen resolution to 1366x768 and enable full screen on the browser. Adjust the dashboard so that all elements can be clearly visible without the need to scroll up/down.

Throughout all the different dashboards, useful guides/tips are provided to help users navigate through the different filters and actions so that their analysis can be performed smoothly. The following interactivity elements are also used throughout all the dashboards to maintain consistency:

Interactive Technique Rationale Brief Implementation Steps
Filter dates with the use of time range slider
Filter by Date GwendolineTanWanXin.png
To provide flexibility for analysts to choose the time period that they are interested to analyse. The use of checkboxes or dropdown list requires the analyst to check/uncheck each date manually which is time-consuming. As such, a time range slider is preferred.
  1. The date/time field have to be duplicated with its data type set to “date”
  2. Add the new field to be filtered.
Filter each floor/zone using a single selection drop down list
Drop down list GwendolineTanWanXin.png
To allow analysts to concentrate on the data collected from each level with the use of a single selection. Use of a drop down list also allow analysts to easily choose the building level that they are interested to analyse.
  1. Configure the filter selection to be a single selection drop down list
Change and zoom of floor plans based on each floor filter
Dynamic Floor Plan GwendolineTanWanXin.png
To allow for easy reference of mapping each building data elements with each zone. When a user filters from one floor to another, the floor plan also changes to provide for quick and easy reference. Due to the space constraint, each floor plan has to be zoomed for users to identify and see the zone areas clearly.
  1. Create calculated fields for the x and y axis.
  2. Put the 2 calculated fields into the worksheet view.
  3. To hide the mark, set the colour as transparent.
  4. Navigate to Maps > Background Images. Add the floor plans into the background images and configure it according to the filter condition.
  5. Put the “floor” attribute as a filter.

The following sections elaborates on other interactivity techniques integrated into each of the individual dashboard.

Home Dashboard

There is a large amount of data attributes captured in the dataset provided. As such, it will not be possible to display all the attributes for a proper analysis in a single dashboard. At the same time, although many of these attributes are interrelated to each other, there is no clear and correct order in which data attributes should be analysed. Therefore, use of a story point does not seem plausible in this context. To resolve this issue, flexibility has to be provided for users to navigate between different dashboards. To do so, a homepage is created with 3 different data categories in mind – employee movement, building data elements and power consumption. Each of these categories are further broken down into its respective sub-categories for users to conduct their analysis. This allow users to choose the analysis that they are interested to look in.

The following shows the home dashboard:

Dashboard Home GwendolineTanWanXin.png

To allow for flexibility in navigation, the following interactive techniques have been employed:

Interactive Technique Rationale Brief Implementation Steps
Navigate across dashboards with buttons
To provide users the flexibility of moving from one dashboard to the other, with an easy and simple to use interface
  1. Create images of the buttons and placed these images into the folder directory as follows: My Tableau Repository > Shapes.
  2. In Tableau, create a new calculated field.
  3. Drag the new calculated field into the “Shapes” mark. Configure the shapes that you want to use by selecting the Shape Palette loaded into the folder previously.
  4. In the dashboard, drag the worksheet with the button into an empty area. Configure the action, with the use of the filters action.
Display tooltips when users hover over each button
To provide users with contextual information about the dashboard and the expected charts that they will look at after the button click
  1. Configure the tooltips information on the marks shelf

Employee Movement Dashboard

The following shows the employee movement dashboard:

Dashboard Employee Movement GwendolineTanWanXin.png

The following interactive techniques have been employed in this dashboard:

Interactive Technique Rationale Brief Implementation Steps
Traverse through an hourly time series by clicking on the arrows
To analyse and draw connections between employees and their movement in the building over a series of time
  1. Drag the date/time field into the “Pages” shelf
  2. Change the field to “Hour” to enable the animation on an hourly basis

The following shows a screenshot of the configuration:

Pages Shelf GwendolineTanWanXin.png
Filter data source types based on fixed or mobile proximity type
To allow the analyst to understand the fixed and mobile proximity data
  1. Add a checkbox filter into the dashboard
Highlight department upon selection
To highlight departments that are of interest to the analyst
  1. Add the shape legends for each department
  2. Select the new object and check “Highlight selected items”.

The following shows a screenshot of the configuration:

Enable Highlighting GwendolineTanWanXin.png

To help analysts get started with the analysis of the data, a typical pattern of GAStech employees have also been identified and included in the dashboard.

HVAC Control System & Chemical Levels (Heating)

The following shows the HVAC Control System (Heating) dashboard:

Dashboard Heating GwendolineTanWanXin.png

The following interactive techniques have been employed in this dashboard:

Interactive Technique Rationale Brief Implementation Steps
Highlight temperature type, upon selection
To enable analysts to visualize the differences between the water temperature and its respective set-points, all 3 lines have been placed on the same chart. However, situation arises when there is a need to look at one single attribute only. Therefore, the use of highlighting will help analyst to perform its tasks better.
The implementation steps are similar to the use of highlighting in the “Employee Movement” dashboard.
Highlight across all the different charts
To allow analysts to see how one data point is correlated to another data point in another chart, given the same date/time
  1. Navigate to Dashboard > Actions > select “highlight”.
  2. Configure the source and target sheets for the highlighting of charts in the dashboard.

HVAC Control System & Chemical Levels (Natural Ventilation)

The following shows the HVAC Control System (Natural Ventilation) dashboard:

Dashboard Natural Ventilation GwendolineTanWanXin.png

The following interactive techniques have been employed in this dashboard:

Interactive Technique Rationale Brief Implementation Steps
Highlight across all the different charts
To allow analysts to see how one data point is correlated to another data point in another chart, given the same date/time
The implementation steps are similar to the use of highlighting in the “HVAC Control System & Chemical Levels (Heating)” dashboard.
Link filter with “Mechanical Ventilation” dashboard
To allow analysts to conduct their analysis smoothly such that when one filters for level 2 data in mechanical ventilation dashboard, they will also use level 2 data when they go into the natural ventilation dashboard.
The rationale is that both ventilation elements are related to each other and analysts might be interested to analyse both elements one after the other.
  1. When configuring the filters, select the worksheets that will be affected by the filters.
  2. Other than selecting the worksheets in the dashboard, show all the worksheets in the workbook and check those that applies.

The following shows an example of the configuration:

Apply Filter to other Worksheets GwendolineTanWanXin.png

HVAC Control System & Chemical Levels (Mechanical Ventilation)

The following shows the HVAC Control System (Mechanical Ventilation) dashboard:

Dashboard Mechanical Ventilation GwendolineTanWanXin.png

The following interactive techniques have been employed in this dashboard:

Interactive Technique Rationale Brief Implementation Steps
Highlight across all the different charts
To allow analysts to see how one data point is correlated to another data point in another chart, given the same date/time
The implementation steps are similar to the use of highlighting in the “HVAC Control System & Chemical Levels (Heating)” dashboard.

HVAC Control System & Chemical Levels (Air-Conditioning)

The following shows the HVAC Control System (Air-Conditioning) dashboard:

Dashboard AirCon GwendolineTanWanXin.png

The following interactive techniques have been employed in this dashboard:

Interactive Technique Rationale Brief Implementation Steps
Highlight temperature type, upon selection
To enable analysts to visualize the differences between the thermostat temperature and its respective set-points, all 3 lines have been placed on the same chart. However, situation arises when there is a need to look at one single attribute only. Therefore, the use of highlighting will help analyst to perform its tasks better.
The implementation steps are similar to the use of highlighting in the “Employee Movement” dashboard.

HVAC Control System & Chemical Levels (Chemicals)

The following shows the HVAC Control System (Chemicals) dashboard:

Dashboard Chemicals GwendolineTanWanXin.png

The following interactive techniques have been employed in this dashboard:

Interactive Technique Rationale Brief Implementation Steps
Show different views of the Hazium data in the dashboard
To provide different perspectives for an analyst to conduct his investigation. For example, the line graph allows analysts to view the change in hazium concentration level overtime while the heatmap allow analysts to easily view the concentration levels of hazium based on colours representation.
-
Highlight across different charts
To show the correlation between the same set of data that has been represented in 2 different chart layouts
The implementation steps are similar to the use of highlighting in the “HVAC Control System & Chemical Levels (Heating)” dashboard.
Configure safe range of CO2 level
To provide a clear view for the analysts to see if the level of carbon dioxide concentration exceeds a safe range


To provide flexibility for analysts to configure a safe range, as defined by the organization
  1. Switch to the “Analytics” view and choose to add a “Reference Band” into the chart.
  2. For the value, choose to add a new custom parameter. The following shows a screenshot of the new parameter created:
    Defining Parameters GwendolineTanWanXin.png
  3. To create a band, define two parameters that is the typical safe range level to a maximum safe range level. This will help to create a reference band. The following shows the configuration of a reference band:
    Defining Reference Band GwendolineTanWanXin.png
  4. Add both of these parameters onto the dashboard and analysts will be able to configure the values based on the organization’s preference.
Clicking on data point to explore ventilation elements that might lead to variations in CO2 level
To allow co-referencing of carbon dioxide levels with the ventilation elements and analyse possible reasons that led to the changes in carbon dioxide concentration
To ensure that the filter applied in the dashboard is linked with the “HVAC Control System Ventilation Elements” dashboard, similar steps were taken as mentioned previously to link filters across different dashboards.
To allow clicking of the chart to another dashboard, actions were defined based on the following configuration:
Action to Link Dashboard GwendolineTanWanXin.png

Power Consumption in Kronos Office Building

The following shows the Power Consumption dashboard:

Dashboard Power Consumption GwendolineTanWanXin.png

The following interactive techniques have been employed in this dashboard:

Interactive Technique Rationale Brief Implementation Steps
Click on data point to explore possible causes of variations in HVAC power demand
To allow analysts who are interested to identify reasons as to why the HVAC control system is consuming so much power with just a single click on a data point
The implementation steps are similar to the use of linking technique used in the “HVAC Control System & Chemical Levels (Chemicals)” dashboard.

Interesting & Anomalous Observations

Q1: Typical Patterns In Proximity Card Data & Typical Day Of GAStech Employees

Q2: Interesting Patterns & Its Significance In Building Data

Q3: Notable Anomalies/Unusual Events In Data

Q4: Observed Relationships Between Proximity Card Data & Building Data Elements