IS428 2016-17 Term1 Assign3 Thomas Joseph Thio Kit Sun temp

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search

Building the visualization

=== 1. Exploration ===

Used parallell coordinates on cleaned data to see relationships. Same variables were not helping at all in building data, e.g. Loop temp schedule, Pump power, water heater supply, Supply inlet flow rate, and could be removed from any cross analysis. Deli-fan power seems to be either at 45 or 0. HVAC electric demand power and total electric demand power seems to have a few outliers, which could be worth looking at. Water heater temp tank temperature were also right-skewed, while water heater gas rate was left-skewed.


2. Unifying data by time

Therafter, I thought of plotting the promixity data onto the provided maps, helping me to visualize the movements or detections or the mobile and fixed sensors respectively. This was done using Tableau, which hage the 'Page' feature, allowing me to animate the movements across the three levels. This would come in useful for the cross-analysis with hazium levels and building data in general and at per level. However, encoutered extremely slow rendering due to the large data points if I added minute data. Hourly data could work, but since the data all had different timestamps, it seemed to make more sense to see fluctutations by day, and zoom down after to see which were the variables (staff/hazium/building data) that could be causing any issues. In this way, I could use standard deviations and represent fluctuations or abnormalities using boxplots.

I had to decide on a common 'timestep', that is, should the animation be played in intervals of minutes or hours? Would this choice make us miss interesting patterns?

Thereafter, how would we track the movements of the staff? Thankfully Tableau also comes with tracking the last N number of steps, creating a trail left behind when the staff 'move' from location to location. I also added the added a the ID column in the employee list so that we can match the employee's movements and which departments they are in rather than seeing raw movement patterns of people in general. This was done using JMP's substring, lowercase and concatenation features. Some data cleaning had to be done, for example staff names with hyphens took the first componentn rather than the full last name. There were some mismatches in the given employee list - example is the last name paredes, but the employee lists ID is 'parades'. I took the given employee list ID as the source as truth to simplify things.

It was useful to join the employee list and fixed/mobile proximity tables though, as I could now represent the movements by the employee and departments where they are.

To look at the patterns staff would take, Tableau allows selected specific targets and tracking their last paths, to see common movement on a daily basis. This way, we can see if they have deviated from their usual path.


3. Organization & Flow of Analysis

*Used the HVAC system energy map.

The main concern was how to help users hunt/show strange fluctuations for over 20+ variables, for each floor! One way to do this was use average, median - however, since we are looking out for 'strange' activity, we want to tune in to outliers - standard deviation could thus be used instead of plotting all values of each variable. Tableau's highlighting feature could help compare the values of a specific day and hour of one chart with another.

However, the organization of these charts would matter to the end user - and there is a limitation here to what I perceive as clear to my users, as my own understanding of HVAC systems or building infrastructure is limited. At the time of this writing, I am merely reading some books and applying my assumptions on how I group or generalize the data presented to produce meaningful findings - thus accuracy of the findings may not be realisic or practical, unless done by a practioner in the field. Nevertheless, this is my attempt at organizing the data into coherent parts, so they can be cross-analyzed one by one:


As hazium levels were given for selected floors and levels, they were included in the aspects of the visualization for that specific floor and zone.

Employee Data

Fixed proximity analysis, by floor

Screen Shot 2016-10-22 at 10.39.54 AM.png


Mobile proximity analysis, by floor

Screen Shot 2016-10-22 at 10.41.54 AM.png

Building data

Building data related to the environment and hazium were grouped by the following:

Building Temperature Regulation

Dry Bulb Temp (Celsius): Drybulb measuring the tempearture of the outside air, aka external temperature

Supply Side Inlet Temp (Celsius): Temperature of the air entering the zone from its air supply box

Supply Side Outlet Temp (Celsius): Temperature of the water exiting the hot water heater

Supply Side Inlet Mass Flow Rate (kg/s): Flow rate of water entering the hot water heater


BUILDING POWER:

Total Electric Demand Power (Watts): Total power used by the building

Deli Fan Power (Watts: Power used by the deli exhaust fan

Pump Power (Watts): Power used by the hot water system pump Water Heater

HVAC Electric Demand Power (Watts: Total power used by the building's HVAC system including coils, fans and pumps.


Building WATER

Water Heater Tank Temp (Celsius): Temperature of the water inside the hot water heater

Water Heater Gas Rate (Watts): Rate at which the water heater burns natural gas

Water Heater Setpoint (Celsius): Water heater set point temperature

Loop Temp Schedule (Celsius): Temperature set point of the hot water loop. This is the temperature at which hot water is delivered to hot water appliances and fixtures.


With each floor having its own Tableau visualization to as to split the computation of graphics, the Zones' charts each had their Temperature, Airflow and Power analysis. If the zone had hazium data, it would be included in the zones' grid:

TEMPERATURE Thermostat Cooling Setpoint (Celsius): Cooling set point schedule for the zone

Thermostat Heating Setpoint (Celsius): Heating set point schedule for the zone

Thermostat Temp (Celsius): Temperature of the air inside the zone

Supply Inlet Temp (Celsius): Temperature of the air entering the zone from its air supply box


AIR FLOW

Return Outlet CO2 Concentration (parts per million): Concentration of C02 measured at the zone's return air grille

Mechanical Ventilation Mass Flow Rate (kg/s): Ventilation rate of the zone exhaust fan, where applicable (not all zones have this)

VAV Reheat Damper Position (Open or Closed): Position of the zone's air supply box damper. 1 corresponds to fully open, 0 corresponds to fully closed

Supply Inlet Mass Flow Rate (kg/s): Flow rate of the air entering the zone from its air supply box


POWER

Reheat Coil Power (Watts): Power used by the zone air supply box reheat coil

Equipment Power (Watts): Power used by the electric equipment in the zone

Lights Power (Watts: Power used by the lights in the zone


The visualization would be best displayed at 80%, allowing Tableau's grids and embedded html frames to fill the white space. Analysis is then conducted by viewing charts in a grid form. Hovering over any date/time in its matrix highlights the relevant dateay/times in the rest of the charts in the grid. The idea is that side-by-side comparison of relevant data points can allow simpler comparison, reducing the common inspection dimension to just time. Should the user find an interesting observation a pattern, they just need to note down the day and time, and find somewhere else with interesting patterns or anomalies with the same time and form inferences from there. Thus, we need to keep track of specific days and times to do a cross-analysis from another dimension, for example temperature data on a specific floor and zone.

The gray-blue-red levels for each building data chart denote increasing levels of the building or hazium data - grayish-blue would indicate normal, average levels, while red indicates goes beyond the average and one can visually see when it is in the normal range or not just by color and by area of the chart. The intention is for the user to visually scan across the variables they want to compare, for example, temperature analysis on a specific floor or zone, and zoom in to a reddish spot or patterns that denote anomalies for further investigation.

I conducted my analysis based on a general approach (i.e. general building data given), followed by analysis of temperature, airflow and power. The analysis were done by floor and by zone, prioritizing the zones with hazium data, then interesting zones/areas were employees displayed erratic behavior out of their norm. By doing so, we can start to see patterns between building or hazium data, and the behaviour of the employees - eventually we may find interesting insights that let us know on possible causes between any of the three dimensions.

Answers to questions:

1. What are the typical patterns in the prox card data? What does a typical day look like for GAStech employees?

The day begins at 9am, cuts off at 9:47 - possibly lunch periods - and resumes from 2:30, and the day 'ends' at 2:45 for mobile proximity data

Visual map: Floor 1: Security (Fusil & Lagos)are at office (zone 8) at 9am. At 9.06, Facilities staff are usually at the Deli - they rotate between different staff, possibly taking turns to buy breakfast.

Possible Anomalies: On 6/8/2016, rparade is seen at the server room for the first time. 6/9/2016, he is there at 9.02am again. On 6/13/2016, dscozzes is at the server room. At 2.04, iccarra is at the spot between the deli and zone 3.

Blue, an executive, only comes to the deli once on the 6th at 2pm.

Floor 2: 9.09, early arrivers for work. 9.10AM. Starting from zone 2, engineering team + one from administration (holly) arrives for work. 9.12, another batch of engineering arrives. 9.13, Facilities team members with some of engineering arrive. Most of the facilities members appear to be outside the meeting/training room, possibly a daily standup.

Other IT and facilities staff arriave at 9.15

At 9.28am, engineering members are seen in zone 1

2.15 pm, Two members of engineering are in meeting/training room.

Floruinau and Castellanos from administration and security respectively are often seen in their offices, being one of the earliest.


Floor 3: Floor 3 seems like the executive floor.

Vasco, an executive, is usually seen crossing from his office to zone 6. Mintz, another exeuctive, is perpetually in his office.


In another view, this showed the frequency of visits to a zone by the hour and per day.

Floor 1 Zone 1: Zone 2:...


Floor 2


Floor 3



2. Describe up to ten of the most interesting patterns that appear in the building data. Describe what is notable about the pattern and explain its possible significance.

1. Based on Temperature/Airflow/Power at a specific day and time, are there any anomalies? Is this recurring? Any changes that had a lasting effect? Use this to highlight and see the patterns in other data

If a color is dominant, it means it is of 'normal' levels. Any other color that are small in proportion to the rest of the color grid means a possible anomaly. Blue represent the lowest value, gray as moderate, and red as high.


Building data visualization General: The Dry bulb tempearture, indicating the temperature outside the building goes up from 9am onwards.

During this time, the total electric demand power goes up from 8am, possibly due to the starting up of air conditioning and other facilities.

The 4th and 5th have relatively low wattage of power use compared to the other days, indicating that they are weekends.

The deli fan power seems to be on from 8am to 4pm daily, with exception of the 4th. This could mean that the deli is open 6 days a week, taking a break on the sunday (5th)

HVAC electric demand peaked from the 11th-13th for the entire day - this rise in electricity usage for the HVAC system started from 6pm on 10th.


Temperature:

Floor 2: VAV supply fan outlet temperature rises 7 out of the 14 days at 10pm. The difference in temperature was not as great as in floor 1.

Hazium - Zone 2 and 4: In both these zones, the data was similar to floor 1. The thermostat heating and cooling setpoints followed the same pattern on the 7th and 8th. Thus, the zone's thermostat levels increased as a result, only decreasing once the setpoint temperatures were reduced. The supply inlet temperature was also low. This indicates something else that raised the temperature of that zone.

Supply inlet temperature was particularly high from the 11th to 13th, indicating that the thermostat heating setpoint now affects the supply inlet temperature - previously, there were no similar patterns.

Hazium levels were also high when supply inlet and thermostat heating setpoint temperatures were high.


Floor 3: VAV supply fan outlet temperature has perculiar behavior on the 7th, alternating between rising and falling 2 degrees every hour starting from 11am. This pattern is seen on the 8th, but stops by 1pm.


Airflow: Floor 1: Air loop mass flow peaked on the 7th and 8th at 1am. This was almost twice the usual amount of volume flowing. The same amounts of airflow were seen on the 10th starting from 5pm - the volume does not normalize to standard levels (blue/gray) until 3 days later at 5am

The air loop temperature was raied by 4 degrees on the 7th and 8th from 8am till 9pm

The supply fan outflow temperature was about twice the tempearture during normal levels on the 31st from 12am to 4pm. The mass flow rate of the supply fan outlet always increases from 7am till 8-9pm, with the exception of 4th and 5th, being weekends. However from the 10th-13th, the mass flow rate remained high till the 13th at 4am.



Floor 2: Similar to floor 1, air loop mass flow rose in the wee hours of the morning on the 7th, but started from midnight and normalizes by 7am. Air loop mass flow rate remained high on the 10th, lasting till the 13th at 5am as well.

Air loop temperature also rose, but by 7 degrees on the 7th and 8th from 7am to 9pm.

THe outdoor air mass flow rate is usually high throughout these 14 days, but were at its lowest at 6am everyday except the 5th and 12th.

In contrast to floor 1, the supply fan outflow temp was consistent throughout.

Hazium - Zones 2 & 4: For both zones 2 and 4, CO2 concentration increased on the 7th and 8th, both starting from 11am and only normalizing at 11pm.

Hazium levels increased on the 11th from 2pm and normalized at 11pm.


Floor 3: Hazium - Zone 1: Floor 3 had the similar timings of CO2 and hazium levels increasing, with additional occurrences. CO2 levels increased earlier and lasted longer, starting from 8am on the 8th and lasted till 12am. The occurrence on the 7th lasted till midnight as well. This could indicate floor 3 zone 1 being a source or where CO2 has difficulty dissipating.

High Hazium levels were apparent on the 3rd and 9th from 4am till about 9am, and which similar occurrence on the 11th lasting much longer - till 6am on the 12th.

Floor's reheat damper is open, and supply inlet mass flow rate has a very large proportion of high levels even in the early hours of the morning and late to the night, indicating high usage of the air flow system, possibly air conditioning left on. Levels only go down at 2pm till about 6pm everday where it starts to inrease everyday - this is also inclusive of weekends!

Together with the findings conducted during the temperature analysis, we can isolate these times and zones for further investigation with employee pattern data, or power data to form conclusions.


Power: Floor 1: Bath exhaust is high most of the time, and we can assume the bathroom fan is always operational

The heating coil is perpetually low, while the cooling coil power is seen to rise to almost 10 times its normal levels at the same occurrences as the supply fan power.

Hazium - Zone 8A: Reheat coil power only increases on the 7th and 8th at 7pm. The rest of the days are at 0, indicating it is not used.

Equipment and light power maintains their levels throughout the 14 days.

Hazium levels start to rise from the 8th at 7pm, and spiking on the 9th at 7pm. There was another spike on the 11th, which levels were higher than normal from 3pm to 9pm.

Floor 2: This floor has similar patterns to floor 1 in supply fan power levels. However, the bath exhaust fan power is seen to drop significantly for 12 out of 14 days at 6pm, before rising back to its higher values in the 47 watt region.

The cooling coil power spiked on the 7th and 8th at 10m, and did not go back down to normal levels from the 10th starting from 8am

Hazium - Zone 2: Reheat coil power rises from 0 to the 4000 watt region on the 11th till 13th at midnight. This signifies that the reheater has been turned on for this period.

Lights and Equipment power is shown to have a consistent pattern, rising from 8am and dropping at 6pm, indicating its usage times during working hours.

Hazium in this zone starts to rise on the 11th, from 1pm till midnight, where it normalizes again.

Hazium - Zone 4: Similarly, hazium levels rise at the same time as in zone 2, and the reheat coil, equipment and lights power follow zone 2 as well.


Floor 3:

Hazium - Zone 1: Reheat coil power rises at 6pm and only normalizes at 4am the next day. This is also perculiar as this happens after office hours.

Lights and equipment power follow each others patterns exactly.


3. Describe up to ten notable anomalies or unusual events you see in the data. Prioritize those issues that are most likely to represent a danger or a serious issue for building operations.

Floor 1 Supply fan power is 10 times higher on the 7th and 8th, from midnight to 7pm. There was another longer occurence starting from the 10th and ending only on the 13th at 5pm. Supply fan power levels when normal (blue) is only 200+ watts - these occurrences increase wattage to levels 2000 and above!

Hazium - Zone 8: *Supply inlet temperature's only spike in temperature was on the 7th and 8th at 7pm. This was at least twice the normal tempearture

From that point on, the thermostat in zone 8A remained high, only normalizing at 9pm on the 8th. Supply inlet temp remained at normal levels, indicating something else that raised the tempearture of that zone.

The thermostat cooling and heating setpoints were also raised at the exact times, and lowered at the same times in similar magnitude (>1-2 times the norm). Something or someone may have triggered this.


The CO2 concentration peaking on the 6th from 7-10pm seems rather dangerous. On other days, their levels are also in the 400 levels from 7pm onwards, only with exception of the 31st, 5th and 11-13th.

Hazium levels increased from 6am onwards on the 9th, before normalizing by 11am. It rose to high levels again on the 11th, this time from 3pm to 10pm.

In all these periods, the reheat damper position are open, and supply inlet mass flow rate are at their highest levels, indicating maximum levels - if the CO2 and hazium (if airborne carried) were to circulate, this would be very dangerous.


Floor 2: *Slight difference to floor 1 is that the reheat damper and supply inlet mass flow rate were open and at their highest only where hazium levels were also high. CO2 levels were high when reheat damper and supply inlet mass flow rates were normal - however this may indicate that the CO2 has somehow got into floor 2 without help from the HVAC system!


Hazium - Zone 2: Reheat coil power rises from 0 to the 4000 watt region on the 11th till 13th at midnight. This signifies that the reheater has been turned on for this period.

CO2 concentration increased from the 7th and 8th, both starting from 11am and normalizing at 11pm.

Hazium levels increased on the 11th from 2pm and normalized at 11pm.


Floor 3: VAV supply fan outlet temperature has perculiar behavior on the 7th, rising and falling 2 degrees every hour starting from 11am. This pattern is seen on the 8th, but stops by 1pm.

Hazium - Zone 1: Out of all the zones with hazium levels detected, this zone had the most numbrer of occurrences. They were on the 3rd, increasing from 3am and normalizing at 11pm, the 9th, increasing from 3am and normalizing at 12pm, the 11th, increasing from 3pm and normalizing at 7am the next day. The 11th seems to be the most profound finding, as it is also the longest occurrence the hazium levesl have been that high.

From the 2nd to the 13th, this particular floor and zone had unusual temperature patterns. For example, the thermostat cooling and heating setpoints were high (35 degrees) for most of the day, starting from 1pm till 4 am the next day. Temperature goes back to normal (10 degrees) from 5am to 12pm.

Supply inlet and thermostat temperature followed the same pattern and never seemed to go down after day 2, even on weekends. This is an interesting finding from floor 3! This means there is something worth investigating to unravel the causes - possibly a malfunctioning HVAC system, or something/someone else causing it to behave in a perculiar manner.

It is also worth nothing that the temperature levels are as high, if not higher than that of the dry bulb temperature! This meant that Zone 1 was as hot as the outside temperature, even on weekends!

Reheat coil power rises at 6pm and only normalizes at 4am the next day. This is also perculiar as this happens after office hours.

Floor 3 had the similar timings of CO2 and hazium levels increasing, with additional occurrences. CO2 levels increased earlier and lasted longer, starting from 8am on the 8th and lasted till 12am. The occurrence on the 7th lasted till midnight as well. This could indicate floor 3 zone 1 being a source or where CO2 has difficulty dissipating.

High Hazium levels were apparent on the 3rd and 9th from 4am till about 9am, and which similar occurrence on the 11th lasting much longer - till 6am on the 12th.

Floor's reheat damper and supply inlet mass flow rate has a very large proportion of high levels even in the early hours of the morning and late to the night, indicating high usage of the air flow system, possibly air conditioning left on. Levels only go down at 2pm till about 6pm everday where it starts to inrease everyday - this is also inclusive of weekends!



4. Describe up to five observed relationships between the proximity card data and building data elements. If you find a causal relationship (for example, a building event or condition leading to personnel behavior changes or personnel activity leading to building operations changes), describe your discovered cause and effect, the evidence you found to support it, and your level of confidence in your assessment of the relationship.

I formed my inferences on the conclusions based on the relationships between building, hazium and employee - each could affect one or another in some way. For example:

1. By hazium levels, get date and time 2. By erractic employee behaviour, get date and time 3. By building data anomalies, get date and itme


Floor 1: There were engineers working from 7pm till midnight on the 31st in zone 1 and 4, explaining the rise in supply fan power as they were still in the building. Perhaps they were doing some modifications to the building, or fooling around.


Floor 3: Perculiar temperature patterns:

During the 11th at 1pm, thermostat cooling and heating setpoints were very high, at 35 degrees. This started from 1pm, of which this was the last recorded instance of an executive at zone 1. This is the room where an executive is frequently at, possibly meaning it is his office. The executive seems to be the only one frequenting this room for the 14 days.

Hazium - Zone 1: On the 3rd, the last to leave the zone at 6pm was an executive. On the 9th, a facilities personnel was the last at 6pm. On the 11th, the last instance was the executive at 1pm.

These two occurences can be supported by the fact that he is the only person in the room - if anyone adjusted the setpoints, it was highly likely to be that particular executive (Sanjorge Jr.)

Comparisons between software

I did attempt to do the above steps with various software, but faced difficulties in getting the organization, analysis and interaction capabilities. Tableau does hit its limitation at some point, which I will discuss further.

*Tableau renders the charts very slowly - I believe that it works well for small data sets, but scales poorly when dealing with large datasets - especially one dealing with spatio-temporal aspects. JMP does well here, with their Graph Builder feature creating time series comparisons amongst each variable such as building data quickly. Tableau takes some time to load each of the variables, and loading them all at once even causes Tableau to crash.

*Other than using JMP, I attemped using Qlik View for this project. On first use though, Qlik view is very slow when conducting data exploration, as variabels have to be dragged or selected one by one.


Tableau does not let you do a full join or right join as they consider it a deprecated feature... thus manual data cleanaing would have to be done. To get all the timestamps synced with one another, I created all the possible minutes, hours and days and joined them to the existing building data excel, then replacing the old timestamps. There would be some blanks in building data, but these were to ensure the joins with the rest of the proximity and hazium data could be synced properly.

However, Tableau did crash several times when creating the worksheets required to represent each floor/zone, and by pattern/anomaly detection! The more worksheets that were created, the slower the rendering of dashboards - thus I had to split the visualization into parts. I wanted to keep them coherent, so I split by Employee Patterns, General Building Data, Building Data by Floors.

Tableau server does not have the same features as Tableau desktop - an important component was the Play and Pause button, which was supposed to play the animation of staff moving around the floor maps, and this would allow users to find unusual employee behavior patterns easily. In Tableau Server, one can only click on a specific time and use it like a filter. Furthermore, large dimensions found in these data sets such as this one would be better visualized in custom displays, such as a parallel coordinates chart: See screenshot, with various interactions which could help the user see relevant information either side-by-side, or open up the filtered information on the page itself.

The limitations of my current visualization has is that it requires the user to:

  1. Be familiar with terms of the data set
  2. Analyze by floor, then by zone. This restricts other kinds of cross-analysis, such as reverse the analysis to go by zone or employee, and to a specific building data chart.
  3. Navigate amongst many Tableau workbooks and worksheet pages due to performance constraints or limitations of Tableau server