Difference between revisions of "IS428 2016-17 Term1 Assign3 Margot Marie T Stelleman"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
Line 89: Line 89:
 
#Peak of the reheat coil power on the 7th and 8th of June. Definitely something not normal happened that day. There was also a peak for the CO2 concentration these days in the afternoon till late.   
 
#Peak of the reheat coil power on the 7th and 8th of June. Definitely something not normal happened that day. There was also a peak for the CO2 concentration these days in the afternoon till late.   
 
#It seems that the hazium concentration is higher in concentrate offices as the zone 1, third floor and lower in zone were the turnover and passage is bigger like the zone 8A at the floor 1, the corridor. But we can not forget that the corridor is split in a lot of part so the hazium could actually go and spare in the other parts of the corridor, maybe the concentration is in total higher there.
 
#It seems that the hazium concentration is higher in concentrate offices as the zone 1, third floor and lower in zone were the turnover and passage is bigger like the zone 8A at the floor 1, the corridor. But we can not forget that the corridor is split in a lot of part so the hazium could actually go and spare in the other parts of the corridor, maybe the concentration is in total higher there.
#Peak of Hazium, specially at second floor on the 11th june.  Specially a lot coming from the zone 4 at level 2, offices.
+
#Peak of Hazium, specially at second floor on the 11th june.  Specially a lot coming from the zone 4 at level 2, offices.Which is interesting because no one was working in that zone that day.
  
 
==Task 4==
 
==Task 4==

Revision as of 10:34, 24 October 2016

Abstract

At the end of 2015, Abila, division of GAStech officials moved in new offices, near the old ones. It helps to rise the mood of employees but even some are still unhappy about this change. This new location has been designed and built to meet the highest energy efficiency standard but some HVAC issues are still there and need to be worked out. Hazium a possibly new dangerous chemical that forces the CEO to want new and very expensive sensors in each HVAC zones which divide the new offices. Sensors are able to inform about hazium concentration levels as well as the CO2 level but also temperatures, heating and cooling system status values for example. Because these sensors are really expensive, the company can afford only a small number of them. So not HVAC zones will have Hazium sensors. There are also new security procedures but the staff members are still not used and not very constant in their utilizations and respect of the rules. They must wear a prox card while in the building. These cards can be read by passive prox card readers that are each in a different zone of the building. The zones are not necessary the same as the HVAC ones. How does it work from now?

  • Each time a prox card enter a zone it is detected and recorded.
  • Most of the zones are still open to people not having their cards.
  • People are careless but a new prox card can be obtained at the security desk if the old one is mislaid.

There is also a free robotic mail delivery system, named Rosie. She moves periodically, able to move between floors. She has a mobile prox sensor identifying the prox card in the area where she is. All this card and sensors are producing a massive amount of data. Several questions are ask by the company and thanks to Visual Analytics and the large amount of data available, they will find an answer.

Motivation

I’m a visual analytics expert who’s been hired to help the company understand its operations data. After I had well understand the situation and my differents tasks by rewriting the project task as an abstract and summary, the next step is to prepare the data to finally analyze them and deliver some patterns.

Data preparation and cleaning

Before the data could give us patterns, stories and been used, we need to prepare them and clean them.

Creating a new id column to link Employee list, proxMobileOut-MC2 and proxOut-MC2 files

To answer the first question, we’ll need information about employees and prox card data. The files we will use are the Employee list file on Excel and the proxMobileOut-MC2 and proxOut-MC2 files. The information related to employees in these two last files is the prox-id which is not appearing in the Employee list. Since we want to interpret information from all the three files together we need to create it in the Employee list so they can be linked. When we look carefully at the name of the employees we understand fast that the prox-id is constructed as followed: (all in lower case) the first letter of the first name + the entire last name + “00x” where x is the number of cards that the employee asked since he can ask for a new one when he lost it. This last part with the numbers is complicated to standardize so in each file there will be a column “id” with in lower case the first letter of the first name + the entire last name. For example, in the employee list, Bramar Mat will be “mbramar” and construct with this formula (in French): =MINUSCULE(CONCATENER(STXT(C2;1;1);B2)) where Minuscule is for lower case, stxt to take the fist letter of the first name, and concatener to add to the first letter the last name. In the prox files, a new column id is created by removing the 3 last characters of the column prox-id, using the formula: =GAUCHE(C2;NBCAR(C2)-3). But attention, the work is not finished. Some people have to last name like “Mies Haber” so we need to check and remove the blank space in the new id or when a hyphen, remove the second last name. There is some exceptions as Sanjorge Jr. Sten which will be jsanjorge and not ssanjorge.

Creating coordinates for ProxOut-MC2

For the question 1, we would like to know the typical day of an employee. For that we are going to use the prox data, to know at what time is going to start his day or to know what he is doing during the day regarding how is moving in the building. In the ProxOut-MC2, we are able to know the floor and the zone but to have the visualization as its best we want to use a map, since that we need to have the coordinates of the zone to be able to put it on the map. The other file, ProxMobileOut-MC2 has already the coordinates. So we want to know the coordinates of our different zones, Tableau can help us to do it once we have import and can display the maps of each floor. In data visualization section, this is explained. Once this step is done, we can use the annotation function to know the coordinates of a point. (http://onlinehelp.tableau.com/current/pro/desktop/en-us/help.htm#annotations_annotations_add.html ) For each zone of each floor I just select a point inside more or less in the middle of the zone and annotate then report the coordinates in an excel file. Sometimes some zones are split and/or are not a regular rectangle but it is always the same type of rooms such as for the zones 4 which are the elevators or stairs. Each zone has a number but at floor 3 one zone has no number but is called “Server Room”, we also need to assign her coordinates. The final step is to join the ProxOut-MC2 file with the coordinates of each zone that I put in an excel file. I have used JMP to do it since Excel does not allow it to do so and not as easy. Both tables are joined using Floor=Floor and Zone=Zone. Basically we had 29763 lines, after joining, we have 29513 lines left. Server room could be not recognized and needed to be change as a number the zone is then now at floor 3 known as zone 9. After this change, we still have some lines missing since we have in total 29542 lines. I took a lot of time to know which ones and why they were missing but it is still a mystery. Every zone since to have data so it has been take as assumption that it could not matter since the number of data is very high.

Image1.png
Organizing the building data

When we open the building data file, we are literally lost. There is too much and the data are not organized easily for us to understand and access the information fast. Visualization will definitely help the user to see and search in an effective way. But to have an effective visualization, the data must be organized in a different way on Excel, first. Three major changes need to be done.

  1. Split the data in three different major categories: Some data are for the entire building, others are for one particular zone of a particular floor and the rest is for an entire particular floor. We can split the data in three different sheets according to the space that is taking the measure for. We will have then one sheet for the building, one for the entire floors and one for all the data by zone for a floor.
  2. Changing tables organization for filtering the concentration measurements: if the file is left like that each measurement is a title of a column. Since that the user won’t be able to use interactively the data. We need to drop each column to the dashboard for them to appear. BUT if the measurements are in one and only column call “measurement”, then the user will be able to select them interactively as they could select the floor or the zone for example. To be able to do that, I put my split file in JMP and using the function STACK allow me to put my labels in a column with each appropriate time and value. It has been done for each split part of the big original file.
  3. Creating columns for floor and zone for filtering by localization: In prox-card data, we have for each row a particular floor column and a particular zone column. It allows us to filter and to get into the data, be more precise and accurate AND allow good interactivity. We want to use the same principle for the concentration measurements. For the new floor sheet/file, on JMP we use in Columns>Utilities>Text in Columns to separate the new “measurement” column once using “_” as separator and next “ ” as second time, we delete the column we don’t need and rename the one we need. But because we separate the column with a blank space some we will need to be recombine to form again the name of the measurement. Similar process is done for the zone by floor file.
Computing the Hazium concetration file

All the hazium files have been merged in one in excel and then the similar process for organizing the data than for building data has been followed.


Data Visualization

How to import and display our own map and multiple floors on Tableau?

I knew that by changing the X and Y columns in real coordinates as latitude (Y) and longitude (X) tableau will be able to display the real map of the world but that was not the purpose here. We wanted to insert our own map of the building to be able to visualize the movement and the activity of the employees. Tableau has a lot of helpful tutorials. Here is the one I found to help me and be able to see the building map as background. (https://onlinehelp.tableau.com/current/pro/desktop/en-us/bkimages_add.html ) A challenge was to know how to be able to show the floor 1 map when it was only the data of the floor 1 so be able to display the 3 floors and not only one. For that we needed to go in the options when adding a new picture and there adding a “only show when” condition. My floor variable was ranked as a measure so I needed to change it into a dimension one to be able to do a such condition. The final result is like that:

Image2.png

All the images needed to be cropped and it has been done easily on Paint. We must specify on Tableau that X is going from 0 to 189 and Y from 0 to 111. X and Y must be also changed after as dimension to be all displayed on the building map. Floor will be a filter and we don’t want to show all the floor at the same time but floor by floor so we click right, customize and unselect “all”.

Trouble with the prox-card dates when importing on Tableau

When I was importing my prox-card data file, systematically the dates with a day number lower than a number of a month so lower than 13 was interpreting like a month. That's why in my visualization, we are dealing with all the 6 january, 6 february, for example. I tried to modify the format on excel but even with a regular one, the dates are systematically interepreted as American. Because of that, the visualization is very less accurate and it is a big trouble, I should say that is not clear for the users. If anyone of the class has a suggestion, let me hear of it.

My dashboards

https://public.tableau.com/views/Assignment3VF/Tableaudebord5?:embed=y&:display_count=yes

Task 1

What are the typical patterns in the prox card data? What does a typical day look like for GAStech employees?

To answer this task, the files Employee List, ProxOut-MC2 and ProxMobileOut-MC2 have been used and before the two prox-card data have been concatenate and then this file has been join with employee list with the RECHERCHEV function (VLOOKUP in English).

  1. Emile Arpa from the Facilities is the only person staying during the night.
  2. Only one or two people have been working (either from Administration or Executive) on sunday and only between 8am and 11am.
  3. Security is only there between 7am and 5pm.
  4. People from Engineering, Information Technology and Facilities are the one working later than the others departments that have more a regular day like 7am-4pm.
  5. There is always people in the zone 1 and zone 4 which are the corridor and elevator/stairs.
  6. There is always a security person at the elevator zones excepting at 11am and 2pm.

Typical day:

  1. Starting work around 7-8am. But a lot of people are actually at 9 am in their offices.
  2. Around 11-12am, they are having lunch. After 12am there is no more people in the deli.
  3. Going back home around 4-5pm.

The use of the elevator regarding each department is confirming that also.

Task 2

Describe up to ten of the most interesting patterns that appear in the building data. Describe what is notable about the pattern and explain its possible significance.

  1. The HVAC electric demand power is higher during the weekend than during the weekdays. Which is quite surprising since the aircon for example should be more used when people are actually working.
  2. The DELI-fan power is getting as its lowest level around 6-7 am and back at its lowest level around 5-9pm. It is surprising that the fan is working a lot before 6 am. More surprising it will be low on Saturday but particularly high on the Sunday but no one is working.
  3. The VAV SYS AIR LOOP INLET Mass flow rate is higher in the third floor and lowest in first floor probably because the heat goes up. It is surprisingly low on the 7th and 8th of June. Why?
  4. The temperature of the previous point is then lower in the third floor. And will be surprisingly low on the 7th and 8th of June. Why? I don’t understand the words and the measurement, I’m not able to give a precise explanation.
  5. Cooling cool power higher for the third floor
  6. Equipment power very high for the third floor, specially for the zone number 9. It is normal because it is the server room. And we that during the weekend the equipment power is getting a little bit lower but the drop is not coming from the server zone at all.
  7. The Deli is using the most the light power. It’s at the first floor and this floor is using even lights during the night.
  8. CO2 concentration higher at the second floor. Specially coming from training room and offices.
  9. Supply outlet mass flow rate very high for third floor and specially for weekends. Coming from the server room.
  10. Thermostat heating setpoint is getting higher and then lower while thermostat cooling setpoint is getting lower and then higher.

Task 3

Describe up to ten notable anomalies or unusual events you see in the data. Prioritize those issues that are most likely to represent a danger or a serious issue for building operations.

  1. All the concentration are very low during the day for the 7th and 8th of June
  2. The fan supply/power is very high during the night around 4am, and very high on the weekend on the third floor. Why using the fan and then electricity when people are not there?
  3. Someone has been using the light at the second floor during the weekend of the 11-12 june between 7am and 1pm.
  4. Ventilation going lower usually during the weekend but particularly high on the Sunday 12th June.
  5. Peak of the reheat coil power on the 7th and 8th of June. Definitely something not normal happened that day. There was also a peak for the CO2 concentration these days in the afternoon till late.
  6. It seems that the hazium concentration is higher in concentrate offices as the zone 1, third floor and lower in zone were the turnover and passage is bigger like the zone 8A at the floor 1, the corridor. But we can not forget that the corridor is split in a lot of part so the hazium could actually go and spare in the other parts of the corridor, maybe the concentration is in total higher there.
  7. Peak of Hazium, specially at second floor on the 11th june. Specially a lot coming from the zone 4 at level 2, offices.Which is interesting because no one was working in that zone that day.

Task 4

Describe up to five observed relationships between the proximity card data and building data elements. If you find a causal relationship (for example, a building event or condition leading to personnel behavior changes or personnel activity leading to building operations changes), describe your discovered cause and effect, the evidence you found to support it, and your level of confidence in your assessment of the relationship.

  1. The total electricity demand is getting higher as soon as people arrive to work and getting lower as soon as people are going back home. The same appears for the water heater gas rate and actually for a lot of the attributes. Which is logical since people need electricity to work and also hot water for the kitchen for example or to wash their hands. And I don’t know about the technics but if the hot water is used to heat it makes sense also.
  2. Someone has been using the light at the second floor during the weekend of the 11-12 june between 7am and 1pm. It is because one person has been working at this hours during the weekend.

Excuse me for the poor explanations but I had a lot of trouble translating all the different terms used.

Tools used and their effectiveness

JMP: to clean and reorganize the data. But JMP to reorganize the data is much more “fluent” than Excel. In the way that we can easily split a text column, is this a function translated into a button, we don’t need to find a long formula to be able to do it. Plus, JMP can deal with much more data than Excel. On Excel, the data of zone by floor needed to be split in two separate sheet and it takes a long time to open the files. JMP is also more effective to join different tables and reshape the data with stack function, for example, which is very handful.

Excel: to clean and reorganize the data

Tableau: really good to display data on custom map and moreover, for interactivity and specially a good point for Tableau Public which allows to share our work easily and keeping the interactive way to play with data. BUT Tableau has a lot of trouble with uploading excel sheets. For example, I have tried several times to upload my prox-card data and Tableau was taking an old version of the file and not the actual one.

References

http://onlinehelp.tableau.com/current/pro/desktop/en-us/help.htm#annotations_annotations_add.html

https://onlinehelp.tableau.com/current/pro/desktop/en-us/bkimages_add.html

Comments