ChenNannan-Data preparation

From Visual Analytics and Applications
Jump to navigation Jump to search

Binrndc.jpg ISSS608 Assign ChenNannan-MC2

Introduction

Data preparation

Insights

Conclusion

 


Data Quality Issues

No missing value.

Cdn1.png

At least 2.5% of 0 value in value variables which is meaningless.

Cdn2.png

Year 1998 and 1999 are imported wrong. The time series range is from 1998 to 2016.

Cdn3.png

Same location, sample date and measure have different value record.

Cdn4.png

Data Preparation

Recode the sample date.

Cdn5.png

Use the summary function to avoid duplication record by mean.

Cdn6.png

'Dcast' the data

Cdn7.png

Standardize the value by each kinds of measure because different units.

Cdn8.png

'Melt' the data

Cdn9.png