ANLY482 AY2016-17 T2 Group10 Project Overview: Methodology
Revision as of 18:01, 21 February 2017 by Jxsim.2013 (talk | contribs)
Data Preparation
Data preparation involves cleaning, transformation, and integration, which are standard procedures to standardize data across different datasets for their many formats, errors in data entries and granularity. We will first look at each of the data files, determine best ways to standardize formats and then perform aggregations on more granular data for integration purposes.
MCCP
Invoice Details
Data Cleaning
A brief scan of the entire Invoice Details data table led to 3 main areas to be cleaned.
- Missing values in Price$ column
- Negative values in Sales Qty and Amount$ columns
- Some Postal Code with only 5 digits (because they start with 0)