Difference between revisions of "Kiva Project Findings Final"

From Analytics Practicum
Jump to navigation Jump to search
Line 47: Line 47:
  
 
<!--Content-->
 
<!--Content-->
==<div style="background: #FFD700; line-height: 0.3em; border-left: #B22222 solid 13px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;"><font face ="Elephant" color= "black" size="3">Data Cleaning</font></div></div>==
+
==<div style="background: #FFD700; line-height: 0.3em; border-left: #B22222 solid 13px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;"><font face ="Elephant" color= "black" size="3">Area of study</font></div></div>==
 
<div style="height: 1em"></div>
 
<div style="height: 1em"></div>
 
<div><font face="Arimo" size="4">
 
<div><font face="Arimo" size="4">
 +
 
===Missing Value===
 
===Missing Value===
 
[[Image:G22F1.png|900px]]<br/>
 
[[Image:G22F1.png|900px]]<br/>
Line 56: Line 57:
 
The screenshot above of loan_themes_by_region.csv shows a snippet of the old geocode of the Kiva regions having many missing values (14536 out of 15736 records). As there is far too many missing records for this column to derive any meaningful information regarding shifts in location regions for particular loan themes, we removed this column entirely.<br/>
 
The screenshot above of loan_themes_by_region.csv shows a snippet of the old geocode of the Kiva regions having many missing values (14536 out of 15736 records). As there is far too many missing records for this column to derive any meaningful information regarding shifts in location regions for particular loan themes, we removed this column entirely.<br/>
  
[[Image:G22F2.png|600px]]<br/>
 
<small>Figure 2: Screenshot of kiva_mpi_region_locations</small><br/>
 
  
The table above shows the erroneous records of kiva_mpi_region_locations, where there is no location name (which is the main identifier/primary key for this table) and missing values of all other columns except the geocode, which only consists of (1000.0,1000.0) values and are not actual geocodes. Hence, all of these 1788 rows which had no useful information were removed.  
+
==<div style="background: #FFD700; line-height: 0.3em; border-left: #B22222 solid 13px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;"><font face ="Elephant" color= "black" size="3">Analysis</font></div></div>==
===Redundant data===
+
<div style="height: 1em"></div>
There are were some data which we removed as they were either repeated information from other columns or were erroneous. In the file kiva_mpi_region_locations.csv,  there was a column geo which was an addition of both latitude and longitude. As the individual columns were more useful for our analysis, we hence removed the geo column.
+
<div><font face="Arimo" size="4">
 +
 
 +
===Kernel Density Analysis===
 +
===Spatial Autocorrelation Analysis===
 +
====Methodology====
 +
====Area Analysis of Visayas ====
  
[[Image:F3.png|600px]]<br/>
 
<small>Figure 3: Screenshot of kiva_mpi_region_locations.csv with geo column having duplicate information</small><br/>
 
  
Lastly, we removed a single invalid record from kiva_loans.csv, where the funded time was after the posted time, which should never be the case as the loan should not be funded even before it was posted.
 
  
 +
==<div style="background: #FFD700; line-height: 0.3em; border-left: #B22222 solid 13px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;"><font face ="Elephant" color= "black" size="3">Reference</font></div></div>==
 +
<div style="height: 1em"></div>
 +
<div><font face="Arimo" size="4">
  
 
</font></div>
 
</font></div>
 
<div style="height: 2em"></div>
 
<div style="height: 2em"></div>
 
<!--/Content-->
 
<!--/Content-->

Revision as of 15:24, 15 April 2018


 

Home

 

Project Overview

Project Findings

 

Project Management

 

Documentation

 

About Us

 

ANLY482 Main Page


Interim Final


Area of study

Missing Value

G22F1.png
Figure 1: Screenshot of loan_themes_by_region.csv

The screenshot above of loan_themes_by_region.csv shows a snippet of the old geocode of the Kiva regions having many missing values (14536 out of 15736 records). As there is far too many missing records for this column to derive any meaningful information regarding shifts in location regions for particular loan themes, we removed this column entirely.


Analysis

Kernel Density Analysis

Spatial Autocorrelation Analysis

Methodology

Area Analysis of Visayas

Reference