Difference between revisions of "Group14 proposal"
Line 4: | Line 4: | ||
== <big>Motivation and Objectives</big> == | == <big>Motivation and Objectives</big> == | ||
− | + | <p> Nowadays, all industries in the world are facing fierce competition. With the development of telecom technology and social media, the Telco companies play more and more important role in the society. There are growing number of wireless carriers in the world. The U.S has four main wireless carriers and lots of little wireless carriers. It is no surprise that the companies in this industry face very fierce competition. Since this condition, the most significant problem for these organizations are customer remaining. As we know, companies from these industries often have customer service department. Their target is that winning back clients who is churn. Because it is generally acknowledged that recovering long-term customers can be worth much more to a company than acquiring new customers. | |
− | < | + | In order to understand more directly the main factors that affect customer churn and better maintain the relationship with customers, relevant models will be built so as to select and visualize important variables. Lastly, we will present the comparison among models towards Recall, Accuracy, Precision and F1 score and evaluate the performances of different models. |
== <big>Critique of Existing Visualization</big> == | == <big>Critique of Existing Visualization</big> == |
Revision as of 10:49, 2 March 2020
Contents
Motivation and Objectives
Nowadays, all industries in the world are facing fierce competition. With the development of telecom technology and social media, the Telco companies play more and more important role in the society. There are growing number of wireless carriers in the world. The U.S has four main wireless carriers and lots of little wireless carriers. It is no surprise that the companies in this industry face very fierce competition. Since this condition, the most significant problem for these organizations are customer remaining. As we know, companies from these industries often have customer service department. Their target is that winning back clients who is churn. Because it is generally acknowledged that recovering long-term customers can be worth much more to a company than acquiring new customers. In order to understand more directly the main factors that affect customer churn and better maintain the relationship with customers, relevant models will be built so as to select and visualize important variables. Lastly, we will present the comparison among models towards Recall, Accuracy, Precision and F1 score and evaluate the performances of different models.
Critique of Existing Visualization
Data Source
Data Description
We collect the dataset from IBM Community. This dataset contains five spreadsheets. They contain the information about the demographics, location, population, services and status about customers. Demographic is the information about customers’ gender, age range, and if they have partners and dependents. Location is the information about customers’ detail location such as country, city. Status is the information about customers’ status of churn and the reason about churn. There are 7043 entity instances in the dataset. Each customer is identified by Customer_ID column. There are 42 columns with 40 attributes. Customers who left within the last month is the column named Churn_Value. The churn customers are recorded as 1 and the non-churn customers are recorded as 0.
Data Fields | Description | Example | Datatype |
---|---|---|---|
Customer ID | Customer ID | 7590-VHVEG | Numeric |
gender | Whether the customer is a male or a female | Female | Binary |
SeniorCitizen | Whether the customer is a senior citizen or not (1, 0) | 0 | Binary |
Partner | Whether the customer has a partner or not (Yes, No) | Yes | Binary |
tenure | Number of months the customer has stayed with the company | 1 | Numeric |
PhoneService | Whether the customer has multiple lines or not (Yes, No, No phone service) | No phone service | Categorical |
MultipleLines | Customer ID | 7590-VHVEG | Numeric |
InternetService | Customer’s internet service provider (DSL, Fiber optic, No) | DSL | Categorical |
OnlineSecurity | Whether the customer has online security or not (Yes, No, No internet service) | No | Categorical |
OnlineBackup | Whether the customer has online backup or not (Yes, No, No internet service) | No | Categorical |
DeviceProtection | Whether the customer has device protection or not (Yes, No, No internet service) | No | Categorical |
TechSupport | Whether the customer has tech support or not (Yes, No, No internet service) | No | Categorical |
StreamingTV | Whether the customer has streaming TV or not (Yes, No, No internet service) | No | Categorical |
StreamingMovies | Whether the customer has streaming movies or not (Yes, No, No internet service) | No | Categorical |
Contract | The contract term of the customer (Month-to-month, One year, Two year) | Month-to-month | Categorical |
PaperlessBilling | Whether the customer has paperless billing or not (Yes, No) | Yes | Binary |
aymentMethod | The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic)) | Electronic check | Categorical |
MonthlyCharges | The amount charged to the customer monthly | 29.85 | Numeric |
TotalCharges | The total amount charged to the customer | 29.85 | Numeric |
Churn | Whether the customer churned or not (Yes or No) | No | Binary |
Methodology and Approach
Proposed R Packages
Team Members
References