Difference between revisions of "IS484 IS Project Experience (FinTech)"

From IS Project Experience
Jump to navigation Jump to search
Line 59: Line 59:
 
|width="30%"|<!-- Project Description --> '''Customer Mailing Address Analysis''' - Addresses of people and businesses contain important information about them. More data about the locations of those addresses is required to get some insight from addresses. For example the population, geographic and economic indicators, crime rates etc. can be helpful. We need to collect such information about countries and cities to make the addresses usable in models and other analytics.
 
|width="30%"|<!-- Project Description --> '''Customer Mailing Address Analysis''' - Addresses of people and businesses contain important information about them. More data about the locations of those addresses is required to get some insight from addresses. For example the population, geographic and economic indicators, crime rates etc. can be helpful. We need to collect such information about countries and cities to make the addresses usable in models and other analytics.
 
|width="30%"|<!-- Project Deliverables --> A solution or program which can accomplish the following: <br>  
 
|width="30%"|<!-- Project Deliverables --> A solution or program which can accomplish the following: <br>  
* Detect the root cause of low accuracy with a given model input, model output and model binary.
+
* Collect information about countries from IMF data. <br>
* Generate corrective recommendations to increase accuracy without re-building the model.
+
* Collect information about cities from DBPedia data. <br>
* Perform regression testing with recommendations, to demonstrate the expected accuracy.
+
* Build schedules to keep the above data fresh, as new data is available. <br>
* The program is expected to be able to analyse any supervisory learning model for the given input and output.
+
* Make this data available to lookup by country and Citi names to be used by models and analytics queries. <br>
 +
* Generate an embedding of countries and an embedding of cities, to be used as features in models. <br>
 +
* Unstructured addresses (where country, city are not marked separately, but part of large address text) need to be parsed before lookup. <br>
 +
* Make this information available by joining the addresses of people and businesses and collected data by countries and cities as join keys. <br>
 +
* Measure how much the model performance improves, after using this additional information.
 
|width="30%"|<!-- Project Sponsor/Stakeholders -->  
 
|width="30%"|<!-- Project Sponsor/Stakeholders -->  
 
* TBD  
 
* TBD  
 
|-
 
|-
 
|}
 
|}

Revision as of 19:15, 10 March 2020

Course Description:

  • This is an SMU-X course designed in collaboration with CitiVentures Innovation Lab. Citibank will supply a minimum of 5 projects ideas to select from.
  • Students will form teams of 5 or 6, and select one of the Citibank project ideas to work on. Project selections do not need to be unique, meaning multiple teams can select the same project idea.
  • Each student project team will be assigned to a Citibank sponsor and an SMU faculty supervisor.
  • Citibank will provide project scope and management for student teams to have practical industry learning experiences.
  • Student teams will have weekly check in meetings, either virtually or physically, with their Citibank sponsor.
  • Citibank will specify the technologies to be used, including; development tools/languages, OS, database, 3rd party libraries, target deployment environment e.g. cloud environment.

Project Timeline:

  • Week 1 - Attend orientation session, where teams will be formed (if you have not already formed a team), and projects are selected from a set of predefined projects provided by CitiVentures.
  • Week 8 - Midterm presentation and demo
  • Week 15 - Final presentation and demo

Project deliverable:

  1. Student project teams will be expected to develop a working software application prototype, to be delivered to Citibank at the end of the course.
  2. A formal, in-person midterm and final presentation will be facilitated by Citibank.

Citibank Projects

  • Projects to be selected/assigned to project teams during Week 1 orientation session.
Item Project Description Project Deliverables Project Sponsor/Stakeholders
1 Private Banking Client Dashboard - Citi Private Bank (CPB) Investment Counsellors and Advisors provide frequent consultation to HNWI and UHWNI (high and ultra-high net-worth individuals) on how to manage their Investment portfolios. In order to perform their job they need high speed access to a client's positions, real-time market data and publicly available sentiment on the portfolio's constituents. The portfolio is usually composed of capital market securities and various funds (hedge, mutual, real estate, private equity).

The work would entail coming up with Investment solutions (capital market products and alternative funds) for the consumption of qualified Investment Counsellors and Advisors. Careful thought needs to be put into providing an enriching UX / UI and leveraging machine / deep learning capability to provide robust recommendations. The users will use the information to pro-actively and also reactively service CPB's HNWI and UHNWI clients.

A working dashboard that provides a real-time view of a client's position. The view should be contextual based on the type of holdings (Cash/Liabilities, Equity, Fixed Income, Derivatives and Alternative Investments). The view would give an instrument and profitability analysis based on market data (Bloomberg / Reuters). Furthermore, there will be a recommendation engine that looks at a client's current / past positions and suggests trade-able ideas to the advisor based on upcoming announcements, trending public sentiment and client's personal interests.
  • APAC Innovation Lead for CPB Investments
  • Investment Counsellor Team Lead
  • Head of APAC Investment Technology
2 Predictive Analysis of Risk Utilization - Citi's institutional clients place millions of orders on any given trading day through its electronic execution platforms. As orders come in through Citi's systems, they are evaluated against several risk parameters(such as credit limits) before the order is sent to the market. While currently, breaches in these parameters can be identified the moment the orders are placed, the next gen evolution of this risk management system requires predictive analytics of such breach events. This will enable Citi's clients and client facing officials to prevent regulatory violations, navigate trading disruptions by proactively take measures to prevent such breaches by allocating funds/ changing their trading strategy etc. Students executing this project will be expected arrive at a machine learning solution to predict imminent movement of the risk parameters based on historical trading patterns. The solution should be able to take data feed for supplemental information (Triple witching dates, FTSE/MSCI rebalancing, other events that affect the market such as the Coronavirus threat) to more accurately predict exceptional scenarios.

Tasks:

  • Understand Citi's current data model for storing historical data.
  • Build adapters to funnel data to a central data pool to run analytics on the data.
  • Analyze and find inflection data points and patterns.
  • Build supplemental data feed to establish market sentiments in the sytem and use that to augment their prediction models.
  • Build a user interface/ data conduit that can be used by Citi clients/ users to be notified of any breaches if found.
  • TBD
3 Machine Learning Model Performance - Machine learning models are being trained based on historical data. But in the commercial world, change is expected rapidly which may mark the model biased to the new data as well as scaled old data. Before the model is retained, there are immediate needs to understand what are the leverages that can be applied to interfere with the old model output to achieve the accuracy rate, then capture the business opportunity in a very short turnaround time. When models are unable to digest new data, they will generate inaccurate recommendations and predictions to the business, resulting in missing the opportunities for increased revenue. A solution or program which can accomplish the following:
  • Detect the root cause of low accuracy with a given model input, model output and model binary.
  • Generate corrective recommendations to increase accuracy without re-building the model.
  • Perform regression testing with recommendations, to demonstrate the expected accuracy.
  • The program is expected to be able to analyse any supervisory learning model for the given input and output.
  • TBD
4 Customer Mailing Address Analysis - Addresses of people and businesses contain important information about them. More data about the locations of those addresses is required to get some insight from addresses. For example the population, geographic and economic indicators, crime rates etc. can be helpful. We need to collect such information about countries and cities to make the addresses usable in models and other analytics. A solution or program which can accomplish the following:
  • Collect information about countries from IMF data.
  • Collect information about cities from DBPedia data.
  • Build schedules to keep the above data fresh, as new data is available.
  • Make this data available to lookup by country and Citi names to be used by models and analytics queries.
  • Generate an embedding of countries and an embedding of cities, to be used as features in models.
  • Unstructured addresses (where country, city are not marked separately, but part of large address text) need to be parsed before lookup.
  • Make this information available by joining the addresses of people and businesses and collected data by countries and cities as join keys.
  • Measure how much the model performance improves, after using this additional information.
  • TBD