Difference between revisions of "IS480 Team wiki: 2018T1 analyteaka research"
Line 159: | Line 159: | ||
<b> What's Machine Learning? </b> | <b> What's Machine Learning? </b> | ||
+ | |||
In a nutshell, using algorithms to <b> parse data, learn </b> from it, and then make a <b> determination or predictions.</b> To be specific, it’s a field of computer science that use statistical techniques to give computer system the ability to “learn (e.g progressively improve performance on a specific task) with data, <b> without being explicitly programmed. </b> | In a nutshell, using algorithms to <b> parse data, learn </b> from it, and then make a <b> determination or predictions.</b> To be specific, it’s a field of computer science that use statistical techniques to give computer system the ability to “learn (e.g progressively improve performance on a specific task) with data, <b> without being explicitly programmed. </b> | ||
[[File:Research MachineLearning.png|600px|center]] | [[File:Research MachineLearning.png|600px|center]] | ||
+ | |||
+ | <b> Misconceptions</b> | ||
+ | |||
+ | However, there are some misconceptions about machine learning. | ||
+ | |||
+ | - Its not logic based, its stats based. | ||
+ | |||
+ | - Its not a solution without proper understanding and expectations. | ||
+ | |||
+ | - AI versus ML vs Deep learning [http://larshorsbol.com/the-difference-between-machine-learning-deep-learning-and-artificial-intelligence/ Source] | ||
+ | |||
+ | - AI: Human intelligence exhibited by Machines | ||
+ | - ML: An approach to achieve AI | ||
+ | - Deep learning: A technique for implementing Machine learning | ||
+ | |||
+ | - Lastly, there's nothing new about the concept of machine learning (it exist as early as 1950s). It just became much more relevant due to the rise of IOT devices and the potential to store endless data. | ||
+ | [[File:Research comparsion.jpg|400px]] | ||
+ | |||
<!--CONTENT--> | <!--CONTENT--> |
Revision as of 17:19, 17 August 2018
Under PDPA’s guidance, we are not legally obligated to care for personal data. However, we would follow the best practice tips by exploring
1. Set out how the personal data in custody may be well-protected.
2. Classify the personal data to better manage housekeeping
3. Set clear timelines for the retention of the various personal data and cease to retain documents containing personal data that is no longer required for business or legal purposes.
4. For the transfer of personal data overseas, including the use of contractual agreements with the organizations involved in the transfer to provide a comparable standard of protection overseas.
The above classification is based on our interpretation of Federal Information Processing Standards (FIPS) publication 199 published by the National Institute of Standards and Technology as stated by Carnegie Mellon University to reflect the level of impact to the company if confidentiality, integrity or availability is compromised.
Potential Impact table
Security Objective | Low | Moderate | High |
Confidentiality | Leakage of information could be expected to have a limited adverse effect on the company’s operation, assets or individuals. | Leakage of information could be expected to have a serious adverse effect on the company’s operation, assets or individuals. | Leakage of information could be expected to have a severe or catastrophic adverse effect on the company’s operation, assets or individuals. |
Integrity | Unauthorized modification or destruction of information could be expected to have a limited adverse effect on the company’s operation, assets or individuals. | Unauthorized modification or destruction of information could be expected to have a serious adverse effect on the company’s operation, assets or individuals. | Unauthorized modification or destruction of information could be expected to have a severe or catastrophic adverse effect on the company’s operation, assets or individuals. |
Availability | The disruption of the information or system could be expected to have a limited adverse effect on the company’s operation, assets or individuals. | The disruption of the information or system could be expected to have a serious adverse effect on the company’s operation, assets or individuals. | The disruption of the information or system could be expected to have a severe or catastrophic adverse effect on the company’s operation, assets or individuals. |
Based on the above tips and impact. We decided to split the data into 5 different class.
- Class 1 contain at least 2 high impacts
- Class 2 contain at least 1 high impacts
- Class 3 contain at least 1 moderate impacts
- Class 4 contain 0 impacts
- Class 5 contain 0 impacts with easily accessible public data
CLass Level | Description | Example | Action |
---|---|---|---|
1 | Highly confidential data | CVV code, credit card number | Never stored or process. |
2 | Uniquely personally identifiable information. | Fingerprints, eye scan, session token, NRIC, password | Never stored, process and discard. |
3 | Personally identifiable information | DoB, email, address | Store only hashed value. |
4 | non-Personally identifiable information | State, city, region, subzone | Can be stored as is it |
5 | public website available content | Item details, category, item price | Can be stored as is it |
Data Analytics is the process of examining data sets in order to draw conclusions about the information they contain, increasingly with the aid of specialized systems and software.
Typical mechanisms: Database (only Data)
Typical timeframe: Offline
The outcome of analytics is informed business decisions to verify or disprove scientific models, theories and hypotheses. The typical goals is to improve efficiency, optimize processes, increase revenues etc.
The hardest part of analytics project is asking the question. As Robert Half once mentioned, "Asking the right questions takes as much skill as giving the right answers." - Robert Half
Descriptive analytics | Insight into the past:
Data operations:
|
Predictive analytics | Understanding the future:
|
Prescriptive analytics | Advise on possible outcomes:
|
Based on the above details our modules are split into the respective section
Descriptive:
- Customer Profile module
- Store Profile
- Staff profile
Predictive:
- Machine Learning
Prescriptive:
- Data visualisation module
- Analytics and reporting module
What's Machine Learning?
In a nutshell, using algorithms to parse data, learn from it, and then make a determination or predictions. To be specific, it’s a field of computer science that use statistical techniques to give computer system the ability to “learn (e.g progressively improve performance on a specific task) with data, without being explicitly programmed.
Misconceptions
However, there are some misconceptions about machine learning.
- Its not logic based, its stats based.
- Its not a solution without proper understanding and expectations.
- AI versus ML vs Deep learning Source
- AI: Human intelligence exhibited by Machines - ML: An approach to achieve AI - Deep learning: A technique for implementing Machine learning
- Lastly, there's nothing new about the concept of machine learning (it exist as early as 1950s). It just became much more relevant due to the rise of IOT devices and the potential to store endless data.