Group01 Report

From Visual Analytics and Applications
Jump to navigation Jump to search

LINK TO PROJECT GROUPS:
Please Click Here -> [1]

1718T3 Group1 hacking.png

Cybersecurity

Proposal

Poster

Application

Report


Introduction

PUT YOUR CONTENT HERE

Objective and Motivations

PUT YOUR CONTENT HERE

Previous Works

PUT YOUR CONTENT HERE

Dataset and Data Preparation

PUT YOUR CONTENT HERE

Design Framework and Visualization Methodologies

PUT YOUR CONTENT HERE

Insights and Implications

It is impossible to tell if a single network connection is an attack or not. While the intent could be malicious, it could also be benign, for instance when a person makes a typo in the URL and thereby establishing a connection by mistake.

To diagnose a cyber-attack requires us to analyse network traffic by specific temporal intervals. An abnormal amount of traffic during a interval would call for further investigation and intervention. From the dataset that has been explored in this project, intervals of one minute would be a good starting point for detecting unusual network activity.

Cyber-attacks could happen at any time, on any day. Attackers could programme their attempts to be carried out at specific timings by machines, or even to be triggered by certain events (e.g. phishing emails where an unsuspecting user clicks on a link that then triggers the installation of malware). This dataset contains connections across all twenty-four hours of any given day. While the Iranian attack took place between 18:40 to 18:41 hours, it could easily have had taken place in the wee hours instead. Therefore, it is important for defenders to possess real-time capabilities to detect and fend off cyber-attacks.

Cyber-attacks are also getting more and more complex. While this dataset does not give a full account of the complexities involved, the high variability in the IP addresses and TCP/UDP ports of the connections already proved to be quite challenging for us to interpret and manipulate for analysis. Furthermore, the volume of cyber-attacks is also growing as techniques become more sophisticated and the cost of machines become lower. Hence defenders would need to invest correspondingly in both powerful hardware and software, to implement advanced cyber-defence techniques.


Limitation and Future Work

The app in its current iteration is not designed for real-time monitoring. Future work would include adapting the code to ingest real-time data and create a loop to refresh the analysis periodically. The time taken to refresh the analysis would be shorter than the interval in which network traffic is analysed for suspicious activity. The app could be deployed within Big Data Architecture that use Apache Spark for analysis, which is a common solution, as the Spark engine comes with APIs for R. In fact, with enough data points, the dashboard could even be expanded to include a predictive module that anticipates where and when the next cyber-attack will take place.

The interactivity of the app could also be enhanced. More control elements could be implemented for users to perform their own exploration of the data. For example, the current Sankey visualisation only allows users to examine the connections by source country e.g. Iran but does not allow users to specify specific timings to inspect.

While the Sankey and Network visualisations currently perform specific and distinct functions within the app, they could potentially overlap in terms of the type of information that can be conveyed to users. Hence future work would include tweaking the coding to see if either one could be omitted for an even simpler App interface.

Conclusion

This project attempts to tackle the complexity of cybersecurity and visualise suspicious attacks that are highly likely to be actual attacks in a meaningful and intuitive manner. That is not an easy task given that cyber-attacks can take place at any time, from anywhere, at any intensity (e.g. number of connections) and in many different forms. Hence tools to aid cybersecurity experts in detecting and defending against cyber-attacks need to continually be refined and upgraded. This project is a first step in that direction.

References

[1] Shneiderman, B. (2005) “The eyes have it: A task by data type taxonomy for information visualization” IEEE Conference on Visual Languages (VL96), pp. 336-343

[2] About BP: https://en.wikipedia.org/wiki/BP

[3] Data: http://www.bp.com/statisticalreview

[4] http://ryanhafen.com/blog/geofacet

[5] https://hafen.github.io/geofacet/

[6] https://github.com/timelyportfolio/sunburstR

[7] R Packages Description: https://cran.r-project.org