Cyber Crime: How Machine Learning, Confusion Matrix helps in the Cyber Security.

Shivangi Kesharwani
6 min readJun 6, 2021

Task 05:

Cyber Crime

A cyber crime is a criminal behavior that involves the use of a computer or another digital device and a network. It is primarily an attack on personal information that is of significant importance to an individual, business, or government, and its disclosure can result in major threats, infrastructure damage, financial loss, and even death.

In our research, we have three main goals. The first is to anticipate a cyber-crime method using actual cyber-crime data as input and compare the accuracy outcomes. The second goal is to see if cyber-crime culprits can be predicted using the data available. The final goal is to figure out how victim profiles affect cyber-attacks.

However, academics have lately developed security models and made predictions using artificial intelligence and machine learning technologies to solve these issues. The literature has a large number of crime prediction methods. They, on the other hand, lack the ability to forecast cyber-crime and cyber-attack strategies.

Types Of Cyber Attacks

Gartner claims that in 2022, the global information security market is expected to reach $170.4 billion. In 2019, spear phishing attacks were made in over 88 percent of enterprises throughout the world. In the first half of 2020, data breaches revealed 36 billion records. 86 percent of breaches were motivated by money, whereas 10% were motivated by espionage. Hacking was used in 45 percent of breaches, malware was used in 17 percent, and phishing was used in 22 percent.

Healthcare firms accounted for 15% of breaches, while the financial industry accounted for 10% and the governmental sector accounted for 16%. Ransomware attacks cost the healthcare industry an estimated $25 billion in 2019. The average cost of a data breach in the financial services industry is $5.85 million USD.

As a result, recognizing various cyber-attacks in a network is crucial. The use of a Machine Learning model in the development of a successful Intrusion Detection System (IDS) comes into play. A binary classification model may be used to determine what is going on in the network, such as whether or not there is an assault.

Using machine learning to detect malicious activity and stop attacks

Machine learning has emerged as a critical technique in the field of cybersecurity. It uses pattern detection, real-time cyber crime mapping, and rigorous penetration testing to discover threats and vulnerabilities in security infrastructure, making the environment extremely intelligent and mainly automated.

Machine learning has many applications in cyber security, but we’ll focus on just one today.

Machine learning algorithms assist organizations in detecting harmful conduct that is occurring or is about to occur, and stopping it or alerting security staff on the job.

However, the accuracy of its forecast may vary. There are four conditions that categories the output. Confusion matrix is a matrix (table) that maps them out.

Confusion Matrix Role over Cyber Security

This is a binary classification. It can work on any prediction task that makes a yes or no, or true or false, distinction.

To organize the results, the confusion matrix use specialized language. False positives and false negatives exist alongside real positives and genuine negatives. These values might be displayed as actual and expected classes for two unique items in a more sophisticated confusion matrix or one based on comparison classification.

  • True Positive(TP) -A true positive test result is one that detects the condition when the condition is present.
  • False Positive (FP)-Also know as a Type I error, a false positive test result is one that detects the condition when the condition is absent.
  • False Negative (FN)-Also know as a Type II error, a false negative test result is one that does not detect the condition when the condition is present.
  • True Negative (TN)- A true negative test result is one that does not detect the condition when the condition is absent.
  • Error is calculated of different ratios and formulas based on these four states. It is easy to see that depending on the cost of a Type I or a Type II the way the error is measured might be adjusted.

False Positives case in Cyber Security

False positives are security warnings that indicate a threat when there isn’t one. These false/non-malicious notifications might add to the backend teams’ workload while they analyze the alarm.

Assume a security team is continuously monitoring, however they are dependent on a sensory software to identify any threat linked to dto attacks or breaches.

False Negatives case in Cyber Security

Uncaught cyber dangers, or those that aren’t recognized, are known as false negatives. There might be a number of causes for this, including the use of inactive security tools, as well as a weak or complicated security infrastructure.

Although False Positives occur more frequently than False Negatives, False Negatives represent a greater danger of injury.

Measuring Error

Type I error:

False Positive (Type I Error)

This sort of blunder might be quite deadly. Our system projected no attack, but if one occurred, no notice would have reached the security team, and there would be nothing that could be done to prevent it. This category includes the False Positive examples mentioned above, and one of the model’s goals is to reduce this number.

Type II Error:

This sort of inaccuracy isn’t very harmful because our system is really secured, but the model projected an assault. The team would be contacted and would investigate any suspicious conduct. This has no negative consequences. They’re known as false alarms.

We can use confusion matrix to calculate various metrics:

  1. Accuracy: The values of confusion matrix are used to calculate the accuracy of the model. It is the ratio of all correct predictions to overall predictions (total values).

Accuracy = (TP + TN)/(TP + TN + FP + FN)

2. Precision: (True positives / Predicted positives) = TP / TP + FP

3. Recall: (True positives / all actual positives) = TP / TP + FN

4. Specificity: (True negatives / all actual negatives) =TN / TN + FP

5. Misclassification: (all incorrect / all) = FP + FN / TP + TN + FP + FN

It can also be calculated as -> 1-Accuracy

USECASE — How the confusion matrix helps in the cyber security

In today’s 21st century more and more cyber crimes are happening on online platform.

The growing popularity and communities on various social media platforms have resulted in a new type of cybercrime, in which people establish phone accounts and begin committing acts that are unlawful and in violation of cyber laws.
These unlawful acts include “publishing naked images,” “spreading abandoned posts,” “cyberbullying,” and using filthy comments in personal chats.

Thank you for taking the time to read this blog. I hope you found it informative and useful.

Please leave a comment below and let me know what you think.😊

--

--

Shivangi Kesharwani

Hello!.. I’m fascinated by technology and enjoy learning and exploring it. I hope you find my information useful.