Security considerations for the use of unsupervised AI models

Jasmeen Kaur

Jr. Associate , Montreal

Data

Business opportunities VS. security, privacy, and established data protection principles

The fact that the financial services industry is one of the most targeted industries by cyber-attacks is well known, as 71% of all data breaches have financial motivations. As the frequency and scale of cyber-attacks have increased year-over-year, records show that the average total cost of a data breach worldwide in 2024 was $6.08 million per incident.

When it comes to AI-driven data ecosystems, businesses opt for “live experimentation” for growth and innovation – but in the wake of massive data exfiltration attacks, this experimentation can come with significant risks both from a financial and reputational perspective. To navigate the balance of innovation with the inevitability of cyber incidents, successful approaches to AI security will be marrying exploration of new markets, testing novel ideas, and adapting swiftly to changing customer needs while remaining resilient from cyber-attacks and exposures.

Unsupervised learning (UL) AI models are particularly vulnerable to cyber threats due to less attention on their security compared to supervised and reinforcement learning, posing risks to financial integrity and user privacy. The quality of data powering AI is often underestimated, yet it's crucial for model performance and ethical standards. There's an urgent need to enhance AI security, this includes a defense-in-depth strategy that integrates multiple layers of security throughout the operational lifecycle and software development lifecycle (SDLC). Especially for UL, as current defenses fall short against sophisticated adversarial attacks, highlighting the necessity for robust, secure AI development.

Unsupervised AI Models

Unsupervised AI models utilize types of machine learning (ML) which learns from data itself without any human supervision. In these models, unlabeled data reveals hidden patterns or data groupings. Some of the AI models using UL are as follows:

Clustering: In this model, clusters are discovered using distance metrics between data points by recognising two data points close enough to be in same cluster.
Autoencoders: Training a neural network to produce an inner representation of the data. Typically, Autoencoders force a dimensionality reduction on the inner representation by restricting the number of hidden nodes in the network.
Super resolution networks: Neural networks that function to enhance the dimensionality of the data, by training the model to produce a higher-dimensional piece of data based on a lower-dimensional piece of data.
Generative adversarial networks (GANs): Two models – the discriminator and the generator that make up GANs. Random noise is fed into a neural network that serves as the generator, and a classifier that has been trained to distinguish between the generator's output and the training set of data serves as the discriminator. Based on the discriminator’s performance, the generator and discriminator are trained against each other.

Security risks and challenges

Adversarial Attack: An attacker disrupts machine learning model’s classification ability by injecting faulty input, harming business applications. For example, if an attacker mimics a maintenance worker's routine recognized by a security camera's ML model, they could gain unauthorized access without triggering alarms, exploiting the model's vulnerability to overlook threats. Examples of adversarial attacks are mentioned below:

Adversarial attacks - AI and cybersecurity

Defensive methods

Key defensive methods are essential to prevent vulnerabilities from being exploited. But these methods rely more on positive testing than negative testing. If vulnerabilities, flaws, and weaknesses can be found and fixed through negative test cases, before it impacts end users along with regular maintenance, the malicious actor can be thwarted. The following are the main defensive methods to avoid from exploitation.

Adversarial training: A process by which the model is fed intentionally with faulty inputs adversarial examples, which cause machine learning models to fail. Following that, the model classifies these known harmful inputs as threats. This is akin to how a machine learning model trains itself to classify data as part of its regular process, and it also teaches itself to reject disturbances. Continuous maintenance and supervision for this approach has utmost importance since the attempts to modify the machine learning model progress.

Defensive distillation: This can be done by training the teacher & student network. Using standard training protocols, a teacher neural network must first be trained on the original dataset. The class probabilities of the teacher network, which are more informative than the hard labels after training, are employed as soft objectives for the following stage. These targets gathered are then used to train a student network, which may or may not have the same architecture as the Teacher Network. By imitating the teacher's output distribution—which includes the confidence levels for every class—the student network becomes more adept at generalizing.

Real-time monitoring: Real-time monitoring of prompts and responses can assist in promptly identifying non-compliant responses, much to how public surveillance cameras are used for crime control and prevention. Automated systems that identify potentially hazardous or non-compliant content can be used to do this.

Proactive monitoring of circumvention techniques: One of the most important aspects of surveillance is comprehending and evading network censorship. To gain access to restricted websites or services, one can employ methods like switching DNS providers, utilizing VPNs, or setting up proxy servers for messaging applications. These methods can also be applied to keep an eye on possible abusers' behavior.

Attribute based access control (ABAC): ABAC can help design policies that limit access to contextual datasets and categories of data that may require advanced sensitivity and stronger reinforcement by layered entitlement defenses. For example, customer records requiring specific segmentation for different data elements in the data record and for different consumption patterns, including the user's role, location, affiliated organization, time of day, or any other topical controls.

Common vulnerable use cases

Emerging technologies and tools

Implementing reliable AI concepts, validating models and utilizing data poisoning detection technologies is recommended.

An attack architecture for AI systems called “Adversarial Threat Landscape for Artificial-Intelligence Systems" (ATLAS) was developed by MITRE (in collaboration with Microsoft and 11 other businesses). It describes 12 levels of attacks on machine learning systems.

Counterfeit announced by Microsoft, is an open-source automation tool that can be used by red team operations. This tool tests AI systems security. This tool can also be used in the AI development process to identify vulnerabilities prior to their release into production.

Additionally, an open-source defensive tool for adversarial robustness by IBM is currently managed as a Linux Foundation project. This project includes 39 attack modules that are divided into four main categories: evasion, poisoning, extraction, and inference. It also supports all popular ML frameworks.

Challenges with existing methods

With adversarial training, even though robustness is improved, there is a trade-off that exist between model generalization and robustness. In other words, if the models' training goal is adversarial in nature, the accuracy in robustness is increased but there is a decrease in standard accuracy. In summary, a trade-off can occur even in a very straightforward and natural environment. This trade-off stems from the possibility of fundamental differences in features learned by robust and standard classifiers.
It is possible for adversaries to modify their attack tactics to get past the defense, particularly if they are aware of or can deduce the distillation procedure.
Computational overhead is another challenge. For example, training two networks —a teacher and a student—requires more computational power, time and increased overhead costs as compared to training a single network.
Hyper parameterization can lead to loss of accuracy. Carefully tuning hyperparameters, such as the temperature parameter that regulates the soft probability' smoothness, is necessary to maximize the effectiveness of distillation.

Conclusion:

In conclusion, as AI starts to influence many aspects of our lives, the security of unsupervised learning models becomes crucial. Even if current defensive techniques like defensive distillation and adversarial training provide some protection, they have gaps, drawbacks and trade-offs that call for more research and development. Although new tools and technologies have the potential to improve AI security, attackers and defenders are still in the arms race. The industry must place a high priority on creating reliable, secure AI systems to protect against changing threats and guarantee the security and dependability of AI applications.

FinLabs

Artificial Intelligence

Cybersecurity

Consulting

Digital

Cloud & DevOps

Data

Software Engineering

Our Company

Hi-Tech

Financial Services

Insurance

Industries

Hi-Tech

Financial Services

Insurance

Our Partners

Corporate Sustainability

Awards

Press Room

Insights