AI security is a topic gaining significant traction as organizations take on the responsibility of defending AI technologies across their ecosystems.
Previous episodes of Security Sparks covered organizational and programmatic considerations for AI security. Today’s focus shifts to the technology side, breaking down the complex layers of an AI stack. For those not involved in AI daily, this will provide a clearer understanding of the components they are tasked with securing.
Understanding the AI Stack
To begin, it's essential to outline the key elements of an AI stack. AI infrastructure is vast, and while there have been numerous innovations in this space, some areas of AI defense remain underdeveloped. Fortunately, organizations like MITRE and NIST are leading efforts to establish AI-specific threat defense models.
What does an AI stack look like?
-
Infrastructure Foundation
As with other tech stacks, AI relies on foundational infrastructure, typically hosted in cloud or on-premises environments. This foundation is supported by Kubernetes, Infrastructure as a Service (IaaS), and object stores.
-
Foundational Data Layer
The next layer is the foundational data layer, which includes data lakes, data lineage management, and segmentation. These elements are crucial for governing how data is processed for AI and machine learning (ML) models.
-
Logging and Monitoring
Logging and monitoring are essential for both AI engines and the security infrastructure, ensuring the quality of AI processes and enabling the detection of anomalies.
-
Core AI Functions
Core AI functions encompass data ingestion, experimentation engines, training engines, deployment to production, and serving engines. These integrate into the application stack using the AI infrastructure.
-
Experimentation and Testing
AI development often involves a machine learning tech stack that operates in parallel with the core infrastructure, specifically for experimentation and testing purposes.
-
Identity Management and Access Control
Identity management is critical in managing access for humans, machines, and applications within the AI stack. Role-based authentication is essential for securing the pipeline.
-
Data Development Platforms
Data is the backbone of AI, and this area includes external data sourcing, synthetic data generation, and other mechanisms that bring data into the AI platform.
-
Data Engineering and Orchestration
Data engineering involves moving and transforming data between different engines to make it accessible and usable throughout the AI stack.
-
Supportive Microservices
The final layer of the AI stack consists of supportive services such as code repositories, code management technologies, dashboards, and notebooks.
Protecting the AI Environment
With an understanding of the AI stack, attention turns to safeguarding it. While many cutting-edge technologies are emerging in this space, starting with the basics is crucial.
Key steps to protect the AI environment include:
- Micro-Segmentation: Separate various components of the AI infrastructure to limit the spread of potential threats.
- XDR and EDR: Utilize extended detection and response (XDR) and endpoint detection and response (EDR) solutions to log information centrally for further analysis, integrating these with SIEM (Security Information and Event Management) or data security platforms.
- Playbooks: Develop incident response playbooks specifically for AI environments, and ensure AI environments are included in the Configuration Management Database (CMDB) for streamlined patch management.
Focus on Data Defense
Since data is at the core of AI, its defense is non-negotiable. Consider implementing Data Security Posture Management (DSPM) capabilities to address critical questions, such as:
- Where is the data?
- Who has access to it?
- How is it being used?
- Where did it come from, and where is it going?
Catalog data repositories to ensure they contain only appropriate informational assets. This is critical from both a threat and regulatory perspective. Additionally, monitor the regionality of data to comply with regulatory requirements related to data control, custody, and usage.
Frameworks for AI Defense
Organizations should adopt frameworks from MITRE and NIST to prioritize AI risks and develop AI-specific defense capabilities. For example, the MITRE Atlas framework helps map and defend the most vulnerable parts of AI systems, applying a risk-based approach to known attack vectors.
It’s important to note that AI defense capabilities are still evolving. Organizations must continuously monitor, update, and improve their defenses.
Final Thoughts
Defending an AI ecosystem is a complex and ongoing process. From foundational infrastructure to data pipelines, every component plays a critical role in the overall security of AI environments. By utilizing frameworks such as MITRE Atlas, implementing advanced technologies, and addressing basic preventative controls, organizations can safeguard their AI systems from emerging threats.
The insights shared here are intended to "spark" ideas for teams working on AI security. Stay tuned for the next episode in the Security Sparks: Insights for Pros series.