Organisations must understand how to secure their AI systems. This in-depth course delves into the AI security landscape, addressing vulnerabilities like prompt injection, denial of service attacks, model theft, and more. Learn how attackers exploit these weaknesses and gain hands-on experience with proven defense strategies and security APIs.

Discover how to securely integrate LLMs into your applications, safeguard training data, build robust AI infrastructure, and ensure effective human-AI interaction. By the end of this course, you'll be equipped to protect your organization's AI assets and maintain the integrity of your systems.

This course will cover the following topics:

Introduction to AI Security
Types of AI Systems and Their Vulnerabilities
Understanding and Countering AI-specific Attacks
Ethical and Reliable AI
Prompt Injection
Model Jailbreaks and Extraction Techniques
Visual Prompt Injection
Denial of Service Attacks
Secure LLM Integration
Training Data Manipulation
Human-AI Interaction
Secure AI Infrastructure

Learning Outcomes

Gain a comprehensive understanding of AI technologies and the unique security risks they pose
Learn to identify and mitigate common AI vulnerabilities
Gain practical skills in securely integrating LLMs into applications
Understand the principles of responsible, reliable, and explainable AI
Familiarize themselves with security best practices for AI systems
Stay updated with the evolving threat landscape in AI security
Engage in hands-on exercises that simulate real-world scenarios

No prerequisites, aside general understanding of AI principles.

Day 1

Introduction to AI security

What is AI Security?
- Defining AI
- Defining Security
- AI Security scope
- Beyond this course
Different types of AI systems
- Neural networks
- Models
- Integrated AI systems
From Prompts to Hacks
- Use-cases of AI systems
- Attacking Predictive AI systems
- Attacking Generative AI systems
- Interacting with AI systems
What does 'Secure AI' mean?
- Responsible AI
- Reliable, trustworthy AI
- Explainable AI
- A word on alignment
- To censor or not to censor
Exercise: Using an uncensored model
- Using an uncensored model

Using AI for malicious intents

Deepfake scam earns $25M
- You would never believe, until you do
- Behind deep fake technology
Voice cloning for the masses
- Imagine yourself in their shoes
- Technological dissipation
Social engineering on steroids
Levelling the playing field
Profitability from the masses
Shaking the fundamentals of reality
Donald Trump arrested
Pentagon explosion shakes the US stock market
How humans amplify a burning Eiffel tower
Image watermarking by OpenAI
Exercise: Image watermarking
- Real or fake?

The AI Security landscape

Attack surface of an AI system
- Components of an AI system
- AI systems and model lifecycle
- Supply-chain is more important than ever
- Models accessed via APIs
- APIs access by models
- Non-AI attacks are here to stay
OWASP Top 10 and AI
- About OWASP and it's Top 10 lists
- OWASP ML Top 10
- OWASP LLM Top 10
- Beyond OWASP Top 10
Threat modeling an LLM integrated application
- A quick recap on threat modeling
- A sample AI-integrated application
- Sample findings
- Mitigations
Exercise: Threat modeling an LLM integrated application
- Meet TicketAI, a ticketing system
- TicketAI's data flow diagram
- Find potential threats

Prompt Injection

Attacks on AI systems - Prompt injection
- Prompt injection
- Impact
- Examples
- Indirect prompt injection
- From prompt injection to phishing
Advanced techniques - SudoLang: pseudocode for LLMs
- Introducing SudoLang
- SudoLang examples
- Behind the tech
- A SudoLang program
- Integrating an LLM
- Integrating an LLM with SudoLang
Exercise: Translate a prompt to SudoLang
- A long prompt
- A different solution
Exercise: Prompt injection - Get the password for levels 1 and 2
- Get the password!
- Classic injection defense
- Levels 1-2
- Solutions for levels 1-2

Day 2

Prompt Injection

Attacks on AI systems - Model jailbreaks
- What's a model jailbreak?
- How jailbreaks work?
Jailbreaking ChatGPT
- The most famous ChatGPT jailbreak
- The 6.0 DAN prompt
- AutoDAN
Exercise: Jailbreaking - Get the password for levels 3, 4, and 5
- Get the password!
- Levels 3-5
- Use DAN against levels 3-5
Tree of Attacks with Pruning (TAP)
- Tree of Attacks explained
Attacks on AI systems - Prompt extraction
- Prompt extraction
Exercise: Prompt Extraction - Get the password for levels 6 and 7
- Get the password!
- Level 6
- Level 7
- Extract the boundaries of levels 6 and 7
Defending AI systems - Prompt injection defenses
- Intermediate techniques
- Advanced techniques
- More Security APIs
- ReBuff example
- Llama Guard
- Lakera
Attempts against a similar exercise
- Gandalf from Lakera
- Types of Gandalf exploits
Exercise: The Real Challenge - Get the password for levels 8 and 9
- Get the password!
- Level 8
- Level 9
Other injection methods
- Attack categories
- Reverse Psychology
Exercise: Reverse Psychology
- Write an exploit with the ChatbotUI
Other protection methods
- Protection categories
- A different categorization
- Bergeron method
Sensitive Information Disclosure
- Relevance
- Best practices

Visual Prompt Injection

Attack types
- New Tech, New Threats
- Trivial examples
- Adversarial attacks
Tricking self-driving cars
- How to fool a Tesla
- This is just the beginning
Exercise: Image recognition with OpenAI
- Invisible message
- Instruction on image
Exercise: Adversarial attack
- Untargeted attack with Fast Gradient Signed Method (FGSM)
- Targeted attack
Protection methods
- Protection methods

Denial of Service

Chatbot examples
- Attack scenarios
- Denial of Service
- DoS attacks on LLMs
- Risks and Consequences of DoS Attacks on LLMs
Prompt routing challenges
- Attacks
- Protections
Exercise: Denial of Service
- Halting Model Responses

Model theft

Know your enemy
- Risks
Attack types
- Training or fine-tuning a new model
- Dataset exploration
Exercise: Query-based model stealing
- OpenAI API parameters
- How to steal a model
Protection against model theft
- Simple protections
- Advanced protections

Day 3

LLM integration

The LLM trust boundary
- An LLM is a system just like any other
- It's not like any other system
- Classical problems in novel integrations
- Treating LLM output as user input
- Typical exchange formats
- Applying common best practices
Exercise: SQL Injection via an LLM
Exercise: Generating XSS payloads
LLMs interaction with other systems
- Typical integration patterns
- Function calling dangers
- The rise of custom GPTs
- Identity and authorization across applications
Exercise: Making a call with invalid parameters
Exercise: Privilege escalation via prompt injection
Principles of security and secure coding
Racking up privileges
- The case for a very capable model
- Exploiting excessive privileges
- Separation of privileges
- A model can't be cut in half
- Designing your model privileges
A customer support bot going wild
Exercise: Breaking out of a sandbox
Best practices in practice
- Input validation
- Output encoding
- Use frameworks

Training data manipulation

What you train on matters
- What data are models trained on?
- Model assurances
- Model and dataset cards
Exercise: Verifying model cards
A malicious model
A malicious dataset
- Datasets and their reliability
- Attacker goals and intents
- Effort versus payoff
- Techniques to poison datasets
Exercise: Let's construct a malicious dataset
Verifying datasets
- Getting clear on objectives
- A glance at the dataset card
- Analysing a dataset
Exercise: Analysing a dataset
A secure supply chain
- Proving model integrity is hard
- Cryptographic solutions are emerging
- Hardware-assisted attestation

Human-AI interaction

Relying too much on LLM output
- What could go wrong?
- Countering hallucinations
- Verifying the verifiable
- Referencing what's possible
- The use of sandboxes
- Building safe APIs
- Clear communication is key
Exercise: Verifying model output

Secure AI infrastructure

Requirements of a secure AI infrastructure
- Monitoring and observability
- Traceability
- Confidentiality
- Integrity
- Availability
- Privacy
Privacy and the Samsung data leak
LangSmith
Exercise: Experimenting with LangSmith
BlindLlama

AI Security

Webinaari: Tulevaisuuden Scrum Master?

Webinar: End to End Automation with Ansible Automation Platform

Webinaari: Mitä johdon on hyvä ymmärtää scrumista?

PRINCE2®, PMP vai IPMA: miten valita oikea projektityön sertifikaatti?

Tekoälysovellusten testaus: miten varmistaa AI:n luotettavuus ja turvallisuus

Microsoft Teamsin Copilot – siirrä rutiinihommat tekoälylle