AI Vendor Due Diligence: 30 Questions in Five Categories

You are evaluating an AI vendor and need a structured approach to due diligence that goes beyond the standard security questionnaire. Traditional software procurement checklists miss AI-specific risks: model opacity, training data provenance, benchmark reliability, and the unique ways AI systems can fail.

This guide provides 30 questions organized in five categories, a red flags table for interpreting vendor responses, and a comparison framework for evaluating multiple vendors side by side.

Category 1: Model transparency (questions 1-6)

These questions assess whether you can understand what the model does, how it was built, and where it fails.

**1. What model architecture does the system use, and is it proprietary or based on an open model?** You need to know whether you are buying a wrapper around GPT-4, a fine-tuned open model like Llama, or a proprietary architecture. This affects your ability to evaluate, audit, and switch vendors.

**2. What training data was used, and can you describe the data sources, time period, and curation process?** The vendor may not share a full data manifest, but they should describe the categories of data, the time range, and whether the data was licensed, scraped, or user-contributed.

**3. Has the model been evaluated on benchmarks relevant to your use case, and can you share the results?** Generic benchmark scores (MMLU, HumanEval) may not reflect performance on your specific tasks. Ask for evaluation results on tasks that match your deployment scenario.

**4. What are the known limitations and failure modes of the model?** Every AI system has documented weaknesses. A vendor that claims none is either being evasive or has not tested thoroughly. Look for specific descriptions, not vague disclaimers.

**5. How often is the model updated, and how are updates communicated?** Model updates can change behavior in production. You need advance notice, changelog documentation, and the ability to test new versions before they replace the current one in your environment.

**6. Can we evaluate the model on our own data before purchasing?** A vendor that resists evaluation on your data may be hiding performance gaps. A trial period with your real data is the most reliable way to assess fit.

Category 2: Data practices (questions 7-12)

These questions assess how the vendor handles your data and your users' data.

**7. Is our data used to train or improve the model?** Many vendors use customer data for model training by default. Confirm whether you can opt out, whether the opt-out is retroactive, and whether it applies to all processing (including human review of flagged outputs).

**8. Where is data stored, processed, and transmitted?** Get specific data center locations and jurisdictions. This matters for GDPR compliance, data residency requirements, and sector-specific regulations.

**9. What is the data retention policy, and what happens to our data if we cancel?** Confirm: how long data is retained after each API call, what data is retained after contract termination, and whether you receive a data export before deletion.

**10. Who within the vendor organization can access our data?** Specify: engineering teams, support staff, contractors, subprocessors, and any government access requirements. Get a list of subprocessors and their roles.

**11. Is data encrypted at rest and in transit, and can we bring our own encryption keys?** Encryption at rest and in transit should be standard. Customer-managed encryption keys (CMEK) are important for regulated industries and give you control over data access.

**12. How are data breaches detected, and what is the notification process?** Get specific: detection mechanisms, notification timeline (GDPR requires 72 hours), incident response process, and your role in the response plan.

Category 3: Security (questions 13-18)

These questions assess the vendor's security posture, with attention to AI-specific attack surfaces.

**13. Has the vendor completed SOC 2 Type II certification?** Or equivalent certifications relevant to your industry (ISO 27001, HITRUST, FedRAMP). Ask for the most recent audit report, not just a certification badge.

**14. Has the model been tested for prompt injection vulnerabilities?** Prompt injection is a class of attacks where malicious inputs cause the model to execute unintended actions or reveal sensitive information. Ask for testing methodology and results.

**15. Has the model been tested for data extraction attacks?** Can an attacker extract training data, other users' data, or system prompts through crafted queries? Ask whether the vendor conducts extraction testing and what mitigations are in place.

**16. What access controls are available for API access?** Evaluate: API key management, role-based access control, IP allowlisting, rate limiting, and audit logging of API calls.

**17. How is the model protected against adversarial inputs?** Adversarial inputs are crafted to cause the model to produce incorrect or harmful outputs. Ask about input validation, output filtering, and the vendor's adversarial testing program.

**18. What is the incident response process for AI-specific security events?** Traditional incident response may not cover AI-specific events like model poisoning, systematic hallucination, or coordinated prompt injection campaigns. Ask how these are classified and handled.

Category 4: Compliance (questions 19-24)

These questions assess the vendor's regulatory compliance, with focus on AI-specific requirements.

**19. How does the vendor classify this AI system under the EU AI Act?** The EU AI Act establishes risk categories (unacceptable, high, limited, minimal). If the vendor has not classified their system, they may not be prepared for compliance obligations.

**20. Can the vendor provide an AI impact assessment?** Under the EU AI Act, high-risk systems require documented impact assessments. Even for lower-risk systems, an impact assessment demonstrates responsible governance.

**21. Does the vendor maintain documentation required by the EU AI Act Annex IV?** Annex IV requires technical documentation including: system description, design specifications, training and testing data information, performance metrics, and risk management measures.

**22. How does the vendor handle data subject requests (GDPR Articles 15-22)?** If the AI system processes personal data of EU residents, data subjects can request access, correction, deletion, and explanation of automated decisions. Ask how the vendor supports these requests.

**23. Does the vendor carry professional liability insurance for AI-related claims?** AI errors can cause financial harm, discrimination, or safety incidents. Ask about the vendor's insurance coverage for errors and omissions, professional liability, and cyber liability.

**24. What regulatory reporting obligations does the vendor fulfill?** Some jurisdictions require reporting when AI is used in hiring (NYC Local Law 144), credit decisions (ECOA), or healthcare (FDA). Ask whether the vendor handles reporting or if the obligation falls on you.

Category 5: Commercial terms (questions 25-30)

These questions assess the business relationship, with attention to AI-specific commercial risks.

**25. What is the pricing model, and how does cost scale with usage?** AI pricing varies widely: per-call, per-token, per-seat, per-outcome, or flat rate. Get pricing for your expected usage plus 10x and 100x growth scenarios.

**26. Are there minimum commitments or take-or-pay clauses?** Annual contracts with minimum spend can lock you in even if the model underperforms. Understand the financial commitment before signing.

**27. Who owns the outputs generated by the AI system?** AI-generated content, decisions, and analyses need clear IP ownership terms. Some vendors claim rights to outputs or to derivative works based on outputs.

**28. What is the exit strategy if we need to switch vendors?** Evaluate: data export capabilities, format portability, notice period, transition support, and whether you can continue operating during a migration period.

**29. Can we audit the AI system independently?** For high-stakes deployments, you may need the right to conduct independent audits of the model's performance, bias, and security. Confirm whether the contract allows third-party audits.

**30. What happens when the model version we are using reaches end of life?** AI models have shorter lifecycles than traditional software. Ask about end-of-life timelines, migration paths, and whether you will have advance notice before your model version is deprecated.

Red flags table

When evaluating vendor responses, these answers should raise concerns:

Question Area	Red Flag Response	Why It Concerns You
Training data	"We cannot disclose any information about training data"	You cannot assess data quality, bias risk, or legal compliance
Model limitations	"Our model has no known limitations"	Every model has failure modes; this answer indicates insufficient testing or evasiveness
Data usage for training	"By default, yes, but you can opt out" (with no details on scope)	Opt-out may not cover all processing; retroactive deletion may not be possible
Bias testing	"We tested for bias" (no methodology or results shared)	Vague claims without evidence are not verifiable
Prompt injection	"That does not apply to our system"	Prompt injection affects nearly all language model systems; dismissing it suggests lack of testing
EU AI Act classification	"We have not classified our system yet"	Indicates the vendor is not prepared for regulatory compliance
Output ownership	"We retain rights to use outputs for model improvement"	Your confidential business outputs may be used to train models serving competitors
Independent audit	"We do not allow third-party audits"	You cannot independently verify the vendor's claims
Exit strategy	"We can discuss that when the time comes"	No exit plan means you are committing without a safety net

Vendor comparison framework

When evaluating multiple vendors, use this framework to compare them on the criteria that matter most:

Criteria	Weight	Vendor A	Vendor B	Vendor C
Model transparency (questions 1-6)	20%	Score 1-5	Score 1-5	Score 1-5
Data practices (questions 7-12)	25%	Score 1-5	Score 1-5	Score 1-5
Security (questions 13-18)	20%	Score 1-5	Score 1-5	Score 1-5
Compliance (questions 19-24)	20%	Score 1-5	Score 1-5	Score 1-5
Commercial terms (questions 25-30)	15%	Score 1-5	Score 1-5	Score 1-5
Weighted total	100%

Adjust weights based on your context. Regulated industries should weight compliance higher. Organizations handling sensitive data should weight data practices and security higher.

Scoring guide

Score	Meaning
5	Exceeds expectations. Proactive disclosure, industry-leading practices.
4	Meets expectations. Clear answers, documented practices, willing to share evidence.
3	Acceptable with conditions. Some gaps that can be addressed through contract terms.
2	Below expectations. Vague answers, limited documentation, resistance to specific commitments.
1	Unacceptable. Red flag responses, refusal to engage, or missing capabilities.

Any vendor scoring 1 in data practices, security, or compliance should be disqualified unless the deficiency can be contractually remediated before deployment.

How to use this guide

Send questions 1-30 to each vendor in writing. Verbal answers are harder to enforce.
Score each response using the 1-5 scale. Document the evidence for each score.
Check responses against the red flags table. Any red flag warrants a follow-up conversation.
Complete the comparison framework. Let the weighted scores guide, but not replace, your judgment.
Include the vendor's responses as an exhibit in the contract. This makes their representations binding.

AI Vendor Due Diligence: 30 Questions in Five Categories

Category 1: Model transparency (questions 1-6)

Category 2: Data practices (questions 7-12)

Category 3: Security (questions 13-18)

Category 4: Compliance (questions 19-24)

Category 5: Commercial terms (questions 25-30)

Red flags table

Vendor comparison framework

Scoring guide

How to use this guide

Sources

Related

AI Procurement Checklist

AI Risk Register Template

How to Read AI Leaderboards

Data Leakage Risk