Executive summary
Most GenAI vendor evaluations are still treated like “SaaS with a chatbot.”
That’s a mistake.
GenAI tools create a new combination of risks and costs:
- data leakage risk (employees paste sensitive inputs)
- training / usage risk (your data improves someone else’s model)
- model drift risk (behavior changes without notice)
- auditability gaps (you can’t prove what happened)
- variable cost exposure (usage-based fees and token spend)
CFOs and COOs don’t need a PhD in ML to approve GenAI safely.
They need a procurement-ready checklist that forces clarity on the only questions that matter:
- What data touches the system?
- What does the vendor do with it?
- Can we audit it?
- Can we bound it?
Below is a 10-question GenAI vendor checklist you can reuse for tools, copilots, and agent platforms.
The GenAI vendor checklist (10 questions)
1) What data will users actually paste into this tool?
Don’t accept “we’ll train employees not to paste PII.”
Assume reality:
- customer emails will get pasted
- invoices and contracts will get pasted
- internal metrics and forecasts will get pasted
Ask the vendor:
- What categories of data are you assuming will appear in prompts?
- Do you support automatic redaction (PII/PCI/PHI) before processing?
- Do you support client-side or gateway-based controls (so inputs can be filtered before they reach the model)?
Why it matters: If you don’t define inputs, you can’t define controls.
2) Is customer data used for training (now or later)?
You want a clear “no” by default.
Ask:
- Is any customer data used to train foundation models?
- Is any customer data used to train vendor fine-tunes?
- Is any customer data used to improve the product via “human review” (labeling) or “quality monitoring”?
If the answer is “yes,” you need explicit constraints:
- opt-in only
- separate agreement
- data minimization
- strong deletion guarantees
CFO/COO translation: You’re deciding whether your company becomes an unpaid R&D lab.
3) What is the data retention policy for prompts, outputs, and logs?
Vendors often say “we store logs for debugging.” That can quietly become indefinite storage.
Ask for retention periods for:
- prompts
- outputs
- embeddings / vector stores (if applicable)
- conversation transcripts
- tool-call logs (if it’s an agent)
Then ask:
- Can we set retention to 0 days (no storage) or a short window?
- Can we enforce retention via contract (not just “policy”)?
- What’s the deletion SLA after termination?
Why it matters: retention is where “oops” becomes a reportable incident.
4) Where does the data live (and who can access it)?
You’re looking for a simple map:
- data residency (US/EU/etc.)
- sub-processors
- admin access model
- support access model
Ask:
- Which countries will data be stored or processed in?
- Which sub-processors touch the data?
- Do you offer tenant isolation and encryption at rest + in transit?
- Do you support customer-managed keys (CMK)?
CFO/COO translation: if you can’t explain where the data goes, you can’t defend the purchase.
5) Do we get audit logs that are actually useful?
“Logs exist” is not enough.
You want logs that answer:
- who used the tool
- what data was accessed
- what actions were taken
- what the model/tool produced
- what was sent externally (if anything)
Ask:
- Do you provide immutable audit logs?
- Can we export logs to our SIEM?
- Do you log prompt + output (or hashes/pointers) in a way that supports investigations?
- Can we separate “business telemetry” from “sensitive content”?
Why it matters: without logs, you can’t do incident response, cost control, or compliance.
6) How do you prevent data exfiltration and prompt injection?
If the product connects to internal systems (Google Drive, CRM, ticketing), it’s exposed to classic attack patterns:
- prompt injection in documents
- malicious links
- cross-tenant leakage (rare but catastrophic)
Ask:
- What controls exist for prompt injection?
- Are external tool calls restricted via allowlists?
- Do you support a “read-only mode” for connectors?
- Can we enforce domain allowlists for outbound actions?
CFO/COO translation: this is the difference between “assistant” and “unauthorized integration.”
7) What happens when the model changes?
This is the quiet risk most teams miss.
GenAI products change behavior when:
- the underlying model version changes
- the vendor updates system prompts
- safety policies shift
- routing changes between models
Ask:
- Do you pin model versions?
- Do you provide release notes for model/prompt changes?
- Do you offer an evaluation harness or staging environment?
- Can we delay upgrades (change window) and roll back?
Why it matters: if outputs are used in customer comms, finance, or ops, drift becomes business risk.
8) What are the cost controls (and who is accountable)?
GenAI cost is often variable:
- usage-based fees
- per-seat + usage hybrids
- token-based charges
Ask:
- Can we set budgets, quotas, and caps?
- Can we enforce per-team or per-workflow budgets?
- Do you alert on spend anomalies?
- Do you provide cost breakdown by user/workflow/model?
CFO/COO translation: you’re buying a new variable-cost line item. Treat it like cloud.
9) What is the vendor’s security posture (in terms you can verify)?
You don’t need a 50-page security deck. You need evidence.
Ask:
- SOC 2 Type II (or equivalent) — yes/no
- SSO/SAML + SCIM — yes/no
- MFA for admins — yes/no
- penetration testing cadence — what is it?
- incident response SLA — what is it?
Also ask for:
- sub-processor list
- breach notification timelines
Why it matters: a GenAI vendor is often handling more sensitive text than most SaaS tools.
10) What are the contractual protections (indemnity, liability, and IP)?
This is where CFOs should be firm.
Ask for clarity on:
- IP indemnity (training data and output claims)
- limitation of liability (what’s excluded?)
- confidentiality terms that cover prompts/outputs
- data deletion and return provisions
If the tool is used to generate customer-facing material or internal policy decisions, push for:
- explicit indemnity scope
- clear dispute/notice process
- defined security obligations (not “commercially reasonable”)
CFO/COO translation: this is the “who pays if it goes wrong?” question.
A simple approval pattern: allow levels, not “yes/no”
If you want to move fast without taking dumb risk, don’t treat GenAI procurement as a binary decision.
Treat it like authority levels:
- Level 1: Read-only — summarize docs, analyze data, no external sending
- Level 2: Draft-only — produce drafts, human reviews and sends
- Level 3: Execute-with-approval — agent actions with explicit gates
- Level 4: Execute-with-caps — autonomous execution within strict limits
The checklist above becomes easier when you map answers to each level.
A vendor might be acceptable for Level 1–2 today, but not Level 3–4 until:
- audit logs improve
- caps exist
- change control exists
Closing thought
The CFO/COO job is to convert uncertainty into bounded risk.
GenAI doesn’t change that job.
It just changes the questions.
If you run this checklist before you sign, you can approve GenAI tools quickly and avoid the two failure modes that kill adoption:
- security teams saying “no” forever
- business teams adopting tools anyway (shadow AI)
If you want help building a repeatable GenAI procurement + governance playbook (so approvals become fast and predictable), CDS can help you set up the controls once—then scale.