AI Ethics for Product Teams: A Practical Checklist

You are a product manager or engineer about to add an AI-powered feature to your product. What questions should you answer before shipping?

This checklist covers the decisions that matter most. It is organized by phase: before you build, while you build, and after you ship. Each item links to the reasoning behind it.

Before you build

1. Define the problem without AI first

Before selecting a model or vendor, write down the problem you are solving in plain language. If the problem can be solved with rules, thresholds, or simple automation, AI may add complexity without value.

**Ask yourself:**

Can this be solved with a decision tree or lookup table?
What is the cost of a wrong answer?
Does the problem require pattern recognition across unstructured data?

If the answer to the first question is yes and the cost of a wrong answer is high, reconsider whether an AI system is the right approach.

2. Map the people affected

Every AI system has stakeholders beyond its users. A hiring tool affects candidates. A content moderation system affects creators. A credit scoring model affects applicants.

Stakeholder	What they need	What can go wrong
Direct users	Accurate, fast results	Over-reliance on AI output
Affected parties	Fair treatment, recourse	Discrimination, no appeal process
Operators	Clear decision support	Alert fatigue, automation bias
Your organization	Reduced risk, compliance	Liability, reputation damage

Document each group before writing a single line of code.

3. Check your training data

Data issues are the most common source of AI harm. Before training or fine-tuning:

Where did the data come from? Is it licensed for this use?
Does it represent the population your system will serve?
What time period does it cover? Will it become stale?
Does it contain protected characteristics (race, gender, age, disability)?
Has anyone audited it for labeling errors or systemic bias?

If you cannot answer these questions, you are not ready to train a model.

While you build

4. Choose your model with eyes open

Model selection is a risk decision, not just a performance decision.

Factor	What to check
Accuracy	Benchmark results on tasks similar to yours (not generic benchmarks)
Failure modes	What does the model do when uncertain? Does it say "I don't know"?
Bias testing	Has the model been evaluated for disparate impact on protected groups?
Explainability	Can you explain why the model produced a specific output?
Vendor lock-in	Can you switch models without rebuilding your product?
Data handling	Where does your data go? Is it used to train future model versions?

5. Build in human oversight

No AI system should make high-stakes decisions without a human review path. This means:

A human can override any AI decision
Users know when they are interacting with AI
There is an appeal or escalation process
Edge cases are flagged for manual review

The EU AI Act requires human oversight for high-risk AI systems. Even if your system is not classified as high-risk, human oversight reduces liability and catches errors the model cannot detect.

6. Disclose AI use to users

Users have a right to know when AI is involved in decisions that affect them. At minimum:

State that AI is used in the feature
Explain what the AI does (in plain language, not marketing copy)
Describe what data the AI uses
Provide a way to opt out or request human review

7. Test for harm before launch

Standard software testing (unit tests, integration tests, load tests) is necessary but not sufficient. AI systems need additional testing:

**Bias testing:** Run the system on demographic subgroups and compare outcomes
**Adversarial testing:** Try to make the system produce harmful outputs
**Edge case testing:** What happens with unusual inputs, empty data, or conflicting signals?
**Failure mode testing:** What happens when the model is uncertain? When the API is down?

Document results and set thresholds for acceptable performance across all groups.

After you ship

8. Monitor continuously

AI systems degrade over time. The world changes, user behavior shifts, and data drifts. You need:

Automated monitoring of accuracy metrics over time
Alerts when performance drops below thresholds
Regular bias audits (quarterly at minimum)
A process for users to report problems

9. Maintain an incident log

When something goes wrong, document it. An incident log should include:

What happened
Who was affected
Root cause analysis
What was changed to prevent recurrence
Who reviewed the fix

The AI Incident Database (incidentdatabase.ai) catalogs public AI failures. Review it periodically to learn from others' mistakes.

10. Plan for model retirement

Every model has a lifespan. Plan for:

How you will migrate to a new model
What happens to data collected during the model's operation
How you will notify users of changes
What documentation you need to preserve for compliance

Decision table

Use this table to assess whether your feature is ready to ship:

Checkpoint	Status	Notes
Problem defined without AI
Stakeholder map complete
Training data audited
Model bias tested
Human oversight built in
AI use disclosed to users
Harm testing completed
Monitoring configured
Incident process documented
Retirement plan exists

If any checkpoint is blank, the feature is not ready.