| UI or UX | UI + UX - First-use or new-feature onboarding that teaches AI capabilities, limits, uncertainty, user control, and recovery paths | UI + UX - First-run or new-feature orientation that leads to first value | UI + UX - Contextual in-flow feedback | UI + UX - Severe-consequence warning copy before an action | UI + UX - Calibrated reliability and uncertainty display for AI or automated predictions before user action | UI + UX - Whole-answer source coverage and grounding evidence display | UI + UX - AI clarification surface that asks for missing task boundaries before answering, planning, retrieving, or using tools | UI + UX - Runtime checkpoint that pauses AI or automation until an eligible human authorizes the next step |
| UI guidance | Render AI limitation onboarding as a short benefit-and-boundary surface that names what the AI can help with, what it cannot do, what data it uses, where uncertainty appears, when human review is needed, and how to recover if the AI fails. | Render onboarding as a short purposeful path with a visible benefit, current step, skip or later path when safe, persistent resume point, and the next product action users can take immediately after finishing. | Render the message inside the same row, card, panel, form section, or task region that it describes, with a clear tone, concise title, body text, and at most one local action or detail disclosure. | Render warning text as a short high-emphasis statement with a warning icon, visible or hidden warning label, and explicit consequence copy placed before the relevant action, declaration, or instruction. | Render confidence and uncertainty as labelled reliability information with confidence band, reason, input scope, calibration status, review threshold, freshness, and the next safe action. | Render source grounding as an answer-wide evidence panel that separates source scope, searched sources, retrieved sources, used sources, supported claims, partially supported claims, unsupported claims, and unresolved source states. | Render scope clarification as a focused question tied to the user's request, with the missing boundary named in plain language: object set, timeframe, audience, source set, authority, output depth, risk limit, or action target. | Render a human approval gate as a paused automation checkpoint with the proposed action, tool or workflow step, triggering rule, risk level, payload snapshot, requester or agent, approver eligibility, timeout, and explicit approve, reject, edit, cancel, or bypass controls. |
| UX guidance | Use AI limitation onboarding when users are first encountering an AI feature, a changed AI capability, or an agent with enough autonomy that users may over-trust, under-trust, anthropomorphize, or misuse it. | Use onboarding only when users need orientation, minimal setup, personalization, or instruction before the normal interface can deliver value, and remove steps that merely market features or repeat what the UI already explains. | Use inline messages when users need contextual feedback while continuing nearby work, such as a row-level warning, section-level success, local policy note, or task-specific next step. | Use warning text when users must understand a serious consequence before acting or failing to act, such as a fine, loss of access, permanent deletion, eligibility impact, or legal responsibility. | Use confidence / uncertainty display when users need to decide whether an AI prediction, classification, recommendation, extraction, risk assessment, or generated answer is reliable enough to apply. | Use source grounding display when users need to judge whether an AI answer is backed by the right body of evidence, not merely open one citation. | Use scope clarification when an AI request is too broad, underspecified, conflicting, high-impact, or likely to produce the wrong answer unless the user narrows the task first. | Use human approval gate when automation is ready to act but policy, risk, confidence, cost, access, publication, deployment, customer impact, or legal consequence requires a human decision before execution continues. |
| Good UI | A contract assistant opens with I can summarize clauses and flag missing dates, but I may miss legal nuance; review source text before sending, then offers a sample contract to try safely. | A project-management app asks for role and team size, creates a sample board, highlights the first Add task action, and lets users skip the tour while keeping a setup checklist available. | An invoice row shows Missing billing contact directly beneath the affected customer with Add contact as the only action. | Before Submit declaration, a warning with an exclamation icon says the user may be fined if they provide false information. | A claim classifier says Medium confidence, 71 to 78 percent calibrated range, review threshold 80 percent, conflicting account-age signal, and routes the case to manual review. | A policy answer includes a Grounding panel showing 4 sources searched, 3 retrieved, 2 used, 5 supported claims, 1 partially supported claim, and 1 unsupported claim with a Review action. | An assistant receives Summarize the issues and asks Which issue set should I summarize? with Current sprint, All open issues, and Selected label choices before generating. | An AI support agent pauses before issuing a refund, shows the proposed amount, customer, policy match, confidence, source grounding, approver role, timeout, Approve refund, Edit amount, Reject, and Stop run controls. |
| Bad UI | A chat widget says Meet your expert teammate and starts answering customer questions with no limits, source scope, review warning, or fallback. | A first launch shows six promotional slides about every feature, requires Next on each slide, and lands on an empty dashboard. | A vague Important message appears above the whole page with no object reference. | A red sentence says Important below the submit button after the user has already acted. | A generated answer shows 97 percent sure without calibration, threshold, source coverage, or review path. | The answer shows a green Grounded badge even though only one citation supports one paragraph. | The assistant guesses all workspaces, creates a long answer, and never reveals that the original request lacked an object boundary. | A banner says Human approval needed but does not show the tool call, payload, approver, timeout, or resume consequence. |
| Good UX | A new analyst sees that the AI can draft summaries from selected reports, cannot verify private spreadsheets unless attached, should not be used as final approval, and can be corrected after each answer. | A new admin selects Invite teammates as their goal, imports two sample users, sees progress saved, skips notification setup, and arrives on the team page with the invite action focused. | Users can reveal why export is limited, add the missing contact, and see the local message resolve without losing table context. | Users see the fine or eligibility consequence before checking the declaration and can pause to verify their answer. | A reviewer sees low confidence and out-of-distribution input, opens the reason panel, collects the missing invoice, and avoids auto-denying the claim. | A reviewer opens the grounding panel, sees that the answer used the current policy but not the outdated FAQ, and flags one unsupported claim before publishing. | A user asks for a customer email draft; the assistant asks whether the audience is trial users, enterprise admins, or all affected accounts before drafting. | A billing lead opens the paused refund gate, sees that the amount is under policy but source grounding is partial, edits the refund to the verified amount, approves, and the agent resumes only that step. |
| Bad UX | A user assumes the assistant has read all company documents because onboarding says Ask me anything, then acts on an answer missing restricted policy files. | A user is forced to configure integrations, notifications, billing, and profile details before they know whether the product solves their task. | The message disappears like a toast even though users still need the invoice reference. | A benefit-loss warning appears only after submission, so users cannot change the decision it warns about. | Users treat a high-confidence label as proof even though the answer has no source grounding and the claim still needs evidence. | A user trusts a generated answer because the product says Grounded, but the source scope was only web search and did not include internal policy. | A user asks to update permissions and the AI acts on every team because the UI did not clarify the target scope. | A human approves a stale agent action from email and the agent applies it to a different customer state. |
| Best fit | A user is introduced to an AI feature whose abilities, limits, data scope, uncertainty, or review needs are not obvious from the normal interface. | New users need orientation, setup, personalization, or instruction before the regular interface can deliver value. | A visible object or section has local status, warning, success, or next-step information. | A user must understand a serious consequence before taking or skipping an action. | Users must judge whether an AI prediction, classification, recommendation, extraction, risk score, or generated answer is reliable enough to use. | Users need answer-wide evidence coverage before trusting generated content. | A user's AI request has missing object, audience, timeframe, source, workspace, permission, output-depth, or action-target boundaries. | An AI agent, workflow, deployment, or automation is ready to perform a high-impact step and must pause for human authorization. |
| Avoid when | The product only needs standard setup, account orientation, or a first-value path unrelated to AI capability limits. | The product is already understandable through the normal interface. | The message is a one-field validation correction. | The message is a dynamic task status that must be announced when it appears. | The system cannot estimate uncertainty or calibration honestly. | The system cannot determine source scope, retrieval status, or claim support reliably. | The request is clear enough to answer safely and clarification would only slow the user down. | The action has already happened and users only need an audit log. |
| Required state | First-run welcome state with user benefit, AI role, and plain-language capability boundary. | First-run welcome state with benefit-focused copy and one clear next action. | Neutral local context with no message. | No-warning state where the action has no severe consequence. | High confidence state with calibration scope, reason, and whether direct apply is allowed. | Default grounded state with source scope, searched sources, retrieved sources, used sources, and supported-claim count. | Ambiguous request state with original prompt preserved and missing scope named. | Paused gate state with proposed action, payload snapshot, reason for gate, and run context. |
| Accessibility burden | Expose capability, limitation, data scope, uncertainty, review, feedback, and fallback content as readable text, not only icons or color. | Keep onboarding screens in normal heading order with clear titles and step labels. | Keep the message in the reading order near the context it describes. | Do not rely on color alone; include visible or programmatic warning wording and a non-color cue such as an icon. | Expose confidence label, uncertainty reason, threshold, freshness, and gated action as text rather than relying on color, position, or animation alone. | Expose grounding summary, source scope, status counts, unsupported claims, and source groups as text. | Expose original request, missing boundary, choices, selected scope, default, blocked state, and resume status as text. | Expose gate status, proposed action, target, payload summary, risk, approver rule, timeout, and current run state as text. |
| Common misuse | Using charming human-like copy that implies broad expertise while hiding actual AI limits. | Forcing all users through a feature tour before they can do useful work. | Using an inline message for a single field error that should be connected to that input. | Using warning text for routine hints, explanations, or mild reminders. | Showing a fake percent or exact decimal for an uncalibrated model score. | Showing a global Grounded badge when only some claims have evidence. | Guessing broad scope without revealing the assumption. | Showing Approve without the exact action, payload, target, risk, or resume consequence. |