Editor’s Note: This article is Part 2 of a three-part series on AI implementation by Kathleen Pearson. In the next installment, she dives into the 3 Steps to Launch Your AI Program. The first article in the series covered Why Most AI Implementations Fail.
Starting with a verified problem and focusing on outcomes ensures artificial intelligence initiatives address meaningful problems and achieve adoption. “No pain, no project” should be the mantra.
One key problem that HR leaders can focus on with AI is the annual performance evaluation process. Drafting performance evaluations is an acute pain point for HR leaders, the process is easy to quantify (in hours lost and complaints logged), and leaders can quickly gain executive sponsorship from those ready to champion a solution because they feel the pain, too.
Here are four steps to prepare for implementing AI to improve the performance evaluation process.
Step 1: Identify a Real Pain Point
Objective: Focus on work that matters and that will earn adoption because stakeholders already feel the friction.
In this case, the pain practically screams: the annual performance evaluation process. Year after year, HR chases managers and employees to complete review forms. Even then, nearly half of reviews are submitted late, and many require follow-up.
Managers often provide sparse write-ups or, worse, use language that violates HR guidelines (such as casually comparing one employee to another or referencing medical leave affecting output — comments that are inappropriate and legally problematic).
From the employee perspective, reviews don't deliver value. This is often due to employees expressing that feedback is too generic to be helpful. Gallup research shows that only 14% of employees feel inspired to improve from their reviews. This is work that matters: fixing a broken process affecting engagement, fairness, and legal risk.
Method: Run focused conversations asking stakeholders what exactly makes the process painful.
Responses typically map to these themes:
Time and scale: Managers with large teams say, "I simply don't have time to write 10 or 15 quality review narratives." They describe the process as daunting — a pile of blank forms each requiring careful thought, usually squeezed between normal work. This confirms the "blank-page problem”: Starting from scratch for each employee is a major bottleneck.
Knowledge and consistency: Some managers admit they're unsure how to phrase feedback, especially constructive criticism, in a way that is both honest and diplomatically worded. They fear saying something "wrong," so they either keep it overly vague or unknowingly include risky remarks. This highlights an education gap: Managers need just-in-time guidance on what not to write.
Process and compliance: HR coordinators point out the heavy audit workload after reviews are submitted. HR must catch mentions of absences protected by the Family and Medical Leave Act, overly subjective comparisons, or a lack of supporting examples for ratings. Manually reading and correcting hundreds of reviews takes weeks. This pain isn't just inefficiency but also risk — unchecked.
The initial drafting step emerges as the root cause: If managers could draft reviews correctly and quickly the first time, it would alleviate the time crunch and reduce downstream rework.
Step 2: Clarify the Desired Outcome
Objective: Avoid automating existing steps that might be unnecessary; instead, clarify the "what" (the end goal) independent of the current flawed process. Define success in terms of the decision or deliverable wanted at the end of the day.
For the performance evaluation project, ask: What is the true purpose and end state of this process? The answer: a constructive, fair performance feedback narrative for each employee, delivered on time, leading to improved performance and engagement. Deliberately separate this "what" from the current "how" of the process (managers writing paragraphs in a form by a deadline and then HR fixing them).
Outcome Definitions — Key Elements to Clarify:
Intended outcome: Produce high-quality performance evaluation narratives for all employees by a set date, with minimal manual editing needed.
Time and quality targets: Aim for "95% of draft evaluations meet quality and compliance criteria on the first pass," meaning very few need major HR revision. Define "quality" in a granular manner: Each evaluation should contain at least two concrete examples of accomplishments or improvements, at least one actionable development suggestion, and zero policy violations or flagged terms.
Acceptance criteria: Outline what a "good enough" first release looks like:
Accuracy: Content must reflect user intent correctly and not attribute false achievements or criticisms (factual accuracy).
Tone: The narrative should be professional and constructive — not overly stiff, but not too casual. It should align with cultural values and the performance framework.
Completeness: Cover key topics: strengths, areas for improvement, skills gaps, and goals for next year. No section should be skipped.
Compliance flags: There should be absolutely no mentions of protected information (such as health issues or leave status), no biased language, and no comparisons between employees. The presence of any of these would be a failed outcome for the AI draft, indicating that it needs to be reworked.
This outcome definition framework is a quick exercise that prevents scope creep and debates about what the AI agent could or should do later.
Step 3: Map the Information Required
Objective: Ensure the AI model can do the job with the right data and that humans can validate the results.
For the performance evaluator agent to draft meaningful and compliant reviews, identify several categories of information and plan how to provide each:
Inputs from Humans:
The manager's knowledge of employees’ performance is crucial. Simplify this into four prompt questions that the user answers:
- What did the employee do well this year?
- What could they improve upon?
- Are there any skills gaps or development needs?
- What are the goals and expectations for next year?
These are structured extractions of manager feedback. This Q&A format guides managers to provide the raw facts and reflections the AI needs. It tackles the blank-page problem by breaking the task into answerable chunks. Managers can input short bullet points or brief notes in response.
Internal Knowledge Sources
The AI agent can't rely solely on bullet points; it needs context about roles and policies. Incorporate:
Organization performance review policy and HR guidelines: Feed the AI a summary of do's and don'ts (e.g., "do focus on behavior and results" and "don't make comparisons to peers” and "don’t mention age, health, or family status"). Essentially, instruct the AI to avoid forbidden phrases or topics. This acts as built-in compliance checking — the model will be less likely to generate something violating these rules and, in many cases, will proactively warn users if their input includes red-flag statements.
Career pathing guides and role expectations: Provide data on what competencies or results are expected for the employee's job level. For instance, if the employee is a Software Engineer III, the guide might say, "expected to lead small projects and mentor junior staff." The AI agent can use this to gauge if inputs suggest the employee is meeting, exceeding, or falling short of expectations.
By the end of this step, have a minimal data specification listing all inputs (such as manager answers, policy documents, and role guides) and a clear plan of how data flows through the system. Having this map of the information required gives HR leaders confidence to proceed because the task is seen as data-feasible with controls to handle accuracy and security.
Step 4: Match the Task to the Right AI Approach
Objective: Choose the appropriate AI platform for the task at hand rather than being swayed by hype or defaulting to one tool for everything.
Examine the nature of this task: generating a narrative based on various inputs (user points, policies, and role expectations) in a way that's fluent and contextually appropriate. This clearly points to using a large language model (LLM) — essentially, an AI writing assistant. Specifically, it's an ideation and drafting task with some need for policy compliance checks. You can use the LLM to translate tasks.
Translating a Simple Task to a Capability Map:
- For ideation, drafting, and refinement, using an LLM as a thought partner via well-crafted prompts is ideal.
- Strict deterministic output isn't needed, but best-effort natural language generation is required, which LLMs excel at.
- The volume of reviews (hundreds to thousands) is high enough that automating drafting saves significant time, justifying AI use.
RAG Approach and Hybrid Checks:
Recognize that some questions need company-specific information to be answered effectively. While LLMs are trained on large data sets and can have access to the web, a model that uses a retrieval-augmented generation (RAG) approach can also be directed to look at particular files before responding to a prompt. Use an LLM that can pull in snippets of your policy or guide texts as it formulates the review, ensuring accuracy and alignment with your standards.
Solution architecture combines a generative model based on a generative pre-trained transformer (GPT) with custom prompts and retrieval. Below is a list outlining the general process.
Solution Sketch — Simple Architecture:
- Manager input interface: A web form where managers answer four questions and click "Generate Draft."
- Data retrieval: The system fetches relevant snippets (e.g., policy on language and the role's competency model).
- LLM prompt construction: A carefully designed built-in prompt is used each time behind the scenes.
- Draft generation: The AI agent returns a draft narrative (typically a few paragraphs covering strengths and areas for improvement).
- User review: The draft, along with any flagged sections, is presented to the manager for editing.
- Submission to HR: Once satisfied, the manager submits the narrative into the human resource information system.
- System of record update: After approval, the finalized review is saved in the HR system and shared with the employee.
Kathleen Pearson is the national director of human resources at Lewis Brisbois, a leading law firm with over 55 offices throughout the U.S. Pearson has more than two decades of expertise in human capital management across global teams and is a recognized thought leader on AI’s transformative potential in HR. She is known for pioneering innovative people strategies that integrate advanced AI solutions into talent management, employee experience, and organizational growth.
Was this resource helpful?