Fatos Kusari is a big proponent of continuous improvement—and it pays off. Last year, Kusari, head of employee listening at health care company Johnson & Johnson (J&J), decided to give employees personalized results based on their feedback on an annual engagement survey. He also began using generative AI (GenAI) to summarize the vast number of survey comments.
The idea of giving workers personalized “fulfillment” reports from their engagement survey responses—rather than the standard practice of supplying those findings only to leaders or supervisors—was born of two beliefs, according to Kusari, who recently spoke at experience management provider Qualtrics’ X4 conference in Salt Lake City.
“We wanted to build greater awareness in employees about their level of happiness at work, and we also wanted to empower and equip them with the resources to help improve that satisfaction on their own,” Kusari told conference attendees.
The intention wasn’t to send a message that employees are solely responsible for their own happiness at work, he stressed, but rather that they have a role equal to the organization’s in increasing their own job satisfaction and engagement.
J&J employees were asked to respond to 10 survey questions identified by internal research to be strong predictors of attrition and future performance. The questions focused on skill development, growth, performance feedback, autonomy, using their strengths, the complexity of work, alignment, well-being, belonging, and inclusion.
Over 90% of employees opted in to receive personalized reports based on their responses to those questions, Kusari said. Each employee was then given two areas to focus on where the data showed opportunities to increase their satisfaction at work, along with recommended resources in J&J’s internal university to help them work on those areas. Those resources included courses, experiential exercises, reflection questions, and assessments.
“We personalized and created employee learning journeys tied to each of those 10 questions in the survey,” Kusari said.
While he chose not to give J&J managers access to the personalized engagement reports at this time, it’s something he’s considering for the future.
“My vision is that employees might one day bring their personalized reports to their managers and have a discussion about things like team dynamics, culture, and how to improve engagement of an entire team,” Kusari said.
Using GenAI to Summarize Survey Comments
Like many of his HR peers, Kusari was intrigued by the potential of GenAI tools such as OpenAI’s ChatGPT to save time in summarizing the large volume of employee comments received via the engagement survey. “Those comments provide rich nuances not expressed in scaled survey items, which makes them very valuable to us,” Kusari said.
The concern was whether ChatGPT could summarize and categorize unstructured data, namely open-ended responses, in an accurate manner. So Kusari conducted a series of tests to gauge ChatGPT’s validity and reliability in capturing survey comments.
His team first built an interface between OpenAI and J&J, then uploaded 140,000 employee comments to J&J’s IT infrastructure to keep them secure. Comments to be summarized were generated from two questions on the annual survey: What is one thing the company does well, and what is one thing it needs to improve?
Kusari then had to craft a prompt for ChatGPT that would properly summarize that large number of comments—a task that proved more challenging than initially thought.
“I spent about two months creating and refining that prompt,” he said. “I needed the prompt to do three things. It had to create an accurate summary, create a meaningful summary, and also ensure the summary would be engaging and inspiring to read.”
In the end, the ChatGPT prompt consisted of 100 lines of text and 30 different commands, Kusari said. Two of the commands were “Start with positive themes, transition smoothly to mixed themes, and end with more negative themes” and “Be sure to include a section that highlights employee recommendations.”
The final product contained 15 summaries that would ultimately be sent to J&J’s top leadership team for review. “But before I released those summaries, I had to ensure the validity and reliability of the data,” Kusari said.
To measure the summaries’ accuracy, he compared polarity scores of ChatGPT and Text iQ, a text and sentiment analysis tool from Qualtrics that he uses. Text iQ assigns a sentiment score for each employee comment; Kusari used a similar scale for ChatGPT. Scores from the two tools proved to be 90% consistent.
To further confirm validity, Kusari compared ChatGPT’s summary outputs to those of human coders in his organization. “I had my team code the themes, and we compared them with ChatGPT,” he said. The findings were very similar.
Kusari also tested ChatGPT’s reliability—the consistency of results over time—and fond the results were good but created more concerns than validity tests.
“The result of the experiment was that while GenAI produced highly valid or accurate results, it produced moderate consistency over time,” Kusari said. “Regarding reliability, I compare it to a chef who cooks a great dish every time he makes it, but the flavor of the dish might be somewhat different each time.”
Test findings were convincing enough overall that Kusari opted to release ChatGPT’s summaries to J&J’s top leaders, with a disclaimer about the reliability findings. It turned out to be the right move: An HR executive in the company gave the summaries a thumbs-up after reading 500 original employee comments and comparing them to ChatGPT’s related summaries. J&J’s CEO also referenced Kusari’s work during a global town hall meeting, saying he was able to easily get a barometer of employee sentiment by reading the GenAI summaries.
Lessons Learned
Kusari learned several lessons in his first time using GenAI that he believes can help other HR professionals who are new to the technology. Among them:
*Using GenAI is like conducting an orchestra. “Your cues as the conductor shape the melody of the orchestra, but be ready for some surprise notes,” Kusari said. “That makes validating your results critical.”
*GenAI is an eager assistant. The technology is designed to please those providing it with prompts or instructions, Kusari said, an attribute that can sometimes have downsides such as producing hallucinations or inaccuracies. “My recommendation is to start with datasets you know extremely well so you have a good idea of what GenAI should produce in terms of summaries, to limit the hallucinations,” he said. To ensure greater accuracy, Kusari also used prompts to summarize qualitative data, such as, “Only show me themes supported by 10 or more employee comments” and “Ensure no external sources are used and all analysis and feedback is based on internal comments.”
*GenAI rarely produces the same output twice. “I found GenAI will produce similar outputs, but not the exact same ones,” Kusari said. “That initially was challenging for me, especially since many of our organization’s leaders come from scientific backgrounds and expect precision in data.”
*Writing good prompts is like learning a second language. “I started with small prompts and gradually improved over time with feedback and practice,” said Kusari, who noted that prompts are highly responsive even to slight word changes. He also found that asking ChatGPT itself to evaluate the quality of his prompts paid dividends: “ChatGPT provided feedback like telling me some of my prompts were too complex and suggesting ways to simplify them.”
Dave Zielinski is principal of Skiwood Communication, a business writing and editing firm in Minneapolis.
Advertisement
An organization run by AI is not a futuristic concept. Such technology is already a part of many workplaces and will continue to shape the labor market and HR. Here's how employers and employees can successfully manage generative AI and other AI-powered systems.
Advertisement