Spotlight

Research-driven design for GenAI bias reporting.

Role

UX Design Lead

UX Researcher

Duration

August 2024-December 2024

What I Did

Storyboarding/Speed Dating

Think-Aloud Testing

Usability Testing

Metrics Evaluation

UX Flows

Rapid Prototyping

Team

Riya Mody (Project Manager)

Nikita Kulkarni (UX Research Lead)

Jasmine Kim (Visual Design Lead)

At-a-Glance

As part of CMU's User-Centered Research and Design course, I collaborated with a team to explore how everyday users could better report AI bias.

While our team collaborated on research and concept development, I spearheaded the design execution - creating all wireframes and prototypes in Figma and iterating on our interface based on research findings.

This academic project, while using ChatGPT as a case study, was not affiliated with OpenAI.

PROJECT BACKGROUND

Current bias reporting processes in GenAI systems are flawed.

While AI companies rely on technical experts to test for algorithmic bias, these expert-led audits often miss critical issues that only everyday users can discover through their real-world interactions with AI systems.

Primary Goal

How might we empower everyday users to seamlessly report harmful AI behaviors in a way that helps AI teams detect and address biases they might otherwise miss?

Scope

While everyday algorithm auditing could apply to many AI systems, we focused specifically on ChatGPT to deeply understand user behaviors and validate our solution approach.

DISCOVERY

01

DISCOVERY

01

Understanding Current User Behavior

Initial usability testing revealed fundamental flaws in ChatGPT's existing reporting system.

Research Setup

We conducted usability testing with 4 regular ChatGPT users, employing think-aloud protocols to understand their natural interactions with the current reporting system. Each session was recorded via Zoom with two researchers present - one interviewer and one notetaker.

METHOD DEEP-DIVE

Our test was structured to:

  • Have users explore ChatGPT's existing feedback features while thinking out loud

  • Observe their natural reporting behaviors

  • Allow participants to express confusion or uncertainty in real-time

Design Lens

As a designer observing these usability tests, I noticed users consistently highlighting text - a behavior that later inspired our core interaction design.

KEY FINDINGS

Through our analysis of interpretation notes and affinity clustering, we uncovered several critical insights:

🤔

Uncertainty Rules

Users are often uncertain if their reports actually have any impact.

⚡️

Wrong Tool for the Job

Existing feedback system, designed for general comments, lacks the structure needed for meaningful bias reporting.

🤖

Trust Issues

Users showed conflicting attitudes towards bias in AI, with their level of trust affecting their likelihood to report issues.

With these learnings in hand, we moved into our speed dating phase to test different concepts that could make reporting feel more intuitive and integrated into users' workflows.

IDEATION

02

IDEATION

02

SPEED DATING

Early concepts revealed users' desire for lightweight yet meaningful reporting.

Research Setup

We began with a creative ideation session using the Crazy 8s method for four How Might We questions. This rapid ideation helped us generate diverse concepts before developing more detailed storyboards for user testing.

Our Crazy 8s session.

Design Lens

During our Crazy 8s session, I sketched several interactions, drawing from my observation of users' natural behaviors in the discovery phase.

METHOD DEEP-DIVE

Our speed dating process involved:

  • 12 storyboards of various psychological risk levels

  • Presenting varied bias reporting solutions to understand user preferences

  • Gathering rapid feedback on each approach

A storyboard we used during speed dating.

KEY FINDINGS

From these speed dating sessions, we surfaced important findings about user needs.

🎯

Context is Key

There is a distinction between harmful representational bias and beneficial personalized experiences.

🌊

Flow Matters

A reporting system needs to seamlessly integrate into users' existing workflows.

🎨

Personalization Preferred

One-size-fits-all reporting doesn't work; users need options that match their current cognitive load.

With clear user preferences for natural, flexible reporting, we refined our concept into a paper prototype for further testing.

VALIDATION

03

VALIDATION

03

Paper Prototyping

Lo-fi testing validated our highlight-to-report approach.

The Concept

Building on insights from speed dating, we envisioned a seamless reporting system that would activate when users naturally highlight text in ChatGPT.

Research Setup

We created a low-fidelity paper prototype combining printed ChatGPT screenshots with hand-drawn elements to test our new reporting system. We designed specific tasks to test key interactions and assumptions.

Our paper prototype elements.

Design Lens

While creating the paper prototype elements, I worked with my team to ensure careful design of each interaction to test our highlight-to-report hypothesis.

KEY FINDINGS

As testers interacted with our paper prototype and shared their thoughts, several critical patterns emerged.

Natural Interaction

Users naturally gravitated toward highlighting as their first action when spotting problematic content.

⏱️

Time Sensitivity

Decision to provide additional details heavily influenced by users' immediate time constraints and task context.

📝

Flexible Input

Users appreciated the ability to choose how much detail to provide based on their current situation.

Building on the success of our paper prototype, we evolved the concept into Spotlight, enhancing the interface while maintaining the core highlighting interaction users loved.

FINAL SOLUTION

04

FINAL SOLUTION

04

MEET SPOTLIGHt

Spotlight makes reporting AI bias as natural as highlighting text.

Design Lens

Based on our paper prototype findings, I made several key design decisions:

  • Positioned the reporting dialog close to highlighted text to maintain context

  • Simplified the category selection to reduce cognitive load

  • Added progressive disclosure to make detailed reporting optional

Here's how it works:

Highlighting potentially biased text in ChatGPT's response would trigger a "Report Bias" dialog box

After clicking "Report Bias," users can select from predefined bias categories in the same dialog

Users can optionally "Tell Us More" which opens a side panel for additional context

Upon submission, users receive a reference number to track their report's status

SUCCESS METRICS

  • System Usability Scale (SUS) Score: 81.9 (Excellent)

  • Users praised the natural workflow integration

  • Positive feedback on the ability to track report status

  • Strong appreciation for flexible detail levels

"Nice, it's super quick. Not having to like, stop everything I'm doing and go somewhere else."

— P2, on final prototype

"Hah, I'd totally find this by accident just highlighting stuff. Like, it's right there where you're already working."

— P3, on natural workflow integration

Design Lens

Drawing from our research insights, I designed Spotlight's interface to feel as natural as possible. I went through several iterations of the highlighting interaction and dialog placement, testing different approaches to find the most intuitive flow.

View the click-through prototype on a larger screen!

RETROSPECTIVE

06

RETROSPECTIVE

06

RETROSPECTIVE

06

REFLECTING ON MY EXPERIENCE

Learning to think like a researcher: reflecting on my growth.

Initial Learning Goals

Reflections on Achievement

Build a UX research toolkit for future projects

Goal: Build a UX research toolkit for future projects

I gained hands-on experience with a variety of research methods such as think-aloud protocols, speed dating, and usability testing. I also now understand how to select appropriate research methods that serve different research needs.

Become more well-rounded as a UX/product designer

I learned to translate research insights into design decisions, making me a more well-rounded product designer who can bridge research and design.

Learn to effectively collaborate with UX researchers

As both a UX researcher and designer on this project, I gained a unique perspective on research-design collaboration. I developed the ability to understand research processes and how to best incorporate insights into the design process.

Impact on Future Work

While I've always recognized the importance of research, this project deepened my understanding of how research and design truly interweave.

I experienced firsthand how research insights continually shape and refine design decisions at every stage, and hands-on experience transformed my theoretical knowledge into practical expertise.