Optimizing Comedy Robots

Human-Robot Interaction

Optimizing Comedy Robots

Systematic review on making robots funny through AI and performative techniques

Lead Researcher & Author
Academic Term (2021)
Noor Naila Imtinan Himam
UBC Library SummonIEEE XploreSystematic Review Protocol
Overview

Can robots be funny? This systematic review synthesized six empirical studies on comedy robots performing for human audiences. The review found that the most effective comedy robots combine natural language processing (NLP) for joke generation, reinforcement learning (RL) for audience adaptation, and human performative techniques like gaze coordination and gestural timing.

Research Question

As social robots increasingly enter public-facing roles, humor becomes a critical social skill. But what makes a robot actually funny to a human audience? Existing comedy robots range from pre-scripted joke-tellers to AI-driven performers, with inconsistent results. This review aimed to identify what combination of technical and performative strategies produces the most positive audience response.

6

Studies meeting inclusion criteria

197+

Total participants across studies

p<0.05

Significant effects in all 6 studies

Process
🔍

Search Strategy

Search Strategy

Systematic search using UBC Library Summon across peer-reviewed HCI conferences and journals (IEEE, ACM). Boolean keywords: comedy OR humour OR joke AND audience AND robot.

📋

Screening

Screening

Inclusion criteria: English papers from 2015-2021, empirical quantitative and qualitative studies with human participants evaluating comedy robot performance.

📊

Data Extraction

Data Extraction

Extracted methodology, robot platform, audience size, comedy type, AI techniques used, and statistical significance of findings from each study.

🔗

Synthesis

Synthesis

Compared approaches across studies - categorizing by joke generation method, adaptation strategy, and performative technique.

🔬

Critical Analysis

Critical Analysis

Identified gaps in existing research including sample size limitations, venue constraints, and unexplored modalities like voice tone.

Key Findings Across Studies
1

Performative gaze and pointing gestures significantly affect audience response (p<0.01) - robots that look at and gesture toward audiences get better laughs

2

Manzai-style robots using NLP can generate original jokes from web news articles, producing routines rated as interesting and understandable

3

Reinforcement learning with social adaptation (gaze, prosody, smile detection) outperforms static or table-based approaches

4

Time adaptivity is critical - robots that adjust timing based on audience reactions are significantly funnier (p=0.001)

5

Street-style public performances attract more natural audience reactions than lab settings, but audiences often respond out of politeness

6

Monotonous robot voice remains the biggest gap - no study has successfully addressed vocal expressiveness in comedy delivery

Study Comparison

50 participants - performative gaze + gestures

Katevas et al. (2015)

11 participants - NLP joke generation (Manzai)

Umetani et al. (2016)

30 participants - RL with social adaptation

Ritschel (2020)

24 participants - real-time RL + social signals

Weber et al. (2018)

10-20 participants - timing adaptivity

Vilk & Fitter (2020)

72 participants - street-style comedy

Swaminathan et al. (2021)

Conclusions & Implications
1

The ideal comedy robot

Combines NLP-based joke generation, reinforcement learning for real-time audience adaptation, performative gaze and gestures, and time-adaptive delivery. No single study achieved all four.

2

Lab vs. real world

Lab settings provide controlled conditions but may inflate positive responses. Street-style studies reveal that audiences often respond out of social politeness rather than genuine amusement.

3

The voice gap

Across all six studies, monotonous robot voice was identified as a major limitation. Future research should prioritize vocal expressiveness and prosody variation.

4

Small sample sizes

Most studies used fewer than 30 participants, limiting generalizability. Larger, more diverse audience samples are needed to validate these findings.

Systematic Review MethodologyBoolean Search StrategyCritical AnalysisHRI LiteratureAcademic Writing
Links & Resources