MPA (Misleading Pun Accuracy): Correctly recognizing false puns.
(MPA− and MPA+ stands for whether the original pun's subject has been substituted with a dissimilar or similar item).
# | Model | CPA | MPA− | MPA+ | MPA |
1 | o3-mini ⭐ 🧠 | 78.3 | 6.0 | 3.4 | 4.7 |
2 | Gemini-2.0-Flash-Think ⭐ 🧠 | 71.1 | 6.9 | 24.6 | 15.8 |
3 | LLaMA-3.3 (70B) | 70.0 | 29.4 | 29.1 | 29.3 |
4 | GPT-4o ⭐ | 64.9 | 14.9 | 17.4 | 16.2 |
5 | Phi-4 (14B) | 64.6 | 9.4 | 13.7 | 11.6 |
6 | Gemini-2.0-Flash ⭐ | 44.6 | 44.9 | 35.7 | 40.3 |
7 | GPT-4o-mini ⭐ | 36.3 | 10.9 | 11.7 | 23.6 |
8 | LLaMA-3.1 (8B) | 29.7 | 0.0 | 0.3 | 0.2 |
9 | Phi-3.5 (3B) | 14.9 | 48.3 | 47.1 | 47.7 |
10 | Humans 👤 | 87.9 | 87.3 | 94.4 | 90.9 |
Note: ⭐ Closed-source models. 🧠 Reasoning-focused models.