Pistachio: Towards Synthetic, Balanced, and Long-Form Video Anomaly Benchmarks

Jie Li1*    Hongyi Cai2*    Mingkang Dong2    Muxin Pu3    Shan You4    Fei Wang4    Tao Huang5†

1University of Science & Technology Beijing    2Universiti Malaya    3University

4SenseTime Research    5Shanghai Jiaotong University

*Equal contribution    †Corresponding author

Abstract

Automatically detecting abnormal events in videos is crucial for modern autonomous systems, yet existing Video Anomaly Detection (VAD) benchmarks lack the scene diversity, balanced anomaly coverage, and temporal complexity needed to reliably assess real-world performance. Meanwhile, the community is increasingly moving toward Video Anomaly Understanding (VAU), which requires deeper semantic and causal reasoning but remains difficult to benchmark due to the heavy manual annotation effort it demands. In this paper, we introduce Pistachio, a new VAD/VAU benchmark constructed entirely through a controlled, generation-based pipeline. By leveraging recent advances in video generation models, Pistachio provides precise control over scenes, anomaly types, and temporal narratives, effectively eliminating the biases and limitations of Internet-collected datasets. Our pipeline integrates scene-conditioned anomaly assignment, multi-step storyline generation, and a temporally consistent long-form synthesis strategy that produces coherent 41-second videos with minimal human intervention. Extensive experiments demonstrate the scale, diversity, and complexity of Pistachio, revealing new challenges for existing methods and motivating future research on dynamic and multi-event anomaly understanding.

Video Anomaly Detection

Explore our collection of normal and anomaly videos

Video Anomaly Understanding

Detailed clip-level and event-level annotations for comprehensive anomaly analysis

Clip-level Annotation

1. Fixed wide-angle shot: A quiet urban street with a red-brick building on the right and a black metal fence in front. A fire hydrant is situated near the curb, and a man (wearing a blue shirt and jeans) walks past it heading left.

2. Fixed wide-angle shot: The fire hydrant suddenly bursts open, spraying water forcefully to the left and right, creating a chaotic stream that splashes onto the sidewalk and street. The water flow is uncontrolled and spreads rapidly.

3. Fixed wide-angle shot: The water continues to spray, forming a misty cloud as it hits the pavement. The man who walked past earlier is now seen running away from the hydrant, moving quickly to the left.

4. Fixed wide-angle shot: A woman (wearing a white blouse and black pants) approaches the area from the right, seemingly unaware of the situation. She stops abruptly when she notices the water spray and begins to back up.

5. Fixed wide-angle shot: The woman appears distressed, clutching her chest and gasping for air. She stumbles backward, her face contorted in pain, as she begins to experience sudden illness symptoms.

6. Fixed wide-angle shot: The woman collapses to the ground, her body convulsing in a seizure. She lies motionless on the wet pavement, surrounded by the spreading water.

7. Fixed wide-angle shot: Bystanders gather cautiously from both sides of the street, observing the situation.

8. Fixed wide-angle shot: Some individuals move closer to assess the woman's condition, while others keep their distance due to the water hazard.

Event-level Annotation

A fire hydrant suddenly bursts, spraying water uncontrollably across a street, after which a woman approaching the area abruptly stops, experiences a medical emergency, and collapses to the ground convulsing. Bystanders then gather to assess the situation while maintaining a cautious distance from the water hazard.

Anomaly Distribution by Scene

Statistical distribution across different scene categories

Citation

If you use Pistachio in your research, please cite:

@misc{li2025pistachiosyntheticbalancedlongform,
      title={Pistachio: Towards Synthetic, Balanced, and Long-Form Video Anomaly Benchmarks}, 
      author={Jie Li and Hongyi Cai and Mingkang Dong and Muxin Pu and Shan You and Fei Wang and Tao Huang},
      year={2025},
      eprint={2511.19474},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2511.19474}
}