📝 Publications

🏹 Selected Projects

TMM 2025
sym

Enhancing Weakly Supervised Multimodal Video Anomaly Detection through Text Guidance
Shengyang Sun, Jiashen Hua, Junyi Feng, Xiaojin Gong

Project

  • TGMVAD is the first work to employ in-context learning (ICL) to the task of weakly-supervised multimodal video anomaly detection for the purpose of augmenting text samples.
TCSVT 2025
sym

Delving Into Instance Modeling for Weakly Supervised Video Anomaly Detection
Shengyang Sun, Jiashen Hua, Junyi Feng, Dongxu Wei, Baisheng Lai, Xiaojin Gong

  • This is the first work to our knowledge that deliberately explores the issue of anomaly contamination and dilution along the temporal dimension, which is overlooked by prior MIL-based weakly-supervised video anomaly detection works.
CVPR 2023
sym

Hierarchical semantic contrast for scene-aware video anomaly detection
Shengyang Sun, Xiaojin Gong

Code

  • We build a scene-aware reconstruction framework composed of scene-aware feature encoders and objectcentric feature decoders for anomaly detection.
  • We propose hierarchical semantic contrastive learning to regularize the encoded features in the latent spaces, making normal features more compact within the same semantic classes and separable between different classes.
ACM MM 2024
sym

TDSD: Text-Driven Scene-Decoupled Weakly Supervised Video Anomaly Detection
Shengyang Sun, Jiashen Hua, Junyi Feng, Dongxu Wei, Baisheng Lai, Xiaojin Gong

  • This is the first work to address scene-dependent video anomaly detection under a weakly supervised setting.
ICME 2024
sym

Multi-scale Bottleneck Transformer for Weakly Supervised Multimodal Violence Detection
Shengyang Sun, Xiaojin Gong

Code

  • We propose a multi-scale bottleneck transformer (MSBT)-based fusion module. It leverages a reduced number of bottleneck tokens to transmit gradually condensed information from one modality to another and a bottleneck token-based weighting scheme to weight the fused features, effectively addressing the information redundancy and modality imbalance problems.
ICME 2023
sym

Long-Short Temporal Co-Teaching for Weakly Supervised Video Anomaly Detection
Shengyang Sun, Xiaojin Gong

Code

  • We employ a co-teaching strategy to train short- and long-term networks alternatively and iteratively. The two networks can explicitly learn from abnormal events with varying durations.

🌟 Latest Works