Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Beyond the Norm: A Survey of Synthetic Data Generation for Rare Events

Jun 04, 2025

Jingyi Gu, Xuan Zhang, Guiling Wang

Figure 1 for Beyond the Norm: A Survey of Synthetic Data Generation for Rare Events

Figure 2 for Beyond the Norm: A Survey of Synthetic Data Generation for Rare Events

Figure 3 for Beyond the Norm: A Survey of Synthetic Data Generation for Rare Events

Figure 4 for Beyond the Norm: A Survey of Synthetic Data Generation for Rare Events

Share this with someone who'll enjoy it:

Abstract:Extreme events, such as market crashes, natural disasters, and pandemics, are rare but catastrophic, often triggering cascading failures across interconnected systems. Accurate prediction and early warning can help minimize losses and improve preparedness. While data-driven methods offer powerful capabilities for extreme event modeling, they require abundant training data, yet extreme event data is inherently scarce, creating a fundamental challenge. Synthetic data generation has emerged as a powerful solution. However, existing surveys focus on general data with privacy preservation emphasis, rather than extreme events' unique performance requirements. This survey provides the first overview of synthetic data generation for extreme events. We systematically review generative modeling techniques and large language models, particularly those enhanced by statistical theory as well as specialized training and sampling mechanisms to capture heavy-tailed distributions. We summarize benchmark datasets and introduce a tailored evaluation framework covering statistical, dependence, visual, and task-oriented metrics. A central contribution is our in-depth analysis of each metric's applicability in extremeness and domain-specific adaptations, providing actionable guidance for model evaluation in extreme settings. We categorize key application domains and identify underexplored areas like behavioral finance, wildfires, earthquakes, windstorms, and infectious outbreaks. Finally, we outline open challenges, providing a structured foundation for advancing synthetic rare-event research.

View paper on

Share this with someone who'll enjoy it:

Title:Beyond the Norm: A Survey of Synthetic Data Generation for Rare Events

Paper and Code