Generative AI for synthetic data creation in medical imaging

Amélie Durand; Lucas Moreau

Generative AI for synthetic data creation in medical imaging

Author(s):

Amélie Durand and Lucas Moreau

Abstract:

The increasing reliance on data-driven approaches algorithms in medical imaging is constrained by limited availability of labeled datasets, patient privacy restrictions, and class imbalance across disease categories. This study critically investigates and evaluates the role of Generative Artificial Intelligence (AI) specifically diffusion-based models models in creating high-fidelity synthetic medical imaging data to augment real-world datasets and improve diagnostic model performance. Multimodal imaging datasets, including MRI, CT, and X-ray, were used to train and evaluate various generative frameworks such as Variational Autoencoders (VAE), StyleGAN2, Denoising Diffusion Probabilistic Models (DDPM), and their privacy-enhanced variant (DP-DDPM). Quantitative metrics including Frechet Inception Distance (FID), Multi-Scale Structural Similarity (MS-SSIM), and Inception Score (IS) were employed to assess realism, while model performance was validated using classification and segmentation benchmarks under both internal and external conditions. The results revealed that DDPM consistently achieved superior synthesis quality (FID < 20, MS-SSIM ≈ 0.95) and improved downstream task performance by approximately 4% over real-only baselines. Incorporating differential privacy noise (DP-DDPM) reduced re-identification risk to below 1% with negligible loss in fidelity. Radiologist validation confirmed over 90% clinical plausibility of synthetic images across modalities. The integrated Fidelity-Utility-Privacy (FUP) score provided a structured evaluation framework, enabling balanced trade-offs between realism, diagnostic utility, and data protection. Overall, the study strongly demonstrates that diffusion-based models generative AI can effectively augment medical imaging datasets, enhance model robustness, and support privacy-preserving frameworks AI development. The findings highlight the importance of establishing standardized evaluation protocols, radiologist-guided validation, and governance-aligned scorecards for responsible clinical adoption of synthetic data. This research offers a reproducible blueprint for ethical, scalable, and privacy-conscious data augmentation in medical imaging, promoting equitable access to high-quality AI training data across healthcare institutions.

Pages: 18-22 | 140 Views 68 Downloads

Download (268KB)

How to cite this article:

Amélie Durand and Lucas Moreau. Generative AI for synthetic data creation in medical imaging. J. Mach. Learn. Data Sci. Artif. Intell. 2025;2(2):18-22.

Vol. 2, Issue 2, Part A (2025)

Generative AI for synthetic data creation in medical imaging

Important Links