PhD Thesis: Differentially Private Synthetic Data Generation for Mobile Money Fraud Detection
Abstract
We live in an era where routine transactions ranging from paying domestic bills to buying groceries are carried out using mobile financial services. However, the rapid growth and uptake of these services has led to amplified security and privacy risks, including SIM swap attacks, identity fraud, data theft, refund fraud, and unauthorized fees. Advances in machine learning (ML) show potential for detecting financial fraud in mobile money transactions, yet this requires access to large volumes of transaction data. Research on mobile money fraud has been hindered by data sensitivity and privacy concerns that restrict access to such datasets. In addition, real mobile money datasets are class-imbalanced, with far fewer frauds than legitimate transactions, biasing ML models against the minority class. This thesis presents a differentially private synthetic data generation approach for mobile money transaction datasets to support financial modeling and fraud detection.
Developing a synthetic data generation model for tabular data that preserves the intricate, high-order correlations that drive fraud while guaranteeing differential privacy remains notoriously difficult. This challenge stems from calibration fragility in high-dimensional spaces and a parameter search space that expands exponentially, requiring thousands of stochastic runs for model convergence. Existing synthetic data generation methods do not accurately model sparse, event-driven features, while simpler resampling techniques risk leaking private information and struggle to capture evolving fraud tactics in real mobile money ecosystems.
This thesis develops synthetic data generation techniques to investigate these limitations. This study introduces a multi-agent-based simulation model MoMTSim, which simulates interactions among clients, merchants, and banks. MoMTSim is calibrated using transaction aggregates derived from a real mobile money transaction dataset. Its fidelity is assessed using the sum of squared errors, Kolmogorov–Smirnov tests, and visual diagnostics such as Bland–Altman plots and kernel density estimates. The results show a close resemblance to real data, with a total error of 2.0010 at the 100 000-client benchmark. We present MoMTSimDP, a differentially private extension of MoMTSim that applies the Gaussian mechanism and satisfies a (1.0, 10−6) privacy guarantee. MoMTSimDP maintains high fidelity, achieving a comparable total error of 2.0070 at 100 000 clients.
Inference fidelity analysis shows that ML models trained on MoMTSim and MoMTSimDP data preserve key structural and multivariate relationships. Random forest and XGBoost maintain high feature-importance agreement with real-data models, even under differential privacy. Classification results also show that both models remain resilient, achieving AUCs of at least 0.79. The simulation model is embodied in MoMTLab, a graph-based platform built to enable visual analysis of mobile money transaction patterns.
Azamuke, Denish, M. Katarahweire, and E. Bainomugisha. 2025. “A labeled synthetic mobile money transaction dataset,” Elsevier Data in Brief, 2025. Article link.
Azamuke, Denish, M. Katarahweire, and E. Bainomugisha. 2024. “MoMTSim: A multi-agent-based simulation platform calibrated for mobile money transactions,” IEEE Access, 2024. Article link.
Azamuke, Denish, M. Katarahweire, and E. Bainomugisha. 2025. "MoMTSimDP: A Differentially Private Simulator for Mobile Money Transactions," 2025 IEEE/ACM Symposium on Software Engineering in the Global South (SEiGS), Ottawa, ON, Canada, 2025, pp. 53-58, . Article link.
Azamuke, Denish, M. Katarahweire, and E. Bainomugisha. 2023. “Financial fraud detection using rich mobile money transaction datasets,” In Towards New E-Infrastructure and e-Services for Developing Countries. AFRICOMM 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. Article link.
Azamuke, Denish, M. Katarahweire, J. Muleesi Businge, S. Kizza, C. Opio, and E. Bainomugisha. 2023. “Refining detection mechanism of mobile money fraud using MoMTSim platform,” in Pan African Conference on Artificial Intelligence, Springer, 2023, pp. 62–82. Article link.
Azamuke, D. et al. (2025). “DeepFakesUG: Detecting Counterfeit Ugandan Banknotes Using Deep Learning." In: Girma Debelee, T., Ibenthal, A., Schwenker, F., Megersa Ayano, Y. (eds) Pan-African Conference on Artificial Intelligence. PanAfriCon AI 2024. Communications in Computer and Information Science, vol 2550. Article link.
Azamuke, Denish, M. Katarahweire, and E. Bainomugisha. 2022. “Scenario-based synthetic dataset generation for mobile money transactions,” in Proceedings of the Federated Africa and Middle East Conference on Software Engineering, 2022, pp. 64–72.2022. Article link.
Coming soon