Patient-centric synthetic data generation: a new methodology for Chronic Kidney Disease

Access to medical data is often restricted due to privacy and security policies. Synthetic data generation from real data is a widely adopted technique to address these limitations. This research presents a patient-centric methodology for generating synthetic data, specifically designed for patients...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Álvarez, Candelaria, Ibeas, José, Balladini, Javier, Suppi, Remo
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2024
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/176195
Aporte de:
id I19-R120-10915-176195
record_format dspace
spelling I19-R120-10915-1761952025-02-06T20:05:40Z http://sedici.unlp.edu.ar/handle/10915/176195 Patient-centric synthetic data generation: a new methodology for Chronic Kidney Disease Álvarez, Candelaria Ibeas, José Balladini, Javier Suppi, Remo 2024-10 2024 2025-02-06T13:02:44Z en Ciencias Informáticas patient-centric methodology synthetic data generation chronic kidney disease Access to medical data is often restricted due to privacy and security policies. Synthetic data generation from real data is a widely adopted technique to address these limitations. This research presents a patient-centric methodology for generating synthetic data, specifically designed for patients diagnosed with Chronic Kidney Disease (CKD). The key advantage of this proposal is its explainability and the traceability of the results, as it relies on statistics and data analysis rather than AI algorithms. The MIMIC-III clinical dataset serves as the foundation for generating synthetic patients in this study. This article details the data preprocessing and filtering applied to this dataset. Subsequently, synthetic data for CKD patients is generated using the proposed methodology. A comparison is then conducted between the synthetic data and the real data. Additionally, the synthetic data is compared with results obtained using the AI algorithm known as SMOTE. Generally, the metrics for the synthetic data generated by SMOTE are slightly superior. However, the results obtained with the proposed methodology exhibit minimal deviations from the MIMIC data across most CKD stages. Red de Universidades con Carreras en Informática Objeto de conferencia Objeto de conferencia http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) application/pdf 280-289
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
patient-centric methodology
synthetic data generation
chronic kidney disease
spellingShingle Ciencias Informáticas
patient-centric methodology
synthetic data generation
chronic kidney disease
Álvarez, Candelaria
Ibeas, José
Balladini, Javier
Suppi, Remo
Patient-centric synthetic data generation: a new methodology for Chronic Kidney Disease
topic_facet Ciencias Informáticas
patient-centric methodology
synthetic data generation
chronic kidney disease
description Access to medical data is often restricted due to privacy and security policies. Synthetic data generation from real data is a widely adopted technique to address these limitations. This research presents a patient-centric methodology for generating synthetic data, specifically designed for patients diagnosed with Chronic Kidney Disease (CKD). The key advantage of this proposal is its explainability and the traceability of the results, as it relies on statistics and data analysis rather than AI algorithms. The MIMIC-III clinical dataset serves as the foundation for generating synthetic patients in this study. This article details the data preprocessing and filtering applied to this dataset. Subsequently, synthetic data for CKD patients is generated using the proposed methodology. A comparison is then conducted between the synthetic data and the real data. Additionally, the synthetic data is compared with results obtained using the AI algorithm known as SMOTE. Generally, the metrics for the synthetic data generated by SMOTE are slightly superior. However, the results obtained with the proposed methodology exhibit minimal deviations from the MIMIC data across most CKD stages.
format Objeto de conferencia
Objeto de conferencia
author Álvarez, Candelaria
Ibeas, José
Balladini, Javier
Suppi, Remo
author_facet Álvarez, Candelaria
Ibeas, José
Balladini, Javier
Suppi, Remo
author_sort Álvarez, Candelaria
title Patient-centric synthetic data generation: a new methodology for Chronic Kidney Disease
title_short Patient-centric synthetic data generation: a new methodology for Chronic Kidney Disease
title_full Patient-centric synthetic data generation: a new methodology for Chronic Kidney Disease
title_fullStr Patient-centric synthetic data generation: a new methodology for Chronic Kidney Disease
title_full_unstemmed Patient-centric synthetic data generation: a new methodology for Chronic Kidney Disease
title_sort patient-centric synthetic data generation: a new methodology for chronic kidney disease
publishDate 2024
url http://sedici.unlp.edu.ar/handle/10915/176195
work_keys_str_mv AT alvarezcandelaria patientcentricsyntheticdatagenerationanewmethodologyforchronickidneydisease
AT ibeasjose patientcentricsyntheticdatagenerationanewmethodologyforchronickidneydisease
AT balladinijavier patientcentricsyntheticdatagenerationanewmethodologyforchronickidneydisease
AT suppiremo patientcentricsyntheticdatagenerationanewmethodologyforchronickidneydisease
_version_ 1845116772782440448