Improving information retrieval in functional analysis
Transcriptome analysis is essential to understand the mechanisms regulating key biological processes and functions. The first step usually consists of identifying candidate genes; to find out which pathways are affected by those genes, however, functional analysis (FA) is mandatory. The most frequen...
Guardado en:
Publicado: |
2016
|
---|---|
Materias: | |
Acceso en línea: | https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_00104825_v79_n_p10_Rodriguez http://hdl.handle.net/20.500.12110/paper_00104825_v79_n_p10_Rodriguez |
Aporte de: |
id |
paper:paper_00104825_v79_n_p10_Rodriguez |
---|---|
record_format |
dspace |
spelling |
paper:paper_00104825_v79_n_p10_Rodriguez2023-06-08T14:34:29Z Improving information retrieval in functional analysis Big omics data Biological insight Breast cancer Functional class scoring Gene set enrichment analysis Knowledge discovery Over representation analysis R framework Singular enrichment analysis Computational efficiency Data mining Diseases Functional analysis Genes Information retrieval Big omics data Biological insight Breast Cancer Functional class Gene set enrichment analysis Over representation analysis R framework Singular enrichment analysis Gene expression Transcriptome analysis is essential to understand the mechanisms regulating key biological processes and functions. The first step usually consists of identifying candidate genes; to find out which pathways are affected by those genes, however, functional analysis (FA) is mandatory. The most frequently used strategies for this purpose are Gene Set and Singular Enrichment Analysis (GSEA and SEA) over Gene Ontology. Several statistical methods have been developed and compared in terms of computational efficiency and/or statistical appropriateness. However, whether their results are similar or complementary, the sensitivity to parameter settings, or possible bias in the analyzed terms has not been addressed so far. Here, two GSEA and four SEA methods and their parameter combinations were evaluated in six datasets by comparing two breast cancer subtypes with well-known differences in genetic background and patient outcomes. We show that GSEA and SEA lead to different results depending on the chosen statistic, model and/or parameters. Both approaches provide complementary results from a biological perspective. Hence, an Integrative Functional Analysis (IFA) tool is proposed to improve information retrieval in FA. It provides a common gene expression analytic framework that grants a comprehensive and coherent analysis. Only a minimal user parameter setting is required, since the best SEA/GSEA alternatives are integrated. IFA utility was demonstrated by evaluating four prostate cancer and the TCGA breast cancer microarray datasets, which showed its biological generalization capabilities. © 2016 Elsevier Ltd 2016 https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_00104825_v79_n_p10_Rodriguez http://hdl.handle.net/20.500.12110/paper_00104825_v79_n_p10_Rodriguez |
institution |
Universidad de Buenos Aires |
institution_str |
I-28 |
repository_str |
R-134 |
collection |
Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) |
topic |
Big omics data Biological insight Breast cancer Functional class scoring Gene set enrichment analysis Knowledge discovery Over representation analysis R framework Singular enrichment analysis Computational efficiency Data mining Diseases Functional analysis Genes Information retrieval Big omics data Biological insight Breast Cancer Functional class Gene set enrichment analysis Over representation analysis R framework Singular enrichment analysis Gene expression |
spellingShingle |
Big omics data Biological insight Breast cancer Functional class scoring Gene set enrichment analysis Knowledge discovery Over representation analysis R framework Singular enrichment analysis Computational efficiency Data mining Diseases Functional analysis Genes Information retrieval Big omics data Biological insight Breast Cancer Functional class Gene set enrichment analysis Over representation analysis R framework Singular enrichment analysis Gene expression Improving information retrieval in functional analysis |
topic_facet |
Big omics data Biological insight Breast cancer Functional class scoring Gene set enrichment analysis Knowledge discovery Over representation analysis R framework Singular enrichment analysis Computational efficiency Data mining Diseases Functional analysis Genes Information retrieval Big omics data Biological insight Breast Cancer Functional class Gene set enrichment analysis Over representation analysis R framework Singular enrichment analysis Gene expression |
description |
Transcriptome analysis is essential to understand the mechanisms regulating key biological processes and functions. The first step usually consists of identifying candidate genes; to find out which pathways are affected by those genes, however, functional analysis (FA) is mandatory. The most frequently used strategies for this purpose are Gene Set and Singular Enrichment Analysis (GSEA and SEA) over Gene Ontology. Several statistical methods have been developed and compared in terms of computational efficiency and/or statistical appropriateness. However, whether their results are similar or complementary, the sensitivity to parameter settings, or possible bias in the analyzed terms has not been addressed so far. Here, two GSEA and four SEA methods and their parameter combinations were evaluated in six datasets by comparing two breast cancer subtypes with well-known differences in genetic background and patient outcomes. We show that GSEA and SEA lead to different results depending on the chosen statistic, model and/or parameters. Both approaches provide complementary results from a biological perspective. Hence, an Integrative Functional Analysis (IFA) tool is proposed to improve information retrieval in FA. It provides a common gene expression analytic framework that grants a comprehensive and coherent analysis. Only a minimal user parameter setting is required, since the best SEA/GSEA alternatives are integrated. IFA utility was demonstrated by evaluating four prostate cancer and the TCGA breast cancer microarray datasets, which showed its biological generalization capabilities. © 2016 Elsevier Ltd |
title |
Improving information retrieval in functional analysis |
title_short |
Improving information retrieval in functional analysis |
title_full |
Improving information retrieval in functional analysis |
title_fullStr |
Improving information retrieval in functional analysis |
title_full_unstemmed |
Improving information retrieval in functional analysis |
title_sort |
improving information retrieval in functional analysis |
publishDate |
2016 |
url |
https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_00104825_v79_n_p10_Rodriguez http://hdl.handle.net/20.500.12110/paper_00104825_v79_n_p10_Rodriguez |
_version_ |
1768542062509031424 |