Nonparametric inference for classification and association with high dimensional genetic data

García-Magariños, Manuel

Nonparametric inference for classification and association with high dimensional genetic data

García-Magariños, Manuel

Dirigida por:

Antonio Salas Ellacuriaga Director/a
Wenceslao González Manteiga Director/a
Ricardo Cao Abad Director

Universidad de defensa: Universidade de Santiago de Compostela

Fecha de defensa: 29 de enero de 2010

Tribunal:

Ángel Carracedo Álvarez Presidente/a
Carmen María Cadarso Suárez Secretario/a
Ignacio López de Ullibarri Galparsoro Vocal
Vincent Macaulay Vocal
Thore Egeland Vocal

Tipo: Tesis

Teseo: 286596 DIALNET

Resumen

Over the last years, genetic advances have meant a revolution that has expanded beyond genetic borders, influencing the future of many other scientific areas, As the boom of genetics has caused the arising of countless high dimensional datasets containing DNA/RNA profiles, statistics is the science required to deal with them. Not only new tools need to be developed, but also existing methods can be adapted, and their abilities evaluated, to be applied to genetic data. The term genetic data include a wide variety of datasets, having in common only the fact of coming from DNA information: from SNPs (categorical data) to gene expression measures (continuous data). Inside this DNA information could be the answer to many common diseases with a complex basis (psychiatric disorders, cancer, diabetes, etc), so the main aim of statistics is to provide with proper, powerful techniques, able to unravel the underlying nature of complex diseases. This essay contains several statistical approaches to both gene expression data and SNP/STR data. There is place here for penalized regression, machine learning or tree-based methods. Although the emphasis lays on clinical genetics, statistical tools for population and forensic genetics are also explained.