Ebook: Bioinformatics and Computational Biology Solutions Using R and Bioconductor
- Tags: Computational Biology/Bioinformatics, Statistics for Life Sciences Medicine Health Sciences, Bioinformatics, Animal Genetics and Genomics
- Series: Statistics for Biology and Health
- Year: 2005
- Publisher: Springer-Verlag New York
- Edition: 1
- Language: English
- pdf
Bioconductor is a widely used open source and open development software project for the analysis and comprehension of data arising from high-throughput experimentation in genomics and molecular biology. Bioconductor is rooted in the open source statistical computing environment R. This volume's coverage is broad and ranges across most of the key capabilities of the Bioconductor project, including
importation and preprocessing of high-throughput data from microarray, proteomic, and flow cytometry platforms
curation and delivery of biological metadata for use in statistical modeling and interpretation
statistical analysis of high-throughput data, including machine learning and visualization,
modeling and visualization of graphs and networks.
The developers of the software, who are in many cases leading academic researchers, jointly authored chapters. All methods are illustrated with publicly available data, and a major section of the book is devoted to exposition of fully worked case studies.
This book is more than a static collection of descriptive text, figures, and code examples that were run by the authors to produce the text; it is a dynamic document. Code underlying all of the computations that are shown is made available on a companion website, and readers can reproduce every number, figure, and table on their own computers.
Robert Gentleman is Head of the Program in Computational Biology at the Fred Hutchinson Cancer Research Center in Seattle. He is one of the two authors of the original R system and a leading member of the R core team. Vincent Carey is Associate Professor of Medicine (Biostatistics), Channing Laboratory, Brigham and Women's Hospital, Harvard Medical School. Gentleman and Carey are co-founders of the Bioconductor project. Wolfgang Huber is Group Leader in the European Molecular Biology Laboratory at the European Bioinformatics Institute in Cambridge. He has made influential contributions to the error modeling of microarray data. Rafael Irizarry is Associate Professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health in Baltimore. He is co-developer of RMA and GCRMA, two of the most popular methodologies for preprocessing high-density oligonucleotide arrays. Sandrine Dudoit is Assistant Professor in the Department of Biostatistics at the University of California, Berkeley. She has made seminal discoveries in the fields of multiple testing and generalized cross-validation and spearheaded the deployment of these findings in applied genomic science.
Bioconductor is a widely used open source and open development software project for the analysis and comprehension of data arising from high-throughput experimentation in genomics and molecular biology. Bioconductor is rooted in the open source statistical computing environment R. This volume's coverage is broad and ranges across most of the key capabilities of the Bioconductor project, including
importation and preprocessing of high-throughput data from microarray, proteomic, and flow cytometry platforms
curation and delivery of biological metadata for use in statistical modeling and interpretation
statistical analysis of high-throughput data, including machine learning and visualization,
modeling and visualization of graphs and networks.
The developers of the software, who are in many cases leading academic researchers, jointly authored chapters. All methods are illustrated with publicly available data, and a major section of the book is devoted to exposition of fully worked case studies.
This book is more than a static collection of descriptive text, figures, and code examples that were run by the authors to produce the text; it is a dynamic document. Code underlying all of the computations that are shown is made available on a companion website, and readers can reproduce every number, figure, and table on their own computers.
Robert Gentleman is Head of the Program in Computational Biology at the Fred Hutchinson Cancer Research Center in Seattle. He is one of the two authors of the original R system and a leading member of the R core team. Vincent Carey is Associate Professor of Medicine (Biostatistics), Channing Laboratory, Brigham and Women's Hospital, Harvard Medical School. Gentleman and Carey are co-founders of the Bioconductor project. Wolfgang Huber is Group Leader in the European Molecular Biology Laboratory at the European Bioinformatics Institute in Cambridge. He has made influential contributions to the error modeling of microarray data. Rafael Irizarry is Associate Professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health in Baltimore. He is co-developer of RMA and GCRMA, two of the most popular methodologies for preprocessing high-density oligonucleotide arrays. Sandrine Dudoit is Assistant Professor in the Department of Biostatistics at the University of California, Berkeley. She has made seminal discoveries in the fields of multiple testing and generalized cross-validation and spearheaded the deployment of these findings in applied genomic science.
Bioconductor is a widely used open source and open development software project for the analysis and comprehension of data arising from high-throughput experimentation in genomics and molecular biology. Bioconductor is rooted in the open source statistical computing environment R. This volume's coverage is broad and ranges across most of the key capabilities of the Bioconductor project, including
importation and preprocessing of high-throughput data from microarray, proteomic, and flow cytometry platforms
curation and delivery of biological metadata for use in statistical modeling and interpretation
statistical analysis of high-throughput data, including machine learning and visualization,
modeling and visualization of graphs and networks.
The developers of the software, who are in many cases leading academic researchers, jointly authored chapters. All methods are illustrated with publicly available data, and a major section of the book is devoted to exposition of fully worked case studies.
This book is more than a static collection of descriptive text, figures, and code examples that were run by the authors to produce the text; it is a dynamic document. Code underlying all of the computations that are shown is made available on a companion website, and readers can reproduce every number, figure, and table on their own computers.
Robert Gentleman is Head of the Program in Computational Biology at the Fred Hutchinson Cancer Research Center in Seattle. He is one of the two authors of the original R system and a leading member of the R core team. Vincent Carey is Associate Professor of Medicine (Biostatistics), Channing Laboratory, Brigham and Women's Hospital, Harvard Medical School. Gentleman and Carey are co-founders of the Bioconductor project. Wolfgang Huber is Group Leader in the European Molecular Biology Laboratory at the European Bioinformatics Institute in Cambridge. He has made influential contributions to the error modeling of microarray data. Rafael Irizarry is Associate Professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health in Baltimore. He is co-developer of RMA and GCRMA, two of the most popular methodologies for preprocessing high-density oligonucleotide arrays. Sandrine Dudoit is Assistant Professor in the Department of Biostatistics at the University of California, Berkeley. She has made seminal discoveries in the fields of multiple testing and generalized cross-validation and spearheaded the deployment of these findings in applied genomic science.
Content:
Front Matter....Pages i-xix
Preprocessing Overview....Pages 3-12
Preprocessing High-density Oligonucleotide Arrays....Pages 13-32
Quality Assessment of Affymetrix GeneChip Data....Pages 33-47
Preprocessing Two-Color Spotted Arrays....Pages 49-69
Cell-Based Assays....Pages 71-90
SELDI-TOF Mass Spectrometry Protein Data....Pages 91-109
Meta-data Resources and Tools in Bioconductor....Pages 113-133
Querying On-line Resources....Pages 135-146
Interactive Outputs....Pages 147-160
Visualizing Data....Pages 161-179
Analysis Overview....Pages 183-187
Distance Measures in DNA Microarray Data Analysis....Pages 189-208
Cluster Analysis of Genomic Data....Pages 209-228
Analysis of Differential Gene Expression Studies....Pages 229-248
Multiple Testing Procedures: the multtest Package and Applications to Genomics....Pages 249-271
Machine Learning Concepts and Tools for Statistical Genomics....Pages 273-292
Ensemble Methods of Computational Inference....Pages 293-311
Browser-based Affymetrix Analysis and Annotation....Pages 313-326
Introduction and Motivating Examples....Pages 329-336
Graphs....Pages 337-346
Bioconductor Software for Graphs....Pages 347-368
Case Studies Using Graphs on Biological Data....Pages 369-394
limma: Linear Models for Microarray Data....Pages 397-420
Classification with Gene Expression Data....Pages 421-430
From CEL Files to Annotated Lists of Interesting Genes....Pages 431-442
Back Matter....Pages 443-473
Bioconductor is a widely used open source and open development software project for the analysis and comprehension of data arising from high-throughput experimentation in genomics and molecular biology. Bioconductor is rooted in the open source statistical computing environment R. This volume's coverage is broad and ranges across most of the key capabilities of the Bioconductor project, including
importation and preprocessing of high-throughput data from microarray, proteomic, and flow cytometry platforms
curation and delivery of biological metadata for use in statistical modeling and interpretation
statistical analysis of high-throughput data, including machine learning and visualization,
modeling and visualization of graphs and networks.
The developers of the software, who are in many cases leading academic researchers, jointly authored chapters. All methods are illustrated with publicly available data, and a major section of the book is devoted to exposition of fully worked case studies.
This book is more than a static collection of descriptive text, figures, and code examples that were run by the authors to produce the text; it is a dynamic document. Code underlying all of the computations that are shown is made available on a companion website, and readers can reproduce every number, figure, and table on their own computers.
Robert Gentleman is Head of the Program in Computational Biology at the Fred Hutchinson Cancer Research Center in Seattle. He is one of the two authors of the original R system and a leading member of the R core team. Vincent Carey is Associate Professor of Medicine (Biostatistics), Channing Laboratory, Brigham and Women's Hospital, Harvard Medical School. Gentleman and Carey are co-founders of the Bioconductor project. Wolfgang Huber is Group Leader in the European Molecular Biology Laboratory at the European Bioinformatics Institute in Cambridge. He has made influential contributions to the error modeling of microarray data. Rafael Irizarry is Associate Professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health in Baltimore. He is co-developer of RMA and GCRMA, two of the most popular methodologies for preprocessing high-density oligonucleotide arrays. Sandrine Dudoit is Assistant Professor in the Department of Biostatistics at the University of California, Berkeley. She has made seminal discoveries in the fields of multiple testing and generalized cross-validation and spearheaded the deployment of these findings in applied genomic science.
Content:
Front Matter....Pages i-xix
Preprocessing Overview....Pages 3-12
Preprocessing High-density Oligonucleotide Arrays....Pages 13-32
Quality Assessment of Affymetrix GeneChip Data....Pages 33-47
Preprocessing Two-Color Spotted Arrays....Pages 49-69
Cell-Based Assays....Pages 71-90
SELDI-TOF Mass Spectrometry Protein Data....Pages 91-109
Meta-data Resources and Tools in Bioconductor....Pages 113-133
Querying On-line Resources....Pages 135-146
Interactive Outputs....Pages 147-160
Visualizing Data....Pages 161-179
Analysis Overview....Pages 183-187
Distance Measures in DNA Microarray Data Analysis....Pages 189-208
Cluster Analysis of Genomic Data....Pages 209-228
Analysis of Differential Gene Expression Studies....Pages 229-248
Multiple Testing Procedures: the multtest Package and Applications to Genomics....Pages 249-271
Machine Learning Concepts and Tools for Statistical Genomics....Pages 273-292
Ensemble Methods of Computational Inference....Pages 293-311
Browser-based Affymetrix Analysis and Annotation....Pages 313-326
Introduction and Motivating Examples....Pages 329-336
Graphs....Pages 337-346
Bioconductor Software for Graphs....Pages 347-368
Case Studies Using Graphs on Biological Data....Pages 369-394
limma: Linear Models for Microarray Data....Pages 397-420
Classification with Gene Expression Data....Pages 421-430
From CEL Files to Annotated Lists of Interesting Genes....Pages 431-442
Back Matter....Pages 443-473
....