Analyzing Network Data in Biology and Medicine 1st edition by Nataša Pržulj – Ebook PDF Instant Download/Delivery: 1108389846 , 9781108389846
Full download Analyzing Network Data in Biology and Medicine 1st edition after payment

Product details:
ISBN 10: 1108389846
ISBN 13: 9781108389846
Author: Nataša Pržulj
The increased and widespread availability of large network data resources in recent years has resulted in a growing need for effective methods for their analysis. The challenge is to detect patterns that provide a better understanding of the data. However, this is not a straightforward task because of the size of the data sets and the computer power required for the analysis. The solution is to devise methods for approximately answering the questions posed, and these methods will vary depending on the data sets under scrutiny. This cutting-edge text introduces biological concepts and biotechnologies producing the data, graph and network theory, cluster analysis and machine learning, before discussing the thought processes and creativity involved in the analysis of large-scale biological and medical data sets, using a wide range of real-life examples. Bringing together leading experts, this text provides an ideal introduction to and insight into the interdisciplinary field of network data analysis in biomedicine.
Analyzing Network Data in Biology and Medicine 1st Table of contents:
1 From Genetic Data to Medicine: From DNA Samples to Disease Risk Prediction in Personalized Genetic
1.1 Background
1.2 Genetic Tests in Healthcare
1.2.1 Types of Genetic Tests
1.2.2 Genetic Tests Providers
1.3 Common Technologies and Algorithms for SNPs Identification
1.3.1 Microarrays
1.3.1.1 Affymetrix SNP Microarrays
1.3.1.2 Illumina SNP BeadChips
1.3.1.3 Algorithms for Genotyping
1.3.2 Next Generation Sequencing
1.3.2.1 The Illumina NGS Platform
1.3.2.2 Algorithms for SNP Calling and Genotyping
1.3.3 Pros and Cons of Microarrays and NGS
1.4 Algorithms to Predict SNP-Disease Association
1.4.1 Single-SNP Association Studies
1.4.2 Multi-SNP Association Studies
1.4.2.1 Logistic Regression Models
1.4.2.2 Support Vector Machines (SVMs)
1.4.2.3 Random Forests (RFs)
1.4.2.4 Bayesian Networks (BNs)
1.4.3 Predictive genetic risk models in DTC services
1.5 Perspectives and Recommendations
1.6 Exercises
1.7 Acknowledgments
References
2 Epigenetic Data and Disease
2.1 Background
2.2 DNA Methylation and its Role in Genome Regulation
2.2.1 DNA Demethylation and its Role in Genomic Profiles
2.2.2 Different Experimental Strategies for the DNA Methylation Analysis
2.2.3 Processing and Analysis Methods and Tools for DNA Methylation Data from Bisulfite Based Assays
2.2.3.1 Bisulfite Conversion
2.2.3.2 Methylation Microarrays
2.3 The Post-Translational Modifications of Histones
2.3.1 Experimental Evaluation of Post-Translational Modifications of Histones
2.3.2 ChIP-seq Data Analysis
2.4 Higher Order Chromatin Organization
2.4.1 Technologies to Study Chromatin Conformation
2.4.1.1 The 3C, 4C, 5C, and ChIA-PET Technologies
2.4.1.2 The Hi-C Technology
2.4.2 Bioinformatic Methods of Hi-C Analysis
2.4.3 Mapping and Filtering
2.4.4 Normalization
2.4.5 Statistical Analysis
2.4.6 Visualization of Hi-C Data
2.4.7 Topological Associated Domain Identification from Hi-C Data
2.5 Long Non-Coding RNAs, Novel Molecular Regulators
2.5.1 The Implications of lncRNAs in Precision Medicine
2.5.2 Bioinformatic Tools for lncRNAs Analysis
2.5.3 Analysis of Annotated lncRNAs
2.5.4 Analysis of Unannotated lncRNAs
2.6 Epigenetic Databases
2.6.1 Encyclopedia of DNA Elements in the Human Genome
2.6.2 The Roadmap Epigenomics Project
2.6.3 Functional Annotation of the Mammalian Genome
2.6.4 BLUEPRINT Epigenome
2.6.5 The International Human Epigenome Consortium
2.7 Conclusion and Final Remarks
2.8 Exercises
2.9 Acknowledgements
References
3 Introduction to Graph and Network Theory
3.1 Motivation
3.2 Background
3.2.1 Mathematical Background
3.2.1.1 Matrix Operations
3.2.1.2 Special Matrices
3.2.1.3 Sets of Vectors
3.2.1.4 Matrix Spectral Decomposition
3.2.2 Computational Complexity
3.3 Graph Theory
3.3.1 Definitions
3.3.2 Degree and Neighborhood
3.3.3 Subgraphs and Connectedness
3.3.4 Types of Graphs
3.3.5 Classic Graph Theory Problems
3.3.5.1 Eulerian Circuit
3.3.5.2 Hamiltonian Paths
3.3.5.3 Matching
3.3.6 Data Structures and Search Algorithms for Graphs
3.3.6.1 Data Structures
3.3.6.2 Graph Search Algorithms
3.3.7 Spectral Graph Theory
3.4 Network Measures
3.4.1 Network Properties
3.4.2 Network Models
3.4.2.1 Erdős–Renyi Random Graphs
3.4.2.2 Scale-free Networks
3.4.2.3 Geometric Networks
3.4.2.4 Stickiness Index Based Networks
3.5 Summary
3.6 Exercises
3.7 Acknowledgments
References
4 Protein–Protein Interaction Data, their Quality, and Major Public Databases
4.1 Protein–Protein Interactions: Introduction and Motivation
4.2 Experimental Detection and Computational Prediction of PPIs
4.2.1 Experimental Methods
4.2.2 Computational Methods
4.2.3 Errors and Challenges
4.2.3.1 Error Rates
4.2.3.2 Biases
4.3 Challenges of Data Integration
4.3.1 Heterogeneity in Biological Data
4.4 Protein–Protein Interaction Databases
4.4.1 Curated Databases
4.4.2 Prediction Databases
4.4.2.1 Integrated Databases
4.4.3 PPI Context Annotation
4.4.3.1 Subcellular Localization
4.4.3.2 Tissue Annotation
4.4.3.3 Disease
4.5 Protein–Protein Interaction Networks and their Properties
4.5.1 Short Introduction to Networks
4.5.2 Network Construction
4.5.3 Properties of PPI Networks
4.5.3.1 Degree and Betweenness Centrality
4.5.3.2 Articulation Points
4.5.3.3 Graph Density
4.5.3.4 Distance
4.5.3.5 Clustering Coefficient
4.5.3.6 Cliques
4.5.3.7 Other Properties
4.5.4 PPI Network Annotations and Visualization
4.5.4.1 Qualitative Annotations
4.5.4.2 Quantitative Annotations
4.6 Applications of PPI Network Analysis
4.6.1 Identification of Disease-associated Genes
4.6.2 Improvement of Gene Signatures
4.6.3 Prediction of Drug Targets
4.6.4 Annotation of Protein Functions
4.7 Integrative Computational Biology Workflow
4.8 Closing Remarks
4.9 Exercises
4.10 Acknowledgments
References
5 Graphlets in Network Science and Computational Biology
5.1 Introduction
5.2 Graphlets and Graphlet-based Measures of Network Topology
5.2.1 Graphlets
5.2.1.1 Original Graphlets
5.2.1.2 Directed Graphlets
5.2.1.3 Dynamic Graphlets
5.2.1.4 Heterogeneous Graphlets
5.2.1.5 Ordered Graphlets
5.2.2 Graphlet-based Measures of Topological Position of Individual Nodes, Edges, or Non-edges
5.2.2.1 Graphlet Orbits
5.2.2.2 Graphlet Degree Vector (GDV)
5.2.2.3 GDV-similarity
5.2.2.4 GDV-centrality
5.2.3 Graphlet-based Measures of Entire Network Topology
5.2.3.1 Graphlet Frequency Vector (GFV)
5.2.3.2 Graphlet Degree Distributions (GDDs)
5.2.3.3 Graphlet Correlation Matrix (GCM)
5.3 Computational Approaches Based on the Graphlet Measures
5.3.1 Clustering of Nodes or Edges in a Network
5.3.2 Dominating Set of a Network
5.3.3 Link Prediction
5.3.4 Network Comparison
5.3.4.1 Alignment-free Network Comparison
5.3.4.2 Alignment-based Network Comparison: Network Alignment (NA)
5.4 Biological Applications of the Graphlet Measures
5.4.1 Protein Function Prediction
5.4.2 Aging
5.4.2.1 Static Analysis of the Human PPI Network in the Context of Aging
5.4.2.2 Dynamic Analysis of the Human PPI Network at Different Ages
5.4.2.3 Transfer of Aging-related Knowledge from Model Species to Human via Network Alignment (NA)
5.4.3 Disease
5.4.3.1 Cancer
5.4.3.2 Pathogenicity
5.4.4 Health-related Applications Beyond Computational Biology: Social Networks
5.5 Graphlet-based Software Tools
5.5.1 General-purpose Software for Graphlet Counting
5.5.2 Task-specific Graphlet-based Software
5.6 Exercises
5.7 Acknowledgment
References
6 Unsupervised Learning: Cluster Analysis
6.1 Formal Definitions
6.1.1 Clustering
6.1.2 Data Formats
6.2 Cluster Analysis
6.3 Preprocessing
6.3.1 Normalization and Standardization
6.3.2 Feature Selection
6.3.3 Principal Component Analysis
6.4 Proximity Calculation
6.4.1 Continues Variables
6.4.1.1 Euclidean Distance
6.4.1.2 Minkowski Distance
6.4.1.3 Correlation
6.4.2 Categorical Values
6.4.2.1 Boolean Variables
6.4.2.2 General Categorical Variables
6.4.3 Practical Issues
6.5 Clustering Algorithms
6.5.1 Cluster Approaches
6.5.2 k-means
6.5.2.1 Algorithm
6.5.2.2 Initialization Strategies
6.5.2.3 Other Variants
6.5.3 Hierarchical Clustering
6.5.3.1 Algorithm
6.5.3.2 Linkage Functions
6.5.3.3 Discussion
6.5.4 DBSCAN
6.5.4.1 Algorithm
6.5.4.2 Discussions
6.5.5 Transitivity Clustering
6.5.5.1 Transitive Graph Projection Problem
6.5.5.2 Heuristic Solution
6.5.6 Discussion
6.6 Cluster Evaluation
6.6.1 External Cluster Evaluation
6.6.2 Internal Cluster Evaluation
6.6.3 Optimization Strategies
6.6.3.1 k is not a Parameter
6.6.3.2 k as a Parameter
6.6.3.3 The Gap Statistic
6.7 Final Remarks
6.8 Exercises
References
7 Machine Learning for Data Integration in Cancer Precision Medicine: Matrix Factorization Approache
7.1 Introduction
7.2 Precision Medicine
7.3 The Different Types of Data Integration Methods
7.3.1 Homogeneous and Heterogeneous Data Integration
7.3.2 Early, Intermediate, and Late Integration
7.3.3 Supervised, Unsupervised, and Semi-supervised Data Integration
7.4 Summary of Data-Integration Methods
7.4.1 Network-based Data Integration
7.4.2 Bayesian Approaches
7.4.3 Kernel-based Methods
7.5 Homogeneous Data Integration with Non-Negative Matrix Factorization
7.5.1 Principles and Properties
7.5.2 Solving NMF
7.5.3 Homogeneous Data Integration with NMF
7.5.3.1 Simultaneous Decomposition
7.5.3.2 Graph Regularization
7.6 Heterogeneous Data Integration with Non-Negative Matrix Tri-Factorization
7.6.1 Principle and Properties
7.6.2 Solving NMTF
7.6.2.1 Optimizing Non-Linear Constrained Continuous Optimization Problems
7.6.2.2 Applying KKT Conditions to NMTF
7.6.2.3 From KKT Conditions to Multiplicative Update Rules
7.6.3 Heterogeneous Data Integration with NMTF
7.7 Concluding Remarks
7.8 Exercises
7.9 Acknowledgments
References
8 Machine Learning for Biomarker Discovery: Significant Pattern Mining
8.1 Introduction
8.2 The Problem of Significant Pattern Mining
8.2.1 Terminology and Problem Statement
8.2.1.1 Significant Itemset Mining
8.2.1.2 Significant Subgraph Mining
8.2.2 Statistical Association Testing in Significant Pattern Mining
8.2.2.1 Pearson’s χ[sup(2)] Test
8.2.2.2 Fisher’s Exact Test
8.2.3 Multiple Testing Correction
8.2.3.1 The Bonferroni Correction
8.2.3.2 Tarone’s Improved Bonferroni Correction for Discrete Data
8.3 A Framework for Significant Pattern Mining Using Tarone’s Method
8.3.1 Evaluating Tarone’s Minimum Attainable P-value
8.3.2 Designing a Pruning Condition
8.3.3 Implementation Considerations
8.4 Accounting for the Redundancy Between Patterns
8.4.1 Empirically Approximating the FWER Using Random Permutations
8.4.2 Permutation Testing in Significant Pattern Mining
8.5 Accounting for a Categorical Covariate
8.5.1 Conditional Association Testing in Significant Pattern Mining
8.5.2 Deriving the Minimum Attainable P-value for the CMH Test
8.5.3 A Search Space Pruning Condition for the CMH Test
8.6 Summary and Outlook
8.6.1 Software
8.6.2 Outlook
8.7 Exercises
8.8 Acknowledgments and Funding
References
9 Network Alignment
9.1 Introduction
9.1.1 Proteins and their Functions
9.1.2 Protein Interactions and Network Alignment
9.2 Pairwise Network Alignment
9.2.1 Formal Definitions
9.2.2 Scoring Alignments
9.2.2.1 Scoring Local Network Alignments
9.2.2.2 Scoring Global Network Alignments
9.2.2.3 Agreement and Trade-off Between Scores
9.2.3 Example Pairwise Local Alignment Method: PathBlast
9.2.4 Overview of Other Pairwise Local Alignment Methods
9.2.5 Global Pairwise Network Alignment Methods
9.2.5.1 IsoRank
9.2.5.2 Global Alignment Methods: GRAAL and H-GRAAL
9.2.5.3 Overview of Other Pairwise Global Alignment Methods
9.3 Multiple Network Alignment
9.3.1 Context and Formal Definitions
9.3.2 Scoring Multiple Network Alignments
9.3.3 Multiple Network Alignment Method: SMETANA
9.3.4 Multiple Network Alignment Method: FUSE
9.3.5 Overview of Other Multiple Network Alignment Methods
9.4 Aligning Other Types of Networks
9.4.1 Probabilistic Networks
9.4.2 Multilayer Networks
9.4.3 Directed Networks
9.4.4 Hyper-Networks
9.5 Concluding Remarks
9.6 Exercises
9.7 Acknowledgments
References
10 Network Medicine
10.1 Introduction
10.2 Networks in Medicine
10.2.1 Overview
10.2.2 Molecular Networks
10.2.2.1 Protein–Protein Interaction Networks
10.2.2.2 Metabolic Networks
10.2.2.3 Regulatory Networks
10.2.2.4 Co-Expression Networks
10.2.2.5 Genetic Interactions
10.2.3 Disease Networks
10.2.4 Social Networks
10.2.4.1 Transportation Networks
10.2.4.2 Social Contagion
10.3 Interactome Analysis
10.3.1 Interactome Construction
10.3.2 Basic Interactome Properties
10.3.3 Interactome Topology and Biological Function
10.3.4 Diseases in the Interactome
10.3.5 Localization in Networks
10.3.6 Randomization of Network Properties
10.3.6.1 Randomizing the Network Topology
10.3.6.2 Randomizing Node Properties
10.3.6.3 Degree Preserving Label Permutation
10.4 Disease Module Analysis
10.4.1 Overview
10.4.2 Seed Cluster Construction
10.4.2.1 Interactome Construction
10.4.2.2 Seed Gene Selection
10.4.2.3 Evaluation of the Seed Cluster
10.4.3 Network-Based Disease Gene Prioritization
10.4.3.1 Connectivity-Based Methods
10.4.3.2 Path-Based Methods
10.4.3.3 Diffusion-Based Methods
10.4.4 Validation and Enrichment
10.4.4.1 Cross-validation of Prediction Performance
10.4.4.2 Enrichment with Independent Biological Data
10.4.5 Biological Interpretation
10.5 Summary and Outlook
10.6 Exercises
References
11 Elucidating Genotype-to-Phenotype Relationships via Analyses of Human Tissue Interactomes
11.1 Introducing Genotypes, Phenotype, and Molecular Interaction Networks
11.1.1 What are Genotypes and Phenotypes?
11.1.2 Molecular Interactions Play a Role in Genotype–Phenotype Relationships
11.1.3 Molecular Interaction Networks
11.2 Network Approaches for Elucidating Disease-Related Genotype–Phenotype Relationships
11.2.1 Causal Variants Tend to Perturb Their Surrounding Molecular Network
11.2.2 Network Approaches to Identify the Molecular Basis of Diseases
11.2.2.1 Network Approaches to Monogenic Diseases
11.2.2.2 Network Approaches to Complex Diseases
11.2.2.3 Disease Modules
11.2.2.4 Inter-Organismal Networks
11.3 Tissue-Sensitive Molecular Interaction Networks
11.3.1 Characterizing the Composition of Human Tissues
11.3.2 Constructing Tissue-Sensitive Interactomes
11.3.3 Constructing Other Types of Context-Sensitive Interactomes
11.4 Using Tissue Interactomes to Illuminate Genotype–Phenotype Relationship
11.4.1 Differential Network Analysis
11.4.2 Meta-Analysis of Tissue Interactomes for Elucidating Genotype–Phenotype Relationship
11.5 Conclusion
11.6 Exercises
11.7 Acknowledgments
References
12 Network Neuroscience
12.1 Introduction
12.2 Structural Brain Networks
12.3 Functional Brain Networks
12.4 Node Definition in Structural and Functional Networks
12.5 Tools for Brain Networks Analysis
12.6 Topological Measures in Complex Networks
12.6.1 Stochastic Measures
12.6.2 Deterministic Measures
12.7 Network Neuroscience in the Latent Geometric Space
12.8 Brain Network Disorders
12.9 Exercises
References
13 Cytoscape: A Tool for Analyzing and Visualizing Network Data
13.1 Introduction
13.2 Network Taxonomy
13.2.1 Interaction Networks
13.2.2 Pathways
13.2.3 Similarity Networks
13.3 Visualizing Networks
13.3.1 Visual Representations of Networks
13.3.2 Node-Link Diagrams
13.3.2.1 Visual Properties
13.3.2.2 Visual Mappings
13.3.2.3 Layouts
13.3.2.4 Animation
13.3.2.5 Layering Visualizations
13.4 Getting Started with Cytoscape
13.4.1 Cytoscape in a Nutshell
13.4.2 The Cytoscape User Interface
13.4.2.1 Control Panel
13.4.2.2 Network Panel
13.4.2.3 Results Panel
13.4.2.4 Table Panel
13.4.2.5 View Menu
13.4.3 Getting Data into Cytoscape
13.4.3.1 Importing Networks from Public Databases
13.4.3.2 Importing Networks from Files
13.4.3.3 Importing Tables from Files
13.4.3.4 Calculated Data
13.4.4 Visualizing Data
13.4.4.1 Visual Styles
13.4.4.2 Visual Property Editors
13.4.4.3 Charts and Graphs
13.4.4.4 Layouts
13.4.5 Network Analysis
13.4.6 Apps
13.4.6.1 The Cytoscape App Store
13.4.6.2 App Manager
13.4.6.3 Example Apps
13.5 Functional Enrichment Workflow
13.5.1 Apps
13.5.2 Data
13.5.3 Step-by-Step Workflow
13.6 Scripting Cytoscape
13.7 Command Language
13.7.1 CyREST
13.8 Exercises
13.9 Acknowledgments
References
14 Analysis of the Signatures of Cancer Stem Cells in Malignant Tumors Using Protein Interactomes an
14.1 Outline
14.2 Introduction
14.2.1 Metastases and Cancer Stem Cells
14.3 From Proteins to Interactomes in Cancer
14.4 Cancer Stem Cells Biomarkers within a Cell’s Protein Interaction Network
14.5 Exercises
14.6 Acknowledgments
References
Index
People also search for Analyzing Network Data in Biology and Medicine 1st:
medical data analyst jobs
applications of big data in healthcare
what is the role of data analytics in healthcare
network analysis biology
network analysis bioinformatics
Tags: Nataša Pržulj, Analyzing Network, Biology and Medicine


