# 2012 Summer School Outline

### University of Rochester Center for Biodefense Immune Modeling

### Summer School on Computational Immunology

### Topic Outline (draft)

** **

**Hulin Wu: June 10, 9:00-10:00AM**

**Summer School Introduction and Overview: Benefits for immunologists use “modeling” approaches**

- Basic concepts of computational biology, bioinformatics, and modeling for immunology
- Integrating mathematical modeling, computing, statistics and bioinformatics techniques for a new multidisciplinary field: Computational immunology
- Benefits for immunologists to use modeling approaches:
- Design novel experiments
- Quantify immunological concepts and experiments
- Better analyze experimental data
- Extract more information from and better interpret experimental data
- Predictions and simulations to generate testable hypotheses for new experiments
- Overview and introduction:
- Summer school lectures and speakers
- Symposium

**Martin Zand: June 10, 10:15-11:15AM**

**Introduction of modeling for Immunologists**

In this talk, I will introduce concepts of computational and mathematical models relevant to immunology. Numerous of examples of immune process models will be presented, emphasizing choosing the right modeling approach for each immunological problem. Several advantages and pitfalls of modeling will be explored, all within the context of modeling immune responses.

**Hongyu Miao: June 10, 11:30AM-12:30PM**

**Basic concepts of mathematical modeling** (1 hour)

- Who is doing modeling?
- Why modeling?
- Representation of knowledge
- Data Analysis Tool
- Prediction and Hypothesis Generation
- Full Control
- Cost-effective
- Types of models and when to use what
- Starter models and terminology

**Hongyu Miao: June 10, 1:30-3:45PM**

**Basic concepts of computational method**

- Dynamics in homogenous space (ODE)
- Regular ODE
- Handle sharp turns (stiffness)
- Model delayed interactions (DDE)
- Model with constraints (DAE)
- Dynamics with spatial dependence (PDE)
- Finite element (FE)
- Finite difference (FD)
- Many more…
- Stochastic Simulation
- Random number generation
- SDE
- Discrete event simulation
- Rare event simulation
- Markov Chain Monte Carlo method
- Optimization
- Least squares (linear regression)
- Search a local solution (gradient-based algorithms)
- Search a global solution (evolutionary algorithms)
- Hybrid algorithms

**Hulin Wu: June 10, 4:00-5:30PM**

**Basic Concepts of Statistics for Immunologists**

- Introduction
- What statisticians can do for immunologists?
- Random variables and distributions (discrete and continuous)
- Summary statistics
- Data types and study endpoints
- Statistical inference: Basics
- Introduction to statistical lectures on Day 3 (Tuesday, June 12)

**Hongyu Miao: June 11, 9:00AM-5:30PM**

**ODE Modeling**

- Lectures (3 hours)
- Example of cell kinetics: growth, death, differentiation and migration
- Create a model: thinking in diagram
- Break down a model: thinking in mathematics
- State variables
- Parameters
- Initial conditions
- Simulation
- Prediction
- Sensitivity analysis
- Semi-mechanistic modeling
- Refine a model
- Tutorial (2 hours)
- Create a model: thinking in diagram (CellDesigner)
- Break down a model: terminology (DEDiscover)
- Simulation (DEDiscover)
- Prediction (DEDiscover)
- Sensitivity analysis (DEDiscover)
- Projects (1.5hours)
- T cell kinetics & trafficking
- B cell differentiation
- HIV Infection
- Primary influenza infection in lung

**Hulin Wu: June 12, 9:00-10:00AM**

**Experimental Design for Immunologists**

- Introduction
- Hypothesis: Formulation and statement of the problem
- Experimental design
- Perform experiments to collect data
- Data management, processing and analysis
- Result interpretation and annotation
- Standard experimental design and complicated experimental design: Examples
- Experiment design for systems biology and immune modeling

**Hongmei Yang: June 12, 10:15-11:15AM**

**Statistical Inference**

- Point Estimation
- Interval Estimation
- Hypothesis Testing
- Data Problems
- More Complicated Data
- Immunological Data Analysis
- Examples
- Exploratory Data Analysis
- Graphic Methods
- Description of Data: Summary Statistics
- Basic Statistical Methods
- Univariate & Bivariate Analysis
- Multivariate Analysis
- Immunological Data Processing
- Elispot Data Processing
- Elisa & Luminex Data Processing
- Hemagglutination Data Processing
- Welcome to the world of *-omics data!
- Microarrays (Affymetrix GeneChip platform, Illumina Beads Array)
- Protein arrays
- RNA-seq arrays
- A brief introduction of data preprocessing
- Now we have the data. Which genes are “interesting”?
*t*-test, Wilcoxon rank-sum test, and permutation test.- ANOVA
*F*-test for multiple group comparisons. - Signal-to-noise ratio (the statistical approach) is usually better than signal (fold-change) alone.
- A compromise: the SAM approach
- How to control false discoveries for 20,000 comparisons
- The rationale
- FWER: Bonferroni procedure and variants
- FDR: Benjamini-Hochberg procedure, empirical Bayes, and the stability of gene selection
- From data to knowledge: machine learning techniques
- Discriminant analysis (supervised machine learning) and the cross-validation principle.
- Cluster analysis (unsupervised machine learning)
- Beyond analyzing probe sets
- Multiple probe sets per gene
- Working with annotation and metadata, the Gene Ontology (GO) data base
- Gene Set Enrichment Anaysis (GSEA)
- Briefly mention some other popular techniques, such as pathway reconstruction (a.k.a. network analysis); differential association analysis; interaction analysis, etc.
- Model-based experiment design
- Use identifiability analysis
- Use sensitivity analysis
- Use simulation
- Estimate parameters of ODE models
- Use local vs. global optimization methods
- Evaluate the significance of a parameter (FIM vs bootstrap)
- Verify alternative hypotheses
- Develop different models based on alternative hypotheses
- Balance model complexity and goodness of fit
- Network data analysis
- Use ODE to model a dynamic network
- What are agent-based models (ABMs) and what are they good for?
- Steps to translating a biological system into an ABM: identifying
- the agents
- the rules; and
- the neighbourhood.
- The Do's and Don'ts of ABMs
- Key features of ABMs
- Grid topologies
- Boundary conditions
- Updating rules (synchronous vs sequential)
- Discrete agents versus a field representation
- 2D versus 3D grids
- The benefits of ABMs
- Contributions of ABMs in immunology, and microbio/virology
- Popular ABM development platforms
- Introduction to affinity maturation and the germinal center reaction
- Modeling the germinal center reaction using individual cell-based simulation.
- Introduction to
*Daphne*for the study of affinity maturation. - Hypothesis generation and the analysis of systems-level data using
*Daphne*. - An introduction to the NetLogo agent-based model (ABM) development platform.
- Opening and running existing modules.
- Learning the meaning and use of the workbook tabs.
- An introduction to Tom Kepler's ABM development platform for lymphocyte trafficking and intercellular communication.
- Implementation of a sample ABM following step-by-step instructions provided.
- Implementation of your very own (group or individual) ABM based on your own research in consultation with and with assistance from the instructors.
**Introduction to Flow Cytometry data - why is manual processing no longer adequate?**- Large datasets
- Multiple parameters, up to 20 or even more
- Diverse cell populations
- Compensation, normalization, standardization
- Current practice depends on subjective manual gating - each person different
- Main problem - humans don't visualize in more than three dimensions.
**Data pre-processing**- Normalization, different reagents, machine settings, flow cytometers,
- Compensation - current algorithms not accurate
- Compensation limits, background versus fluor compensation
- Transformations to handle negative numbers on a log scale - arcsinh, logicle etc.
**Clustering - goals**- Ideally, one cluster = one biological population
- In practice, actual diversity is probably >> flow cytometry diversity
- Clustering is a practical separation of useful populations
- Ambiguity - decision about when to split a population further
- Ambiguity - alternative solutions
**Clustering methods:**- Model-based, Gaussian, skewed Gaussian, etc
- Grid-based
- Thresholding, Watershed
- SPADE
- Others.....
**Sample comparison and inference**- Cluster matching - difficult, inter-related with normalization
- How to incorporate expert knowledge?
**Evaluation of data processing algorithms**- Difficulty of obtaining ground truth - how to define "correct" clustering?
- Evaluating cells versus evaluating populations
**FLOCK**- Example datasets? Question to be asked?
**SWIFT**- Construction of test datasets (mix and match input files in different proportions)
- +/- stimulation datasets, identification of rare populations
- Choice of parameters
**Bioconductor packages?**- Before the course, participants will be encouraged to bring their own datasets, within limits (e.g. 100,000 events, 20 parameters). Data file structures will be carefully specified.
- Handouts available for a tutorial on each method (FLOCK, SWIFT, maybe others?)
- Brief description of each method, strengths and weaknesses
- Participants will choose one program to analyze the model datasets, or their own data. Output as .fcs files that they may take away.

**Hua Liang: June 12, 11:30-12:30PM**

**An introduction to R for basic statistical concepts, and a few advanced methods**

This lecture provides a brief introduction to R, a free software package for statistical computing and graphics, to statistical data analysis with emphasis on using of immunology datasets. I will explain how to fit generalized linear models, how to handle variable selection for high-dimensional datasets. I will introduce a nonstandard model fitting method, which is a data-driven model strategy when a closed-form model is too restrictive or sometimes may not be easy to identify, for estimating IC50 and illustrate why they are helpful and useful in the stage of exploratory data analysis. Another example is how to identify significant factors when the number of factors is larger than the sample size.

**Hongmei Yang: June 12, 1:30-2:30PM**

**Immunological Data Processing & Analysis**

**Xing Qiu: June 12, 2:45-3:45****PM**

**Statistical Methods in Bioinformatics**

**Hongyu Miao: June 12, 4:00-5:30PM**

**Selected Statistical Methods for Mathematical Models**

**Catherine Beauchemin: June 13, 9:00-10:45AM**

**Mathematical modeling of immune responses: Agent-based models**

**Thomas Kepler: June 13, 11:00AM-12:30PM**

**Agent-based models applied to affinity maturation and the germinal center reaction**

**Catherine Beauchemin & Thomas Kepler: June 13, 1:30-5:30PM**

**Hands-on Agent-based Model Tutorial**

**Tim Mosmann & Richard Scheuermann: June 14, 9:00AM-12:30PM**

__Special workshop: __**Automatic flow gating methods and software hands on tutorial, FLOCK, SWIFT**

**Class session June 14, 9:00 - 10:15AM including discussion**

**Practical session (Computer lab) June 14, 10:30AM-12:30PM**

Instructors available for help.