R is a free software environment for statistical computing and graphics (http://www.r-project.org) and offers rich statistical and graphical tools to handle large data sets. The 2-credit course offers intensive hands-on summer training in R over 6 full weekdays. The goal is to provide students with an opportunity to gain skills in data analysis and graphics using R. It is designed for students who are new to R but have had some basic experience working with computers. This course meets the prerequisite for two summer courses M21-515 Fundamentals of Genetic Epidemiology and M21-550 Introduction to Bioinformatics. Syllabus»
Curriculum and Degree Requirements
Students in the MSIBS program complete core coursework, an internship, electives and/or a mentored research project, and will choose either the Biostatistics or Statistical Genetics pathway.
The MSIBS degree is an 18-month program beginning in the Summer semester (last week of June or first week of July depending on the academic calendar). As a full-time student, students will graduate in the following Fall semester. Part-time status is available and students have 3 years from the start of the program to complete the degree.
Students must complete 42 credit hours:
- 24 credits of required core course work
- 3-6 credits* of internship (on campus experience)
- 6 credits of either electives or the Mentored Research course
- 6 credits of required pathway course work
*additional elective credit to replace 3 credit hours of internship experience
Statistical Computing with SAS®
Intensive hands-on summer training in SAS® over 7 full weekdays. Starting with a brief introduction to computing environment and Unix, students will learn how to use the SAS® System for handling, managing, and analyzing data. Instruction is provided in the use of the SAS® programming language and procedures. The course will teach students how to become effective, self-reliant SAS® users, and will instruct the students in data management and basic exploratory data analysis using SAS®. Topics include, but are not limited to: Reading External Files into SAS; Examining and Manipulating the Contents of SAS Datasets; and SAS Macro Variables and Programs. Students will learn how to output results, and create high quality tables and graphs in SAS. A brief introduction to statistics in SAS will also be included. Instruction manual and computer lab will be provided. This course meets the prerequisite for M21-560 Biostatistics I. Syllabus»
Fundamentals of Genetic Epidemiology
Intensive two-week summer course. Lectures cover causes of phenotypic variation, familial resemblance and heritability, Hardy-Weinberg Equilibrium, ascertainment, study designs and basic concepts in genetic segregation, linkage and association. The computer laboratory portion is designed as hands-on practice of fundamental concepts. Students will gain practical experience with various genetics computer programs (e.g. MERLIN, QTDT, and PLINK) and data QC using R-programming. Prerequisites: Must have taken the R-programming course (M21-506) or have equivalent R-programming experience, and must have experience with Unix/Linux computing environment. Auditors will not have access to the computer lab sessions. Syllabus»
Introduction to Bioinformatics
This course provides a broad exposure to basic concepts, methodologies and applications of bioinformatics. Students will learn online databases & mining tools, and acquire understanding of mathematical algorithms in sequence analysis (sequence alignment, gene finding, and hidden Markov models), gene expression microarray analysis (data QC & normalization, univariate & multivariate differential expression analysis), next generation sequence analysis (short-read data format and processing, variant calling algorithms), and topics on other high-throughput biomedical experiments. Students will become familiar with popular bioinformatics software, online tools, and R/BioConductor packages. We will discuss methods for high-dimensional data analysis including classification and clustering analysis, principal component analysis (PCA), statistical/machine learning, and Bayesian inference. There also will be seasonal additional lectures on topics such as proteomics and applications of bioinformatics to real studies of complex diseases.
As an important component of this course, students will conduct hands-on computer labs to learn basics of online bioinformatics databases and tools, and to practice computer programming. The labs require using the statistical computing environment R (i.e., the R primer is required) though introduction to BioConductor basics will be provided. Students will use specialized software and R packages to accomplish tasks including designing experiments, low-level analysis of expression levels, univariate differential expression analysis, and various multivariate analysis techniques taught in class. A variety of software will be used for NGS data analysis covering alignment, variants calls, differential analysis, and visualization of results. Through the lab exercises, students will learn how state of art computational tools are applied to solving bioinformatics problems in real studies of human diseases. Syllabus»
This course is designed for students who want to develop a working knowledge of basic methods in biostatistics. The course is focused on biostatistical and epidemiological concepts and on practical hints and hands-on approaches to data analysis rather than on details of the theoretical methods. We will cover basic concepts in hypothesis testing, will introduce students to several of the most widely used probability distributions, and will discuss classical statistical methods that include t-tests, chi-square tests, regression analysis, and analysis of variance. Both in-class examples and homework assignments will involve extensive use of SAS®. Auditors will not have access to the computer lab sessions. Prerequisite: M21-502 Statistical Computing with SAS® or student must have practical experience with SAS®. Syllabus»
This course is designed for students who have taken Biostatistics I or the equivalent and who want to extend their knowledge of biostatistical applications to more modern and more advanced methods. Biostatistical methods to be discussed include logistic and Poisson regression, survival analysis, Cox regression analysis, and several methods for analyzing longitudinal data. Students will be introduced to modern topics that include statistical genetics and bioinformatics. The course will also discuss clinical trial design, the practicalities of sample size and power computation and meta analysis, and will ask students to read journal articles with a view towards encouraging a critical reading of the medical literature. Both in-class examples and homework assignments will involve extensive use of SAS®. Auditors will not have access to the computer lab sessions. Prerequisite: M21-560 Biostatistics I or its equivalent as judged by the course masters. Syllabus»
Study Design and Clinical Trials
This course will focus on statistical and epidemiological concepts of study design and clinical trials. Topics include: different phases of clinical trials, various types of medical studies (observational studies, retrospective studies, cross-over design, factorial design, and group sequential design and power analysis, along with statistical methods for the various types of studies. Study management, randomization method and survey data analysis are also addressed. Students will be expected to write up a proposed design for a study of their choice, and to practice power analysis/sample size estimation during lab sessions. Permission of the Course Master required. Prerequisites: M21-560 Biostatistics I and M21-570 Biostatistics II or the equivalent as determined by the course masters.
Ethics in Biostatistics
This course prepares biostatisticians to analyze and address ethical and professional issues in the practice of biostatistics across the range of professional roles and responsibilities of a biostatistician. The primary goals are for biostatisticians to recognize complex situational dynamics and ethical issues in their work and to develop professional and ethical problem-solving skills. The course specifically examines ethical challenges related to research design, data collection, data management, ownership, security, and sharing, data analysis and interpretation, and data reporting and provides practical guidance on these issues. The course also examines fundamentals of the broader research environment in which biostatisticians work, including principles of ethics in human subjects and animal research, regulatory and compliance issues in biomedical research, publication and authorship, and collaboration in science. Syllabus»
Advanced Math Course
MSIBS students are required to enroll in one of the following advanced math courses – Linear Statistical Models (L24-439), Numerical Applied Mathematics (L24-449), Statistical Computation (L24-475) or Probability (L24-493).
The primary goal of the Internship program is for all students to acquire critical professional experience so that they will be well prepared to enter the job market upon graduation. This provides an opportunity for students to test-drive the job market, develop contacts, build marketable skills, and figure out likes and dislikes in the chosen field. Students are required to spend a total of 220 or 440 hours (depending on number of credit hours) in the research centers of their Internship match. One of two types of projects may be pursued as part of the Internship experience. A student may elect to pursue a “Data Analysis Project” involving data management and extensive analyses of data which may lead to a publication quality manuscript (possibly earning co-authorship for the student). Alternatively, a student may choose a highly focused research oriented project and carry out “Mentored Research” by working closely with the mentor. In this case, the student will assist the mentor by preparing a publication quality manuscript as part of the Internship. In either case, as part of the Internship requirements, each student will submit a one page abstract of the work performed as part of the Internship, and make a 5-minute presentation of the internship experience. Internship presentations will be scheduled in late summer. Grade for each student will be determined in consultation with the mentor. Students will register for either 3 or 6 credit hours. If 3 credits are taken, the student will replace the remaining 3 credits with an approved elective.
Mentored research or electives (6 credit hours)
All students enrolled in the Mentored Research course will complete a Master’s thesis, which may involve conducting and reporting a comprehensive data analysis or conducting research and reporting on a focused methodological problem; the latter may include a computer simulation approach to solve a problem, an in depth review of available methods in a certain topical area, or developing new methods. The Internship experience may provide leads for the MS thesis. Each student will work closely with a Mentor who has expertise in biostatistics or a related quantitative filed. Grade for each student will be determined in consultation with the mentor. MSIBS students who do not enroll in this course option are required to take 6 credit hours of approved electives.
Elective course options are in the Mathematics (L24) and Engineering (E62, E81) departments. Additional approved electives can be found under Other. Students should review the course prerequisites, and seek instructor approval if necessary prior to the start of classes.
- L24-429: Linear Algebra
- L24-439: Linear Statistical Models
- L24-4392: Advanced Linear Statistical Models
- L24-449: Numerical Applied Mathematics
- L24-459: Bayesian Statistics
- L24-475: Statistical Computation
- L24-493: Probability
- L24-494: Mathematical Statistics
- E62-537: Computational Molecular Biology
- E81-554A: Geometric Computing for Biomedicine
- E81-587A: Algorithms for Computational Biology
- E81-417T: Introduction to Machine Learning
- E81-514A: Data Mining
- E81-530S: Database Management Systems
- L32-584C: Multi-Level Models in Quantitative Research
- M19-502: Intermediate Epidemiology
- L41-5488: Genomics
Pathway core courses
Introduction to Epidemiology
This course introduces the basic principles and methods of epidemiology, with an emphasis on critical thinking, analytic skills, and application to clinical practice. Topics include outcome measures, methods of adjustment, surveillance, quantitative study designs, and sources of data. Designed for those with a clinical background, the course will provide tools for critically evaluating the literature and skills to practice evidence-based medicine. Syllabus»
This course will cover the basic applied and theoretical aspects of survival analysis techniques to analyze time-to-event data. Basic concepts will be introduced and topics include survival function, hazard function, censoring and truncation, Kaplan-Meier and Nelson-Aalen estimators, cohort life table, likelihood construction for censored and truncated data, estimating hazard and survival functions, Cox-proportional hazards (PH) model with fixed and time-dependent covariates and model selection. Additional topics will include regression diagnostics for survival models, the stratified PH model, parametric regression models and competing risk. Computer lab sessions are designed to provide intensive hands-on experience to analyze real life datasets. Prerequisites: Biostat I and II, mathematical statistics (covers probabilities, distributions, likelihood, etc.), Calculus II or III and SAS programming. Or permission from the course master.
Human Linkage and Association
Basic Genetic concepts: meiosis, inheritance, Hardy-Weinberg Equilibrium, Linkage, segregation analysis; Linkage analysis: definition, crossing over, map functions, phase, LOD scores, penetrance, phenocopies, liability classes, multi-point analysis, non-parametric analysis (sibpairs and pedigrees), quantitative trait analysis, determination of power for mendelian and complex trait analysis; Linkage Disequilibrium analyses: allelic association (case control designs and family bases studies), QQ and Manhattan plots, whole genome association analysis; population stratification; Quantitative Trait Analysis; measured genotypes and variance components. Hands-on computer lab experience doing parametric linkage analysis with the program LINKAGE, model free linkage analyses with Genehunter and Merlin, power computations with SLINK, quantitative trait analyses with SOLAR, LD computations with Haploview, and family-based and case-control association analyses with PLINK and SAS. The methods and exercises are coordinated with the lectures, and students are expected to understand underlying assumptions and limitations, and the basic calculations performed by these computer programs. Auditors will not have access to the computer lab sessions. Prerequisite: M21-515 Fundamentals of Genetic Epidemiology. For details, to register and to receive the required permission of the Coursemaster contact the MSIBS Program Manager at firstname.lastname@example.org or by telephone at 314-362-1384.
Computational Statistical Genetics
This course is designed to give the students computational experience with the latest statistical genetics methods and concepts, so that they will be able to computationally implement the method(s)/model(s) developed as part of their thesis. Concentrating on the applications of genomics and SAS® computing, it deals with creating efficient new bioinfomatic tools to interface with some of the latest, most important genetic epidemiological analysis software, as well as how to derive, design and implement new statistical genetics models. The course also includes didactic instruction on the theory and application of the maximum likelihood principle, the differences between Bayesian and Frequentist approaches to analysis in genetic modeling. Specialized topics include haplotype estimation and modeling of genotype to phenotype relationships, LD and LE mapping, rare vs. common variant analysis methods, growth curve analysis, data mining and variable selection methods, the fundamentals of meta-analysis, importance sampling, permutation tests and empircal p-values, as well as the design of monte-carlo simulation experiments. Course not available to auditors. Prerequisite: M21-5483 Human Linkage & Association, M21-560 Biostatistics I, and M21-570 Biostatistics II or, with permission of the Course Master, the equivalents.