Department of Biostatistics (GRAD)
The Department of Biostatistics is recognized as a worldwide leader in research and practice. Members of the faculty are interested both in the development of statistical methodology and application of statistics in applied research. The research strengths include: development of new statistical methods to address pressing issues in medicine and public health sciences; design of innovative clinical trials that allow faster evaluation of new therapeutic agents; collaborative work focused upon important public health concerns, including infectious diseases, cancer, cardiovascular disease, obesity and drinking water safety; and utilization of strong quantitative skills to improve the health of human beings around the globe.
The mission of the Department of Biostatistics is to forge dramatic advances in health science research that benefit human health in North Carolina, the U.S., and globally through the development of profound and paradigm-shifting innovations in biostatistical methodology and the thoughtful implementation of biostatistical practice to solve public health problems.
For more information, please reference the Academic Information Manual on the department's website.
Master of Public Health (M.P.H.)
The redesigned UNC Gillings School of Global Public Health’s master of public health (M.P.H.) program is for people who are passionate about solving urgent local and global public health problems. With a legacy of outstanding education, cutting edge research and globally recognized leadership, the UNC Gillings School is creating the next generation of public health leaders through our integrated training program and 21st century curriculum. The Department of Biostatistics hosts the public health data science concentration.
Master of Science (M.S.)
The master of science (M.S.) degree in the Department of Biostatistics provides students with research-oriented training in the theory and methodology of biostatistics and its application to solving problems in the health sciences.
Doctor of Philosophy (Ph.D.)
The doctor of philosophy (Ph.D.) degree in the Department of Biostatistics provides advanced, research-oriented training in theory and methodology of biostatistics to prepare individuals for careers in academia, government, and industry.
Public Health, Master's Program (M.P.H.) — Public Health Data Science Concentration
The Public Health Data Science concentration, one of the first applied data science programs situated within a school of public health, gives students the skills and knowledge to employ cutting-edge data science tools and respond to pressing public health issues with effective solutions. Data science draws upon multiple disciplines, combining the statistical skills to manipulate data and make inferences, the mathematical skills to model phenomena and make predictions, and the computer science skills to manage and analyze large data sets. Steeped in the public health context, our program offers a unique focus on leveraging the foundational statistical, mathematical, and computer science elements of data science to generate useful information from data sources relevant to public health.
Course Requirements
Requirements for the M.P.H. degree in the Public Health Data Science concentration
Code | Title | Hours |
---|---|---|
M.P.H. Integrated Core | ||
SPHG 711 | Data Analysis for Public Health Fall 1 | 2 |
SPHG 712 | Methods and Measures for Public Health Practice Fall 1 | 2 |
SPHG 713 | Systems Approaches to Understanding Public Health Issues Fall 1 | 2 |
SPHG 701 | Leading from the Inside-Out Spring 1 | 2 |
SPHG 721 | Public Health Solutions: Systems, Policy and Advocacy Spring 1 | 2 |
SPHG 722 | Developing, Implementing, and Evaluating Public Health Solutions (MPH Comprehensive Exam administered in class) Spring 1 | 4 |
M.P.H. Practicum | ||
SPHG 703 | MPH Pre-Practicum Assignments Spring 1 | 0.5 |
SPHG 707 | MPH Post-Practicum Assignments Fall 2 | 0.5 |
M.P.H. Concentration | ||
BIOS 512 | Data Science Basics Fall 1 | 3 |
BIOS 650 | Basic Elements of Probability and Statistical Inference I Fall 1 | 3 |
BIOS 635 | Introduction to Machine Learning Spring 1 | 3 |
BIOS 645 | Principles of Experimental Analysis Spring 1 | 3 |
EPID 710 | Fundamentals of Epidemiology Fall 2 | 3 |
M.P.H. Electives | ||
Elective (Graduate-level courses, 400+ level at Gillings, 500+ level at UNC) | 3 | |
Elective (Graduate-level courses, 400+ level at Gillings, 500+ level at UNC) | 3 | |
Elective (Graduate-level courses, 400+ level at Gillings, 500+ level at UNC) | 3 | |
M.P.H. Culminating Experience | ||
BIOS 992 | Master's (Non-Thesis) Spring 2 | 3 |
Minimum Hours | 42 |
Admissions
Please visit Applying to the Gillings School first for details and information. Application to the residential M.P.H. is a 2-step process. Please apply separately to (1) SOPHAS and (2) UNC–Chapel Hill (via the Graduate School application). Visit the Graduate School website for more details. If you are interested in the online M.P.H., please visit the MPH@UNC website and fill out an inquiry form.
Milestones
- Master's Committee
- Master's Written Examination/Approved Substitute (Comprehensive Exam)
- Thesis Substitute (Culminating Experience)
- Residence Credit
- Exit Survey
- Master's Professional Work Experience (Practicum)
Master of Science in Biostatistics (M.S.)
The Master of Science (MS) program is designed to provide research-oriented training in the theory and methodology of biostatistics and its applications to the solution of problems in the health sciences.
Course Requirements
Code | Title | Hours |
---|---|---|
Public Health Foundation Courses | ||
SPHG 600 | Introduction to Public Health 1 | 3 |
EPID 600 | Principles of Epidemiology for Public Health | 3 |
or EPID 710 | Fundamentals of Epidemiology | |
Core Courses | ||
BIOS 511 | Introduction to Statistical Computing and Data Management | 4 |
BIOS 660 | Probability and Statistical Inference I | 3 |
BIOS 661 | Probability and Statistical Inference II | 3 |
BIOS 662 | Intermediate Statistical Methods | 4 |
BIOS 663 | Intermediate Linear Models | 4 |
BIOS 667 | Applied Longitudinal Data Analysis | 3 |
BIOS 680 | Introductory Survivorship Analysis | 3 |
BIOS 691 | Field Observations in Biostatistics | 1 |
BIOS 841 | Principles of Statistical Collaboration and Leadership | 3 |
BIOS 843 | Seminar in Biostatistics (two semesters, 2 credit hours) 2 | 2 |
Electives 3,4,5 | ||
Six hours of course work that can include BIOS 635, 664, 665, and 668 or any course higher than 668 but not including 680 in Biostatistics. | 6 | |
Thesis/Substitute or Dissertation Course | ||
BIOS 992 | Master's (Non-Thesis) | 3 |
Minimum Hours | 45 |
- 1
Students with a prior public health degree are not required to take SPHG 600; exemptions are available for those with non-public health degrees from accredited SPHs. Students should discuss with their Academic Coordinator.
- 2
BIOS 843 Seminar must be taken two semesters for two credit hours after comprehensive exams.
- 3
Six hours of course work that can include BIOS 635, 664, 665, and 668 or any course higher than 668 but not including 680 in Biostatistics, or equivalent in the Department of Statistics and Operations Research (STOR) at UNC, or in the Department of Statistics at North Carolina State University (NCSU); these hours are considered individually and must be approved by the DGS.
- 4
Students interested in substituting a graduate level course (600 level or higher) outside of the Gillings School of Global Public Health should submit a request to the Academic Coordinator for review by the DGS for consideration.
- 5
700-level courses as approved by DGS would also count. Please refer to your student specialist. Please note that BIOS 990, BIOS 992, and BIOS 994 do not count towards the electives requirement.
Code | Title | Hours |
---|---|---|
Biostatistics Elective Course Options | ||
BIOS 635 | Introduction to Machine Learning | 3 |
BIOS 664 | Sample Survey Methodology | 4 |
BIOS 668 | Design of Public Health Studies | 3 |
BIOS 665 | Analysis of Categorical Data | 3 |
BIOS 672 | Topics in Real Analysis, Introduction to Measure Theory | 1 |
BIOS 673 | Intermediate Statistical Inference | 1 |
BIOS 669 | Working with Data in a Public Health Research Setting | 3 |
Milestones
The following list of milestones (non-course degree requirements) must be completed; view this list of standard milestone definitions for more information.
- Master's Committee
- Master's Written Exam / Approved Substitute
- Thesis Substitute
- Residence Credit
- Exit Survey
- Master's Written Exam 2
Doctor of Philosophy in Biostatistics (Ph.D.)
The doctor of philosophy (Ph.D.) degree in the Department of Biostatistics provides advanced, research-oriented training in theory and methodology of biostatistics to prepare individuals for careers in academia, government, and industry.
Course Requirements
Code | Title | Hours |
---|---|---|
Public Health Foundation Courses | ||
SPHG 600 | Introduction to Public Health 1 | 3 |
EPID 600 | Principles of Epidemiology for Public Health | 3 |
or EPID 710 | Fundamentals of Epidemiology | |
Core Courses | ||
BIOS 611 | Introduction to Data Science | 4 |
BIOS 660 | Probability and Statistical Inference I | 3 |
BIOS 661 | Probability and Statistical Inference II | 3 |
BIOS 662 | Intermediate Statistical Methods | 4 |
BIOS 663 | Intermediate Linear Models | 4 |
BIOS 672 | Topics in Real Analysis, Introduction to Measure Theory | 1 |
BIOS 673 | Intermediate Statistical Inference | 1 |
BIOS 735 | Statistical Computing - Basic Principles and Applications | 4 |
BIOS 760 | Advanced Probability and Statistical Inference I | 4 |
BIOS 761 | Advanced Probability and Statistical Inference II | 4 |
BIOS 762 | Theory and Applications of Linear and Generalized Linear Models | 4 |
BIOS 767 | Longitudinal Data Analysis | 4 |
BIOS 780 | Theory and Methods for Survival Analysis | 3 |
BIOS 841 | Principles of Statistical Collaboration and Leadership | 3 |
BIOS 843 | Seminar in Biostatistics 2 | 4 |
BIOS 850 | Training in Statistical Teaching in the Health Sciences | 3 |
Electives | ||
700-level Biostatistics or (Mathematical) Statistics course from the list below, or approval of the DGS | 9 | |
Thesis/Substitute or Dissertation Course | ||
BIOS 994 | Doctoral Research and Dissertation | 6 |
Minimum Hours | 71 |
- 1
Students with a prior public health degree are not required to take SPHG 600; exemptions are available for those with non-public health degrees from accredited SPHs. Students should discuss with their Academic Coordinator.
- 2
Four hours of BIOS 843 Seminar taken individually as 1 credit hour
Code | Title | Hours |
---|---|---|
Biostatistics Elective Course Options | ||
BIOS 740 | Specialized Methods in Health Statistics | 3-4 |
BIOS 752 | Design and Analysis of Clinical Trials | 3 |
BIOS 764 | Advanced Survey Sampling Methods | 3 |
BIOS 765 | Models and Methodology in Categorical Data | 3 |
BIOS 772 | Statistical Analysis of MRI Images | 3 |
BIOS 773 | Statistical Analysis with Missing Data | 3 |
BIOS 774 | Advanced Machine Learning | 3 |
BIOS 782 | Statistical Methods in Genetic Association Studies | 3 |
BIOS 784 | Introduction to Computational Biology | 3 |
BIOS 785 | Statistical Methods for Gene Expression Analysis | 3 |
BIOS 775 | Statistical Methods in Diagnostic Medicine | 3 |
BIOS 776 | Causal Inference in Biomedical Research | 3 |
BIOS 777 | Precision Medicine and Machine Learning | 3 |
BIOS 779 | Bayesian Statistics | 4 |
BIOS 781 | Statistical Methods in Human Genetics | 4 |
STOR 701 | Statistics and Operations Research Colloquium | 1 |
STOR 712 | Optimization for Machine Learning and Data Science | 3 |
STOR 713 | Mathematical Programming II | 3 |
STOR 722 | Integer Programming | 3 |
STOR 734 | Stochastic Processes | 3 |
STOR 743 | Reinforcement Learning and Markov Decision Processes | 3 |
STOR 754 | Time Series and Multivariate Analysis | 3 |
STOR 757 | Bayesian Statistics and Generalized Linear Models | 3 |
STOR 767 | Advanced Statistical Machine Learning | 3 |
Milestones
The following list of milestones (non-course degree requirements) must be completed; view this list of standard milestone definitions for more information.
- Doctoral Committee
- Doctoral Oral Comprehensive Exam
- Doctoral Written Exam
- Prospectus Oral Exam
- Advanced to Candidacy
- Dissertation Defense
- Doctoral Dissertation Approved/Format Accepted
- Residence Credit
- Exit Survey
Following the faculty member's name is a section number that students should use when registering for independent studies, reading, research, and thesis and dissertation courses with that particular professor.
Professors
Kevin Anstrom (70), Clinical Trials, Statistical Consulting, Causal Inference, Data Safety Monitoring, Pragmatic Clinical Trials, and Coordinating Center Operations
Jianwen Cai (93), Survival Analysis and Regression Models, Clinical Trials, Analysis of Correlated Responses
David J. Couper (77), Epidemiological Methods, Longitudinal Data, Data Quality
Michael Hudgens (42), Nonparametric Estimation, Group Testing, Causal Inference, Infectious Diseases
Joseph G. Ibrahim (11), Bayesian Inference, Missing Data Problems, Bayesian Survival Analysis, Generalized Linear Models, Genomics
Anastasia Ivanova (83), Clinical Trials Design, Sequential Design of Binary Response Experiments, Statistical Methodology in Biostatistics
Gary G. Koch (14), Categorical Data Analysis, Nonparametric Methods
Michael R. Kosorok (88), Biostatistics, Bioinformatics, Empirical Processes, Statistical Learning, Data Mining, Semiparametric Inference, Monte Carlo Methods, Survival Analysis, Clinical Trials, Personalized Medicine, Cancer, Cystic Fibrosis
Yun Li (59), (Joint with the Department of Genetics), Statistical Genetics
Danyu Lin (31), Survival Analysis, Semiparametric Statistical Methods, Clinical Trials
Feng-Chang Lin (71), Survival Analysis, Generalized Linear Models, Longitudinal Analysis, Hearth Disease and Stroke, Infectious Disease, Neuroscience
Yufeng Liu (73), (Joint with the Department of Statistics and Operations Research), Statistical Machine Learning and Data Mining, High-Dimensional Data Analysis, Nonparametric Statistics and Functional Estimation, Bioinformatics, Design and Analysis of Experiments
James Stephen Marron (82), (Joint with the Department of Statistics and Operations Research), High Dimension Low Sample Size (HDLSS), Data and/or Data, Exotic Data Types such as Manifold and Tree-Structural Data
Jane Monaco (43), Survival Analysis, Correlated Failure Time Data
Andrew Nobel, (Joint with the Department of Statistics and Operations Research), Data Mining, Statistical Data of Genomic Data, Machine Learning
John S. Preisser Jr. (89), Categorical Data, Longitudinal Data Analysis
Bahjat Qaqish (94), Generalized Linear Models, Survival Analysis, Statistical Computing
Todd A. Schwartz (13), Categorical Data, Clinical Trials
Richard Smith, (Joint with the Department of Statistics and Operations Research), Spatial Statistics, Time Series Analysis, Extreme Value Theory, Bayesian Statistics
Daniela T. Sotres-Alvarez (74), Linear Mixed Models, Latent Variable Models, Dietary and Physical Activity Patterns
Xianming Tan (50), Finite Mixture Models, Design of Clinical Studies, Variable Selection for Zero-Inflated Models, Non-Parametric Regression
Kinh N. Truong (90), Time Series Analysis, Nonparametric Regression, Bootstrap Methods, Hazard Regression, Splines
Haibo Zhou (40), Missing/Auxiliary Data, Survival Analysis, Human Fertility
Hongtu Zhu (48), Neuroimaging Statistics, Structural Equation Models, Statistical Computing, Diagnostic Methods
Fei Zou (4), Statistical Genetics
Associate Professors
Robert Agans (78), Population-Based Research Methods, Multimode Data Collection Procedures, Questionnaire Development, Standardization and Validation, Hard-to-Reach Populations and Minorities
Jamie B. Crandell (64), (Joint with the School of Nursing), Bayesian Methods, Longitudinal Analysis and Measurement Error Modeling
Tanya P. Garcia (67), Survival Analysis, Semiparametric Theory, Longitudinal Data Analysis
Annie Green Howard (75), Cardiovascular Disease, Global Health
Yuchao Jiang (91), Statistical Modeling, Method Development and Data Analysis in Genetics and Genomics
Quefeng Li (81), High Dimensional Data Analysis, Integrative Analysis of Omics Data, Robust Statistics, Factor Models
Michael I. Love (39), (Joint with the Department of Genetics), Statistical Modeling of Genetics Data, High-Throughput Sequencing, RNA Sequencing (RNA-seq), Empirical Bayes Methods
Naim Rashid (79), Cancer, Genomics, High Throughput Sequencing, High Dimensional Data Analysis, Variable Selection
Di Wu (51), (Joint with the School of Dentistry), Statistical Bioinformatics and Biostatistics for Preprocess and Integration of High-Dimensional Biomedical Data
Baiming Zou (97), Robust Modeling of Data with Complex Structures, Machine Learning Methods for Large Scale Electronic Health Record Data Analysis
Assistant Professors
Ethan Alt, Bayesian Methods, Clinical Trial Design
Didong Li (80), Geometric Data Analysis, Information Geometry, Nonparametric Bayes, Spatial Statistics
Xihao Li, Statistical Genetics and Genomics, Integrative Analysis of WGS/WES and Multi-Omics Data, Functional Genomics and Annotations, Data Integration and Meta-Analysis, Multivariate Analysis, Machine Learning
Yusha Liu, Cancer, Single-Cell Modeling, Multi-Omics Data Integration, Bayesian Inference, Functional Data Analysis, and Quantile Regression
Kara McCormack (85), Statistical Pedagogy, Classroom Accessibility and Inclusivity
Bonnie Shook-Sa (98), Causal Inference, Survey Sampling, Infectious Diseases, Epidemiology
Instructors
Jane Eslinger (62)
Kinsey Helton
Marcus Herman-Giddens
Jeff Laux
Vincent Toups (17)
Adjunct Professors
Haoda Fu
Eric Laber
Sean Simpson
Wei Sun
William Valdar
Clarice Weinberg
Donglin Zeng
Richard Zink
Adjunct Associate Professors
Shanshan Zhao
Xiaojing Zheng
Adjunct Assistant Professors
Charles Pepe-Ranney
Matthew Psioda
Zhengwu Zhang
Professors Emeriti
Shrikant I. Bangdiwala
Lloyd E. Chambless
Clarence E. Davis
James E. Grizzle
Ronald W. Helms
William D. Kalsbeek
Lawrence L. Kupper
Lisa M. LaVange
Keith E. Muller
Dana E. Quade
Michael J. Symons
BIOS
Advanced Undergraduate and Graduate-level Courses
Access to SAS, Excel required. Permission of instructor for nonmajors. Introductory course in probability, data analysis, and statistical inference designed for B.S.P.H. biostatistics students. Topics include sampling, descriptive statistics, probability, confidence intervals, tests of hypotheses, chi-square distribution, 2-way tables, power, sample size, ANOVA, non-parametric tests, correlation, regression, survival analysis.
Required preparation, previous or concurrent course in applied statistics. Permission of instructor for nonmajors. Introduction to use of computers to process and analyze data, concepts and techniques of research data management, and use of statistical programming packages and interpretation. Focus is on use of SAS for data management and reporting.
Students will gain proficiency with R, data wrangling, data quality control and cleaning, data visualization, exploratory data analysis, with an overall emphasis on the principles of good data science, particularly reproducible research. The course will also develop familiarity with several software tools for data science best practices, such as Git, Docker, Jupyter, Make, and Nextflow.
Arrangements to be made with the faculty in each case. A course for students of public health who wish to make a study of some special problem in the statistics of the life sciences and public health. Honors version available.
Required preparation, knowledge of basic descriptive statistics. Major topics include elementary probability theory, probability distributions, estimation, tests of hypotheses, chi-squared procedures, regression, and correlation.
Topics will include gaining proficiency with R and Python, data wrangling, data quality control and cleaning, data visualization, exploratory data analysis, and introductory applied optimization, with an overall emphasis on the principles of good data science, particularly reproducible research. Some emphasis will be given to large data settings such as genomics or claims data. The course will also develop familiarity with software tools for data science best practices, such as Git, Docker, Jupyter, and Nextflow.
This course will be an introductory course to machine learning. The goal is to equip students with knowledge of existing tools for data analysis and to prepare students for more advanced courses in machine learning. Students in the SPH Master of Public Health with a Public Health Data Science concentration receive priority for enrollment.
Course is designed to meet the needs of health care professionals to appraise the design and analysis of medical and health care studies and who intend to pursue academic research careers. Covers basics of statistical inference, analysis of variance, multiple regression, categorical data analysis. Previously offered as PUBH 741. Permission of instructor.
Continuation of BIOS 641. Main emphasis is on logistic regression; other topics include exploratory data analysis and survival analysis. Previously offered as PUBH 742.
Required preparation, basic familiarity with statistical software (preferably SAS able to do multiple linear regression) and introductory biostatistics, such as BIOS 600. Continuation of BIOS 600. Analysis of experimental and observational data, including multiple regression and analysis of variance and covariance. Previously offered as BIOS 545. Permission of the instructor for nonmajors.
Required preparation, two semesters of calculus (such as MATH 231, 232). Fundamentals of probability; discrete and continuous distributions; functions of random variables; descriptive statistics; fundamentals of statistical inference, including estimation and hypothesis testing.
Required preparation, three semesters of calculus (such as MATH 231, 232, 233). Introduction to probability; discrete and continuous random variables; expectation theory; bivariate and multivariate distribution theory; regression and correlation; linear functions of random variables; theory of sampling; introduction to estimation and hypothesis testing.
Distribution of functions of random variables; Helmert transformation theory; central limit theorem and other asymptotic theory; estimation theory; maximum likelihood methods; hypothesis testing; power; Neyman-Pearson Theorem, likelihood ratio, score, and Wald tests; noncentral distributions.
Principles of study design, descriptive statistics, sampling from finite and infinite populations, inferences about location and scale. Both distribution-free and parametric approaches are considered. Gaussian, binomial, and Poisson models, one-way and two-way contingency tables.
Required preparation, BIOS 662. Matrix-based treatment of regression, one-way and two-way ANOVA, and ANCOVA, emphasizing the general linear model and hypothesis, as well as diagnostics and model building. Reviews matrix algebra. Includes statistical power for linear models and binary response regression methods.
Fundamental principles and methods of sampling populations, with emphasis on simple, random, stratified, and cluster sampling. Sample weights, nonsampling error, and analysis of data from complex designs are covered. Practical experience through participation in the design, execution, and analysis of a sampling project.
Introduction to the analysis of categorized data: rates, ratios, and proportions; relative risk and odds ratio; Cochran-Mantel-Haenszel procedure; survivorship and life table methods; linear models for categorical data. Applications in demography, epidemiology, and medicine.
Matrix-based longitudinal data analysis emphasizing applications and interpretation. Linear and generalized linear, marginal and mixed regression models. Fixed effects and random effects. Maximum likelihood, REML, GEE. Regression diagnostics. Sample size. Simulation of longitudinal data.
Statistical concepts in basic public health study designs: cross-sectional, case-control, prospective, and experimental (including clinical trials). Validity, measurement of response, sample size determination, matching and random allocation methods.
Provides a foundation and training for working with data from clinical trials or research studies. Topics: issues in study design, collecting quality data, using SAS and SQL to transform data, typical reports, data closure and export, and working with big data.
Source and interpretation of demographic data; rates and ratios, standardization, complete and abridged life tables; estimation and projection of fertility, mortality, migration, and population composition.
Selected topics in calculus, real analysis including Taylor's series, Riemann, Stieltjes and Lebesgue integration, and complex variables. Introduction to measure theory.
This course introduces intermediate concepts and theories in statistical inferences, including multivariate transformation, convergence of random vectors, sufficient and complete statistics, methods of estimation, and advanced problems such as information inequality, unbiased estimators, Bayes estimators, asymptotically efficient estimation, nonparametric estimation, and simultaneous confidence intervals.
Introduction to concepts and techniques used in the analysis of time to event data, including censoring, hazard rates, estimation of survival curves, regression techniques, applications to clinical trials.
Field/topical/research seminar. Instructors use this course to offer instruction in particular topics or approaches.
Field visits to, and evaluation of, major nonacademic biostatistical programs in the Research Triangle area. Field fee: $25.
Directed research. Written and oral reports required.
Directed research. Written and oral reports required.
Graduate-level Courses
Permission of the department for students with passing grade of either doctoral qualifying examination in biostatistics. BIOS 700 will introduce doctoral students in biostatistics to research skills necessary for writing a dissertation and for a career in research.
Required preparation, one undergraduate-level programming class. Teaches important concepts and skills for statistical software development using case studies. After this course, students will have an understanding of the process of statistical software development, knowledge of existing resources for software development, and the ability to produce reliable and efficient statistical software.
Permission of the instructor. Statistical theory applied to special problem areas of timely importance in the life sciences and public health. Lectures, seminars, and/or laboratory work, according to the nature of the special area under study.
This course will introduce the methods used in clinical. Topics include dose-finding trials, allocation to treatments in randomized trials, sample size calculation, interim monitoring, and non-inferiority trials.
Theory and application of nonparametric methods for various problems in statistical analysis. Includes procedures based on randomization, ranks and U-statistics. A knowledge of elementary computer programming is assumed.
Measure space, sigma-field, measurable functions, integration, conditional probability, distribution functions, characteristic functions, convergence modes, SLLN, CLT, Cramer-Wold device, delta method, U-statistics, martingale central limit theorem, UMVUE, estimating function, MLE, Cramer-Rao lower bound, information bounds, LeCam's lemmas, consistency, efficiency, EM algorithm.
Elementary decision theory: admissibility, minimaxity, loss functions, Bayesian approaches. Hypothesis testing: Neyman-Pearson theory, UMP and unbiased tests, invariance, confidence sets, contiguous alternatives. Elements of stochastic processes: Poisson processes, renewal theory, Markov chains, martingales, Brownian motion.
Linear algebra, matrix decompositions, estimability, multivariate normal distributions, quadratic forms, Gauss-Markov theorem, hypothesis testing, experimental design, general likelihood theory and asymptotics, delta method, exponential families, generalized linear models for continuous and discrete data, categorical data, nuisance parameters, over-dispersion, multivariate linear model, generalized estimating equations, and regression diagnostics.
Continuation of BIOS 664 for advanced students: stratification, special designs, multistage sampling, cost studies, nonsampling errors, complex survey designs, employing auxiliary information, and other miscellaneous topics.
Theory and application of methods for categorical data including maximum likelihood, estimating equations and chi-square methods for large samples, and exact inference for small samples.
Presents modern approaches to the analysis of longitudinal data. Topics include linear mixed effects models, generalized linear models for correlated data (including generalized estimating equations), computational issues and methods for fitting models, and dropout or other missing data.
Required preparation, integral calculus. Life table techniques; methods of analysis when data are deficient; population projection methods; interrelations among demographic variables; migration analysis; uses of population models.
The course will review major statistical methods for the analysis of MRI and its applications in various studies.
Fundamental concepts, including classifications of missing data, missing covariate and/or response data in linear models, generalized linear models, longitudinal data models, and survival models. Maximum likelihood methods, multiple imputation, fully Bayesian methods, and weighted estimating equations. Focus on biomedical sciences case studies. Software packages include WinBUGS, SAS, and R.
This advanced machine learning course, designed for PhD students in biostatistics and related fields, centers on cutting-edge tools in ML, encompassing theory, methods, and applications. It is motivated by complex biomedical data problems, offering in-depth exploration of technical details, model understanding, and the strengths and weaknesses of various approaches. The aim is to provide a comprehensive understanding of state-of-the-art ML tools for effectively analyzing and solving intricate biomedical data challenges.
Statistical concepts and techniques for evaluating medical diagnostic tests and biomarkers for detecting disease. Measures for quantifying test accuracy. Statistical procedures for estimating and comparing these quantities, including regression modeling. Real data will be used to illustrate the methods. Developments in recent literature will be covered.
This course will consider drawing inference about causal effects in a variety of settings using the potential outcomes framework. Topics covered include causal inference in randomized experiments and observational studies, bounds and sensitivity analysis, propensity scores, graphical models, and other areas.
In this course, we will address precision medicine from a statistical and machine learning perspective with numerous examples of application. We will develop a working knowledge of the following inter-related areas in the context of precision medicine and precision health: dynamic treatment regimes; causal inference for precision medicine; study designs such as SMARTs, basic and advanced machine learning and artificial intelligence tools, including deep learning, outcome weighted learning, reinforcement learning and Markov decision processes.
Topics include Bayes' theorem, the likelihood principle, prior distributions, posterior distributions, predictive distributions, Bayesian modeling, informative prior elicitation, model comparisons, Bayesian diagnostic methods, variable subset selection, and model uncertainty. Markov chain Monte Carlo methods for computation are discussed in detail.
Counting process-martingale theory, Kaplan-Meier estimator, weighted log-rank statistics, Cox proportional hazards model, nonproportional hazards models, multivariate failure time data.
An introduction to statistical procedures in human genetics, Hardy-Weinberg equilibrium, linkage analysis (including use of genetic software packages), linkage disequilibrium and allelic association.
This course provides a comprehensive survey of the statistical methods for the designs and analysis of genetic association studies, including genome-wide association studies and next-generation sequencing studies. The students will learn the theoretical justifications for the methods as well as the skills to apply them to real studies.
Molecular biology, sequence alignment, sequence motifs identification by Monte Carlo Bayesian approaches, dynamic programming, hidden Markov models, computational algorithms, statistical software, high-throughput sequencing data and its application in computational biology.
Clustering algorithms, classification techniques, statistical techniques for analyzing multivariate data, analysis of high dimensional data, parametric and semiparametric models for DNA microarray data, measurement error models, Bayesian methods, statistical software, sample size determination in microarray studies, applications to cancer.
Theory and applications of empirical process methods to semiparametric estimation and inference for statistical models with both finite and infinite dimensional parameters. Topics include bootstrap, Z-estimators, M-estimators, semiparametric efficiency.
An introduction to the statistical collaborative process and leadership skills. Emphasized topics include problem solving, study design, data analysis, ethical conduct, teamwork, career paths, data management, written and oral communication with scientists and collaborators.
Under supervision of a faculty member, the student interacts with research workers in the health sciences, learning to abstract the statistical aspects of substantive problems, to provide appropriate technical assistance, and to communicate statistical results.
This seminar course is intended to give students exposure to cutting edge research topics and hopefully help them in their choice of a thesis topic. It also allows the student to meet and learn from major researchers in the field.
Using lectures and group exercises, students are taught where and how biostatisticians can offer leadership in both academic and nonacademic public health settings.
Required preparation, a minimum of one year of graduate work in statistics. Principles of statistical pedagogy. Students assist with teaching elementary statistics to students in the health sciences. Students work under the supervision of the faculty, with whom they have regular discussions of methods, content, and evaluation of performance.
Permission of the instructor. Seminar on new research developments in selected biostatistical topics.
Individual arrangements may be made by the advanced student to spend part or all of his or her time in supervised investigation of selected problems in statistics.
Department of Biostatistics
Chair
Michael G. Hudgens
Associate Chair
Todd A. Schwartz