Academic credit for approved internship experience.
Directed readings or laboratory study. May be taken more than once. Two to six laboratory hours a week.
Access to SAS, Excel required. Permission of instructor for nonmajors. Introductory course in probability, data analysis, and statistical inference designed for B.S.P.H. biostatistics students. Topics include sampling, descriptive statistics, probability, confidence intervals, tests of hypotheses, chi-square distribution, 2-way tables, power, sample size, ANOVA, non-parametric tests, correlation, regression, survival analysis.
Required preparation, previous or concurrent course in applied statistics. Permission of instructor for nonmajors. Introduction to use of computers to process and analyze data, concepts and techniques of research data management, and use of statistical programming packages and interpretation. Focus is on use of SAS for data management and reporting.
Students will gain proficiency with R, data wrangling, data quality control and cleaning, data visualization, exploratory data analysis, with an overall emphasis on the principles of good data science, particularly reproducible research. The course will also develop familiarity with several software tools for data science best practices, such as Git, Docker, Jupyter, Make, and Nextflow.
Arrangements to be made with the faculty in each case. A course for students of public health who wish to make a study of some special problem in the statistics of the life sciences and public health. Honors version available.
Required preparation, knowledge of basic descriptive statistics. Major topics include elementary probability theory, probability distributions, estimation, tests of hypotheses, chi-squared procedures, regression, and correlation.
Topics will include gaining proficiency with R and Python, data wrangling, data quality control and cleaning, data visualization, exploratory data analysis, and introductory applied optimization, with an overall emphasis on the principles of good data science, particularly reproducible research. Some emphasis will be given to large data settings such as genomics or claims data. The course will also develop familiarity with software tools for data science best practices, such as Git, Docker, Jupyter, and Nextflow.
This course will be an introductory course to machine learning. The goal is to equip students with knowledge of existing tools for data analysis and to get students prepared for more advanced courses in machine learning. This course is restricted to SPH Master of Public Health students.
Course is designed to meet the needs of health care professionals to appraise the design and analysis of medical and health care studies and who intend to pursue academic research careers. Covers basics of statistical inference, analysis of variance, multiple regression, categorical data analysis. Previously offered as PUBH 741. Permission of instructor.
Continuation of BIOS 641. Main emphasis is on logistic regression; other topics include exploratory data analysis and survival analysis. Previously offered as PUBH 742.
Required preparation, basic familiarity with statistical software (preferably SAS able to do multiple linear regression) and introductory biostatistics, such as BIOS 600. Continuation of BIOS 600. Analysis of experimental and observational data, including multiple regression and analysis of variance and covariance. Previously offered as BIOS 545. Permission of the instructor for nonmajors.
Required preparation, two semesters of calculus (such as MATH 231, 232). Fundamentals of probability; discrete and continuous distributions; functions of random variables; descriptive statistics; fundamentals of statistical inference, including estimation and hypothesis testing.
Required preparation, three semesters of calculus (such as MATH 231, 232, 233). Introduction to probability; discrete and continuous random variables; expectation theory; bivariate and multivariate distribution theory; regression and correlation; linear functions of random variables; theory of sampling; introduction to estimation and hypothesis testing. Students may not receive credit for both BIOS 660 and BIOS 672.
Distribution of functions of random variables; Helmert transformation theory; central limit theorem and other asymptotic theory; estimation theory; maximum likelihood methods; hypothesis testing; power; Neyman-Pearson Theorem, likelihood ratio, score, and Wald tests; noncentral distributions. Students may not receive credit for both BIOS 661 and BIOS 673.
Principles of study design, descriptive statistics, sampling from finite and infinite populations, inferences about location and scale. Both distribution-free and parametric approaches are considered. Gaussian, binomial, and Poisson models, one-way and two-way contingency tables.
Required preparation, BIOS 662. Matrix-based treatment of regression, one-way and two-way ANOVA, and ANCOVA, emphasizing the general linear model and hypothesis, as well as diagnostics and model building. Reviews matrix algebra. Includes statistical power for linear models and binary response regression methods.
Fundamental principles and methods of sampling populations, with emphasis on simple, random, stratified, and cluster sampling. Sample weights, nonsampling error, and analysis of data from complex designs are covered. Practical experience through participation in the design, execution, and analysis of a sampling project.
Introduction to the analysis of categorized data: rates, ratios, and proportions; relative risk and odds ratio; Cochran-Mantel-Haenszel procedure; survivorship and life table methods; linear models for categorical data. Applications in demography, epidemiology, and medicine.
Analysis of variance and multiple linear regression course at the level of BIOS 663 required. Familiarity with matrix algebra required. Univariate and multivariate repeated measures ANOVA, GLM for longitudinal data, linear mixed models. Estimation and inference, maximum and restricted maximum likelihood, fixed and random effects.
Statistical concepts in basic public health study designs: cross-sectional, case-control, prospective, and experimental (including clinical trials). Validity, measurement of response, sample size determination, matching and random allocation methods.
Provides a foundation and training for working with data from clinical trials or research studies. Topics: issues in study design, collecting quality data, using SAS and SQL to transform data, typical reports, data closure and export, and working with big data.
Source and interpretation of demographic data; rates and ratios, standardization, complete and abridged life tables; estimation and projection of fertility, mortality, migration, and population composition.
Required preparation, three semesters of calculus. Introduction to probability; discrete and continuous random variables; combinatorics; expectation; random sums, multivariate distributions; functions of random variables; theory of sampling; convergence of sequences, power series, types of convergence, L'Hopital's rule, differentiable functions, Lebesgue integration, Fubini's theorem, convergence theorems, complex variables, Laplace transforms, inversion formulas.
Distribution of functions of random variables; central limit theorem and other asymptotic theory; estimation theory; hypothesis testing; Neyman-Pearson Theorem, likelihood ratio, score, and Wald tests; noncentral distributions. Advanced problems in statistical inferences, including information inequality, best unbiased estimators, Bayes estimators, asymptotically efficient estimation, nonparametric estimation and tests, simultaneous confidence intervals.
Introduction to concepts and techniques used in the analysis of time to event data, including censoring, hazard rates, estimation of survival curves, regression techniques, applications to clinical trials.
Field/topical/research seminar. Instructors use this course to offer instruction in particular topics or approaches.
Field visits to, and evaluation of, major nonacademic biostatistical programs in the Research Triangle area. Field fee: $25.
Directed research. Written and oral reports required.
Directed research. Written and oral reports required.
Permission of the department for students with passing grade of either doctoral qualifying examination in biostatistics. BIOS 700 will introduce doctoral students in biostatistics to research skills necessary for writing a dissertation and for a career in research.
Required preparation, one undergraduate-level programming class. Teaches important concepts and skills for statistical software development using case studies. After this course, students will have an understanding of the process of statistical software development, knowledge of existing resources for software development, and the ability to produce reliable and efficient statistical software.
Permission of the instructor. Statistical theory applied to special problem areas of timely importance in the life sciences and public health. Lectures, seminars, and/or laboratory work, according to the nature of the special area under study.
This graduate-level course concentrates on up-to-date views of intercellular signal processing, with emphasis on signal transduction mechanisms as they relate to cellular/physiological responses in both normal development and disease. Signaling mechanisms that will be discussed include autocrine, paracrine, juxtacrine signaling and cell-matrix interactions.
This course will introduce the methods used in clinical. Topics include dose-finding trials, allocation to treatments in randomized trials, sample size calculation, interim monitoring, and non-inferiority trials.
Theory and application of nonparametric methods for various problems in statistical analysis. Includes procedures based on randomization, ranks and U-statistics. A knowledge of elementary computer programming is assumed.
Topics include correlograms, periodograms, fast Fourier transforms, power spectra, cross-spectra, coherences, ARMA and transfer-function models, spectral-domain regression. Real and simulated data sets are discussed and analyzed using popular computer software packages.
Measure space, sigma-field, measurable functions, integration, conditional probability, distribution functions, characteristic functions, convergence modes, SLLN, CLT, Cramer-Wold device, delta method, U-statistics, martingale central limit theorem, UMVUE, estimating function, MLE, Cramer-Rao lower bound, information bounds, LeCam's lemmas, consistency, efficiency, EM algorithm.
Elementary decision theory: admissibility, minimaxity, loss functions, Bayesian approaches. Hypothesis testing: Neyman-Pearson theory, UMP and unbiased tests, invariance, confidence sets, contiguous alternatives. Elements of stochastic processes: Poisson processes, renewal theory, Markov chains, martingales, Brownian motion.
Linear algebra, matrix decompositions, estimability, multivariate normal distributions, quadratic forms, Gauss-Markov theorem, hypothesis testing, experimental design, general likelihood theory and asymptotics, delta method, exponential families, generalized linear models for continuous and discrete data, categorical data, nuisance parameters, over-dispersion, multivariate linear model, generalized estimating equations, and regression diagnostics.
Continuation of BIOS 664 for advanced students: stratification, special designs, multistage sampling, cost studies, nonsampling errors, complex survey designs, employing auxiliary information, and other miscellaneous topics.
Theory and application of methods for categorical data including maximum likelihood, estimating equations and chi-square methods for large samples, and exact inference for small samples.
Presents modern approaches to the analysis of longitudinal data. Topics include linear mixed effects models, generalized linear models for correlated data (including generalized estimating equations), computational issues and methods for fitting models, and dropout or other missing data.
Required preparation, integral calculus. Life table techniques; methods of analysis when data are deficient; population projection methods; interrelations among demographic variables; migration analysis; uses of population models.
The course will review major statistical methods for the analysis of MRI and its applications in various studies.
Fundamental concepts, including classifications of missing data, missing covariate and/or response data in linear models, generalized linear models, longitudinal data models, and survival models. Maximum likelihood methods, multiple imputation, fully Bayesian methods, and weighted estimating equations. Focus on biomedical sciences case studies. Software packages include WinBUGS, SAS, and R.
Introductory overview of statistical learning methods and high-dimensional data analysis. Involves three major components: supervised or unsupervised learning methods, statistical learning theory, and statistical methods for high-dimensional data including variable selection and multiple testing. Real examples are used.
Statistical concepts and techniques for evaluating medical diagnostic tests and biomarkers for detecting disease. Measures for quantifying test accuracy. Statistical procedures for estimating and comparing these quantities, including regression modeling. Real data will be used to illustrate the methods. Developments in recent literature will be covered.
This course will consider drawing inference about causal effects in a variety of settings using the potential outcomes framework. Topics covered include causal inference in randomized experiments and observational studies, bounds and sensitivity analysis, propensity scores, graphical models, and other areas.
Permission of the instructor. A detailed presentation of natality models, including necessary mathematical methods, and applications; deterministic and stochastic models for population growth, migration.
Topics include Bayes' theorem, the likelihood principle, prior distributions, posterior distributions, predictive distributions, Bayesian modeling, informative prior elicitation, model comparisons, Bayesian diagnostic methods, variable subset selection, and model uncertainty. Markov chain Monte Carlo methods for computation are discussed in detail.
Counting process-martingale theory, Kaplan-Meier estimator, weighted log-rank statistics, Cox proportional hazards model, nonproportional hazards models, multivariate failure time data.
An introduction to statistical procedures in human genetics, Hardy-Weinberg equilibrium, linkage analysis (including use of genetic software packages), linkage disequilibrium and allelic association.
This course provides a comprehensive survey of the statistical methods for the designs and analysis of genetic association studies, including genome-wide association studies and next-generation sequencing studies. The students will learn the theoretical justifications for the methods as well as the skills to apply them to real studies.
Molecular biology, sequence alignment, sequence motifs identification by Monte Carlo Bayesian approaches, dynamic programming, hidden Markov models, computational algorithms, statistical software, high-throughput sequencing data and its application in computational biology.
Clustering algorithms, classification techniques, statistical techniques for analyzing multivariate data, analysis of high dimensional data, parametric and semiparametric models for DNA microarray data, measurement error models, Bayesian methods, statistical software, sample size determination in microarray studies, applications to cancer.
Theory and applications of empirical process methods to semiparametric estimation and inference for statistical models with both finite and infinite dimensional parameters. Topics include bootstrap, Z-estimators, M-estimators, semiparametric efficiency.
An introduction to the statistical collaborative process and leadership skills. Emphasized topics include problem solving, study design, data analysis, ethical conduct, teamwork, career paths, data management, written and oral communication with scientists and collaborators.
Under supervision of a faculty member, the student interacts with research workers in the health sciences, learning to abstract the statistical aspects of substantive problems, to provide appropriate technical assistance, and to communicate statistical results.
This seminar course is intended to give students exposure ot cutting edge research topics and hopefully help them in their choice of a thesis topic. It also allows the student to meet and learn from major researchers in the field.
Using lectures and group exercises, students are taught where and how biostatisticians can offer leadership in both academic and nonacademic public health settings.
Required preparation, a minimum of one year of graduate work in statistics. Principles of statistical pedagogy. Students assist with teaching elementary statistics to students in the health sciences. Students work under the supervision of the faculty, with whom they have regular discussions of methods, content, and evaluation of performance.
Permission of the instructor. Seminar on new research developments in selected biostatistical topics.
Individual arrangements may be made by the advanced student to spend part or all of his or her time in supervised investigation of selected problems in statistics.