Statistics and Data Science
Ellen Haberoth, Regents Math 307
507-786-3113
habero1@stolaf.edu
wp.stolaf.edu/mscs
(Mathematics, Statistics, and Computer Science)
With the growing abundance of data gathered in nearly every field, statistics and data science methods have become invaluable for transforming data into useful information. As a subject, statistics and data science is interdisciplinary, spanning the sciences (natural and social), the humanities, and even the arts. Examples of areas of applications include economics, biology, health, education, actuarial sciences, and law. An increasing number of majors and concentrations require or recommend a statistics or data science course.
Overview of the Concentration
At St. Olaf, students can combine their interests in statistics and data science with any major and acquire a background that leads to graduate study and abundant career opportunities. To find out more about the statistics and data science concentration, visit the Statistics and Data Science program.
Requirements for the Concentration
Code | Title | Credits |
---|---|---|
Required Core Courses: | ||
MSCS 164 | Data Science 1 | 1.00 |
STAT 272 | Statistics 2 | 1.00 |
A prerequisite course prior to STAT 272 can be fulfilled with any of the following: | ||
AP Statistics | ||
Introductory Econometrics | ||
Principles of Statistics | ||
Statistics 1 | ||
Select one (1) Level III course from the following: | 1.00 | |
Econometrics: Cross-Sectional and Panel Data 1 | ||
Econometrics: Time Series and Forecasting 1 | ||
Algorithms for Decision Making | ||
Advanced Statistical Modeling | ||
Statistical Theory | ||
Advanced Topics in Statistics | ||
Select one (1) elective from the following: | 1.00 | |
Computer Science for Scientists and Mathematicians | ||
Probability Theory | ||
Introduction to Data Science | ||
Analyzing Politics and Policies | ||
Research Methods in Psychology | ||
Foundations of Social Science Research: Quantitative Methods | ||
Intermediate Statistics for Social Science Research | ||
Topics in Statistics | ||
Biostatistics: Design and Analysis | ||
Any of the Level III courses listed above 1 | ||
Experiential Learning Component (optional, see below) | ||
Total Credits | 4 |
- 1
Only one of ECON 384 or ECON 385 may count toward the concentration
Experiential Learning Component (Optional)
Each concentrator is encouraged to participate in experientially based research or employment that takes statistical methods beyond the traditional classroom. This can occur on- or off-campus. Prior approval by the director of statistics and data science program and a letter after the fact from a supervisor are required to earn credit. Excellent opportunities for experiential learning in statistics and data science are available through STAT 294 Academic Internship and MSCS 389 Research Methods (through the Center for Interdisciplinary Research (CIR)). As a CIR fellow, students can work during the academic year or summer with faculty on research from a variety of disciplines.
Note: For students considering graduate school in statistics or a closely related field, the following courses are recommended:
Code | Title | Credits |
---|---|---|
MATH 126 | Calculus II | 1.00 |
or MATH 128 | Honors Calculus II | |
MATH 220 | Elementary Linear Algebra | 1.00 |
MATH 226 | Multivariable Calculus | 1.00 |
MATH 230 | Differential Equations I | 1.00 |
MATH 242 | Modern Computational Mathematics | 1.00 |
MATH 244 & MATH 344 | Real Analysis I and Real Analysis II | 2.00 |
CSCI 221 | Introduction to Data Structures in C++ | 1.00 |
STAT 110, STAT 172, and ECON 260 all provide an introduction to statistics, and students should not take more than one; they can all serve as a prerequisite for further courses, although ECON 260 is geared toward majors in economics. Students coming from STAT 110, ECON 260, or AP Statistics who would like to transition into the statistics and data science concentration are encouraged to begin in MSCS 164.
STAT 110: Principles of Statistics
This is an introductory course for the liberal arts. Students learn study design principles and develop statistical literacy and reasoning. They learn to describe distributions, assess if known distributions fit their data, estimate population values with confidence intervals, and assess statistical significance with hypothesis tests (e.g., chi-square, z-, and t-tests, ANOVA, correlation, and regression). Not recommended for students who have completed a term of calculus. STAT 110, STAT 212, and ECON 260 all provide an introduction to statistics, and students should not take more than one; they all can serve as a prerequisite for further courses. Offered each semester. Also counts toward environment studies (social science emphasis) and kinesiology majors and public health studies concentration.
STAT 172: Statistics 1
A first course in statistical methods, this course addresses study design and its implications as well as exploratory and inferential techniques for analyzing and modeling data. Topics include exploratory graphics, descriptive techniques, randomization tests, statistical designs, hypothesis testing, confidence intervals, and simple/multiple regression. Offered each semester. Enrollment limited for seniors. STAT 110, STAT 172, and ECON 260 all provide an introduction to statistics and students should not take more than one; they all can serve as a prerequisite for further courses. Also counts toward environmental studies major (natural science and social science emphases), kinesiology major, and business and management studies, mathematical biology, and public health studies concentrations.
STAT 270: Intermediate Statistics for Social Science Research
This course focuses on the use of statistics in a social science context. Students investigate three essential questions: How can one reliably measure something? How does one design valid research? How does one analyze research results? Topics include ANOVA designs (for example, one-way and two-way with interaction), data reduction methods, and principles of measurement. Interdisciplinary groups work together on case studies throughout the term. Offered alternate years. Also counts toward public health studies concentration.
Prerequisites: STAT 110 or STAT 172 or ECON 260 or equivalent preparation, or permission of the instructor.
STAT 272: Statistics 2
This course takes a case-study approach to the fitting and assessment of statistical models with application to real data. Specific topics include multiple regression, model diagnostics, logistic regression, experimental design and ANOVA. The approach focuses on problem-solving tools, interpretation, model assumptions underlying analysis methods, and written statistical reports. Offered each semester. Also counts toward environmental studies major (natural science and social science emphases) and business and management studies, mathematical biology, neuroscience, and public health studies concentrations.
Prerequisite: STAT 172, ECON 260 or equivalent preparation (STAT 110 and MSCS 264) or (AP Stat and MSCS 264), or permission of instructor.
STAT 282: Topics in Statistics
Students explore special topics in statistics. Topics vary from year to year. May be repeated if topic is different. Offered periodically.
STAT 284: Biostatistics: Design and Analysis
The course investigates issues in health-related settings using a quantitative, research-oriented perspective. Course material focuses on global and public health issues, study design, methods for analyzing health data, and communication of research findings. Design topics include controlled trials, case-control, cohort and other observational studies. Methods include survival analysis and causal inference for observational studies. Communication emphasizes writing up findings and interpreting published research. Also counts toward mathematical biology concentration. Offered alternate years.
Prerequisite: completion of STAT 272 or permission of the instructor.
STAT 294: Academic Internship
STAT 298: Independent Study
STAT 316: Advanced Statistical Modeling
This course extends and generalizes methods introduced in STAT 272 by introducing generalized linear models (GLMs) and correlated data methods. GLMs cover logistic and Poisson regression, and more. Correlated data methods include longitudinal data analysis and multi level models. Applications are drawn from across the disciplines. Offered annually. Also counts toward neuroscience concentration.
Prerequisite: STAT 272.
STAT 322: Statistical Theory
This course is an investigation of modern statistical theory along with classical mathematical statistics topics such as properties of estimators, likelihood ratio tests, and distribution theory. Additional topics include Bayesian analysis, bootstrapping, Markov Chain Monte Carlo, and other computationally intensive methods. Offered annually. Also counts toward neuroscience concentration.
Prerequisite: STAT 272 and MATH 262.
STAT 382: Advanced Topics in Statistics
Students work intensively on a special topic in statistics. Topics may vary from year to year. May be repeated if topics are different. Offered periodically.
Prerequisites: Permission of instructor.
STAT 394: Academic Internship
STAT 396: Directed Undergraduate Research
This course provides a comprehensive research opportunity, including an introduction to relevant background material, technical instruction, identification of a meaningful project, and data collection. The topic is determined by the faculty member in charge of the course and may relate to their research interests. Offered based on department decision. May be offered as a 1.00 credit course or .50 credit course.
Prerequisite: determined by individual instructor.
STAT 398: Independent Research
Related Courses
CSCI 125: Computer Science for Scientists and Mathematicians
This course teaches introductory programming with a focus on handling data. Emphases include programming concepts and structures, writing computer code to solve quantitative problems, and the use of programming to analyze data. The primary tool is the Python programming language. Students work individually and in teams to apply basic principles and explore real-world datasets with a sustainability theme. Offered annually. Also counts toward statistics and mathematical biology concentrations; one of CSCI 121, CSCI 125, or CSCI 251 counts toward applied linguistics concentration.
Prerequisite: calculus or permission of the instructor.
ECON 260: Introductory Econometrics
This course emphasizes skills necessary to understand and analyze economic data. Topics include descriptive statistics, probability and random variables, sampling theory, estimation and hypothesis testing, and practical and theoretical understanding of simple and multiple regression analysis. Applications to economic and business problems use real data, realistic applications, and econometric/statistical software. Offered each semester. ECON 260 is required for economics majors who do not take both STAT 272 and either ECON 384 or ECON 385. Credit toward the economics major will not be given for ECON 260 following completion of STAT 272. Also counts toward environmental studies major (social science emphasis) and public health studies concentration.
Prerequisite: MATH 119 or MATH 120 and one of ECON 110 - ECON 121, or permission of instructor.
ECON 384: Econometrics: Cross-Sectional and Panel Data
This course emphasizes theoretical foundations, mathematical structure, and applications of major econometric techniques appropriate for cross-sectional and panel data. Topics to be covered include generalized least squares, dummy variables, non-linear models, instrumental variables techniques, fixed- and random-effects models, and limited dependent variable models. This course is recommended for students interested in analysis of issues in microeconomics and public policy. Offered annually. ECON 384 and ECON 385 may not both be used to satisfy the economic analysis requirements for either the economics or quantitative economics major.
Prerequisite: ECON 262 and one of ECON 260, ECON 263, or STAT 272; or permission of instructor.
ECON 385: Econometrics: Time Series and Forecasting
This course emphasizes the theoretical foundations, mathematical structure, and applications of major econometric techniques appropriate for time-series data. Topics covered include generalized least squares, single-equation time-series models, multi-variable time-series models, forecasting and forecast evaluation, and seasonality. This course is recommended for students interested in analysis of issues in macroeconomics and finance. Offered annually. ECON 384 and ECON 385 may not both be used to satisfy the economic analysis requirements for either the economics or quantitative economics major. Completion of MATH 220 may be helpful but is not required.
Prerequisites: ECON 261 and one of ECON 260 or ECON 263 or STAT 272; or permission of instructor.
MATH 262: Probability Theory
This course introduces the mathematics of randomness. Topics include probabilities on discrete and continuous sample spaces, conditional probability and Bayes' Theorem, random variables, expectation and variance, distributions (including binomial, Poisson, geometric, normal, exponential, and gamma) and the Central Limit Theorem. Students use computers to explore these topics. Offered each semester. Also counts toward business and management studies concentration.
Prerequisite: MATH 126 or MATH 128.
MSCS 150: Statistical and Data Investigations
Students learn basic techniques to analyze, manage, visualize, and model data. Instruction focuses on the analysis of "real," salient datasets in a computer-equipped classroom. In small groups students discuss, analyze, and solve case study-based problems. Class sessions include the Inquiry-Based Learning technique, which engages students in frequent presentations of their solutions to the class. Students use the R statistical software to perform statistical computing and data visualizations. Offered annually.
MSCS 164: Data Science 1
Data is the currency of the modern world. At the intersection between statistics and computer science, data science is about gleaning information and making decisions from data. Using data from a variety of contexts and disciplines, students learn to summarize and extract insight from data, create compelling data visualizations, wrangle data, practice literate programming, and explore ethical issues in data science. No prior experience with programming is expected. This course cannot be taken after MSCS 264.
MSCS 264: Introduction to Data Science
Data is the currency of the modern world, and data science is a field that sits at the intersection between statistics and computer science. At its heart, data science is about gleaning information and making decisions from data; this course provides a solid foundation to the most important data science tools. Students develop a common language for creating visualizations, wrangling with data, programming in a literate manner, producing reproducible research, and communicating results. Offered each semester. Counts toward statistics and data science concentration.
MSCS 341: Algorithms for Decision Making
This course introduces students to the subject of machine learning. The primary focus is the development and application of powerful machine learning algorithms applied to complex, real-world data. Topics covered include linear regression, nearest neighbor models, k-means clustering, shrinkage methods, decision trees and forests, boosting, bagging, support vector machines, and hierarchical clustering. Applications are taken from a wide variety of disciplines, including biology, economics, public policy, public health, and sports. Offered on a regular basis. Counts toward computer science and mathematics majors and statistics and data science concentration.
Prerequisite: MSCS 164 or MSCS 264 or permission of the instructor.
MSCS 389: Math, Statistics, and Computer Science Research Methods (0.50)
Students focus on writing scientific papers, preparing scientific posters, and giving presentations in the context of a specific, year-long, interdisciplinary research project. In addition, this weekly seminar series builds collaborative research skills such as working in teams, performing reviews of math, statistics, and computer science literature, consulting effectively, and communicating proficiently. Exposure to post-graduate opportunities in math, statistics, and computer science disciplines is also provided. Open to students accepted into the Center for Interdisciplinary Research.
MSCS 390: Mathematics Practicum
Students work in groups on substantial problems posed by, and of current interest to, area businesses and government agencies. The student groups decide on promising approaches to their problem and carry out the necessary investigations with minimal faculty involvement. Each group reports the results of its investigations with a paper and an hour-long presentation to the sponsoring organization. Offered alternate years during January Term.
Prerequisite: Permission of instructor.
PSYCH 230: Research Methods in Psychology
This course prepares students with tools for understanding how research studies in psychology are conceptualized, designed, and ethically conducted, and how data is analyzed, interpreted, and disseminated. Students apply this understanding in independent and small group research projects. In the process, students develop critical reading, thinking, and scientific writing skills. Students attend lectures plus one two-hour laboratory per week. Offered each semester. Also counts toward environmental studies major, kinesiology major, and statistics and data science and public health studies concentrations.
Prerequisites: PSYCH 125, and STAT 110 or STAT 172 or ECON 260.
SOAN 371: Foundations of Social Science Research: Quantitative Methods
Students gain the skills necessary to conduct and critically evaluate quantitative research. Students learn the underlying theoretical assumptions and orientations of quantitative research, including research design, sampling techniques, strategies for data collection, and approaches to analysis. Students gain practice in data analysis by conducting are search project and using the Statistical Package for the Social Sciences (SPSS), a standard in sociology. Offered annually in the fall semester. Also counts toward environmental studies major (social science emphasis) and business and management studies and public health studies concentrations.
Prerequisite: STAT 110 or STAT 172; open to junior or senior sociology/anthropology majors only.
Director, 2023-2024
Kathryn Ziegler-Graham
Associate Professor of Mathematics, Statistics, and Computer Science
biostatistics
Laura Boehm
Assistant Professor of Mathematics, Statistics, and Computer Science
statistics; spatial data analysis
Jaime I. Davila
Assistant Professor of Mathematics, Statistics, and Computer Science
Francesca Gandini
Assistant Professor of Mathematics, Statistics, and Computer Science
Kimberly (Kim) Mandery
Visiting Instructor of Mathematics, Statistics, and Computer Science
Rachael Norton
Assistant Professor of Mathematics, Statistics, and Computer Science
Thomas (T.J.) Reinartz
Visiting Assistant Professor of Mathematics, Statistics, and Computer Science
Paul J. Roback
Kenneth Bjork Distinguished Professor of Mathematics, Statistics, and Computer Science
statistics
Joseph Roith
Associate Professor of Practice in Mathematics, Statistics, and Computer Science
statistics
Jack Wolf
Adjunct Instructor of Mathematics, Statistics, and Computer Science
Martha Zillig
Visiting Assistant Professor of Mathematics, Statistics Computer Science