Meeting the demand for skilled IT and data science workers

The MS program in Engineering Science with a focus on Data Science provides students with a core foundation in big data and analysis by obtaining knowledge, expertise and training in data collection and management, data analytics, scalable data-driven discovery, and fundamental concepts.

Program Director

Johannes Hachmann

612 Furnas Hall

hachmann@buffalo.edu

STEM Approved

The Engineering Science programs are STEM approved, allowing international students the opportunity to apply for the 24-month STEM OPT extension.

Who is this for?

For engineering and natural/mathematical science students.

This applied program trains students in the emerging and high demand area of data and computing sciences. In fact, many surveys of employment have highlighted the great need for suitably trained professionals in these areas, estimating deficits of personnel availability in only the US at as high as 150,000 a year.

Students will be trained in sound basic theory with an emphasis on practical aspects of data, computing and analysis. Graduates will be able to serve the analytics needs of employers and will be exposed to several areas of application. The degree can be specialized using electives and a project. Classes will be modestly sized and emphasize best classroom practices while employing online resources to reinforce the classroom experience.

Students in this program will need some prior knowledge of mathematics, statistics and computing (commensurate with that from an engineering/natural science/math undergraduate program, see below for detail). The program can be completed in one calendar year of study.

The University at Buffalo has responded aggressively to these trends by first establishing a doctoral program in Computational and Data Sciences. UB has been a research pioneer in these areas and faculty have much expertise and decades of experience and assets like the world leading Center for Computational Research with unmatched facilities for big computing and data.

Some prior knowledge of mathematics, statistics and computing (commensurate with that from an engineering/natural science/math undergraduate program) is required.

Equivalent of a B average or better in a recognized undergraduate program; GRE: 300+ (waived for recent UB undergraduate students)

Calculus, Multivariate Calculus, Linear Algebra (e.g., UB course MTH 309)

Basic Statistics and Probability

Programming (at least one language - C/C++/Python/Java), Data Structures (e.g., UB course CSE 113)

- This program is currently taught in a cohort-based model and offers both Fall AND Spring admission.
- Students will take a combination of core courses (18 credits), electives (6 credits), a data science survey + capstone course (3 credits) and the data science project (3 credits) for a total of 30 credits.
- Students have the opportunity to complete an internship in industry for their data science project requirements. Alternatively, students can opt to complete a research project with a faculty member.
- The program may be completed in one calendar year of study.

**Course plan for full-time students:**

- First semester – 4 core courses (Math and Stats Basics)
- Second semester – 3 core courses + 1 elective
- Third semester – 1 Data Science Survey course + 1 Project/Capstone

Introduction to Probability Theory for Data Science (3 credits)

EAS 502 (see description below)

Introduction to Numerical Mathematics for Computing and Data Scientists (3 credits)

EAS 501 (see description below)

Statistical Learning and Data Mining I

EAS 595 *(as of spring 2021)*

Programming and Database Fundamentals for Data Scientists (3 credits)

EAS 503 (see description below)

Statistical Learning and Data Mining II

EAS 596 *(as of spring 2021)*

Introduction to Machine Learning (3 credits)

CSE 574

Elective 1 (3 credits)

See list below

Data Models Query Language (3 credits)

CSE 560 *(as of Fall 2018)*

EAS 504 Data Science Survey Course**

EAS 560 Data Science Project***

*** The Data Science Survey Course will include weekly modules on application-oriented and other relevant topics, including: data science for bioinformatics, data science for health informatics, data science for engineering applications, ethics and privacy, and data science for finance.*

**** Students will work with an affiliated faculty member on a Data Science Project. Projects will be sourced from industry where feasible.*

**EAS 502 Introduction to Probability Theory for Data Science**

This course provides basic background on probability theory at a beginning graduate level. Topics include introductory probability concepts, discrete and continuous random variables and probability distributions, joint probability distributions, random sampling and data description, point estimation of parameters, random variables, derived probability distributions, discrete and continuous transforms and random incidence. As time permits, the course introduces elementary stochastic processes including Bernoulli and Poisson processes.

**EAS 501 Introduction to Numerical Mathematics for Computing and Data Scientists**

The aim of this course is:

- To develop the ability to formulate and solve problems using mathematical methods and tools
- To apply knowledge gained in lower level mathematics courses
- To introduce concepts and methods of linear algebra
- To introduce a broad range of numerical methods
- To develop an ability to identify, understand, and solve algebraic equations
- To develop an ability to identify, understand, and solve differential equations
- To develop experience with numerical and symbolic mathematical software and their use in problem solving

**EAS 595 Statistical Learning and Data Mining I**

An introduction to the mathematical theory and computational methodology at the heart of statistical learning. Using a Bayesian paradigm, this first semester considers supervised learning, including topics of classification - support vector machines, k-nearest neighbors, Naive Bayes, logistic regression, tree methods and forests, bagging and ensemble methods – as well as Gaussian processes and neural networks, and methods for validation and testing. The R programming language will be used. Students will develop a facility for statistical learning of data; students will become proficient in writing computer code to analyze datasets and draw conclusions from analysis.

This course has both a traditional lecture component, as well as an online computational lab component; labs will be run approximately every third week.

**EAS 503 Programming and Database Fundamentals for Data Scientists**

This course introduces students to computer science fundamentals for building basic data science applications. The course has two parts. The first part covers the fundamentals of programming with Python and Python libraries for data manipulation, visualization, and machine learning. The second part covers database design and use of databases in applications.

**EAS 596 Statistical Learning and Data Mining II**

An introduction to the mathematical theory and computational methodology at the heart of statistical learning. Using a Bayesian paradigm, this second semester considers unsupervised learning, including dimension reduction, clustering, Gaussian mixtures methods, graph models, and model averaging. The course will examine parametric and non-parametric regression, including Gaussian Process regression. The R programming language will be used. Students will develop a facility for statistical learning of data; students will become proficient in writing computer code to analyze datasets and draw conclusions from analysis.

This course has both a traditional lecture component, as well as an online computational lab component; labs will be run approximately every third week.

**CSE 574 Intro to Machine Learning**

Involves teaching computer programs to improve their performance through guided training and unguided experience. Takes both symbolic and numerical approaches. Topics include concept learning, decision trees, neural nets, latent variable models, probabilistic inference, time series models, Bayesian learning, sampling methods, computational learning theory, support vector machines, and reinforcement learning.

**CSE 560 Data Models Query Language**

The course focuses on the issues of data models and query languages that are relevant for building present-day database applications. The following topics are addressed: Entity-Relationship data model, relational data model, relational query languages, object data models, constraints and triggers, XML and Web databases, the basics of indexing and query optimization.

**EAS 504 Applications of Data Science: Industry Overview**

This course will provide students with an overview of data driven analytics in different industry sectors. The class will have a series of visiting lecturers with the faculty member teaching the class providing overview, continuity and grading of homework and term papers.

This course will provide students with a final integrative project experience. The class will require students to obtain an integrative project experience either in industry or at the university. In either case the students will use the skills acquired during the other classes in executing project goals. Students will provide short reports to supervising faculty to ensure that learning objectives are being met.

Two out of the following courses can be selected as electives.

CSE 531 Algorithms Analysis + Design

CSE 535 Information Retrieval

CSE 546 Reinforcement Learning

CSE 562 Database Systems

CSE 573 Computer Vision

CSE 586 Large-Scale Distributed Systems

CSE 587 Data Intensive Computing

CSE 601 Data Mining for Bioinformatics

CSE 633 Parallel Algorithms

CSE 635 Multimedia Information Retrieval

CSE 636 Data Integration

CSE 676 Deep Learning

_{**Students must have successfully completed CSE 574 before taking CSE 676. Cannot be taken in the same semester as CSE 574.}

CSE 740 Machine Learning and Big Data

CSE 674 Advanced Machine Learning

STA 517 Categorical Data Analysis

STA 567 Bayesian Statistics

CDA 609 High Performance Computing

IE 575 Stochastic Methods

IE 535 Human Computer Interaction

EE 634 Principles of Information Theory and Coding

MTH 558/559 Mathematical Finance

**The Institute for Computational and Data Sciences is temporarily suspending the GRE requirement for admission to our masters and PHD programs for the Spring 2021 and Fall 2021 entry terms.**