The Master's of Professional Studies in Data Sciences and Applications program will train students in analytics, including standard methods in data mining and machine learning, so they will possess the expertise to obtain insights from large and heterogeneous data sets.
Students will learn data management and manipulation such as database management, distributed and big data management, and cloud based methodologies.
Students from all majors interested in data sciences and applications skills are encouraged to apply.
This program was created in consultation with companies such as IBM and HP, Sentient Science, Calspan, M&T, and Moog, who provided input on the skills that they see as difficult to find within current hiring pools and that they anticipate will be needed in the future.
In fact, the McKinsey Global Institute estimates that the job market will need an additional 140,000–190,000 trained personnel for “deep analytical talent positions” and 1.5 million more “data-savvy managers” to take full advantage of big data in the United States. A recent New York Times article writes “Universities can hardly turn out data scientists fast enough.” It is estimated that the national shortage for such talent is at least 60%.
This program is STEM approved, allowing international students the opportunity to apply for the 24-month STEM OPT extension.
Rachael Hageman Blair
709 Kimball Tower
Upon successful completion of the MPS degree, students will be expected to be able to:
This Master of Professional Studies degree is skills-oriented and provides training in the practice of data, computing and analysis. Students will need some prior knowledge of mathematics, statistics and computing, and bridge classes are available to prepare students for success in the program. In particular, we are interested in students from non-traditional backgrounds with an interest in and need for the skills that are the focus of this program.
We accept applications on a rolling basis throughout the year, but encourage all prospective students to submit their applications by the deadlines noted below.
Apply by October 1
Apply by February 15
A nonrefundable fee of $85 is required to apply. You may pay the application fee online with a credit card or e-check.
Copies of transcript(s) for all post-secondary schoolwork must be uploaded with the online application for initial review. Upon an offer of admission, accepted applicants will be required to submit official transcripts and proof of degree(s).
Two letters of recommendation are required to apply to this program. Letters are to be requested through the online application by providing the names and email addresses of recommenders. While we will accept letters from professional sources, we strongly prefer letters from professors who are acquainted with your academic interests, achievements, and abilities.
The GRE is optional for this program. If you would like to take the GRE, arrangements to take the exam can be made through the Educational Testing Service.
Note: the University at Buffalo Institutional Code is 2925.
Provide an account of your education and experience that describes your qualifications for the program.
International applicants are required to provide proof of English proficiency via the Test of English as a Foreign Language (TOEFL) score, International English Language Testing System (IELTS) score, or Pearson Test of English Academic (PTE Academic). All applicants whose native language is not English will be required to provide proof of English proficiency. Please use Institution Code 2925 to provide your scores to us.
The University at Buffalo has the following minimum admission requirements for these tests:
The exam results must be dated within 2 years from your proposed date of admission and remain valid upon entering the term for which you applied. For example, Fall 2020 begins in August 2020; therefore, your exam results must be valid until August 2020.
Information and arrangements to take the exam can be made by contacting the Educational Testing Service. It is strongly recommended to make test arrangements early in the year so sufficient time can be allowed for the results to be reported before our application deadline.
Fill out the International Applicant Financial Form. You will need to fill out the form labeled "Standard Graduate" for the appropriate academic year. The Financial Form and supporting bank documents may be uploaded to your application after an admissions decision has been made.
All international applicants must submit a completed financial statement. Answer all questions thoroughly. An I-20 cannot be issued without this statement documenting the necessary funds for each year of intended study (five years for a PhD program).
Please note that original financial documents must be brought to the school in person upon arrival at UB orientation.
Please do not mail application materials. All items should be submitted electronically with your online application. Please login to the Application Management System frequently to ensure that all of their supporting documents have been received.
Course plan for full-time students:
All courses are 3 credit hours. An asterick (*) indictates a new course that is being finalized for approval.
This course introduces students to computer science fundamentals for building basic data science applications. The course has two components. The first part introduces students to algorithm design and implementation in a modern, high-level, programming language (currently, Python). It emphasizes problem-solving by abstraction. Topics include data types, variables, expressions, basic imperative programming techniques including assignment, input/output, subprograms, parameters, selection, iteration, Boolean type, and expressions, and the use of aggregate data structures including arrays. Students will also have an introduction to the basics of abstract data types and object-oriented design. The second part covers regression analysis and introduction to linear models. Topics include multiple regression, analysis of covariance, least square means, logistic regression, and nonlinear regression. The students learn to implement the regression models as a computer program and use the developed application to analyze synthetic and real world data sets.
This course provides basic understanding of relational databases including normalization, database schemas and relational algebra, create, update, query and delete tables using standard SQL statements, understand workflows such as ETL (extract, transform, and load) to aggregate data from multiple sources integrating it in databases and data warehouses use, manage and customize NoSQL databases including key value, wide column, document and graph stores as well as their application on non-tabular data, use, manage and customize graph databases and apply them to multi-dimensional datasets.
A first course on the design and implementation of numerical methods to solve the most common types of problem arising in science and engineering. Most such problems cannot be solved in terms of a closed analytical formula, but many can be handled with numerical methods learned in this course. Topics for the two semesters include: how a computer does arithmetic, solving systems of simultaneous linear or nonlinear equations, finding eigenvalues and eigenvectors of (large) matrices, minimizing a function of many variables, fitting smooth functions to data points (interpolation and regression), computing integrals, solving ordinary differential equations (initial and boundary value problems), and solving partial differential equations of elliptic, parabolic, and hyperbolic types. We study how and why numerical methods work, and also their errors and limitations. Students gain practical experience through course projects that entail writing computer programs.
Topics include: review of probability, conditional probability, Bayes' Theorem; random variables and distributions; expectation and properties; covariance, correlation, and conditional expectation; special distributions; Central Limit Theorem and applications; estimations, including Bayes; estimators, maximum likelihood estimators, and their properties. Includes use of sufficient statistics to 'improve' estimators, distribution of estimators, unbiasedness, hypothesis testing, linear statistical models, and statistical inference from the Bayesian point of view.
This course presents statistical models for data mining, inference and prediction. The focus will be on supervised learning, which concerns outcome prediction from input data. Students will be introduced to a number of methods for supervised learning, including: linear and logistic regression, shrinkage methods, lasso, partial least squares, tree-based methods, model assessment and selection, model inference and averaging, and neural networks. Computational applications will be presented using R and high dimensional data to reinforce theoretical concepts.
This course presents the topic of data mining from a statistical perspective, with attention directed towards both applied and theoretical considerations. An emphasis will be placed on unsupervised learning methods, especially those designed to discover and exploit hidden structures in high-dimensional data. Topics include: hierarchical and center based clustering, principal component analysis, data visualization, random forests, directed and undirected graphical models, and special considerations when n>>p. Computational applications to high-dimensional data will be presented using Matlab and R to illustrate methods and concepts.
Humans have an uncanny ability to learn from their mistakes and adapt to new environments by relying on their past experience. Machine learning focuses on "How to write a computer program than can improve performance through experience?" Machine learning has a huge number of practical applications, more so in the present era of Big Data, where staggering volumes of diverse data in almost every facet of society, science, engineering, and commerce, are presenting opportunities for valuable discoveries. For example, machine learning is being used to understand financial markets, impact of climate change on society, protein-protein interactions, diseases, etc. Machine learning also has far ranging applications such as self-driving cars to never ending language learning systems. This course will focus on understanding the mathematical and statistical foundations of machine learning. We will also cover the core set of techniques and algorithms needed to understand the practical applications of machine learning. The course will be an integrated view of machine learning, statistics (classical and Bayesian), data mining, and information theory. A basic understanding of probability, statistics, algorithms, and linear algebra is expected. Familiarity with Python is required for homework assignments and for understanding in-class demonstrations.
Present-day terms, philosophies, technologies, and strategies that go into buttressing an organization’s cybersecurity posture. Managing the resources of a corporate information assurance program, while continually improving a risk footprint and response, is an underpinning of all topics that will be covered. Students will critically examine concepts such as networking, system administration, and system security as well as identifying and applying basic security hardening techniques. Students will gain practical experience through a virtualized lab environment where they will build and secure a small corporate network.
This course will provide students with an overview of data driven analytics in different industry sectors. The class will have a series of visiting lecturers with the faculty member teaching the class providing overview, continuity and grading of homework and term papers.
This course will provide students with a final integrative project experience. The class will require students to obtain an integrative project experience either in industry or at the university. In either case the students will use the skills acquired during the other classes in executing project goals. Students will provide short reports to supervising faculty to ensure that learning objectives are being met.
Total of 30 credit hours