Training Machine Learning Models with Uncertainty

Graphical depiction of SVM training with Gaussian Uncertainty. From: C. Tzelepis, V. Mezaris and I. Patras, "Linear Maximum Margin Classifier for Learning from Uncertain Data," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, pp. 2948-2962, 1 Dec. 2018.

Combine data science, machine learning and statistics to build algorithms to autonomously classify data. 

Project is Not Currently Available

This project has reached full capacity for the current term. Please check back next semester for updates.

Project description

Machine Learning (ML) is an efficient and effective way for algorithms to assess data based on historical events. Support Vector Machines (SVM) are a type of ML model that classify input data into one of two categories. These models are typically trained on deterministic data, that is data without any uncertainty or randomness. However, in real life all data has some associated level of accuracy, and the ML model must account for this. During this project, we will examine how to train an SVM model using uncertain data. One such approach has been posed in literature and will be investigated. That result will be compared against an alternative means to train the model based on statistically significant sampling which may offer computational savings – a primary concern with training ML algorithms. Results from this project will be summarized into a final report and ideally a published academic article.

Project outcome

During this project the student will learn what Support Vector Machine (SVM) classification is and will learn how to train an SVM model. They will also learn about uncertainty in data, how operations on uncertain data affects model outputs, and about statistically significant sampling. The student will then apply their knowledge of sampling to develop an SVM training algorithm. The results from this algorithm will be compared against an algorithm that exists in literature which inspired this project. The student will also learn how to present their results by generating a final report or academic publication summarizing their work. 

Project details

Timing, eligibility and other details
Length of commitment Longer than a semester; 6-9 months
Start time Summer (May/June) 
In-person, remote, or hybrid? Hybrid Project (can be remote and/or in-person; to be determined by mentor and student) 
Level of collaboration Individual student project 
Benefits Stipend
Who is eligible Juniors and Seniors who have experience in Matlab and/or Python programming, linear algebra, introductory statistics 

Core partners

Project mentor

Christopher Nebelecky

Research Scientist

Mechanical and Aerospace Engineering

Phone: (716) 418-2366

Email: ckn@buffalo.edu

Start the project

  1. Email the project mentor using the contact information above to express your interest and get approval to work on the project. (Here are helpful tips on how to contact a project mentor.)
  2. After you receive approval from the mentor to start this project, click the button to start the digital badge. (Learn more about ELN's digital badge options.) 

Preparation activities

Once you begin the digital badge series, you will have access to all the necessary activities and instructions. Your mentor has indicated they would like you to also complete the specific preparation activities below. Please reference this when you get to Step 2 of the Preparation Phase. 

To prepare for this effort, the following video should be watched to gain a foundational understanding of Support Vector Machines at a high level

https://www.youtube.com/watch?v=Y6RRHw9uN9o

Keywords

Mechanical and Aerospace Engineering, Machine Learning