This year's IAD Days was held April 26-28, 2022. The event was a celebration of big data and high-performance computing at the University at Buffalo.
IAD Days is a signature event hosted by the Institute for Artificial Intelligence and Data Science. Previously called CDSE Days, the event brings some of the nation's most prominent scholars of data-enabled science to Buffalo for a week of workshops, lectures and networking. The initiative increases educational opportunity and employability for students, attracts new graduate students to UB, and boosts research opportunities for aligned faculty members.
Time | Topic | Speaker | Location |
---|---|---|---|
1:00-3:00pm | A Hands-On Tutorial on AI Fairness and AI Explainability Using Open-Source Libraries | Moninder Singh | 330 Student Union |
3:00-3:15pm | Designing Scalable and Situation-Adaptive Robot Swarms via Optimization and Machine Learning | Souma Chowdhury | 330 Student Union |
3:15-3:30pm | Expressive Structured Matrices for Efficient and Accurate Training | Atri Rudra | 330 Student Union |
3:30-3:45pm | Early Detection of Autism in High-Risk 6-month old Infants Using AI | Ifeoma Nwogu | 330 Student Union |
3:45-4:00pm | Systems Challenges in Visual Spatial Reasoning | Karthik Dantu | 330 Student Union |
4:00-5:00pm | Capturing Real-World Difficulties of AI Deployment | Noel Codella | 330 Student Union |
5:00-6:00pm | Empowering Medical Scanners with Autonomy | Ziyan Wu | 330 Student Union |
Time | Topic | Speaker | Location |
---|---|---|---|
1:00-2:00pm | Going with the flow: The role of the cytoplasmic streaming during 3D cell migration | Wanda Strychalski | 330 Student Union |
2:00-3:00pm | Genomic surveillance of SARS‐CoV‐2 in Erie County highlights the need for regional surveillance and the partnerships to support it | Jennifer Surtees | 330 Student Union |
3:00-4:00pm | UB Faculty Research | David Doermann | 330 Student Union |
4:00-5:00pm | What 1936 Can Teach Us about 2016: Another Look at Nonresponse Bias and the Limits of Poll Aggregation | Jacob Neiheisel | 330 Student Union |
5:00-6:00pm | Entertainment-Education: Using Narrative Engagement and Emerging Technologies for Sexual and Reproductive Health Promotion | Helen Wang | 330 Student Union |
Time | Topic | Speaker | Location |
---|---|---|---|
1:00-2:00pm | Fighting DeepFakes with AI Technology | Siwei Lyu | 330 Student Union |
2:00-3:00pm | Fighting Misinformation with Discovery Oriented Education | Naniette H. Coleman | 330 Student Union |
3:00-4:00pm | UB Faculty Research | David Doermann | 330 Student Union |
4:00-5:00pm | PhD Student Talks | 330 Student Union | |
5:00-6:00pm | Poster Session | 215 Lockwood |
Associate Professor, Department of Mechanical and Aerospace Engineering, Adjunct Associate Professor, Department of Computer Science and Engineering University at Buffalo
Microsoft Research
Associate Professor, Department of Computer Science and Engineering, University at Buffalo
Empire Innovation Professor, Department of Computer Science and Engineering, Director, Institute for Artificial Intelligence and Data Science, University at Buffalo
Empire Innovation Professor, Department of Computer Science and Engineering, University at Buffalo
Associate Professor, Department of Computer Science and Engineering, University at Buffalo
Associate Professor, Department of Mathematics, Applied Mathematics and Statistics, Case Western Reserve University
Motivated by observations of phenomenal cooperative behavior in nature, large teams of simple robots promise unprecedented task efficiency and resilience benefits over sophisticated standalone systems. To advance this paradigm of large multi-robot teams, aka robot swarms, we have developed a suite of algorithms that guide the collective behavior of multiple ground or aerial robots for operations such as signal source localization, multi-location response, area coverage and boundary mapping. These algorithms are built using the machinery of optimization, graph theory and metaheuristics. To demonstrate their potential impact, we have explored their application in simulated use cases such as flood response, finding skiers trapped under avalanche and offshore oil spill mapping. More recently, we have broken new ground on scalability of such autonomous robot swarm operations by innovating graph learning approaches to generate real-time policies for multi-robot task allocation and capacitated vehicle routing. A culmination of our work in swarm robotics has been the demonstration of new reinforcement learning frameworks and gaming environments (for human-swarm interaction) to model situation-adaptive tactical decisions for teams of UAVs and UGVs performing victim search in complex urban environments.
Modern robotics and mixed reality is driven by the ability of our devices/robots to perceive the environment using cameras and depth sensors. Localization and Mapping is a fundamental task in such reasoning. While research in Visual Simultaneous Localization and Mapping (VSLAM) has been underway for over a decade, it is still challenging to efficiently run these pipelines on modern embedded systems. I will discuss Edge-SLAM, a novel Visual SLAM pipeline that splits the computing between the mobile device and an edge server to efficiently use resources. I will also discuss other systems challenges in VSLAM with potential for future research.
In this talk, I will cover two difficulties encountered when translating AI from research to deployment settings.
The first concerns application of AI to medical imaging settings, specifically dermatology. Previous studies of artificial intelligence (AI) applied to dermatology have shown AI to have higher diagnostic classification accuracy than expert dermatologists; however, these studies did not adequately assess clinically realistic scenarios, such as how AI systems behave when presented with images of disease categories that are not included in the training dataset or images drawn from statistical distributions with significant shifts from training distributions. We aimed to simulate these real-world scenarios and evaluate the effects of image source institution, diagnoses outside of the training set, and other image artifacts on classification accuracy, with the goal of informing clinicians and regulatory agencies about safety and real-world accuracy. We have identified specific deficiencies and safety issues in AI diagnostic systems for skin cancer that should be addressed in future diagnostic evaluation protocols to improve safety and reliability in clinical practice.
The second concerns deployment of few-shot learning methods. Recent progress on few-shot learning largely relies on annotated data for meta-learning: base classes sampled from the same domain as the novel classes. However, in many applications, collecting data for meta-learning is infeasible or impossible. This leads to the cross-domain few-shot learning problem, where there is a large shift between base and novel class domains. While investigations of the cross-domain few-shot scenario exist, these works are limited to natural images that still contain a high degree of visual similarity. Here, we propose the Broader Study of Cross-Domain Few-Shot Learning (BSCD-FSL) benchmark, consisting of image data from a diverse assortment of image acquisition methods. This includes natural images, such as crop disease images, but additionally those that present with an increasing dissimilarity to natural images, such as satellite images, dermatology images, and radiology images. Extensive experiments on the proposed benchmark are performed to evaluate state-of-art meta-learning approaches, transfer learning approaches, and newer methods for cross-domain few-shot learning.
Higher Education remains a reliable countermeasure for misinformation. With the proliferation of Deepfakes, fake news, fake data sets, and even fake academic journals a foundational skill set geared towards combating misinformation is increasingly critical for basic participation in academia and society. While well-researched answers to questions with myriad citations are always out there, so are extra deliciously convincing, conspiracy theory clickbait. If we all fall for it, how do we prevent the next generation from suffering the same fate? Now more than ever, college students and college faculty need opportunities to develop digital literacy skills while also learning how to decipher and interrogate the maelstrom of information we all encounter daily. In this presentation I will introduce an innovative misinformation-combating initiative founded at UC Berkeley. The award-winning, semester-long, discovery-oriented, research (and professional) skills training program focuses on the review, discussion, translation, and dissemination of privacy, surveillance, and cybersecurity knowledge to the public.
Recent years have witnessed an unexpected and astonishing rise of AI-synthesized fake media (aka DeepFakes), thanks to the rapid advancement of technology and the omnipresence of social media. Together with other forms of online disinformation, the DeepFakes are eroding our trust in online information and have already caused real damage (thinking about the recent fake video of President Zelensky during the Ukrainian War). It is thus important to develop countermeasures to limit the negative impacts of DeepFakes. In this presentation, I will first overview the types and making of DeepFake media. I will then highlight recent technical developments to fight DeepFakes using AI technology. Detection technologies and state-of-the-art are covered. I will introduce a DeepFake detection platform, DeepFake-o-meter, which is a collaboration with Realty Defender Inc. to assist users to authenticate DeepFakes. In addition, I will briefly introduce the more recent developments of preemptive techniques to protect the users from becoming victims of DeepFake attacks. I will also discuss the current drawbacks and limitations of these counter technologies, and the future of DeepFakes and forensics. This talk aims to help law enforcement officers and media forensic practitioners to understand the capacity of current counter-technologies to DeepFakes to better employ them in practice.
Public opinion polling’s reputation took a sizable hit in the wake of the 2016 elections, as statewide polls in key battleground states ended up understating popular support for Donald Trump. Forecasting models that relied heavily on national polls similarly missed the mark. In response to these apparent shortcomings, the American Association for Public Opinion Research (AAPOR) conducted a thorough post-mortem review of what went wrong with the polls in 2016. In this talk, I provide an overview of the AAPOR report and situate its findings within the context of a similar such effort written in response to another (in)famous polling disaster—the 1936 Literary Digest Poll. I also take this opportunity to provide a preliminary analysis of nonresponse bias attributable to differential levels of distrust in the media—a potential contributor to polling errors in 2016 that went largely overlooked by the AAPOR report. Finally, I briefly touch upon issues related to poll aggregation of the type popularized by sites like FiveThirtyEight and (in an earlier period) Pollster.com, incorporating this discussion into a broader conversation of the limits of big data approaches to measuring public opinion.
In recent times, deep learning techniques for training artificial neural networks have been highly successful and transformative in a wide range of AI applications. But such techniques are “data hungry”, requiring large amounts of data to learn the substantial number of parameters in the architectures. But in our research, we have very limited data, hence, we propose intelligent training methods to circumvent this limitation.
We will explore the notion of generating new samples of parent-infant pairs based on the probabilistic deep-learning models trained on actual dyads. Such a model can be useful for anonymizing existing data and for extending our extant 60-dyad dataset. Furthermore, such a generative model can be developed more extensively within the context of causal reasoning methods, to provide more explainable predictions than what can be obtained from previous approach. This work benefits from the interplay between the interdisciplinary researchers involved, including investigators from developmental + quantitative psychology, behavioral science + nonverbal communication, and affective computing + machine learning.
Large neural networks excel in many domains, but they are expensive to train and fine-tune. A popular approach to reduce their compute or memory requirements is to replace dense weight matrices with structured ones (e.g., sparse, low-rank, Fourier transform). These methods have not seen widespread adoption (1) in end-to-end training due to unfavorable efficiency--quality tradeoffs, and (2) in dense-to-sparse fine-tuning due to lack of tractable algorithms to approximate a given dense weight matrix. To address these issues, we propose a class of matrices (Monarch) that is hardware-efficient (they are parameterized as products of two block-diagonal matrices for better hardware utilization) and expressive (they can represent many commonly used transforms). Surprisingly, the problem of approximating a dense weight matrix with a Monarch matrix, though nonconvex, has an analytical optimal solution. These properties of Monarch matrices unlock new ways to train and fine-tune sparse and dense models.
It is critical to assess the risks of AI used in high-stakes enterprise workflows along several dimensions beyond accuracy. In this hands-on session, you will learn to build trust in AI along two important dimensions – fairness and explainability - using two open-source libraries, AI Fairness 360 and AI Explainability 360.
The tutorial will first introduce various types of unwanted bias and algorithmic fairness. Then, using samples of working code (Jupyter notebooks), participants will be led through the process of evaluating, as well as mitigating, bias in AI models using a variety of fairness metrics and mitigation methods. We will then explore various techniques for generating explanations from AI models, and look at multiple scenarios, and associated code samples, for explaining model predictions in these scenarios.
Cell migration is critical for many vital processes, such as wound healing, as well as harmful processes, like cancer metastasis. Recent experiments of cells migrating in 3D fibrous matrices reminiscent of biological tissue show some cells experience large shape changes and are thought to utilize cytoplasmic streaming and intracellular pressure in generating leading-edge protrusions. It has been hypothesized that the adhesion of the cell to its external environment does not play an important role in this type of motility in contrast to cell migration on flat surfaces. It is not well understood how cells generate forces to facilitate migration in using this amoeboid mode. In this talk, dynamic computational models of single-cell migration are presented in a fibrous extracellular matrix and in a confined channel. Results show the non-trivial relationship between cell rheology and its external environment during migration.
Genomic sequencing has the power to trace pathogen movement through time and space by monitoring changes in the viral genome. The SARS-CoV-2 pandemic is the first time this approach has been deployed on such a vast, global scale. As a result, we have been able to track not only movement of this virus globally, but also its evolution in real time. Mutagenesis of this virus through millions of infections has resulted in variants of concern that have increased infectivity and/or pathogenicity of SARS-CoV-2. Early in the SAR-CoV-2 pandemic, we established a whole genome sequencing pipeline to assess lineages circulating in Western New York (WNY), in collaboration with Erie County Department of Health (ECDOH). This effort evolved into a collaborative network of regional partners performing SARS-CoV-2 PCR testing, including Erie County Department of Health (ECDOH), Kaleida Health, Catholic Health, Erie County Medical Center, Cattaraugus County Department of Health and KSL Diagnostics. We are also now part of a New York State SARS-CoV-2 sequencing consortium. Initial sequences revealed entry into the region via Europe, similar to observations in New York City (NYC). However, as the pandemic progressed and variants of concern emerged, we observed distinct patterns in lineages relative to NYC and other parts of the state. For example, B.1.427 (epsilon variant of concern) became dominant in WNY, before it was displaced by B.1.1.7 (alpha variant of concern), while epsilon was never dominant in NYC. Our analysis of alpha, delta and omicron lineages in WNY indicated unique lineage patterns, multiple introductions into WNY and community spread. Through early identification of variants of concern, we were able to inform public policy through collaboration with ECDOH. Our work highlights the importance of widespread, regional surveillance of SARS-CoV-2 across the United States. More recently, we have implemented wastewater sequencing and have been able to monitor transitions from delta to omicron and BA.1 to BA.2 variants in Erie County. This regional network has expanded our ability to monitor viral pathogens beyond SARS-CoV-2 and potentially identify novel pathogens.
Storytelling is one of the oldest forms of human communication and a fundamental mechanism for information sharing, sense-making, and social transformation. Entertainment-education is a globally recognized strategy that leverages the power of storytelling purposefully incorporated in entertainment programming for social and behavior change. In this presentation, I will use three examples – one from the United States, one from India, and one from China – to illustrate the potential and challenges of using emerging technologies such as streaming platforms, social media, and AI chatbots to tell compelling stories, create transformative experiences, and facilitate positive change.
Medical scanners are widely used in screening and diagnosis in order to develop a personalized treatment plan for patients. Current patient examination workflow with medical scanners comprises several critical pre-scan events that involve manual operations and physical interactions between patients and medical professionals such as directing patients to the examination room, assisting them to pose according to scanning protocols, and positioning them for the scan, which may increase t This talk will cover several computer vision algorithms and systems we developed to make medical scanners more intelligent and efficient by automating the scanning procedure, as well as real-world examples of their applications.