CDSE Days 2021

March 30 - April 1, 2021 | Virtual Event via Zoom

CDSE Days 2021

March 30 - April 1, 2021 | Virtual Event via Zoom

center for computational research data processor.
computer with code on it.
data.
data.
Overview Abstracts Recordings and Presentations

Abstracts

Azure Machine Learning: Leveraging the Cloud to Accelerate Machine Learning Workflows

Sharat Chikkerur

We will introduce Azure Machine Learning, a platform for developers and data scientists to build machine learning services in the cloud and on the edge. Using hands-on examples, we will demonstrate tools such as (a) Azure AutoML for fully automated workflows (b) AzureML designer for graphical/drag-n-drop workflows, and (c) AzureML notebooks for more custom modeling workflows. In each of these cases, we will demonstrate end-to-end development that includes data ingestion, model training, evaluation, and deployment as API services. Along the way, we will also introduce how AzureML enables reproducible data science and supports MLOps for automated training/deployment of models.

Quantum Computing in Practice

Grant Salton

Quantum computers are machines that process information encoded in objects whose behavior is governed by quantum physics. This nascent technology has the potential to dramatically speed up today’s most challenging computational tasks. Recent advances in quantum hardware and software now make this technology accessible beyond a handful of scientific laboratories. This talk will provide a survey of some recent technological and scientific developments for quantum computers, an overview of the current state of the art, and practical considerations for breaking into this exciting interdisciplinary field.

What to do with a Near-Term Quantum Computer

Alán Aspuru-Guzik

In 2019, Google researchers carried out an experiment where they achieved"quantum supremacy". This is defined as when a quantum computer performs*any*task in a way that a classical computer takes impractical amounts of time to simulate its outcome. This is a huge milestone for the field, but several milestones lie ahead. In this talk, I will describe a potential algorithmic road to"quantum advantage", ie. the moment when a quantum computer carries out a*practical*task on a system that is unreachable by current classical computers. To reach quantum advantage presumably many years of hardware development and algorithmic development lie ahead of us. The current era of quantum computing, coined near-term intermediate-scale quantum computing(NISQ) by John Preskill presents many challenges and opportunities. I will discuss this algorithmic road in the context of algorithms for the simulation of molecules and materials, as well as will briefly mention quantum machine learning applications.

Introduction to Agent Based Models

Andrew Crooks and Sara Metcalf

This session will introduce the method of agent-based modeling, give a tutorial, and discuss a range of applications. Agent-based models facilitate dynamic simulation of multi-scalar feedback mechanisms and interactions between heterogeneous individual agents and their environments. Agents may represent people, animals, organizations, or other kinds of discrete decision-making entities. Participants who wish to practice developing the agent-based models demonstrated in this session should install the free NetLogo and AnyLogic PLE (Personal Learning Edition) software, which can be downloaded from https://ccl.northwestern.edu/netlogo/download.shtml and https://www.anylogic.com/downloads/personal-learning-edition-download/.

Building Large Scale Listing Recommender Systems

Sriganesh (Sri-G) Madhvanath

Think about the last movie you watched or the last book you read. Chances are, you did not find them by actively searching for them online using keywords. More likely they popped up as some form of recommendation. Recommendations, when done right, are fundamentally a way to help users deal with information overload. In this talk, I will introduce the role of recommendations in e-commerce marketplaces, and our approach to generating and serving recommendations at eBay, with its highly unstructured inventory of over 1.5B listings. Time permitting, I will touch upon new approaches based on deep learning, and the production engineering architecture required to serve 4B impressions daily across the web and native applications.

Opioid Epidemic and Public Health Issues Associated with It

Donald Burke

For almost four decades, the curve of drug overdose deaths in the USA has tracked along a remarkably predictable exponential growth trajectory. To understand the mechanisms driving this sustained pattern of growth –so as to be able to implement more effective epidemic interventions -we conducted analytical and simulation modeling of the epidemic. Analysis by year of birth and age at death disclosed substantial epidemic structure, with a sharp emergence of risk in the cohorts born immediately after World War II, an inexorable youth-ward shift in the age at death for all subsequent birth cohorts, with another intensification in younger generations. Patterns have also varied systematically according to demographic factors (sex, race, urbanicity) and geography and with the introduction of new drugs. Novel methods of data visualization will be used to show and explain these epidemiological patterns, and progress toward the development of computational simulations of the epidemic will be discussed.

Intel Distribution for Python and AI Toolkit

Rachel Oberman

During this session, Intel will provide instruction on how to achieve faster Python —right out of the box—with minimal or no changes to your code. The course will cover how to: 

  • Accelerate NumPy, SciPy, and scikit-learn* with integrated Intel® Performance Libraries such as Intel® Math Kernel Library and Intel® Data Analytics Acceleration Library (Intel® DAAL)
  • Access the latest vectorization and multithreading instructions, Numba* and Cython, composable parallelism with Threading Building Blocks, and more

Intel will also provide instruction on Intel® AI Analytics Toolkit: Providing data scientists, AI developers, and researchers familiar Python* tools and frameworks to accelerate end-to-end data science and analytics pipelines on Intel® architectures. The components are built using oneAPI libraries for low-level compute optimizations. This maximizes performance from preprocessing through machine learning.

  • Deliver high-performance deep learning (DL)
  • Achieve drop-in acceleration for data analytics and machine learning workflows with compute-intensive Python* packages: Modin*, NumPy, Numba, scikit-learn*, and XGBoost* optimized for Intel.
  • Gain direct access to Intel analytics and AI optimizations to ensure that your software works together seamlessly.

There is no preparation for this session, but these links may be of assistance when researching technical subjects:

Data in Sports

Helen Drew, Gerry Meehan and Jason Nightingale

A Roundtable on Sports and Data, moderated by Professor Nellie Drew and including Sabres Director of Analytics Jason Nightingale and former Sabresplayer, GM, and EVP Gerry Meehan. The Roundtable will discuss a wide range of issues concerning the use of data in professional and amateur sports and the business of sport. The audience will be able to ask questions of the panelists and hear about new developments at the nexus of sports, data, the law, and society.

Event Start Date: March 30, 2021