First International Workshop on

Data Driven System Management for Large Clusters

Held in conjunction with ICS ’13, June 10, 2013 in Eugene, Oregon

    The purpose of this workshop is to establish the state-of-the-art for and focus on the enhancement of management of large open source software stack clusters used for scientific and engineering computation.  In particular, the focus will be the requirements for an information infrastructure which will effectively support the requirements of all of the stakeholders (users, systems administrator and system management) of these systems.  Such a management system should include the measurement, analyses and reporting processes and tools that provide the information required for effective support for all of the stakeholders of a system.

    In the past, the information infrastructure for effective management of large clusters has been approached on a piecemeal basis:

(i) the tools for systems administrator administration are largely ad hoc

(ii)  systematic feedback to users is largely lacking and

(iii) systematic support for system level evaluation and planning is lacking.  

The result has been loss of return on investment in time, effort and money by all classes of stakeholders. The widespread prevalence and heavy reliance of science and engineering on large scale computers makes widespread availability of such a comprehensive information infrastructure of critical importance.  The rapid evolution of the technologies in these systems drives a need for continuous evolution in the supporting information infrastructure to enable valid assessment of needs and resources.


Workshop Format & Schedule

The workshop will be one full day on June 10, 2013.  An approximate format and schedule for the workshop is:

  •             Keynote Lecture – ¾ hour
  •             Panel on Stakeholder Requirements and Perspective – 1 hour
  •             Invited Lecture or Lectures – 1 hour
  •             Invited Vendor Presentation – ¾ hour
  •             Contributed Papers – ~3 hours
  •             Summary and Wrap-Up – 1/2  hour

The final schedule and allocation of time to the items on the schedule may vary.  A final schedule and program will be available on or about June 1, 2013.


Call for Contributions

The topics appropriate for this workshop include but are not limited to:

1. Processes and tools for measurement, analysis and reporting on resource use to users, systems administrators and system management.

2. Processes and tools for systems administration including diagnosis and management of job and system faults and failures.

3. Processes and tools for integration and unified management of data from multiple sources to support comprehensive system management.

4. Case studies of measurement data driven resource use at both job and system levels

5. Case studies on job and system level fault/failure diagnosis

6. Case studies on system evaluation and planning

7. Processes and tools for promoting interaction and communication among users, systems administration and system management.

8. Modeling and analysis using detailed system, job and user level data and information

Contributions should be prepared as extended (three page) abstracts using the ACM standard paper format specified at  and submitted via the form below. Each contribution will be reviewed by the entire program committee. The extended abstracts and complete papers of up to eight pages are due on the dates given below:

Deadline for Papers – Midnight US CDT, April 28th, 2013

 Notification of Acceptances: May 12, 2013

Deadline for completed Papers: June 1, 2013

All accepted contributions including the keynote and the invited talks as well as the contributed papers will be available electronically.  A proposal for publication of the contributions by ACM SIGHPC will be prepared.


Program Committee Members:

Jim Browne (Co-Chair)
University of Texas at Austin

Abani Patra (Co-Chair)
State University of NY at Buffalo

Bill Barth
University of Texas at Austin

Matt Jones
State University of NY at Buffalo

Tom Furlani
State University of NY at Buffalo

Haihang You
NICS, University of Tennessee, Knoxville

Greg Bronevetsky
Lawrence Livermore National Laboratory

Kathryn Mohror
Lawrence Livermore National Laboratory

Ajay Mahimkar
ATT Bell Laboratory

John Stearley
Sandia National Laboratory

Rob Pennington
NCSA, University of Illinois at Urbana-Champaign