Published December 1, 2014
Amazing. Some of UB’s core network systems have been online for years at a time without failing or being shut down. This is no small feat, and it doesn’t happen by accident.
UBIT Senior Communication Systems Engineer Joe Pautler sat down with UBIT News to talk about all the hard work that goes into ensuring that UB’s networks are as reliable as possible.
“Ten years ago, the system had many more of what we’d call ‘single points of failure,’” Joe said. This term “single points of failure” has become something of a buzzword among the UBIT team; it refers to any physical point in the campus’ network systems that could create an outage in services if it were to fail. It requires a tremendous amount of time, money and effort to minimize these “single points of failure” and make the system as a whole more reliable.
According to Joe, having a reliable network in 2014 isn’t just a point of pride: it’s a necessity.
“Every department has services they consider ‘mission-critical,’” Joe said. “If you think about all the departments and services we have at UB, you realize the scope of what we do.” From a fully-operational dental clinic on South Campus, to heating and alarm systems in each building, to video uplink broadcasting Bulls games on ESPN, to the campus police emergency hotline and the “blue light phones” around campus: they all depend on UB’s networks being constantly up and running.
“We’ve carefully designed redundancy and reliability into the systems to avoid outages,” said J. Brice Bible, UB Vice President and Chief Information Officer. “I’m proud of our team for their hard work to keep systems running, especially when we aren’t able to refresh our equipment as often as we should.”
Gone are the days when networks could be taken offline for diagnostics and upgrades, even for an hour or two, even in the middle of the night. For this reason, UBIT strives to keep all network devices and links in redundant pairs, and further steps have been taken to ensure that the networks stay online. For example, UBIT keeps spare equipment for every device they support because, as Joe told us, “even one business day delivery is now too slow to make critical repairs.”
UBIT also makes the most of these spare devices by using them to maintain a “test network” where they try out delicate changes to the system and troubleshoot problems. As for updating the actual system, the one in use by 30,000 students, faculty and staff members on a daily basis, there is a detailed checks-and-balances system that includes documentation, supervisor approval and mandated “backout plans” to rollback changes when problems arise.
All this is an effort to support a system that once was considered a convenience, or even a luxury, but has become crucial in the age of “the Internet of everything.” Keeping it all going, up and running in some cases for over half a decade with no downtime, is no easy task. For Joe Pautler and the team at UBIT, it’s just another day on the job.