A word from Inder Monga: The Road to ESnet6 (Part 1)

Inder Monga, Executive Director of ESnet.

Dear Friends, Well-wishers, Colleagues, and all of ESnet,

In October of this year we will launch ESnet6, a next-generation network featuring an entirely new, software-driven network design that enhances the ability to rapidly invent, test, and deploy new innovations to meet the data needs of the Office of Science/DOE.

We put forth the vision for ESnet6 in 2016. Since then, this $151M project (total project cost – DOE 413.3 parlance including contingency) has overcome pandemic-induced issues like site lockdowns, differing vaccination and inter-state travel policies, and variable supply chain delays, and is now in its final stages of implementation. As I prepare this historic unveiling, I can’t help but look back at what the team accomplished last year.

This is the first post in a series of blog posts about the people, partnerships, and innovations that have paved the road to ESnet6.

2021 was a year for growth within ESnet. We have 100+ people in the organization now—a 30% increase from last year—and it has been great to have new employees on-boarded, integrated, and productive in this challenging environment. 

A diagram showing the dimensions of growth within ESnet: Foundations, Innovation, Co-design, and Culture. Foundations, Innovation, and Co-design all point outward in separate directions, while Culture lies alongside all three Axes, growing in tandem with them.
The dimensions of growth for ESnet

Looking towards the future, we think of ESnet growing around four dimensions. The three spatial axes are: 

  • Foundations: Next Generation Network and Services 
  • Innovation: Testbeds and Advanced Research and Development
  • and Co-design: Partnerships with Science for new data and network solutions. 

The fourth axis, Culture, is pervasive across all three dimensions. 

The main reason for choosing this very technical representation is to illustrate that these are not independent thrusts—success in each of these dimensions depends on the capabilities of the other.

In this post, I’d like to focus on that first axis: Foundations. In the next few posts, I will focus on the Innovation and Co-Design dimensions and share more thoughts about our focus for 2022 and beyond.

Major capacity improvements

In 2021, we installed a brand new routing infrastructure on our network backbone, while decommissioning a portion of the previous generation packet processors in parallel. We seamlessly transitioned all ESnet customers and peers onto the forty new backbone routers before the holidays, and the remaining router upgrades at our customer sites are in progress and scheduled through 2022.

The greenfield optical infrastructure (installed at 300 locations in 2020— another noteworthy accomplishment) is getting a wonderful upgrade: 400G wavelengths are being standardized across our national backbone as we complete the second phase of optical upgrades.

In addition to our team’s intricate efforts to decommission the existing network, we added another 100G on the ring in Europe (thanks to our collaboration with GEANT). This ensured that the first Large Hadron Collider Data Challenge had enough bandwidth to accommodate both ESnet scientific data and LHC data challenge (test) streams. We also established a new point of presence in Dallas to support new peerings and the FABRIC project

ESnet network map showing LHC data challenge traffic sending nearly 100Gbps from Amsterdam to Boston
ESnet network map showing LHC data challenge traffic sending nearly 100Gbps from Amsterdam to Boston.

Creating a smarter network

The vision laid out in 2016 focused not only on capacity, but also on improving the essential framework of how we operate with the network. 

We made a significant investment in building out a high-availability site within 10ms of our main data center, in addition to our disaster-recovery site on the east coast. So any planned or unplanned power outages will be handled without a scramble. While the supply chain issues prevented the site from being ready for operations, we are making steady progress and look forward to completing it this year. 

The software orchestration team made tremendous progress in laying down the vision and framework for automation. They were supported by strong internal collaboration with the engineering team. Many repetitious deployments were automated, and I know it took diligent effort to make these tools available in the right time frame, aligned with evolving constraints of the deployments. A few examples of where automation was used include:

  • Deployment of optical wavelengths on our backbone
  • Deployment of routers and base configurations, and service provisioning
  • Customer migration configurations from old network to the new equipment automatically generated from ESnet Database (ESDB)
  • Virtualized test environment was developed to test out new tools and services before actual in-field deployment.

This year, we prepare to bring the official DOE 413.3 ESnet6 project to a close, but as you know the network never sleeps, data never stops growing, and we have to constantly evolve the network. I can proudly say that we have the core foundations of the enduring ESnet user facility ready to handle the next big challenges of Data, AI, and Integrated multi-facility research that the scientists and National Labs are actively pursuing.

Wishing you all a very Happy New Year from ESnet. 

Inder

This post is part of a series of posts reflecting on the road to ESnet6. Check back soon to see upcoming posts from Inder focusing on innovation, co-design, and his vision for ESnet6 and beyond.

Next Generation ESnet6 Routers Installed and Accepted!

ESnet6 took a major step forward last week with the completed installation and acceptance of all 40 “greenfield” routers on the network backbone. These new routers will enable ESnet to operate at speeds up to 400 Gbps across our national fiber network, and provide the backbone infrastructure behind our next generation scientific data mobility capabilities.

A new ESnet6 backbone router in its native habitat.

The installation and acceptance process at each location across the continental US required careful coordination between subcontractors, colocation facility personnel, Lab site staff, and multiple teams across ESnet. Following local health regulations and access requirements, ESnet arranged physical access for the subcontractors at each location and all parties participated in a turn-up conference call as the routers were installed and brought online..

In addition to networking capabilities, the ESnet6 team implemented new software automation capabilities simplifying the installation and acceptance process.  These capabilities included enhancements to the ESnet inventory system to support bulk planning data import, automatic bill of materials generation, automatic site survey generation, and automated generation of all backbone links within the network.  In addition, the team introduced new workflow orchestration, automated provisioning, and inventory discovery capabilities to help with the installation process.

The acceptance of the ESnet6 greenfield routers is a major milestone for the ESnet6 Project and the team has already migrated a significant portion of customer traffic onto the new routers. Despite the extra challenges presented by the COVID-19 pandemic, the project has made steady progress and is on track to finish ahead of schedule. 

ESnet6 Achieves 2021 Annual Review Milestone – the future research and education network is one step closer!

The ESnet6 2021 Annual Status Review was a great success, and the Review Committee, led by DOE, concluded that the ESnet6 Project is being managed and executed well!

Given that the project’s budget, scope, and schedule were approved in February 2020, this was the first official Annual Status Review – and what a year it has been! The 2021 Review was a major milestone, allowing the Project to formally present the project performance over the past year and, consequently, during the COVID-19 pandemic. I continue to be amazed by the entire project team, and I felt very honored to be the one to introduce the astounding progress we made during an extremely challenging year. Not only that, it was all done while operating the current ESnet5 production network at the same time.

The project execution continued at full speed while some of us started carving out time over the past several months to prepare for the Review. Pulling together all of the information required, synthesizing it into a clear and concise set of briefings and documents, and presenting it to leaders in our field is a monumental task under any circumstances, but the pandemic made this especially difficult. However, the project team, backed by strong support across LBNL (Procurement, Project Management Office, Project Management Advisory Board members, and many others) made everything appear seamless. The impressive level of teamwork did not go unnoticed and was specifically mentioned repeatedly during the Closeout session. I am grateful for and proud of, all of the members of the team who contributed to this terrific success.

The Review Committee consisted of three Subcommittees (Technical, Cost & Schedule, Project Management & Environment, Safety & Health), all charged with answering a set of questions to determine if we were on schedule, achieving scope, within budget, and performing all tasks safely. The answer to every charge question: Yes! It was an all-encompassing couple of days, but we really couldn’t have asked for a better result. In short, there were no formal recommendations, so we’ll be considering how best to implement several of the Review Committee’s extremely helpful comments as we proceed onward. Our hard work, not only on the Review itself, paid off!

With the formal Review complete for the year, we’re all back to our daily project plan of execution, while keeping the network “lights on” in the process, of course.

On the Path to ESnet6—Seeing the Light

ESnet6 Network

Three years ago, ESnet unveiled its plan to build ESnet6, its next-generation network dedicated to serving the Department of Energy (DOE) national lab complex and overseas collaborators. With a projected early finish in 2023, ESnet6 will feature an entirely new software-driven network design that enhances the ability to rapidly invent, test, and deploy new innovations. The design includes:

  • State-of-the-art optical, core and service edge equipment deployed on ESnet’s dedicated fiber optic cable backbone
  • A scalable switching core architecture coupled with a programmable services edge to facilitate high-speed data movement
  • 100–400Gbps optical channels, with up to eight times the potential capacity compared to ESnet5
  • Services that monitor and measure the network 24/7/365 to ensure it is operating at peak performance, and
  • Advanced cybersecurity capabilities to protect the network, assist its connected sites, and defend its devices in the event of a cyberattack

Later this month, ESnet staff will present an online update on ESnet6 to the ESnet Site Coordinators Committee (ESCC). Despite the challenges of deploying new equipment at over 300 distinct sites across the country and lighting up approximately 15,000 of miles of dark fiber during a pandemic, the team is making great progress, according to ESnet6 Project Director Kate Mace.

“We’ve had some delays, but our first priority is making sure the work is being done safely,” Mace said. “We have a lot of subcontractors and we are working closely with them to make sure they’re safe, they’re following local pandemic rules and they’re getting the access they need for installs.

“The bottom line is that we have a lot of pretty amazing people putting in a lot of hours and hard work to keep the project moving forward,” Mace said.

When completed in 2023, ESnet6 will provide the DOE science community with a dedicated backbone capable of carrying at least 400 Gigabits per second (Gbps), with some spans capable of carrying more than 1 Terabit per second.

The current network, known as ESnet5, comprises a series of interconnected backbone rings, each with 100Gbps or higher bandwidth. ESnet5 operates on a fiber footprint owned by and shared with Internet2. Once the switch is complete, Internet2 will take over ESnet’s share of the fiber spectrum to provide more bandwidth to the U.S. education community.

“We’re almost done with the optical layer, which is a big deal,” Mace said. “It’s been a major procurement of new optical line equipment from Infinera to light up the new optical footprint.”

Mapping the road to ESnet6 

Back in 2011, using Recovery Act funds for its Advanced Networking Initiative, ESnet secured the long-term rights to a pair of fibers on a national fiber network that had been built, but not yet used. Because there was a surplus of installed fiber cable at the time, ESnet was able to negotiate advantageous terms for the network.

As part of the ESnet6 project, ESnet and its subcontractors began installing optical equipment along the ESnet fiber footprint starting in November 2019. The optical network consists of seven large fiber rings east to west across the U.S., and smaller “metro” rings in the Chicago and San Francisco Bay areas.

At this point, Infinera has completed the installation of the equipment at all locations. The four large eastern-most rings have passed ESnet’s rigorous testing and verification process ensuring that they are configured and working as designed, and most ESnet services in these areas have been transitioned over to the new optical system.

Infinera has turned over the other three large rings and is working closely with ESnet staff to address a number of minor issues identified during testing.

ESnet and Infinera are collaborating on turning up, testing, and rolling services to the new network in the Chicago and Bay Area rings. The installation in these areas is more complex because it is re-using the ESnet5 fiber going into the DOE Laboratories.  

“The ESnet and Infinera teams have worked really well together to overcome all of the typical challenges we expected on a network build of this scale, as well as some unexpected obstacles,” said Joe Metzger, the ESnet6 Implementation Lead. 

The typical expected challenges ranged from installing thousands of perfectly clean (microscopically verified) fiber connections, to the unexpected, such as engineers driving for hours to get to a remote isolated location to install the equipment only to find the access road is drifted in with snow, or somebody changed the lock.

Most of the unexpected challenges were related to COVID-19.

“It was amazing to see how the facility providers, including the DOE Laboratories, ESnet and Infinera teams worked together to find safe, workable solutions to the COVID-19-related access constraints that we encountered during the installation,” said Metzger.  

The team expects the optical system build to be fully accepted and all services transitioned over to it by Oct. 1, completing what they are calling ESnet5.5, the first major step in the transition from ESnet5 to ESnet6.

To get to this point, ESnet’s network engineers needed extensive, hands-on training on the new Infinera equipment and built a specialized test lab at Berkeley Lab. To do this, a test lab was built at Berkeley Lab to provide hands-on training. Engineers take a weeklong session learning how to configure, operate, and troubleshoot the equipment deployed in the field.

The next major step will be the installation of new routers for the packet layer, which is expected to begin in early 2021, Mace said.

And of course, this is all being carried out while ESnet keeps its production network and services in regular operation and with the undercurrent of stress from the COVID-19 pandemic. 

“We’ve got to keep the network running,” Mace said. “And we are hiring additional network engineers, software engineers and technical project managers.

ESnet is supported by DOE’s Office of Science.

Written by Jon Bashor