Looking back at ESnet’s 2020

Advancing our strategy and shaping our position on the board.
Some thoughts from Inder on the year-that-was.

Miniature from Alfonso X’s Libro del axedrez dados et tablas (Book of chess, dices and tables), c. 1283. , Public domain, via Wikimedia Commons

Dear Friends, Well-wishers, Colleagues, and all of ESnet,

Chess! 2020 has been much more challenging than this game. It’s also been a year where we communicated through the squares on our zoom screens, filled with faces of our colleagues, collaborators, and loved ones.

In January, Research and Education leaders came together in Hawaii at the Pacific Telecommunications Council meeting to discuss the future of networking across the oceans. It was impossible to imagine then that we would not be able to see each other again for such a long time. Though thanks to those underwater cables, we have been able to communicate seamlessly across the globe.

Looking back at 2020, we not only established a solid midgame position on our ESnet chessboard, but succeeded in ‘winning positions’ despite the profound challenges. The ESnet team successfully moved our network operations to be fully remote (and 24/7) and accomplished several strategic priorities. 

ESnet played some really interesting gambits this year: 

  1. Tackled COVID-related network growth and teleworking issues for the DOE complex
    • We saw a 4x spike in remote traffic and worked closely across several Labs to upgrade their connectivity. We continue to address the ever-growing demand in a timely manner. 

    • As we all shifted to telework from home, ESnet engineers developed an impromptu guide that was valuable to troubleshoot our home connectivity issues. 
  2. Progressed greatly on implementing our next-generation network, ESnet6
    • We deployed and transitioned to the ESnet6 optical backbone network, with 300 new site installations, 100’s of 100G waves provisioned, with just six months of effort, and while following pandemic safety constraints. I am grateful to our partners Infinera (Carahsoft) and Lumen for working with our engineers to make this happen. Check out below how we decommissioned the ESnet5 optical network and lit up the ESnet6 network.
    • Installed a brand new management network and security infrastructure upgrades along with significant performance improvements.
    • We awarded the new ESnet6 router RFP (Congratulations Nokia and IMPRES!); the installs start soon.
    • Issued another RFP for optical transponders, and will announce the winner shortly.
  3. Took initiative on several science collaborations to address current and future networking needs
    • We brainstormed new approaches with the Rubin Observatory project team, Amlight, DOE and NSF program managers to meet the performance and security goals for traffic originating in Chile. We moved across several countries in South America before reaching the continental U.S. in Florida (Amlight), and eventually the U.S. Data Facility at SLAC via ESnet.
    • Drew insights through deep engagement of ESnet engineers with the High Energy Physics program physicists, for serving the data needs of their current and planned experiments expediently.
      Due to the pandemic, a two-day immersive in-person meeting turned into a multi-week series of Zoom meetings, breakouts, and discussions.
    • When an instrument produces tons of data, how do you build the data pipeline reliably? ESnet engineers took on this challenge, and worked closely with the GRETA team to define and develop the networking architecture and data movement design for this instrument. This contributed to a successful CD 2/3 review of the project—a challenging enough milestone during normal times, and particularly tough when done remotely. 
    • Exciting opening positions were created with EMSL, FRIB, DUNE/SURF, LCLS-II…these games are still in progress, more will be shared soon. 
  4. Innovated to build a strong technology portfolio with a series of inspired moves
    • AI/ML
      • We demonstrated Netpredict, a tool using deep learning models and real-time traffic statistics to predict when and where the network will be congested. Mariam’s web page showcases some of the other exciting investigations in progress. 
      • Richard and his collaborators published Real-time flow classification by applying AI/ML to detailed network telemetry.
    • High-touch ESnet6 project
      • Ever dream of having the ability to look at every packet, a “packetscope”, at your fingertips? An ability to create new ways to troubleshoot, performance engineer, and gain application insights? We demonstrated a working prototype of that vision at the SC20 XNET workshop
    • SENSE
      • We deployed a beta version of software that provides science applications the ability to orchestrate large data flows across administrative domains securely. What started as a small research project five years ago (Thanks ASCR!) is now part of the AutoGOLE project initiative in addition to being used for Exascale Computing Project (ECP) project, ExaFEL.
    • TCP
      • Initiated the Q-Factor project this year, a research collaboration with Amlight, funded by NSF. The project will enable ultra-high-speed data transfer optimization by TCP parameter tuning through the use of programmable dataplane telemetry: https://q-factor.io/
      • We testbed thoroughly the interactions between TCP congestion control algorithms, BBRv2 and CUBIC. A detailed conversation with Google, the authors of the BBRv2 implementation, is in progress.
  5. Initiated strategic new games, with a high potential for impact
    • FABRIC/FAB
      • Executed on the vision and design of a nationwide @scale research testbed working alongside a superstar multi-university team.
      • With the new FAB grant, FABRIC went international with plans to put nodes in Bristol, Amsterdam, Tokyo and Geneva. More locations and partners are possibilities for the future.  
    • Edge Computing
      • Created an prototype FPGA-based edge-computing platform for data-intensive science instruments in collaboration with the Computational Research Division and Xilinx. Look for exciting news on the blog as we complete the prototype deployment of this platform.
    • Quantum
    • 5G
      • What are the benefits of widespread deployment of 5G technology on science research? We contributed to the development of this important vision at a DOE workshop. New and exciting pilots are emerging that will change the game on how science is conducted. Stay tuned. 

Growth certainly has its challenges. But, as we grew, we evolved from our old game into an adept new playing style. I am thankful for the trust that all of you placed in ESnet leadership, vital for our numerous, parallel successes. Our 2020 reminds me of the scene in Queen’s Gambit where the young Beth Harmon played all the members of a high-school chess team at the same time. 

Several achievements could not make it to this blog, but are important pieces on the ESnet chess board. They required immense support from all parts of ESnet, CS Area staff, Lab procurement, Finance, HR, IT, Facilities, and Communications partners.

I am especially grateful to the DOE Office of Science, Advanced Scientific Computing Research leadership, NSF, and our program manager Ben Brown, whose unwavering support has enabled us to adapt and execute swiftly despite blockades. 

All this has only been possible due to the creativity, resolve, and resilience of ESnet staff — I am truly proud of each one of you. I am appreciative of the new hires that trusted their careers with us and joined us remotely—without shaking hands or even stepping foot at the lab.

My wish is for all to stay safe this holiday season, celebrate your successes, and enjoy that extra time with your immediate family. In 2021, I look forward to killer moves on the ESnet chessboard, while humanity checkmates the virus. 

Signing off for the year, 

Inder Monga

Three questions with Derek Howard

Three questions with a new ESnet staff member!  

Derek Howard is a software developer from Columbia, MO. Prior to joining ESnet, Derek worked as an HPC system administrator for the University of Missouri. Derek also created Augur (https://github.com/chaoss/augur) which is part of the Linux Foundation’s CHAOSS group (https://chaoss.community/), a working group focused on measuring the health and sustainability of open source software. 


Derek is part of the Network Services Automation group under John MacAuley, where he will be working primarily on our internal ESnet Database (ESDB).

Question 1: What brought you to ESnet?

I worked with George Robb at the University of Missouri and he joined ESnet a while ago and it seemed like a great place to work. I asked him if there were any positions at ESnet he thought might be a good fit for me and he referred me to the position I am in now. I’m really happy I joined; it is as great as I expected!

Question 2: What is the most exciting thing going on in your field right now?

With so much work underway for ESnet6, exciting changes are happening every day. We are pushing to get features out for all of our software as fast as possible right now. Right now, I am working on a feature in ESDB to make it easier for network engineers to verify hardware was installed correctly during router installs. 

As far as the broader field goes, I am excited about DDR5 memory becoming commercially available soon. 

Question 3: What book would you recommend?

Randall Munroe’s “What If?” – It’s a wonderful collection of serious answers to silly questions by the creator of XKCD.

On the Path to ESnet6—Seeing the Light

ESnet6 Network

Three years ago, ESnet unveiled its plan to build ESnet6, its next-generation network dedicated to serving the Department of Energy (DOE) national lab complex and overseas collaborators. With a projected early finish in 2023, ESnet6 will feature an entirely new software-driven network design that enhances the ability to rapidly invent, test, and deploy new innovations. The design includes:

  • State-of-the-art optical, core and service edge equipment deployed on ESnet’s dedicated fiber optic cable backbone
  • A scalable switching core architecture coupled with a programmable services edge to facilitate high-speed data movement
  • 100–400Gbps optical channels, with up to eight times the potential capacity compared to ESnet5
  • Services that monitor and measure the network 24/7/365 to ensure it is operating at peak performance, and
  • Advanced cybersecurity capabilities to protect the network, assist its connected sites, and defend its devices in the event of a cyberattack

Later this month, ESnet staff will present an online update on ESnet6 to the ESnet Site Coordinators Committee (ESCC). Despite the challenges of deploying new equipment at over 300 distinct sites across the country and lighting up approximately 15,000 of miles of dark fiber during a pandemic, the team is making great progress, according to ESnet6 Project Director Kate Mace.

“We’ve had some delays, but our first priority is making sure the work is being done safely,” Mace said. “We have a lot of subcontractors and we are working closely with them to make sure they’re safe, they’re following local pandemic rules and they’re getting the access they need for installs.

“The bottom line is that we have a lot of pretty amazing people putting in a lot of hours and hard work to keep the project moving forward,” Mace said.

When completed in 2023, ESnet6 will provide the DOE science community with a dedicated backbone capable of carrying at least 400 Gigabits per second (Gbps), with some spans capable of carrying more than 1 Terabit per second.

The current network, known as ESnet5, comprises a series of interconnected backbone rings, each with 100Gbps or higher bandwidth. ESnet5 operates on a fiber footprint owned by and shared with Internet2. Once the switch is complete, Internet2 will take over ESnet’s share of the fiber spectrum to provide more bandwidth to the U.S. education community.

“We’re almost done with the optical layer, which is a big deal,” Mace said. “It’s been a major procurement of new optical line equipment from Infinera to light up the new optical footprint.”

Mapping the road to ESnet6 

Back in 2011, using Recovery Act funds for its Advanced Networking Initiative, ESnet secured the long-term rights to a pair of fibers on a national fiber network that had been built, but not yet used. Because there was a surplus of installed fiber cable at the time, ESnet was able to negotiate advantageous terms for the network.

As part of the ESnet6 project, ESnet and its subcontractors began installing optical equipment along the ESnet fiber footprint starting in November 2019. The optical network consists of seven large fiber rings east to west across the U.S., and smaller “metro” rings in the Chicago and San Francisco Bay areas.

At this point, Infinera has completed the installation of the equipment at all locations. The four large eastern-most rings have passed ESnet’s rigorous testing and verification process ensuring that they are configured and working as designed, and most ESnet services in these areas have been transitioned over to the new optical system.

Infinera has turned over the other three large rings and is working closely with ESnet staff to address a number of minor issues identified during testing.

ESnet and Infinera are collaborating on turning up, testing, and rolling services to the new network in the Chicago and Bay Area rings. The installation in these areas is more complex because it is re-using the ESnet5 fiber going into the DOE Laboratories.  

“The ESnet and Infinera teams have worked really well together to overcome all of the typical challenges we expected on a network build of this scale, as well as some unexpected obstacles,” said Joe Metzger, the ESnet6 Implementation Lead. 

The typical expected challenges ranged from installing thousands of perfectly clean (microscopically verified) fiber connections, to the unexpected, such as engineers driving for hours to get to a remote isolated location to install the equipment only to find the access road is drifted in with snow, or somebody changed the lock.

Most of the unexpected challenges were related to COVID-19.

“It was amazing to see how the facility providers, including the DOE Laboratories, ESnet and Infinera teams worked together to find safe, workable solutions to the COVID-19-related access constraints that we encountered during the installation,” said Metzger.  

The team expects the optical system build to be fully accepted and all services transitioned over to it by Oct. 1, completing what they are calling ESnet5.5, the first major step in the transition from ESnet5 to ESnet6.

To get to this point, ESnet’s network engineers needed extensive, hands-on training on the new Infinera equipment and built a specialized test lab at Berkeley Lab. To do this, a test lab was built at Berkeley Lab to provide hands-on training. Engineers take a weeklong session learning how to configure, operate, and troubleshoot the equipment deployed in the field.

The next major step will be the installation of new routers for the packet layer, which is expected to begin in early 2021, Mace said.

And of course, this is all being carried out while ESnet keeps its production network and services in regular operation and with the undercurrent of stress from the COVID-19 pandemic. 

“We’ve got to keep the network running,” Mace said. “And we are hiring additional network engineers, software engineers and technical project managers.

ESnet is supported by DOE’s Office of Science.

Written by Jon Bashor

How a future-facing ESnet project reaches back to Berkeley Lab’s roots

Eric Pouyoul and Mike Witherell

Eric Pouyoul and Mike Witherell
ESnet’s Eric Pouyoul (left) talks to Berkeley Lab Director Mike Witherell (right) about a specialized network that he’s helping to build for the GRETA experiment, short for Gamma Ray Energy Tracking Array. (Photo: Berkeley Lab)

While ESnet staff are known for building an ever-evolving network that’s super fast and super reliable, along with specialized tools to help researchers make effective use of the bandwidth, there is also a side of the organization where things are pushed, tested, broken and rebuilt: ESnet’s testbed.

For example, in conjunction with the rollout of its nationwide 100Gbps backbone network, the staff opened up a 100Gbps testbed in 2009 with Advanced Networking Initiative funding through the American Reinvestment and Recovery Act. This allowed scientists to test their ideas on a separate but equally fast network so if something crashed, ESnet traffic would continue to flow unimpeded across the network. Six years later, ESnet upped the ante and launched the 400Gbps network — the first science network to hit this speed — to help NERSC move its massive data archive from Oakland to Berkeley Lab.

Eric Pouyoul is the principal investigator for the testbed and the things he’s learned on past projects can be applied to others. His most recent project also pushed the boundaries of what the organization does in supporting DOE science. With funding from the lab’s Nuclear Physics Division, Pouyoul developed a pair of uniquely specialized data processing systems for the GRETA experiment, short for Gamma Ray Energy Tracking Array. The gamma ray detector will be installed at DOE’s Facility for Rare Isotope Beams (FRIB) located at Michigan State University in East Lansing.

When an early version of GRETA  goes online at the end of 2023 it will house an array of 120 detectors that will produce up to 480,000 messages per second—totaling 4 gigabytes of data per second—and send them through a computing cluster for analysis. Not only did Pouyoul write the software for the first stage that will reduce the amount of data by an order of magnitude—in real-time—he also designed the physics simulation software to generate realistic data generation to test the system.

For the second data handling phase of GRETA, called the Global Event Builder, he wrote the software that will take all of the data from the first phase and, using the timestamps, aggregate them in order, as well as sort them by event. This data will then be stored for future analysis.

Even though he designed and built the systems to simulate the behavior of the nuclear physics that will occur inside the detector, “don’t expect me to understand it,” Pouyoul said. “I never did anything like this before.”

A rendering of GRETA, the Gamma-Ray Energy Tracking Array.
A rendering of GRETA, the Gamma-Ray Energy Tracking Array. (Credit: Berkeley Lab)

GRETA is the first of its kind in that it will track the positions of the scattering paths of the gamma rays using an algorithm specifically developed for the project. This capability will help scientists understand the structure of nuclei, which is not only important for understanding the synthesis of heavy elements in stellar environments, but also for applied-science topics in nuclear energy, nuclear forensics, and stockpile stewardship.

“This has been my most exciting project and it only could have happened here,” he said. “I think it takes me back to the origins of the Lab when scientists and engineers worked together to create new physics. We know it will work, but we don’t even know how the results will turn out, we don’t know what will be discovered.”

Before joining ESnet at Berkeley Lab 11 years ago, he had worked in the private sector. At one point in his career, he wrote code for control systems for nuclear power plants. Looking back, he estimates that maybe three lines of his code made it into the final library. He’s quick to point out that he doesn’t consider himself a software engineer, nor does he think of himself as a network engineer. At ESnet, those engineers are responsible for designing and deploying robust systems that keep the data moving in support of DOE’s research missions.

“I really like to work with prototypes, one-time projects like in the testbed,” he said. “I know how to build stuff.”

He developed that skill as a high school student in Paris, where he preferred to roam the sidewalks, looking for discarded electronics he could take home, repair, and sell. He did manage to attend classes often enough to pass his exams and graduate with a degree. That was the only diploma he’s ever received. 

Since then, he’s learned by working on things, not sitting in lecture halls. Some of it he picked up working for a supercomputing startup company. He learned how to tune networks for maximum performance by tweaking data transfer nodes, the equipment that takes in data from experiments, observations, or computations and speeds them on their way to end-users. 

He sees the GRETA project as a pilot and it’s already drawing interest from other researchers. The idea is that if ESnet can work with scientists from the start, it will be more efficient and effective than trying to tack on the networking components afterward. Pouyoul looking forward to the next one.

“I’m really not specialized, but I do understand different aspects of projects,” he said. “I only have fun when I’m not in my comfort zone — and I had a lot of fun working on GRETA.”

Interested in working at ESnet? Apply to our open jobs: http://m.rfer.us/LBLt9j2yC 

Read more about ESnet’s contributions to the GRETA project: https://bit.ly/ESnetGRETA

Written by Jon Bashor