It’s rare for any technology project to be completed early and under budget — let alone a massively complex one involving extensive hardware and software upgrades across many states. Yet Energy Sciences Network’s (ESnet) ESnet6 project was finished more than two years ahead of schedule and for less than it was estimated. In recognition of this unusual feat, the Department of Energy (DOE) recently presented ESnet with a special Project Assessment Award. (As an IT project, ESnet6 is not eligible for the DOE’s Project Management Awards.)
ESnet6 is the newest iteration of the DOE’s high-performance network, also known as the “data circulatory system” for the DOE science complex. Not only did ESnet6 boost bandwidth to more than 46 Terabits per second — a significant increase – it also automated network operations for scalability and reliability, improved security services, and replaced aging equipment. In addition, ESnet6 offers greater programmable network flexibility that will support evolving computation and data models in the emerging exabyte data era.
Six years in the making, ESnet6 was completed well under budget six months before the forecasted early finish date of January 2023 – and more than two years ahead of the forecasted CD-4 date in January 2025.
DOE Office of Project Assessment Director Kurt W. Fisher presented the award in a private ceremony at the DOE Project Management Workshop in Washington, DC, in April. ESnet Network Services Group Lead Kate Petersen Mace, ESnet6’s project director, accepted on behalf of the ESnet6 team.
“ESnet6 represents the culmination of several years of extraordinary commitment and tireless dedication by all of ESnet’s staff,” said Inder Monga, ESnet’s executive director. “We’re grateful to Berkeley Lab for its support and to DOE for recognizing the collective efforts of the team behind this critical piece of scientific infrastructure.”
In March 2020, the U.S. Government Office of Management and Budget (OMB) released a draft memo outlining a required migration to IPv6 only. Memorandum M-21-07 was made official on November 19, 2020. Among other things, this memo mandates that 80% of IP-enabled assets on Federal networks are operating in IPv6-only environments by the end of FY 2025.
ESnet is in the process of planning this transition now, to ensure that we provide our users with the support and resources they need to continue their work uninterrupted and unimpeded by the transition. Practically speaking, this means for ESnet that by 2025, all of our nodes will be transitioned to IPv6 address space, and we will not support dual-stacking with IPv4 and IPv6 addresses.
Transitioning to an IPv6-only network has been over a quarter-century in the making for ESnet. Here’s a look back at our history with IPv6
IPv6: Past and Present
ESnet’s history of helping to develop, support, and operationalize new protocols begins well before the advent of IPv6.
In the early 1990s, Cathy Aronson, an employee of Lawrence Livermore National Laboratory working on ESnet, helped establish a production implementation and support plan for the Open Systems Interconnect (OSI) Connectionless-mode Network Service (CLNS) suite of network protocols. Crucially, Aronson developed a scalable network addressing plan that provided a model for the utilization of the kinds of massive address spaces that OSI CLNS and, later, IPv6 would come to use. CLNS itself was a logical progression from DECnet which had been embraced and supported by ESnet’s precursors (MFEnet and HEPnet).
As the IPv6 draft standard (RFC2460) developed in the 1990s, ESnet staff created an operational support model for the new protocol. The stakes were high; if IPv6 were to succeed in supplanting IPv4, and prevent the ill effects of IPv4 address exhaustion, it would need a smooth roll-out. Bob Fink, Tony Hain, and Becca Nitzan spearheaded early IPv6 adoption processes, and their efforts reached far beyond ESnet and the Department of Energy (DOE). The trio were instrumental in establishing a set of operational practices and testbeds under the auspices of the Internet Engineering Task Force–the body where IPv6 was standardized–and this led to the development of a worldwide collaboration known as the 6bone. 6bone was a set of tunnels that allowed IPv6 “islands” to be connected, forming a global overlay network. More importantly, it was a collaboration that brought together commercial and research networks, vendors, and scientists, all with the goal of creating a robust internet protocol for the future.
Not only were Fink, Hain, and Nitzan critical in this development of what would become a production IPv6 network (their names appear on a number of IETF RFCs), they would also spearhead the adoption of the protocol within ESnet and DOE. In the summer of 1996, ESnet was officially connected to the 6bone; by 1999, the Regional Internet Registries had received their production allocations of IPv6 address space. Just one month later, the first US allocation of that space was made–to ESnet. ESnet has the distinction of being the first IPv6 allocation from ARIN – assigned on August 3, 1999, with the prefix 2001:0400::/32.
Nitzan continued her pioneering work, establishing native IPv6 support on ESnet, and placing what we believe was the first workstation on a production IPv6 network. This was part of becoming the first production network in North America to adopt IPv6 in tandem with IPv4 via the use of an IPv6 “dual-stack.” As US Government requirements and mandates developed in 2005, 2012, and 2014, the ESnet team met these requirements for increased IPv6 adoption, while also providing support and consultation for the DOE community.
Although Aronson, Fink, Hain, and Nitzan have all moved on from ESnet, a new generation of staff continued the spirit of innovation and early adoption. In the early 2010s, ESnet’s internal routing protocols were consolidated around the use of multi-topology Intermediate System to Intermediate System or IS-IS. This allowed for the deployment of flexible and disparate IPv4 and IPv6 topologies, paving the way for the creation of IPv6-only portions of ESnet, allowing the use of optimized routing protocols for the entire network. ESnet’s acquisition strategy has long emphasized IPv6 support andfeature parity between IPv4 and IPv6.
For our customers and those connected to us, here’s what this means:
ESnet will be ready, willing, and able to support connectors, constituents, and partners in their journey to deploying IPv6-only across our international network.
ESnet planning and architecture team members have been included in the Department of Energy Integration and Product Team (DOE IPT) for migration to IPv6-only, and are supporting planning and documentation efforts for the DOE Complex.
We look forward to supporting our customers and users, as we all make this change to IPv6 together.
Advancing our strategy and shaping our position on the board. Some thoughts from Inder on the year-that-was.
Dear Friends, Well-wishers, Colleagues, and all of ESnet,
Chess! 2020 has been much more challenging than this game. It’s also been a year where we communicated through the squares on our zoom screens, filled with faces of our colleagues, collaborators, and loved ones.
In January, Research and Education leaders came together in Hawaii at the Pacific Telecommunications Council meeting to discuss the future of networking across the oceans. It was impossible to imagine then that we would not be able to see each other again for such a long time. Though thanks to those underwater cables, we have been able to communicate seamlessly across the globe.
Looking back at 2020, we not only established a solid midgame position on our ESnet chessboard, but succeeded in ‘winning positions’ despite the profound challenges. The ESnet team successfully moved our network operations to be fully remote (and 24/7) and accomplished several strategic priorities.
ESnet played some really interesting gambits this year:
Tackled COVID-related network growth and teleworking issues for the DOE complex
We saw a 4x spike in remote traffic and worked closely across several Labs to upgrade their connectivity. We continue to address the ever-growing demand in a timely manner.
As we all shifted to telework from home, ESnet engineers developed an impromptu guide that was valuable to troubleshoot our home connectivity issues.
Progressed greatly on implementing our next-generation network, ESnet6
We deployed and transitioned to the ESnet6 optical backbone network, with 300 new site installations, 100’s of 100G waves provisioned, with just six months of effort, and while following pandemic safety constraints. I am grateful to our partners Infinera (Carahsoft) and Lumen for working with our engineers to make this happen. Check out below how we decommissioned the ESnet5 optical network and lit up the ESnet6 network.
Installed a brand new management network and security infrastructure upgrades along with significant performance improvements.
We awarded the new ESnet6 router RFP (Congratulations Nokia and IMPRES!); the installs start soon.
Issued another RFP for optical transponders, and will announce the winner shortly.
Took initiative on several science collaborations to address current and future networking needs
We brainstormed new approaches with the Rubin Observatory project team, Amlight, DOE and NSF program managers to meet the performance and security goals for traffic originating in Chile. We moved across several countries in South America before reaching the continental U.S. in Florida (Amlight), and eventually the U.S. Data Facility at SLAC via ESnet.
Drew insights through deep engagement of ESnet engineers with the High Energy Physics program physicists, for serving the data needs of their current and planned experiments expediently. Due to the pandemic, a two-day immersive in-person meeting turned into a multi-week series of Zoom meetings, breakouts, and discussions.
When an instrument produces tons of data, how do you build the data pipeline reliably? ESnet engineers took on this challenge, and worked closely with the GRETA team to define and develop the networking architecture and data movement design for this instrument. This contributed to a successful CD 2/3 review of the project—a challenging enough milestone during normal times, and particularly tough when done remotely.
Exciting opening positions were created with EMSL, FRIB, DUNE/SURF, LCLS-II…these games are still in progress, more will be shared soon.
Innovated to build a strong technology portfolio with a series of inspired moves
We demonstrated Netpredict, a tool using deep learning models and real-time traffic statistics to predict when and where the network will be congested. Mariam’s web page showcases some of the other exciting investigations in progress.
Richard and his collaborators published Real-time flow classification by applying AI/ML to detailed network telemetry.
High-touch ESnet6 project
Ever dream of having the ability to look at every packet, a “packetscope”, at your fingertips? An ability to create new ways to troubleshoot, performance engineer, and gain application insights? We demonstrated a working prototype of that vision at the SC20 XNET workshop.
We deployed a beta version of software that provides science applications the ability to orchestrate large data flows across administrative domains securely. What started as a small research project five years ago (Thanks ASCR!) is now part of the AutoGOLE project initiative in addition to being used for Exascale Computing Project (ECP) project, ExaFEL.
Initiated the Q-Factor project this year, a research collaboration with Amlight, funded by NSF. The project will enable ultra-high-speed data transfer optimization by TCP parameter tuning through the use of programmable dataplane telemetry: https://q-factor.io/
Executed on the vision and design of a nationwide @scale research testbed working alongside a superstar multi-university team.
With the new FAB grant, FABRIC went international with plans to put nodes in Bristol, Amsterdam, Tokyo and Geneva. More locations and partners are possibilities for the future.
Created an prototype FPGA-based edge-computing platform for data-intensive science instruments in collaboration with the Computational Research Division and Xilinx. Look for exciting news on the blog as we complete the prototype deployment of this platform.
What are the benefits of widespread deployment of 5G technology on science research? We contributed to the development of this important vision at a DOE workshop. New and exciting pilots are emerging that will change the game on how science is conducted. Stay tuned.
Growth certainly has its challenges. But, as we grew, we evolved from our old game into an adept new playing style. I am thankful for the trust that all of you placed in ESnet leadership, vital for our numerous, parallel successes. Our 2020 reminds me of the scene in Queen’s Gambit where the young Beth Harmon played all the members of a high-school chess team at the same time.
Several achievements could not make it to this blog, but are important pieces on the ESnet chess board. They required immense support from all parts of ESnet, CS Area staff, Lab procurement, Finance, HR, IT, Facilities, and Communications partners.
I am especially grateful to the DOE Office of Science, Advanced Scientific Computing Research leadership, NSF, and our program manager Ben Brown, whose unwavering support has enabled us to adapt and execute swiftly despite blockades.
All this has only been possible due to the creativity, resolve, and resilience of ESnet staff — I am truly proud of each one of you. I am appreciative of the new hires that trusted their careers with us and joined us remotely—without shaking hands or even stepping foot at the lab.
My wish is for all to stay safe this holiday season, celebrate your successes, and enjoy that extra time with your immediate family. In 2021, I look forward to killer moves on the ESnet chessboard, while humanity checkmates the virus.
While ESnet staff are known for building an ever-evolving network that’s super fast and super reliable, along with specialized tools to help researchers make effective use of the bandwidth, there is also a side of the organization where things are pushed, tested, broken and rebuilt: ESnet’s testbed.
For example, in conjunction with the rollout of its nationwide 100Gbps backbone network, the staff opened up a 100Gbps testbed in 2009 with Advanced Networking Initiative funding through the American Reinvestment and Recovery Act. This allowed scientists to test their ideas on a separate but equally fast network so if something crashed, ESnet traffic would continue to flow unimpeded across the network. Six years later, ESnet upped the ante and launched the 400Gbps network — the first science network to hit this speed — to help NERSC move its massive data archive from Oakland to Berkeley Lab.
Eric Pouyoul is the principal investigator for the testbed and the things he’s learned on past projects can be applied to others. His most recent project also pushed the boundaries of what the organization does in supporting DOE science. With funding from the lab’s Nuclear Physics Division, Pouyoul developed a pair of uniquely specialized data processing systems for the GRETA experiment, short for Gamma Ray Energy Tracking Array. The gamma ray detector will be installed at DOE’s Facility for Rare Isotope Beams (FRIB) located at Michigan State University in East Lansing.
When an early version of GRETA goes online at the end of 2023 it will house an array of 120 detectors that will produce up to 480,000 messages per second—totaling 4 gigabytes of data per second—and send them through a computing cluster for analysis. Not only did Pouyoul write the software for the first stage that will reduce the amount of data by an order of magnitude—in real-time—he also designed the physics simulation software to generate realistic data generation to test the system.
For the second data handling phase of GRETA, called the Global Event Builder, he wrote the software that will take all of the data from the first phase and, using the timestamps, aggregate them in order, as well as sort them by event. This data will then be stored for future analysis.
Even though he designed and built the systems to simulate the behavior of the nuclear physics that will occur inside the detector, “don’t expect me to understand it,” Pouyoul said. “I never did anything like this before.”
GRETA is the first of its kind in that it will track the positions of the scattering paths of the gamma rays using an algorithm specifically developed for the project. This capability will help scientists understand the structure of nuclei, which is not only important for understanding the synthesis of heavy elements in stellar environments, but also for applied-science topics in nuclear energy, nuclear forensics, and stockpile stewardship.
“This has been my most exciting project and it only could have happened here,” he said. “I think it takes me back to the origins of the Lab when scientists and engineers worked together to create new physics. We know it will work, but we don’t even know how the results will turn out, we don’t know what will be discovered.”
Before joining ESnet at Berkeley Lab 11 years ago, he had worked in the private sector. At one point in his career, he wrote code for control systems for nuclear power plants. Looking back, he estimates that maybe three lines of his code made it into the final library. He’s quick to point out that he doesn’t consider himself a software engineer, nor does he think of himself as a network engineer. At ESnet, those engineers are responsible for designing and deploying robust systems that keep the data moving in support of DOE’s research missions.
“I really like to work with prototypes, one-time projects like in the testbed,” he said. “I know how to build stuff.”
He developed that skill as a high school student in Paris, where he preferred to roam the sidewalks, looking for discarded electronics he could take home, repair, and sell. He did manage to attend classes often enough to pass his exams and graduate with a degree. That was the only diploma he’s ever received.
Since then, he’s learned by working on things, not sitting in lecture halls. Some of it he picked up working for a supercomputing startup company. He learned how to tune networks for maximum performance by tweaking data transfer nodes, the equipment that takes in data from experiments, observations, or computations and speeds them on their way to end-users.
He sees the GRETA project as a pilot and it’s already drawing interest from other researchers. The idea is that if ESnet can work with scientists from the start, it will be more efficient and effective than trying to tack on the networking components afterward. Pouyoul looking forward to the next one.
“I’m really not specialized, but I do understand different aspects of projects,” he said. “I only have fun when I’m not in my comfort zone — and I had a lot of fun working on GRETA.”
Funded through a grant from the National Science Foundation (NSF) and directly from ESnet, the program funds eight early to mid-career women in the research and education (R&E) network community to participate in the 2016 setup, build out and live operation of SCinet, the Supercomputing Conference’s (SC) ultra high performance network. SCinet supports large-scale computing demonstrations at SC, the premier international conference on high performance computing, networking, data storage and data analysis and is attended by over 10,000 of the leading minds in these fields.
The SC16 WINS program kicked off this week as the selected participants from across the U.S., headed to Salt Lake City, the site of the 2016 conference to begin laying the groundwork for SCinet inside the Salt Palace Convention Center. The WINS participants join over 250 volunteers that make up the SCinet engineering team and will work side by side with the team and their mentors to put the network into full production service when the conference begins on November 12. The women will return to Salt Lake City a week before the conference to complete the installation of the network.
“We are estimating that SCinet will be outfitted with a massive 3.5 Terabits per second (Tbps) of bandwidth for the conference and will be built from the ground up with leading edge network equipment and services (even pre-commercial in some instances) and will be considered the fastest network in the world during its operation,” said Corby Schmitz, SC16 SCinet Chair.
The WINS participants will support a wide range of technical areas that comprise SCinet’s incredible operation, including wide area networking, network security, wireless networking, routing, network architecture and other specialties.
“While demand for jobs in IT continues to increase, the number of women joining the IT workforce has been on the decline for many years,” said Marla Meehl, Network Director from UCAR and co-PI of the NSF grant. “WINS aims to help close this gap and help to build and diversify the IT workforce giving women professionals a truly unique opportunity to gain hands-on expertise in a variety of networking roles while also developing mentoring relationships with recognized technical leaders.”
“Not only is WINS providing hands-on engineering training to the participants but also the opportunity to present their experiences with the broader networking community throughout the year. This experience helps to expand important leadership and presentations skills and grow their professional connections with peers and executives alike,” said Wendy Huntoon, president and CEO of KINBER and co-PI of the NSF grant.
The program also represents a unique cross-agency collaboration between the NSF and DOE. Both agencies recognize that the pursuit of knowledge and science discovery that these funding organizations support depends on bringing the best ideas from people of various backgrounds to the table.
“Bringing together diverse voices and perspectives to any team in any field has been proven to lead to more creative solutions to achieve a common goal,” says Lauren Rotman, Science Engagement Group Lead, ESnet. “It is vital to our future that we bring every expert voice, every new idea to bear if our community is to tackle some of our society’s grandest challenges from understanding climate change to revolutionizing cancer treatment.”
2016 WINS Participants are:
Denise Grayson, Sandia National Labs (Network Security Team), DOE-funded
Julia Locke, Los Alamos National Lab (Fiber and Edge Network Teams), DOE-funded
Angie Asmus, Colorado State (Edge Network Team), NSF-funded
Kali McLennan, University of Oklahoma (WAN Transport Team), NSF-funded
Amber Rasche, North Dakota State University (Communications Team), NSF-funded
Jessica Shaffer, Georgia Institute of Tech (Routing Team), NSF-funded
Julia Staats, CENIC (DevOps Team), NSF-funded
Indira Kassymkhanova, Lawrence Berkeley National Lab (DevOps and Routing Teams), DOE-funded
The WINS Supporting Organizations: The University Corporation for Atmospheric Research (UCAR) http://www2.ucar.edu/
The Keystone Initiative for Network Based Education and Research (KINBER) http:www.kinber.org
Created in 1986, the U.S. Department of Energy’s (DOE’s) Energy Sciences Network (ESnet) is a high-performance network built to support unclassified science research. ESnet connects more than 40 DOE research sites—including the entire National Laboratory system, supercomputing facilities and major scientific instruments—as well as hundreds of other science networks around the world and the Internet.
Step 9: Why just rent fiber? Pick up your own dark fiber network at a bargain price for future expansion. In the meantime, boost your bandwidth to 100G for everyone. (2012)
Step 10: Here’s a cool idea, come up with a new network design so that scientists moving REALLY BIG DATASETS can safely avoid institutional firewalls, call it the Science DMZ, and get research moving faster at universities around the country. (2012)
Step 12: 100G is fast, but it’s time to get ready for 400G. To pave the way, ESnet installs a production 400G network between facilities in Berkeley and Oakland, Calif., and even provides a 400G testbed so network engineers can get up to speed on the technology. (2015)
Step 13: Celebrate 30 years as a research and education network leader, but keep looking forward to the next level. (2016)
ESnet Director Greg Bell will give an invited talk on “Cyber Security and Data Mobility in a World of Ultrafast Networks and Data-Intensive Science” to senior IT managers and security staff at the Centers for Disease Control (CDC) in Atlanta on Wednesday, June 4. The CDC has just completed the first phase of its “Research Grade Network” upgrade project, which was significantly influenced by ESnet designs and recommendations. Because the CDC has staff in 50+ countries, effective use of global networks is key to its success.
Bell was invited to give the talk by CDC Chief Technology Officer Jaspal Sagoo after they met when Bell gave a keynote talk on “Networking for Discovery” at Scaling Networks Securely and Cost Effectively, a one-day program targeted at government network providers and IT staff on Feb. 20, 2014.
We are pleased to announce three influential keynote speakers for the upcoming Focused Technical Workshop titled “Improving Data Mobility and Management for International Climate Science”, which will be hosted by the National Oceanic and Atmospheric Administration (NOAA) in Boulder, CO from July 14-16, 2014.
The first keynote will be delivered by NOAA’s Dr. Alexander “Sandy” MacDonald, Chief Science Advisor and Director of the Earth System Research Laboratory (ESRL), who is known for his influential work in weather forecasting and high performance computing at NOAA.
Also from NOAA’s Geophysical Fluid Dynamics Laboratory (GFDL) and Princeton University, Dr. V. Balaji, Head of the Modeling Systems Group, will share the importance of bridging the worlds between science and software with workshop attendees.
And finally, Eli Dart, a highly-acclaimed network engineer from the Department of Energy’s ESnet who is credited with co-developing the Science DMZ model, will wrap up the workshop with the final keynote focused on how to create cohesive strategies for data mobility across computer systems, networks and science environments.
Inspired by each of the keynote speaker’s integral roles in climate science, computing and network architectures, the workshop intends to spark lively, interactive discussions between the research and education (R&E) and climate science communities to build long-term relationships and create useful tools and resources for improved climate data transport and management.
Starting this January, the Earth System Grid Federation (ESGF) has started a new working group—the International Climate Network Working Group—to help set up and optimize network infrastructures for their climate data sites around the world. They need network connections that can deal with petabytes of modeling and observational data, which will traverse more than 13,000 miles of networks (more than half the circumference of the Earth!), spanning two oceans.
By the end of 2014, this working group will aim to obtain at least 4Gbps of data transfer throughput at five of their climate data centers at PCMDI/LLNL (US), NCI/ANU (AU), CEDA/SFTC (UK), DKRZ (DE), and KNMI (NE). This goal runs in parallel with the Enlighten Your Research Global international networking program award that ESGF received this last November 2013. This initiative is lead by Dean Williams of Lawrence Livermore National Lab and ESnet’s Science Engagement Team, along with collaborating international network organizations in Australia (AARnet), Germany (DFN), the Netherlands (SURFnet), and the UK (Janet). We are helping to shepherd ESGF’s project and working group to make sure all their climate sites get up and running at proficient network speeds for the future peta-scale climate data that is expected within the next 5 years.
As we work closely with ESGF to pave the way for climate science, we look forward to developing a new set of networking best practices to help future climate science collaborations. In all, we are excited to get this started and see their science move forward!
As a research and education network, one of ESnet’s accomplishments came to light at the end of 2013 during an SC13 demo in Denver, CO. Using ESnet’s 100 Gbps backbone network, NASA Goddard’s High End Computer Networking (HECN) Team achieved a record single host pair network data transfer rate of over 91 Gbps for a disk-to-disk file transfer. By close collaboration with ESnet, Mid-Atlantic Crossroads (MAX), Brocade, Northwestern University’s International Center for Advanced Internet Research (iCAIR), and the University of Chicago’s Laboratory for Advanced Computing (LAC), the HECN Team showcased the ability to support next generation data-intensive petascale science, focusing on achieving end-to-end 100 Gbps data flows (both disk-to-disk and memory-to-memory) across real-world cross-country 100 Gbps wide-area networks (WANs).
To achieve 91+ Gbps disk-to-disk network data transfer rate between a single pair of high performance RAID servers, this demo required a number of techniques working in concert to avoid any bottlenecks in the end-to-end transfer process. This required parallelization using multiple CPU cores, RAID controllers, 40G NICs, and network data streams; a buffered pipelined approach to each data stream, with sufficient buffering at each point in the pipeline to prevent data stalls, including application, disk I/O, network socket, NIC, and network switch buffering; a completely clean end-to-end 100G network path (provided by ESnet and MAX) to prevent TCP retransmissions; synchronization of CPU affinities for the application process and the disk and network NIC interrupts; and a suitable Linux kernel.
The success of the HECN Team SC13 demo proves that it is possible to effectively fully utilize real-world 100G networks to transfer and share large-scale datasets in support of petascale science, using Commercial Off-The-Shelf system, RAID, and network components, together with open source software.
You must be logged in to post a comment.