ESnet’s Data Mobility Exhibition: Moving to petascale with the research community

Research and Education Networks (REN) capacity planning and user requirements differ from those faced by commodity internet service providers for home users. One key difference is that scientific workflows can require the REN to move large, unscheduled, high-volume data transfers, or “bursts” of traffic. Experiments may be impossible to duplicate and even one underperforming network link can cause the entire data transfer to fail.  Another set of challenges stem from the federated nature of scientific collaboration and networking. Because network performance standards cannot be centrally enforced, performance is obtained as a result of the entire REN community working together to identify best practices and resolve issues.  For example:

  • Data Transfer Nodes (DTN), which connect network endpoints to local data storage systems are owned by individual institutions, facilities, or labs. DTNs can be deployed with various equipment configurations, with local or networked storage configurations, and connected to internal networks in many different ways. 
  • Research institutions have diverse levels of resources and varied data transfer requirements; DTNs and local networks are maintained and operated based on these local considerations.
  • Devising performance benchmarks for “how fast a data transfer should be” is difficult as capacity, flexibility, and general capabilities of networks linking scientists and resources constantly evolve and are not consistent across the entire research ecosystem.

ESnet has long been focused on developing ways to streamline workflows and reduce network operational burdens on the scientific programs, researchers, and others both those we directly serve and on behalf of the entire R&E network community.  Building on the successful Science DMZ design pattern and the Petascale DTN project, the Data Mobility Exhibition (DME) was developed to improve the predictability of data movement between research sites and universities. Many sites use perfSONAR to test end-to-end network performance. The DME allows sites to take this a step farther and test end to end data transfer performance.

DME is a resource that enables the calibration of data transfer performance for a site’s DTNs to ensure that they are performing well by using ESnet’s own test environment, at scale. As part of the DME, system/storage administrators and network engineers have a wide variety of resources available to analyze data transfer performance against ESnet’s standard DTNs, obtain help from ESnet Science Engagement (or from universities, Engagement and Performance Operation Centers) to tune equipment, and to share performance data and network designs with the community to help others.  For instance, a 10Gbps DTN should be capable of – at a minimum – transferring one Terabyte per hour. However, we would like to see DTNs > 10G or a cluster of 10G DTNs transfer at PetaScale rates of 6TB/hr or 1PB/week.

Currently, the DME has geographically dispersed benchmarking DTNs in three research locations:

  • Cornell Center for Advanced Computing in Ithaca, NY, connected through NYSERnet
  • NCAR GLADE in Boulder, CO, connected through Front Range Gigapop
  • Petrel system at Argonne National Lab, connected through ESnet

Benchmarking DTNs are also deployed in two commercial cloud environments: Google Drive and Box.  All five DME DTN can be used for both upload and download testing allowing users to calibrate and compare their network’s data transfer performance. Additional DTNs are being considered for future capacity. Next generation ESnet6 DTNs will be added in FY22-23, supporting this data transfer testing framework.

DME provides calibrated data sets ranging in size from 100MB to 5TB, so that performance of different sized transfers can be studied. 

DOE scientists or infrastructure engineers can use the DME testing framework, built from the Petascale DTN model, with their peers to better understand the performance that institutions are achieving in practice. Here are examples of how past Petascale DTN data mobility efforts have helped large scientific data transfers:

  1. 768 TB of DESI data sent via ESnet, between OLCF and NERSC automatically via Globus over 20 hours. Despite the interruption of a maintenance activity at ORNL, the transfer was seamlessly reconnected without any user involvement.
  2. Radiation-damage-free high-resolution SARS-CoV-2 main protease SFX structures obtained at near-physiological-temperature offer invaluable information for immediate drug-repurposing studies for the treatment of COVID19. This Work required near-real-time collaboration and data movement between LCLS, NERSC via ESnet.

To date, over 100 DTN operators have used DME benchmarking resources to tune their own data transfer performance. In addition, the DME has been added to the NSF-funded Engagement and Performance Operations Center (EPOC) program’s six main scientific networking consulting support services, bringing this capability to a wide set of US Research Universities. 

As the ESnet lead for this project, I invite you to contact me for more info (consult@es.net). We also have information up on our knowledge-base website fasterdata.es.net. DME is an easy, effective way to ensure your network, data transfer, and storage resources are operating at peak efficiency! 

40G Data Transfer Node (DTN) now Available for User Testing!

ESnet’s first 40 Gb/s public data transfer node (DTN) has been deployed and is now available for community testing. This new DTN is the first of a new generation of publicly available networking test units, provided by ESnet to the global research and engineering network community as part of promoting high-speed scientific data mobility. This 40G DTN will provide four times the speed of previous-generation DTN test units, as well as the opportunity to test a variety of network transfer tools and calibrated data sets.

The 40G DTN server, located at ESnet’s El Paso location, is based on an updated reference implementation of our Science DMZ architecture. This new DTN (and others that will soon follow in other locations) will allow our collaborators throughout the global research and engineering network community to test high speed, large, demanding data transfers as part of improving their own network performance. The deployment provides a resource enabling the global science community to reach levels of data networking performance first demonstrated in 2017 as part of the ESnet Petascale DTN project

The El Paso 40G DTN has Globus installed for gridFTP and parallel file transfer testing. Additional data transfer applications may be installed in the future. To facilitate user evaluation of their own network capabilities ESnet Data Mobility Exhibition (DME), test data sets will be loaded on this new 40G DTN shortly. 

All ESnet DTN public servers can be found at https://app.globus.org/file-manager. ESnet will continue to support existing 10G DTNs located at Sunnyvale, Starlight, New York, and CERN. 

ESnet's 40G DTN Reference Architecture Block Diagram
ESnet’s 40G DTN Reference Architecture Block Diagram

The full 40G DTN Reference architecture and more information on the design of these new DTN can be found here:

A second 40G DTN will be available in the next few weeks, and will be deployed in Boston. It will feature Google’s bottleneck bandwidth and round-trip propagation time (BBR2) software, allowing improved round-trip-time measurement and the ability for users to explore BBR2 enhancements to standard TCP congestion control algorithms.

In an upcoming blog post, I will describe the Boston/BBR2-enabled 40G DTN and perfSONAR servers. In the meantime, ESnet and the deployment team hope that the new El Paso DTN will be of great use to the global research community!