Summer at ESnet: student notes, part 3

Eashan Adhikarla is pursuing a Ph.D. at Lehigh University and joined my group this summer to work on our “DTN as a Service” project. He contributed a lot of energy and novel insights into our work over the summer and I hope we have the opportunity to collaborate again in the near future. Here are some thoughts from Eashan on the summer student experience at ESnet.

This is my second internship at Berkeley Lab and my first at the Scientific Networking Division (SND). It has been full of excitement, thrills, challenges, and surprises, and it is a dream place to be.

This summer, I have been working on the intersection of machine learning and high-performance computing in data transfer nodes (DTNs). ESnet connects 40 DOE sites to 140 other networks and therefore has a high demand for data transfers ranging from megabytes to petabytes. The team is designing DTN-as-a-Service (DTNaaS), where the goal is to deploy and optimize the performance of the data movement across various sites. Managing the transmission control protocol (TCP) flows is a key factor in achieving good performance of transfers over a wide range of network infrastructure. My research helps automate DTN performance via machine learning – thus improving the overall DTNaaS framework.

At present, most DTN software is deployed on bare metal servers, limiting the flexibility for operational configuration changes and automation of transfer configurations. Manually inferring best tuning parameters for a dynamic network is a challenge. To optimize the throughput over TCP flow, we currently often use a pacing rate-function to control packet inter-arrival time. A part of my work proposes two different alternative approaches (supervised or sparse regression-based models) to better predict pacing rate, as well as automate change of related DTN settings based on the nature of the transfers.

Overall, my summer research involved getting experience with a wide set of networking areas of interest:

  • Improving the DTN-as-a-Service agent traffic control API with profiles and setting pacing
  • Creating a method for statistics retrieval for the harness toolkit for dynamic data visualization and analysis, and preparing these statistics to train the pacing model
  • Developing a pacing prediction approach that reduces much of the effort for manual pacing rate configuration.

I was also able to contribute to a separate team’s project on exploring the use of network congestion control algorithms for DTNs; the resulting paper will be submitted to an SC21 workshop.

For me, one of the best things at ESnet is that the summer interns get to work directly with quintessential research scientists and research engineers in the lab and learn a variety of skills to tackle the most challenging problems on a real-world scale. It’s a place from which I always come out as a better version of myself.


If you are interested in learning more about future summer opportunities with ESnet, please see this link (https://cs.lbl.gov/careers/summer-student-and-faculty-program/). We typically post notices and application information starting in January or February.

Summer at ESnet: the view from our students: Part 1

Summer students are a key part of growing ESnet and supporting the scientific networking community. Every year, we host research projects with talented students working on important research topics. We benefit tremendously from their enthusiasm, talent, and fresh ideas, and they work directly with our staff across a wide set of disciplines.  Here are some thoughts from two current students on what it is like to work with ESnet, and what research excites them.

Sandesh Dhawaskar Sathyanarayana:

I am thrilled with my summer internship at ESnet. During my Multipath Transmission Control Protocol (TCP) research, I used in-kernel programs to implement receiver-based network controllers and have always wanted to work more on it as it allows one to hook into the kernel and innovate the different network protocols. Software Defined Networking (SDN) along with dataplane and kernel network programming is trending as it enables the telecom world to save billions of dollars and operate the network more efficiently. 

My goal for this summer was to work and innovate in the SDN field, and ESnet was the perfect fit for it. At ESnet, I work on the Q-factor project using technologies such as eBPF (extended Berkeley Packet Filters) and XDP (eXpress Data Path) to improve data transfer speeds in science networks. I get to play with the state-of-the-art P4 dataplane programming language for switches and programmable NICs. The project is a collaboration with Florida International University (FIU), so I get to work with amazing people. Our team is small, with great mentors like Richard Cziva and Jeronimo Bezerra. 

What I love the most is the freedom to think and solve problems with great support. Having to work in different labs, I used to be stressed most of the time to complete the work. This summer has been a very different experience with excellent mentorship. I also had other offers and chose ESnet as my advisor and co-advisors insisted strongly, and I am happy I went with ESnet.

Elias Joseph:

Interning at ESnet has been a really good learning experience for me. The regular seminars from researchers in the lab about their current projects have allowed me to learn about a lot of topics I usually wouldn’t have much exposure to, as well as see how the concepts I have learned about in school are being applied in a professional environment. It is really interesting to see how machine learning is actively being used at the laboratory, and what current advancements are being made with it.

As much as I’m learning from the seminars, I’m learning even more from the project I’m working on. This internship is giving me experience using a lot of tools that are prevalent in computer science but are underutilized in my master’s program, and my mentor has been extremely helpful in getting me up to speed on these tools.

I’ve also found working on my project very fulfilling. Primarily I’ve been working on a tool that displays internet traffic, as well as predictions for future traffic, and seeing it come together over the past month and a half has been really cool.

I do miss the social aspect of working in an office, but the networking and social activities that have been organized have done a lot to alleviate that, and overall, I have grown a lot in the first half of my internship.

If you are interested in learning more about future summer opportunities with ESnet, please see this link — we typically post notices and accept applications for the next summer starting in January or February.

Graduate students publish on network telemetry with ESnet

Two graduate students working with ESnet have published their papers recently in IEEE and ACM workshops.

Bibek Shrestha, a graduate student at the University of Nevada, Reno, and his advisor Engin Arslan worked with Richard Cziva from ESnet to publish a work on “INT Based Network-Aware Task Scheduling for Edge Computing”. In the paper, Bibek investigated the use of in-band network telemetry (INT) for real-time in-network task scheduling. Bibek’s experimental analysis using various workload types and network congestion scenarios revealed that enhancing task scheduling of edge computing with high-precision network telemetry can lead up to a 40% reduction in data transfer times and up to 30% reduction in total task execution times by favoring edge servers in uncongested (or mildly congested) sections of the network when scheduling tasks. The paper will appear in the 3rd Workshop on Parallel AI and Systems for the Edge (PAISE) co-conducted with IEEE IPDPS 2021 conference to be held on May 21st, 2021, in Portland, Oregon. 

Zhang Liu, a former ESnet intern and a current graduate student at the University of Colorado at Boulder, worked with the ESnet High Touch Team – Chin Guok, Bruce Mah, Yatish Kumar, and Richard Cziva – on fastcapa-ng, ESnet’s telemetry processing software. In the paper “Programmable Per-Packet Network Telemetry: From Wire to Kafka at Scale,” Zhang showed the scaling and performance characteristics of fastcapa-ng, and highlighted the most critical performance considerations that allow the pushing of 10.4 million telemetry packets per second to Kafka with only 5 CPU cores, which is more than enough to handle 170 Gbit/s of original traffic with 1512B MTU. This paper will appear in the 4th International Workshop on Systems and Network Telemetry and Analytics (SNTA 2021) held at the ACM HPCD 2021 conference in Stockholm, Sweden between 21-25 June 2021.

Congratulations Bibek and Zhang!


If you are a networked systems research student looking to collaborate with us on network measurements, please reach out to Richard Cziva. If you are interested in a summer internship with ESnet, please visit this page.