3 Questions with Kapil Agrawal

Kapil Agrawal. Juniper the cat was unavailable to photograph.

Kapil Agrawal comes to us from National Center for Supercomputing Applications (NCSA), where he worked as a Network Engineer focusing on HPC data center networking and all things automation. Before that, Kapil worked as a Network Engineer at GlobalNOC focusing on service provider networking for regional R&E networks. He is passionate about learning and tinkering with new open source technologies in his home lab, intense hackathons, and infrastructure-as-code. In his downtime, Kapil enjoys high intensity interval training, traveling and exploring new places, competitive gaming, and playing with Juniper (his cat).

Kapil Agrawal. Juniper the cat was unavailable to photograph.

What brought you to ESnet?

ESnet’s mission to innovate, build, and support a bleeding edge network infrastructure for scientific computing, empowers researchers to focus on what’s core to them—the science. This is very exciting but it also comes with challenges in terms of security. We want to be open to share the science with our collaborators, but not too open to the point where bad actors take advantage of us. Where does one draw the line? That is the challenge and that’s what makes cyber security in scientific computing so interesting! I am also familiar with the innovative work that ESnet security does for the R&E community and I am excited about the opportunity to learn and grow with the team and to give back to the community in every way possible. 

What is the most exciting thing going on in your field right now?

Coming from a networking background, I find MANRS (Mutually Agreed Norms for Routing Security) very exciting. It’s a herculean effort by the larger networking community to secure the global internet routing infrastructure. 

What book would you recommend?

Books in the order from non-technical to most technical : Atomic Habits, The Phoenix project, Where Wizards stay up late (The origins of the internet), and Internet routing architectures.

ESnet Highlights from ZeekWeek’21

Fatema Bannat Wala presenting at ZeekWeek21

Slides and videos from ZeekWeek have just been made available — here are links to ESnet highlights.


ZeekWeek, an annual Fall conference organized by the Zeek Project, took place online from October 13-15 this year. The conference had over 2000 registered participants from the open source user community this year, who got together to share the latest and greatest about this cyber-security and network monitoring software tool.

Berkeley Lab staff member Vern Paxson developed the precursor to the Zeek intrusion detection software, then called Bro, in 1994. As an early adopter, ESnet’s cybersecurity team has strong relationships with the Zeek community, and this ZeekWeek was an opportunity to showcase advances and uses made by the software by ESnet and the entire Research and Educational Networking Community.


The talk “DNS and Spoofed traffic investigation with Zeek,” presented by Fatema Bannat Wala, discussed how Zeek is being used to do network traffic analysis/investigations at ESnet by triaging abnormal activities when these occur on our network.

The talks “A Better Way to Capture Packets with DPDK” and “Details for DPDK plugin development and performance measurement presented by Vlad Grigorescu and Scott Campbell, detailed the development process of the plugin and the performance enhancements it brings to the network packet capture technology.

Fatema Bannat Wala also did a training session on “Introduction to Zeek,” which provided hands-on experience with Zeek tools and information about how to get involved with the collaboration.

ESnet’s cybersecurity team looks forward to continued collaboration with the Zeek community, attending next year’s ZeekWeek, and to contributing future code enhancements to this great software ecosystem.

3 Questions with Michael Haberman

Michael comes to ESnet’s Cybersecurity group after working as a software engineer at the National Center for Supercomputing Applications (NCSA), and in the Automated Learning Group at the University of Illinois, Champaign/Urbana (UIUC). Recently, he has also been an instructor for a data science and machine learning course within the School of Informatics (iSchool).

Michael Haberman
Michael Haberman

What brought you to ESnet?
The classes I taught at UIUC were designed around mastery-based learning and evidence-based teaching. I built a framework that instrumented the assignments (similar to observability) so that I could get a good pulse on where students were struggling and where they weren’t. Creating the end-to-end workflows for the students made me realize how much I missed architecting (and building) software. I knew several great ESnet people and it was just perfect timing that the security group had an opening where they were receptive to bringing on someone with a software design background and also enthusiastic about letting me continue climbing the data analytics and machine learning mountain (I’m at the base). I also love that ESnet’s mission enables science.

What’s the most exciting thing happening in your field?
There’s a lot going on and staying current is a challenge. If I had to pick a topic that is ripe for potential (or hype) it’s using blockchain “decentralized ledger” technology (now being used for databases, voting, and electronic currencies), to create applications in digital identity, and remove unnecessary intermediaries from transactions. It seems like there are new application ideas for blockchain every day.

Although I do not know much about cryptocurrency (or its future), the idea of using their decentralized ‘bookkeeping’ architecture for secure transactions with provenance seems intriguing.

What book would you recommend?
I remember reading The Cuckoo’s Egg in high school and it’s one of the books that got me interested in both computer science and security. When I saw this question I remembered that the main character is from LBL! Perhaps the security group will want me to look into an accounting discrepancy?

ESnet Machine Learning Researchers Win Best Paper at MLN ‘2021!

MLN '2021 Best Paper Award Notification

Sheng Shen, Mariam Kiran, and Bashir Mohammed have just been awarded the Best Paper award at the International Conference on Machine Learning for Networking (MLN). Sponsored by the Conservatoire National des Arts et Métiers (CNAM), the École Supérieure d’Ingénieurs en Électrotechnique et Électronique (ESIEE), and Laboratoire d’Informatique Gaspard-Monge (LIGM), MLN is being held virtually 1-3 December 2021.

The paper, “DynamicDeepFlow: An Approach for Identifying Changes in Network Traffic Flow Using Unsupervised Clustering,” uses a hybrid of deep learning variational autoencoder model and a shallow learning k-means to help identify unique traffic patterns across ESnet. These unique patterns can help identify if a new experiment has started or whether current network bandwidth is changing.

DynamicDeepFlow (DDF) model structure

“We’re very excited to receive this recognition and the conference was a wonderful opportunity to exchange thoughts and ideas with peers in France. MLN is a conference dedicated to discussing machine learning applications in networks. Our next task is to integrate DynamicDeepflow with Netpredict to show real-time information in ESnet data” — Mariam Kiran

Papers from MLN will be published as post-proceedings in Springer’s Lecture Notes in Computer Science (LNCS).

ESnet Highlights from the National Science Foundation’s Cybersecurity Summit ’21

The National Science Foundation (NSF) Cybersecurity Center of Excellence, Trusted CI Project hosts a yearly cybersecurity summit, inviting people from various NSF-funded research organizations to share innovations and ideas. Here are some videos of ESnet presentations.

Scott Campbell presented “ESnet Security Group Impact on Network Architecture” where he discussed some of the social, technical, and architectural outcomes of the ESnet6 network upgrade that were beneficial to the organization. By being involved early, security design elements were incorporated into workflows at early stages and were both tightly integrated and vetted during the core design process. This early involvement also heightened the security group’s visibility, which led to a better understanding of how the various groups interact and their different methods of problem-solving and time management.

Eli Dart and Fatema Bannat Wala presented “Best practices for securing Science DMZ,” focusing on disentangling security policies and enforcement for science flows from traditional security approaches for business systems, and use of the Science DMZ model to protect high-performance science flows. They discussed thinking of the Science DMZ as a security architecture that provides useful and implementable security controls without impacting performance. 

ESnet Scientists awarded best paper at SC21 INDIS!

A combined team from ESnet and Lehigh University was awarded the best paper for Exploring the BBRv2 Congestion Control Algorithm for use on Data Transfer Nodes at the 8th IEEE/ACM International Workshop on Innovating the Network for Data-Intensive Science (INDIS 2021), which was held in conjunction with the 2021 IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC21) on Monday, November 15, 2021.

The team was comprised of:

  • Brian Tierney, Energy Sciences Network (ESnet)
  • Eli Dart, Energy Sciences Network (ESnet)
  • Ezra Kissel, Energy Sciences Network (ESnet)
  • Eashan Adhikarla, Lehigh University

The paper can be found here. Slides from the presentation are here. In this Q+A, ESnet spoke with the award-winning team about their research — answers are from the team as a whole.

INDIS 21 Best Paper Certificate

The paper is based on extensive testing and controlled experiments with the BBR (Bottleneck Bandwidth and Round-trip propagation time), BBRv2 and the Cubic Function Binary Increase Congestion Control (CUBIC) Transmission Control Protocol (TCP) Internet congestion algorithms. What was the biggest lesson from this testing?

BBRv2 represents a fundamentally different approach to TCP congestion control. CUBIC (as well as Hamilton, Reno, and many others) are loss-based, meaning that they interpret packet loss as congestion and therefore require significant network engineering effort to achieve high performance. BBRv2 is different in that it measures the network path and builds a model of the path – it then paces itself to avoid loss and queueing. In practical terms, this means that BBRv2 is resilient to packet loss in a way that CUBIC is not. This comes through loud and clear in our data.

What part of the testing was the most difficult and/or interesting?

We ran a large number of tests in a wide range of scenarios. It can be difficult to keep track of all the test configurations, so we wrote a “test harness” in python that allowed us to keep track of all the testing parameters and resulting data sets.

The harness also allowed us to better compare results collected over real-world paths to those in our testbed environments. Managing the deployment of the testing environment though containers also allowed for rapid setup and improved reproducibility. 

You provide readers with links to great resources so they can do their own testing and learn more about BBRv2. What do you hope readers will learn?

We hope others will test BBRv2 in high-performance research and education environments. There are still some things that we don’t fully understand, for example there are some cases where CUBIC outperforms BBRv2 on paths with very large buffers. It would be great for this to be better characterized, especially in R&E network environments.

What’s the next step for ESnet research into BBRv2? How will you top things next year?

We want to further explore how well BBRv2 performs at 100G and 400G. We would also like to spend additional time performing a deeper analysis of the current (and newly generated) results to gain insights into how BBRv2 performs compared to other algorithms across varied networking infrastructure. Ideally we would like to provide strongly substantiated recommendations on where it makes sense to deploy BBRv2 in the context of research and educational network applications.

Arecibo Support Wins SC21 HPCwire Readers’ Choice Award!

Arecibo dish after the collapse

As part of a team spanning 15 government, academic, and industrial partners, the Engagement and Performance Operations Center (EPOC) – a collaboration between Indiana University and ESnet – was awarded the “Best HPC Collaboration (Academia/Government/Industry)” HPCwire Readers’ Choice award on Tuesday, Nov. 16. The award, which was made at the High Performance Computing, Networking, Storage and Analysis (SC21) conference, recognizes the effort and collaboration required to move and safeguard irreplaceable data (over 50 years of astronomical observations) from the Arecibo observatory following the structural collapse of this scientific resource in 2016.

At ESnet, Ken Miller, George Robb, and Jason Zurawski supported these efforts as both full members of EPOC and ESnet staff. Both Jason and Ken divide their time between ESnet’s Science Engagement Team, while George is with ESnet’s Infrastructure Systems group. LightBytes looped up with Jason Zurawski to get his thoughts on the project and award, and an update on the Arecibo effort since our post in April 2021 on this project.


Now that data from Arecibo has been migrated to the Texas Advanced Computing Center (TACC), what happens now, and how will this data be used?

The team at the University of Central Florida has been engaged with TACC on several ways to build up the capabilities for their data analysis and sharing requirements. They are working to deploy a portal that will allow researchers access to the data, as well as build workflows to investigate and process using computation provided by TACC.

The team at Arecibo is also still going to process much older data that still resides on tape. Due to the delicate state of the media, it is carefully being read and transferred to on-island storage before being transmitted to TACC for archiving. This work will take several more months to complete.

What do you think the lessons from this effort are in terms of getting so many different organizations to work together to support this very challenging problem?

The collapse that Arecibo experienced sent ripples through the R&E community because researchers and technology professionals alike knew there was a limited window to act on replicating important observations gathered over the years. The partners in this effort were motivated to act, and that removed many barriers to putting some solutions in place. Everyone collaborated efficiently with their core competencies, and we continue to work together as the next steps for the scientific collaboration are planned.

Plans are starting to emerge for a “next generation” Arecibo based on the loss of this instrument, how might the next generation of data management resources be shaped by this collaboration?

Now that there has been some time to evaluate the work, it has also spurred UCF and Arecibo to plan for the future with respect to computation, storage, and network connectivity both in Puerto Rico and in Florida.  With these improvements planned, they will be well-positioned to serve the scientific data for years to come.  New instruments will no doubt increase the data demands by many orders of magnitude – addressing all aspects of the data pipeline now, and then gradually increasing the capabilities over time, will help to prepare for these emerging challenges. 

Congratulations to all of the organizations and staff who helped prevent the loss of this data!

Making the Research and Educational Community SAFER: Adam Slagell on the creation of a new global collaboration to combat cyberthreats.

Adam Slagell is ESnet’s Chief Security Officer and a founding member of the newly formed Security Assistance For Education & Research (SAFER) trust group.

SAFER is an operational security entity focused on fighting computer misuse and defending the academic, research, and education (R&E) mission globally.  SAFER brings together expertise and resources from organizations across the Research and Educational cybersecurity community, including CERN, DFN-CERT, ESET, ESnet, LBNL, STFC, and WLCG.

More information can be found here https://www.safer-trust.org/.


What motivates the creation of SAFER and what do you think success will look like for the community?

There are many cybersecurity trust groups out there, some even dedicated to R&E like REN-ISAC or XSEDE’s trust group consisting of current and former Teragrid and XSEDE site  members. However, there really isn’t anything like this—both permanent and truly international— even though attacks are almost always transnational. So each time there is a new, major campaign, an international group connecting all these regional responders must be created again. What we are trying to do is create that permanent backbone with a core set of highly connected individuals who are a part of these regional and project-specific trust groups around the world.

If we are successful, we will see several things. First, I believe we will see more international cooperation and information sharing, leading to an earlier notice of new attack campaigns. Second, we will be able to activate a response more quickly, pulling in the expertise needed from a broad pool of SAFER members and their trusted colleagues. Finally, it is our hope that we can provide surge capabilities when a member is under attack. Many R&E organizations have limited resources and small teams. It is a tremendous asset if they can get help from their peers, maybe with unique expertise as they are facing a disruptive attack.

What kind of security resources will SAFER provide?

I alluded to some of the services when discussing what success will look like. But ultimately, our security resources will be determined by community needs. The founding members will serve as the steering committee for the first year until we elect the next steering committee. 

One of our  first-steps is  setting up a Malware Information Sharing Platform (MISP) instance to share Indicators of Compromise, e.g., IP addresses, file hashes, domain names, etc. Usually, there is no requirement for members to share such data as the rules and regulations differ so much across organizations. But even on day one, we will have enough organizations that can contribute to making this service useful.

There is also a secure messaging and chat service using decentralized cryptography that all of our members can participate in. These ad hoc conversations about what people are seeing on their networks will hopefully help detect trends early.

Finally, many of the founding members have more resources from these large institutions, and I believe we can quickly help those projects and institutions that might struggle with an attack by providing our expertise while helping to train the next generation of security professionals.

What excites you most about this effort and what is the opportunity to do the most good?

I love the community-building aspect. In a past life, I created the Bro (now Zeek) Leadership Team and really worked hard to build a vibrant community around that software. I think this expertise is where I can be most helpful as I am less technical in my roles today.

I will also say, I am excited about getting young people involved, too. Organizations who contribute time from their teams will really benefit. There is no training for an incident response like jumping in, and I expect the variety of issues we will see will prove very useful just from a training and development perspective.

LBL has a long history supporting cybersecurity research, from the early days of Clifford Stoll and The Cuckoo’s Egg to the creation of Bro.  What does the future of cybersecurity look like, and how will that shape the REN community?

Indeed, LBL’s security team is also a SAFER founding member. One of the things I love about working here and at ESnet is that our mission is outward-focused and when we help the community we raise all boats so to speak.

Fortune telling however is a dangerous game. We have anticipated some things, like cryptocurrency mining coming to HPCs. However, the threat landscape and tools available keep changing. That is part of what makes this job interesting. The important thing that I hope we keep in mind is that security is not done for its own sake, but to enable our mission of scientific research. To me, this means that we must always work to make risk-based security decisions, even when that might challenge pushes for compliance and simple one-size-fits-all solutions. 

Next Generation ESnet6 Routers Installed and Accepted!

ESnet6 took a major step forward last week with the completed installation and acceptance of all 40 “greenfield” routers on the network backbone. These new routers will enable ESnet to operate at speeds up to 400 Gbps across our national fiber network, and provide the backbone infrastructure behind our next generation scientific data mobility capabilities.

A new ESnet6 backbone router in its native habitat.

The installation and acceptance process at each location across the continental US required careful coordination between subcontractors, colocation facility personnel, Lab site staff, and multiple teams across ESnet. Following local health regulations and access requirements, ESnet arranged physical access for the subcontractors at each location and all parties participated in a turn-up conference call as the routers were installed and brought online..

In addition to networking capabilities, the ESnet6 team implemented new software automation capabilities simplifying the installation and acceptance process.  These capabilities included enhancements to the ESnet inventory system to support bulk planning data import, automatic bill of materials generation, automatic site survey generation, and automated generation of all backbone links within the network.  In addition, the team introduced new workflow orchestration, automated provisioning, and inventory discovery capabilities to help with the installation process.

The acceptance of the ESnet6 greenfield routers is a major milestone for the ESnet6 Project and the team has already migrated a significant portion of customer traffic onto the new routers. Despite the extra challenges presented by the COVID-19 pandemic, the project has made steady progress and is on track to finish ahead of schedule. 

Science begins as a Conversation! See how ESnet creates a world where conversations become discovery. Watch our new video now!

Ever want to know how big research data moves around the globe? ESnet plays a significant role in supporting the great scientific conversations, collaborations, and experiments underway, wherever and whenever they occur. We move Exabytes of data around the world creating a global laboratory that accelerates scientific discovery.

In order to meet these needs of scientists, we are constantly looking for opportunities to expand our capabilities with our next generation network ESnet6, intelligent edge analytics, advanced network testbeds, 5G wireless, quantum networking and more.

https://www.es.net/scienceconversation/