Three questions with a new staff member! Please meet Rémy Doucet

Rémy comes to us from ByteDance/TikTok where they worked as a Systems Engineer responsible for large-scale server allocation and bare-metal OS deployment.  They have worked as a systems engineer for five years, with experience both in the Telecom industry and for large social media companies.  Rémy began their career as a software developer in Python but shifted when they realized a passion for infrastructure and systems.  

Rémy Doucet

What brought you to ESnet?

I have a long history of activism and also worked in the nonprofit sector prior to my engineering career. I became dissatisfied working only for social media giants and began seeking a career that married my passion for technology with my drive to make a positive impact on the world. Climate change is the most pressing issue humans are facing today, so I am excited to begin contributing to a place that not only has an impressive legacy of scientific discovery, but is continuing to make strides in areas such as renewable and clean energy.

What is the most exciting thing going on in your field right now?

Although it is not exactly under my purview, I have always been fascinated by artificial intelligence. Not only will it continue to transform our society in unimaginable ways, but I am also curious to see how it will come to be used for systems administration tasks such as monitoring and deployment. Currently, these processes are still largely human and automation driven but I think we will start to see more AI incorporated into the process in the future. For my personal interests, I enjoy experiencing art or music created by AI.

What book would you recommend?

Simulacra and Simulation by Jean Baudrillard. It is a philosophical treatise that I think will become increasingly relevant in our society.

ESnet’s Nick Buragilio Wins Prize at the Annual LBNL IPO Pitch Competition for Hecate

Three questions with Nick Buragilio about the Pitch Competition

Nick Buragilio took third prize at the LBNL Intellectual Property Office’s Annual Pitch Competition on September 9 for his talk on “Hecate: Directing happiness to internet service provider customers.” Hecate is a software tool that leverages machine learning to automate complex network traffic engineering.  The prize includes $1000 for the Hecate team, supporting continued lab-to-market progress.


How did you develop the technology? It’s a team of three: myself, Scott Campbell, and Mariam Kiran. Scott is handling the data collection and curation and pipelining the data into AI algorithms being developed by Mariam. Mariam is also working on porting the algorithms to GPUs. I’m handling the overall technology and product strategy, plus network elements supporting large-scale traffic engineering. 

The idea came from traffic engineering and segment routing conversations; Mariam had some ideas about bringing in machine learning from the SENSE project, so we sat down over Zoom and sketched things out – it was a natural meeting of minds and very much a virtual “mapping out a project on a napkin” moment, despite the pandemic.

What was it like pulling together the pitch? I enjoy public speaking, and I like to challenge myself. The Pitch Competition seemed like a good opportunity to test the waters and experiment to see what would work and what might not. The challenge was to fit a complicated technical topic into a 5-minute elevator pitch. The Intellectual Property Office supplied coaching as well.

What’s next? We continue on with our testing, and we are looking for more opportunities to use the demonstration software on real data, especially research and educational network partners who can give access to their network data. I’m at buraglio@es.net if a reader is interested in learning more!

Summer at ESnet: student notes, part 3

Eashan Adhikarla is pursuing a Ph.D. at Lehigh University and joined my group this summer to work on our “DTN as a Service” project. He contributed a lot of energy and novel insights into our work over the summer and I hope we have the opportunity to collaborate again in the near future. Here are some thoughts from Eashan on the summer student experience at ESnet.

This is my second internship at Berkeley Lab and my first at the Scientific Networking Division (SND). It has been full of excitement, thrills, challenges, and surprises, and it is a dream place to be.

This summer, I have been working on the intersection of machine learning and high-performance computing in data transfer nodes (DTNs). ESnet connects 40 DOE sites to 140 other networks and therefore has a high demand for data transfers ranging from megabytes to petabytes. The team is designing DTN-as-a-Service (DTNaaS), where the goal is to deploy and optimize the performance of the data movement across various sites. Managing the transmission control protocol (TCP) flows is a key factor in achieving good performance of transfers over a wide range of network infrastructure. My research helps automate DTN performance via machine learning – thus improving the overall DTNaaS framework.

At present, most DTN software is deployed on bare metal servers, limiting the flexibility for operational configuration changes and automation of transfer configurations. Manually inferring best tuning parameters for a dynamic network is a challenge. To optimize the throughput over TCP flow, we currently often use a pacing rate-function to control packet inter-arrival time. A part of my work proposes two different alternative approaches (supervised or sparse regression-based models) to better predict pacing rate, as well as automate change of related DTN settings based on the nature of the transfers.

Overall, my summer research involved getting experience with a wide set of networking areas of interest:

  • Improving the DTN-as-a-Service agent traffic control API with profiles and setting pacing
  • Creating a method for statistics retrieval for the harness toolkit for dynamic data visualization and analysis, and preparing these statistics to train the pacing model
  • Developing a pacing prediction approach that reduces much of the effort for manual pacing rate configuration.

I was also able to contribute to a separate team’s project on exploring the use of network congestion control algorithms for DTNs; the resulting paper will be submitted to an SC21 workshop.

For me, one of the best things at ESnet is that the summer interns get to work directly with quintessential research scientists and research engineers in the lab and learn a variety of skills to tackle the most challenging problems on a real-world scale. It’s a place from which I always come out as a better version of myself.


If you are interested in learning more about future summer opportunities with ESnet, please see this link (https://cs.lbl.gov/careers/summer-student-and-faculty-program/). We typically post notices and application information starting in January or February.

Meeting the Challenge of High Availability through the HASS

Operating a highly optimized network across two continents that meets the needs of very demanding scientific endeavors requires a tremendous amount of automation, orchestration, security, and monitoring.  Any failure to provide these services can create serious operational challenges. 

As we enter the ESnet6 era, ESnet is dedicated to ensuring that we continue to relentlessly push the boundaries of operational excellence and obsessively seek out and improve upon operational risks. Our new High Availability Services Site (HASS) in San Jose, CA. will be a critical component to realizing those goals in our computing platforms and services. ESnet’s HASS will soon provide fully redundant network operations platforms, allowing us to seamlessly maintain services if our operations at LBL are disrupted.

For about a decade, ESnet has augmented its data center operations at Berkeley Lab in California with a small footprint at Brookhaven National Laboratory in New York.  This has allowed us to synchronize important information across two sites and to run multiple instances of important services to ensure operational continuity in the case of a failure.  While this has provided great stability and reliability, there are limitations.  In particular, the 2,500 mile gap across a continent does not let ESnet restore operations without some degree of delay as some key services must be manually transitioned.  HASS will enable seamless operational continuity, since the shorter distance between Berkeley and San Jose will let us automatically maintain the active synchronization of operational platforms.

Deployment of HASS involves a team effort of our ESnet Computing Infrastructure, Network Engineering, and Security teams, working together to architect and deploy the next evolution in our computing and service reliability strategy.  After finalizing our requirements, we are now working with Equinix, a commercial colocation provider, to deploy a site adjacent to the ESnet6 network.  Equinix was able to provide a secure suite in their San Jose facility and this location gives us the capacity, and physical adjacency we require to directly connect this suite to ESnet6 and reach our Berkeley data center comfortably within our demanding latency goals (10ms or less).  

As part of standing up HASS , we’ll be installing a new routing platform with a 100G upstream connection to ESnet6 in both San Jose and Berkeley.  We’ll also be installing new high performance switching platforms, security services (high throughput firewalls, tapping, black hole routing, etc.), virtualization resources, and several other redundant internal operational platforms.  Our existing virtualization platform (ESXi/vSAN) will “stretch” into the new space as part of the same logical cluster we operate in Berkeley.  Once this is deployed, even networking services that lack native high availability capabilities will be able to simply “float” between the two physical data centers with data mirrored and striped across both sites.  

We’re very excited by the addition of the San Jose HASS, and HASS, in combination with existing reliability resources at Brookhaven, will continue to ensure that ESnet6 has the ability to meet scientific networking community needs for service hosting, disaster recovery, and offsite data replication.