New Ground-Shaking Science with Dark Fiber

As a Network Engineer at ESnet, I am no stranger to the importance of designing and maintaining a robust fiber-optic network. To operate a network that will “enable and accelerate scientific discovery by delivering unparalleled network infrastructure, capabilities, and tools,” ESnet has acquired an impressive US continental footprint of more than 21,000 kilometers of leased fiber-optic cable. We spend a great deal of effort designing and sourcing redundant fiber-optic paths to support network data connectivity between scores of DOE Office of Science facilities and research collaborators across the country.

But network data transfer is only one of the uses for fiber-optic cable. What about using buried fiber-optic cable for some truly “ground-shaking” science? The answer is “Yes, absolutely!” – and I was fortunate to play a part in exploring new uses for fiber-optic cable networks this past year.

Back in 2017, the majority of our 21,000 km fiber footprint was still considered “dark fiber,” meaning it was not yet in use. At that time, ESnet was actively working on the design to upgrade from our current production network “ESnet5” to our next-generation network “ESnet6,” but we hadn’t yet put our fiber into production.

At the same time, Dr. Jonathan Ajo-Franklin, then graduate students Nate Lindsey and Shan Dou, and the Berkeley Lab’s Earth and Environmental Science Area (EESA) were exploring the use of distributed acoustic sensing (DAS) technology to detect seismic waves by using laser pulses across buried fiber optic cable. The timing was perfect to try and expand on the short-range tests that Dr. Ajo-Franklin and his team had been performing at the University of California’s Richmond Field Station by using a section of the unused ESnet dark fiber footprint in the West Sacramento area for more extensive testing. ESnet’s own Chris Tracy worked with Dr. Ajo-Franklin and team to demonstrate how the underground fiber-optic cables running from West Sacramento northwest toward Woodland in California’s Central Valley made an excellent sensor platform for early earthquake detection, monitoring groundwater, and mapping new sources of potential geothermal energy.

The Sacramento ESnet Dark Fiber Route (left) and seismic events recorded on the array from around the world including the massive M 8.1 earthquake in Chiapas, Mexico.

Fast forward to May 2019, and Dr. Ajo-Franklin was heading up a new collaborative scientific research project for the DOE’s Geothermal Technology Office based on his prior DAS experimentation successes using ESnet fiber. The intent was to map potential geothermal energy locations in the California Imperial Valley south of the Salton Sea, near Calipatria and El Centro. The team, including scientists in EESA, Lawrence Livermore National Laboratory (LLNL), and Rice University needed a fiber path to conduct the experiment. It would make sense to assume that ESnet’s fiber footprint, which runs through that area, would be an excellent candidate for this experiment. Fortunately for ESnet’s other users, but unfortunately for the DAS team, by 2018 the ESnet6 team was already “lighting” this previously dark fiber. 

However, just because ESnet fiber in the Imperial Valley was no longer a candidate for DAS-based experiments, that didn’t mean there weren’t ways to gain access to unused dark fiber. For every piece of fiber that has been put into production to support ESnet6, there are dozens if not hundreds of other fibers running right alongside it. When fiber-optic providers install new fiber paths, they pull large cables consisting of many individual fibers to lease or sell to as many customers as possible. Because the ESnet fiber footprint was running right through the Imperial Valley, we knew that there was likely unused fiber in the ground, and only had to find a provider that would be willing to lease a small section to Berkeley Lab for Dr. Ajo-Franklin’s experiment. 

Making the search a little more complicated, the DAS equipment utilized for this experiment has an effective sensing range that is limited to less than 30 kilometers. Most fiber providers expect to lease long sections of fiber connecting metropolitan areas. For example, the fiber circuits that run through the Imperial Valley are actually intended to connect metropolitan areas of Arizona to large cities in Southern California. Finding a provider that would be willing to break up a continuous 600 km circuit connecting Phoenix to Los Angeles just to sell a 30 km piece for a year-long research project would be a difficult task.  

One of my contributions to the ESnet6 project was sourcing new dark fiber circuits and data center colocation spaces to “fill out” our existing footprint and get ready for our optical system deployments. Because of those efforts, I knew that there were often entire sections of fiber that had been damaged across the country and would likely not be repaired until there was a new customer that wanted to lease the fiber. I was asked to assist Dr. Ajo-Franklin and his team to engineer a new fiber solution for the experiment. I just had to find someone willing to lease us one of these small damaged sections.

After speaking with many providers in the area, the communications company Zayo was able to find a section of fiber starting in Calipatria, heading south through El Centro and then west to Plaster City, that was a great candidate for DAS use. This section of fiber had been accidentally cut near Plaster City and was considered unusable for networking purposes. Working with Zayo, we were able to negotiate a lease on this “broken” fiber span along with a small amount of rack space and power to house the DAS equipment that Dr. Ajo-Franklin’s team would need to move forward with their research.  

The Imperial Valley Dark Fiber Array: (A) Team Co-PI Veronica Rodriguez Tribaldos (LBNL) turning on the DAS system. (B) The ILA used to house the equipment in Calipatria. (C) The Zayo fiber section currently being used in the experiment. (D) The corresponding DAS data showing a magnitude 2.6 earthquake located near the Salton Sea, to the north. 

This cut fiber segment was successfully “turned up” for the project on November 10, 2020 by a team including Co-PI Veronica Rodriguez Tribaldos, Michelle Robertson, and Todd Wood (EESA/LBNL), and seismic data collection equipment is now up and running. The figure above (D) shows some great initial data recorded on the array, a small earthquake many miles to the north. There will be many more articles and reports from the Imperial Valley Dark Fiber Team as they continue to gather data and perform their experiments, and I’m sure we’ll begin to see fiber across the country put to use for this type of sensing and research.

I’ve had a great experience working with the different groups that were assembled for this project. By seeing how new technologies and methods are being developed to use fiber-optic cable for important research outside of computing science, I’ve developed a greater appreciation for how our labs and universities are tackling some of our biggest energy and public safety challenges.

Three Questions with Asma Aldaghar

Three questions with a new staff member on our Networking Engineering Team!

Asma earned her Bachelors in Computer Science from Higher Colleges of Technology in Dubai, where she majored in Network Sciences and Engineering. In Dubai, she was a member of IEEE Women in Engineering (WIE), one of the first organizations to recognize women’s presence in Engineering in the UAE, and participated in many WIE summits. Asma left Dubai, immigrating to California where she has worked as a Network Engineer in the San Francisco Bay Area for multinational corporations such as Google and Amazon, in Los Angeles for AT&T, and in the Central Valley as a Technical Infrastructure Lead for the Turlock Irrigation District. Beyond network engineering, Asma is also keenly interested in scripting, virtualization, automation, building databases, and working with open-source operating systems.

In her personal time, Asma enjoys reading, traveling, hiking and baking vegan goods.

Question 1: What brought you to ESnet?

I was introduced to ESnet and LBNL through my professor who also happens to work at LLNL. After hours of research on the ESnet public website, I was impressed by the lab’s accomplishments and future projects, specifically the ones that are focused on providing network services for national labs and some international research facilities. At this stage of my career, I wanted to be part of an organization which has an impactful mission that is beyond the bottom line. ESnet seems to satisfy both my professional and personal interests, and I am thrilled about this opportunity!

Question 2: What is the most exciting thing going on right now?

Automation! The vast majority of networking tasks are still executed manually, which can be time and effort taxing for network engineers. Incorporating automation into network services will assist in managing repetitive tasks and consequently improve network availability.

Question 3: What book do you recommend?

Big Farms make Big Flu by Rob Wallace. Looking at our current situation with this deadly pandemic, it’s very important to educate ourselves of how we got here. Apart from the fact that I learned a lot from Rob Wallace’s extraordinary analysis of our current agricultural practices, I also incorporated significant changes in my daily life (plant-based diet, awareness of ethical trades and supporting sustainable energy).

Looking back at ESnet’s 2020

Advancing our strategy and shaping our position on the board.
Some thoughts from Inder on the year-that-was.

Miniature from Alfonso X’s Libro del axedrez dados et tablas (Book of chess, dices and tables), c. 1283. , Public domain, via Wikimedia Commons

Dear Friends, Well-wishers, Colleagues, and all of ESnet,

Chess! 2020 has been much more challenging than this game. It’s also been a year where we communicated through the squares on our zoom screens, filled with faces of our colleagues, collaborators, and loved ones.

In January, Research and Education leaders came together in Hawaii at the Pacific Telecommunications Council meeting to discuss the future of networking across the oceans. It was impossible to imagine then that we would not be able to see each other again for such a long time. Though thanks to those underwater cables, we have been able to communicate seamlessly across the globe.

Looking back at 2020, we not only established a solid midgame position on our ESnet chessboard, but succeeded in ‘winning positions’ despite the profound challenges. The ESnet team successfully moved our network operations to be fully remote (and 24/7) and accomplished several strategic priorities. 

ESnet played some really interesting gambits this year: 

  1. Tackled COVID-related network growth and teleworking issues for the DOE complex
    • We saw a 4x spike in remote traffic and worked closely across several Labs to upgrade their connectivity. We continue to address the ever-growing demand in a timely manner. 

    • As we all shifted to telework from home, ESnet engineers developed an impromptu guide that was valuable to troubleshoot our home connectivity issues. 
  2. Progressed greatly on implementing our next-generation network, ESnet6
    • We deployed and transitioned to the ESnet6 optical backbone network, with 300 new site installations, 100’s of 100G waves provisioned, with just six months of effort, and while following pandemic safety constraints. I am grateful to our partners Infinera (Carahsoft) and Lumen for working with our engineers to make this happen. Check out below how we decommissioned the ESnet5 optical network and lit up the ESnet6 network.
    • Installed a brand new management network and security infrastructure upgrades along with significant performance improvements.
    • We awarded the new ESnet6 router RFP (Congratulations Nokia and IMPRES!); the installs start soon.
    • Issued another RFP for optical transponders, and will announce the winner shortly.
  3. Took initiative on several science collaborations to address current and future networking needs
    • We brainstormed new approaches with the Rubin Observatory project team, Amlight, DOE and NSF program managers to meet the performance and security goals for traffic originating in Chile. We moved across several countries in South America before reaching the continental U.S. in Florida (Amlight), and eventually the U.S. Data Facility at SLAC via ESnet.
    • Drew insights through deep engagement of ESnet engineers with the High Energy Physics program physicists, for serving the data needs of their current and planned experiments expediently.
      Due to the pandemic, a two-day immersive in-person meeting turned into a multi-week series of Zoom meetings, breakouts, and discussions.
    • When an instrument produces tons of data, how do you build the data pipeline reliably? ESnet engineers took on this challenge, and worked closely with the GRETA team to define and develop the networking architecture and data movement design for this instrument. This contributed to a successful CD 2/3 review of the project—a challenging enough milestone during normal times, and particularly tough when done remotely. 
    • Exciting opening positions were created with EMSL, FRIB, DUNE/SURF, LCLS-II…these games are still in progress, more will be shared soon. 
  4. Innovated to build a strong technology portfolio with a series of inspired moves
    • AI/ML
      • We demonstrated Netpredict, a tool using deep learning models and real-time traffic statistics to predict when and where the network will be congested. Mariam’s web page showcases some of the other exciting investigations in progress. 
      • Richard and his collaborators published Real-time flow classification by applying AI/ML to detailed network telemetry.
    • High-touch ESnet6 project
      • Ever dream of having the ability to look at every packet, a “packetscope”, at your fingertips? An ability to create new ways to troubleshoot, performance engineer, and gain application insights? We demonstrated a working prototype of that vision at the SC20 XNET workshop
    • SENSE
      • We deployed a beta version of software that provides science applications the ability to orchestrate large data flows across administrative domains securely. What started as a small research project five years ago (Thanks ASCR!) is now part of the AutoGOLE project initiative in addition to being used for Exascale Computing Project (ECP) project, ExaFEL.
    • TCP
      • Initiated the Q-Factor project this year, a research collaboration with Amlight, funded by NSF. The project will enable ultra-high-speed data transfer optimization by TCP parameter tuning through the use of programmable dataplane telemetry: https://q-factor.io/
      • We testbed thoroughly the interactions between TCP congestion control algorithms, BBRv2 and CUBIC. A detailed conversation with Google, the authors of the BBRv2 implementation, is in progress.
  5. Initiated strategic new games, with a high potential for impact
    • FABRIC/FAB
      • Executed on the vision and design of a nationwide @scale research testbed working alongside a superstar multi-university team.
      • With the new FAB grant, FABRIC went international with plans to put nodes in Bristol, Amsterdam, Tokyo and Geneva. More locations and partners are possibilities for the future.  
    • Edge Computing
      • Created an prototype FPGA-based edge-computing platform for data-intensive science instruments in collaboration with the Computational Research Division and Xilinx. Look for exciting news on the blog as we complete the prototype deployment of this platform.
    • Quantum
    • 5G
      • What are the benefits of widespread deployment of 5G technology on science research? We contributed to the development of this important vision at a DOE workshop. New and exciting pilots are emerging that will change the game on how science is conducted. Stay tuned. 

Growth certainly has its challenges. But, as we grew, we evolved from our old game into an adept new playing style. I am thankful for the trust that all of you placed in ESnet leadership, vital for our numerous, parallel successes. Our 2020 reminds me of the scene in Queen’s Gambit where the young Beth Harmon played all the members of a high-school chess team at the same time. 

Several achievements could not make it to this blog, but are important pieces on the ESnet chess board. They required immense support from all parts of ESnet, CS Area staff, Lab procurement, Finance, HR, IT, Facilities, and Communications partners.

I am especially grateful to the DOE Office of Science, Advanced Scientific Computing Research leadership, NSF, and our program manager Ben Brown, whose unwavering support has enabled us to adapt and execute swiftly despite blockades. 

All this has only been possible due to the creativity, resolve, and resilience of ESnet staff — I am truly proud of each one of you. I am appreciative of the new hires that trusted their careers with us and joined us remotely—without shaking hands or even stepping foot at the lab.

My wish is for all to stay safe this holiday season, celebrate your successes, and enjoy that extra time with your immediate family. In 2021, I look forward to killer moves on the ESnet chessboard, while humanity checkmates the virus. 

Signing off for the year, 

Inder Monga

ESnet Builds Morale and Community With a Zoom Competition

Nearly two months into California’s shelter-in-place order, we’ve all been in more than our fair share of video conferences. To boost morale during this difficult time, the Energy Sciences Network (ESnet) staff held a Zoom Background Competition during their all-to-all meeting on Monday, April 27. 

Staff were encouraged to create their own backgrounds and display them during the meeting. There were 21 entries. ESnet employees voted. Submissions were judged on overall artistry, functionality (not too distracting as a background), whether it elevated the voter’s mood, and if it made them feel included in the ESnet community. 

The top three winners got bragging rights. Here they are:

First place: Jeff Berman, NOC Engineer

This Zoom challenge inspired Berman, an avid sailoJeffrey Bermanr, to take to the sea. He won this competition with an hour video of the San Francisco skyline, one he filmed while sailing on the Bay. Although he typically likes to go sailing with friends and family, he says that sailing solo brings him a sense of peace, calm, and tranquility.  

“What is sailing? Most books define it as hours of sheer boredom scattered with white knuckle periods of terror. On a good day, both are true. Both give you an equal sense of accomplishment. How to be with yourself with nothing to do, good training for our current situation,” said Berman.

Second Place: Sartaj Baveja, Software Engineer

This challenge inspired Baveja to create a background meme of office life. In the background, someone (Baveja) is looking over your shoulder to catch a glimpse of your screen and make sure you don’t procrastinate.

Sartaj Baveja

Third Place: Joe Metzger, Network Engineer

Joe_interviewThis challenge inspired Metzger to use a picture that he took in Barcelona. The focal point of the picture (the blur) is a little girl in a red coat, black dress and white tights who was just running back and forth between the pools of light and shadow created by the stone arches and rosette windows, while her family was sitting in the cafe. 

“I used this as my zoom background because I think it is a really cool picture. It brings to mind a fun evening strolling around the little squares and back streets in Barcelona and sitting in cafes with a good glass of wine relaxing,” said Metzger.

Girl in Red

Written by Linda Vu, Berkeley Lab Computing Sciences.

The Risks of Not Deploying IPv6 in the R&E Community

Observations from ESnet’s resident IPv6 Expert Michael Sinatra

When having discussions with CIOs of various colleges, universities, and national laboratories, I often hear about such issues as “risk,” “return on investment,” “up-front-costs,” “CAPEX/OPEX,” and the like. When the topic turns to IPv6, costs are cited as well as potential risks involved with adopting IPv6. However, any good risk assessment should include risks and costs of not doing something as well as doing it. Until recently, much of the risk of not deploying IPv6 was centered around running out of IPv4 addresses and not much more. Organizations that had a lot of IPv4 addresses (or thought they did) presumably didn’t have to consider such risks. In the discussion below, I note several more risks of not deploying IPv6, advantages of IPv6, and reasons to move forward. This discussion can be combined with more traditional risks and costs associated with deploying IPv6, to provide the seeds of a more complete risk assessment.

Adoption, not migration

It’s important to understand that adoption, not migration is of principal concern. It is widely understood that IPv4 will remain active for some time and that a dual-stack environment–where networks, computers, and other devices all run both IPv4 and IPv6 simultaneously–is still the “best” way to achieve an IPv6 transition. My concern is principally about the adoption portion, where we add IPv6 functionality to networks and hosts, in order to achieve a dual-stack environment. Indeed, this assumes a reasonable abundance of IPv4 addresses within an organization.

Risk 1: Security

Conventional wisdom has it that adopting IPv6 brings with it a range of security issues, namely owing to the traditionally poor support for IPv6 in security appliances. Although open-source firewalls have long had near-parity between IPv4 and IPv6, many proprietary firewall and IDS devices have lacked sufficient IPv6 features. While it is true that security equipment has in the past turned a blind eye to IPv6, this is changing rapidly as vendors move to support IPv6.

Nevertheless, there are risks to ignoring IPv6 on the campus or the lab site. Because of the widespread and largely “default-on” support of IPv6 tunneling technologies, such as 6to4 and Teredo, IPv6 tunnels can and do easily exist on IPv4-only networks. Security devices which don’t understand IPv6 are unlikely to understand these tunneling technologies, and they will be unable to peel open the tunnel layers to see what’s really going on inside the tunnels.

Many black-hats and grey-hats understand this and will attempt to use IPv6 to transport illegal peer-to-peer content and malware. If the bad guys are adopting IPv6, the good guys need to adopt it as well so they can see the bad stuff and clean it up. Some IDSes and firewalls do understand IPv6 and they even understand the tunneling protocols. While it’s possible to block wholesale some of the tunneling protocols, it isn’t easy to block them all–especially if there are legitimate users of tunnels on your network. Moreover, blocking protocols at the border or at a router doesn’t block it within your enterprise or within individual LANs.

If you haven’t been including IPv6 support (and, ideally, feature-parity between IPv4 and IPv6) in your purchasing decisions and RFPs for security equipment, you are exposing yourself to this risk. Ignoring IPv6 simply won’t keep it off your network. However, adopting IPv6 as a natively-routed protocol will bring most of the tunneled traffic out in the open as native IPv6 traffic, where it will be easier to detect anomalies. That, combined with making IPv6 a key consideration in purchasing decisions will help mitigate the security risk.

Of course, purchasing life-cycles often span three- or five-year periods. That’s why it’s crucial to start thinking about IPv6 now, so that you can get IPv6 requirements embedded into purchasing processes before you really need IPv6. This is true for both network (routers, switches) and security equipment. Realizing you need IPv6 after a purchasing cycle has completed is not a good position to be in.

Risk 2: Your eyeballs

Of course, I am speaking here of “eyeball networks”–networks with client computers that access content and data. For colleges and universities, these include your wireless nets, your residence hall networks, and lab and research networks that consume and process data. That last category is also prevalent at national lab sites. Many of these organizations feel that they have enough IPv4 address to satisfy network growth for several years. However, that does not mitigate the risks that these organizations face: That eyeball networks–even those with abundant IPv4 address space–will still need access to IPv6 content and data, and that “several years’” worth of IPv4 address space still may not be enough.

As is the case with security, ignoring the need for IPv6 on those eyeball networks also poses risks. While users may be perfectly happy consuming content and data over IPv4, there is no guarantee that that content and data will always be available over IPv4. Indeed, with the recent run-out of IANA’s IPv4 free-pool, and of the Asia-Pacific region’s IPv4 address space, IPv4 address space is becoming scarce in the larger Internet. Secondary markets are beginning to open in IPv4 address space, and the prices this far have been around $10 per IP address–far more expensive than most R&E organizations are used to spending (or for that matter, can afford). Government and foundation grants are unlikely to support shopping for IPv4 addresses on the secondary market, and they will tend to view IPv6 as a more viable (and cheaper) alternative.

New colleges and universities will not have access to an abundance of IPv4 addresses. Moreover, as new scientific sites and special instruments come on-line around the world, it is increasingly likely that those (especially in Europe and Asia) will have access to fewer and fewer IPv4 addresses. These research centers will have two options: Entirely forego IPv4 addresses or get a very small number of IPv4 addresses and run some sort of NAT and/or protocol translation to support IPv4. In this latter scenario, IPv6 can be supported end-to-end, due to address abundance, while the limited IPv4 space requires middleboxes.

ESnet’s work with the Science DMZ concept has revealed that middleboxes have a detrimental effect on network performance for data-intensive science applications. The lack of a clean end-to-end path for IPv4 will mean that IPv4 can only be supported as a legacy “slow-path” protocol. For real performance, IPv6 will be necessary. Even legacy support for IPv4 may not be available in certain regions for much longer.

A reasonable scenario that may be encountered within the IT departments of research institutions–universities and national labs–is as follows: Faculty members and research staff will need access to data from a particular instrument or reactor. They will either be unable to get the data they need over IPv4 or they will have to go through a middlebox or bastion to get to the data, and that will have serious performance implications for the researchers. They will approach the IT department and request IPv6. Knowing that lead times for such requests do not often extend past the order of hours or days begs the question, will you be ready when a researcher comes to you and asks for IPv6 connectivity in her lab “by the end of the week”?

Risk 3: Your Content

As the developing world mobilizes economically, corporations, foundations, entrepreneurs, benefactors, and prospective students will increasingly hail from countries such as China, India, Brazil, and other parts of Asia and Latin America. These prospective donors, collaborators, and scholars will need access to information resources at your university or lab. And increasingly, these people will have better access to IPv6 than to IPv4. The next-generation research network in China, CERNET2, is IPv6-only, for example.

Moreover, this is not solely true in the developing world. As ISPs in the US and Europe become further strapped for IPv4 resources, they will turn to large-scale NAT (LSN, aka “carrier-grade NAT”). LSN promises to reduce performance and increase troubleshooting headaches. Avoiding these pitfalls requires IPv6 support, which is why large ISPs like Comcast are proceeding aggressively with IPv6 adoption. As the effects of IPv4 run-out become more pronounced, more people will be trying to access your information resources via IPv6. Will they be able to reach you easily?

As campus development and public-affairs offices continue to push the outreach envelope, using social media and a variety of Internet-based technologies, they will want to ensure that they are reaching the maximum range of benefactors and prospective students. How will you answer the vice president of development when he asks you if your institution is doing all it can to make information resources available to the maximum range of prospective donors?  How will you respond when a researcher at your lab site asks about how to improve real-time collaboration experiences with partners in India?  How will you ensure the director of admissions that prospective students in China, India, and Brazil will have easy access to admissions materials and information about programs of study?  IPv6 plays an important role in answering these questions. How well you answer them depends on how well positioned you are for IPv6 adoption.

Risk 4: Even if you have a lot of IPv4 addresses, you don’t have enough

There are a lot of applications for which NAT, or the use of private IPv4 addresses without NAT, is “good enough.” Home networks frequently use NAT. Large high-performance compute clusters (HPCCs) frequently use private IPv4 space to number internal nodes. Small labs may use a consumer-grade NAT device at your campus or site. In many cases, these devices work fine. But often, the network could work much better if each device could be individually addressed.

In the case of HPCCs, I frequently encounter cases where two clusters at disparate sites use the same private IPv4 range (usually the lower end of 10.0.0.0/8). When the cluster owners decide to connect the HPCCs together via a private layer-2 or layer-1 (wave) link, they suddenly have address collisions. I have seen several cases where rounds of iterative negotiation are needed to properly renumber hosts into non-colliding ranges. Surprisingly, this is not infrequent in HPCCs, and it certainly has the potential to occur in many other applications. IPv6 solves this problem in two ways: First, because of its massive address space, a chunk of globally-routable address space can be used to number hosts with little impact. In the HPCC example, a single /64 can number all of the internal nodes in both clusters!  Even if a similar “private” address space is desired, IPv6 provides a mechanism called Unique Local Addressing, which allows different sites to “create” their own private address space. The algorithm specified by RFC 4193 allows for a high likelihood of uniqueness, so that if and when clusters are eventually merged, address collisions won’t occur. ULAs aren’t an exact replacement for IPv4 private addresses, but they are useful in certain circumstances, such as this HPCC example.

In this case, the use of IPv6 lowers the risk of escalating OPEX costs in maintaining a private address space. Moreover, the internal nodes could be numbered using EUI-64 based stateless autoconfiguration (SLAAC), further lowering costs. Because of the closed nature of the network, SLAAC may be a good candidate for an easy and maintainable configuration. Using EUI-64, which is based on the hardware addresses of the physical interfaces, makes documentation easy (hardware addresses are often easily known in these clusters, so hosts files and internal DNS can easily be generated from the known MAC addresses) and greatly reduces the likelihood of number collisions.

While large-scale HPCCs can benefit from IPv6, it can also help with small-scale NAT installations. Even in my own home network, I find IPv6 to be valuable over the existing IPv4 NAT system. I often need to manage individual hosts and sometimes, I need end-to-end transparency for such things as video and voice conferencing. Using IPv6 is much easier and more efficient than trying to poke holes or configure special redirects in my NAT box. I can access individual hosts at home directly over IPv6 without having to go through a NAT box. Now, some people may view this as a security risk–to them having my hosts “exposed” on the Internet is a big risk that NAT otherwise “solves.”

I view this security issue not as a risk but as a benefit. Instead of my security policy being dictated by the technology, I am able to develop my own security policy for my home network and use stateful firewalls to enforce that policy technically. Moreover, this produces a much cleaner security policy than having to place redirects and other kludges in my NAT configuration to support video conferencing and other needs.

IT readiness: The overarching risk

How many large-scale IT projects in your organization finish on-schedule, let alone can be completed on short notice? When that PR person or researcher comes to you and needs IPv6 “real soon now,” do you really want to be in the position of having never enabled IPv6 on any of your networks, or–worse yet–having no plan for IPv6 adoption?  You certainly don’t want to wait until a prominent member of your scientific staff or faculty is demanding IPv6 on her network before you even start thinking about IPv6, do you?

IT projects are hard. They have a lot of dependencies. They have a lot of unforeseen obstacles. And, of course, there are risks that arise from deploying IPv6, as there are with any large IT project. The big problem for IT managers will occur when the risks of not deploying IPv6 begin to outweigh the risks of deploying IPv6, and there’s suddenly a lot of pressure to move forward quickly. How do you ensure that things aren’t moving too quickly?  How do you mitigate all of the risks, or at least the majority?

There is a simple answer: Start before you really need to. Ideally, you should have already started and you may even be enabling IPv6 on networks and services right now. But if you haven’t begun yet, now is the time to start. There are a number of things you need to do just to get going, and ESnet has put together a useful checklist to get you started. You’re going to run into problems–all will definitely not be smooth sailing. That’s why you need to start adopting IPv6 before one of the risks that I have identified comes to bear. Even if you can’t deploy IPv6 on your production network just yet, you can get a feel for how it works and what the pitfalls are by creating a special “IPv6 DMZ.”  Better yet, if you plan to build a Science DMZ, make sure that it supports both IPv4 and IPv6 from day one. That will go a long way toward ensuring that you and your colleagues and staff fully understand IPv6, and it will provide improved connectivity options for the Science DMZ itself.

By now it should be clear that none of the risks I have identified are mitigated in any way by how much IPv4 address space you have. Simply put, having lots of IPv4 address space–even your own /8–is not reason enough to delay IPv6 implementation for one second. Your IPv4 address space no longer matters in the IPv6 equation. In this increasingly interconnected world, you need to be able to reach everyone else, and they need to be able to reach you. If you delay adopting IPv6, you make it less likely that your resources will be available to all, and that poses risks to you and your institution.

I.T. in-depth at DUSEL

This guest blog is contributed by Warren Matthews, Cyber-Infrastructure Chief Engineer at the Deep Underground Science and Engineering Lab (DUSEL).
 

Guest Blogger: Warren Matthews, DUSEL

The Deep Underground Science and Engineering Laboratory (DUSEL) is a research lab being constructed in the former Homestake gold mine in Lead, South Dakota, now resurrected to mine data about the earth, new life forms, and the universe itself. When finished, DUSEL will explore fundamental questions in particle physics, nuclear physics and astrophysics. Biologists will study life in extreme environments. Geologists will study the structure of the earth’s crust. Early science programs have already begun to explore some of these questions. In addition, DUSEL education programs are underway to inspire students to pursue careers in science, technology, engineering, and mathematics. This interdisciplinary collaboration of scientists and engineers is led by the University of California at Berkeley and the South Dakota School of Mines and Technology.

 

I am the cyberinfrastructure chief engineer for DUSEL. As such, my concern is the research environment and advanced services that will be needed to accomplish our scientific goals. To enable future discoveries, scientists will need to capture, analyze, and exchange their data. We will have to deploy and perhaps even develop new technologies to provide the scientists with the technical and logistical support for their research. We expect that the unique research opportunities and instrumentation that will be established at DUSEL will draw scientific teams from all over the world to South Dakota, so high-speed national and International network connectivity will also critical.

National laboratories have made many important contributions in the development of IT and networking technology.  I’m very pleased that DUSEL is the newest member of the ESnet community and I have no doubt that we’ll be leveraging their expertise.  In conversations with numerous colleagues at other labs it has become apparent that although DUSEL is starting with a clean slate and there are no legacy systems to support, we still have common issues and some difficult decisions to consider. All the labs have the challenges of meeting the needs of both large and small scientific collaborations. We all feel the budget crunch and are streamlining our support infrastructure. We are all wondering how we can optimize our use of the Cloud.

Delving into underground research

At DUSEL we have our own particular challenges, starting with an extreme underground environment. On the surface, the Black Hills of South Dakota may be freezing, but the further you go down in the mine, the hotter it gets. Rock temperatures at the 4850′ level, where the mid-level campus is under construction, are around 70F (21〫C) and humidity is around 88%. At the 7400′ level, where the deep-level campus is planned, temperatures hover around 120F (50〫C). The high levels of temperature and humidity have a significant impact on computer equipment.  We’ll figure out our challenges as we go, depending on shared expertise. After all, national labs were created to focus effort and move forward knowledge where no one university could marshal the resources required. Our goal is to provide a platform where science, technology, and innovation are able to flourish.

We anticipate technology partnerships with the many experiments are going underground at DUSEL. Currently we are expanding IPv6 and deploying perfSONAR. We are leveraging HD video conferencing. We are worrying about identity management and cyber security. We are establishing the requirements for dynamic network provisioning.  And at the same time we’re wondering what other technologies will emerge in the next 20 or 30 years and what will be required to dig for new discoveries. You can keep track of our progress here at the Sanford Laboratory Youtube Channel.

–Warren Matthews

Did you get the memo?

Last September, Federal CIO Vivek Kundra issued a memo mandating that government agencies take aggressive steps to adopt IPv6. IPv6 is the next-generation Internet Protocol, which allows for a vastly increased address space: 340 undecillion (340 followed by 36 zeroes) as compared to 4.3 billion addresses for the existing protocol, IPv4. The memo stipulates that agencies are to make all public-facing Internet resources IPv6-accessible by September 30, 2012, and that all internal connectivity within agencies and sites must use IPv6 by September 30, 2014.  Kundra’s aggressiveness appeared to be well-placed, when, on February 3, 2011, the Internet Assigned Numbers Authority (IANA), allocated the very last chunks of IPv4 address space to the Regional Internet Registries (RIRs). This officially signaled the beginning of the end of IPv4.

Time is running out on IPv4's billions of Internet addresses

It’s not about our addresses.  While many organizations, government agencies, universities, and companies feel that they have sufficient IPv4 address space so that they don’t need to implement IPv6, that’s not really the problem. The depletion of IPv4 addresses signals the advent of IPv6-only organizations and networks. New scientific research facilities, community organizations, and educational institutions around the world will soon find it much harder–and more expensive–to obtain IPv4 addresses. For these organizations, their only hope for having a presence on the Internet will be to make extensive–and possibly exclusive–use of IPv6.

Thus, even those who have abundant IPv4 resources may soon need to access IPv6-only resources.  Network staff need to be ready to act when IPv6-only users and collaborators elsewhere in the world need to access resources at their sites, or when their researchers request access to IPv6-only remote facilities.  This means that everyone needs to proceed in earnest with IPv6 adoption, so that the inevitable kinks can be worked out before IPv6 becomes a mission-critical requirement.

Many people wonder if Network Address Translation (NAT), which allows certain IPv4 addresses to be re-used throughout the Internet, can help stave off the need for IPv6. While this was a common argument in the 1990s and early 2000s, the acceleration of IPv4 depletion in the latter part of the last decade calls this assertion into question.  NAT technologies have been known to work–with some difficulty–on a small scale, but the kinds of large-scale NAT installations required to continue with an IPv4-only Internet are expensive and come with their own reliability and security issues.  Some, such as Lorenzo Colitti of Google, believe that large-scale NAT will make the Internet, as a whole, “slower and flakier.”

Currently, IPv6 is our best way forward when it comes to maintaining the reliability of the Internet.  Having adopted IPv6 many years ago, ESnet is well-positioned to provide help to others making the transition. But it’s important to get moving now.  The depletion of IANA IPv4 resources and the Federal mandate should provide good motivation.

–Michael Sinatra

ESnet 2010 Round-up: Part 2

Our take on ANI, OSCARS, perfSONAR, and the state of things to come.

ANI Testbed

In 2010 ESnet led the technology curve in the testbed by putting together a great multi-layer design, deploying specially tuned 10G IO Testers, became early investors in the Openflow protocol by deploying the NEC switches, and built a research breadboard of end-hosts leveraging open-source virtualization and cloud technologies.

The first phase of the ANI testbed is concluding. After 6+ months of operational life, with exciting research projects like ARCHSTONE, Flowbench, HNTES, climate studies, and more leveraging the facilities, we are preparing to move the testbed to its second phase on the dark fiber ring in Long Island. Our call for proposals that closed October 1st garnered excellent ideas from researchers and was reviewed by the academic and industry stalwarts in the panel. We are tying up loose ends as we light the next phase of testbed research.

OSCARS

This year the OSCARS team has been extremely productive. We added enhancements to create the next version (0.5.3) of currently production OSCARS software, progressed on architecting and developing a highly modular and flexible platform for the next-generation OSCARS (0.6), a PCE-SDK targeted towards network researchers focused on creating complex algorithms for path computation, and developing FENIUS to support the GLIF Automated GOLE demonstrator.

Not only did the ESnet team multitask on various ANI, operational network and OSCARS deliverables, it also spent significant time supporting our R&E partners like Internet2, SURFnet, NORDUnet, RNP and others interested in investigating the capabilities of this open-source software. We also appreciate Internet2’s participation by dedicating testing resources for OSCARS 0.6 starting next year to ensure a thoroughly vetted and stable platform during the April timeframe. This is just one example of the accomplishments possible for the R&E community by commiting to partnership and collaboration.

perfSONAR collaboration

perfSONAR kept up its rapid pace of feature additions and new releases in joint collaboration with Internet2 and others. In addition to rapid progress in software capabilities, ESnet is aggressively rolling out perfSONAR nodes in its 10G and 1G POPs, creating an infrastructure where the network can be tuned to hum. With multiple thorny network problems now solved, perfSONAR has proven to be great tool delivering value. This year we focused on making perfSONAR easily deployable and adding the operational features to transform it into a production service. An excellent workshop in August succinctly captured the challenges and opportunities to leverage perfSONAR for operational troubleshooting and also by researchers in understanding further how to improve networks. Joint research projects continue to stimulate further development with a focus on solving end-to-end performance issues.

The next networking challenge?

2011

Life in technology tends to be interesting, even though people keep warning about the commoditization of networking gear. The focus area for innovation just shifts, but never goes away.  Some areas of interest as we evaluate our longer term objectives next year:

  • Enabling the end-to-end world: What new enhancements or innovations are needed to deploy performance measurement, and control techniques to enable a seamless end-to-end application performance?
  • Life in a Terabit digital world: What network innovations are needed to fully exploit the requirement for Terabit connectivity between supercomputer centers in the 2015-2018 timeframe?
  • Life in a carbon economy: What are the low-hanging fruit for networks to become more energy-efficient and/or enable energy-efficiency in the IT ecosystem they play? Cloud-y or Clear?

We welcome your comments and contributions,

Happy New Year

Inder Monga and the folks at ESnet

ESnet gives CISCO Nerd Lunch talk, learns televangelism is harder than it seems

As science transitions from lab-oriented to a distributed computational and data-intensive activity, the research and education (R&E) networking community is tracking the growing data needs of scientists. Huge instruments like the Large Hadron Collider are being planned and built. These projects require global-scale collaborations and contributions from thousands of scientists, and as the data deluge from the instruments grows, even more scientists are interested in analyzing it for the next breakthrough discovery. Suffice it to say that even though worldwide video consumption on the Internet is driving a similar increase in commercial bandwidth, the scale, characteristics, and requirements of scientific data traffic is quite different.

And this is why ESnet got invited to Cisco Systems’ headquarters last week to talk about how we how we handle data as part of their regular Nerd Lunch talk series. What I found interesting although not surprising, was that with Cisco being a big evangelist of telepresence, more employees attended the talk from their desks than in person.  This was a first for me and I came away with a new appreciation for the challenges of collaborating across distances.

From a speaker’s perspective, the lesson learnt by me was to brush up my acting skills. My usual preparations are to rehearse the difficult transitions and  focus on remembering the few important points to make on every slide. When presenting, that slide presentation portion of my brain goes on auto-pilot, while my focus turns towards evaluating the impact on the audience. When speaking at a podium one can observe when someone in the audience opens a notebook to jot down a thought, when their attention drifts to email on the laptop in front of them, or when a puzzled look appears on the face of someone as they try to figure out the impact of the point I’m trying to make. But these visual cues go missing with a largely webcast audience, making it harder to know when to stop driving home a point or when to explain the point further to the audience.  In the future, I’ll have to be better at keeping the talk interesting without the usual clues from my audience.

Maybe the next innovation in virtual-reality telepresence is just waiting to happen?

Notwithstanding the challenges of presenting to a remote audience, enabling remote collaboration is extremely important to ESnet. Audio, video and web collaboration is a key service offered by us to the DOE labs. ESnet employees use video extensively in our day-to-day operations. The “ESnet watercooler”, a 24×7 open video bridge, is used internally by our distributed workforce to discuss technical issues, as well as, to have ad-hoc meetings on topics of interest. As science goes increasingly global, scientists are also using this important ESnet service for their collaborations.

With my brief stint in front of a stage now over, it is back to ESnet and then on to the 100G invited panel/talk at IEEE ANTS conference in Mumbai. Wishing all of you a very Happy New Year!

Inder Monga

Fenius takes another big step forward

As I wrote back in June, ESnet has been promoting global interoperability in virtual circuit provisioning via our work on the Fenius project. Recently this effort took another step forward by enabling four different provisioning systems to cooperate for the Automated GOLE demonstration at the GLIF workshop held at CERN in beautiful Geneva, Switzerland.

For the uninitiated, GOLE stands for GLIF Open Lightpath Exchange, a concept similar to an IP internet exchange, but oriented towards interconnecting lightpaths and virtual circuits. Several GOLEs already exist and collaborate in the GLIF forum, but until recently,  interconnecting has been a manual process initiated by the network administrators at each GOLE. Because of the lack of standards, any automation in the process  was only accessible through a proprietary interface. This lack of interoperability has hindered the development and use of virtual circuit services that cross more than a few GOLEs at a time.

Our objective in Geneva was to demonstrate that if there’s a will there is a way: that we can indeed have automated, dynamic GOLEs that can provision virtual circuits with no manual intervention initiated by the end-user through the Fenius common interface.

This project involved several different GOLEs and networks from around the world. In North America, both MANLAN and StarLight participated along with Internet2’s ION service and USLHCNet. The majority of GOLEs and networks were European: NorthernLight, CERNLight, CzechLight, and PSNC, as well as NetherLight and University of Amsterdam. Finally, AIST and JGN2+ participated from Japan, making this a demonstration that spanned sixteen (!) timezones and utilized four transoceanic links.

The demonstration was a complete success and resulted in what is to my knowledge a global first: a virtual circuit was completely automatically set up through five different networks and four different provisioning systems. And it was completed in a short amount of time – it only took about five minutes from the initiating request until packets were flowing from end to end.

During the weeks leading up to the demonstration, software developers and network engineers from almost every organization mentioned closely collaborated  to develop, test and deploy the Fenius interface on all the various GOLEs and networks. Several people worked day and night. This level of commitment can only mean good things for the long-term prospects of the Fenius and Automated GOLE efforts.

Our success is also worth noting particularly since the software, hardware, and network infrastructure set up for this demo has been committed to remain available for use and experimentation for the next year. We hope to replicate this success in Supercomputing 2010, only now extended with even more GOLEs and networks joining. Since Fenius + automated GOLE applications clearly demonstrated the value of interoperability, the next steps will be to help define and develop an open-source implementation of NSI, a standard protocol that will establish native interoperability between the various provisioning software systems like OSCARS, AutoBAHN, G-Lambda, Open-DRAC, Argia, and others.

These are exciting times and it’s great to see our efforts finally bearing fruit. I can’t wait to how the newly interoperable GOLEs can benefit our user community and scientific networking in general.