The Fenius project: enabling virtual circuits around the globe

ESnet has been one of the leading research and education networks in the adoption of virtual circuit technology, which has allowed ESnet customers to sidestep traditional limitations of wide area networking and transfer data at high speed between geographically distant sites at a minimal cost. Each day, tens of terabytes of scientific data flow over ESnet’s Science Data Network between supercomputers, clusters, data storage sites, and experimental data sources like the LHC at CERN.

Essentially, virtual circuits provide an Ethernet pipeline with guaranteed bandwidth between two locations. This traffic is isolated from the rest, allowing our users to run “impolite” protocols like UDP, which would otherwise clog up their regular Internet connection. Our homegrown software code-named OSCARS, enables ESnet to easily monitor this traffic for trends and engineer its route to plan for growth and rearrange capacity according to the needs of our customers.

This is a win-win situation for both us and our customers, and we’re not alone in recognizing this. An increasing number of global research and education backbones and exchange points are deploying such services and writing their own software to manage them: Internet2 is providing the ION service (previously called DCN) based on the OSCARS platform. Across the Atlantic GEANT is developing AutoBAHN, and SURFnet is using Nortel’s DRAC. An international consortium developed Harmony under the Phosphorus project and is now starting up GEYSERS. In Japan, AIST has been developing the G-lambda suite, while Korean KISTI is in the process of coding their DynamicKL project – and there are certainly other projects out there.

Can’t we all just talk?

Now for the bad news: since there isn’t a globally accepted standard for this kind of service, the different software suites don’t quite communicate with one another. OSCARS communicates using the OSCARS application interface, DRAC uses the DRAC interface, and so forth. This, unfortunately, stymies our ambitions to automatically “stitch” virtual circuits across multiple networks. With everyone speaking a different language, this is impossible to accomplish.

A solution is to have a standard software interface; then different implementations would be able to interoperate as long as they were compliant. There is a standards effort in progress by the Open Grid Forum Network Services Interface working group, but an actual standard is probably at least several months away.

A bit of history

Several software developers made an effort to solve the interoperability issue at the GLIF meeting co-located at Joint Techs back in early 2008. After a few presentations, it became evident that all of these APIs,  stripped of their cosmetic differences and special features, looked remarkably alike in terms of the raw pieces of information they handled.  The consensus of the meeting was that there is no real reason not to have basic interoperability, even if many of the bells and whistles would be stripped. The developers then formed the GNI API task force under the umbrella of the GLIF Control Plane technical group, with the objective of duct-taping an interoperability solution together until actual standards emerged.

A mythical reference

They conceived the Fenius project, dubbed for the legendary king of Scythia, Fenius Farsaid. According to Irish folklore, after the collapse of the Tower of Babel, Fenius collected the best parts of the confused tongues of the world and invented a new language.
The Fenius Project is a fairly simple idea: it defines a bare-bones API for virtual circuit services as an interim pseudo-standard. Then developers can easily write code to automatically translate between the “standard” API and a specific software suite such as OSCARS; several translators already exist. The rest of the project is software “glue” which allows Fenius to run standalone, publishing its API as a web service, and routing incoming requests to the specific translator.

We demonstrated Fenius with good results during last year’s GLIF conference in Daejeon, Korea, as well as during Supercomputing 2009 in Portland, OR, using Fenius to provision virtual circuit services on demand across three networks – via completely different technologies, and two different software suites – from a lab in Japan to the NICT booth on the conference showfloor.

The next step for the project is to update its “standard” API according to some important lessons learned during last year’s demos, and to become the de facto external interface of production virtual circuit facilities. We plan to make an appearance at this year’s GLIF conference in Geneva, as well as in Supercomputing 2010 in New Orleans, LA. Fenius is also slated to become a component of OpenDRAC soon. http://www.opendrac.org/.

We hope that Fenius will be able to provide ESnet customers and the international research and education community wider access to the network infrastructure, and that it will enable virtual circuits to become a truly global infrastructure capability in the service of science, research, and education worldwide.

Purchase of dark fiber launches ESnet into new era

What sets us apart? ESnet has, and always will focus on anticipating the needs of the extended DOE science community.  This shapes our network strategy, from services and architecture to topology and reach. It also distinguishes ESnet from university research & education networks which are driven by the broader needs of the general university population.  Vis-à-vis commercial networks, ESnet has specialized in handling the relatively small number of very large flows of large-scale science data rather than the enormous number of relatively small data flows traversing commercial carrier networks today. Our desire to always stay a step ahead of the constantly evolving network needs of the scientific community has driven ESnet to take the bold step of purchasing and lighting our first segment of dark fiber.

Owning the road

By owning a tiny but powerful pair of optical fibers, ESnet will no longer have to rely on the vagaries of the commercial market – we will be able to deliver services when we choose and where they are needed.  For example, the DOE envisions using ESnet to link its supercomputing centers with a terabit of capacity by 2015. Our network will be key to enabling the scientific community to accomplish exascale computing by 2020.

Ramping up is no slam-dunk

But providing terabit capacity by using 10 100G waves through commercial services is no slam-dunk and could be very cost-prohibitive.  Without owning the fiber and transport infrastructure, the same is likely to be true when near-terabit waves become available around 2020. Not only does one lose spectral efficiency because a terabit wave won’t fit within ITU standard 50 Ghz spacing – it is necessary to plan for non-standard spacing, with current research pointing towards 200 Ghz to accommodate the signal.

But just solving this problem is not enough, as ESnet’s massive bandwidth requirements don’t end with the supercomputers.  ESnet must deliver steadily increasing amounts of data generated by the Large Hadron Collider as well as similar data sets shared within the climate, fusion, and genomics communities to scientists around the world.

Lighting the way forward

It is clear to us that the only way to scale the network to meet the rapidly propagating needs of large-scale science is by lighting our own dark fiber. Although this relatively small 200-mile loop linking New York City to Brookhaven National Lab barely registers with most in the networking community, it represents an exciting sea change in ESnet’s approach in serving our customers.

–Steve Cotter

New 100GE Ethernet Standard IEEE 802.3ba (and 40GE as well)

From Charles Spurgeon's Ethernet Website


History is being written: from a simple diagram published in 1976 by Dr. Robert Metcalfe, with a data rate of 3 Mpbs, Ethernet surely has come a long way in the last 30 years. Coincidentally, the parent of ESnet, MFEnet, was also launched around the same time as a result of the new Fusion Energy supercomputer center at Lawrence Livermore National Labs (LLNL) http://www.es.net/hypertext/esnet-history.html. It is remarkable to note that right now, as the 100GE standard got ratified, ESnet engineers are very much on the ball, busy putting 100GE enabled routers through the paces within our labs.

For ESnet and the Department of Energy – it is all about the science. To enable large-scale scientific discovery, very large scientific instruments are being built. You have read on the blog about DUSEL, and are familiar with LHC. These instruments – particle accelerators, synchrotron light sources, large supercomputers, and radio telescope farms are generating massive amounts of data and involve large collaborations of scientists to extract useful research results from it. The Office of Science is looking to ESnet to build and operate a network infrastructure that can scale up to meet the highly demanding performance needs of scientific applications. The Advanced Networking Initiative (ANI) to build the nationwide 100G prototype network and a research testbed is a great start. If you are interested in being part of this exciting initiative, do bid on the 100G Transport RFP.

As a community, we need to keep advancing the state of networking to meet the oncoming age of the digital data deluge ().

To wit, the recent IEEE 802.3ba press release: – http://standards.ieee.org/announcements/2010/ratification8023ba.html Note the quote from our own Steve Cotter:

Steve Cotter, Department Head, ESnet at Lawrence Berkeley National Laboratory
“As the science community looks at collaboratively solving hard research problems to positively impact the lives of billions of people, for example research on global climate change, alternative energy and energy efficiency, as well as projects including the Large Hadron Collider that probe the fundamental nature of our universe – leveraging petascale data and information exchange is essential. To accomplish this, high-bandwidth networking is necessary for distributed exascale computing. Lawrence Berkeley National Laboratory is excited to leverage this standard to build a 100G nationwide prototype network as part of ESnet’s participation in the DOE Office of Science Advanced Networking Initiative.”

Got a networking idea you want to test? ANI testbed opening for business

Want try out some new ideas in network research? ESnet invites you to submit a proposal to run experiments on its reconfigurable testbed.  ESnet’s ARRA-funded Advanced Networking Initiative testbed is a high-performance environment where researchers will have the opportunity to prototype, test, and validate cutting edge networking concepts.

Instructions for submitting proposals can be found here https://sites.google.com/a/lbl.gov/ani-testbed/. Proposals are due October 1, 2010. Decisions will be made January 10, 2011 when the Phase 1 version of the testbed is up and running. The phase I version is a set of 10 Gbps connected layer 1, 2, and 3 equipment that will be deployed on a dark fiber ring we  acquired in Long Island (LIMAN: Long Island Metropolitan Area Network). This will mainly be of interest to researchers doing experiments at layers 1-3, or middleware/application research at 10 Gbps.

The testbed will support research including multi-layer multi-domain hybrid networks, network protocols, component testing for future capabilities, protection and recovery,  automatic classification of large bulk data flows, high-throughput middleware and applications, and any other innovative ideas you may want to try out in a realistic network environment, but with no risk of breaking anything.

Try us. We’re open to suggestions.

100Gbps Prototype Network RFP is out!

All those cheers and whoops from the top of the Berkeley Hill would be us. ESnet just nailed another ANI milestone. We got out our RFP to vendors for the next stage in the nationwide 100Gbps prototype network.

The network will deliver data at scorching speeds and link three of the Department of Energy’s major supercomputing centers— NERSC at Berkeley Lab, the Argonne Leadership Computing Facility at Argonne National Laboratory in Illinois and the Oak Ridge Leadership Computing Facility at Oak Ridge National Laboratory in Tennessee— and MANLAN, the international exchange point in New York.

Your ARRA stimulus money is busy, this time building the infrastructure to help scientists communicate and deal with all that data proliferation from places like the Large Hadron Collider. We handle petabytes of data a month with our usual aplomb; but terabit networking is not far in our future. Good to get prepared.

The collaboration portal across continents

When organizations are driven by the similar goal of excellence, it is amazing what can be accomplished in two days of non-stop discussions. As I try to consolidate my thoughts from meetings the past two days, the following quotation rings particularly true.

Coming together is a beginning, staying together is progress, and working together is success. ~ Henry Ford

There is still a lot of work to be done, milestones to be hit and unforeseen roadblocks to overcome before we give ourselves a pat on the back, but we are off to a good start.

For the perpetually curious, SURFnet, ESnet and NORDUnet stated their intentions to collaborate on furthering innovation in network research by working on open-source middleware to reserve and seamlessly allocate bandwidth across multiple domains  (http://www.lbl.gov/cs/Archive/news030910.html).  While our first planned meeting was delayed by volcano ash from Iceland, skies and schedules cleared enough for the Dutch to visit us at our facilities at Berkeley lab.  This week’s two day meeting of minds is a baby step taken towards an ambitious goal of open sharing of tools, knowledge, open-source software and other network-related research with the community.

Taking a Break

We did let our visitors out of the conference room for just few minutes to enjoy the beautiful, albeit faint, backdrop of the Golden Gate Bridge.

100GE around the bend?

Ever feel the exhilaration of sitting in a race car and going around the track at super high speeds? We came close to that experience when we recently received early editions of a vendor’s 100GE cards for their routers. The experience so far has been phenomenal – no issues in getting the card up and running, the optics work great and packets are getting forwarded at line-rate. We are putting those cards through our rigorous testing process, though our lips are sealed for now.

For the industry, this is significant progress – just last year we started the Advanced Networking Initiative (ANI) project and the prospects of actually seeing a 100GE interface in a router this soon seemed far off. So kudos to the vendor (you know who you are) and to the IEEE 802.3ba 40/100Gbps task force – if you are listening, this stuff is ready to go!

Team update: Sowmya has joined us

Sowmya

As the newest ESnet team member, Sowmya Balasubramanian will be working on perfSONAR (PERFormance Service Oriented Network monitoring ARchitecture), a network testing and troubleshooting system. In her internship at Berkeley Lab last summer, she was a primary developer of ESnet’s Network Weathermap monitoring software.

Sowmya originally hails from the South Indian city of Chennai, and just got her Master’s degree in information networking from Carnegie Mellon University. Her projects included developing a communications interface for systems in a distributed network, a small “real time operating system” for embedded systems, a peer-to-peer photo sharing application as well as mobile phone applications. We suspect she can get just about anything to work.