Fenius takes another big step forward

As I wrote back in June, ESnet has been promoting global interoperability in virtual circuit provisioning via our work on the Fenius project. Recently this effort took another step forward by enabling four different provisioning systems to cooperate for the Automated GOLE demonstration at the GLIF workshop held at CERN in beautiful Geneva, Switzerland.

For the uninitiated, GOLE stands for GLIF Open Lightpath Exchange, a concept similar to an IP internet exchange, but oriented towards interconnecting lightpaths and virtual circuits. Several GOLEs already exist and collaborate in the GLIF forum, but until recently,  interconnecting has been a manual process initiated by the network administrators at each GOLE. Because of the lack of standards, any automation in the process  was only accessible through a proprietary interface. This lack of interoperability has hindered the development and use of virtual circuit services that cross more than a few GOLEs at a time.

Our objective in Geneva was to demonstrate that if there’s a will there is a way: that we can indeed have automated, dynamic GOLEs that can provision virtual circuits with no manual intervention initiated by the end-user through the Fenius common interface.

This project involved several different GOLEs and networks from around the world. In North America, both MANLAN and StarLight participated along with Internet2’s ION service and USLHCNet. The majority of GOLEs and networks were European: NorthernLight, CERNLight, CzechLight, and PSNC, as well as NetherLight and University of Amsterdam. Finally, AIST and JGN2+ participated from Japan, making this a demonstration that spanned sixteen (!) timezones and utilized four transoceanic links.

The demonstration was a complete success and resulted in what is to my knowledge a global first: a virtual circuit was completely automatically set up through five different networks and four different provisioning systems. And it was completed in a short amount of time – it only took about five minutes from the initiating request until packets were flowing from end to end.

During the weeks leading up to the demonstration, software developers and network engineers from almost every organization mentioned closely collaborated  to develop, test and deploy the Fenius interface on all the various GOLEs and networks. Several people worked day and night. This level of commitment can only mean good things for the long-term prospects of the Fenius and Automated GOLE efforts.

Our success is also worth noting particularly since the software, hardware, and network infrastructure set up for this demo has been committed to remain available for use and experimentation for the next year. We hope to replicate this success in Supercomputing 2010, only now extended with even more GOLEs and networks joining. Since Fenius + automated GOLE applications clearly demonstrated the value of interoperability, the next steps will be to help define and develop an open-source implementation of NSI, a standard protocol that will establish native interoperability between the various provisioning software systems like OSCARS, AutoBAHN, G-Lambda, Open-DRAC, Argia, and others.

These are exciting times and it’s great to see our efforts finally bearing fruit. I can’t wait to how the newly interoperable GOLEs can benefit our user community and scientific networking in general.

Direct Wormhole to Google Cloud

Earlier this week our network engineers were presented with an interesting problem: researchers from Lawrence Berkeley National Laboratory were moving data in and out of the Google Cloud service, but it looked like the transfers were “slow”, running at a mere 1 gigabit per second. Most people wouldn’t call that slow – but we know that we can do better!

After some investigation, it turned out that all these transfers were going through a bottleneck in the network: an outdated 1Gbps connection to a commercial internet exchange located in San Jose, CA, that hasn’t yet been upgraded to the usual 10Gbps.

To resolve this, we decided to do a bit of traffic engineering: create a network “wormhole” that would suck in data from LBNL, move it through the Science Data Network, and drop it off to a different internet exchange point thousands of miles away – in Chicago, IL.

This is a picky wormhole, by the way; it will only suck in data that needs to travel between the researchers’ computers and Google Cloud, leaving other data flows alone. And, as long as the data is traveling in the wormhole, other traffic can’t cause any congestion that would limit throughput. We call these virtual circuits, and the OSCARS software developed here at ESnet provides the ability for our engineers to easily create and manage them.

Keith Jackson, a scientist in Advanced Computing for Science and Computational Research division at LBNL, had this to say:

”It was really impressive that we were able, in a matter of hours to set up a circuit and route this traffic to the Google cloud to avoid this network bottleneck. From my perspective as a researcher, the process looked seamless. This allowed us to conduct tests that we couldn’t have done otherwise. “

ESnet has a lot of virtual circuits snaking through our network – about 30 at last count. This one, though, is special: it’s the first one that connects up one of ESnet’s sites with a commercial service such as Google Cloud.

Jackson and other researchers are examining how commercial networks can be used for data driven computation. They are exploring with Google how fast we will be able to move data and what infrastructure is necessary to do this—one virtual circuit at a time.

ESnet's latest method of selective data transmission

The Fenius project: enabling virtual circuits around the globe

ESnet has been one of the leading research and education networks in the adoption of virtual circuit technology, which has allowed ESnet customers to sidestep traditional limitations of wide area networking and transfer data at high speed between geographically distant sites at a minimal cost. Each day, tens of terabytes of scientific data flow over ESnet’s Science Data Network between supercomputers, clusters, data storage sites, and experimental data sources like the LHC at CERN.

Essentially, virtual circuits provide an Ethernet pipeline with guaranteed bandwidth between two locations. This traffic is isolated from the rest, allowing our users to run “impolite” protocols like UDP, which would otherwise clog up their regular Internet connection. Our homegrown software code-named OSCARS, enables ESnet to easily monitor this traffic for trends and engineer its route to plan for growth and rearrange capacity according to the needs of our customers.

This is a win-win situation for both us and our customers, and we’re not alone in recognizing this. An increasing number of global research and education backbones and exchange points are deploying such services and writing their own software to manage them: Internet2 is providing the ION service (previously called DCN) based on the OSCARS platform. Across the Atlantic GEANT is developing AutoBAHN, and SURFnet is using Nortel’s DRAC. An international consortium developed Harmony under the Phosphorus project and is now starting up GEYSERS. In Japan, AIST has been developing the G-lambda suite, while Korean KISTI is in the process of coding their DynamicKL project – and there are certainly other projects out there.

Can’t we all just talk?

Now for the bad news: since there isn’t a globally accepted standard for this kind of service, the different software suites don’t quite communicate with one another. OSCARS communicates using the OSCARS application interface, DRAC uses the DRAC interface, and so forth. This, unfortunately, stymies our ambitions to automatically “stitch” virtual circuits across multiple networks. With everyone speaking a different language, this is impossible to accomplish.

A solution is to have a standard software interface; then different implementations would be able to interoperate as long as they were compliant. There is a standards effort in progress by the Open Grid Forum Network Services Interface working group, but an actual standard is probably at least several months away.

A bit of history

Several software developers made an effort to solve the interoperability issue at the GLIF meeting co-located at Joint Techs back in early 2008. After a few presentations, it became evident that all of these APIs,  stripped of their cosmetic differences and special features, looked remarkably alike in terms of the raw pieces of information they handled.  The consensus of the meeting was that there is no real reason not to have basic interoperability, even if many of the bells and whistles would be stripped. The developers then formed the GNI API task force under the umbrella of the GLIF Control Plane technical group, with the objective of duct-taping an interoperability solution together until actual standards emerged.

A mythical reference

They conceived the Fenius project, dubbed for the legendary king of Scythia, Fenius Farsaid. According to Irish folklore, after the collapse of the Tower of Babel, Fenius collected the best parts of the confused tongues of the world and invented a new language.
The Fenius Project is a fairly simple idea: it defines a bare-bones API for virtual circuit services as an interim pseudo-standard. Then developers can easily write code to automatically translate between the “standard” API and a specific software suite such as OSCARS; several translators already exist. The rest of the project is software “glue” which allows Fenius to run standalone, publishing its API as a web service, and routing incoming requests to the specific translator.

We demonstrated Fenius with good results during last year’s GLIF conference in Daejeon, Korea, as well as during Supercomputing 2009 in Portland, OR, using Fenius to provision virtual circuit services on demand across three networks – via completely different technologies, and two different software suites – from a lab in Japan to the NICT booth on the conference showfloor.

The next step for the project is to update its “standard” API according to some important lessons learned during last year’s demos, and to become the de facto external interface of production virtual circuit facilities. We plan to make an appearance at this year’s GLIF conference in Geneva, as well as in Supercomputing 2010 in New Orleans, LA. Fenius is also slated to become a component of OpenDRAC soon. http://www.opendrac.org/.

We hope that Fenius will be able to provide ESnet customers and the international research and education community wider access to the network infrastructure, and that it will enable virtual circuits to become a truly global infrastructure capability in the service of science, research, and education worldwide.