Why are we reincarnating OSCARS?

OSCARS ESnet traffic patterns

Some recent articles on new developments in virtual circuits such as Fenius and cloud computing with Google, Internet2’s announcements of its ION service, and the recently funded DYNES proposal are all powered by OSCARS or On Demand Secure Circuits and Reservation System, a software engine developed with DOE funding. This open-source software engine provides us with the capability of building a network with highly dynamic, traffic-engineered flows that meet the research data transport needs of scientists. The current deployed release, 0.5.2, has been deployed as a production service within ESnet for the past 3 years. We are currently enhancing 0.5.3 and plan to release the software in the Q4, 2010 time frame.

In the course of running this software as a production service and interacting with scientists, network researchers, and standards community at OGF, we realized we had to redesign the software architecture to be a much more robust and extensible platform. We wanted to be able to easily add new features to the OSCARS platform that would cater to a variety of network engineers and researchers.  With this in mind, the re-architectured OSCARS is planned as release version 0.6. Like any successful product, transitioning from a deployed release to a new one involves thorny issues like backward compatibility and feature parity. Hence, the current balancing act of taking something that is quite good and proven (0.5.2), but making it even better a.k.a. 0.6.

Here are four good reasons why OSCARS 0.6 is the way to go:

1. It can meet production requirements: The modular architecture enables features to be added through the use of distinct modules. This allows specific deployment requirements to be easily integrated into the service. For example, if it is necessary to support a federated AA implementation, the AA modules can be replaced with ones that are compliant with that AA framework (e.g. Shibboleth).  Another example would be High Availability (HA). The 0.6 architecture helps provide HA on a component basis, ensuring that the critical components do not fail.

2. It provides new complex features: As end-sites and their operators become comfortable with point to point provisioning of virtual circuits, we are getting increased requests for complex feature enhancements. The OSCARS 0.5 software architecture is not especially suitable for new features like multi-point circuits and/or multi-layer provisioning. But these new feature requests increase the urgency of moving to the 0.6 release that has been designed with such enhancements in mind. Moreover, the multi-layer ARCHSTONE research project funded by DOE will use 0.6 as the base research platform.

3. Research/GENI and other testbeds: The research community is a major constituent for OSCARS and its continuing development. This community is now conducting experiments on real infrastructure testbeds like the ANI and GENI. To really leverage the power of those testbeds, the research community wants to leverage the OSCARS software base/framework, while researching/innovating on certain algorithms and testing them. OSCARS 0.6 platform’s modular architecture enables the researcher to replace any component with new algorithmic research module. For example, with the new PCE engine re-design, one can write a flexible workflow of custom PCE’s. This flexibility does not exist with the purpose-built, but monolithic architecture of the OSCARS 0.5 codebase.

4. NSI Protocol/Standards: As the European and Asian research and education communities move towards interoperability with the US, it is important to leverage a common understanding brought through via standards. The NSI protocol standardization being discussed in the OGF NSI working group (http://ogf.org/gf/group_info/view.php?group=nsi-wg) needs to be implemented by the network middleware open source community like OSCARS. We feel that the 0.6 is the right platform to upgrade to the standard NSI protocol whenever it is ready.

At ESnet, we invest considerable time in new technology development, but balance this with our operational responsibilities. We invite the community to join in developing OSCARS 0.6, which has greatly improved capabilities over OSCARS 0.5.2. With your participation in the development process, we can accelerate the 0.6 architected software to production-quality as soon as possible.  If this excites you, we welcome you to contribute to the next stage of the OSCARS open source project.

–Chin Guok

A few reasons why ESnet matters to scientists.

Keith Jackson, ESnet Guest Blogger

Recently we’ve been testing the ability to move huge amounts of scientific data in and out of commercial cloud providers like Amazon and Google. We were doing this because if you want to do scientific computation in the cloud, you need to be able to move data in and out efficiently or it will never be useful for science.

Recently we’ve been working with engineers at Google to test the performance of their cloud storage solution. We were in the midst of transferring data between Berkeley Lab servers and the Google cloud when we noticed the data wasn’t moving as fast as it should.

We tried to figure out the root of the problem. The Google folks talked to their networking people and we talked to our engineers at ESnet.

We found there was a bottleneck in the path between Berkeley and Google on the ESnet side. One path was still only 1 gigabit and was scheduled to be upgraded to 10 gigabit in the next week or so. But it limited us to no more than a gigabit per second data transfers.

Using OSCARS, not only did we find the bottleneck, but as Vangelis talked about in a prior blogpost, we were able to find a way to reroute traffic to avoid the slow link, completely bypassing the problem. ESnet was not only able to help me diagnose the problem right away, but were able to suggest and quickly deploy a solution.

In thinking about that problem, a few things occurred to me. For a scientist just concerned with getting data through the network, it is probably easier to work with ESnet than a commercial provider for several reasons.

As a research network, ESnet is completely accessible. A commercial provider would have been completely opaque because of proprietary issues and have no incentive to grant access into its network for troubleshooting by outsiders. Since serving scientists is not its main mission, its sense of urgency would be different. Moreover, a commercial network’s interfaces are not designed for the particular needs of scientists.

But ESnet exists solely to support science, and scientists. Sometimes we need to be reminded that to scientists, quite literally, the “network matters.”

Direct Wormhole to Google Cloud

Earlier this week our network engineers were presented with an interesting problem: researchers from Lawrence Berkeley National Laboratory were moving data in and out of the Google Cloud service, but it looked like the transfers were “slow”, running at a mere 1 gigabit per second. Most people wouldn’t call that slow – but we know that we can do better!

After some investigation, it turned out that all these transfers were going through a bottleneck in the network: an outdated 1Gbps connection to a commercial internet exchange located in San Jose, CA, that hasn’t yet been upgraded to the usual 10Gbps.

To resolve this, we decided to do a bit of traffic engineering: create a network “wormhole” that would suck in data from LBNL, move it through the Science Data Network, and drop it off to a different internet exchange point thousands of miles away – in Chicago, IL.

This is a picky wormhole, by the way; it will only suck in data that needs to travel between the researchers’ computers and Google Cloud, leaving other data flows alone. And, as long as the data is traveling in the wormhole, other traffic can’t cause any congestion that would limit throughput. We call these virtual circuits, and the OSCARS software developed here at ESnet provides the ability for our engineers to easily create and manage them.

Keith Jackson, a scientist in Advanced Computing for Science and Computational Research division at LBNL, had this to say:

”It was really impressive that we were able, in a matter of hours to set up a circuit and route this traffic to the Google cloud to avoid this network bottleneck. From my perspective as a researcher, the process looked seamless. This allowed us to conduct tests that we couldn’t have done otherwise. “

ESnet has a lot of virtual circuits snaking through our network – about 30 at last count. This one, though, is special: it’s the first one that connects up one of ESnet’s sites with a commercial service such as Google Cloud.

Jackson and other researchers are examining how commercial networks can be used for data driven computation. They are exploring with Google how fast we will be able to move data and what infrastructure is necessary to do this—one virtual circuit at a time.

ESnet's latest method of selective data transmission