Henrique Rodrigues, a Ph.D. student in computer science at the University of California, San Diego, who is working with ESnet, won the best student paper award at the Hot Interconnects conference held Aug. 26-28 in Mountain View, Calif. Known formally as the 2014 IEEE 22nd Annual Symposium on High-Performance Interconnects, Hot Interconnects is the premier international forum for researchers and developers of state-of-the-art hardware and software architectures and implementations for interconnection networks of all scales.
“Special thanks to ESnet that gave me the opportunity to work on such an important and interesting topic,” Rodrigues wrote to his ESnet colleagues. “Also to the reviewers of my endless drafts, making themselves available to provide feedback at all times. I hope to continue with the good collaboration moving forward!”
At the recent Supercomputing11 conference, the bubbly was flowing. ESnet launched its ANI 100 gigabit-per-second network, and marked a quarter century of networking for DOE science. That big news may have overshadowed another milestone—SC11 was the first time OSCARS 0.6 was publicly demonstrated in a production environment. Now we’d like to give OSCARS its due.
OSCARS, or On-Demand Secure Circuits and Advance Reservation System, allows users to set up virtual circuits on demand to reserve bandwidth, streamlining the transfer of massive data sets across multiple network domains. OSCARS originated at ESnet, but we open-sourced it to the community long ago. Last spring the more modular OSCARS version 0.6 was released for testers and early adopters.
The performance of OSCARS 0.6 at SC11 showed us that we met our design goal of creating a flexible and modular framework. This was reflected in the demos, which were easy for folks to customize according to their needs. In the demo, “Enabling Large Scale Science using Inter-domain Circuits over OpenFlow” Tom Lehman of ISI used OSCARS to provide the functionality to control Openflow switches. Thanks to the flexibility to customize software built into OSCARS 0.6, ESnet’s Eric Pouyol was able to produce a variation of that application, customizing OSCARS 0.6 for resource brokering. OSCARS also played a part in the successful demonstration of Internet2’s Dynamic Network System (DYNES).The goal of DYNES is to work with regional networks and campuses, using OSCARS to schedule and support scientific data flows from the LHC, and other data intensive science programs such as LIGO, Virtual Observatory, and other large-scale sky surveys.
Most of the 100 Gbps demos at SC were supported by both the ANI 100 Gbps network and the 100 Gbps SCinet showfloor network. OSCARS 0.6 was used to schedule all eight of the demos using the 100 Gbps ANI network, which included complex visualizations of climate models, the Large Hadron Collider and the VERY early history—13.5 billion years ago, or 100 billion in dog years— of the Universe. OSCARS also controlled the approximately 100 different connections at SCInet, as well as connecting to three other OSCARS instances on the show floor.
We used OSCARS 0.6 to provision the network, scheduling user time-slices of the 100 gigabit-per-second ANI and SCinet network, 24 hours a day, over the period of a week so they could test the demos in advance without having to get up at 3:00 a.m. to do it.
OSCARS 0.6 ended up making certain network engineers’ lives much easier. According to my colleague Evangelos Chaniatakis a.k.a. Vangelis, who was involved in the gritty details of setting up OSCARS 0.6 at the show, his team was required to make last-minute changes to the pre-existing network framework to work with the new hardware but didn’t receive the equipment until the week before the conference. The modularity ESnet built into OSCARS 0.6 helped the team get the network working at short notice.
Less of a Software, More of a Service
Every year the number of reservations and circuits at SC continues to grow. The SC11 network required roughly twice the number of VLANs over the previous year. While the bandwidth wasn’t much bigger, and there were approximately the same number of customers, this year’s users definitely had more requirements. “On the whole OSCARS 0.6 was really stable.” Vangelis reports. “It worked fine.” But the lessons learned at SC11 made us rethink the OSCARS 0.6 service module and requirements. In the near future, we intend to tweak OSCARS 0.6 to provide users more flexibility, making it less of a software and more of a service.
ESnet and its collaborators successfully completed three days of demonstrating its End-to-End Circuit Service at Layer 2 (ECSEL) software at the Open Networking Summit held at Stanford a couple of weeks ago. Our goal is to build “zero-configuration circuits” to help science applications seamlessly use networks for optimized end-to-end data transport. ECSEL, developed in collaboration with NEC, Indiana University, and the University of Delaware builds on some exciting new conceptual thinking in networking.
Wrangling Big Data
To put ECSEL in context, the proliferating tide of scientific data flows – anticipated at 2 petabytes per second as planned large-scale experiments get in motion – is already challenging networks to be exponentially more efficient. Wide area networks have vastly increased bandwidth and enable flexible, distributed, scientific workflows that involve connecting multiple scientific labs to a supercomputing site, a university campus, or even a cloud data center.
The increasing adoption of distributed, service-oriented computing means that resource and vendor independence for service delivery is a key priority for users. Users expect seamless end-to-end performance and want the ability to send data flows on demand, no matter how many domains and service providers are involved. The hitch is that even though the Wide Area Network (WAN) can have turbocharged bandwidth, at these exponentially increasing rates of network traffic even a small blockage in the network can seriously impair the flow of data, trapping users in a situation resembling commute conditions on sluggish California freeways. These scientific data transport challenges that we and other R&E networks face are just a taste of what the commercial world will encounter with the increasing popularity of cloud computing and service-driven cloud storage.
Abstracting a solution
One of the key feedback from application developers, scientists and end-users is that they do not want to deal with the complexity at the infrastructure level while still accomplishing their mission. At ESnet, we are exploring various ways to make networks work better for users. A couple of concepts could be game-changers, according to Open Network Summit conference presenter and Berkeley professor Scott Shenker: 1) using abstraction to manage network complexity, and 2) extracting and exposing simplicity out of the network. Shenker himself cites Barbara Liskov’s Turing Lecture as inspiration.
ECSEL is leveraging OSCARS and OpenFlow within the Software Defined Networking (SDN) paradigm to elegantly prevent end-to-end network traffic jams. OpenFlow is an open standard to allow application-driven manipulation of network flows. ECSEL is using OSCARS-controlled MPLS virtual circuits with OpenFlow to dynamically stitch together a seamless data plane delivering services over multi-domain constructs. ECSEL also provides an additional level of simplicity to the application, as it can discover host-network interconnection points as necessary, removing the requirement of applications being “statically configured” with their network end-point connections. It also enables stitching of the paths end-to-end, while allowing each administrative entity to set and enforce its own policies. ECSEL can be easily enhanced to enable users to verify end-to-end performance, and dynamically select application-specific protocol forwarding rules in each domain.
The OpenFlow capabilities, whether it be in an enterprise/campus or within the data center, were demonstrated with the help of NEC’s ProgrammableFlow Switch (PFS) and ProgrammableFlow Controller (PFC). We leveraged a special interface developed by them to program a virtual path from ingress to egress of the OpenFlow domain. ECSEL accessed this special interface programmatically when executing the end-to-end path stitching workflow.
Our anticipated next step is to develop ECSEL as an end-to-end service by making it an integral part of a scientific workflow. The ECSEL software will essentially act as an abstraction layer, where the host (or virtual machine) doesn’t need to know how it is connected to the network–the software layer does all the work for it, mapping out the optimum topologies to direct data flow and make the magic happen. To implement this, ECSEL is leveraging the modular architecture and code of the new release of OSCARS 0.6. Developing this demonstration yielded sufficient proof that well-architected and modular software with simple APIs, like OSCARS 0.6, can speed up the development of new network services, which in turn validates the value-proposition of SDN. But we are not the only ones who think that ECSEL virtual circuits show promise as a platform for spurring further innovation. Vendors such as Brocade and Juniper, as well as other network providers attending the demo were enthusiastic about the potential of ECSEL.
But we are just getting started. We will reprise the ECSEL demo at SC11 in Seattle, this time with a GridFTP application using Remote Direct Memory Access (RDMA) which has been modified to include the XSP (eXtensible Session Protocol) that acts as a signaling mechanism enabling the application to become “network aware.” XSP, conceived and developed by Martin Swany and Ezra Kissel of Indiana University and University of Delaware, can directly interact with advanced network services like OSCARS – making the creation of virtual circuits transparent to the end user. In addition, once the application is network aware, it can then make more efficient use of scalable transport mechanisms like RDMA for very large data transfers over high capacity connections.
We look forward to seeing you there and exchanging ideas. Until Seattle, any questions or proposals on working together on this or other solutions to the “Big Data Problem,” don’t hesitate to contact me.
Eric Pouyoul, Vertika Singh (summer intern), Brian Tierney: ESnet
We are proud to announce that two of ESnet’s projects have received IDEA (Internet2 Driving Exemplary Applications) awards in Internet2’s 2011 annual competition for innovative network applications that have had the most positive impact and potential for adoption within the research and education community. (see: Internet2’s press release).
Internet2 recognized OSCARS (On-Demand Secure Circuits and Advance Reservation System), developed by the ESnet team led by Chin Guok, including Evangelos Chaniotakis, Andrew Lake, Eric Pouyoul and Mary Thompson. Contributing partners also included Internet2, USC ISI and DANTE.
ESnet’s MAVEN (Monitoring and Visualization of Energy consumed by Networks) proof of concept application was also recognized with an IDEA award in the student category. MAVEN was prototyped by Baris Aksanli during his summer internship at ESnet. Baris is a Ph.D student at the University of California, San Diego conducting research at the System Energy Efficiency Lab with his thesis advisor, Dr. Tajana Rosing. Baris worked closely with his summer advisor, Inder Monga, and Jon Dugan to implement MAVEN as part of ESnet’s new Green Networking Initiative.
The idea behind OSCARS
OSCARS enables researchers to automatically schedule and guarantee end-to-end delivery of scientific data across networks and continents. For scientists, being able to count on reliable data delivery is critical as scientific collaborations become more expansive, often global. Meanwhile, in disciplines ranging from high-energy physics to climate, scientists are using powerful, geographically dispersed instruments like the Large Hadron Collider that are producing increasingly massive bursts of data, challenging the capabilities of traditional IP networks.
OSCARS virtual circuits can reliably schedule time-sensitive data flows – like those from the LHC – round the clock across networks, enabling research and education networks to seamlessly meet user needs. OSCARS code is also being deployed by R&E networks worldwide to support an ever-growing user base of researchers with data-intensive collaboration needs. Internet2, U.S. LHCnet, NORDUNet, RNP in Brazil as well as over 10 other regional and national networks have currently implemented OSCARS for virtual circuit services. Moreover, Internet2’s NSF-funded DyGIR and DYNES projects will in 2012 deploy over 60 more instances of OSCARS at university campuses and regional networks to support scientists involved in LHC, Laser Interferometer Gravitational-Wave Observatory (LIGO), Large Synoptic Survey Telescope (LSST) and Electronic Very-Long Baseline Interferometry (eVLBI) programs.
We are proud of the hard work and dedication the OSCARS development team has demonstrated since the start of this project. Just as importantly we are proud to see this work paying off in with new science collaboration and discoveries.
The potential of MAVEN
The Monitoring and Visualization of Energy consumed by Networks (MAVEN) project is a brand new prototype portal that will help network operators and researchers better track live network energy consumption and environmental conditions. MAVEN – implemented by Baris during his summer internship – is a first major step for ESnet in instrumenting our network with the tools to understand these operational dynamics. As networks continue to get bigger and faster, they will require more power and cooling in an era of decreased energy resources. To address this pressing challenge, ESnet is leading a new generation of research aimed at understanding how networks can operate in a more energy-efficient manner. We are grateful for Baris’ significant contributions in leading the development of MAVEN and glad to see that his talent is being recognized by the R&E networking community through this award.
This week, Inder Monga is representing ESnet at the 11th Annual Global LambdaGrid Workshop. The GLIF hosts a meeting of research & education (R&E) network operators, network vendors and researchers that support the paradigm of lambda networking. The GLIF worldwide network is based around a number of lambdas–dedicated high-capacity circuits based on optical wavelengths, and which terminate at exchange points known as GOLEs (GLIF Open Lightpath Exchanges). On Monday, a smaller subset of GLIF members, GLIF Americas, will meet to share the various developments in their own R&E networks. ESnet will present the exciting new developments in the Advanced Networking Initiative, including leading work on measuring and sharing network power consumption.
On Tuesday, September 13, at the Museum of Modern Art, ESnet participates in a Network Services Interface (NSI) protocol “plugfest” with OSCARS, its award-winning On-Demand Secure Circuits and Advance Reservation System software, testing it against other bandwidth reservation software to determine its level of interoperability and find any issues with specifications. It is encouraging to note that seven independent implementations of NSI are participating in the “plugfest.” OSCARS currently implements the Inter-Domain Control protocol (IDCP) developed jointly with the DICE working group to accomplish inter-domain connections today. Converging on a standard NSI protocol will enable the larger GLIF community to participate in federated, multi-domain virtual circuits. For more information on OSCARS and NSI, you can reach Chin Guok, technical lead of OSCARS software development within ESnet, Evangelos Chaniotakis, developer of NSI protocol for the plugfest or Inder himself who is co-chair of the NSI working group in OGF.
On Wednesday September 14th, NSI session at GLIF will discuss the state of network services interface (NSI) 1.0 standards specifications today, and the work ahead to be tackled by the community in getting production instances of the protocol deployed. Up until now, NSI has been purely an academic exercise. But that is changing now with the plugfest.
Also that day, Inder will be giving a talk titled “Networks & Power–ESnet’s Initiatives towards Green.” The talk will focus on the recent design and prototype of a network power measurement tool that was developed by Baris Aksanli, a UCSD summer intern, under Inder’s mentorship. It will also give a preview of joint theoretical network energy efficiency research with Baris and his advisor Tajana Rosing at UCSD that is currently being submitted as a conference paper. Research into energy-efficient networking is important to ESnet. Energy efficiency is an issue that will assume international importance as the volume of data carried by scientific networks is relentlessly expanding, putting greater demands on networks in an era of rising energy costs.
ESnet’s Inder Monga and Samrat Ganguly of NEC Corporation made a splash at the Summer 2011 ESCC/Internet2 Joint Techs conference in Fairbanks, Alaska by demonstrating some brand new ways that laboratories, universities, and industry can integrate end-to-end network virtualization across both the local area network (LAN) and wide area network (WAN). Check out the video of their demo “OpenFlow with OSCARS: Bridging the gap between campus, data centers and the WAN.”
Monga and Ganguly combined OpenFlow and On-Demand Secure Circuits and Advance Reservation System (OSCARS) as centralized controller technologies with NEC ProgrammableFlow switches to demonstrate automation and secure service provisioning that will enable scientists to share data and collaborate easier while easing the burden on their campus IT infrastructure staff.
Science networking already extensively employs aspects of network virtualization. Mismatches still exist between the campus networks, HPC data centers and R&E networks that require manual intervention and limit end-to-end automation and control. But OpenFlow-enabled campuses, through integration with OSCARS WAN capabilities, can be used to accomplish automated, end-to-end flow management capabilities, thus closing the gap. There was a need noted to define a higher layer API/Network Virtualization constructs that span both OSCARS and OpenFlow and lead to a end-to-end virtual networking model for applications.
At ESnet we are exploring the potential of OpenFlow because it enables network administrators to “program” flexible manipulation of flows, through a well-defined “forwarding instruction set”. Only certain traffic flows, selected by policies pre-programmed by local network administrators in the OpenFlow controller, seamlessly traverse the dynamic WAN virtual circuits enabled by OSCARS. This demonstrated approach creates a simple, secure and dynamic bridge between multiple OpenFlow enabled campuses, or datacenters instead of requiring statically (and manually) configured VLAN circuits used today.
OSCARS and NEC ProgrammableFlow
OSCARS enables the automated provisioning of guaranteed bandwidth over the WAN in multiple research and engineering around the world. OSCARS allows user applications to reserve bandwidth across multiple wide-area domains in advance and offers reliable end-to-end throughput, and quality of service in managing time-sensitive and large data sets.
NEC’s ProgrammableFlow switches and controller software leverage the OpenFlow protocol to automatically monitor and intelligently distribute network traffic across multiple paths, enabling more efficient use of network resources and multiplying available bandwidth within the network. ProgrammableFlow cuts the complexity as it provides a simple interface to implement complex virtualized networks.
This spring ESnet achieved something akin to global presence, figuratively with our network, and in-person at conferences as we traded ideas with the technical community, such as the limitations of bandwidth on demand, and how to compose services that are easy for end users to understand and use. In May Steve Cotter, Bill Johnston, and Inder Monga were invited speakers at the TERENA Networking Conference in Prague.
Inder Monga followed with a presentation on “Network Service Interface: Concepts and Architecture,” that discussed the motivation, concepts and architecture in the upcoming Open Grid Forum standard that has the promise to enable researchers simple abstract constructs to dynamically create and manage their communication infrastructure to serve their science. During the talk he explained some of the differentiating attributes of the protocol: Recursive and flexible request and response framework, abstraction of physical topology into a service layer representation—and declared that composable services are the next logical step in network design. The key to dealing with complex infrastructure is to abstract it into objects the users can understand, but that is just the beginning. A composable services model contains essential elements like abstracted technical requirements in a language that all users can understand, failsafe backups, service changes that are transparent, transport efficiency and monitoring for “soft” failures. He pointed out that a Topology Service would be the next target for standardization once the Connection Service was fully specified.
Steve Cotter talked about meeting user expectations in “Fighting a Culture of ‘Bad is Good Enough,” asserting that bandwidth on demand on its own is inadequate to meet the growing needs of science. In ESnet’s surveys scientists report that while the technology is often there, they don’t know how to access it or how to make it work. The result is that poor network performance is often the norm at various sites and scientists are left to fend for themselves without technical assistance. Frustrated, many simply give up attempting to send data via the network and instead use ‘sneakernet’. But it doesn’t have to be this way. Cotter cited the LHC as one example of investment of time and commitment to do networking right. “For us as a community to succeed, we need to provide intuitive services to researchers, and documentation and assistance to make it easy for them.” said Cotter, before he launched into a run-down of new ESnet tools and ventures.
Now that OSCARS version 0.6 is code-complete, ESnet is taking offers to help test the code. ESnet is also working with its sites to build secure, dedicated enclaves on the perimeters of networks, dubbed Science DMZs, which are fully instrumented with perfSONAR. Separating the campus science traffic from converged network services like VOIP makes it easier to debug and improves performance across the WAN. To make it easier to test and troubleshoot infrastructure, ESnet has created a community knowledge base, http://fasterdata.es.net that regularly receives more than 2500 hits a week. ESnet is also developing a multi-function web portal called MyESnet that it will launch at ESCC/Joint Techs in a few weeks. MyESnet will have lots of tools and new features for the scientist and networker, including: traffic flow visualizations, high-level information about ‘site health’, the ESnet maintenance calendar, a discussion forum and idea repository, as well as one-stop shop where users will be able to log in with Shibboleth or OpenID, initiate perfSONAR tests, and open trouble tickets.
Going beyond just bandwidth on demand
Three weeks later at the NORDUnet conference in Reykjavik, Iceland, Inder Monga discussed the ins and outs of developing composable network services on demand. Given new developments in network virtualization, co-scheduling, cloud services and 100G bandwidth, the network is playing an ever larger role in providing scientists new services.
Incidentally, Inder used high-speed networking to accomplish the enviable feat of being two places at once without violating any laws of physics. Upon landing in Iceland, Inder promptly presented a talk on green networking from Iceland for the conference on Green and Sustainable ICT in Delhi, India.
Designing “greener” networks is one of ESnet’s key priorities, and something you will be hearing more about from ESnet in the future.
As the next generation of packet-optical integration permeates multi-layer Internet architecture as well as telecom equipment designs, valuable lessons can be drawn from hybrid network concepts championed and operationalized by research and education (R&E) networks. In fact, the ESnet4 hybrid architecture, conceived in 2006 and made operational in 2008, consists of separate physical wavelengths for IP-routed and dynamic virtual-circuit services. We are pleased to have an impact on the research and development of these hybrid networking concepts. IEEE Communications Magazine‘s special issue on hybrid networking published in May includes three ESnet co-authored articles:
Hybrid Networks: Lessons Learned and Future Challenges from the ESnet4 Experience, shares lessons learned from operating a hybrid infrastructure consisting of separate IP-routed and dynamic circuit services. As service requirements are driven by increasingly rigorous application needs as well as impetus to reduce operating costs through automation, working out optimal frameworks will be important as the Internet evolves.
As industry moves towards circuit and packet technology integration, characterizing and abstracting the interactions between the multiple layers will be important to ensure a stable operating environment. Virtual circuit services provided by a network services agent (NSA), such as the OSCARS/ARCHSTONE project funded by DOE, will be able to make complex topology reservations across a multi-layer environment using network capabilities provided by multi-protocol label switching (MPLS), carrier Ethernet, and wavelength/optical switching. The tricky part is to determine the optimal level of co-ordination and management feedback between the packet- and circuit-switched technologies. This matters greatly not just for Internet and telecommunication service providers, but also for future development of data center interconnects, cloud computing systems and green networks. We would like to thank the guest editors for championing this important topic. We are honored to participate in this special issue of IEEE Communications Magazine, and look forward to your feedback.
–Inder Monga and Chin Guok on behalf of ESnet’s network and research engineers.
Last month was the first in which the ESnet network crossed a major threshold – over 10 petabytes of traffic! Traffic volume was 40% higher than the prior month and 10 times higher than just a little over 4 years ago. But what’s behind this dramatic increase in network utilization? Could it be the extreme loads ESnet circuits carried for SC10, we wondered?
Breaking down the ESnet traffic highlighted a few things. Turns out it wasn’t all that demonstration traffic sent across thousands of miles to the Supercomputing Conference in New Orleans (151.99 TB delivered), since that accounted for only slightly more than 1% of November’s ESnet-borne traffic. We observed for the first time significant volumes of genomics data traversing the network as the Joint Genome Institute sent over 1 petabyte of data to NERSC. JGI alone accounted for about 10% of last month’s traffic volume. And as we’ve seen since it went live in March, the Large Hadron Collider continues to churn out massive datasets as it increases its luminosity, which ESnet delivers to researchers across the US.
Summary of Total ESnet Traffic, Nov. 2010
Total Bytes Delivered: 10.748 PB
Total Bytes OSCARS Delivered: 5.870 PB
Pecentage of OSCARS Delivered: 54.72%
What is is really going on is quite prosaic, but to us, exciting. We can follow the progress of distributed scientific projects such as the LHC by tracking the proliferation of our network traffic, as the month-to-month traffic volume on ESnet correlates to the day-to-day conduct of science. Currently, Fermi and Brookhaven LHC data continue to dominate the volume of network traffic, but as we see, production and sharing of large data sets by the genomics community is picking up steam. What the stats are predicting: as science continues to become more data-intensive, the role of the network will become ever more important.
This award represents teamwork on several fronts. For example, earlier this year, ESnet’s engineering chops were tested when the Joint Genome Institute, one of Magellan’s first users, urgently needed increased computing resources at short notice.
Within a nailbiting span of several hours, technical staff at both centers collaborated with ESnet engineers to establish a dedicated 9 Gbps virtual circuit between JGI and NERSC’s Magellan system over ESnet’s Science Data Network (SDN). Using the ESnet-developed On-Demand Secure Circuits and Advance Reservation System (OSCARS), the virtual circuit was set up within an hour after the last details were finalized.
NERSC raided its closet spares for enough networking components to construct a JGI@NERSC local area network and migrated a block of Magellan cores over to JGI control. This allowed NERSC and JGI staff to spend the next 24 hours configuring hundreds of processor cores on the Magellan system to mimic the computing environment of JGI’s local compute clusters.
With computing resources becoming more distributed, complex networking challenges will occur more frequently. We are constantly solving high-stakes networking problems in our job connecting DOE scientists with their data. But thanks to OSCARS, we now have the ability to expand virtual networks on demand. And OSCARS is just getting better as more people in the community refine its capabilities.
The folks at JGI claim they didn’t feel a thing. They were able to continue workflow and no data was lost in the transition.
Which makes us very encouraged about the prospects for Magellan, and cloud computing in general. Everybody is hoping that putting data out there in the cloud will expand capacity. At ESnet, we just want to make the ride as seamless and secure as possible.
Kudos to Magellan. We’re glad to back you up, whatever the weather.