Outgrowing the limits of briefcase capacity

Traditional method of data transport

At the BES requirements workshop that I led last week in Washington D.C. for scientists
and program managers, I saw a significant result of the impending data explosion that will be produced by the next generation of light sources and instruments at BES facilities.

The sheer quantity of data involved is going to completely change the scientific process for the scientists who use them.  The current model for data transport used by most scientists at light sources does not use networks at all – scientists travel to the light source, run their experiments, and travel home with a USB hard drive loaded with a few hundred gigabytes of data (perhaps a terabyte or two, but even that is tractable with portable media).  This model has worked well for this community for years.

However, as instruments are upgraded with new detectors and as new data analysis methods are employed, data sets are going to increase in size by up to a factor of 100 over the next few years- – scientists that might carry home 700 gigabytes of data today will need to move 70 terabytes in the near future. I don’t know about
you, but my briefcase isn’t up to the task.

Data will have to be transferred home over the network, or scientists will have to perform the computational analysis on site at the facilities.  Other options include streaming data to supercomputer centers for real time or semi-real time analysis.  Whatever happens, the scientists will need more from their networks and from the
systems connected to them.

The increase in data as instrumentation capacity improves will mean a significant change in the science process for these communities. Transferring data will require network capacity upgrades at the scientific facilities and the laboratories that support them, as well as network test and measurement tools such as perfSONAR.

ESnet is ready to help, with pilot projects already underway.

–Eli Dart