Last night, I had the pleasure of attending an Austin Forum on Science and Technology lecture delivered by Dr. David P. Anderson of the University of California, Berkeley. The subject of the talk was on technology-enabled citizen science, and Dr. Anderson is well positioned to address this topic. He was one of the co-founders of the SETI@home project, and went on from there to establish the Berkeley Open Infrastructure for Networking Computing (BOINC) project.
It was pointed out during the introductions that the primary sponsor/partner for this lecture series, the Texas Advanced Computing Center (TACC), has contributed about 1000 years of CPU time to distributed BOINC projects via IBM’s World Community Grid. (In the interest of full disclosure, I should note that TACC is one of my customers for my day job as an Exchange administrator. In fact, another sponsor, the University of Texas at Austin’s Information Technology Services, or ITS, is my employer.)
The timing of this lecture could not have possibly been better. I’ve actually been preparing a posting about citizen science (which I hope to put out in the next few days), and I gleaned several useful tidbits from the lecture. Here are the broad brush strokes of the talk.
Science has become increasingly computational, with heavy demands for high-performance computing (HPC) and high-throughput computing. Examples of research areas with such intense computational demands include:
- Physical simulation
- Particle collision simulation
- Atomic/molecular simulation (bio/nano)
- Earth climate system modelling
- Compute-intensive data analysis
- Particle physics (LHC)
- Bio-inspired optimization
- Genetic algorithms, flocking, ant colony, etc.
While such research sometimes requires large single compute jobs, modern research frequently requires many small jobs. Chaotic systems need many runs, with the parameters slightly varied across each run. This allows researchers to explore the distribution of outcomes over a multi-dimensional parameter space.
There are many approaches to tackling these computational demands, including cluster computing (such as the supercomputing clusters at TACC and other facilities), grid computing (which essentially involves connected clusters), cloud computing resources (such as Amazon EC2 and Microsoft Azure), and volunteer computing (which is really the topic of this lecture).
Consumer Digital Infrastructure
The proliferation of consumer computing devices (including PCs, laptops, tablets, smartphones, game consoles, set-top boxes, DVRs, and even appliances) coupled with the widespread availability of commodity Internet connectivity, with much of that computing capacity sitting idle much of the time, means that their exists a huge pool of computing power waiting to be tapped.
Consider an estimate of 1 billion desktop and laptop PCs. That equates to 10 ExaFLOPS of CPU capacity and 1,000 ExaFLOPS (one ZetaFLOP) of GPU (Graphics Processing Unit) capacity.
Or, how about the estimated 2.5 billion smartphones? That translates to 10 ExaFLOPS of CPU capacity.
The premise of volunteer computing (which is to say, people donating the spare processing capacity of their computing devices) is based upon three sources of motivation for participation: donating to support science, being part of a community, and competing for recognition (people being the competitive beings that they are).
Volunteer computing has its roots in 1997 with the advent of the Great Internet Mersenne Prime Search (GIMPS) and distributed.net. GIMPS is devoted to identifying a special subset of rare prime numbers known as Mersenne primes (with the 48th such number identified earlier this year by the GIMPS project), whereas distributed.net tackles the computationally-intensive task of cracking encryption.
In 1999, (with the participation of Dr. Anderson) SETI@home took center stage, devoted to the task of analyzing radio telescope data for indications of intelligent life. That year also marked the debut of Folding@home, a distributed computing effort devoted to the study of protein folding, an area of research with major ramifications for the study of many types of disease. By 2003, Dr. Anderson had moved on to release the first iteration of the BOINC framework, which provides client and server software for creating distributed computing projects.
Dr. Anderson went over several factors which limit the impact of volunteer computing, one being the basic difficulty of getting people interested in participating. In a study of college students [Toth 2006], only 5% said that they would “definitely participate” and 10% would “possibly participate.”
Another issue is PC availability. Of the systems participating in BOINC projects, those systems have 65% average availability [Kondo 2008], and only 35% of participating PCs are available 24/7. The simple fact of the matter is that many people turn off their computers when they aren’t using them.
Dr. Anderson cited network bandwidth as another limiting factor, but without really elaborating on that. I found this mention rather curious given how pervasive the availability of broadband connectivity has become (except for rural areas).
He then went on to discuss memory and disk capacity as potential limiting factors. New PCs on average are equipped with 6 GB RAM, which really is inadequate for certain jobs. Not every computer can handle every distributed computing task.
BOINC: middleware for volunteer computing
At this point, Dr. Anderson dove into the details of the BOINC project. It has been supported by the NSF since 2002, is Open Source (LGPL), and based at UC Berkeley. There are numerous projects utilizing the BOINC framework, including LHC@home, Climate Prediction Network, Cosmology@home, Einstein@home (gravitational wave detection), Rosetta@home (protein folding), and IBM’s World Community Grid. The client software is designed such that multiple projects can be attached to it, with different amounts of time allocated for each.
Creating a BOINC project consists of installing the BOINC server on a Linux system (and it doesn’t even have to be a powerful one), compiling the apps for the target client platform (Windows/Mac/Linux/Android), and then attracting volunteers.
Volunteer computing today
Today, there are 500,000 active computers performing computations on 50 BOINC projects, averaging 15 PetaFLOPS of computing capacity. As an example of the success of such efforts, large numbers of new pulsars have been found in radio telescope data (including data that had already been analyzed).
As an illustration of the cost savings that distributed volunteer computing can provide to researchers, consider a project requiring 10 TeraFLOPS for 1 year. Accomplishing this with a supercomputing cluster would cost roughly $1.5 million. Using cloud computing resources, such as Amazon’s EC2 service, would cost roughly $4 million. However, with volunteer distributed computing, the cost would be roughly $0.1 million, with most of that cost being for software development.
Returning to the current state of the BOINC project, that framework can handle heterogeneous collections of client computers. Macs? PCs? Linux boxes? No problem. As for relying upon untrusted, anonymous computers, mischeif is safeguarded through a process of result validation through a process of adaptive replication of results. When an unknown computer joins the network, a check is made of the results it reports by duplicating the effort on other nodes of the network. As a given computer becomes more trusted by turning in reliably verified results, such double-checks are reduced.
To appeal to the desire by volunteers for recognition, credit is given for the amount of work done, such that high-performing volunteers can, for example, be recognized in the stats on the project website. And all of this is wrapped up in a consumer-friendly client which runs quietly in the background.
The BOINC client has the ability to detect and schedule tasks on GPUs, the graphics co-processors which are common on modern video cards. These GPUs, primarily from NVIDIA, AMD, and Intel, are highly optimized for number crunching, making them ideal for this type of workload. What’s more, the BOINC client can handle the presence of multiple GPUs, even of mixed types, and provides support for accessing those GPUs through a variety of APIs, including CUDA, OpenCL, and CAL. (This mention of CUDA caught my attention. One item long on my plate has been to learn how to program to the CUDA API against the NVIDIA card in my home workstation.) Some limitations in this area include the lack of preemptive GPU scheduling and the lack of paging of GPU memory.
These days, 4 and 8 core processors have become quite common. It really won’t be long before we start seeing next-generation PCs 100 cores. BOINC is ready for that with support for multi-core apps via OpenMP, MPI, and OpenCL.
BOINC also supports wrapping client packages in virtual machines and bundling Oracle VirtualBox, so the client software can be creating in a single environment, eliminating the need for multiple versions. For example, the client code can be developed in a Linux environment, but bundled in virtual machines that will run on Windows or Macintosh systems. As an added benefit, a VM is a strong “sandbox” which can run untrusted applications. Furthermore, virtualization “checkpointing”, which simplifies the resumption of an interrupted computation task.
Recently, there has been a new release of the BOINC client for Android, sporting a new GUI and addressing battery-related issues. It was released July 2013, and is available via the Google and Amazon App Stores). Currently, there are ~50K active Android devices participating in BOINC projects.
Why hasn’t volunteer computing gained traction?
The current model for distributed volunteer computing is built around an “ecosystem of projects,” with lots of competing projects. This is something of a problem, though. Volunteers need simplicity. There is no coherent PR. Instead, there are too many “brands” involved. On top of all of that, creating and operating a project is rather difficult.
How to improve?
One of Dr. Anderson’s suggestions for improving the situation is to leverage umbrella projects, where one project serves many scientists. For example, there could be an umbrella project for particle physics calculations, another for genetics calculations, another for climate modeling, etc.
Another proposed direction is greater integration with other projects and tools, as is the case with HTCondor at the University of Wisconsin, which provides a BOINC-based back end for directing computing tasks to the Open Science Grid or any Condor supercomputing pool, or HUBzero at Purdue, BOINC-based back end for science portals such as nanoHUB.
Then we came to Dr. Anderson’s big proposal: Science@home. This would provide a single “brand” for volunteer computing, in which volunteers would register for participation in science areas rather than projects (the umbrella projects mentioned earlier). “Let’s see, I devote 50% of spare CPU time to climate simulations, and 50% to genetics.”
But how would the umbrella projects go about allocating computing power among specific projects? This would likely involve decision-making by the HPC & scientific funding communities. In any case, the BOINC architecture should make it simple to do all of this because of features already present in the software.
Volunteer distributed computing is:
- Usable for most HTC applications
- A path to ExaFLOPS computing
- A way to popularize science
The software infrastructure is here. The barriers are largely organizational.
One audience member pointed out that these distributed computing resources are not really “free.” For example, the client systems consume electrical power. Dr. Anderson pointed out that the focus recent years has been on leveraging GPUs, which provide more computational capability per Watt than typical CPUs.
Another expressed concerns about computer security. Dr. Anderson noted that, thus far, their have been so security exploits via BOINC.
Asked about the current status of SETI@home, Dr. Anderson pointed out that the project has been running for 12 years, and that the biggest challenge facing them currently is management of the huge volume of data on the back-end.
Dr. Anderson also revealed that the BOINC team is starting to consider developing a BOINC client for iOS.