Open-source science

Most academic science is funded by the government, so the results essentially belong to the people. But the data that is paid for by these grants rarely appears outside of expensive, subscription-only journals. The Public Library of Science (PLoS), a free online scientific journal, is a great approach to making science results available to the public. And, very soon, Google will be another!

Wired Magazine reports that Google’s next huge world-changing project will be a home for terabytes of scientific data. Wired says:

The storage will be free to scientists and access to the data will be free for all. The project, known as Palimpsest and first previewed to the scientific community at the Science Foo camp at the Googleplex last August, missed its original launch date this week, but will debut soon…

The storage would fill a major need for scientists who want to openly share their data, and would allow citizen scientists access to an unprecedented amount of data to explore. For example, two planned datasets are all 120 terabytes of Hubble Space Telescope data and the images from the Archimedes Palimpsest, the 10th century manuscript that inspired the Google dataset storage project.

Wowzers. For ocean scientists, I envision having real-time and archival monitoring data at our fingertips. Imagine having all the long-term data sets (like CALCOFI or the Scripps pier series) in one place. Or imagine people putting all kinds of photo transects up there. Want to virtually dive on a wall in the Galapagos? Here’s 1 m square photos along a 100 m transect – analyze as you will.

Of course, the tricky part will be coaxing scientists into actually putting their hard-earned data up for all to see. But I think once the momentum gets going, it will be too powerful a tool to resist.


4 Responses to Open-source science

  1. Sam says:

    This does sound awesome! But from a librarian perspective: are we talking just about raw data, or actual papers? And if we’re talking papers (or either way, really), what’s to stop it being filled with useless crap, rather than useful data? Journals are ungodly expensive, but part of what you’re paying for is the knowledge that a bunch of hard-asses who know their stuff looked over the data and papers and said, “Yes, this is solid and a worthy contribution to the Store of Scientific Knowledge.” If every scientist who uses the Google data has to effectively peer-review all the data herself first, it may be more trouble than it’s worth.

  2. Sorry if I wasn’t clear – my understanding is that it’s raw data, not papers. (PLoS is papers). The Google system has two advantages : high barrier to entry & online review. The barrier to entry is that you essentially have to send your data in by mail – Google mails you some hard drives & you mail them back. For review, the Google system will have a comment/annotate function, so people looking through the data can decide its quality.

    Free raw data has a precedent – an online collection of gene sequences called GenBank. Though I’ve never used it, I’ve been told that while there is lots of crap, the useful parts make it through. Much like the internet itself. :>)

  3. JasonR says:

    What about the Science Commons? How does that factor into this? I never see that get much mention. Maybe I am not paying enough attention.

  4. I didn’t know about Science Commons (clearly, my attention needs expanding). Their Scholar’s Copyright Project seems especially pertinent. Thanks, Jason!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: