Malaspina expedition: Data deluge

S 32° 27’ 41″ E 50° 55’ 53″ – It’s pushing midnight in the computer room and the zooplankton team, some of them awake since the 4:30am Neuston net tow, are starting to get cranky.

barcodeFederico Maldonado Uribe, a marine physiology graduate student at the University of Las Palmas in Gran Canaria, curses the programmer who wrote their awkward data entry system. The sleep-deprived researchers sway from side to side as their floating laboratory bobs up and down on 3-metre waves.

His colleague Ángel Lamas clicks on yet another drop-down menu on the screen in front of him. Maldonado reads him a figure. Lamas taps the keyboard a couple of times. All told, they will label around 100 samples today, which is just one of 28 planned sampling days on this leg of the cruise. They may spend more than 120 hours between them on this leg clicking drop-down menus and repeating numbers in a late-night monotone drone.

Handling the data from such an ambitious project is no easy task. Some of the samples require near real-time treatment or analysis, such as the bacteria whose dying breaths Maldonado measures for the 50 or so hours they survive after collection. Other samples of water, laden with genetic material, will be preserved for future study by other scientists, like many of the birds and plants collected by the explorer Alessandro Malaspina two centuries ago.

Lamas says that the data collection on this campaign is no different from that on shorter cruises off the coast of Spain. He’d normally write something like “Day 5, 40 metre net” on a sampling jar, he says. “There’s only one 40 metre net sample today so when we get home we’d know exactly what this means.”

DSC_6920.JPG

It’s what happens next that makes this harder for the researchers aboard: the Malaspina expedition managers have devised a coding system for each sample that should allow any of the collaborators to swiftly pin down where particular data originated. It also requires a noisy bar code printer, which spits out bar codes one by one, with a triumphant buzz..

After Lamas and Maldonado have labeled their jars in the old-fashioned way with a marker in the wet lab, they carry a blue plastic crate full of bottles to the neighboring computer room and begin digitizing the labels, which will total around 16,000 over the course of the 7-month expedition.

“The challenge on this campaign is mostly logistical,” says Jordi Dachs, this leg’s chief scientist. The payoff, according to Dachs, is that an expedition of this scale will help Spain’s fragmented oceanography community to share techniques, create a consistent dataset they can all analyze and discuss, and pioneer new areas such as studying the meta-genome of the deep sea and looking for trace signatures of industrial pollutants in remote locations such as the southern Indian Ocean.

Late at night beside the hum and buzz of the bar code printer the project seems too big for normal human beings, too. I wonder whether it is fair that the admirals of Spanish marine science have bitten off so much for their junior scientist-sailors to chew.

Back at the zooplankton labeling station I find a mutinous email posted on the wall which seems to confirm my suspicion. The writer question why the group must wake so early to perform the Neuston tow and complains that their sleep deprivation could introduce errors into the complicated data entry system.

Then I ask Maldonado who dared put that in writing, and he answers: “Oh, that’s our boss. She’s taking over on the Auckland-Honolulu leg.”

Previous posts in this series:

Malaspina expedition: Water, water everywhere and not a drop to sample (17 February 2011)
Malaspina expedition: Starting out with a splash (14 February 2011)
Malaspina expedition: Shipping out (11 February 2011)

Images: A sample barcode entry form (top) / Federico Maldonado. Ángel Lamas, Institute of Oceanography, Vigo, Fernando Ángel Piñon Gonzalez, Institute of Oceanography, Gijón, and Federico Maldonado, University of Las Palmas, Gran Canaria, discuss protocol / Lucas Laursen. Map indicates approximate location of dispatches.

Read the rest of this entry at Nature´s The Great Beyond [html].