News aggregator

Photo Essay: Seeking Humanity’s Roots

East Africa’s rift valley is considered by many to be the cradle of humanity. In the Turkana region of northwest Kenya, researchers Christopher Lepre and Tanzhuo Liu of Columbia University’s Lamont-Doherty Earth Observatory are cooperating with colleagues to study questions of human evolution, from the creation of the earliest stone tools to climate swings that have affected developing civilizations. Startling new discoveries are coming from this region at a rapid pace. Here are images from a recent field expedition. READ THE FULL SCIENTIFIC STORY or WATCH A VIDEO

 " data-cycle-speed="750" data-cycle-center-horz="true" data-cycle-caption="#gslideshow_captions" data-cycle-caption-template="{{alt}}">
Northwest Kenya's Turkwel River runs through ancient lands where many key fossils and artifacts left by early humans have come to light. The region is now inhabited by the Turkana people. Near sunset, three boys frolic on their way to fetch water. Further south, at a site called Olorgesalie, professional fossil hunter Bonface Kimeu visits some of thousands of stone axes left by proto-humans who lived here 500,000 to 1 million years ago. Bonface's father, Kamoya Kimeu, has found many of the world's most important fossils; these include a 1.6 million-year-old Homo erectus skeleton dubbed Turkana Boy, still the most complete early human remains ever found.  The Turkana region once had a climate hospitable for humans and their ancestors; now it is brutally hot and dry. Here, a local man crosses what was a lakebed as little as 5,000 years ago. The present-day Turkana people survive mainly by herding goats and camels. Near the Turkwel River, Lamont-Doherty geologist Christopher Lepre surveys layers of rocks and sediments dating back about 3.6 million years. In many places, wind, water and tectonic forces have shredded the surface into badlands, exposing many sequences across time--ideal for finding fossils and artifacts. "You get the whole human story in this one small place," says Lepre. Lamont-Doherty geochemist Tanzhuo Liu inspects eroding sediments left by a long-gone lakebed. Off near the horizon, what was probably once an island where thousands of years ago people erected stone monoliths and harpooned fish. Most research in the remote region is based out of the Turkana Basin Institute, which provides facilities for scientists. A wall at the institute documents the stream of major discoveries coming out of here. Until 2015, the stone tool displayed at lower left was considered the world's oldest, at 2.6 million years. (Actually it is a replica from a set.) Then Lepre and colleagues at Stony Brook University dated a set of tools from the Turkana region to 3.3 million years. This has reset the entire archaeological record. Lepre prepares to sample an outcrop of ancient sediments. He establishes the ages of objects found within or near layers using magnetostratigraphy--the study of how earth's magnetic field periodically reverses itself. Changes in polarity can be identified by the orientation of mineral grains. A whitish layer of compacted volcanic ash, or tuff, runs through one section. As Africa slowly tears apart along the seam of the rift valley, volcanic eruptions are commonly generated. A tuff sample chiseled out for later lab analysis. Lepre has etched an arrow into the sample to indicate which way pointed north before he removed it. Earthquakes are also frequent, as evidenced by this erosion-resistant old fault snaking though the landscape. Faults tend to confuse the chronological picture, because they thrust up some sections of land while dropping others down, putting layers out of sequence. Lepre scrambles to a high point to survey the area. Getting reliable dates means taking dozens of samples, from top to bottom. Measuring radioactive decay of certain minerals is another way to date rocks. Back at the research station, geologist Andrew Gleadow of Australia's University of Melbourne pans out heavy grains used for such analyses. A model skeleton of our own species, Homo sapiens, is used at the research station for teaching comparative anatomy to visiting students. Liu is charting past lake levels using desert varnishes--thin coatings on rocks whose compositions reflect moisture levels at the time they were deposited. Here, he hunts specimens that may have lain undisturbed for millennia. This rock has a nice varnish coating, "Look at this specimen--it's beautiful!" says Liu. Remains of the past are everywhere. By chance, the researchers spot a projectile point--maker and age unknown. They are careful to leave it in place; such artifacts are the province of archaeologists. The petrified roots of a onetime forest, from wetter times during the Pliocene, which started 5.3 million years ago. The soil and trees that grew above them are long gone, eroded away by time. Fossilized animal bones--possibly ancestors of modern antelopes or pigs--bleed from a seasonal riverbed. Nearby, in the 1990s, anthropologists found bone fragments of a distant human relative who lived around 3.5 million years ago, and may have hunted such animals. A Turkana herder stops by. Many Turkana remain deeply traditional, retaining their own language, religion and clothing. Turkana teens each carry the traditional curved stick and portable stool used for long hours of goat herding in the bush. But Kenya is modernizing fast; they also have a cell phone. The diverse hustle of Nairobi, Kenya's capital, is drawing people from many rural areas. Rapid urbanization is a global phenomenon; for the first time ever, more than half of people live in cities--a momentous step in human migration. At Nairobi's Kenya National Museum, two Homo sapiens visit the Homo erectus known as Turkana Boy. Continuing discoveries in evolution serve as powerful reminders that all humans come from, and remain, one family.
Further south, at a site called Olorgesalie, professional fossil hunter Bonface Kimeu visits some of thousands of stone axes left by proto-humans who lived here 500,000 to 1 million years ago. Bonface's father, Kamoya Kimeu, has found many of the world's most important fossils; these include a 1.6 million-year-old Homo erectus skeleton dubbed Turkana Boy, still the most complete early human remains ever found.

The See-Through Sea - The Economist

Featured News - Tue, 06/07/2016 - 12:00
The pioneering maps put together by Lamont's Marie Tharp and Bruce Heezen in the 1950s and 1960s, which first identified the structure of the mid-Atlantic ridge, were mind-expandingly right in their synoptic vision. The Economist looks at the challenges then and now of mapping the sea floor.

Was There an Ice Age in the Southern Hemisphere? - New York Times

Featured News - Mon, 06/06/2016 - 12:00
Lamont's Joerg Schaefer answers a reader's science question for the New York Times: Was there an ice age in the Southern Hemisphere?

Measuring Ice Behavior From Earth’s Glaciers to Saturn’s Frigid Moons - Science Explorer

Featured News - Thu, 06/02/2016 - 12:00
A team of Lamont researchers led by Christine McCarthy has built a new apparatus in the Rock Mechanics Lab to gain insight into the behavior of ice on Earth and elsewhere in the solar system.

Seeing the Seafloor in High Definition: Modern Mapping Offers Increasing Clarity - Earth Magazine

Featured News - Tue, 05/31/2016 - 12:00
Earth Magazine talks with Suzanne Carbotte and other scientists about advances in the mapping of the seafloor that are providing extraordinary detail.

'Dirty Blizzard' Carried Deepwater Horizon Contaminants to Seafloor - E&E

Featured News - Tue, 05/31/2016 - 12:00
Scientists led by Lamont's Beizhan Yan have discovered the mechanism that transported contaminants from the Deepwater Horizon oil spill to the bottom of the Gulf of Mexico.

Deepwater Horizon: Oil Fell on the Seabed Like Snow - Der Spiegel

Featured News - Tue, 05/31/2016 - 12:00
Researchers led by Lamont's Beizhan Yan estimate that 10 to 15 percent of the oil released by the Deepwater Horizon disaster sank to the seabed in the Gulf of Mexico, where it covered hundreds of square miles. (In German)

New Tool Measures the Behavior of Ice on Moons - American Institute of Physics

Featured News - Tue, 05/31/2016 - 12:00
A new study led by Lamont's Christine McCarthy offers a glimpse of what happens inside ice. The scientists developed a device to measure ice as it changes in response to external forces, both on Earth and on the moons of other planets.

The Database Dilemma

Chasing Microbes in Antarctica - Wed, 05/25/2016 - 12:31

Microbial ecologists know they have a problem with data archiving, particularly when it comes to sequence data.  I’m not entirely sure why this is the case; in theory it would be pretty straightforward to build a database searchable by the parameters ecologists are interested in.  I suspect that part of the problem is a lack of interest on the part of NSF (the primary source of funding for ecology) in designing and maintaining databases, and the perception that this is duplicative of the (rather sad) efforts of the NIH-funded NCBI in this area.  I don’t want to be too disparaging of NCBI – they have a much harder job than the simple database I’ve outlined above – but the Sequence Read Archive (SRA) and other NCBI repositories are a near-total disaster.  We’re wasting a lot of collective time, money, and effort developing sequence data that can’t be used for future studies because it can’t be easily accessed.

In response to this a small number of alternate repositories have popped up.  In my experience however, they don’t offer much of an improvement.  This isn’t meant as a harsh criticism of various peoples’ good efforts, I just don’t think the community’s found a viable model yet.  One example is the VAMPS database operated by the Marine Biological Laboratory, which in addition to having an unfortunate name (I believe it was named before the onset of the angsty-teen vampire thing, try googling “VAMPS” and you’ll see what I mean), tries to do too much.  Rather than organizing data and making it searchable the designers opted to try and build an online analysis pipeline.  Instead of easy to find data you end up with an analytical black box and canned results.

The reason for all this soap-boxing is my experience this week trying to obtain some global marine 16S rRNA gene amplicon data for a project that I’m working on.  The data I had hoped to obtain was used in the 2014 PNAS paper Marine bacteria exhibit a bipolar distribution.  My difficulty in obtaining the data highlights failures at the author, reviewer, journal, and repository levels.  I have great respect for many of the authors on this paper, so again this isn’t meant as a harsh criticism (I’m sure I haven’t made data I’ve published sufficiently accessible either), but rather a lesson in something that we all need to do better at.  I’ll end with the Python-based solution I came up with for getting useful sequence data out of the VAMPS flat-file format.

The 2014 paper used 277 samples that were collected as part of the International Census of Marine Microbes (ICOMM).  There are quite a few things one can say about that particular endeavor but we’ll save it for another post.  The authors included the accession numbers for all these studies in a supplementary file.  This file is a pdf, so not the easiest thing to parse, but at least the text is renderable, so the (7-page) table can be copied into a text file and parsed further.  With a little bit of effort you can get a single-column text file containing the accession numbers for each of these studies.  Great.  Unfortunately a good number of these are wildly incorrect, and most of the remainder point to SRA objects that can’t be recognized by the woefully inadequately documented SRA toolkit.  Here are some examples:

SRX011052 – Enter this accession number into the SRA search bar and it will navigate you to a page for a sample with the helpful name “Generic sample from marine metagenome”.  Nevermind for a moment that “metagenome” is an inappropriate name for an amplicon dataset, where is the data?  Click the link for “Run” (with a different accession number: SRR027219) and navigate to a new page with lots of obscure information you don’t care about.  If you’re actually doing this you might, in near desperation, click the “Download” tab in the hope that this gets you somewhere.  Psych!  All that page has is a message that you need to use the SRA toolkit to download this data.  Let’s try the prefetch utility in the toolkit:

>prefetch SRR027219 2016-05-25T15:30:43 prefetch.2.4.2 err: path not found while resolving tree within virtual file system module - 'SRR027219.sra' cannot be found.

Whatever that means.  There’s a trick to using this utility, which I’ll post as soon as I remember what it is.  The documentation suggests that the above command should work.  The absurdity of needing to rely on this utility in place of FTP usually drives me to using the EMBL-EBI site instead.  This is the Euro equivalent of Genbank, anything submitted to one should be accessible through the other.  In general EMBL-EBI is much better organized and more user friendly than SRA.

Entering “SRX011052” into the ENBL-EBI search bar returns some results that also point us to SRR027219.  The page for SRR027219 has such helpful things as an FTP link for direct download as well as some basic info on the sample.  That’s great, but because the authors opted to provide “Experiment” accession numbers for some samples (e.g. SRX011052) there is no way that I can see to generate a script to automate downloads, as we have no way of knowing the run accession numbers.  The solution is to tediously follow the links to all 277 runs.  I actually started doing this which lead me to my second problem.

SRP001129 – At some point in the supplementary table the accession numbers switch from a “SRX” prefix to “SRP”.  Picking one of these at random and entering it into the EMBL-EBI search bar returns a result for a genome sequencing project related to the Human Microbiome Project.  Clearly this isn’t from the ICOMM project.  Roughly a third of the accession numbers reported in the paper aren’t correct!

Clearly I wasn’t going to get the data that I needed through either EMBL-EBI or the SRA.  Knowing that the study had been part of the ICOMM I turned to the MBL VAMPS database where the ICOMM data is also archived.  Thinking that I was on the home stretch I followed the link to the ICOMM portal from the VAMPS homepage.  The right side column gave me two options for download; tar-zipped sff files or a tab-delimited file.  Not wanting to deal with converting from sff (the organic 454 file format) I took a look at the tab-delim option.  The top lines of the file look like this:

project dataset sequence taxonomy knt rank total_knt frequency ICM_ABR_Av6 ABR_0006_2005_01_07 GAAACTCACCAAGGCCGACTGTTAGACGAAGATCAGCGTAATGAGCTTATCGGATTTTCAGAGAG Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Marine_Group_II 1 family 36172 2.7645693e-05 ICM_ABR_Av6 ABR_0006_2005_01_07 GAAACTCACCAAGGCCGACTGTTAGATGAAGATCAGCGTAATGAGCTTATCGGATTTTCAGA Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Marine_Group_II 1 family 36172 2.7645693e-05 ICM_ABR_Av6 ABR_0006_2005_01_07 GAAACTCACCAAGGCCGACTGTTAGATGAAGATCAGCGTAATGAGCTTATCGGATTTTCAGAGAG Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Marine_Group_II 13 family 36172 0.000359394006 ICM_ABR_Av6 ABR_0006_2005_01_07 GAAACTCACCAAGGCCGACTGTTAGATGAAGATCAGCGTAATGAGCTTATCGGATTTTCTAGAGAG Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Marine_Group_II 1 family 36172 2.7645693e-05 ICM_ABR_Av6 ABR_0006_2005_01_07 GAAACTCACCAAGGCCGACTGTTAGATGAAGATTCAGCGTAATGAGCTTATCGGATTTTCAGAGAG Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Marine_Group_II 1 family 36172 2.7645693e-05 ICM_ABR_Av6 ABR_0006_2005_01_07 GAAACTCACCAGGGCCGACTGTTAGATGAAGACCAGTGTAACGAACTTGTCGGATTTTCAGAGAG Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Marine_Group_II 1 family 36172 2.7645693e-05 ICM_ABR_Av6 ABR_0006_2005_01_07 GAAACTCACCAGGGCCGACTGTTATATGAAGACCAATGTGATGAACTTGTCGGATTTTCAGAGAG Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Marine_Group_II 2 family 36172 5.5291386e-05 ICM_ABR_Av6 ABR_0006_2005_01_07 GAAACTCACCAGGGCCGACTGTTATATGAAGACCAGCGTAATGAGCTTGTCGGATTTTCAGAGAG Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Marine_Group_II 1 family 36172 2.7645693e-05 ICM_ABR_Av6 ABR_0006_2005_01_07 GAAACTCACCAGGGCCGACTGTTATATGAAGACCAGCGTGATGAACTTGTCGGATTTTCAGAGAG Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Marine_Group_II 1 family 36172 2.7645693e-05

Okay, so this is a bizarre way to share sequence data but at this point I was willing to take it.  Basically the file gives us unique reads, their abundance (knt), and taxonomy according to VAMPS.  Want I needed to do was convert this into a separate, non-unique fasta file for each sample.  With a little help from Pandas this wasn’t too bad.  The vamps_icomm_surface.csv file specified below is just a csv version of the pdf table in the PNAS publication, after some tidying up.  The second row of the csv file is the sample name in VAMPS (I had to split the first column of the original table on ‘.’ to get this).

import pandas as pd ## get the metadata for surface samples, this was generated ## from the pdf table in the PNAS paper metadata = pd.read_csv('vamps_icomm_surface.csv', index_col = 1) ## get the seqs, downloaded from seqs = pd.read_csv('icomm_data.tsv', delimiter = '\t', index_col = 1) ## iterate across each study in metadata, making a temp dataframe ## of reads from that study for study in metadata.index: temp = seqs.loc[study] temp_bacteria = temp[temp['taxonomy'].str.contains('Bacteria')] ## now iterate across each row in the temp dataframe, exporting ## the sequence the specified number of times to a fasta file with open(study + '.bacteria.fasta', 'w') as output: for row in temp.iterrows(): sequence = row[1]['sequence'] for i in range(1, row[1]['knt']): name = row[0] + '.' + str(i) print name print >> output, '>' + name print >> output, sequence

To end I’d like to circle back to the original dual problems of poor database design and erroneous accession numbers.  Although these problems were not directly connected in this case they exacerbated each other, magnifying the overall problem of data (non)reuse.  Some suggestions:

  1.  The SRA and other databases needs to aggregate the run accession numbers for each new study.  A couple of clicks should lead me to a download link for every sequence associated with a study, whether that study generated the raw data or only used it.  I don’t believe it is currently possible to do this within SRA; initiating a new project assumes that you will contribute new raw sequences (correct me if you think otherwise).  This should be easy to implement, and the association of a publication with a project should be rigorously enforced by journals, departments, and funding agencies.
  2. Accession numbers should be automatically checked by journals the way references are.  If it points to something that isn’t downloadable (or at least has a download link on the page) the author should correct it.  There is no excuse for incorrect, incomplete, or non-run accession numbers.  That this was allowed to happen in PNAS – a top-tier journal – is double frustrating.  And frankly, if you reviewed this paper you owe me beer for not spot checking the refs.
  3. Validating this information is the job of the reviewers.  Almost nobody does this.  Usually when I review a paper containing sequence data I’m the only reviewer that insists the data is available at the time of publication (I can’t think of a single case where I wasn’t the only reviewer to insist).  Not only is this a necessary check, but part of the job of the reviewer is to spot check the analysis itself.  Clearly, without data that’s not being done either…

And that’s enough ranting about data availability for one day, back to work…

Climate Change and Extreme Weather - Climate Connections

Featured News - Wed, 05/25/2016 - 09:30
Was that extreme weather event caused by climate change? It’s a question scientists get asked a lot, and one that they’re increasingly able to answer, says Lamont's Adam Sobel.

Fascinated by the Forces that Contribute to Flow in Ice and Rock - People Behind the Science

Featured News - Mon, 05/23/2016 - 12:00
In this podcast, Lamont's Christine McCarthy talks about life and the science of flowing ice.

Investigators Conduct Deep Sea Search For EgyptAir Flight 804 - NPR

Featured News - Mon, 05/23/2016 - 12:00
The wreckage from EgyptAir Flight 804 is likely in the Mediterranean Sea somewhere between Crete and Egypt. Lamont's David Gallo discusses the challenges of the search.

Signals From Plane Hint at Swift Catastrophe, Aviation Website Reports - New York Times

Featured News - Fri, 05/20/2016 - 12:00
The flight’s track indicated that it crashed about halfway between Crete and Egypt. “If that is correct, then it has landed on a feature we call the Mediterranean Ridge,” Lamont's Bill Ryan told the Times. When sonar is used to scan the area, “you get a complex play of echoes that was nicknamed cobblestone, showing the sea floor is very bumpy."

'Fundamentally Unstable’: Concerns About East Antarctica's Biggest Glacier - Washington Post

Featured News - Wed, 05/18/2016 - 16:44
The Totten glacier ice region is bigger than California, and could raise seas by over 10 feet if it collapsed. The Washington Post talked with scientists, including Lamont's Robin Bell, about the risks.

A Secret Forest No One Noticed - Atlas Obscura

Featured News - Wed, 05/18/2016 - 16:37
Cedar trees living on the steep cliffs of the Niagara Escarpment were centuries old, and no one knew until scientists took a closer look. The Tree Ring Lab at Lamont confirmed their find.

Is It The End of The World as We Know It? - Huffington Post

Featured News - Tue, 05/17/2016 - 16:02
In this short video podcast, Huffington Post's Karah Preiss and Lamont professor Peter de Menocal discuss the historical epochs, the idea of the Anthropocene, and whether it’s possible to change course.

Roadmap to Ocean Worlds: Polar microbial ecology and the search for boring, normal life

Chasing Microbes in Antarctica - Mon, 05/16/2016 - 20:11

Recently congress recommended that NASA create an Ocean Worlds Exploration Program whose primary goal is “to discover extant life on another world using a mix of Discovery, New Frontiers, and flagship class missions”.  Pretty awesome.  In February I was invited to participate on the science definition team for the Roadmap to Ocean Worlds (ROW) initiative.  The ROW is step one in NASA’s response to the congressional recommendation.  Over the last few months the science definition team has been meeting remotely to hash through some challenging questions around the central themes of identifying ocean worlds, defining their habitability, and understanding the life that might live in or on them.  This week we will have our first and only opportunity to meet in person as a team.  As one of very few biologists in the group, and possible the only ecologist, I’ll be giving a short talk on polar microbial ecology as it pertains to discovering and understanding life on other worlds.  As a way to organize my thoughts on the subject and prep for the talk I’m going to try an experiment and write out the main points of the presentation as an article.

I decided to title the talk (and this post) Polar microbial ecology and the search for boring, normal life because it reflects an idea that I and others have been advocating for some time: there isn’t anything that special about extreme environments (gasp!).  Life functions there in pretty much the same way that it functions everywhere else.  One way to think of this is that polar microbes (and other extremophiles) are uniquely adapted but not unique; they follow the standard rules of ecology and evolution, including how they acquire energy and deal with stress.  The nuances of how they interact with their environment reflect the unique characteristics of that environment, but the microbes are generally kind of conventional.

Act I: Challenges and opportunities for polar microbes

As with microbes in any environment, polar microbes are presented with a range of challenges and opportunities that define how they live.  From a microbial perspective challenges are stressors, or things that cause damage.  Opportunities are sources of energy.  Very often these things are the same.  Take sunlight, for example.  This is the single most important source of energy on the planet, but get a little too much of it (or any of it, to be more accurate), and you start doing damage to yourself.  As a result you have to invest some of the acquired energy into offsetting this stress.  We can illustrate this with a very simple conceptual model.

stress_energyThe above figure shows a hypothetical (but plausible) relationship between energy and stress.  Energy is shown both on the x-axis and as the blue 1:1 line.  Stress is the orange line.  In this scenario as energy increases, stress increases logarithmically.  The energy that is available for growth (that which is not dedicated to offsetting the damage caused by stress) is energy – stress, or the difference between the blue and orange lines.  That looks like this:

biomass_stressThe units for biomass are, of course, arbitrary, what matters is the shape of the curve.  Past a critical energy value the amount of biomass an ecosystem can sustain actually decreases as energy increases.  At the point at which stress = energy, no biomass is being accumulated.  It isn’t a coincidence that the shape of the biomass curve is the same as experimentally determined temperature curves for bacterial isolated belonging to different temperature classes (below).  In those curves the energy in the system is increasing as temperature rises.  This enhances enzymatic reactions (producing more biomass) until there is so much energy that the system becomes disordered (enzymes denature).  This process is directly analogous to the one presented here, but takes place at the population level rather than the ecosystem level.


Figure from  Temperature curves for a psychrophilic (optimized to low temperature) bacteria and a mesophilic (optimized to roughly human body temperature) relative.

One thing that I would like to make clear up front is that low temperature itself is not a stressor.  Low temperature alone doesn’t kill single-celled organisms, it only impedes their ability to deal with other stress.  Because of this energy and stress have a very complicated relationship with temperature in cold ecosystems.  For example:

  • As temperature decreases, energy decreases.
  • As energy decreases, stress decreases (remember that in our conceptual model stress is a function of energy).
  • As temperature decreases, access to substrate (and by proxy energy) increases.
  • As temperature decreases, inhibitory compounds (which are stress) increases.

So many of the effects of low temperature are in conflict.  Highly adapted psychrophiles play off these conflicts to establishes niches in cold environments.  We can illustrate this idea by focusing on the last two bullet points, which are specific to environments that contain ice.  Impure ice (which virtually all environmental ice is) does not form a solid structure when it freezes.  It forms a porous matrix, with the porosity determined by the concentration of solutes and the temperature.  You can actually view this in 3D using sophisticated X-ray tomography techniques, as shown in this figure from Pringle et al., 2009.

From Pringle et al., 2009

From Pringle et al., 2009.  X-ray tomography images of saline ice at different temperatures.  The gold represents pore spaces, which decrease in size as the temperature drops.

The gold spaces in this figure are pore spaces within the ice.  As the ice gets colder the pore spaces become smaller and the solutes contained in them become more concentrated.  That’s because whatever was dissolved in the starting material (e.g. salt, sugars, small particles) is excluded from the ice as it forms crystals, the excluded material ends up in the pore spaces.  The colder the ice, the smaller the spaces, the more concentrated the solutes.  Bacteria and ice algae that are also excluded into these spaces will experience a much higher concentration of nutrients, potential sources of carbon, etc.  This equates to more energy, which helps them offset the stress of being in a very salty environment.  In very cold, relatively pure ice, solute concentrations can be 1000x those of the starting material.  Neat, right?

Here’s what decreasing temperature actually means for a bacterial cell trying to make a go of it in a cold environment:

quick_blog_figOur earlier plot of biomass depicted cells in the “growth” phase shown here.  Only very early in the plot, when energy was very low, and at the very end, when stress was very high, did biomass approach zero.  That zero point is called the maintenance phase.  Maintenance phase happens when decreasing temperature suppresses the ability to deal with stress to the point that a cell cannot invest enough resources into creating biomass to reproduce.  The cell is investing all its energy in offsetting the damage caused by stress.

At increasingly low temperatures the synthesis of new biomass becomes increasingly less efficient, requiring a proportionally greater expenditure of energy (we say that bacterial growth efficiency decreases).  At the end of the maintenance phase the bacterial cell is respiring, that is it is creating energy, but this is resulting in virtually no synthesis or repair of cellular components.  This leads inevitably to cell death when enough cellular damage has accumulated.  It’s sad.

If we lift our heads up out of the thermodynamic weeds for a moment we can consider the implications of all this for finding life on another cold world (all the good ones are cold):

quick_blog_fig2On the assumption that everyone’s asleep by this point in the presentation I’m trying to be funny, but the point is serious.  You hear a lot of silly (my bias) talk in the astrobiology community about life at the extremes; how low can it go temp wise etc.  Who cares?  I’m not particularly interested in finding a maintenance-state microbial community, nor do we stand a very good chance of detecting one even if it was right under our (lander’s) nose.  The important contribution to make at this points is to determine where life is actively growing on the relevant ocean worlds.  Then we can try to figure out how to reach those places (no mean task!).  The point of this whole exercise it to find extant life, after all…

Act II: Where are polar microbes distributed and why?

I can think of 8 polar environments that are particularly relevant to ocean worlds:

  • Supraglacial environment (glacier surface)
  • Interstitial glacier environment (glacier interior)
  • Subglacial environment (where the glacier meets rock)
  • Sea ice surface (top’o the sea ice)
  • Interstitial sea ice (sea ice interior)
  • Sea ice-seawater interface (where the sea ice meets the seawater)
  • Water column below ice (the ocean! or a lake)
  • The sediment-water interface (where said lake/ocean meets mud)

Each of these environments has a range of stress, energy, and temperatures, and this to a large extent defines their ability to support biomass.  For the purpose of this discussion I’ll report biomass in units of bacteria ml-1.  To give some sense of what this means consider that what the waste-treatment profession politely calls “activated sludge” might have around 108 bacteria ml-1.  Bottled water might have 103 bacteria ml-1.  Standard ocean water ranges between 104 bacteria ml-1 and 106 bacteria ml-1 at the very highest end.  One further note on biomass… our conceptual model doesn’t account for grazing, or any other trophic transfer of biomass, because that biomass is still in the biological system.  Bacteria ml-1 doesn’t reflect this, because bacteria are consumed by other things.  So the values I give for the ecosystems below are at best a loose proxy for the capacity of each ecosystem to support biomass.

From left to right, top to bottom: supraglacial environment (, interstitial glacier environment (Price, 2000), subglacier environment – arguably a subglacial lake, but a very shallow one with interface characteristics (, sea ice surface (hey, that’s me!), sea ice interior (Krembs et al., 2002), sea ice-seawater interface (I.A. Melnikov), water below ice (from our last season at Palmer Station), sediment-water interface (

Let’s consider the sea ice-seawater interface and the sea ice surface in a little more detail.  The sea ice-seawater interface is interesting because it has, among all polar microbial environments, probably the greatest capacity to support biomass (we’re talking about summertime sea ice, of course).  It is located at the optimum balance point between energy and stress, as shown in the figure below.  Despite the fact that the sea ice surface has relatively high biomass, it is located at the extreme right side of the plot.  The trick is that biomass has accumulated at the sea ice surface as a result of abiotic transport, not in situ growth.  I explored this pretty extensively back in the very early days of my PhD (see here and here).


Despite the fact that the sea ice surface doesn’t function as a microbial habitat, the concentration of biomass there might be relevant to our goal of finding extant life on another world.  The accumulation of bacteria at the ice surface is largely the result of bacteria being passively transported with salt.  Recall that saline ice is a porous matrix containing a liquid brine.  All of the bacteria in the ice are also contained in the liquid brine; as the ice cools the pore spaces get smaller, forcing some of the brine and bacteria to the ice surface.  Porous ice in say, the Europan ice shell would act similarly.  Salt is easier to identify than life (in fact we know that there are large salt deposits on the surface of Europa), so we can target salty areas for deeper search.  In astrobiology we like to “follow things” (because we get lost easy?); “follow the water” and “follow the energy” are often cited and somewhat useful axioms.  So here we have a “follow the salt” situation.

Act III: Can we explore polar microbial ecology in life detection?

Act II ended on a life detection note, so let’s follow this through and consider other details of polar microbial ecology that might give us clues of what to look for if we want to identify life.  Getting ecology to be part of this discussion (life detection) is non-trivial.  Despite the fact that the whole field of astrobiology is really an elaborate re-visioning of classic ecology (who lives where and why), there isn’t a whole lot of interest in studying how organisms interact with their environment.  This simply reflects the field’s disciplinary bias; the most active communities within astrobiology may well be astronomers and planetary geologists.  If the most active community was ecologists we’d probably be ignoring all kinds of important stuff about how planets are formed, etc.  To illustrate the utility of ecology for life detection here are two examples of ecological principles that might lead to life detection techniques.

Case 1 – Biological alteration of the microstructure of ice

From Krembs et al., 2002.

From Krembs et al., 2002.  This sea ice diatom has filled the pore space with EPS as a buffer against high salinity and as a cryoprotectant.

Earlier I put forward that decreasing temperatures can lead to heightened stress because life in ice is exposed to higher concentrations of damaging substances at lower temperatures.  The most obvious example is salt.  In very cold sea ice the pore space salinity approaches 200 ppt, that’s roughly 6 times the salinity of seawater!  Maintaining adequate internal water is a huge challenge for a cell under these conditions.  One of the mechanisms for dealing with this is to produce copious amounts of exopolysaccharides (EPS), a hydrated gel (essentially mucous) containing polysaccharides, proteins, and short peptides.  EPS buffers the environment around a cell, raising the activity of water and, in some cases, interacting with ice.  This produces a highly modified internal structure, as shown in the images below.  This alteration could be a useful way of identifying ice on an ocean world that has been modified by a biological community.

From Krembs et al., 2011.

From Krembs et al., 2011.

Case 2: Motility

Motility has been put forward before as an unambiguous signature of life, but the idea hasn’t really gained a lot of traction.  Clearly one needs to be cautious with this – Brownian motion can look a lot like motility – but I can’t think of anything else that life does that is as easily distinguishable from abiotic processes.  One additional challenge however, is the not all life is motile.  Plants aren’t motile, at least most of them over the timescales that we care to stare at them for.  Not all microbes are motile either, but I would argue that those that aren’t aren’t only because others are.

From Stocker, 2012.

From Stocker, 2012.

Consider the figure at right, which is a cartoon of two modes of bacterial life in the ocean.  One of those modes is motile, and can be seen using flagella to follow chemical gradients (we call this chemotaxis) and optimize their location with respect to phytoplankton, their source of carbon.  The second mode is much smaller and in the background; small-bodied non-motile cells that live on the diffuse carbon that they opportunistically encounter.  This works because that would be an inefficient niche for motile bacteria to exploit.  In the absence of motility however, chemical gradients constitute a very strong selective pressure to evolve motility.  We can see evidence of this in the convergent evolution of flagellar motility (and other forms of motility) in all three domains of life.  Although they may share a common chemotaxis sensory mechanism, the Bacteria, Archaea, and Eukarya all seem to have evolved flagellar motility independently.  This means that it’s probably a pretty good feature to have, and is likely to be shared by microbes on other ocean worlds.

That was quite a lot, so to summarize:

  • Microbial communities are oriented along gradients of energy and stress.  At some optimal point along that gradient a maximum amount of biomass can be supported by surplus energy that is not being used to deal with stress (How much energy do you spend on stress?  Think of how much more biomass you could have!).
  • The relationships between energy, stress, and temperature are complicated, but Earth life generally works at T > -12 °C.  This estimate is probably a little lower than the reality, because laboratory observations of growth at that temperature don’t accurately reflect environmental stress.
  • Life is strongly biased towards surfaces and interfaces, these may provide enhanced opportunities for life detection (follow the salt!).
  • The specific ecology of cold organisms can provide some further insights into life detection strategies.  For example, motility might be an under-appreciated signature of life.


Even for the Fast-Melting Arctic, 2016 Is 'Uncharted Territory' - Washington Post

Featured News - Mon, 05/16/2016 - 16:53
One of the best-established ideas about global warming is that it will hit the Arctic the hardest, creating a feedback loop as melting ice leaves more dark ocean to absorb more energy. It's part of a concept called “Arctic amplification." Already this year, the Arctic has exceeded 4 degrees Celsius above average. Chris Mooney discussed the changes with Lamont's Marco Tedesco.

Dumping Iron in the Pacific Ocean Won't Fix Our Climate - Gizmodo

Featured News - Mon, 05/16/2016 - 16:51
Scientists have discovered a major problem with one popular geoengineering scheme that entails dumping iron into the ocean to fuel algae that can soak up carbon dioxide: Basically, the plan is not supported by the geologic record, at least not in the equatorial Pacific. The study was led by Lamont's Gisela Winckler and Robert Anderson.

Phylogenetic placement re-re-visited

Chasing Microbes in Antarctica - Fri, 05/13/2016 - 15:48

Disclaimer: I banged this out fast from existing scripts to help some folks, but haven’t tested it yet.  Will do that shortly, in the meantime, be careful!


I use phylogenetic placement, namely the program pplacer, in a lot of my publications.  It is also a core part of of the paprica metabolic inference pipeline.  As a result I field a lot questions from people trying to integrate pplacer into their own workflows.  Although the Matsen group has done an excellent job with documentation for pplacer, guppy, and taxtastic, the three programs you need to work with to do phylogenetic placement from start to finish (see also EPA), there is still a steep learning curve for new users.  In the hope of bringing the angle of that curve down a notch or two, and updating my previous posts on the subject (here and here), here is a complete, start to finish example of phylogenetic placement, using 16S rRNA gene sequences corresponding to the new tree of life recently published by Hug et al.  To follow along with the tutorial start by downloading the sequences here.

You can use any number of alignment and tree building programs to create a reference tree for phylogenetic placement.  I strongly recommend using RAxML and Infernal.  After a lot of experimentation this combination seems to be produce the most correct topologies and best supported trees.  You should be aware that no 16S rRNA gene tree (or any other tree) is absolutely “correct” for domain-level let alone life-level analyses, but nothing in life is perfect.  While you’re installing software I also recommend the excellent utility Seqmagick.  Finally, you will need a covariance model of the 16S rRNA gene to feed into Infernal.  You can find that at the Rfam database here.

The workflow will follow these steps:

  1. Create an alignment of the reference sequences with Infernal
  2. Create a phylogenetic tree of the alignment
  3. Create a reference package from the alignment, tree, and stats file
  4. Proceed with the phylogenetic placement of your query reads

Create an alignment of the reference sequences

The very first thing that you need to do is clean your sequence names of any wonky punctuation.  This is something that trips up almost everyone.  You should really have no punctuation in the names beyond “_”, and no spaces!

tr " -" "_" < hug_tol.fasta | tr -d "%\,;():=.\\*[]\"\'" > hug_tol.clean.fasta

Next create an alignment from the cleaned file.  I always like to degap first, although it shouldn’t matter.

## Degap seqmagick mogrify --ungap hug_tol.clean.fasta ## Align cmalign --dna -o hug_tol.clean.align.sto --outformat Pfam hug_tol.clean.fasta ## Convert to fasta format seqmagick convert hug_tol.clean.align.sto hug_tol.clean.align.fasta

Build the reference tree

At this point you should have a nice clean DNA alignment in the fasta format.  Now feed it to RAxML to build a tree.  Depending on the size of the alignment this can take a little bit.  I know you’ve read the entire RAxML manual so of course you are already aware that adding additional cpus won’t help…

raxmlHPC-PTHREADS-AVX2 -T 8 -m GTRGAMMA -s hug_tol.clean.align.fasta -n ref.tre -f d -p 12345

I like having a sensibly rooted tree; it’s just more pleasing to look at.  You can do this manually, or you can have RAxML try to root the tree for you.

raxmlHPC-PTHREADS-AVX2 -T 2 -m GTRGAMMA -f I -t RAxML_bestTree.ref.tre -n root.ref.tre

Okay, now comes the tricky bit.  Clearly you’d like to have some support values on your reference tree, but the Taxtastic program that we will use to build the reference tree won’t be able to read the RAxML stats file if it includes confidence values.  The work around is to build a second tree with confidence values.  You will feed this tree to Taxtastic with the stats file from the tree we already generated.

## Generate confidence scores for tree raxmlHPC-PTHREADS-AVX2 -T 8 -m GTRGAMMA -f J -p 12345 -t RAxML_rootedTree.root.ref.tre -n conf.root.ref.tre -s hug_tol.clean.align.fasta

Now we can use the alignment, the rooted tree with confidence scores, and the stats file without confidence scores to create our reference package.

taxit create -l 16S_rRNA -P hug_tol.refpkg --aln-fasta hug_tol.clean.align.fasta --tree-stats RAxML_info.ref.tre --tree-file RAxML_fastTreeSH_Support.conf.root.ref.tre

Align the query reads

At this point you have the reference package and you can proceed with analyzing some query reads!  The first step is to align the query reads in exactly the same fashion as the reference sequences.  This is important as the alignments will be merged later.

## Clean the names tr " -" "_" < query.fasta | tr -d "%\,;():=.\\*[]\"\'" > query.clean.fasta ## Remove any gaps seqmagick mogrify --ungap ## Align cmalign --dna -o query.clean.align.sto --outformat Pfam query.clean.fasta

Now we use the esl-alimerge command, included with Infernal, to merge the query and reference alignments.

## Merge alignments esl-alimerge --outformat pfam --dna -o query.hug_tol.clean.align.sto query.clean.align.sto hug_tol.refpkg/hug_tol.clean.align.sto ## Convert to fasta seqmagick convert query.hug_tol.clean.align.sto

Phylogenetic placement

Now we’re on the home stretch, we can execute the phylogenetic placement itself!  The flags are important here, so it’s worth checking the pplacer documentation to insure that your goals are consistent with mine (get a job, publish some papers?).  You can probably accept most of the flags for the previous commands as is.

pplacer -o query.hug_tol.clean.align.jplace -p --keep-at-most 20 -c hug_tol.refpkg query.hug_tol.clean.align.fasta

At this point you have a file named query.hug_tol.clean.align.jplace.  You will need to use guppy to convert this json-format file to information that is readable by human.  The two most useful guppy commands (in my experience) for a basic look at your data are:

## Generate an easily parsed csv file of placements, with only a single placement reported for each ## query read. guppy to_csv --point-mass --pp -o query.hug_tol.clean.align.csv query.hug_tol.clean.align.jplace ## Generate a phyloxml tree with edges fattened according to the number of placements. guppy fat --node-numbers --point-mass --pp -o query.hug_tol.clean.align.phyloxml query.hug_tol.clean.align.jplace



Subscribe to Lamont-Doherty Earth Observatory aggregator