News aggregator

Links Between Climate Change & Extreme Weather Increasingly Clear & Present - Washington Post

Featured News - Fri, 03/11/2016 - 12:00
Our science has reached the point where we can look for the human influence on climate in single weather events, and sometimes find it, writes Lamont's Adam Sobel.

Building the paprica database

Chasing Microbes in Antarctica - Fri, 03/11/2016 - 11:04

This tutorial is both a work in progress and a living document.  If you see an error, or want something added, please let me know by leaving a comment.

Build the paprica database provides maximum flexibility with paprica but involves more moving parts and resources than conducting analysis with paprica against the provided database.  Basic instructions for using the script are provided in the manual, this tutorial is intended to provide an even more detailed step-by-step guide.


While a laptop running Linux, VirtualBox, or OSX is perfectly adequate for analysis with paprica, you’ll need something a little beefier for building the database (unless you’re really patient).  A high performance cluster is overkill, I build the provided database on a basic 12 core Linux workstation with 32 Gb RAM (< $5k).  Something in this ballpark should work fine, of course more cores will get the job done faster (but keep an eye on memory useage).

Once you’ve got the hardware requirements sorted out you need to download the dependencies.  I recommend first following all the instructions for the script, then installing RAxML and pathway-tools.  The rest of this tutorial assumes you’ve done just that, including running the test file.

Install remaining dependencies

In addition to all the dependencies required by you need pathway-tools and RAxML.  These are very mainstream programs, but that doesn’t necessarily mean installation is easy.  In particular pathway-tools requires that you request a license (free for academic users).  This takes about 24 hours after which you’ll receive a link to download the installer.  Regardless of whether you’re sitting at the workstation or accessing via SSH a GUI will pop up and guide you through the installation.  In general you can accept the defaults, however, the GUI will ask you where pathway-tools short create to directory ptools-local.  This is where the program will create the pathway-genome databases that describe (among other things) the metabolic pathways in each genome.  By the time you are done creating the database this directory will be > 100 Gb, so pick a location with plenty of space!  This may not be your home directory (the default location).  For example on my system my home directory is housed on a small SSD.  To keep the home directory from becoming bloated I opted to locate ptools-local on a separate SATA drive.

You will receive a number of download options from the pathway-tools development team.  I recommend that you conduct only the basic installation of pathway-tools, and do not download and install additional PGDBs.  Nothing wrong with installing these additional, well-curated PGDBs other than increased space and time, but they become ponderous.  You can always add them later if you want to become a metabolic modeling rock star.

To be continued…

Melting of Greenland’s Ice Sheet Accelerating with Loss of Reflectivity - National Geographic

Featured News - Thu, 03/10/2016 - 12:00
A new study led by Lamont's Marco Tedesco finds that the reflectivity, or albedo, of Greenland’s ice sheet could decrease by as much as 10 percent by the end of the century, potentially leading to significant sea-level rise.

Winter Blooms May Be Disrupting the Marine Ecosystem - Science News

Featured News - Wed, 03/09/2016 - 12:00
The dinoflagellate Noctiluca scintillans is taking over in the Arabian Sea, posing a potential threat to its ecosystem. Science News talks with Lamont's Joaquim Goes.

New York's Big Green Clean - BBC

Featured News - Wed, 03/09/2016 - 12:00
The BBC talks with Lamont's Bob Newton about the Billion Oyster Project, an effort to bring oysters back to New York harbor.

Faster-Merging Snow Crystals Speed Greenland Ice Sheet Melting - Eos

Featured News - Wed, 03/09/2016 - 12:00
Satellite data and modeling reveal a trend toward coarser-grained, more-energy-absorbent snow on Greenland, as a new paper by Lamont's Marco Tedesco explains.

We’re Headed for Mozambique!

When Oceans Leak - Tue, 03/08/2016 - 00:15
 Tim Fulton, IODP

Sedimentologists Andreas Koutsodendris of University of Heidelberg, Masako Yamane of Japan Agency for Marine-Earth Science and Technology, and Thibaut Caley of University of Bordeaux study freshly split cores aboard the JOIDES Resolution. Photo: Tim Fulton, IODP

Read Sidney Hemming’s first post to learn more about the goals of her two-month research cruise off southern Africa and its focus on the Agulhas Current and collecting climate records for the past 5 million years.

At the time of the previous entry, we were heading toward the waters off Mozambique while hoping government permission would be in hand in time for coring. It was a month after we had left port in Mauritius, and we had a couple of firm deadlines – well, actually one that we later revised due to the delay because of the helicopter evacuation. We decided that if we did not have approval from Mozambique’s Fisheries office by Wednesday, we would give up hope and head to our CAPE site, off the tip of South Africa, with the prospect of another site that was not part of our original plan as a consolation prize.

We did not hear back on Wednesday, so we stopped and brought extra pipe up for the potential extra site. Thursday morning, with no word from Mozambique, we began to head south. Approximately 24 hours later WE GOT PERMISSION! I cannot tell you what an emotional roller coaster this has been for the entire party. Some of us had already started warming up to the alternative site, but everybody is ecstatic that we finally have verbal permission for the Mozambique sites. We hope cores from the Zambezi and Limpopo sites, near major rivers that run through Mozambique, will give us a record of the terrestrial climate variability in southeastern Africa through the last 5 million years that can be compared with the Agulhas Current and other oceanographic factors.

Expedition 361's coring sites.

Expedition 361’s coring sites.

We are approaching our northernmost site, which is a re-occupation of an old Deep Sea Drill Project (DSDP) site 242 on the Davie Ridge in the northern part of the Mozambique Channel. The site, MZC, which will be IODP 1476, in some ways, is more exploratory than the other five of our expedition although there are hints that this will be a good spot for paleoceanography. The original drilling was done in 1972 during the 25th leg of the DSDP – they sailed from Mauritius, too, on the Glomar Challenger, and ended in Durban, South Africa.

As an aside, this reminds me how the DSDP and its descendants – the Ocean Drilling Program, Integrated Ocean Drilling Program, and the current International Ocean Discovery Program (IODP) –  have made an incredible legacy of understanding the evolution of the ocean basins and the evolution of the oceans and climate system through the Cenozoic. We would know far less without these extraordinary programs. The DSDP site 242 was drilled to understand the history of separation between Africa and Madagascar, and to establish a mid-latitude faunal succession (the evolutionary change of marine creatures) for the western Indian Ocean. The hole was drilled and cored intermittently to 676 meters, and the bottom sediment recovered was from the Eocene (~50 million years old). The sediment was nannofossil ooze throughout. Nannofossil ooze is sediment that is made up of mostly calcareous nannofossils, which are single-celled organisms that have a calcium carbonate structure. This is also the composition of the first two sites we cored and a very common composition for tropical and subtropical sites without much terrigenous (land-derived) dust and debris. This location is upstream of the Agulhas Current, and it appears to have an important influence on Natal Pulses (turbulent pulses that are triggered by eddies originating in the Mozambique Channel) that pass down the Natal Valley and lead to the Agulhas Leakage.


Barbecue on the JOIDES Resolution‘s “steel beach.” Photo: IODP

We should get to our northernmost site, MCZ/1476, in the early morning on Tuesday March 8. By trimming our program of coring to only include the advanced piston coring and not go to greater depth than needed to capture the 5 million year interval, we think we can still get everything we need at all six sites. It is going to be a really busy final three weeks, but everybody is ready for the challenge.

Meanwhile, the reports are almost finished for site 1475, at the Agulhas Plateau. The correlators were able to put together a splice of cores that provides a continuous section, although there are intervals that will be further scrutinized back home. We had a barbecue on deck Saturday in the nice hot weather, and we are looking forward to the next site.

Sidney Hemming is a geochemist and professor of Earth and Environmental Sciences at Lamont-Doherty Earth Observatory. She uses the records in sediments and sedimentary rocks to document aspects of Earth’s history.

Mercury's Carbon-Rich Crust Is Surprisingly Ancient - Discovery News

Featured News - Mon, 03/07/2016 - 13:01
Before its planned crash into Mercury last year, NASA’s MESSENGER spacecraft gave scientists a parting gift: In its final orbits, MESSENGER confirmed that Mercury’s dark hue is due to carbon. Discovery talked with Lamont Director Sean Solomon, who led the MESSENGER mission.

Why Is Greenland's Ice Getting Darker? - Fox News

Featured News - Fri, 03/04/2016 - 12:00
Greenland can’t seem to catch a break. In a study led by Lamont's Marco Tedesco, researchers have found that the surface has gotten darker over the past two decades, meaning it’s absorbing more solar radiation, which is further increasing snow melt.

Research Is Art and Other Science Outreach - Don't Panic Geocast

Featured News - Fri, 03/04/2016 - 12:00
Lamont graduate student Hannah Rabinowitz talks in a podcast about Lamont's Research Is Art project, Girls' Science Day and other science outreach.

Correctly evaluating metabolic inference methods

Chasing Microbes in Antarctica - Fri, 03/04/2016 - 11:47

Last week I gave a talk at the biennial Ocean Sciences Meeting that included some results from analysis with paprica.  Since paprica is a relatively new method I showed the below figure which is intended to validate the method.  The figure shows a strong correlation for four metagenomes between observed enzyme abundance and enzyme abundance predicted with paprica (from 16S rRNA gene reads extracted from the metagenome).  This is similar to the approach used to validate PICRUSt and Tax4Fun.

Spearman's correlation between predicted and observed enzyme abundance in four marine metagenomes.

Spearman’s correlation between predicted and observed enzyme abundance in four marine metagenomes.

The correlation looks decent, right?  It’s not perfect, but most enzymes are being predicted at close to their observed abundance (excepting the green points where enzyme abundance is over-predicted because metagenome coverage is lower).

After the talk I was approached by a well known microbial ecologist who suggested that I compare these correlations to correlations with a random collection of enzymes.  His concern was that because many enzymes (or genes, or metabolic pathways) are widely shared across genomes any random collection of genomes looks sort of like a metagenome.  I gave this a shot and here are the results for one of the metagenomes used in the figure above.

Correlation between predicted and observed (red) and random and observed (black) enzyme abundances.

Correlation between predicted and observed (red) and random and observed (black) enzyme abundances.

Uh oh.  The correlation is better for predicted than random enzyme abundance, but rho = 0.7 is a really good correlation for the random dataset!  If you think about it however, this makes sense.  For this test I generated the random dataset by randomly selecting genomes from the paprica database until the total number of enzymes equaled the number predicted for the metagenome.  Because there are only 2,468 genomes in the current paprica database (fewer than the total number of completed genomes because only one genome is used for each unique 16S rRNA gene sequence) the database gets pretty well sampled during random selection.  As a result rare enzymes (which are also usually rare in the metagenome) are rare in the random sample, and common enzymes (also typically common in the metagenome) are common.  So random ends up looking a lot like observed.

It was further suggested that I try and remove core enzymes for this kind of test.  Here are the results for different definitions of “core”, ranging from enzymes that appear in less than 100 % of genomes (i.e. all enzymes, since no EC numbers appeared in all genomes) to those that appear in less than 1 % of genomes.

The difference between the random and predicted correlations does change as the definition of the core group of enzymes changes.  Here’s the data aggregated for all four metagenomes in the form of a sad little Excel plot (error bars give standard deviation).

delta_correlationThis suggests to me a couple of things.  First, although I was initially surprised at the high correlation between a random and observed set of enzymes, I’m heartened that paprica consistently does better.  There’s plenty of room for improvement (and each new build of the database does improve as additional genomes are completed – the last build added 78 new genomes, see the current development version) but the method does work.  Second, that we obtain maximum “sensitivity”, defined as improvement over the random correlation, for enzymes that are present in fewer than 10 % of the genomes in that database.  Above that and the correlation is inflated (but not invalidated) by common enzymes, below that we start to lose predictive power.  This can be seen in the sharp drop in the predicted-random rho (Δrho: is it bad form to mix greek letters with the English version of same?) for enzymes present in less than 1 % of genomes.  Because lots of interesting enzymes are not very common this is where we have to focus our future efforts.  As I mentioned earlier some improvement in this area is automatic; each newly completed genome improves our resolution.

Some additional thoughts on this.  There are parameters in paprica that might improve Δrho.  The contents of closest estimated genomes are determined by a cutoff value – the fraction of descendant genomes a pathway or enzyme appears in.  I redid the Δrho calculations for different cutoff values, ranging from 0.9 to 0.1.  Surprisingly this had only a minor impact on Δrho.  The reason for this is that most of the 16S reads extracted from the metagenomes placed to closest completed genomes (for which cutoff is meaningless) rather than closest estimated genomes.  An additional consideration is that I did all of these calculations for enzyme predictions/observations instead of metabolic pathways.  The reason for this is that predicting metabolic pathways on metagenomes is rather complicated (but doable).  Pathways have the advantage of being more conserved than enzymes however, so I expect to see an improved Δrho when I get around to redoing these calculations with pathways.

Something else that’s bugging me a bit… metagenomes aren’t sets of randomly distributed genomes.  Bacterial community structure is usually logarithmic, with a few dominant taxa and a long tail of rare taxa.  The metabolic inference methods by their nature capture this distribution.  A more interesting test might be to create a logarithmically distributed random population of genomes, but this adds all kinds of additional complexities.  Chief among them being the need to create many random datasets with different (randomly selected) dominant taxa.  That seems entirely too cumbersome for this purpose…

So to summarize…

  1.  Metabolic inference definitively outperforms random selection.  This is good, but I’d like the difference (Δrho) to be larger than it is.
  2. It is not adequate to validate a metabolic inference technique using correlation with a metagenome alone.  The improvement over a randomly generated dataset should be used instead.
  3. paprica, and probably other metabolic inference techniques, have poor predictive power for rare (i.e. very taxonomically constrained) enzymes/pathways.  This shouldn’t surprise anyone.
  4. Alternate validation techniques might be more useful than correlating with the abundance of enzymes/pathways in metagenomes.  Alternatives include correlating the distance in metabolic structure between samples with distance in community structure, as we did in this paper, or correlating predictions for draft genomes.  In that case it would be necessary to generate a distribution of correlation values for the draft genome against the paprica (or other method’s) database, and see where the correlation for the inferred metabolism falls in that distribution.  Because the contents of a draft genome are a little more constrained than the contents of a metagenome I think I’m going to spend some time working on this approach…

Scientists Just Found a Surprising Factor Speeding Greenland's Melting - Washington Post

Featured News - Thu, 03/03/2016 - 15:49
A new study from Lamont's Marco Tedesco shows that Greenland's ice sheet is “darkening,” or losing its ability to reflect both visible and invisible radiation, as it melts more and more, the research finds. That means it’s absorbing more of the sun’s energy — which then drives further melting.

Mideast Drought Worst in 900 Years - CNN

Featured News - Thu, 03/03/2016 - 12:00
A new study led by Lamont's Ben Cook finds that the drought that began in 1998 in the Levant is probably the region's worst in 900 years.

Greenland's Ice Melt Accelerating as Surface Darkens - The Guardian

Featured News - Thu, 03/03/2016 - 12:00
Greenland’s vast ice sheet is in the grip of a dramatic “feedback loop” where the surface has been getting darker and less reflective of the sun, helping accelerate the melting of ice and fuelling sea level rises, new research led by Lamont's Marco Tedesco has found.

The Worst Drought in 900 Years Helped Spark Syria's Civil War - Mashable

Featured News - Wed, 03/02/2016 - 13:28
The drought that played a role in triggering the catastrophic Syrian Civil War was the worst such climate event in at least the past 900 years, according to a new study published this week and led by Lamont's Ben Cook. Mashable also talks with Richard Seager.

Uptick in Small Earthquakes Raises Questions in New York Area - Wall Street Journal

Featured News - Wed, 03/02/2016 - 12:00
A cluster of low-magnitude earthquakes in the New York region has piqued the interest of residents, while some geologists predict the increase in temblors will continue and a large-scale one could be coming. Lamont's Won-Young Kim discusses the science.

Global Warming in New York - Le Figaro

Featured News - Tue, 03/01/2016 - 12:00
Since the ravages of Hurricane Sandy in 2012 and the massive floods in the U.S. East Coast, New York has focused on creating a new ecosystem to contain the risks of sea level rise. Le Figaro talks with Lamont's Klaus Jacob and Adam Sobel. (In French)

Arctic Sea Ice Growth Could Be Lowest on Record Again - ThinkProgress

Featured News - Tue, 03/01/2016 - 09:29
Arctic sea ice growth has been sluggish this winter. And that's a huge problem for the animals and communities that depend on it, says Lamont's Ray Sambrotto.

Preparing for the Inevitable Sea-Level Rise Caused by Climate Change - The Atlantic

Featured News - Mon, 02/29/2016 - 17:47
Scientists are struggling to figure out the timeline for how climate change will affect vulnerable waterfront communities. The Atlantic talks with Lamont's Maureen Raymo about the challenges.

Trials & Tribulations of Coring the Agulhas Plateau

When Oceans Leak - Sun, 02/28/2016 - 14:03
 Tim Fulton/IODP.

Sedimentologists Thibaut Caley of the University of Bordeaux and Andreas Koutsodendris of the University of Heidelberg and Deborah Tangunan, a paleontologist from the University of Bremen, work in the core lab aboard the JOIDES Resolution. Photo: Tim Fulton/IODP

Read Sidney Hemming’s first post to learn more about the goals of her two-month research cruise off southern Africa and its focus on the Agulhas Current and collecting climate records for the past 5 million years.

A lot has happened since my last post. As we were heading south to the Agulhas Plateau, one of the scientists had to be evacuated by helicopter for medical treatment. We were within a day of the Agulhas Plateau site and had to go back to near Port Elizabeth for the handoff and then return to drill the plateau. The weather at the plateau was bad enough that we were probably going to have a delay anyway, so we didn’t lose too much time. Our colleague is fine now, and our drilling on the Agulhas Plateau has been a success.

We have had some trials and tribulations because of the large ocean swells and because the sediments do not have as strong of a physical property signal as the previous site. Both of these factors increased the challenge for the stratigraphic correlators, so it has been a real cliff-hanger to find out if we can splice together a continuous section. Because of the small signal-to-noise of the physical properties, the scanning took longer and the records for correlating are not quite as clear. This has created a backup in the work flow, and it means the descriptions and scanning (and some sampling) of the split cores will be continuing as we begin our transit. And it means that until all this is completed we will not know for sure how continuous of a record we have. We are reasonably sure we will have few or no gaps in the splice, but it will be nice to see it all completed.

 Tim Fulton/IODP

The end of a fresh core, just brought aboard the JOIDES Resolution. Photo: Tim Fulton/IODP

Meanwhile, we came here thinking that we would get a high accumulation rate record for the last million years, but the accumulation rates are modest between the surface and about 100 meters – approximately 2 cm per thousand years. Below that, they turned out to be really quite nice, approaching 7 cm per thousand years through much of the Pliocene. The low accumulation in the Pleistocene is a disappointment as there is a great interest in the mid-Pleistocene climate transition, but it does look like it is a continuous record. The higher accumulation in the older sediment is exciting because the early Pliocene is a warm time in Earth’s history and the most recent with global temperatures as warm as modern times. So we Earth scientists are quite eager to understand everything we can about this interval. The Agulhas Plateau site, near where the Agulhas Current swings back toward the east, is well situated to provide some important information about linkages of different factors in the climate system.

Again at this site, as with the previous site, the development of the time scale has been fun and exciting to watch. We have four groups of organisms that are aiding in our time scale – in addition to foraminifera and nannofossils, there are abundant diatoms and dinoflagellates here. This is great for the biostratigraphy and also great for our participants whose post cruise research will use diatoms for documenting paleo-environmental changes. The magnetic stratigraphy started out looking bleak because the weak signal was messed up by the coring process in the first hole, due to the ship’s heave in the waves.  They almost gave up, but the second core preserved a great record. So we are going to have an excellent time scale for this site as well.

Expedition 361's coring sites. APT is the Agulhas Plateau. NV is the Natal Valley.

Expedition 361’s coring sites. APT is the Agulhas Plateau. NV is the Natal Valley. Credit: IODP

Meanwhile the saga continues in our quest to get permission from Mozambique to drill in their waters. We have word from our contact in the American embassy that the form has been signed by the Foreign Ministry and is now with the ministry that deals with fisheries. While that process continues, we have to start toward our next site. Our decision is to head toward the Zambezi site, as it is going to take us six days to get there anyway. If we don’t get permission before we arrive, we’ll have to turn around and head for the Cape site.

The Zambezi and Limpopo sites are near major rivers. We hope they will give us a record of the terrestrial climate variability in southeastern Africa through the last 5 million years that can be compared with the Agulhas Current and other oceanographic factors. The hope is that we will get a continuous record with a variety of proxy data for factors such as precipitation, runoff, distribution of vegetation on the landscape, and surface ocean temperatures. The coring is going to be fast at these sites because they are much shallower. In the happy case that we get to drill there, we will then have another long transit to finish off the analyses.

Sidney Hemming is a geochemist and professor of Earth and Environmental Sciences at Lamont-Doherty Earth Observatory. She uses the records in sediments and sedimentary rocks to document aspects of Earth’s history.



Subscribe to Lamont-Doherty Earth Observatory aggregator