Skip to content

Wildlife of Our Homes: Q & A with Rob Dunn

November 27, 2012

The Wildlife of Your Home.  Copyright: Your Wildlife, with permission.

The Wildlife of Your Home. Copyright: Your Wildlife, with permission.

Rob Dunn is a biologist and writer in the Department of Biology at North Carolina State University. His lab group – in collaboration with Noah Fierer’s microbial ecology group at the University of Colorado–Boulder – have launched the Wild Life of Our Homes project, a continental-scale citizen science project that aims to build an atlas of house-associated microbial diversity. Dunn and his team see the homes of North America as the next ecological frontier. They aspire to understand how the physical characteristics of a home, its inhabitants, and the landscape in which it is situated influence the microbial communities that live there. Moreover, they will investigate the reverse: how the presence or absence of home microbes may influence our own health and well-being.

PersonalGenomes.org has partnered with Dunn and Wild Life of Our Homes to create a third party research opportunity for PGP volunteers.  We recently sat down with him for a brief Q & A:

How did you get interested in the microbes of homes?

I guess this story has multiple answers. My first interest was in the context of writing. I was writing The Wild Life of Our Bodies and became really interested in what we do and don’t know about the species we interact with every day. I was struck by how little was known about the species in our homes. No one has ever exhaustively surveyed the species of homes. No one even has a good list of the animals present, much less the smaller beasts. This fascinated me. What fascinated me even more than writing about this problem was that I could do something about it and so we began the Wild Life of Our Homes and other related projects in which we work with the public to study one of the least known but most important habitats on Earth, your home.

What is your big vision for Wild Life of Our Homes?

It depends on the day I guess, but when I am feeling ambitious I think we might be able to pull off the most complete survey of the insides of our homes ever achieved and do it in such a way to understand what determines the species that live with us. The big, big, vision is then to move from understanding who is present and why to being able to garden species that benefit us, whether they are animals, plants, fungi or bacteria. We are good at killing species we think are bad, far less effective at gardening species that benefit us (with the exception of our foods).

What special contribution do you think PGP volunteers could make to the Wild Life of Our Homes project?

Oh, well this is exciting. One of the really interesting things to think about in the context of microbes in homes is the interaction between our bodies and their cells. We know that many (but not the majority) of the species living in our homes depend on our bodies and their skin and other bits for food. And so presumably the species that live on you are influencing the species we find in homes. But the really interesting question is whether your genes are actually influencing the species on you and the species floating around you—on your pillow, on your cutting board, anywhere else. Is the composition of your house influenced by the extended influence of your genes? That seems very conceivable but is hard to test. With PGP data we will be able to perform the test.  We can also consider the reverse. To what extent do the microbes in your house influence your health and well-being and how much is that effect contingent on your genes. The rich data provided by PGP volunteers will really be wonderful in the ways that it will allow us to think of human and home as part of a continuous ecosystem.

Why recruit citizen volunteers to accomplish your research goals?

It is literally the only way we can see what is going on. I used to study rain forests. Sometimes it was hard to get to a field site. There were narrow trails, dangerous snakes, malaria parasites, and sketchy buses, but you could get there. When it comes to studying bedrooms on the other hand, access is more difficult. More to the point, even if we could go into 1000 bedrooms, we can go once. The folks we work with can study their homes every day. That, to us, is the great thing, to be able to form a network of public scientists each of whose houses becomes a kind of long-term ecological research site.

I understand you’re in the process of analyzing data from a pilot study of the microbes living on surfaces in 40 homes in North Carolina. What have you learned so far? Have there been any surprises?

Microbiologically, we can’t tell toilet seats and pillowcases apart. I don’t know how that changes my life, but it is true. Also, there are strong and discrete habitats within the home microbiologically speaking, but the big surprise is that there are big differences among houses (just as we have seen in another study, among belly buttons). The fun question, the one we are enlisting PGP participants to help us with, is explaining what accounts for those differences. Outdoor climate? Backyard biodiversity? Your genes? Your dog? Your carpet? The type of house you live in? Any and all of these things might matter, but our data so far suggest that most of them don’t. Our preliminary data do suggest that some aspects of the ecology of our homes may be simpler than we anticipated, but we need to see more houses. We need to test our anecdotes against what we see across North America from sea to shining sea, or should that be from toilet seat to shining toilet seat?

Wild Life of Our Homes is just one of many public science projects spearheaded by Dunn and his team. You can learn more about their work at http://www.yourwildlife.org/

Important Note for PGP participants
PGP volunteers interested in participating in the Wild Life of Our Homes project should log into their PGP account and visit the third party page to register. We strongly encourage PGP volunteers to sign-up for the project in this way so that you may easily link your home microbiome data to your public PGP profile once the data is available.

2012 Trait Surveys

November 13, 2012

Part of what makes Personal Genome Project participant data uniquely valuable is our publicly shared trait data connected to public genetic data. A year ago our project was frustrated when our best resource for importing health data — Google Health — was discontinued. Ward soon got an interface running to import data from Microsoft HealthVault; the CCR-format data it produces is very similar, but isn’t trivially combined with our Google Health records.1  We wanted to improve the quality of our trait data and provide another option for adding traits to public profiles.

And so we created a set of twelve trait surveys (the links below will only work for PGP participants) covering 239 traits and diseases:

Cancer Respiratory System
Endocrine, Metabolic, Nutritional, and Immunity Digestive System
Blood Genitourinary Systems
Nervous System Skin and Subcutaneous Tissue
Vision and Hearing Musculoskeletal System and Connective Tissue
Circulatory System Congenital Traits and Anomalies

 

To select what traits and conditions to include, the Google Health data was an invaluable resource. I was able to combine conditions using their ICD-9 codes (or, if unavailable, by internal Google codes).Here’s the five most common reported traits:

Top five conditions reported on Google Health records contributed by PGP participants.

We tried to settle on four encodings corresponding to each trait: ICD-9, ICD-10, SNOMED CT, and NCIMetathesaurus CUI. I’ve shared our list of traits surveyed, along with the encodings we consider them associated with, as a Google spreadsheet.

A useful aspect of the ICD encodings is their organization by topic, and so our traits were split into twelve survey topics by ICD-9 encoding. It’s impossible to be perfect in a first pass, but we tried to include anything that was fairly well-defined, not too rare (a prevalence of at least 1 in 10,000), and within the twelve ICD-9 ranges selected for the surveys. You might notice that some ICD-9 ranges were not used — most notably, the category of mental traits and disorders. We do hope to survey these as well, but I want to be sure that participants are able to easily manipulate data on their public profiles before adding such a potentially sensitive category.

All PGP participants are invited to enter public trait data using these surveys — although contributing such information is optional, and not required for participation. Even if you don’t see a condition listed in the survey that you want to add, submitting an empty survey is useful information. I hope to follow up soon with a blog post analyzing some of the resulting data.


1On top of it, these records contain identifying data (like names and email addresses) that our participants weren’t intending to make public. This meant we couldn’t share the raw data, anything we shared was limited by our private CCR data interpretation process. Ideally we wouldn’t be in this position: sharing raw data and allowing others to interpret it would be better, scientifically.

2If you’re interested in it, this data was made available as “Dataset S1” in our recent open-access PNAS publication.

3Why four different coding systems? A couple of reasons: for redundancy, to facilitate using our data in other systems, to provide a starting point for harmonizing data from imported health records, and because we weren’t (and still aren’t) sure whether or how we’ll be able to work with the licensing issues associated with some of these popular encoding systems.

Video of Open Science Summit 2012 PGP Talk

November 11, 2012

I’ve uploaded a video of the talk I gave at the 2012 Open Science Summit. The talk itself is only twelve minutes long — it’s a fairly fast-paced overview of the Personal Genome Project‘s motivations and goals, with updates on recent progress.

PGP and the Opportunity to Contribute to Human Genome Standards via NIST

November 5, 2012

DNA donated by Personal Genome Project participants may be chosen by the National Institute of Standards and Technology (NIST) to become reference materials for new human genome sequencing standards!

NIST’s “Genome in a Bottle” consortium convened in August to initiate the establishment of a human genome standard. This “meter stick of the genome” will serve as an international reference for identifying variation across individual genomes, and be used to establish professional standards for clinical human genome sequencing.  Specimens donated by PGP volunteers are viewed as ideal candidates to serve as these new reference standards due to the depth and availability of public PGP datasets as well as the strength of the consent process used in the Harvard PGP study.

Nothing has been finalized yet, but this may become an exciting opportunity for our participants to contribute to an effort to standardize a new and rapidly-evolving field of genomics and personalized medicine. The program is specifically interested in the participation of parent-child trios (including both parents and one or more offspring).

Saturday: PGP Talk at Open Science Summit, Live Streaming

October 18, 2012

I’m going to be flying to Mountain View this evening to attend the Open Science Summit. Various PGP members have attended and spoken in previous years; this year I’m up and scheduled to give a 15-20 minute talk Saturday morning (in the 10:45am-12:00pm PST grouping).

Open science is of course one of our core motivations as a project, and I look forward to meeting many like-minded folks there. Live streaming of the summit should be available at this site: http://fora.tv/conference/open_science_summit_2012/livestream

For twitter users, the hashtag du jour will be “#OSS12“. Last minute tickets if you want to attend in person are here: http://opensciencesummit2012.eventbrite.com/ (looks like $300 at this point).

PGP Genome Assessment Challenge

October 11, 2012

The Personal Genome Project is working with the Critical Assessment of Genome Interpretation (CAGI) this year to provide a genomic interpretation challenge using PGP data! CAGI’s use of PGP data is a demonstration of how publicly sharing genome & trait data is invaluable to science: because the data is public, the challenge is open to everyone. No restrictions or requirements need to be met to access the data.

How will the challenge work? In the upcoming week or two we will be returning genomes to some participants. Currently genomes automatically become public after 30 days of private access, but participants have the ability to publish a genome immediately should they choose to do so. We’ve added an additional option for CAGI: participants can release the genome to be used as a CAGI genome. When they do this, the genome becomes public — but which participant account the genome belongs to is kept secret. At the end of the challenge genome data will be linked to the specific participant account.

Of course the other half of CAGI is the predictions — we need trait data from participants for the CAGI researchers to try to predict! I’ve been working hard the last couple weeks to make a set of trait surveys for PGP participants. These surveys aren’t just for CAGI, they’re for all participants and they’ll remain open after CAGI is ended. For a genome to be used by CAGI a participant will need to complete all the surveys, but all participants are encouraged to fill them out.

The trait surveys are publicly shared and entirely optional — only choose the items in the survey that you want added to your public profile. You can find the surveys (“PGP Trait & Disease Survey 2012”) at the bottom-center of the screen when you log in to your PGP account.

Vistas and hazards of the foggy Omic Road

October 1, 2012

I am a PGP director and a participant, and in the latter role I received a recent email that urged me to “READ THIS!” It came from a family member and was triggered by the second segment in a recent series on personal genomics featured on National Public Radio’s Morning Edition (in a previous post Madeleine Ball highlighted the first segment featuring the PGP and George Church). The email-triggering segment featured two scientists whose genomes have been sequenced: James Watson, co-discoverer of the structure of DNA, and Mike Snyder, Director of the Center of Genomics and Personalized Medicine at Stanford University. Both Watson and Snyder presented generally supportive views of the process, but Snyder’s story provides more important lessons about the ups and downs of biomedical self discovery.

Beginning with the upside of Professor Snyder’s story, the publication featuring his genome sequence and other “omic” data provides strong evidence that at least some genomic predictions are both accurate and actionable (1). He learned he has an elevated risk of basal cell carcinoma, hypertriglyceridemia, and Type 2 diabetes (T2D). Upon learning of these risks he began to be monitored for these conditions. Consistent with the genomic prediction he did have high triglycerides, and the problem was successfully medicated.

The most interesting biomedical subtext of Snyder’s story began about a year after the prediction of elevated risk of T2D (due to variants in 3 genes). During the first year his blood glucose and proportion of glycated hemoglobin (HbA1c) were normal (HbA1c is a glycated form of hemoglobin, a protein in red blood cells that aids in oxygen transport; glycation is non-enzymatic attachment of glucose to the protein, which is a measure of persistently high blood glucose levels). Immediately following a viral infection, his blood glucose rose rapidly and in less than a month he had full-blown T2D as measured by high blood glucose (above 126 mg/dL), and about a month later as measured by HbA1c. His levels remained high for about two months after which he changed his diet and began to exercise more, and after six months his glucose and HbA1c levels returned back into the normal range. Overall, his fasting glucose level exceeded the threshold value for a clinical diabetes diagnosis for about 4 months.

But the story doesn’t end there, and this is why I received the exclamatory email. Even though Professor Snyder’s triglycerides and glucose are under control, insurance complications arose from the initial T2D diagnosis. According to NPR:

“After sequencing revealed his high risk for diabetes, his wife tried to increase his life insurance. But because of that high risk, the price shot through the roof. ‘So the bottom line is my life insurance … essentially became prohibitively expensive,’ Snyder says. Federal law bans health insurance companies and employers from penalizing people based on genetic information, but the law doesn’t apply to life insurance or long-term care insurance—leaving people like Snyder vulnerable to discrimination.”

Since there was some uncertainty about this NPR report, I asked Professor Snyder for his input and he emailed me the following clarifications:

  • The genomic prediction of T2D triggered frequent glucose tests, which were ordered by his physician, so the results became part of his medical record;
  • additional life insurance was sought by his wife after the tests showed high glucose and HbA1c levels;
  • the rate of his existing group life-insurance policy did not increase, only the rate for additional insurance.

The NPR story is correct that GINA (the Genetic Information Nondiscrimination Act of 2008) does not protect consumers against genetic discrimination when they purchase life or long-term care insurance. However, the story appears to mischaracterize the causal chain of events in this case: Snyder’s prohibitively high life insurance rate did not result from the discovery of risk alleles in his genome; it resulted from an old-fashioned medical diagnosis of diabetes by his physician based on clinical tests that showed high levels of diabetes-specific biomarkers (2). And despite his insurance problem, Professor Snyder believes genomic self discovery is more positive than negative and that his genome sequence helped him deal with his diabetes in a timely fashion.

It is fairly obvious why Snyder and his family would be happy with the present biomedical outcome but here is something that isn’t quite as obvious: his frequent testing helped him to chart a data-guided course to recovery, but it also made detection of his transiently elevated glucose much more likely than, say, annual testing. A reasonable estimate is that it made it about twice as likely since he had elevated glucose for about four months and sub-threshold levels for the remaining 8 months of the year. So, in addition to potential benefits, there might be a downside to extremely frequent, elective testing.

Professor Snyder’s story reminds us that the builders and first travelers on the road to data-driven healthcare occasionally experience remarkable and previously unknown vistas of self determination, but they also face uncertainty and risk. Many risks can be avoided or managed, but science is often a murky business and news reports on these difficult topics can be misleading, further obscuring the way forward. Our loved ones’ anxieties are heightened by the occasional fog of confusion—triggering emails or discussions of concern, or even alarm. But now that we see the relevant facts in this case—and the red flag posted clearly over one hazard in the road—we also can see ahead more clearly as we push on toward the greater goals of this unprecedented collective experiment.

References and Footnotes
1) Personal omics profiling reveals dynamic molecular and medical phenotypes.  Chen R, Mias GI, Li-Pook-Than J, Jiang L, Lam HY, Chen R, Miriami E, Karczewski KJ, Hariharan M, Dewey FE, Cheng Y, Clark MJ, Im H, Habegger L, Balasubramanian S, O’Huallachain M, Dudley JT, Hillenmeyer S, Haraksingh R, Sharon D, Euskirchen G, Lacroute P, Bettinger K, Boyle AP, Kasowski M, Grubert F, Seki S, Garcia M, Whirl-Carrillo M, Gallardo M, Blasco MA, Greenberg PL, Snyder P, Klein TE, Altman RB, Butte AJ, Ashley EA, Gerstein M, Nadeau KC, Tang H, Snyder M.  Cell. 2012 Mar 16;148(6):1293-307. PMID: 22424236

2) It is important for PGP participants to understand that PGP-generated data and reports are not equivalent to a clinical test and, according to the PGP consent form “are never intended to substitute in any way for professional medical advice, diagnosis or treatment. You may not use any PGP-generated report or any other PGP-supplied data or results for any medical or clinical purpose until you have confirmed the relevant sequence, data, interpretations and/or findings with a licensed healthcare professional.” Nevertheless, just because these data and reports are not clinical diagnoses doesn’t guarantee that they can’t or won’t be used to make decisions that might adversely affect you.

PGP mention on NPR yesterday

September 19, 2012

There was a mention of the Personal Genome Project in a story by Rob Stein on NPR yesterday: “As Genetic Sequencing Spreads, Excitement, Worries Grow”. In the context of the vastly reduced cost of personal genomes, George Church discusses the need for publicly shared personal genome and health data.

In addition to the promise of personalized medicine, the story also highlights privacy concerns. These privacy concerns are reflected in how most other genome research projects are conducted: it is very difficult for most researchers to share data, because research subjects have been promised privacy and this needs to be protected. In contrast, PGP participants have agreed to the public sharing of their data — an unusual waiver of privacy guarantees, made only after they demonstrate an understanding of the risks. This is creating an invaluable public resource, critically needed for scientific progress in this rapidly developing field.

How to donate 23andme exome pilot data

August 25, 2012

Many thanks to the PGP participants who have already donated exome data! We are working to feature these and other donated data on our site.

Customers using 23andme’s pilot exome service may have recently received an email from 23andme notifying them that online hosting of their exome data will end soon (at the end of August). Before this occurs, we would like to communicate to our participants how they can donate this data, if they wish to do so.

It is ideal to have the full data set donated (BAM data as well as the VCF file). These files are large and difficult to upload. We believe the easiest method for participants to donate this data is to provide the Personal Genome Project with the information needed for download (as given to you by 23andme):
(1) the Amazon Web Services URL
(looks like: https://exome-export.s3.amazonaws.com/AB1234_C56DE78a9b.tc)
(2) the decryption password
(looks like a random string, e.g. “XahN7tah4s”)

With this information we can download the data directly and decrypt it. An email from you containing this information will be treated as a “public donation of data” to the PGP and may be made public immediately. (Making the BAM data public may take some time though. You can check the status of things by seeing what is visible on your public profile.)

To email us, please log in to your participant account on my.personalgenomes.org and click the “Contact Us” button. By receiving the information through your participant account, we can confirm that the donation was personally made by the participant.

Please let us know if you would rather share the data with us in some other manner. Thanks again to all our participants!

Latanya Sweeney: MyDataCan.org (2012 GET Conference)

August 4, 2012

Latanya Sweeney: MyDataCan.org (watch video) as presented at the 2012 Genomes Environments and Traits Conference at Harvard Medical School.  Dr. Latanya Sweeney, Ph.D. is Visiting Professor and Scholar, Computer Science Director, Data Privacy Lab, Harvard University. See also Dr. Sweeney’s recent testimony to the Presidential Commission for the Study of Bioethical Issues on the topic of genomics and privacy.