Monday, October 8, 2012

On ENCODE's results regarding junk DNA



After I took part in an AMA ("Ask Me Anything") on reddit, there has been some discussion elsewhere (such as by Ryan Gregory and in the comments of Ewan Birney's blog) of what I and the other ENCODE scientists meant.  In response, I'd like to echo what many others have said regarding the significance of ENCODE on the fraction of the genome that is "junk" (or nonfunctional, or unimportant to phenotype, or evolutionarily unconserved).

In its press releases, ENCODE reported finding 80% of the genome with "specific biochemical activity", which turned into (through some combination of poor presentation on the part of ENCODE and poor interpretation on the part of the media) reports that 80% of the genome is functional.  This claim is unlikely given what we know about the genome (here is a good explanation of why), so this created some amount of controversy. 

I think very few members of ENCODE believe that the consortium proved that 80% of the genome is functional; no one claimed as much on the reddit AMA, and Ewan Birney has made it clear on his blog that he would not make this claim either.  In fact, I think importance of ENCODE's results on the question of what fraction of DNA is functional is very small, and that question is much better answered with other analysis, like that of evolutionary conservation.  Lacking proof either way from ENCODE, there was some disagreement on the AMA regarding what the most likely true fraction is, but I think this stemmed from disagreements about definitions and willingness to hypothesize about undiscovered function, not misinterpretation of the significance of ENCODE's results.

I think many members of the consortium (including Ewan Birney) regret the choice of terminology that led to the misinterpretations of the 80% number.  Unfortunately, such misinterpretations are always a danger in scientific communication (both among the scientific community and to the public).  Whether the consortium could have done a better job explaining the results, and whether we should expect the media to more accurately represent scientific results, is hard to say.

I think the contribution of ENCODE lies not in determining what DNA is functional but rather in determining what the functional DNA actually does.  This was the focus of the integration paper and the companion papers, and I would have preferred for this to be the focus of the media coverage.