Functionally enigmatic genes

A case study of the brain ignorome

Ashutosh K. Pandey, Lu Lu, Xusheng Wang, Ramin Homayouni, Robert Williams

Research output: Contribution to journalArticle

30 Citations (Scopus)

Abstract

What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed - the top 5% of genes absorb 70% of the relevant literature. In contrast, approximately 20% of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum - a genomic bandwagon effect. Finally we ask to what extent massive genomic, imaging, and phenotype data sets can be used to provide high-throughput functional annotation for an entire ignorome. In a majority of cases we have been able to extract and add significant information for these neglected genes. In several cases - ELMOD1, TMEM88B, and DZANK1 - we have exploited sequence polymorphisms, large phenome data sets, and reverse genetic methods to evaluate the function of ignorome genes.

Original languageEnglish (US)
Article numbere88889
JournalPLoS One
Volume9
Issue number2
DOIs
StatePublished - Feb 11 2014

Fingerprint

Brain
Genes
case studies
brain
genes
neurophysiology
Neurosciences
genomics
Reverse Genetics
momentum
Polymorphism
Publications
Momentum
Throughput
genetic polymorphism
image analysis
Tissue
Phenotype
Imaging techniques
phenotype

All Science Journal Classification (ASJC) codes

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Functionally enigmatic genes : A case study of the brain ignorome. / Pandey, Ashutosh K.; Lu, Lu; Wang, Xusheng; Homayouni, Ramin; Williams, Robert.

In: PLoS One, Vol. 9, No. 2, e88889, 11.02.2014.

Research output: Contribution to journalArticle

Pandey, Ashutosh K. ; Lu, Lu ; Wang, Xusheng ; Homayouni, Ramin ; Williams, Robert. / Functionally enigmatic genes : A case study of the brain ignorome. In: PLoS One. 2014 ; Vol. 9, No. 2.
@article{7251f040372a441089d1752c73c6c00f,
title = "Functionally enigmatic genes: A case study of the brain ignorome",
abstract = "What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed - the top 5{\%} of genes absorb 70{\%} of the relevant literature. In contrast, approximately 20{\%} of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum - a genomic bandwagon effect. Finally we ask to what extent massive genomic, imaging, and phenotype data sets can be used to provide high-throughput functional annotation for an entire ignorome. In a majority of cases we have been able to extract and add significant information for these neglected genes. In several cases - ELMOD1, TMEM88B, and DZANK1 - we have exploited sequence polymorphisms, large phenome data sets, and reverse genetic methods to evaluate the function of ignorome genes.",
author = "Pandey, {Ashutosh K.} and Lu Lu and Xusheng Wang and Ramin Homayouni and Robert Williams",
year = "2014",
month = "2",
day = "11",
doi = "10.1371/journal.pone.0088889",
language = "English (US)",
volume = "9",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "2",

}

TY - JOUR

T1 - Functionally enigmatic genes

T2 - A case study of the brain ignorome

AU - Pandey, Ashutosh K.

AU - Lu, Lu

AU - Wang, Xusheng

AU - Homayouni, Ramin

AU - Williams, Robert

PY - 2014/2/11

Y1 - 2014/2/11

N2 - What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed - the top 5% of genes absorb 70% of the relevant literature. In contrast, approximately 20% of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum - a genomic bandwagon effect. Finally we ask to what extent massive genomic, imaging, and phenotype data sets can be used to provide high-throughput functional annotation for an entire ignorome. In a majority of cases we have been able to extract and add significant information for these neglected genes. In several cases - ELMOD1, TMEM88B, and DZANK1 - we have exploited sequence polymorphisms, large phenome data sets, and reverse genetic methods to evaluate the function of ignorome genes.

AB - What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed - the top 5% of genes absorb 70% of the relevant literature. In contrast, approximately 20% of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum - a genomic bandwagon effect. Finally we ask to what extent massive genomic, imaging, and phenotype data sets can be used to provide high-throughput functional annotation for an entire ignorome. In a majority of cases we have been able to extract and add significant information for these neglected genes. In several cases - ELMOD1, TMEM88B, and DZANK1 - we have exploited sequence polymorphisms, large phenome data sets, and reverse genetic methods to evaluate the function of ignorome genes.

UR - http://www.scopus.com/inward/record.url?scp=84895734902&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84895734902&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0088889

DO - 10.1371/journal.pone.0088889

M3 - Article

VL - 9

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 2

M1 - e88889

ER -