Skip to main content

Mapping Knowledge Domains (2004) / Chapter Skim
Currently Skimming:

A method for finding communities of related genes
Pages 59-66

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 59...
... Merely locating all relevant articles in a database by using a simple search utility would be time consuming, not to mention inefficient and difficult, because of shortcomings of the human gene nomenclature system. In contrast, our method indexes gene symbol occurrences in all articles of large database such as Medline in <1 days and then can produce a list of communities of functionally related genes in another half day.0 In this article, we present a method to find communities of related genes.
From page 60...
... And, whereas online biomedical databases provide easy access to abstracts, a manual literature survey would encounter difficulties beyond the large number of results, due to the nomenclature system for human genes. Both the existence of multiple alias symbols for many genes and the frequent occurrence of unrelated abbreviations equivalent to gene symbols interfere with any simple search utility.
From page 61...
... Our modifications were necessary to make the method applicable to gene graphs, which are large and are created from source information that may by nature be incomplete or flawed. In particular, we identify many possible community structures and average them into a final list of communities.
From page 62...
... By varying which high-betweenness edges are removed early in the process, we may therefore identify many community structures on a graph. By then comparing the structures, we can easily identify tightly knit communities, which do not vary from structure to structure, and ambiguous genes, which migrate from group to group.)
From page 63...
... Our modified process can therefore be applied repeatedly to identify different plausible community structures on the graph. This process may erroneously remove an intracommunity edges, which can happen if a large percentage of the centers considered lies in one community.
From page 64...
... This step could create a problem if one ended up with huge communities at the end, but we found that in general the largest communities in the final result had only 10 or 15 more genes than the largest communities in each individual structure, which incidentally indicates that our edge removal algorithm had a low error rate. The entire process of determining community structure is displayed in Table 2.
From page 65...
... A good example of nonfunctionally related genes with similar names that are placed in different communities is MMPll and MMP9 (PMID 8645587~. Often nonfunctionally related neighboring genes do appear together in one community in a small PNAS 1 Aprii 6, 2004 1 vol.
From page 66...
... We show that the communities produced in the case of colon cancer have interesting features that give one insight into the function of the component genes. The identification of many similar community structures on each gene graph allows us to recognize those genes that belong in two or more different communities.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.