Using the Bitola System to Identify Candidate Genes for Parkinson's Disease

Complexity of multifactorial diseases as Parkinson' s disease (PD) often complicate identifying causal genetic factors by traditional approaches such as positional cloning and candidate gene analyses. PD is etiologically and genetically complex disease and second most common neuro-degenerative disorder after Alzheimer' s disease. Th e most cases of PD are idiopathic and small growing subset of individuals have single gene defect as the cause. Th e main goal of this research was to identify the potential candidate genes for idiopathic PD by using biomedical discovery support system (BITOLA). For detecting the potential candidate genes for PD was used opened system of bioinformatics tool BITOLA. Data of chromosome location, tissue specifi c expression of potential candidate genes and their potential association with PD were obtained from Medline, Locus Link, Gene Cards and OMIM. By using BITOLA system is identifi ed  genes as potential candidate genes for PD. Th e role of three genes (MAPT, PARK, UCHL) in PD were confi rmed earlier. Discovering the novel candidate genes for multifactiorial diseases by using specially mentioned bioinformatics tool BITOLA could off er the new opportunity for researching genetics base of PD without using tissue samples of patients.


INTRODUCTION
Parkinson's disease (PD) is complex multifactoral neurodegenerative disorder which aff ects people over the age of .It is estimated that most cases of PD are sporadic and have late age onset, yet small but growing subset of individuals has a single gene defect as the cause [, ].Th e identifi cation of the fi rst gene(alpha synuclein) in familial PD only  years ago was a major step in the understanding of the molecular mechanisms in neurodegeneration [].Until today it have been identifi ed at last  loci associated with PD: PARK, PARK, PARK, PARK, PARK, PARK, PARK, PARK, PARK and GBA [].Linkage analysis and gene expression studies indicate a very large number of candidate genes for PD [, , , , ].Also, variation of genes which are associated with sporadic cases of PD in some cases increases or decreases risk of PD.Moreover, the concept of susceptibility genes allows the involvement of gene-gene and gene-environment interaction in sporadic PD [].Previous studies of genes identifi ed as candidate genes for PD indicate that there are many interactions between mutated genes.Although they only occur in a small number of patients, the discovery of genetic forms of PD demonstrated conclusively that PD can occur through inheritance, which has opened a new and exciting area of research [].Traditional approaches such as positional cloning and candidate gene analyses, as well as modern methodologies such as gene expression profi ling tend to fail to discover genes underlying diseases [].Also, DNA microarray analyses produce hundreds of diff erentially expressed genes which can not be distinguished from normally expressed genes.Genes that harbor no such mutations, but that play key roles in parts of the biological network that lead to disease, are systematically missed by forward genetics approach [].All those strategies fail to help researchers in reducing the target genes to a manageable number or to prioritize the disease specifi c causal genes for further analysis [].From this reason it is need to develop sophisticated techniques and tool to identify key candidates from gene list generated by disease gene discovery methods [].In terms of discovering new information from the literature, especially for candidate genes for various diseases, Peterlin and Hristovski have developed biomedical support discovery system (BITOLA) [].By using this system were identifi ed candidate genes of interest for multiple sclerosis (MS) and bilateral polymicrogyria (BPP) [, , ].Methods of integration data from the literature, using BITOLA tools can reveal new potential candidate genes for Parkinson's disease.Th e main goal of this research was to identify candidate genes for PD which corresponds to the criteria nominated by authors in design of study.

MATERIALS AND METHODS
In order to include some gene as candidate genes in the list of candidate genes for PD, that gene had to meet following criteria nominated by authors: to show a specifi c pattern of expression in brain tissue, to be involved in processes of cell adhesion and cell death, and that so far in the literature have not been brought into relation with PD.Th e reasons because this criteria was nominated are that brain tissue is a target in PD which is characterized by a progressive loss of dopaminergic neurons of the substantia nigra.In the literature, oxidative stress and apoptotic cell death have been implicated in the dopaminergic cell loss.Cell adhesion molecules play a central role in neural development and are also critically involved in axon regeneration and plasticity of synapses in adult nerve system.Information about chromosome loci, tissue-specifi c expression and the function of potential candidate genes and their association with some genetics disorders were extracted from databases: Medline [], Locus Link [], Gene Cards [] and Online Mendelian Inheritance in Man (OMIM) [].To fi nd new potential biomedical relation between PD and pathogenetics mechanisms (cell adhesion and cell death) was used the BITOLA system [].Th e BITOLA system is an interactive literature-based biomedical discovery support system.Th e purpose of the system is to help the biomedical researchers make new discoveries by discovering potentially new relations between biomedical concepts [].Th e set of concepts currently contains Medical Subject Headings (MeSH), which is used to index Medline, and human genes from Th e Human Genome Organization (HUGO).Th e potential new relations are discovered by mining the Medline database [].Th e system is available in two versions: ''closed discovery'' and ''opened discovery'' [].Open discovery allows the input of a single concept, then categories for fi rst-order relatives of that concept, then categories for relatives of those fi rst order concepts [].Th e BI-TOLA system was used according to the authors proposed instruction [].Discovery algorithm for discovering new relations between medical concepts is described in Table  [].Discovery algorithm for fi nding new relations between the given concepts was adapted to PD.As the concept X we nominated Parkinson's disease, then concept Y is cell function and concept Z is gene or gene products.Th e main goal was to fi rst fi nd all the concepts Y (cell function) related to the starting concepts X (PD).Th en all the concepts Z (gene or gene product) are found.As the last step, we check if X (PD) and Z (gene or gene product) appears together in the medical literature, then we evaluated the proposed (X (PD), Z (gene or gene product)) pairs and select among them those that deserve further investigation.If the chromosomal region of PD matches the location of the related genes (Z) and if there are no MEDLINE documents mentioning both the PD and the genes Z, then the genes Z can be proposed as candidate genes for X (PD).Because in Medline each concept can be associated with many other concepts, the possible number of XZ combinations can be extremely large [].In order to deal this combinatorial problem the author's of BITOLA system integrated in discovery algorithm fi ltering (limiting) and ordering capabilities.Th e related concepts can be limited by the semantic type to which they belong and fi nal possibility for limiting the number of related concepts or false related concepts is by setting thresholds on the support and confi dence measures of the association rules [].Th e main goal of the ordering is to present best candidates first to make human review as easy as possible [].Our discovery algorithm integrated in opened BITOLA system

RESULTS
Using the adapted discovery algorithm to the PD and its integration into the opened BITOLA system, we

TABLE 1. The algorithm for discovering new relations between medical concepts [14]
searched all the related concepts Z (gene or gene product) and further limited them to those matching the chromosome loci described in Table .In this manner  genes were suggested by the opened BITOLA system.By further analysis of those genes we excluded  genes whose expression pattern did not preferentially included the brain tissue which is target in the PD.According to the information from Locus Link [], Gene Cards [] and OMIM [] about tissue-specifi c pattern and the cell function (cell adhesion and cell death) of the remaining  genes we include  genes as potential candidate genes for PD (Table ).In the next step, were tested the frequencies of those genes together with PD in the literature.The most frequencies have genes IL, BDNF, MAPT, and PARK.The gene IL appears with PD in  Medline records.Th e gene BDNF and PD in Medline records appears  times.Th e gene MAPT co-occurs in  Medline documents and PARK with PD is documented in  records.It could be explained by the facts that those genes were researched in population of patients with PD and the role of the genes MAPT and PARK in pathogenesis of PD is confi rmed.Th e PTPRU gene with concept X (PD) appears in three Medline documents.However, none of the listed studies do confi rm its role in PD, so we do not exclude it.Th e genes CNTNAP, CNP, TRAF, SORT, ARTN, TMD, NSF, PADI, CDC, GFPT and AMIGO do not appear together with the concept X (PD) in Medline documents.One of the rules of discovery algorithm is that association between concept X (PD) and the concept Z (gene or gene product) must not exists.So we exclude from further analysis genes IL, BDNF, MAPT and PARK.However, their existence in list of off ered genes in the BITOLA system is important because it is implicates on potential role of the BITOLA system in (re)identification disease candidate genes.According to the tissue-specifi c pattern of the remaining twelve genes and the facts that those genes are not researched into relation with PD, they could be proposed as interesting candidate genes for further analysis.

DISCUSSION
By analyzing of results obtained by using bioinformatics tool BITOLA we compiled the list of potential candidate genes for PD.Th e twelve genes are chosen as potential candidate genes (CNTNAP, NSF, CNP, TRAF, PADI, PTPRU, SORT, ARTN, TMD, CDC, GFPT and AMIGO) whose role in PD should be further explored.Th e most interesting of those twelve genes are NSF, CDC and GFPT because their role in pathogenesis of PD has not been studied yet.N-ethylmaleimide sensitive factor (NSF) is an ATPases associated with various cellular activities protein (AAA), broadly required for intracellular membrane fusion [, ].It does seem to interact with other proteins, such as the AMPA receptor subunit, GluR, and beta-AR and is thought to aff ect their traffi cking patterns.Recently, it has been shown that NSF can be regulated by hydrogen peroxide.HO is thought to inactivate NSF through oxidation of the Cys in NSF-D [].Consistently, mutation of Cys to threonine eliminates the sensitivity of NSF to HO [].While this might suggest that NSF could be a redox sensor in the cell, whose activity is decreased when the oxidation state of the cytosol increases [].Th e interesting fact is that, NSF gene is located nearby to the MAPT gene.Mutations in the tau gene, MAPT, cause familial frontotemporal dementia with parkinsonism linked to chromosome    [].Taking into account the fact that CDC is a key component of the cell death machinery in sympathetic neurons [], its potential the role in PD should be further considered.Glutamine-fructose--phosphate transaminase  (GFPAT) is the rate-limiting enzyme of the hexosamine pathway that has been implicated in the pathogenesis of diabetic nephropathy [].Glucosamine -phosphate is subsequently converted to uridine diphosphate N-acetylglucosamine, which is used for the O-glycosylation of intracellular proteins.Although this gene is associated with diabetic nephropathy, it is diff erentially regulated in PD and may play a role in sporadic cases of PD and role in sporadic PD and represent candidate for as yet unidentifi ed disease-causing genes [].
Taking into account the fact that genes of NSF, CDC and GFPT have not been brought into correlation with the PD, as opposed to gene PARK, MAPT and UCHL, their significance for the PD may represent a potential target for further research.

CONCLUSION
On the basis of above mentioned, the role of NSF, CDC and GFPT genes, as well as the role of  genes that are selected in the list of candidate genes for PD in opened BITOLA system, could be readily tested by mutation screening of PD patients for mutation in those genes.Discovering the novel candidate genes for multifactiorial diseases by using specially mentioned bioinformatics tool BITOLA could offer the new opportunity for researching genetics base of PD without using tissue samples of patients.

DECLARATION OF INTEREST
Th e autors state that there is no confl ict of interest.

Figure  .
As beginning concepts X was entered the name Parkinson's disease.After limiting the related concepts Y by the semantic type Cell function,  concepts were obtained corresponding to MeSH descriptions.According to the research strategy, the cell adhesion and cell death were chosen as the most suited to PD.Using those concepts, all related concepts Z of the semantic type gene or gene products were searched and further limited to those matching chromosome location and discoveries only.The all chromosome loci which in literature associated with PD were examined.

FIGURE 1 .
FIGURE 1.The user interface of open system BITOLA and integrated the discovery algorithm for fi nding new relation between medical concepts adapted to Parkinson's disease

TABLE 2 .
The chromosome regions examined in opened BITOLA system

TABLE 3 .
The genes extracted from the opened BITOLA system for the further analysis  (FTDP-), and common variation in MAPT is strongly associated with the risk of progressive supranuclear palsy (PSP), corticobasal degeneration and, to a lesser extent, AD and Parkinson's disease (PD), implicating the involvement of tau in common neurodegenerative pathway(s)[].Th e genomic complexity around the MAPT locus is emphasized not only by complex arrangements of duplications close to the NSF gene but also in a recently identified de novo micro deletion of - kb of the locus in individuals with developmental delay and learning disabilities[].On the basis of previous facts it seems to be interesting to researching potential interaction between MAPT and NSF genes.Neuronal apoptosis or programmed cell death (PCD) is a crucial process occurring not only during normal development and tissue turnover but also in pathological situations such as stroke and neurodegenerative diseases[].Neuronal PCD involves the activation of a number of enzymes and genes and is regulated by specifi c growth factors, such as neurotrophins, which promote survival of particular neuronal populations by binding to specifi c cell surface receptors.Over expression of activated Rac or Cdc in SCG neurons maintained in the presence of NGF induced apoptosis, whereas expression of dominant negative mutants of Cdc or Rac blocked apoptosis following NGF withdrawal.Furthermore, Cdc-induced death was prevented by co expressing the c-Jun dominant negative FLAGΔ