Abstract
Detecting outliers in gene–protein mapping that reveal the presence of neuro-degenerative disorders or distinguishes between two different neuro-degenerations is an unexplored research area. This research work proposes a new methodology based on graphs for detecting outliers that relate the gene–protein mapping anchored on their physicochemical properties. The results of this study have revealed the exact protein physicochemical properties and the corresponding gene that is mapped to that protein. This research work makes the following contributions: (i) Proposes a simple graphical approach to visualize the gene–protein mapping for neuro-degenerative disorders based on their structural and physicochemical properties (ii) Generation of a pre-processed database by feature extraction from multiple web servers (iii) Proposed methodology of extracting outliers from tabulated (supervised/unsupervised) data can be extended to detect outliers from any dataset. The outliers that have been detected by this methodology were further studied using the REVIGO server that reveals the genetic functionality of the genes in maintaining healthy human activity. The outliers have reported no significant contribution and hence it is believed that this method can be extended to detect noisy outlier data from other biological and clinical datasets.
Similar content being viewed by others
Data availability statement
The data will be made available on request since the authors are continuing their research on this data.
References
Aarsland D, Larsen JP, Lim NG, Janvin C, Karlsen K, Tandberg E, Cummings JL (1999) Range of neuropsychiatric disturbances in patients with Parkinson’s disease. J Neurol Neurosurg Psychiatry 67(4):492–496
Aarsland D, Cummings JL, Larsen JP (2001) Neuropsychiatric differences between Parkinson’s disease with dementia and Alzheimer’s disease. Int J Geriatr Psychiatry 16:184–191
Akoglu L, Tong H, Koutra D (2014) Graph based anomaly detection and description: a survey. Data Min Knowl Disc 29(3):626–688
Amboni M, Santangelo G, Barone P (2015 ) Depression, apathy, anhedonia, and fatigue in Parkinson’s disease. In: Neuropsychiatric Symptoms of Movement Disorders. Springer International Publishing, pp 1–28
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):1–15
Chaparro C, Eberle W (2015) Detecting anomalies in mobile telecommunication networks using a graph based approach. In: The Twenty-Eighth International Flairs Conference, Florida, pp 410–515
Chen X, Zhang B, Wang T et al (2020) Robust principal component analysis for accurate outlier sample detection in RNA-Seq data. BMC Bioinform 21:269. https://doi.org/10.1186/s12859-020-03608-0
Debajit S, Samar SS (2015) A survey on different graph based anomaly detection techniques. Indian J Sci Technol 8(31):1–7
Eberle W, Holder L (2014) A partitioning approach to scaling anomaly detection in graph streams. In: IEEE International Conference on Big Data, Washington DC, pp 17–24
Goedert M (2015) Alzheimer’s and Parkinson’s diseases: the prion concept in relation to assembled Aβ, tau, and α-synuclein. Science 349(6248):1255555
Hall Mark A (1999) Correlation-based feature selection for machine learning. Diss. The University of Waikato
Hassanzadeh R, Nayak R, Stebila D (2012) Analyzing the effectiveness of graph metrics for anomaly detection in online social networks. In: Web Information systems Engineering-WIsE 2012. Springer Berlin Heidelberg, pp 624–630
Huang HY, Lin JX, Chen CC, Fan MH (2006) Review of outlier detection. Appl Res Comput 8:002
Iftikhar N, Baattrup-Andersen T, Nordbjerg FE, Jeppesen K (2020) Outlier detection in sensor data using ensemble learning. Procedia Comput Sci 176:1160–1169. https://doi.org/10.1016/j.procs.2020.09.112
Ismail SM, Radwan AG, Madian AH, Abu-El Yazeed MF (2016) Comparative study of fractional filters for Alzheimer disease detection on MRI images. In: Telecommunications and Signal Processing (TSP), 2016 39th International Conference on Jun 27. IEEE, pp 720–723
Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28(1):27–30
Karla C-G, Richard C, Holger H (2022) Integrative OMICS data-driven procedure using a derivatized meta-analysis approach. Front Genet. https://doi.org/10.3389/fgene.2022.828786
Kaur K, Garg A (2016) Comparative study of outlier detection algorithms. Int J Comp Appl 147(9):21–26
Kempfner L, Jennum PJ, Sørensen HBD (2015) Support system and method for detecting neurodegenerative disorder, PCT/EP2013/062164, May, 2015
Li ZR, Lin HH, Han LY, Jiang L, Chen X, Chen YZ (2006) PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res 34(suppl 2):W32–W37
Lin JL, Khomnotai L (2014) Using neighbor diversity to detect fraudsters in on-line auctions. Entropy 16(5):2629–2641
Magrane M (2011) UniProt knowledgebase: a hub of integrated protein data. Database 1:2011
Manning CD, Raghava P, Schutze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge
Mookiah L, Eberle W, Holder L (2014) Detecting suspicious behavior using a graph-based approach. In: Visual Analytics Science and Technology (VAST). IEEE, Paris, France, pp 357–58
Moradi F, Olovsson T, Tsigas P (2014) Overlapping communities for identifying misbehavior in network communications. Advances in knowledge discovery and data mining, vol 8443. Springer International Publishing, pp 398–409
Mort M, Sterne-Weiler T, Li B, Ball EV, Cooper DN, Radivojac P, Sanford JR, Mooney SD (2014) MutPred Splice: machine learning-based prediction of exonic variants that disrupt splicing. Genome Biol 15(1):R19
Ning J, Chen L, Zhou C et al (2022) Deep active autoencoders for outlier detection. Neural Process Lett. https://doi.org/10.1007/s11063-021-10687-4
Perozzi B, Akoglu L, Sánchez IP, Müller E (2014) Focused clustering and outlier detection in large attributed graphs. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, USA, pp 1346–55
Ramani RG, Jacob SG (2013) Improved classification of lung cancer tumors based on structural and physicochemical properties of proteins using data mining models. PLoS ONE 8(3):e58772
Rao HB, Zh Fu, Yang GB, Li ZR, Chen YZ (2011) Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res. Jul 1, 2011; 39(Web Server issue): W385–90.
Safran M, Dalah I, Alexander J, Rosen N, Iny Stein T, Shmoish M, Nativ N, Bahir I, Doniger T, Krug H, Sirota-Madi A (2010) GeneCards version 3: the human gene integrator. Database 1:2010
Samara MA, Bennis I, Abouaissa A, Lorenz P (2022) A survey of outlier detection techniques in IoT: review and classification. J Sens Actuator Netw 11:4. https://doi.org/10.3390/jsan11010004J
Sánchez-Ferro Á, Elshehabi M, Godinho C, Salkovic D, Hobert MA, Domingos J, Uem JM, Ferreira JJ, Maetzler W (2016) New methods for the assessment of Parkinson’s disease (2005–2015): a systematic review. Mov Disord 31(9):1283–1292
Shyr C, Tarailo-Graovac M, Gottlieb M, Lee JJ, van Karnebeek C, Wasserman WW (2014) FLAGS, frequently mutated genes in public exomes. BMC Med Genomics 7(1):64
Stamford JA, Schmidt PN, Friedl KE (2015) What engineering technology could do for quality of life in Parkinson’s disease: a review of current needs and opportunities. IEEE J Biomed Health Inform 19(6):1862–1872
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 102(43):15545–15550
Sun H, Cui Y, Wang H, Liu H, Wang T (2020) Comparison of methods for the detection of outliers and associated biomarkers in mislabeled omics data. BMC Bioinform 21(1):357. https://doi.org/10.1186/s12859-020-03653-9 (PMID: 32795265; PMCID: PMC7646480)
Tejeswinee K, Shomona Gracia J (2017) Feature selection techniques for prediction of neuro-degenerative disorders: a case-study with alzheimer’s and Parkinson’s disease. In: 7th International Conference on Advances in Computing & Communications, ICACC-2017, vol 115, 22–24 August 2017. Elsevier Procedia Computer Science, Cochin, India, pp 188–194
Ur-Rehman A, Belhaouari SB (2021) Unsupervised outlier detection in multidimensional data. J Big Data 8:80. https://doi.org/10.1186/s40537-021-00469-z
Vlasselaer VV, Van Vlasselaer V, Akoglu L, Eliassi-Rad T, Snoeck M, Baesens B (2015) Guilt-by-constellation: fraud detection by suspicious clique memberships. In: Proceedings of 48 Annual Hawaii International Conference on System Sciences. Kauai, HI, pp 918–27
Acknowledgements
This research work was carried out as part of funding from the following sources: (i) The Research Council (TRC) Oman, funded project under the Research Grant Scheme titled “Investigations on Computational Methods for the Early Detection and Neuro-Cognitive Development of Children with Autism Spectrum Disorders (ASD) in Oman” with Proposal ID: BFP/RGP/ICT/21/169. (ii) Science and Engineering Research Board (SERB), Department of Science and Technology (DST) funded project under Young Scientist Scheme – Early Start-up Research Grant- titled “Investigation on the effect of Gene and Protein Mutants in the onset of Neuro-Degenerative Brain Disorders (Alzheimer’s and Parkinson’s disease): A Computational Study” with Reference No- SERB – YSS/2015/000737/ES research work is a part of the Science and Engineering Research Board (SERB), Department of Science and Technology (DST) funded project under Young Scientist Scheme – Early Start-up Research Grant- titled—“Investigation on the effect of Gene and Protein Mutants in the onset of Neuro-Degenerative Brain Disorders (Alzheimer’s and Parkinson’s disease): A Computational Study” with Reference No- SERB—YSS/2015/000737.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jacob, S.G., Sulaiman, M.M.B.A., Bennet, B. et al. A graphical approach for outlier detection in gene–protein mapping of cognitive ailments: an insight into neurodegenerative disorders. Netw Model Anal Health Inform Bioinforma 11, 22 (2022). https://doi.org/10.1007/s13721-022-00364-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13721-022-00364-4