Almespar: An Open Reading Frames Detection Tool Using Python
Journal ArticleBackground and aims. Open reading frames (ORFs) are sections of a reading frame that do not include any stop codons. A reading frame is a sequence of nucleotide triplets read as codons indicating amino acids; a single strand of DNA has three potential reading frames. Long ORFs in a DNA sequence may represent possible protein-coding areas. In addition to extended ORFs, which assist in gene locus prediction, there is yet another type of ORFS known as small open reading frames (smORFs), which have 100 codons or fewer. Methods. We develop an offline, cross-platform, and dependable detection tool for regular ORFs and smORFs prevalent in biomedical studies. Results. In this work, the most ORFs were found in the Bos taurus (Cattle) Insulin gene, which had 17 consecutive ORFs, while the fewest ORFs were reported in the Cani's lupus (Dog) Insulin gene, which had only 4 ORFs. Conclusion. The software meets the expected demarcation restrictions. We strongly advise more research into the detection of nested ORFs.
osamah shuhoub salim alrouwab, (02-2023), libya: AlQalam Journal of Medical and Applied Sciences (AJMAS), 6
Zenobia: CODIS 13 STR Loci Allele Detection Tool
Journal ArticleShort Tandem Repeats (STRs) are one of the utmost mutable provinces in the human genome. They comprise tandem repeating DNA sequences ranging in length from two to six base pairs. Owing to their significant mutation rate, they exhibit considerable variation in pattern among populations and the capacity to be passed on from generation to generation. These loci are broadly employed in medicine, biology, and criminal investigation. They are pivotal in the genesis of a variety of genetic illnesses and have been intensively investigated in forensics, population genetics, and genetic genealogy. Although many implementations that manage STR loci are offered, the overwhelming majority of them rely primarily on the Command-Line Interface (CLI) inputs, which frequently necessitate the implementation of tools carried out in various scripting languages. Installing and launching programs through the Command Line (CL) is timeconsuming and/or unprofitable for many students and scholars. The fundamental intention of this project is to develop a cross-platform Graphical User Interface (GUI) package directed to the Combined DNA Index System (CODIS) STR analysis. Zenobia is a Java-based application considered as a step in consistently making CL-only programs available to more apprentices and researchers. In general, Zenobia's application outcomes satisfy the evaluation metrics for efficiency and time consumption. However, more genetic markers should be introduced to increase productivity of the application.
osamah shuhoub salim alrouwab, (03-2022), iMedPub LTD - 483, Green Lanes London N13 4BS, UK: Genetics and Molecular Biology Research, 6
Evaluating Efficiency of Some Exact StringMatching Algorithms on Large-Scale Genom
Journal ArticleExact string-matching algorithms have become very supreme in many bioinformatics tools. Despite the abundance and diversity of such algorithms, exposing them to real-time experimental analysis has been critical. This study was conducted to evaluate the efficiency of ten exact-string matching algorithms on large-scale genomic sequences from a runtime perspective. To define the most efficient algorithms are qualified to handle the short alphabet used for nucleic acid coding. The methodology promoted for this study was the factorial experiment with Randomized Complete Block Design (FRCBD). Under influence of four independent parameters, four levels of pattern lengths, four levels of pattern indices, two levels of programming languages, and ten levels of algorithmic architecture. The yield of the tested algorithms was calculated in nanoseconds. One-way ANOVA and Two-way ANOVA tests with post-hoc Games-Howell test were used separately for statistical analysis. In this study two widely accepted programming languages, C# and JAVA were used to speculate the possible effect of programing language on algorithm performance.
osamah shuhoub salim alrouwab, (10-2021), iMedPub LTD - 483, Green Lanes London N13 4BS, UK: American Journal of Computer Science and Information Technology, 9
Alhudaj: CpG islands Detection Tool in Mammalian Genome Using C++
Journal ArticleOne of the unique combinations in the mammalian genome, that revolutionized concepts in the fields of genetics and molecular
pathology is what is termed the CpG islands. However, the accurate and rapid determination of CpG islands for DNA sequences remains
experimentally and computationally challenging. The main goal of this project is to design an offline, cross-platform CpG islands detection
tool. The Algorithm implemented in this study was the traditional sliding window algorithm by using the C++ programming language.
Three datasets were used for evaluating the performance of the application. The ANK1 gene, SPTB gene, and RET gene sequence files
were obtained from NCBI. In this study, the highest CGIs were reported in ANK1 (ankyrin 1) Gene which scored 13 successive islands
whereas the lowest score was reported in RET (ret proto-oncogene) Gene which shows only 6 islands. Generally, the program fulfills the
boundary limits as expected. We strongly recommend for further work, the implementation of other algorithms in addition to the sliding
window algorithm such as the Hidden Markov Model (HMM).
osamah shuhoub salim alrouwab, (10-2021), Spain: International Journal of Progressive Sciences and Technologies (IJPSAT), 29
String Processing Algorithms Problems in Bioinformatics
Conference paperDNA, RNA, and protein are represented as strings in bioinformatics for this reason string processing is the cornerstone in the field of bioinformatics and these problems take a variety of manifestations each of which has a specific meaning. This topic will shed some light on some traditional string problems such as: local sequence alignment problem, global sequence alignment problem, exact pattern matching problem, approximate pattern matching problem, finding all maximal palindromes problem, finding all tandem repeats problem, finding all tandem arrays problem, etc. There are quite rich researches for these problems. This paper, will propose the major algorithms in this respect which implemented in BioQt.
osamah shuhoub salim alrouwab, (01-2019), libya: the 3rd Libyan Conference on Medical and Pharmaceutical Sciences 2019, 1
BioQt an Integrated Bioinformatics Software Development Kit
Master ThesisBioinformatics is a multi-disciplinary science focusing on the applications of computational methods and mathematical statistics to molecular biology. Choosing bioinformatics as specialization gives an opportunity to get involved with the most interesting computational techniques dealing with biological data to contribute to cure and diagnose some of genetic disorders that affect biological machines. The purpose of this library (which defines namespace BioQt), is to provide a set of routines for handling biological sequence data for Qt/C++ users (the full source code available on https://github.com/alrawab/BioQt). This thesis will shed the light on some modules of BioQt SDK such as exact string matching problem, Microsatellite Repeats, Palindromic sequences and sequence alignment algorithms (Longest Common Subsequence, Needleman-Wunsch and Smith-Waterman). This thesis examines and evaluates these challenging problems in bioinformatics by using Qt/C++.
osamah shuhoub salim alrouwab, (01-2015), libya: الاكادمية الليبية,