Masters Thesis

High performance gene prediction for small rna

Gene prediction is one of the most important and alluring problems in computational biology. Meanwhile, it is one of the most time-consuming tasks. To achieve efficiency in gene prediction, it's desired to build high-performance version tools. Infernal, which we aim to accelerate in this project, is a popularly used tool to predict small RNAs in prokaryotic genomic sequences. Infernal aims to accurately detect homologs by modeling sequences of secondary structure based RNA families. However, it becomes very expensive when it comes to searching speed. Also the performance varies depending on both inputs, the small RNA family and target database sequence. In this thesis, we have developed highly scalable task parallel CMSEARCH program, which is a part of Internal tool, using Pthread and OpenMP, which can also be extended with Infernal’s data parallelization feature. In addition, we have also developed pipelined parallel computation model to further accelerate the tool. This pipelined parallelism exploits further performance gains with a load balancing strategy to dynamically assign the number of threads to each pipelined stage. In our practice, the proposed acceleration schemes, i.e., task parallel and task parallel with pipelined approaches, showed approximately 20% and 50% of performance gain, respectively.

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.