Dissertation > Excellent graduate degree dissertation topics show

An Algorithm Based on Suffix Tree for Identification of Repeats in DNA Sequence

Author: WangXiaoWu
Tutor: HuoHongWei
School: Xi'an University of Electronic Science and Technology
Course: Computer Software and Theory
Keywords: Bioinformatics Complex body identification Suffix tree RepSeeker
CLC: TP301.6
Type: Master's thesis
Year: 2008
Downloads: 59
Quote: 0
Read: Download Dissertation

Abstract


Repeats recognition is one of the primary means of bioinformatic analysis of the genome sequence . Repeats DNA in eukaryotic genes occupy a very important position . The genetic laws of genome evolution rules and many diseases can be found by identifying repeat . Many transposons of repeat sequences as can be coding regions appear repeatedly in the sequence of the genome to identify these repeats played a very important role in genome decoding . By taking into account the length and frequency of occurrence of the sequence repeats , the identification of a suffix tree primary repeat RepSeeker algorithm . The algorithm uses the lowest limit frequency , and maximize the overlap merger extends the length of the repeat . Algorithms to DNA sequence suffix tree structure as input , and as a means to query algorithm based on the suffix tree , the resulting primary input DNA sequence repeat classification . In order to further improve the efficiency of RepSeeker algorithm , the suffix tree construction algorithm adaptive improvements . Leaves the the information array LL ( LeafList ) and join in the branch node in the structure of the suffix tree to a leaf node number . On this basis , improved query algorithm based on suffix trees , and thus avoiding RepSeeker algorithm for the high frequency sub - tree traversal . Ukkonen suffix tree construction algorithm improvements to increase the space requirements , the time complexity of the structure suffix tree algorithm little affected . NCBI several typical DNA sequence used in the test as a test object , and improvements the Ukkonen before repeat recognition algorithm to do a comparative analysis . In the case of no loss of accuracy , the results show that RepSeeker great extent reducing the running time .

Related Dissertations

  1. BioLab a Bioinformatics Oriented Grid Portal,TP399-C8
  2. Cloning and Expression Analysis of GPx, GST and SAHH Genes in Chlamydomonas Sp. ICE-L from Antarctica,Q943.2
  3. Identification of the Causal Organism of Soybean Bacterial Spots and Two Type Ⅲ Secreted Effectors’s Clone and Functional Analysis,S435.651
  4. Research on Re-sequencing of Next-Generation-Sequencing Data Based on GPU and Compressed Index,Q78
  5. Hereditary cataract gene mutation and protein functional changes,R776.1
  6. Molecular Cloning and Bioinformatics Analysis of an Unknown Function Gene from Musca Domestica,Q78
  7. Rapeseed boron transporter gene cloning and identification,Q943.2
  8. Development and Characterization of Monoclonal Antibody Against Nucleoprotein of Avian Influenza a Virus by DNA Immunization,S855.3
  9. Research of Global Multiple Sequence Alignment Algorithms Base on Information Entropy,Q78
  10. Meta-analysis of the Differentially Expressed Lymphocytic Leukamia-related Genes,R733.7
  11. Microarray Data Analysis of Erectile Dysfunction in Diabetic Rats and the Study of Apoptosis in Corpus Cavernosum Smooth Muscle Cell,R698
  12. Proteomic Analysis of Heparin-related Proteins from Human Serum: A Step Towards Identification of Molecular Makers of Preeclampsia,R714.245
  13. Peptide targeting human FoxMlc lead drug screening and molecular modeling,R730.2
  14. The Design and Realization of Microarray Data Analysis Platform,TP311.52
  15. Analysis of the Differential Proteomics between Fresh and Frozen-thawed Ram Spermatozoa,S826
  16. Polymorphisms and Bioinformatic Analysis of the Gene OLA-DRB1 Exon 3 in Tibetan Sheep,S826
  17. Design and Implementation of Data Service Generation System for Bioinformatics Database,TP311.52
  18. Dissection of Resistant Genes to Northern Corn Leaf Blight in CIMMYT Maize Cultivar Ent17 and Bioinformatics Analysis of Candidate Sequences between Two Flanking Markers Linked with Resistance Gene Ht1,S435.11
  19. Xinjiang pear germplasm molecular markers and self -incompatibility gene,S661.2
  20. The Research and Implementation of Protein Classification Algorithm on the Basic of String Kernel,TP301.6
  21. Construction of Tender Shoots cDNA Library of Camellia Sinensis Cv. Ziyang 1 and Analysis of Expressed Sequence Tags,S571.1

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > General issues > Theories, methods > Algorithm Theory
© 2012 www.DissertationTopic.Net  Mobile