Michael Conry


K-mer matching using the Automata processor in de novo DNA assembly


De novo assembly of post-sequencing DNA reads is a complex computational task requiring the comparison of a massive amount of sequenced reads against reference sequences. To date, there have been attempts at constructing heuristic algorithms to cut down on the intense computational demands when using a traditional CPU as well as implementation of the algorithms on other architectures such as GPUs. This study analyzes the application of a novel non-Von Neumann memory hardware, Micron's Automata Processor(AP), which is theoretically capable of dramatically reducing computation time through hardware based massively parallelized pattern matching1. We convert 20-mer DNA sequences into abstract mathematical models known as a non-deterministic finite automata and use the Automata Simulator to match them to an input stream. By estimating the drastic decrease in computation time required and demonstrating the benefits of the exhaustive nature of the pattern matching technique we attempt to show the advantages of using this device in place of a traditional CPU.