Dissertation > Excellent graduate degree dissertation topics show

Application of Graphics Processing Unit in Matrix Inversion and Normal Mode Analysis

Author: LiuLi
Tutor: LiHongLin
School: East China University of Science and Technology
Course: Applied Computer Technology
Keywords: Graphics processing unit (GPU) Scientific computing Parallel processing Matrix inversion Normal mode analysis (NMA)
CLC: TP391.41
Type: Master's thesis
Year: 2012
Downloads: 138
Quote: 0
Read: Download Dissertation


Graphics processing unit (GPU) is specially used for graphics processing. In recent years, the peak single precision performance of GPU has been increased from several Gflops to Tflops. With the development of its programmability, GPU has been increasingly applied to accelerate the scientific computation. Besides its huge parallel computing power, GPU is also a low-cost, low-power chip and becomes an important part of high performance computers nowadays. How to apply the parallel computing technology of GPU to more scientific computations is currently the hot topic in the high performance computing field. In this paper, for the purpose of demonstrating the programmability and multi-threads parallel computing power of GPU, we have done the following work:Firstly, matrix inversion is an important matrix operation, but the computing process of large-scale matrix inversion is very time-consuming in the serial mode of CPU. In this paper, we map the calculation of matrix inversion onto GPU using Compute Unified Device Architecture (CUDA) provided by NVIDIA according to the hardware characteristics of GPU. And we got a significant speedup (more than 300) and the peak single-precision performance has achieved 230 Gflops which can meet the demand for computing speed of matrix inversion in some scientific computation applications. The single-precision and the double-precision FLOPS of GPU are analyzed according to the results of this program. What is more, we analyze the influence of data transfer time on the parallel performance of GPU and summarize the characteristics of the algorithms fit for GPU for the purpose of applying GPU to molecular dynamics simulation which is a more complex computational system.Secondly, normal mode analysis (NMA) is an effective method to predict collective structural changes in proteins and it is the most time-consuming part in molecular simulation for calculating the sample of free energy. However, the calculations are limited in time scale mainly because the required diagonalization of the matrix is a computationally exhausting task. In this paper, we accelerate NMA process by mapping the most time-consuming part onto GPU. The GPU-accelerated all-atom NMA has achieved a considerable speedup (more than 20) over CPU-based NMA which could reduce the runtime of diagonalization significantly and the peak single-precision performance has achieved 180 Gflops. In addition, we analyze the influence of precision changes on both the computing performance and the accuracy of GPU.

Related Dissertations

  1. Interface Driver Design and Realization of Mutiple-dsp Parallel Processing for Route Planning System,TP368.12
  2. Design of Embedded Graphics Processing Unit Based on ARM and FPGA,TP391.41
  3. Fast Motion Estimation of H.264 and the Implement on CUDA,TN919.81
  4. Graphics processor in the cone beam CT imaging,TP391.41
  5. Research on the Theory and Performance of Ant Colony Optimization,TP301.6
  6. Imaging Algorithm of Synthetic Aperture Radar Based on Graphics Processing Unit,TN957.52
  7. GPU-Based Accelerated Finite-Difference Time-Domain for Electromagnetic Radiation and Scattering Simulations,O441.4
  8. Main-Board Software Design of 3D Phased-Array Sonar Imaging System,U666.7
  9. A Study of the Matrix Operation Harden Implementation on FPGA,TN791
  10. The Design of Sonar Signal Real-time Processor,U666.7
  11. Implement of Digital Receiver of Radar Signals Based on FPGA,TN957.5
  12. Airborne SAR imaging processing in parallel hardware implementation,TN958
  13. High-speed digital image parallel processing system,TP391.41
  14. The Application Research of Parallel Cone-Beam Reconstruction Algorithm Based on Multi-Core CPU for Micro CT,TP391.41
  15. The Design and Realization of ASIP Suitable for MIMO-OFDM,TP368.1
  16. The Analyze and Designing for Harbin Telecom Billing & Business System,TP311.52
  17. The Implementation of the Predictive Control Algorithm Based on FPGA,TP13
  18. ABEEMσπ biological macromolecules charge distribution model of parallel processing,TP391.7
  19. 32 Vector Processor key technology research and design,TP391.41
  20. Marine seismic streamer acquisition system control software multi- Research and Design,TP311.52
  21. GPU-based acceleration of simulated annealing algorithm for fine-grained parallelism,TP18

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Pattern Recognition and devices > Image recognition device
© 2012 www.DissertationTopic.Net  Mobile