Dissertation
Electromagnetic scattering integral equation method GPUbased parallel numerical solution
Author: XuJunYing
Tutor: NieZaiPing
School: University of Electronic Science and Technology
Course: Electromagnetic Field and Microwave Technology
Keywords: Graphics processing unit CUDA programming model Parallel method of moments Parallel multilevel fast multipole method
CLC: O441.4
Type: Master's thesis
Year: 2010
Downloads: 118
Quote: 1
Read: Download Dissertation
Abstract
The large target electromagnetic scattering analysis is an important direction in computational electromagnetics , its main feature is the large number of unknowns , calculated for a long time , hardware resource requirements . Order to quickly solve the target scattering characteristics , usually in largescale computer cluster system , highperformance multicore CPU server platform application of MPI , OpenMP programming techniques to achieve a parallel fast solution . Recent years , parallel computing technology on a single platform  based on the advent of the graphics processing unit GPU CUDA programming model promotion was successful application , the paper based GPU CUDA programming model is implemented in a standalone platform for electromagnetic scattering integral equation method for parallel computing . This inheritance , the specific content to achieve parallel processing on the GPU platform based on the method of moments of the integral equation method and the multilevel fast multipole method . Method of moments has completed about 140 times speedup on GPU platform multilevel fast multipole method on GPU platform about 7 times speedup . The downside is that the current platform on the method of moments can calculate the number of unknowns is only less than one million , multilevel fast multipole method speedup is very low . In this context , read carefully and fully understand the program , test each part of the code running efficiency improvement program for the deficiencies : the method of moments , from the original impedance matrix , all the data stored in memory instead stored in the host memory , then the data needed in the calculation batches sequentially read into the memory , small workstation the computable unknown significantly improved , but speedup declined slightly ; for multilevel fast multipole method , in a multipolar polymerization step , from the original individual threads required data is read from the memory changed to read the required data from the shared memory , reducing the thread directly to the number of times data is read from the memory , and improve the speed of the thread reads data at the expense of a certain percentage of redundant data in shared memory , in a multipolar configuration steps , from the angular spectrum of point serial expand instead angular spectrum point packet so that the angular spectrum of points within each group, expand the spectrum does not overlap , and the angular spectrum point of each set of parallel expand on the current platform , optimized , multilevel fast multipole algorithm from 7 to double the speed ratio is now 14 to double the speed ratio .

CLC: > Mathematical sciences and chemical > Physics > Electromagnetics,electrodynamics > Electricity and Magnetism > Electromagnetic waves and electromagnetic fields
