Dissertation > Excellent graduate degree dissertation topics show

Simulation Research on Multi-Core Stream Architecture and Power Consumption of GPGPU

Author: HeRui
Tutor: XingZuoCheng
School: National University of Defense Science and Technology
Course: Electronic Science and Technology
Keywords: GPGPU processor architecture simulator power consumption verification
CLC: TP332
Type: Master's thesis
Year: 2010
Downloads: 48
Quote: 1
Read: Download Dissertation


With the progress of microprocessor technology, CMP has been the mainstream of design. The multi-core stream processor shows tremendous computing capability and it had advantages in area utilization, average power consumption and programming flexibility. As a typical multi-core processor, GPGPU has made a great impact on dealing with dense data and parallel computing. By studying the architecture of GPGPU, we can explore the direction of computer architecture development, which provides a way to produce homemade general-purpose stream processor.Simulator is an effective tool for researching processor architecture. The architecture of GPGPU has the characters in both multi-core processor and stream processor, which make it quite different from traditional processor architecture, and so it acquires new simulation technology and methodology. Therefor, we choose the GPGPU of NVIDIA Corp, which is used in academia widely, to do related research.This thesis analyses the evolvement and architectural specialty of GPGPU, and with the study in the programming model which is named CUDA and the multithreading executing mode, we detailedly discuss the main idea of multi-core stream processor. This thesis fully makes use of the technology and methodology of the existing simulator named GPGPU-Sim, and by extending the software and perfecting its function, we utilize the programming interface and algorithm of Watch, which is a famous power simulator, to establish the architectural power model of GPGPU. The simulation results show that the GPGPU simulator is able to verify the function and GPGPU reliably. When allocated more threads to execute, the GPGPU has the better speedup, for the stream multiprocessors are filled with the threads more efficiently. The number of multiprocessors is the main factor to determine the performance of GPGPU, and at the same time, the configuration of pipeline, the DRAM scheduler and the clock frequence can also affect the performance. Change the memory hierarchy or the programming mode may have great effect on the performance of GPGPU. For the application which has ordered data and unique flow of execution, it has the best performance to fully use coalescing mechanism without data cache; but for more general-perpose computing applications, it is better to utilize data cache. On the other hand, the power consumption of GPGPU increases when the number of multiprocessors or the number of threads increases, when the memory hierarchy and the programming mode are also important.

Related Dissertations

  1. The Design & Research of Automatic Transmission Load Simulator,TH132.46
  2. Research on the Workflow Technique for the Complex Simulation Systems VV&A,TP391.9
  3. Design and Simulation of Flight Management Computer System CDU Unit in Flight Simulation,TP391.9
  4. Technology for Localization Attack Detection in Wireless Sensor Networks,TP212.9
  5. A Feasibility Study of IMRT Dosimetric Verification Using Radiochromic Film,R815
  6. Aerial Target Anti-interference Recognition and Tracking System,TN215
  7. Research and Implementation of the Conflict Resolution in Process Modeling of Magic Platform,TP311.5
  8. Funcational Verification of Multifunction Vehicle Bus Controllor,TP273
  9. Software Approach to Implement Power Optimization in Embedded Handheld Mobile Gis Device,TP311.52
  10. The Design and Implementation of Multi-audio Shortwave Channel Simulator,TN925
  11. Research on Low Power Techniques of the Instruction Fetching Unit in Embedded Processors,TP332
  12. Research on the Low-cost RFID Systems Security Protocol,TP391.44
  13. Study of Binocular Stereo Matching Algorithm and Algorithm Implementation Based on Multi-core.,TP391.41
  14. Design and Implementation based on ADS-B routes training flight control interval demo system,TP311.52
  15. Research on Image Segmentation and 3D Modeling Based on Medical Image Sequence Matching,TP391.41
  16. Multilayer VLAN tandem device fast topology , fast access method and platform design,TP393.1
  17. Aerocraft Simulator Servo System Controlling & Parameter Tuning Techniques,V249.1
  18. Physical Design of a Chip on NUCSoC,TN47
  19. A High-Performance Audio ∑△ ADC in 65nm CMOS Process,TN792
  20. The Research and Design on Wireless HART Adapter,TN915.05
  21. Low-power、multi-host Interface、multi-layer Design of the LCD Controller,TN873.93

CLC: > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Electronic digital computer (not a continuous role in computer ) > Arithmetic unit and the controller (CPU)
© 2012 www.DissertationTopic.Net  Mobile