Unstructured Grid Computing Acceleration Algorithm Based on Sunway TaihuLight
Abstrak
The performance of unstructured grid computing on Sunway TaihuLight, a domestic heterogeneous many-core platform, is limited by sparse storage, discrete memory access, and data dependency.To relieve the sparse storage and discrete memory access problems, this paper proposes an N-order diagonal coloring algorithm, which effectively balances the computing between Management Processing Element (MPE) and Computing Processing Elements (CPEs) and convert global memory access to Local Device Memory (LDM) access using CPEs.To solve the computing competition caused by data dependence, this paper presents an adaptive and independent blocking method to avoid data conflicts in parallel computing.Furthermore, various optimizations are employed to overcome the performance bottlenecks:1.To leverage hardware resources, the authors use asynchronous parallelism between MPE and CPEs.2.To reduce synchronization costs, they avoid register communication, which increases the scalability of the next-generation Sunway platform.3.To hide the memory access latency, the authors overlap memory access with computing.The SpMV, Integration, and calcLudsFcc operations are generally used to verify the validity of the algorithm, and the results show that our algorithm achieves an average speedup of about 10 times and up to 24 times higher than that of the MPE implementation.Moreover, the N-order diagonal coloring algorithm has a 5.8 times higher speedup than that of the non-coloring blocking algorithm, which effectively improves data locality and computational parallelism.The algorithm also has good acceleration performance for dependent conflict operators, which verifies the effectiveness of adaptive and independent task partitioning methods.
Topik & Kata Kunci
Penulis (1)
XU Le, AN Hong, CHEN Junshi, ZHANG Pengfei, WU Zheng
Akses Cepat
- Tahun Terbit
- 2022
- Sumber Database
- DOAJ
- DOI
- 10.19678/j.issn.1000-3428.0065567
- Akses
- Open Access ✓