Volume 16, no. 1Pages 59 - 68

Automation of the Application of Data Distribution with Overlapping in Distributed Memory

L.R. Gervich, B.Ya. Steinberg
The article deals with block-affine data layouts with overlapping for optimizing parallel computing in a distributed memory computing system. Examples of target computing systems are high-performance clusters and advanced systems on a chip with a large number of computing cores. It is proposed to describe the placement of an array with overlaps as a new array of slightly greater length, in which additional elements have the values of some elements of the original array. The possibility of developing an automatic transformation (by the compiler) of the usual allocation of an array in distributed memory into a new array containing overlaps is being considered. The proposed method is illustrated by a well-known numerical algorithm for solving the heat conduction problem.
Full text
automation of parallelization; distributed memory; program transformations; data distribution; data transfer.
1. Bondhugula, U. Automatic Distributed-Memory Parallelization and CodeGeneration using the Polyhedral Framework. Technical Report, 2011. Available at: http://mcl.csa.iisc.ac.in/downloads/publications/uday11distmem-tr.pdf (accessed on 01.09.2022)
2. Bikonov D., Puzikov A., Sivtsov A. Three-Level Parallel Programming System for the Hybrid 21-Core Scalar-Vector Microprocessor NM6408MP. Cybersecurity Issues, 2019, pp. 22-34. DOI: 10.21681/2311-3456-2019-4-22-34
3. SoC Esperanto. Available at: https://www.esperanto.ai/technology (accessed on 01.09.2022)
4. Korneev V.V. Parallel Programming. Software Engineering, 2022, vol. 13, no. 1, pp. 3-16.
5. Shteynberg B.Ya. Optimizacia razmeshhenija dannyh v parallel'noi pamyati [Optimizing Data Placement in Parallel Memory]. Rostov-na-Donu, Southern Federal University Publishing, 2010. (in Russian)
6. Shteynberg B.Ya. Block-Affine Data Placements in a Parallel Memory. Information Technologies, 2010, no. 6, pp. 36-41.(in Russian)
7. Bahtin V.A., Zaharov D.A., Kolganov A.S. et al. Solving Applied Problems Using DVM-System. Bulletin of the South Ural state University. Series: Computational Mathematics and Computer Science, 2019, vol. 8, no. 1. pp. 89-106. (in Russian)
8. Bahtin V.A., Zaharov D.A., Ermichev A.A. et al. Experience in Solving Applied Problems Using Irregular Grids Using DVM-System. Parallel Computing Technologies, 2018, pp. 241-252.
9. Krivosheev N.M., Steinberg B. Algorithm for Searching Minimum Inter-Node Data Transfers. Computer Science, 2021, vol. 193, pp. 306-313.
10. Ammaev S., Gervich L., Steinberg B. Combining Parallelization with Overlaps and Optimization of Cache Memory Usage. Parallel Computing Technologies, 2017, pp. 257-264. DOI: 10.1007/978-3-319-62932-2-24.
11. Gervich L.R., Kravchenko E.N., Steinberg B.Y. et al. Automatic Program Parallelization with Block Data Distribution. Numerical Analysis and Applications, 2005, vol. 8, no. 1, pp. 35-45.
12. Zhangxiaowen Gong, Zhi Chen, Justin Szaday. An Empirical Study of the Effect of Source-Level Loop Transformations on Compiler Stability. Proceedings of the ACM on Programming Languages, 2018, vol. 2, no. OOPSLA, pp. 1-29.
13. Wolfe M. More Iteration Space Tiling. Supercomputing, Reno, 1989, pp. 655-664.
14. Tikhonov A.N., Samarskii A.A. Equations of Mathematical Physics. Oxford, Pergamon Press, 1963.