# Circular Shift of Loop Body - Programme Transformation, Promoting Parallelism

O.B. SteinbergThe article deals with the programme transformation executing the circular shift of loop body statements. It can be used for vectorizing or parallelizing. This becomes possible due to the fact that when the order of loop body statements is changed, some of the bottom-up arcs become top-down arcs. Besides, sometimes loop carried dependence arcs are substituted by loop independent ones. It should be pointed out that in executing the circular shift the number of loop iterations is reduced by one. The transformation can be used both independently and in conjunction with other transformations promoting parallelism. These could be 'forward substitution', 'scalar expansion', 'privatization', 'array expansion', etc. The transformation under consideration in this article can be used both in hand parallelization and added to a paralleling (optimizing) compiler. Moreover, the application of the transformation results in the equivalent code only for the loops where loop unrolling is the equivalent transformation. Thus, they can contain nested loops, if statements and other programming language statements.Full text

- Keywords
- parallel computations; programme transformations; dependence graph; scalar expansion; loop distribution.
- References
- 1. Allen R., Kennedy K. Optimizing Compilers for Modern Architectures. San Francisco, San Diego, N.Y., Boston, London, Sidney, Tokyo, Morgan Kaufmann Publishers, 2002. 790 p.

2. Wolfe M. High Performance Compilers for Parallel Computing. Redwood City, Addison-Wesley Publishing Company, 1996. 570 p.

3. Steinberg B.J. Matematicheskie metody rasparallelivaniya rekurrentnykh programnykh tsiklov na superkompyutery s parallel'noy pamyatyu [Parallelizing Recurrent Program Cycles with Irregular Superposition Computation]. Rostov-on-Don, Rostov University Publishing House, 2004. 192 p.

4. Duo Liu, Zili Shao, Meng Wang, Minyi Guo, Jingling Xue. Optimal Loop Parallelization for Maximizing Iteration-Level Parallelism. Proceedings of International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES 09). N.Y., ACM, 2009, pp. 67-76. DOI: 10.1145/1629395.1629407

5. Steinberg O.B. [Parallelizing Recurrent Program Cycles with Irregular Superposition Computation]. Izvestiya vuzov. Severo-Kavkazskii region. Natural Science, 2009, no. 2, pp. 18-21. (in Russian)

6. Duo Liu, Yi Wang, Zili Shao, Minyi Guo, Jingling Xue. Optimally Maximizing Iteration-Level Loop Parallelism. IEEE Transactions on Parallel and Distributed Systems, 2012, vol. 23, no. 3, pp. 564-572. DOI: 10.1109/TPDS.2011.171

7. Steinberg O.B., Sukhoverkhov S.E. [Recurrent Program Loops with Stability Check]. Information Technologies, 2010, no. 1, pp. 40-45. (in Russian)

8. Muchnick S.S. Advanced Compiler Design and Implementation. San Francisco, Morgan Kauffman, 1997. 856 p.

9. Aho A.V., Lam M.S., Sethi R., Ullman J.D. Compilers: Principles, Techniques, and Tools. London, Pearson Education, 2007. 1014 p.

10. Evstigneev V.A., Sprogis S.V. [Vectorizing Programmes]. Vektorizatsiya programm: teoriya, metody, realizatsiya [Vectorizing Programmes: Theory, Methods, Implementation]. Moscow, Mir, 1991, pp. 246-267. (in Russian)

11. Shulzhenko A.M. Issledovanie informatsionnykh zavisimostey programm dlya analiza rasparallelivayushchikh preobrazovaniy [Researching Information Dependences of Programs for Analyzing Transformations Used for Parallelizing. The Dissertation for Scientific Degree of the Candidate of Technology]. Rostov-on-Don, 2006, 200 p.

12. Steinberg O.B. Rasparallelivanie tsiklov, dopuskayushchikh rekurrentnye zavisimosti [Parallelizing Loops Allowing Recurrent Dependences. The Dissertation for Scientific Degree of the Candidate of Physics and Mathematical Science]. Institute for System Programming of the Russian Academy of Sciences, Moscow, 2014.

13. Steinberg O.B. [Minimizing the Number of Temporary Arrays in Loop Distribution Problem]. Izvestiya vuzov. Severo-Kavkazskii region. Natural Science, 2011, no. 5, pp. 31-35. (in Russian)

14. Feautrier P. Array Expansion. Proceedings of the 2nd International Conference on Supercomputing, N.Y., ACM, 1988, pp. 429-441. DOI: 10.1145/55364.55406