# Special Aspects of Matrix Operation Implementations for Low-Precision Neural Network Model on the Elbrus Platform

E.E. Limonova, M.I. Neiman-zade, V.L. ArlazarovThis paper investigates the possibility of effective implementation of calculations in low-precision neural network models on the Elbrus platform with the VLIW architecture. Such models are widely used in practice to increase the computational efficiency of recognition and well suit computers with the x86 and ARM architectures. In this paper, we consider an 8-bit neural network model, in which matrix multiplication is the most resource-intensive part of the implementation. This paper presents an effective implementation of matrix multiplication that takes into account the features of the Elbrus architecture: the presence of several computational channels with various arithmetic and logic devices, an array prefetch buffer, and its own SIMD extension. We carry out theoretical and experimental comparisons of the computational efficiency of low-precision and classical neural network models, which show that Elbrus processors have much more capabilities for performing fast floating point calculations and require the development of new approaches to increase the computational efficiency of neural network models.Full text

- Keywords
- low-precision neural networks; computational efficiency; Elbrus architecture; matrix operations.
- References
- 1. Limonova E.E., Bocharov N.A., Paramonov N.B., Bogdanov D.S., Arlazarov V.V., Slavin O.A., Nikolaev D.P. Recognition System Efficiency Evaluation On VLIW Architecture on The Example of Elbrus Platform. Programming and Computer Software, 2019, vol. 45, no. 1, pp. 15-21. DOI: 10.1134/S0132347419010047

2. Bulatov K.B., Arlazarov V.V., Chernov T.S., Slavin O.A., Nikolaev D.P. Smart IDReader: Document Recognition in Video Stream. 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 9-12 November, Kyoto, Japan, 2017, pp. 39-44. DOI: 10.1109/ICDAR.2017.347

3. Lynchenko A., Sheshkus A., Arlazarov V.L. Document Image Recognition Algorithm Based on Similarity Metric Robust to Projective Distortions for Mobile Devices. International Conference on Machine Vision (ICMV 2018), 1-3 November, Munich, Germany, vol. 11041, article ID: 110411K, 7 p. DOI: 10.1117/12.2523152

4. Islam N., Islam Z., Noor N. A Survey on Optical Character Recognition System. Journal of Information and Communication Technology, 2016, vol. 10, no. 2, article ID: 18302720, 11 p. DOI: 10.1109/ICEDSS.2018.8544323

5. Bolotova Y.U., Spitsyn V.G., Rudometkina M.N License Plate Recognition Algorithm on The Basis of a Connected Components Method and a Hierarchical Temporal Memory Model. Computer Optics, 2015, vol. 39, no. 2, pp. 275-280. DOI: 10.18287/0134-2452-2015-39-2-275-280

6. Limonova E.E., Sheshkus A.V., Ivanova A.A., Nikolaev D.P. Convolutional Neural Network Structure Transformations for Complexity Reduction and Speed Improvement. Pattern Recognition and Image Analysis, 2018, vol. 28, no. 1, pp. 24-33. DOI: 10.1134/S105466181801011X

7. Johnson J. Rethinking Floating Point for Deep Learning, 2018. Available at: https://arxiv.org/abs/1811.01721, [accessed 01.10.2019]

8. Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, Yurong Chen. Incremental Network Quantization: Towards Lossless CNNS with Low-Precision Weights, 2017. Available at: https://arxiv.org/abs/1702.03044, [accessed 01.10.2019]

9. Low-Precision Matrix Multiplication. Available at: https://github.com/google/gemm lowp. [accessed 01.10.2019].

10. QNNPACK: Open Source Library for Optimized Mobile Deep Learning. Available at: https://code.fb.com/ml-applications/qnnpack, [accessed 01.10.2019].

11. Choukroun Y., Kravchik E, Kisilev P. Low-Bit Quantization of Neural Networks for Efficient Inference, 2019. Available at: https://arxiv.org/abs/1902.06822, [accessed 01.10.2019]

12. Prokhorov, N.L.,Kim A.K., Egorov G.A. To The 60th Anniversary of The I.S. Brook Institute of Electronic Control Computers. Journal of Information Technologies and Computing Systems, 2018, no. 3, pp. 1-13. DOI: 10.14357/20718632180301

13. Krizhevsky, A., Sutskever I., Hinton G.E. ImageNet Classification with Deep Convolutional Neural Networks. Communications of the ACM, 2017, vol. 60, no. 6, pp. 84-90. DOI: 10.1145/3065386

14. Toshev A., Szegedy C. Deeppose: Human Pose Estimation Via Deep Neural Networks. IEEE Conference on Computer Vision and Pattern Recognition, 17-19 June, Washington, DC, United States, 2014, pp. 1653-1660. DOI: 10.1109/CVPR.2014.214

15. Szegedy C., Liu W., Jia Y., Sermanet P., Reed S., Anguelov D., Erhan D., Vanhoucke V., Rabinovich A. Going Deeper with Convolutions. IEEE Conference on Computer Vision and Pattern Recognition, 7-12 June, Boston, 2015, pp. 1-9. DOI: 10.1109/CVPR.2015.7298594

16. Bashivan P., Rish I., Yeasin M., Codella N. Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks, 2015. Available at: https://arxiv.org/abs/1511.06448, [accessed 01.10.2019].

17. Brahimi S., Aoun N.B., Amar C.B. Very Deep Recurrent Convolutional Neural Network for Object Recognition. International Conference on Machine Vision, 18-20 November, Nice, France, 2017, vol. 10341, article ID: 1034107.

18. Chellapilla K., Puri S., Simard P. High Performance Convolutional Neural Networks for Document Processing. Tenth International Workshop on Frontiers in Handwriting Recognition, 23-26 October, La Baule, France, 2006, pp 1237-1242.

19. Kim A.K., Perekatov V.I., Ermakov S.G. Mikroprocessory i vychislitel'nye kompleksy semejstva Jel'brus [Microprocessors and Computing Systems of the Elbrus Family]. Saint-Petersburg, Piter, 2013. (in Russian)

20. Ishin P.A., Loginov V.E., Vasilyev P.P. Uskorenie vychisleniy s ispol'zovaniem vysokoproizvoditel'nykh matematicheskikh i mul'timediynykh bibliotek dlya arkhitektury El'brus [Acceleration of Computations Using High-Performance Mathematical and Multimedia Libraries for the Architecture of Elbrus]. Bulletin of Aerospace Defense, 2015, no. 4 (8), pp. 64-68. (in Russian)

21. Limonova E.E., Skoryukina N.S., Neyman-Zade M.I. Fast Hamming Distance Computation for 2D Art Recognition on VLIW-Architecture in Case of Elbrus Platform. International Conference on Machine Vision, 16-18 November, Amsterdam, The Netherlands, 2019, vol. 11041, article ID: 110411N, 10 p. DOI: 10.1117/12.2523101.

22. Goto K., Geijn R.A. Anatomy of High-Performance Matrix Multiplication. Transactions on Mathematical Software, 2008, vol. 34, no. 3, pp. 12. DOI: 10.1145/1356052.1356053