No. 27 (286), issue 13Pages 109 - 118

MapReduce-based Image Processing System with Automated Parallelization

A.V. Sozykin, M.L. Goldshtein
The article describes a parallel image processing framework based on the Apache Hadoop and the MapReduce programming model. The advantage of the framework is an isolation of the details of the parallel execution from the application software developer by providing simple API to work with the image, which is loaded into memory.
The main results of the work are the architecture of the Hadoop-based parallel image processing framework and the prototype implementation of this architecture. The prototype has been used to process the data from the Particle Image Velocimetry system (the source for data is PIV challenge project). Evaluation of the prototype on the four-node Hadoop cluster demonstrates near linear scalability.
The results can be used in science (processing images from the physics experimental facilities, astronomical observations, and satellite pictures of a terrestrial surface), in medical research (processing images from hi-tech medical equipment), and in enterprises (analysis of data from security cameras, geographic information systems, etc.).
The suggested approach provides the ability to increase the performance of image processing by using parallel computing systems, and helps to improve the work efficiency of the application developers by allowing them to concentrate on the image processing algorithms instead of the details of parallel implementation.
Full text
image processing, MapReduce, Hadoop, distributed file system, automated parallelization.
1. Horowitz M. The Peta Age. Available at: (accessed 29 May 2012).
2. Apache Hadoop. Available at: (accessed 29 May 2012).
3. Dean J., Ghemawat S. MapReduce: simplified data processing on large clusters. // Commun. ACM, 2008, no. 51(1), pp. 107-113.
4. Ghemawat S., Gobioff H., Leung S.T. The Google File System. 19th ACM Symposium on Operating Systems Principles, Lake George, NY, 2003, pp. 29-43.
5. Hadoop Distributed File System. Available at: (accessed 29 May 2012).
6. Hays J., Efros A.A. IM2GPS: Estimating Geographic Information from a Single Image. Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2008, pp. 1-8.
7. Wiley K., Connolly A., Gardner J., Krughoff S., Balazinska M., Howe B., Kwon Y., Bu Y. Astronomy in the Cloud: Using MapReduce for Image Co-Addition. Publications of the Astronomical Society of the Pacific, 2011, vol. 123, no. 901, pp. 366-380.
8. Wiley K., Connolly A., Krughoff S., Gardner J., Balazinska M., Howe B., Kwon Y., Bu Y. Astronomical Image Processing with Hadoop. Astronomical Data Analysis Software and Systems XX. ASP Conference Proceedings 2011, vol. 442, pp.93-98.
9. Kumar S., Fernandez B.A. Distributed image processing using hadoop mapreduce framework. Available at: (accessed 29 May 2012).
10. Almeer M.H. Cloud Hadoop Map Reduce For Remote Sensing Image Analysis. J. of Emerging Trends in Computing and Information Sciences, 2012, vol. 3, no. 4, pp. 637-644.
11. Cary A., Sun Z., Hristidis V., Rishe N. Experiences on Processing Spatial Data with MapReduce. Proceedings of the 21st International Conference on Scientific and Statistical Database Management, 2009, pp. 302-319.
12. Sweeney C., Liu L., Arietta S., Lawrence J., Thesis B.S. HIPI: A Hadoop Image Processing Interface for Image-based MapReduce Tasks. University of Virginia. Department of Computer Science, 2011, pp. 5.
13. Adrian R.J., Westerweel J. Particle Image Velocimetry. Cambridge University Press, 2011. 584 p.
14. PIV Challenge. Available at: (accessed 29 May 2012).
15. Mizeva I.A., Stepanov R.A., Frick P.G. Wavelet Correlations of Two-dimensional Signals [Vejvletnye korreljacii dvumernyh polej]. Vychislitel'nye Metody i Programmirovanie [Numerical Methods and Programming], 2006, vol. 7, pp. 172-179.
16. Shestakov A.L., Kirpichnikova I.M. South Ural State University as Starting Platform of Power Saving Technologies and Use of Renewable Energy Sources [Yuzhno-Ural'skiy gosudarstvennyy universitet kak startovaya ploshchadka energosberegayushchikh tekhnologiy i ispol'zovaniya vozobnovlyaemykh Istochnikov Energii]. Al'ternativnaya Energetika i Ekologiya [Alternative Energy and Ecology], 2010, no. 1, pp. 149-152.