Maximization of Expectation-Parallelism Algorithm Using OpenCL

Authors

  • Ernesto Insua-Suárez Ing. Ernesto Insua Suárez Facultad de Ingeniería Informática, Instituto Superior Politécnico José Antonio Echeverría La Habana, Cuba e-mail: einsuas@ceis.cujae.edu.cu
  • Marlis Fulgueira-Camilo Facultad de Ingeniería Informática Instituto Superior Politécnico José Antonio Echeverría La Habana, Cuba
  • Venus Henry-Fuenteseca Facultad de Ingeniería Informática Instituto Superior Politécnico José Antonio Echeverría La Habana, Cuba

DOI:

https://doi.org/10.21501/21454086.1361

Keywords:

Hybrid architectures, Parallel & Distributed Computation, Maximization of Expectation, Parallel Execution, OpenCL.

Abstract

Nowadays both organizations and companies store big volumes of data to achieve their purposes. One of the variants to obtain valuable information consists on the employment of Data Mining. Inside Data Mining, different tasks exist and one of them is clustering. In this task the data group according to their likenesses among them differ with elements of other groups. One of the algorithms that carry out these clusters is Expectation-Maximization, which presents high times of execution in their data. This article discusses about the parallelization of the mentioned algorithm, using techniques of parallel programming. The design of the proposed algorithm is based on the use of the graphic process unit, GPU. OpenCL, language used for the programming in hybrid architectures, allows to take advantage of the available hardware architectures, which it is possible to diminish the time of execution of the sequential implementation. The reason to improve this time is due to the quantity of parallel processes that can rush in threads of independent prosecutions. For the achievement of the described results, knowledge of the field of Data Mining and Parallel and Distributed Computation are integrated. As part of this investigation, an implementation of the algorithm using the libraries of OpenCL was carried out to diminish the time of execution. The implementation is able to diminish the sequential implementation in 82%, this means that the parallel algorithm is executed 5,5 times quicker that its sequential corresponding implementation.

Downloads

Download data is not yet available.

Author Biographies

Ernesto Insua-Suárez, Ing. Ernesto Insua Suárez Facultad de Ingeniería Informática, Instituto Superior Politécnico José Antonio Echeverría La Habana, Cuba e-mail: einsuas@ceis.cujae.edu.cu

Ing. Ernesto Insua Suárez
Facultad de Ingeniería Informática, 
Instituto Superior Politécnico José 
Antonio Echeverría
La Habana, Cuba
e-mail: einsuas@ceis.cujae.edu.cu

Marlis Fulgueira-Camilo, Facultad de Ingeniería Informática Instituto Superior Politécnico José Antonio Echeverría La Habana, Cuba

Facultad de Ingeniería Informática Instituto Superior Politécnico José Antonio Echeverría La Habana, Cuba

Venus Henry-Fuenteseca, Facultad de Ingeniería Informática Instituto Superior Politécnico José Antonio Echeverría La Habana, Cuba

Facultad de Ingeniería Informática Instituto Superior Politécnico José Antonio Echeverría La Habana, Cuba

References

J. Hernández-Orallo, J. Ramírez-Quintana, "Introduccíon A La Minería De Datos". Primera edición, 2004, Madrid, Pearson Prentice Hall S.A. 680p.

A. P. Dempster, D. B. Rubin,"Maximum Likelihood from Incomplete Data via the EM Algorithm". Journal of the Royal Statistical Society, 1977. pp. 39-45.

Collins, M. "The EM Algorithm", 1997, 28p. Disponible en: http://faculty.washington.edu/fxia/courses/LING572/EM_collins97.pdf

S.Purcell. "Maximum Likelihood Estimation (MLE)", 2007, Disponible en: http://statgen.iop.kcl.ac.uk/bgim/mle/sslike_1.html

I. Foster, "Designing and Building Parallel Programs", 1995: Addison-Wesley. Disponible en: http://www.mcs.anl.gov/~itf/dbpp

OpenMP.org. "The OpenMP API Specification For Parallel Programming". 2014; Disponible en: http://www.openmp.org

R. Pas, "An Overview of OpenMP 3.0", IWOMP 2009, TU Dresden (Alemania), Disponible en: https://iwomp.zih.tu-dresden.de/downloads/2.Overview_OpenMP.pdf

H. Hoeger, "Introducción a la Computación Paralela", Centro Nacional de Cálculo Científico Universidad de Los Andes, Mérida (Venezuela) - CeCalCULA, 2011, 48p

Tosini, M. "Introducción a las Arquitecturas Paralelas", 2011, 51p, Disponible en: http://bit.ly/1H9pylQ

NVIDIA, "High Performance Computing with CUDA", Users Group Conference", San Diego, CA (EEUU), 2009.

J.A. Raya-Vaquera, E.X. Martín-Rull, "Aceleración con CUDA de Procesos 3D". 2009.

Wang, P., "OpenCL Optimization", GPU Technology Conference, San José, CA (EEUU), 2009.

A. Munshi, T. G. Mattson, J. Fung, D. Ginsburg, "OpenCL Programming Guide", Boston, MA (EEUU), Pearson Education, 2012

A. Barak, E. Levy, A. Shiloh, "A Package for OpenCL Based Heterogeneous Computing on Clusters with Many GPU Devices", IEEE International Conference on Cluster Computing Workshops and Posters, Heraklion, Isla de Creta, 2010.

B. R. Gaster, L. Howes, D. R. Kaeli, P. Mistry, D. Schaa, "Chapter 13 - OpenCL Profiling and Debugging", En: Heterogeneous Computing with Opencl, Editado por B. R. Gaster, L. Howes, D R. KaeliPerhaad, M.D. Schaa, M. Kaufmann, Boston, 2013, pp. 243-261, ISBN 9780124058941,

NVIDIA, "NVIDIA CUDA Compute Unified Device Architecture", 2014. Disponible en: http://www.nvidia.com/object/cuda_home_new.html

E.W. Weisstein, "K-Means Clustering Algorithm". 2014. Disponible en: http://mathworld.wolfram.com/K-MeansClusteringAlgorithm.html

M.I. Ángeles-Larrieta, A.M. Santillán-Gómez "Minería de datos: Concepto, características, estructura y aplicaciones", Contaduría y Administración, N° 190, 2004. p. 79-84.

Vallejos, S.J. "Minería de Datos. Argentina: Universidad Nacional del Nordeste" Facultad de Ciencias Exactas, Naturales y Agrimensura, 2006.

P. Graham, "Parallel Computation Notes", 1996. Disponible en: http://www.cs.umanitoba.ca/~comp4510/notesDIR/COMP4510Models_2up.pdf.

Published

2015-01-01

How to Cite

Insua-Suárez, E., Fulgueira-Camilo, M., & Henry-Fuenteseca, V. (2015). Maximization of Expectation-Parallelism Algorithm Using OpenCL. Lámpsakos, (13), 51–61. https://doi.org/10.21501/21454086.1361

Issue

Section

Articles of scientific and technological research