Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach
A wide range of applications in engineering and scientific computing are involved in the acceleration of the sparse matrix vector product (SpMV). Graphics Processing Units (GPUs) have recently emerged as platforms that yield outstanding acceleration factors. SpMV implementations for GPUs have alread...
| Autores: | , , |
|---|---|
| Tipo de recurso: | artículo |
| Estado: | Versión enviada para evaluación y publicación |
| Fecha de publicación: | 2012 |
| País: | España |
| Institución: | Consejo Superior de Investigaciones Científicas (CSIC) |
| Repositorio: | DIGITAL.CSIC. Repositorio Institucional del CSIC |
| OAI Identifier: | oai:digital.csic.es:10261/380549 |
| Acceso en línea: | http://hdl.handle.net/10261/380549 |
| Access Level: | acceso abierto |
| Palabra clave: | Sparse matrix vector product GPU computing GPU performance modeling |
| id |
ES_3441dcd5d8ddb517e59b2c92a88be721 |
|---|---|
| oai_identifier_str |
oai:digital.csic.es:10261/380549 |
| network_acronym_str |
ES |
| network_name_str |
España |
| repository_id_str |
|
| spelling |
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approachVázquez, FranciscoFernández, José JesúsGarzón, Ester M.Sparse matrix vector productGPU computingGPU performance modelingA wide range of applications in engineering and scientific computing are involved in the acceleration of the sparse matrix vector product (SpMV). Graphics Processing Units (GPUs) have recently emerged as platforms that yield outstanding acceleration factors. SpMV implementations for GPUs have already appeared on the scene. This work is focused on the ELLR-T algorithm to compute SpMV on GPU architecture, its performance is strongly dependent on the optimum selection of two parameters. Therefore, taking account that the memory operations dominate the performance of ELLR-T, an analytical model is proposed in order to obtain the auto-tuning of ELLR-T for particular combinations of sparse matrix and GPU architecture. The evaluation results with a representative set of test matrices show that the average performance achieved by auto-tuned ELLR-T by means of the proposed model is near to the optimum. A comparative analysis of ELLR-T against a variety of previous proposals shows that ELLR-T with the estimated configuration reaches the best performance on GPU architecture for the representative set of test matrices.This work has been funded by grants from the Spanish Ministry of Science and Innovation (TIN2008-01117), Junta de Andalucia (JA-P10-TIC-6002, JA-P08-TIC-3518) and Consejo Superior de Investigaciones Cientificas (CSIC-PIE-200920I075).Peer reviewedElsevierMinisterio de Ciencia e Innovación (España)Junta de AndalucíaConsejo Superior de Investigaciones Científicas (España)Consejo Superior de Investigaciones Científicas [https://ror.org/02gfc7t72]202520252012info:eu-repo/semantics/articlehttp://purl.org/coar/resource_type/c_6501Preprintinfo:eu-repo/semantics/submittedVersionapplication/pdfhttp://hdl.handle.net/10261/380549reponame:DIGITAL.CSIC. Repositorio Institucional del CSICinstname:Consejo Superior de Investigaciones Científicas (CSIC)Ingléshttps://doi.org/10.1016/j.parco.2011.08.003Síinfo:eu-repo/semantics/openAccessoai:digital.csic.es:10261/3805492026-05-22T06:33:51Z |
| dc.title.none.fl_str_mv |
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach |
| title |
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach |
| spellingShingle |
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach Vázquez, Francisco Sparse matrix vector product GPU computing GPU performance modeling |
| title_short |
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach |
| title_full |
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach |
| title_fullStr |
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach |
| title_full_unstemmed |
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach |
| title_sort |
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach |
| dc.creator.none.fl_str_mv |
Vázquez, Francisco Fernández, José Jesús Garzón, Ester M. |
| author |
Vázquez, Francisco |
| author_facet |
Vázquez, Francisco Fernández, José Jesús Garzón, Ester M. |
| author_role |
author |
| author2 |
Fernández, José Jesús Garzón, Ester M. |
| author2_role |
author author |
| dc.contributor.none.fl_str_mv |
Ministerio de Ciencia e Innovación (España) Junta de Andalucía Consejo Superior de Investigaciones Científicas (España) Consejo Superior de Investigaciones Científicas [https://ror.org/02gfc7t72] |
| dc.subject.none.fl_str_mv |
Sparse matrix vector product GPU computing GPU performance modeling |
| topic |
Sparse matrix vector product GPU computing GPU performance modeling |
| description |
A wide range of applications in engineering and scientific computing are involved in the acceleration of the sparse matrix vector product (SpMV). Graphics Processing Units (GPUs) have recently emerged as platforms that yield outstanding acceleration factors. SpMV implementations for GPUs have already appeared on the scene. This work is focused on the ELLR-T algorithm to compute SpMV on GPU architecture, its performance is strongly dependent on the optimum selection of two parameters. Therefore, taking account that the memory operations dominate the performance of ELLR-T, an analytical model is proposed in order to obtain the auto-tuning of ELLR-T for particular combinations of sparse matrix and GPU architecture. The evaluation results with a representative set of test matrices show that the average performance achieved by auto-tuned ELLR-T by means of the proposed model is near to the optimum. A comparative analysis of ELLR-T against a variety of previous proposals shows that ELLR-T with the estimated configuration reaches the best performance on GPU architecture for the representative set of test matrices. |
| publishDate |
2012 |
| dc.date.none.fl_str_mv |
2012 2025 2025 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article http://purl.org/coar/resource_type/c_6501 Preprint info:eu-repo/semantics/submittedVersion |
| format |
article |
| status_str |
submittedVersion |
| dc.identifier.none.fl_str_mv |
http://hdl.handle.net/10261/380549 |
| url |
http://hdl.handle.net/10261/380549 |
| dc.language.none.fl_str_mv |
Inglés |
| language_invalid_str_mv |
Inglés |
| dc.relation.none.fl_str_mv |
https://doi.org/10.1016/j.parco.2011.08.003 Sí |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Elsevier |
| publisher.none.fl_str_mv |
Elsevier |
| dc.source.none.fl_str_mv |
reponame:DIGITAL.CSIC. Repositorio Institucional del CSIC instname:Consejo Superior de Investigaciones Científicas (CSIC) |
| instname_str |
Consejo Superior de Investigaciones Científicas (CSIC) |
| reponame_str |
DIGITAL.CSIC. Repositorio Institucional del CSIC |
| collection |
DIGITAL.CSIC. Repositorio Institucional del CSIC |
| repository.name.fl_str_mv |
|
| repository.mail.fl_str_mv |
|
| _version_ |
1869405799038582784 |
| score |
15.81155 |