An energy-efficient memory unit for clustered microarchitectures
Whereas clustered microarchitectures themselves have been extensively studied, the memory units for these clustered microarchitectures have received relatively little attention. This article discusses some of the inherent challenges of clustered memory units and shows how these can be overcome. Clus...
| Autores: | , , |
|---|---|
| Tipo de recurso: | artículo |
| Fecha de publicación: | 2016 |
| País: | España |
| Institución: | Universitat Politècnica de Catalunya (UPC) |
| Repositorio: | UPCommons. Portal del coneixement obert de la UPC |
| Idioma: | inglés |
| OAI Identifier: | oai:upcommons.upc.edu:2117/90303 |
| Acceso en línea: | https://hdl.handle.net/2117/90303 https://dx.doi.org/10.1109/TC.2015.2493518 |
| Access Level: | acceso abierto |
| Palabra clave: | Cache memory Microprocessors Cache memories Parallel architectures Distributed architectures Clustered architectures Store buffer Memòria cau Microprocessadors Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors |
| id |
ES_de92b8a43efc7df8b46e18a0496f4b8f |
|---|---|
| oai_identifier_str |
oai:upcommons.upc.edu:2117/90303 |
| network_acronym_str |
ES |
| network_name_str |
España |
| repository_id_str |
|
| spelling |
An energy-efficient memory unit for clustered microarchitecturesBieschewski, StefanParcerisa Bundó, Joan Manuel|||0000-0001-5771-8118González Colás, Antonio María|||0000-0002-0009-0996Cache memoryMicroprocessorsCache memoriesParallel architecturesDistributed architecturesClustered architecturesStore bufferMemòria cauMicroprocessadorsÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadorsWhereas clustered microarchitectures themselves have been extensively studied, the memory units for these clustered microarchitectures have received relatively little attention. This article discusses some of the inherent challenges of clustered memory units and shows how these can be overcome. Clustered memory pipelines work well with the late allocation of load/store queue entries and physically unordered queues. Yet this approach has characteristic problems such as queue overflows and allocation patterns that lead to deadlocks. We propose techniques to solve each of these problems and show that a distributed memory unit can offer significant energy savings and speedups over a centralized unit. For instance, compared to a centralized cache with a load/store queue of 64/24 entries, our four-cluster distributed memory unit with load/store queues of 16/8 entries each consumes 31 percent less energy and performs 4,7 percent better on SPECint and consumes 36 percent less energy and performs 7 percent better for SPECfp.Peer Reviewed20162016-08-0120162016-09-29journal articlehttp://purl.org/coar/resource_type/c_6501AMhttp://purl.org/coar/version/c_ab4af688f83e57aainfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/2117/90303https://dx.doi.org/10.1109/TC.2015.2493518reponame:UPCommons. Portal del coneixement obert de la UPCinstname:Universitat Politècnica de Catalunya (UPC)InglésengMinisterio de Economía y Competitividad http://doi.org/10.13039/501100003329 TIN2013-44375-R MICROARQUITECTURA Y COMPILADORES PARA FUTUROS PROCESADORES IIIopen accesshttp://purl.org/coar/access_right/c_abf2info:eu-repo/semantics/openAccessoai:upcommons.upc.edu:2117/903032026-05-27T15:37:01Z |
| dc.title.none.fl_str_mv |
An energy-efficient memory unit for clustered microarchitectures |
| title |
An energy-efficient memory unit for clustered microarchitectures |
| spellingShingle |
An energy-efficient memory unit for clustered microarchitectures Bieschewski, Stefan Cache memory Microprocessors Cache memories Parallel architectures Distributed architectures Clustered architectures Store buffer Memòria cau Microprocessadors Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors |
| title_short |
An energy-efficient memory unit for clustered microarchitectures |
| title_full |
An energy-efficient memory unit for clustered microarchitectures |
| title_fullStr |
An energy-efficient memory unit for clustered microarchitectures |
| title_full_unstemmed |
An energy-efficient memory unit for clustered microarchitectures |
| title_sort |
An energy-efficient memory unit for clustered microarchitectures |
| dc.creator.none.fl_str_mv |
Bieschewski, Stefan Parcerisa Bundó, Joan Manuel|||0000-0001-5771-8118 González Colás, Antonio María|||0000-0002-0009-0996 |
| author |
Bieschewski, Stefan |
| author_facet |
Bieschewski, Stefan Parcerisa Bundó, Joan Manuel|||0000-0001-5771-8118 González Colás, Antonio María|||0000-0002-0009-0996 |
| author_role |
author |
| author2 |
Parcerisa Bundó, Joan Manuel|||0000-0001-5771-8118 González Colás, Antonio María|||0000-0002-0009-0996 |
| author2_role |
author author |
| dc.subject.none.fl_str_mv |
Cache memory Microprocessors Cache memories Parallel architectures Distributed architectures Clustered architectures Store buffer Memòria cau Microprocessadors Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors |
| topic |
Cache memory Microprocessors Cache memories Parallel architectures Distributed architectures Clustered architectures Store buffer Memòria cau Microprocessadors Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors |
| description |
Whereas clustered microarchitectures themselves have been extensively studied, the memory units for these clustered microarchitectures have received relatively little attention. This article discusses some of the inherent challenges of clustered memory units and shows how these can be overcome. Clustered memory pipelines work well with the late allocation of load/store queue entries and physically unordered queues. Yet this approach has characteristic problems such as queue overflows and allocation patterns that lead to deadlocks. We propose techniques to solve each of these problems and show that a distributed memory unit can offer significant energy savings and speedups over a centralized unit. For instance, compared to a centralized cache with a load/store queue of 64/24 entries, our four-cluster distributed memory unit with load/store queues of 16/8 entries each consumes 31 percent less energy and performs 4,7 percent better on SPECint and consumes 36 percent less energy and performs 7 percent better for SPECfp. |
| publishDate |
2016 |
| dc.date.none.fl_str_mv |
2016 2016-08-01 2016 2016-09-29 |
| dc.type.none.fl_str_mv |
journal article http://purl.org/coar/resource_type/c_6501 AM http://purl.org/coar/version/c_ab4af688f83e57aa |
| dc.type.openaire.fl_str_mv |
info:eu-repo/semantics/article |
| format |
article |
| dc.identifier.none.fl_str_mv |
https://hdl.handle.net/2117/90303 https://dx.doi.org/10.1109/TC.2015.2493518 |
| url |
https://hdl.handle.net/2117/90303 https://dx.doi.org/10.1109/TC.2015.2493518 |
| dc.language.none.fl_str_mv |
Inglés eng |
| language_invalid_str_mv |
Inglés |
| language |
eng |
| dc.relation.none.fl_str_mv |
Ministerio de Economía y Competitividad http://doi.org/10.13039/501100003329 TIN2013-44375-R MICROARQUITECTURA Y COMPILADORES PARA FUTUROS PROCESADORES III |
| dc.rights.none.fl_str_mv |
open access http://purl.org/coar/access_right/c_abf2 |
| dc.rights.openaire.fl_str_mv |
info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
open access http://purl.org/coar/access_right/c_abf2 |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.source.none.fl_str_mv |
reponame:UPCommons. Portal del coneixement obert de la UPC instname:Universitat Politècnica de Catalunya (UPC) |
| instname_str |
Universitat Politècnica de Catalunya (UPC) |
| reponame_str |
UPCommons. Portal del coneixement obert de la UPC |
| collection |
UPCommons. Portal del coneixement obert de la UPC |
| repository.name.fl_str_mv |
|
| repository.mail.fl_str_mv |
|
| _version_ |
1869421986314190848 |
| score |
15,300724 |