Fetch unit design for scalable simultaneous multithreading (ScSMT)

Continuous IC process enhancements make possible to integrate on a single chip the re-sources required for simultaneously executing multiple control flows or threads, exploiting different levels of thread-level parallelism: application-, function-, and loop-level. Scalable simultaneous multi-threadi...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Moure, Juan Carlos, Rexachs del Rosario, Dolores, Luque Fadón, Emilio
Formato: Articulo
Lenguaje:Inglés
Publicado: 2001
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/9404
http://journal.info.unlp.edu.ar/wp-content/uploads/ipaper1.pdf
Aporte de:
Descripción
Sumario:Continuous IC process enhancements make possible to integrate on a single chip the re-sources required for simultaneously executing multiple control flows or threads, exploiting different levels of thread-level parallelism: application-, function-, and loop-level. Scalable simultaneous multi-threading combines static and dynamic mechanisms to assemble a complexity-effective design that provides high instruction per cycle rates without sacrificing cycle time nor single-thread performance. This paper addresses the design of the fetch unit for a high-performance, scalable, simultaneous multithreaded processor. We present the detailed microarchitecture of a clustered and reconfigurable fetch unit based on an existing single-thread fetch unit. In order to minimize the occurrence of fetch hazards, the fetch unit dynamically adapts to the available thread-level parallelism and to the fetch characteristics of the active threads, working as a single shared unit or as two separate clusters. It combines static and dynamic methods in a complexity-efficient way. The design is supported by a simulation- based analysis of different instruction cache and branch target buffer configurations on the context of a multithreaded execution workload. Average reductions on the miss rates between 30% and 60% and peak reductions greater than 200% are obtained.