Temperature, energy and performance: addressing embedded system challenges through fast cache simulation

Download files
Access & Terms of Use
open access
Copyright: Schneider, Josef
Altmetric
Abstract
Temperature, energy and performance are essential design considerations during the conception of modern digital systems. The work presented in this thesis focusses on three aspects that can be used to overcome these limitations. First an evaluation of the suitability of the dynamic application adaptation method is researched with the aim of using it to control the temperature of a Field Programmable Gate Array (FPGA) device. Despite the use of an extremely adaptive custom JPEG encoder, it was determined that application adaptation alone is ineffective in an FPGA for thermal management. Next, a study is performed which aims to assess which components are principally responsible for the rise in temperatures in FPGAs. It was found that the external memory interface is a significant heat-source in FPGA-based embedded systems, and that device temperature correlates with CPU cache miss rate. The third and main aspect covered in this dissertation is the speeding up of CPU cache simulation. Single pass cache simulation is a tool that can be employed at design time to select a cache yielding acceptable temperature, system performance and energy consumption. Three Multiple cAche Simulators in Hardware (MASH) or in Software (MASS) are proposed for three cache replacement policies: MASH{lru} for the Least Recently Used (LRU) cache algorithm, MASH{fifo} for First In First Out (FIFO) and MASS{plrut} for Pseudo Least Recently Used tree (PLRUt). The former two are novel in that they are implemented in hardware and are respectively 53x and 11.10x faster than software counterparts. The PLRUt simulator presents for the first time an optimised hash table-based algorithm yielding a speedup of 1.93x over an unoptimised solution. All cache simulators employ cache properties specific to their replacement policies to improve simulator characteristics. Additionally, it is shown that the hardware (or MASH) simulators can be implemented in-system alongside an embedded system, allowing for the direct trace extraction and cache simulation from within an FPGA. Using in-system simulation, large speedups can be achieved as trace generation and multiple cache simulation happen at the same time at high frequencies.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Schneider, Josef
Supervisor(s)
Parameswaran, Sridevan
Ignjatovic, Aleksandar
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2015
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download public version.pdf 4.08 MB Adobe Portable Document Format
Related dataset(s)