DAC 2010

Wednesday, September 1, 2010

paper 5.3 Adaptive and Autonomous Thermal Tracking for High Performance Computing Systems

ABSTRACT
Many DTM schemes rely heavily on the accurate knowledge
of the chip’s dynamic thermal state to make optimal performance/
temperature trade-off decisions.

1. INTRODUCTION

Most of today’s high performance multi-core processors
suffer from the heavy power/thermal stress

(page 1 col 2)

If the statistical characteristics of the power dissipation
profile does not change in time, Kalman filter based approach
can generate optimal thermal estimates using sensor observations
[4].

In this paper, we investigate the problem of
adaptive temperature tracking at runtime by considering the
dynamic changes in the statistical characteristics of the power
profile.

2. PRELIMINARY
2.1 System Dynamics
2.2 Kalman Filter Based Thermal Tracking
3. PROBLEM DEFINITION AND
CHALLENGES
The statistical characteristics of each potential power state
could be captured by simulating or experimenting with all
potential applications sets (integer vs floating point, scientific
vs multimedia and etc.)

4. ADAPTIVE TRACKING BASED ON
RESIDUAL WHITENING
4.1 Autonomous Detection
In this section we explain how Kalman filter can be used
to autonomously detect the switch of the power states.

Note that we use Cs instead of C[n|n−1] since we assume
the system has reached the steady state.

Once again we use steady state Ks as parameter.

(p4)
Basically, we evaluate error between observation and prediction
and estimate the autocorrelation.

Friday, August 20, 2010

paper 5.2 Consistent Runtime Thermal Prediction and Control Through Workload Phase Detection

ABSTRACT
Elevated temperatures impact the performance, power consumption, and reliability of processors, which rely on integrated thermal sensors to measure runtime thermal behavior.
1. INTRODUCTION

Dynamic thermal management (DTM) techniques allow processors to optimize performance while avoiding thermal violations. The most well-known DTM techniques include clock gating, dynamic
voltage and frequency scaling (DVFS), and thread migration/scheduling [14, 18, 6, 5].

2. RELATED WORK AND MOTIVATION

3. PROPOSED PHASE-AWARE THERMAL PREDICTION METHODOLOGY

At the highest level, our phase-aware thermal prediction approach
takes raw performance counter data that is periodically measured
for each core during workload operation and translates this data
into a temperature projection for some interval into the future using
the concept of workload phases.

In order to define workload phases and capture temperature dynamics
within them in a computationally efficient manner, we propose
the methodology that is illustrated by Figure 2.

3.1 Offline Thermal Phase Analysis

In order to avoid excessive runtime overhead, global phase analysis
and within-phase temperature modeling are performed offline
using data generated for a set of representative workloads.

(page 3, col 2)

page 4
3.2 Runtime Thermal Prediction and Control

4. EXPERIMENTAL RESULTS
A. Experimental Infrastructure.
(page 5)

5. CONCLUSIONS

Wednesday, August 18, 2010

paper 5.1 Thermal Monitoring of Real Processors: Techniques for Sensor Allocation and Full Characterization

ABSTRACT
1. INTRODUCTION
2. BACKGROUND AND PREVIOUS WORK
3. FREQUENCY DOMAIN TECHNIQUES
4. PROPOSED THERMAL SENSOR ALLOCATION TECHNIQUES
5. PROPOSED FULL RUNTIME THERMAL CHARACTERIZATION TECHNIQUES
A. k-LSE using Pre-determined Thermal Characterization
B. Compressive Sensing
6. EXPERIMENTAL RESULTS
7. CONCLUSIONS

Tuesday, August 17, 2010

paper 4.4 An Effective GPU Implementation of Breadth-First Search

ABSTRACT
1. INTRODUCTION
2. PREVIOUS APPROACHES
3. OUR GPU SOLUTION
3.1 Overview of CUDA on the Nvidia GTX280
3.2 Hierarchical Queue Management
3.3 Hierarchical Kernel Arrangement
4. EXPERIMENTAL RESULTS

Friday, August 13, 2010

paper 4.3 Timing Analysis of Esterel Programs on General-purpose Multiprocessors Lei

ABSTRACT
1. INTRODUCTION
2. OVERVIEW OF ESTEREL
3. CODE GENERATION
4. TIMING ANALYSIS
4.1 Computing Start Times
4.2 Inter-processor Infeasible Paths
4.3 WCET Calculation of a Basic Block
4.4 WCRT Analysis
5. EXPERIMENTAL RESULTS

Tuesday, August 10, 2010

paper 4.2 A Probabilistic and Energy-Efficient Scheduling Approach for Online Application in Real-Time Systems

ABSTRACT
1. INTRODUCTION
2. PRELIMINARIES
2.1 System and Task Model
2.2 Motivating Example
2.3 General Scheduling Concept
2.4 Expected Energy and Time Demand
3. ENERGY MINIMIZATION PROBLEM
4. SOLUTION
4.1 Relaxed Energy Minimization Problem
4.2 General Energy Minimization Problem
4.3 Implementation
5. EXPERIMENTS
5.1 Experimental Setup
5.2 Results
6. CONCLUSIONS

Friday, August 6, 2010

Paper 4.1 LATA: A Latency and Throughput-Aware Packet Processing System

ABSTRACT
1. INTRODUCTION
2. LATA SYSTEM DESIGN
2.1 Program Representation
2.2 Communication Measurement
2.3 Problem Statement
2.4 DAG Generation
3. LATA SCHEDULING, REFINEMENT AND MAPPING
3.1 List-based Pipeline Scheduling Algorithm
3.2 Search-based Refinement Process
3.2.1 Latency Reduction
3.2.2 Throughput Improvement
3.3 Cache-Aware Resource Mapping
3.3.1 Pre-mapping
3.3.2 Real Mapping
4. EXPERIMENT FRAMEWORK
5. PERFORMANCE EVALUATION
5.1 Comparison with Parallel System
5.2 Comparison with Three NP Systems
5.3 Latency Constraint Effect
5.4 Scalability Performance of LATA
5.5 Instruction Cache Size Performance