paint-brush
PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices: Arithmetic Intensityby@bayesianinference

PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices: Arithmetic Intensity

by Bayesian Inference
Bayesian Inference HackerNoon profile picture

Bayesian Inference

@bayesianinference

At BayesianInference.Tech, as more evidence becomes available, we make predictions...

April 2nd, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This paper investigates how the configuration of on-device hardware affects energy consumption for neural network inference with regular fine-tuning.
featured image - PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices: Arithmetic Intensity
1x
Read by Dr. One voice-avatar

Listen to this story

Bayesian Inference HackerNoon profile picture
Bayesian Inference

Bayesian Inference

@bayesianinference

At BayesianInference.Tech, as more evidence becomes available, we make predictions and refine beliefs.

Learn More
LEARN MORE ABOUT @BAYESIANINFERENCE'S
EXPERTISE AND PLACE ON THE INTERNET.
0-item

STORY’S CREDIBILITY

Academic Research Paper

Academic Research Paper

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.

Authors:

(1) Minghao Yan, University of Wisconsin-Madison;

(2) Hongyi Wang, Carnegie Mellon University;

(3) Shivaram Venkataraman, myan@cs.wisc.edu.

C ARITHMETIC INTENSITY

The arithmetic intensity of a 2D convolution layer can be computed by the following equation:


image


image


The notations used in equation 1 can be found in table 8.


The FLOPs term captures the total computation of each workload, while the arithmetic intensity term captures how much computation power and memory bandwidth will affect the final performance. Combining the aforementioned features with an intercept term, which captures the fixed overhead in neural network inference, we can build a model that predicts inference latency if the hardware operating frequency is stable.

L O A D I N G
. . . comments & more!

About Author

Bayesian Inference HackerNoon profile picture
Bayesian Inference@bayesianinference
At BayesianInference.Tech, as more evidence becomes available, we make predictions and refine beliefs.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Coffee-web
X REMOVE AD