paint-brush
PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices: Experimental Resultsby@bayesianinference
105 reads

PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices: Experimental Results

by Bayesian Inference
Bayesian Inference HackerNoon profile picture

Bayesian Inference

@bayesianinference

At BayesianInference.Tech, as more evidence becomes available, we make predictions...

April 2nd, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This paper investigates how the configuration of on-device hardware affects energy consumption for neural network inference with regular fine-tuning.
featured image - PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices: Experimental Results
1x
Read by Dr. One voice-avatar

Listen to this story

Bayesian Inference HackerNoon profile picture
Bayesian Inference

Bayesian Inference

@bayesianinference

At BayesianInference.Tech, as more evidence becomes available, we make predictions and refine beliefs.

Learn More
LEARN MORE ABOUT @BAYESIANINFERENCE'S
EXPERTISE AND PLACE ON THE INTERNET.
0-item

STORY’S CREDIBILITY

Academic Research Paper

Academic Research Paper

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.

Authors:

(1) Minghao Yan, University of Wisconsin-Madison;

(2) Hongyi Wang, Carnegie Mellon University;

(3) Shivaram Venkataraman, myan@cs.wisc.edu.

B EXPERIMENTAL RESULTS

In this section, we further demonstrate the tradeoff between memory frequency and maximum GPU frequency by presenting an array of results. These results underline the interesting observation that the energy consumption patterns vary for the same model operating on different devices. Furthermore, even for the same model device pairing, the optimization landscape can be significantly influenced by the batch size. This underlines the complexities of energy optimization and the need for an adaptive framework that can take these factors into account. Figures 6 − 12 show the energy consumption patterns of EfficientNet and Bert on Jetson TX2 and Orin under various batch sizes. Table 7 shows the optimal CPU frequency and corresponding energy consumption reduction in image preprocessing.


Figure 6. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for Bert at FP16 on JetsonTX2 versus varying Memory and GPU frequency with batch size fixed at 1.

Figure 6. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for Bert at FP16 on JetsonTX2 versus varying Memory and GPU frequency with batch size fixed at 1.


Figure 7. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for Bert at FP32 on JetsonTX2 versus varying Memory and GPU frequency with batch size fixed at 1.

Figure 7. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for Bert at FP32 on JetsonTX2 versus varying Memory and GPU frequency with batch size fixed at 1.


Figure 8. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for Bert at FP16 on Jetson TX2 versus varying Memory and GPU frequency with batch size fixed at 8.

Figure 8. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for Bert at FP16 on Jetson TX2 versus varying Memory and GPU frequency with batch size fixed at 8.


Figure 9. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for EfficientNet B4 at FP16 on Jetson TX2 versus varying Memory and GPU frequency with batch size fixed at 16.

Figure 9. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for EfficientNet B4 at FP16 on Jetson TX2 versus varying Memory and GPU frequency with batch size fixed at 16.


Figure 10. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for EfficientNet B7 at FP16 on Jetson TX2 versus varying Memory and GPU frequency with batch size fixed at 16.

Figure 10. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for EfficientNet B7 at FP16 on Jetson TX2 versus varying Memory and GPU frequency with batch size fixed at 16.


Figure 11. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for EfficientNet B7 at FP16 on Jetson Orin versus varying Memory and GPU frequency with batch size fixed at 8.

Figure 11. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for EfficientNet B7 at FP16 on Jetson Orin versus varying Memory and GPU frequency with batch size fixed at 8.


Figure 12. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for EfficientNet B7 at FP16 on Jetson Orin versus varying Memory and GPU frequency with batch size fixed at 1.

Figure 12. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for EfficientNet B7 at FP16 on Jetson Orin versus varying Memory and GPU frequency with batch size fixed at 1.


Figure 13. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for EfficientNet B4 at FP16 on Jetson Orin versus varying Memory and GPU frequency with batch size fixed at 8.

Figure 13. This figure shows per query energy cost as we vary the GPU frequency and memory frequency for EfficientNet B4 at FP16 on Jetson Orin versus varying Memory and GPU frequency with batch size fixed at 8.

L O A D I N G
. . . comments & more!

About Author

Bayesian Inference HackerNoon profile picture
Bayesian Inference@bayesianinference
At BayesianInference.Tech, as more evidence becomes available, we make predictions and refine beliefs.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Coffee-web
X REMOVE AD