HyperTransformer: F Visualization of The Generated CNN Weights

Written by escholar | Published 2024/04/16
Tech Story Tags: hypertransformer | supervised-model-generation | few-shot-learning | convolutional-neural-network | small-target-cnn-architectures | task-independent-embedding | conventional-machine-learning | parametric-model

TLDRIn this paper we propose a new few-shot learning approach that allows us to decouple the complexity of the task space from the complexity of individual tasks.via the TL;DR App

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Andrey Zhmoginov, Google Research & {azhmogin,sandler,mxv}@google.com;

(2) Mark Sandler, Google Research & {azhmogin,sandler,mxv}@google.com;

(3) Max Vladymyrov, Google Research & {azhmogin,sandler,mxv}@google.com.

F VISUALIZATION OF THE GENERATED CNN WEIGHTS.

Figures 9 and 10 show the examples of the CNN kernels that are generated by a single-head, 1- layer transformer for a simple 2-layer CNN model with 9 × 9 stride-4 kernels. Different figures correspond to different approaches to re-assembling the weights from the generated slices: using “output” allocation or “spatial” allocation (see Section 3.1 in the main text for more information). Notice that “spatial” weight allocation produces more homogeneous kernels for the first layer when compared to the “output” allocation. In both figures we show the difference of the final generated kernels for 3 variants: model with both layers generated, one generated and one trained and both trained.

Trained layers are always fixed for the inference for all the episodes, but the generated layers vary, albeit not significantly. In Figures 11 and 12 we show the generated kernels for two different episodes and, on the right, the difference between them. It appears that the generated convolutional kernel change withing 10 − 15% form episode to episode.


Written by escholar | We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community
Published by HackerNoon on 2024/04/16