Authors:
(1) Xiaofan Yu, University of California San Diego, La Jolla, California, USA ([email protected]);
(2) Anthony Thomas, University of California San Diego, La Jolla, California, USA ([email protected]);
(3) Ivannia Gomez Moreno, CETYS University, Campus Tijuana, Tijuana, Mexico ([email protected]);
(4) Louis Gutierrez, University of California San Diego, La Jolla, California, USA ([email protected]);
(5) Tajana Šimunić Rosing, University of California San Diego, La Jolla, USA ([email protected]).
8 Evaluation of LifeHD semi and LifeHDa
9 Discussions and Future Works
10 Conclusion, Acknowledgments, and References
On-device learning has emerged as a prevailing trend that avoids the slow response time and costly communication of cloud-based learning. The ability to learn continuously and indefinitely in a changing environment, and with resource constraints, is critical for real sensor deployments. However, existing designs are inadequate for practical scenarios with (i) streaming data input, (ii) lack of supervision and (iii) limited on-board resources. In this paper, we design and deploy the first on-device lifelong learning system called LifeHD for general IoT applications with limited supervision. LifeHD is designed based on a novel neurally-inspired and lightweight learning paradigm called Hyperdimensional Computing (HDC). We utilize a two-tier associative memory organization to intelligently store and manage high-dimensional, low-precision vectors, which represent the historical patterns as cluster centroids. We additionally propose two variants of LifeHD to cope with scarce labeled inputs and power constraints. We implement LifeHD on offthe-shelf edge platforms and perform extensive evaluations across three scenarios. Our measurements show that LifeHD improves the unsupervised clustering accuracy by up to 74.8% compared to the state-of-the-art NN-based unsupervised lifelong learning baselines with as much as 34.3x better energy efficiency. Our code is available at https://github.com/Orienfish/LifeHD.
Edge Computing, Lifelong Learning, Hyperdimensional Computing
The fusion of artificial intelligence and Internet of Things (IoT) has become a prominent trend with numerous real-world applications, such as in smart cities [10], smart voice assistants [55], and smart activity recognition [62]. However, the predominant current approach is cloud-centric, where sensor devices send data to the cloud for offline training using extensive data sources. This approach faces challenges like slow updates and costly communication, involving the exchange of large sensor data and models between the edge and the cloud [53]. Instead, recent research has shifted towards edge learning, where machine learning is performed on resource-constrained edge devices right next to the sensors. While most studies focused on inference-only tasks [32, 33, 50], some recent work has investigated the optimization of computational and memory resources for on-device training [15, 34]. Nevertheless, these efforts often rely on static models for inference or lack the adaptability to accommodate new environments.
To fundamentally address these issues, sensor devices should be capable of "lifelong learning" [42]: to learn and adapt with limited supervision after deployment. On-device lifelong learning reduces the need for expensive data collection (including labels) and offline model training, operating in a deploy-and-run manner. This approach enables autonomous learning solely from the incoming samples with minimal supervision, and is thus able to provide realtime decision-making even without a network connection. The lifelong aspect is essential for handling dynamic real-world environments, representing the future of IoT.
Although extensive research has investigated lifelong learning across various scenarios [42], existing techniques face challenges that render them unsuitable for real-world deployments. These challenges include:
(C1) Streaming data input. Edge devices collect streaming data from a dynamic environment. This online learning with noniid data contrasts with the default offline and iid setting where multiple passes on the entire dataset are allowed [16].
(C2) Lack of supervision. Obtaining ground-truth labels and expert guidance is often challenging and expensive. Most lifelong learning methods rely on some form of supervision, such as class labels [28] or class shift boundaries [46], which are typically unavailable in real-world scenarios.
(C3) Limited device resources. Neural networks (NN) are known for their high resource demands [60]. Furthermore, the main techniques for lifelong learning based on NN, such as regularization [28] and memory replay [35], add extra computational and memory requirements beyond standard NNs, making them inadequate for edge devices.
Real-World Example. To illustrate the challenges faced, we present a real-world scenario in Fig.1. Consider a camera deployed in the wild continuously collecting data from surrounding environment. Our goal is to train an unsupervised object recognition algorithm on the edge device, purely from the data stream. We construct both iid and sequential (one class appears after the other) streams from CIFAR-100 [6], and adopt the smallest MobileNet V3 model [20] with the popular BYOL unsupervised learning pipeline [16]. As seen in Fig. 1, while the model shows improved accuracy with iid streams, it has a significant performance loss under sequentially
ordered data, highlighting the NN effect of “forgetting” in a streaming and unsupervised setting. In terms of efficiency, we measure the training latency of MobileNet V3 (small) [20] on two typical edge platforms, Raspberry Pi (RPi) 4B [2] and Jetson TX2 [1] by running 10 gradient descent steps on a single batch of 32 samples. Even on these very capable edge platforms, training takes up to 17.4 seconds, clearly unsuitable for real-time processing under 30 FPS. Therefore, a novel approach capable of handling non-iid data and offering more efficient updates is necessary to accommodate the continual changes in data.
To address challenges (C1)-(C3), we draw inspiration from biology, where even tiny insects display remarkable lifelong learning abilities, and do so using “hardware” that requires very little energy [4]. Hyperdimensional computing (HDC) is an emerging paradigm inspired by the information processing mechanisms found in biological brains [24]. In HDC, all data is represented using high-dimensional, low-precision (often binary) vectors known as “hypervectors,” which can be manipulated through simple elementwise operations to perform tasks like memorization and learning. HDC is well-understood from a theoretical standpoint [56] and shares intriguing connections with biological lifelong learning [52]. Furthermore, its use of basic element-wise operators aligns with highly parallel and energy-efficient hardware, offering substantial energy savings in IoT applications [11, 23, 27, 65]. While HDC is reported as a promising avenue, the literature to date has not explored weakly-supervised lifelong learning using HDC.
In this work, we design and deploy LifeHD, the first system for on-device lightweight lifelong learning in an unsupervised and dynamic environment. LifeHD leverages HDC’s efficient computation and advantages in lifelong learning, while effectively handling unlabeled streaming inputs. These capabilities extend beyond the scope of existing HDC designs, which have focused overwhelmingly on the supervised setting [23, 27]. Specifically, LifeHD represents the input as high-dimensional, low-precision vectors, and, drawing inspiration from work in cognitive science [5], organizes data into a two-tier memory hierarchy: a short-term “working memory” and a long-term memory. The working memory processes incoming data and summarizes it into a group of fine-grained clusters that are represented by hypervectors called cluster HVs. Long-term memory consolidates the frequently appeared cluster HVs in the working memory, and will be retrieved for merging and inference occasionally. We emphasize that LifeHD is designed to suit a variety of edge devices with diverse resource levels. More efficiency gains can be achieved by employing optimizations such as pruning and quantization [15, 61], but this is not the focus of our work.
Our basic approach in LifeHD is fully unsupervised. However, in reality, labels may be available (or could be acquired) for a small number of examples. We introduce LifeHDsemi to exploit a limited number of labeled samples as an extension to the purely unsupervised LifeHD. Additionally, we propose LifeHDa, which uses an adaptive scheme inspired by model pruning, to adjust the HD embedding dimension on-the-fly. LifeHDa allows us to further reduce resource usage (power in-particular), where necessary.
In summary, the contributions of this paper are:
(1) We design LifeHD, the first end-to-end system for on-device unsupervised lifelong intelligence using HDC. LifeHD builds upon HDC’s lightweight single-pass training capability and incorporates our novel clustering-based memory design to address challenges (C1)-(C3).
(2) We further propose LifeHDsemi as an extension to fully utilize the scarce labeled samples along with the stream. We devise LifeHDa that enables adaptive pruning in LifeHD to reduce real-time power consumption.
(3) We implement LifeHD on off-the-shelf edge devices and conduct extensive experiments across three typical IoT scenarios. LifeHD improves the unsupervised clustering accuracy up to 74.8% with 34.3x better energy efficiency compared to leading unsupervised NN lifelong learning methods [13, 14, 54].
(4) LifeHDsemi improves the unsupervised clustering accuracy by up to 10.25% over the SemiHD [22] baseline under limited label availability. LifeHDa limits the accuracy loss within 0.71% using only 20% of LifeHD’s full HD dimension.
The rest of the paper is organized as follows. We start by a comprehensive review of related works in Sec. 2. We then introduce salient background on HDC in Sec. 3 to help understanding. We formally define the unsupervised lifelong learning problem we target to solve in Sec 4. Afterwards, Sec. 5 describes the details of our major design LifeHD. Sec. 6 introduces LifeHDsemi and LifeHDa. Sec. 7 presents the implementation and results of LifeHD, while the evaluations of LifeHDsemi and LifeHDa are reported in Sec 8. We add the discussions and future works in Sec. 9. The entire paper is concluded in Sec. 10.
This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.