DISNET: Distributed Micro-Split Deep Learning in Heterogeneous Dynamic IoT
DISNET: Distributed Micro-Split Deep Learning in Heterogeneous Dynamic IoT
Abstract-The key impediments to deploying deep neural networks in Internet of Things edge environments lie in the gap between the expensive DNN computation and the limited computing capability of IoT devices. Current state-of-the-art machine learning models have significant demands on memory, computation, and energy and raise challenges for integrating them with the decentralized operation of heterogeneous and resource-constrained IoT devices. Recent studies have proposed the cooperative execution of DNN models in IoT devices to enhance the reliability, privacy, and efficiency of intelligent IoT systems but disregarded flexible fine-grained model partitioning schemes for optimal distribution of DNN execution tasks in dynamic IoT networks. In this article, we propose distributed micro-split deep learning in heterogeneous dynamic IoT (DISNET). DISNET accelerates inference time and minimizes energy consumption by combining vertical (layer based) and horizontal DNN partitioning to enable flexible, distributed, and parallel execution of neural network models on heterogeneous IoT devices. DISNET considers the IoT devices' computing and communication resources and the network conditions for resource-aware cooperative DNN Inference. Experimental evaluation in dynamic IoT networks shows that DISNET reduces the DNN inference latency and energy consumption by up to five point two times and six times, respectively, compared to two state-of-the-art schemes without loss of accuracy.
One. INTRODUCTION
One. INTRODUCTION
INTERNET of Things systems generate large volumes of data from user devices. The raw data generated by IoT devices is very often private or sensitive and can be too large to transmit over the networks. For example, wearable devices, such as Google Glass or Apple Watch gather sensitive data by recording the daily activities of users. Similarly, the data collected from medical sensors, microphones, and cameras are very often in large quantities and highly sensitive. This data is essential for executing machine learning models in order to deliver personalized services and other intelligent IoT applications. Consequently, there is a growing demand for implementing machine learning methods without directly accessing and aggregating sensitive raw data from devices at central servers.
Deep neural networks have shown extraordinary power in understanding large-scale data that are massively diverse and complex in several applications. Because of their proximity to the data, conventional consumer-level devices, such as IoT devices, are great candidates for the in-the-edge processing of DNNs. However, current state-of-the-art machine learning models have significant demands on memory, computation, and energy. This is incompatible with the resource-constrained nature of IoT devices that are characterized by limited energy budget, memory, and computation capability. Model optimization methods, such as weight pruning and precision reduction, enable the running of limited versions of models on IoT devices. However, with the continuous advancement of DNNs models for various fields, these approaches are impractical for a wide range IoT applications, such as health, industrial, and multimedia-based because a restricted model (i.e., whose computational demands are adequately reduced to fit on IoT devices) may provide insufficient accuracy performance for such sensitive tasks. It is, therefore, necessary to devise methods to deploy models across a network of resource-constrained devices.
Distributed machine learning methods allow joint execution of DNN models by sharing the computation on multiple devices. Such techniques have great potential to benefit from rich data generated by the distributed heterogeneous IoT devices without transmitting vast amounts of raw data over the central networks. Among others, distributed machine learning in edge IoT environments bring the following opportunities and challenges: reducing the dependence on cloud resources and high-performance network infrastructure for scenarios with limited Internet connectivity, protecting private or sensitive user data by not exposing it outside the local network, and providing an alternative solution for understanding raw data locally other than the current standard solution of offloading to the cloud.
Deep learning driven applications are computation-intensive because of the depth of the DNN models (i.e., their massive number of layers) and large input dimension. Most existing works apply vertical partitioning (Figure one) where the model is partitioned layer by layer, and some of the layer computations are processed in devices and some at the server or other devices. The layer-based partitioning can solve the depth problem of deep learning driven applications and reduce the computation burden on end devices. However, layer-based partitioning lacks a solution for the big input data and disregards the advantages of intralayer parallelism since the layers are executed in a specific sequential order. Furthermore, the approach misses the opportunity to effectively exploit the collective computing power of numerous heterogeneous, ubiquitous, and decentralized IoT devices at the edge. In most variations of this approach, the last model layers are processed on the server, creating server dependency.
Cooperative execution of DNN models among edge devices has significant potential in reducing latency and energy costs by harvesting underutilized computing resources at the edge. Other recent studies have proposed data partitioning with horizontal DNN partitioning. These allow parallel execution of model layers on multiple devices and distribute model input data (Figure two). However, the approaches pay less attention to the heterogeneity and variability of computing and network resources of IoT networks. Furthermore, only input-wise partitioning can increase data dependence, leading to overlapped computation and redundant communication.
To effectively exploit computing resources in edge IoT environments, we must consider the resource-constrained nature and limited energy budget of IoT devices and their heterogeneity. Furthermore, given the highly dynamic IoT network conditions, an optimal workload allocation that jointly considers computation and communications costs to improve the latency and energy consumption is desired. More specifically, it is crucial to decide which devices to participate in the cooperative machine learning, how much workload to allocate to each device and the communication efficiency among the devices. Besides, the topology of IoT mesh networks can be utilized to optimize communication by allocating the computation of partitions to devices based on the communication paths and quality of the communication channels among them.
In this article, we propose distributed micro-split deep learning in heterogeneous dynamic IoT (DISNET), a scheme for distributed DNN computing in resource-constrained heterogeneous IoT devices for low-latency and energy-efficient cooperative deep learning. DISNET models the neural network as a weighted directed acyclic graph that can be flexibly partitioned into a network of IoT devices. By considering both vertical and horizontal DNN partitioning, DISNET enables efficient and parallel processing of a DNN model while utilizing the collective computing of heterogeneous, ubiquitous, and decentralized IoT devices. The IoT network is considered as a weighted graph where the weights correspond to the available computation and network resources on the devices and their respective wireless channels. DISNET dynamically splits and allocates the processing of neural network partitions to the IoT devices in the network for optimizing DNN inference tasks. We envision the deployment of DISNET in environments, such as smart home, Industrial IoT, Internet of Medical Things, smart environmental monitoring, etc., wherein the devices are willing to cooperate and share their resources. In summary, this article makes the following contributions.
One) We propose micro-split deep learning to enable flexible partitioning and distributed computing of neural network models in heterogeneous dynamic IoT for low-latency and energy-efficient cooperative DNN inference.
Two) We model the problem of flexible fine-grained neural network model splitting and device allocation, comprising arbitrary-sized DNN partitions, and dynamic mesh networks with heterogeneous resources. We prove the NP-hardness of the problem and formulate its relaxation.
Three) We propose an efficient heuristic based on the relaxed problem that iteratively combines vertical and horizontal DNN partitioning for distributed execution of neural networks in dynamic IoT with diverse computing capabilities and network conditions without compromising accuracy.
Four) We consider the tradeoffs between minimizing the inference execution time and minimizing the energy consumption on the IoT devices by introducing an energy-sensitivity parameter for the optimization that accounts for the application's energy or time constraints in cases of tradeoffs between the two objectives.
Five) We implement a multidevice prototype comprising of heterogeneous IoT devices and evaluate DISNET against other state-of-the-art approaches in various dynamic scenarios to corroborate its superior performance.