TINYML-ENABLED FRUGAL SMART OBJECTS: CHALLENGES AND OPPORTUNITIES - IOT CRAWLER
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Feature TinyML-Enabled Frugal Smart Objects: Challenges and Opportunities Ramon Sanchez-Iborra and Antonio F. Skarmeta ©ISTOCKPHOTO.COM/MONSITJ 4 IEEE CIRCUITS AND SYSTEMS MAGAZINE 1531-636X/20©2020IEEE THIRD QUARTER 2020 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
Frugal objects are independent “things” specialized Abstract in perform concrete operations that can be programmed The TinyML paradigm proposes to integrate Machine Learning (ML)-based mechanisms within small objects powered by Mi- with software rather than requiring specific electron- crocontroller Units (MCUs). This paves the way for the develop- ics for each task, so they can be massively produced at ment of novel applications and services that do not need the inexpensive cost. These devices incorporate enough omnipresent processing support from the cloud, which is power processing and memory resources, sensors, and con- consuming and involves data security and privacy risks. In this work, a comprehensive review of the novel TinyML ecosys- nectivity capabilities to perform the required operation, tem is provided. The related challenges and opportunities are while being highly energy efficient. A concise definition identified and the potential services that will be enabled by the of types of MCU-based devices according to their hard- development of truly smart frugal objects are discussed. As a ware resources is given in [8]. main contribution of this paper, a detailed survey of the available TinyML frameworks for integrating ML algorithms within MCUs The penetration of frugal objects in our lives due to is provided. Besides, aiming at illustrating the given discussion, the emergence of the Internet of Things (IoT) is skyrock- a real case study is presented. Concretely, we propose a multi- eting, thus provisioning them with cognitive skills opens Radio Access Network (RAT) architecture for smart frugal ob- a plethora of opportunities for developing a widespread jects. The issue of selecting the most adequate communication interface for sending sporadic messages considering both the web of collective intelligence (Fig. 1) [9]. Different para- status of the device and the characteristics of the data to be digms such as fog or edge computing are fueling this sent is addressed. To this end, several TinyML frameworks are hyper-connected distributed scheme supported by the evaluated and the performances of a number of ML algorithms new wave of communication technologies, which permit embedded in an Arduino Uno board are analyzed. The attained results reveal the validity of the TinyML approach, which suc- coordinating end-devices efforts to place intelligence cessfully enables the integration of techniques such as Neural close to the users [10]. Nevertheless, given the power Networks (NNs), Support Vector Machine (SVM), decision trees, constraints of MCUs, reducing data transmissions is cru- or Random Forest (RF) in frugal objects with constrained hard- cial for their sustainability as this type of operations are ware resources. The outcomes also show promising results in terms of algorithm’s accuracy and computation performance. more power demanding than computational tasks [11]. In this line, much data captured by sensors are wasted because of transporting costs, bandwidth limita- I. Introduction tions, or power constraints. For those reasons, perform- M achine Learning (ML) has revolutionized our ing on-device ML processing is a buzz trend today. The way of understanding the world by enabling the TinyML community [12] is boosting the integration of ML inference of valuable information and knowledge within MCU-based smart frugal objects. In fact, forecasts from huge amount of data, so far invisible to human eye. predict that the global edge computing market will reach Data are produced by our digital activity and grow at al- 1.12 trillion dollars by 2023 [13] and some strong compa- most unlimited rates [1], hence calling for large pools of nies such as Ericsson are already offering TinyML-as-a- resources in terms of processing power and storage for Service solutions [14]. In this paper, we provide a compre- their handling and analysis. In this regard, research and hensive overview of current challenges and opportunities innovation advances during the last years have been notable in multiple areas of study [2]–[5]. These efforts have focused on obtaining a better response from the wide range of available ML techniques, e.g., Deep Learn- ing (DL), Neural Networks (NNs), Reinforcement Learning (RL), clustering, etc., by exploiting the potential offered by state-of-the-art processors, data-centers, and super- computers [6]. However, a non-negligible segment of processing entities has received scarce attention due to their inherent computing constraints: Microcontroller Units (MCUs). This tiny processing entities are usually embedded in daily-use appliances such as vehicles, medi- cal devices, personal gadgets, etc., but they may also con- stitute the core processing unit of small frugal objects [7]. Digital Object Identifier 10.1109/MCAS.2020.3005467 Figure 1. Interconnected smart services. Date of current version: 12 August 2020 Ramon Sanchez-Iborra and Antonio F. Skarmeta are with the University of Murcia, Murcia, Spain, (ramonsanchez@um.es; skarmeta@um.es). THIRD QUARTER 2020 IEEE CIRCUITS AND SYSTEMS MAGAZINE 5 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
opened by this emerging hot-topic. A dissection of the ML algorithms within a MCU powered device, and the potential services that will benefit from the introduction attained results. Finally, Section V summarizes the most of ML-powered mechanisms into frugal objects is also important findings and draws future research lines. presented. Currently available frameworks that per- mit to embed ML schemes into MCUs are explored and II. Machine Learning in Frugal Objects reviewed. Although some performance evaluations of The society digitalization achieved during recent times certain frameworks have been reported [15], [16], to the has democratized the daily use of a plethora of frugal authors’ knowledge there is not any prior work provid- objects, e.g., wearables, environmental data collectors, ing a comprehensive survey focused on these valuable actuators, etc. The integration of intelligence within tools. Besides, we elaborate on a specific use case which these small devices brings a series of advantages is the intelligent communication interface selection in a that outweigh the possible cons. In the following, we dis- constrained multi-Radio Access Technology (RAT) de- cuss these aspects aiming at exploring the possibilities vice, by considering the system status as well the char- opened by TinyML from a realistic perspective. acteristics of the data to be sent. To this end, an archi- tectural design of the proposed frugal object and a real A. Benefits on-device implementation of a ML-based decision-mak- ing mechanism are finally showed. Therefore, the main 1) Energy Efficiency contributions of this work are the following: This is the first positive point that comes to mind when adopt- ■■ A wide discussion of the TinyML concept as well ing MCU-based solutions. Whereas powerful processors and as the range of potentially benefited applications. Graphics Processing Units (GPUs) demand great amount of ■■ A survey of the principal available frameworks for power, MCUs present a reduced energy consumption, even integrating ML algorithms within MCUs. working at their maximum workload level. This permits ■■ Insights regarding the real implementation of MCUs to rely on batteries or even energy harvesting. ML-based schemes over frugal objects to solve Therefore, smart frugal objects can be placed almost every- the problem of intelligent interface selection in a where without the need of being plugged to the power grid, multi-RAT set up. which opens the door for novel cognitive itinerant applica- The rest of the paper is organized as follows. Sec- tions [17]. Furthermore, having a scarce power consumption tion II provides a comprehensive overview of the TinyML allows the smart unit to be coupled with larger battery-pow- landscape. Section III presents a survey of the available ered devices, hence converting them into connected smart frameworks for integrating ML within frugal objects. entities, e.g., personal mobility devices such as scooters or Section IV focuses on the specific use case of devices segways. Fig. 2 compares the main characteristics of devices with multiple communication interfaces, presenting the powered either by microcontroller or microprocessor. proposed architecture, the implementation process of 2) Low Cost The simplicity and hardware constraints of frugal objects Processing Power High leads to a moderate cost per unit. Clearly, this has been one of the main reasons for the quick and fruitful exploi- Simplicity Medium Memory tation of IoT architectures in a large range of segments: industry, agriculture, e-health, or entertainment, among Low others [18]. These deployments consist of a large number of frugal objects specialized on performing varied tasks. Low-Cost Storage The heterogeneity of employed end-devices, as well as their reprogrammability bring endless opportunities for enriching current deployments with intelligence, hence enabling their evolution into smart and flexible systems. Portability Multi-RAT 3) System Reliability and Data Security Energy Efficiency As stated above, wireless transmissions require much Microcontroller-Based Device energy in comparison with processing tasks. This Microprocessor-Based Device involves a real need for limiting communication activi- ties in frugal objects. Although this could be seen as a Figure 2. Feature comparison between MCU and micropro- negative aspect, it actually allows to increase system’s cessor-powered devices. reliability as well as data security and privacy. Big data 6 IEEE CIRCUITS AND SYSTEMS MAGAZINE THIRD QUARTER 2020 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
applications require to stream raw data over unpredict- sets are suitable for straight-through ML integration, which able and lossy wireless channels from end-devices to the will permit an easy application development with the sup- cloud. This strategy is bandwidth consuming and prone port of device-level software. In this regard, it is important to transmission errors and cyber-attacks, e.g., man-in- that future firmwares could reach high reliability levels and the-middle or eavesdropping, so that the transmitted tackle key aspects such as in-device data security and op- data may be lost or compromised. These issues are timized transmission operations. In addition, cloud players avoided by performing on-device data processing which should develop enhanced platforms and APIs to allow ef- permits to perform fewer transmissions with aggregated ficient data exchanges between objects and cloud. Finally, or meaningless data for an attacker. the adaptation of mainstream data-science frameworks to MCU requirements is crucial for a wide expansion of 4) Latency TinyML. Not only programming libraries but also lifecycle This is another advantage of performing local pro- management tools are needed to simplify solution develop- cessing instead of offloading computing tasks onto the ment, data handling, etc. As discussed in Section III, a first cloud. State-of-the-art IoT RATs, e.g., Low Power Wide wave of TinyML frameworks and developing tools is already Area Network (LPWAN) [19], have notably improved available, which will foster the arrival of novel services and energy efficiency and transmission distances at the applications to the market by attracting new players. expense of reducing data-rate. This fact leads to very long message’s times-on-air that add undesired delay C. Applications to the whole operation. Besides, on-device computation Many IoT applications are already integrated in our daily efficiency of complex ML algorithms is being deeply in- personal and professional activities, although many of vestigated in order to achieve real-time execution and them are prone to be strengthened with local smart pro- notable advances are being achieved [20]. cessing capabilities (Fig. 3). Most of current IoT devices make an extensive use of network-related operations, B. Challenges which is not always the best option as discussed previ- Given the discussed advantages, integrating ML within ously. For instance, wearables such as fitness monitors or frugal objects also presents important challenges to be “smart” watches are not that smart, as they rely on a Blue- overcome. These are mostly related to device heteroge- tooth-coupled smartphone which assumes the majority neity and MCU’s constraints. of processing tasks. Integrating ML within frugal objects permits their independence from a master device, which 1) Device Heterogeneity will notably enrich current applications landscape. Regarding this aspect, the enormous range of existing devices is a notable obstacle for having a general set 1) eHealth of TinyML tools that could be valid for most of MCU- This is one of the segments with greater growth potential. based set-ups. These units may present different power Health monitoring, visual assistance, hearing aids, per- consumption, processing capabilities, memory, storage sonal sensing, activity recognition, etc., are just few ex- capacity, embedded RATs, peripherals, communication amples of applications that will exploit truly-smart end- ports and protocols, etc., which also hinders the design devices. Beyond the health-care perspective, augmented of generic benchmarking methodologies in TinyML sys- reality, real-time voice recognition, language transla- tems [21]. Therefore, the implementation of universal tion, context-aware support systems, or precise indoor/ development and benchmarking frameworks coping outdoor positioning are other wearable-based services with this heterogeneity is not an easy task although it is to come in a near future [23]. highly necessary for increasing awareness and adoption. 2) Smart Spaces 2) MCU Constraints ML-enabled frugal objects will also contribute to the So far computation and memory constraints have not been collective intelligence already present in scenarios such a limitation given the quantity and quality of available re- as smart cities or cognitive buildings, among many sources in modern workstations and data centers. Howev- others [24]. Current IoT-based monitoring and surveil- er, up-scaling computing system’s resources according to lance systems, e.g., traffic, pollution, crowdsensing, etc., the processing needs of ML mechanisms is not a valid ap- will evolve to smart and autonomous entities capable of proach for MCUs [22]. For that reason, TinyML ecosystem making decentralized and quick decisions. The deploy- players should coordinate effective efforts for achieving a ment simplicity and independence from the power grid al- functional convergence between ML and frugal objects. In low things to be placed in remote and rural areas, hence this line, MCU manufacturers should ensure that their chip- creating smart spaces with seamless roaming among them THIRD QUARTER 2020 IEEE CIRCUITS AND SYSTEMS MAGAZINE 7 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
4) Industry 4.0 The potential impact of TinyML is enormous in industrial and manu- facturing sectors as they are expe- Smart City riencing a digitalization process under the umbrella of the Industry 4.0 revolution [28]. The processing tasks in these fields are highly het- erogeneous in terms of computing eHealth Environmental or memory requirements, which Monitoring causes that MCUs are not always ca- pable of running certain tasks. How- ever, the integration of ML-based Decision Support Systems (DSSs) within MCUs may permit them to determine whether to assume a cer- tain computation task or offloading Industry and it onto upper processing layers such Smart Home Business as edge or cloud [29]. Besides, the refinement of production process- es, decision making, asset tracking, etc. will be driven by the integration Vehicular of intelligence along the production chain as well. 5) Smart Agriculture and Farming Figure 3. TinyML ecosystem. These sectors are also under a deep digital transformation [30]. By [25]. Besides, the reduced cost of end-devices will also fos- adopting ML-based smart monitoring and control, both of ter their adoption in disadvantaged areas, which may help them will increment the efficiency and health of crops and to revamp local economy and business activities [26]. animals, which will lead to greater revenues due to the im- provement of production quality. Finally, common daily ac- 3) Vehicular Services tivities and home appliances will also reach the next level Thanks to the alliance between ML and MCUs, the vehicu- with the support of ML-enabled devices. For example, bio- lar market and, concretely, the Cooperative—Intelligent metric e-locks will be generalized as well as context-aware Transportation Systems (C-ITS) will be enriched with a seg- domotic systems and wellbeing applications. Intelligent ment of vehicles that has been ignored so far, namely, per- pet trackers and feeders, smart garden-irrigation control, sonal mobility devices. Urban mobility trends are changing or advanced safety & security home systems, are other during the last times due to the emergence of sustainable use cases that will be enabled thanks to the upcoming and healthy vehicles such as shared bikes, electric mo- TinyML revolution [31]. peds, scooters, etc. However, current vehicular networks just consider traditional road transportation means, i.e., III. Tinyml Frameworks cars, buses, trucks, etc., as their equipped On-Board Units Given the great potential of TinyML to revolutionize mul- (OBUs) require great amount of power to operate. Although tiple sectors, a number of libraries and tools for easing some efforts have been done for connecting motorcycles to the implementation of ML algorithms on constrained plat- C-ITS infrastructures [27], personal vehicles have not been forms are available in the market. TinyML frameworks considered due to their novelty and inherent constraints. can be categorized into three different families. The most However, coupling these devices with MCU-based OBUs common approach relates to the conversion of already paves the way for their integration within the C-ITS and trained models for adapting them to the restrictions of smart-city ecosystems. Therefore, light vehicles will have MCUs. Thus, these tools usually take inference engines access to basic vehicular services such as driving safety, generated by well-known ML libraries such as TensorFlow device status monitoring, route planning, etc. [32], Scikit-Learn [33], or PyTorch [34], among others, 8 IEEE CIRCUITS AND SYSTEMS MAGAZINE THIRD QUARTER 2020 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
and port their code in order to be executed by scarce ect compatible with existing ML frameworks such as Ten- resource devices. The second approach bets on integrat- sorFlow and Caffe [57], and the ONNX format. As ARM-NN ing ML libraries within MCUs, hence providing them with does not provide support to the Cortex-M CPUs, ARM has local training and analysis capabilities. This permits to also released the Cortex Microcontroller Software Inter- generate models from data retrieved by the own device, face Standard-NN (CMSIS-NN) [42], which is a collection of which may improve the accuracy of the model and also NN kernels optimized to be executed on this processor se- enables the implementation of unsupervised learning al- ries. CMSIS-NN is able to convert floating point-based NNs gorithms. Finally, the last approach relies on the availabil- generated by popular libraries, e.g., TensorFlow, PyTorch, ity of a dedicated co-processor that supports the main and Caffe, into fixed-point format that can run on Cortex-M computing unit on ML-specific tasks. This strategy per- devices. Another electronics manufacturer that has devel- mits to enhance the computation performance, although oped specific libraries for its devices is STMicroelectron- it is the less common approach as the price and complex- ics. Concretely, the STM32 Cube.AI toolkit [43] permits ity of the processing platforms are notably increased. to integrate pre-trained NNs within STM32 ARM Cortex- Big technological companies are promoting the expan- M-based microcontrollers. It generates STM32-compatible sion of the TinyML ecosystem by offering open-source C code from NN models generated by Tensorflow and development libraries. Google has launched TensorFlow Keras [58], or in the standard ONNX format. As an inter- Lite [35], which includes a set of tools to adapt Tensor- esting feature, STM32Cube.AI permits to run large NNs by Flow models aiming at making them to be runnable in mo- storing weights and activation buffers in external flash bile and embedded devices. This framework is composed memory and RAM, respectively. of two main elements, namely, the converter, which ports Besides open-source initiatives, some institutions TensorFlow models to optimized code that can be execut- and companies have also launched licensed products. ed in constrained platforms, and the interpreter, which The Fraunhofer Institute for Microelectronic Circuits and actually runs the code generated by the converter. The Systems (IMS) has developed the Artificial Intelligence optimized models, which cover a number of algorithms for Embedded Systems (AIfES) library [44], which is a from the NN family, can be executed in a range of plat- C-based and platform-independent tool for generating forms such as smart-phones, embedded Linux and MCUs. NNs compatible with a range of MCUs, e.g., Arduino UNO, Focusing on the latter, the optimized code is provided in ATMega32U4, or STM32 F4 Series. It can also produce ef- C++ 11 and requires 32-bit processors to be run. It has ficient code for Windows in the form of Dynamic Link Li- been successfully tested in devices with ARM Cortex-M brary (DLL) and Linux platforms such as Raspberry Pi. series processors, e.g., Arduino nano, and other architec- In its current version, it is able to extracts all parameters tures such as ESP32. Given the notoriety of Arduino, a from models produced by TensorFlow or Keras. Differ- specific library for this platform is available through its ent from the frameworks reviewed above, AIfES permits IDE. Besides the development of TensorFlow Lite, another to implement the NN training process on the embedded related Google’s initiative has been the adaptation of the device, i.e., it belongs to the second family of frameworks GO programming language to microcontrollers [36]. In identified above. Cartesiam’s NanoEdge AI Studio [45] is turn, Microsoft has also contributed to the TinyML scene another commercial product which permits to create ML by releasing its open source Embedded Learning Library static libraries to embed them in Cortex-M-based MCUs. (ELL) [37]. This framework permits to design and deploy Similar to AIfES, it allows to integrate the training process pre-trained ML models onto constrained platforms, e.g., within the constrained device but, in addition, it also ARM Cortex-A and Cortex-M architectures such as Ardu- permits to run unsupervised learning algorithms on the ino, Raspberry Pi, and micro:bit. ELL can be understood MCU. As a part of this toolkit, a microcontroller emula- as an optimizing cross-compiler that runs on a regular tor is provided in order to test and debug the generated desktop computer and outputs C++ code ready to be ex- model prior to its deployment on the device. ecuted on the targeted single-board computer. ELL’s API Surprisingly, all the frameworks presented so far only can be employed either from C++ or Python and makes focus on a single type of ML algorithm, namely, NN. How- use of pre-trained NN models produced by the Microsoft ever, different initiatives released by makers, developers, Cognitive Toolkit (CNTK) [38], Darknet [39], or in Open and researchers also consider other ML techniques such Neural Network Exchange (ONNX) format [40]. as Naive Bayes classifier, decision trees, k-Nearest Neigh- ARM is also pushing for the integration of ML in their bors (k-NN), among others. MicroML [46] is a project that devices. The ARM-NN toolkit [41] is part of Linaro’s ML permits to port Support Vector Machine (SVM) and Rel- Initiative [56] and allows to integrate NN workloads within evance Vector Machine (RVM) algorithms to C code that ARM Cortex-A CPUs, ARM Mali GPUs, and Ethos Neural can run in a number of MCUs, e.g., Arduino, ESP8266, Processing Units (NPUs). ARM-NN is an open-source proj- ESP32, and others with C support. It interoperates with the THIRD QUARTER 2020 IEEE CIRCUITS AND SYSTEMS MAGAZINE 9 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
widely-used Scikit-learn toolkit and transforms the mod- Table I summarizes the main characteristics of the els generated by this library in order to be executed on frameworks reviewed in this section. 8-bit microcontrollers with 2 kb of RAM. Also compatible with Scikit-learn, sklearn-porter [47] permits to transpile IV. Use Case: Smart Multi-Rat Device trained estimators to C, Java, JavaScript, GO, Ruby, and Most of the latest efforts related to the evolution of the PHP. Given this variety of supported languages, it is a very IoT ecosystem have focused on the development of novel complete framework as it is also compatible with a range communication technologies and protocols to provide of ML algorithms. Specifically for C language, sklearn- end-devices with ubiquitous connectivity. Thus, it is com- porter is able to port Scikit-learn models of SVM, decision mon to find MCU-based objects with a variety of integrat- trees, Random Forest (RF), and AdaBoost classifier. How- ed RATs, which present different characteristics in terms ever, the versatility of this library hinders it to generate of data-rate, latency, energy consumption, etc. As dis- microcontroller-optimized code in terms of required RAM. cussed previously, a convenient management of commu- A similar tool is m2cgen [48], which is also able to transpile nication activities is crucial for ensuring the sustainability Scikit-learn-trained models into a native code, e.g., Python, of the end-device; besides, the Quality of Service (QoS) C, Java, Go, JavaScript, Visual Basic, etc. In this case, both requirements of data traffic should be taken into account the number of compatible algorithms as well as the target as well. For those reasons, we propose the application of programming languages are even greater than those sup- ML techniques to smartly decide which is the most ad- ported by sklearn-porter. A more limited tool is Weka-por- equate communication interface for transmitting a given ter [49], which coverts decision tree classifiers generated message by considering both the status of the device and by WEKA [59] into C, Java and JavaScript codes. EmbML the characteristics of the message to be sent. Adopting a [50] is a library that permits to convert models trained by ML approach provides flexibility to the interface selection WEKA or Scikit-learn into C++ source-code files that can mechanism in the case of adding new communication in- be compiled and executed in constrained platforms. This terfaces or removing some of the existing ones. Thus, the library has been tested in Arduino and Teensy [60] boards multi-RAT frugal object is enriched with awareness capa- and it supports different ML algorithms such as decision bilities unavailable for current IoT devices. Aiming at pro- trees, SVM, and NN. viding insights regarding the design and implementation emlearn [51] is a framework that produces portable C99 processes, firstly we propose and describe a multi-RAT code from Python libraries such as Scikit-learn or Keras. smart object architecture and, then, we explore the inte- This library is compatible with generated models of a range gration process of TinyML-based algorithms on real hard- of algorithms, e.g., RF, decision trees, Naive Bayes, or linear ware in order to tackle the mentioned decision problem. models, and has been tested in different platforms, namely, AVR Atmega, ESP8266 and Linux. Focused on extending the A. Architecture compatibility of TensorFlow models, uTensor [52] and Ti- The proposed architecture is shown in Fig. 4. The core nyMLgen [53] are two libraries that produce C code ready element of the device is the MCU which gathers the pro- to be integrated within different families of MCUs. While the cessing unit as well as program and data memories. In this former is focused on Mbed boards [61], the latter exports regard, we consider constrained processing platforms as the trained model to a plain C array, hence ready to be used those described in [8]. Focusing on storage, it is usual to in a wider range of microcontrollers. find commercial devices with SD-card m odules to expand Finally, two proposals from the Academia are focused their capacity, so this is not a big limitation in current de- on optimizing the implementation of different types of NNs velopment boards. As observed in the diagram, the device on ARM Cortex-M processors. CMix-NN [54] is an open- is also equipped with a series of embedded sensors as well source mixed-precision library for quantized NNs deploy- as digital/analog pins for enabling the connection of exter- ment on MCUs. This tool supports convolutional kernels nal elements. To this end, a range of communication proto- with precision of 2, 4, or 8 bits, for any of the operands. cols are frequently employed in MCUs, e.g., Universal Asyn- In their work, authors made use of the convolutional NN chronous Receiver/Transmitter (UART), Serial Peripheral models produced by MobileNets [62]. In turn, FANN-on- Interface (SPI), or Inter-Integrated Circuit (I2C). Although MCU [55] is another open-source framework which is built displays are highly consuming, they are useful to show im- upon the Fast Artificial Neural Network (FANN) library [63] portant events or status messages. Besides, recently small aiming at running lightweight NNs on microcontrollers. e-Paper displays have arrived to the MCU market, which Concretely, this toolkit is able to process Multi-Layer Per- present insignificant energy consumption levels [65]. ceptrons (MLPs) trained with FANN and produces code The availability of multiple RATs embedded in the de- that, besides the mentioned ARM Cortex-M processor, can vice is also one of its key characteristics. WiFi or Blue- run on Parallel Ultra-Low Power Platforms (PULP) [64]. tooth are the most usual technologies as they are very 10 IEEE CIRCUITS AND SYSTEMS MAGAZINE THIRD QUARTER 2020 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
Table I. TinyML frameworks. Interoperable Compatible Output External Publicly Framework Algorithms Platforms Languages Libraries Available Main Developer TensorFlow Neural ARM Cortex-M C++ 11 TensorFlow Yes Google Lite [35] networks ELL [37] Neural ARM Cortex-M C CNTK Yes Microsoft networks ARM Cortex-A C++ Darknet Arduino ONNX micro:bit ARM-NN [41] Neural ARM Cortex-A C TensorFlow Yes ARM networks ARM Mali Caffe Graphics ONNX Processors ARM Ethos Processor CMSIS-NN Neural ARM Cortex-M C99 TensorFlow Yes ARM [42] networks Caffe PyTorch STM 32Cube. Neural STM32 C Keras Yes STMicroelectronics AI [43] networks TensorFlow Lite Caffe ConvNetJs Lasagne AIfES [44] Neural Windows (DLL) C TensorFlow No Fraunhofer IMS networks Raspberry Pi Keras Arduino ATMega32U4 STM32 F4 Series ARM Cortex-M4 NanoEdge Unsupervised ARM Cortex-M C — No Cartesiam AI Studio [45] learning MicroMLGen SVM Arduino C Scikit-learn Yes Particular [46] RVM ESP32 developer ESP8266 sklearnporter SVM Multiple C Scikit-learn Yes Particular [47] Decision tree constrained & GO developer Random Forest non- Java Ada Boost constrained JavaScript Classifier platforms PHP k-NN Ruby Naive Bayes Neural networks m2cgen [48] Linear Multiple C Scikit-learn Yes Particular regression constrained & C# developer Logistic non- Dart regression constrained Go Neural platforms Java networks JavaScript SVM PHP Decision tree PowerShell Random Forest Python LGBM R Classifier Visual Basic (Continued ) THIRD QUARTER 2020 IEEE CIRCUITS AND SYSTEMS MAGAZINE 11 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
Table I. TinyML frameworks. (Cotinued) Interoperable Compatible Output External Publicly Framework Algorithms Platforms Languages Libraries Available Main Developer weka-porter Decision trees Multiple C Weka Yes Particular [49] constrained & Java developer non- JavaScript constrained platforms EmbML [50] Decision trees Arduino C++ Scikit-learn No Research group Neural Teensy Weka networks SVM emlearn [51] Decision trees AVR Atmega C Keras Yes Particular Neural ESP8266 Scikit-learn developer networks Linux Naive Gaussian Bayes Random forest uTensor [52] Neural mBed boards C++11 TensorFlow Yes Particular networks developer TinyMLgen Neural ARM Cortex-M C TensorFlow Yes Particular [53] networks ESP32 Lite developer CMix-NN [54] Neural ARM Cortex-M C Mobilenet Yes Research group networks FANN-on- Neural ARM Cortex-M C FANN Yes Research group MCU [55] networks PULP TinyML Sensors Storage Inertial Weather MC + Health Memory RAT 1 ... ... GPS Display RAT ... n I/O Ports Battery APPS Figure 4. Multi-RAT smart object architecture. 12 IEEE CIRCUITS AND SYSTEMS MAGAZINE THIRD QUARTER 2020 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
ell-known standards that can reach high data-rates. w ing the device status, several aspects are considered. The Other options based on cellular communications such as percentage of remaining battery (B) is one of the key factors 4 G/5G may be also considered. However, all these RATs to evaluate as the power consumption of both interfaces require great amount of energy to work. For that reason, clearly differs. The coverage level (C) of each RAT {c1, c2}, LPWAN-based solutions such as LoRaWAN, Sigfox, or is also taken into account in order to avoid transmissions Narrow Band—Internet of Things (NB-IoT) [19] are cur- over a lossy link. Besides, LoRaWAN makes use of license- rently being adopted by the IoT market as they provide free bands, e.g., Industrial, Scientific and Medical (ISM), so very long transmission distances with highly reduced en- the access to this wireless medium is normally limited by ergy consumption, although with the limitation of severely international regulations. Thus, we model the availability limiting the transmission rate. Given the heterogeneous of the LoRaWAN interface with the boolean feature L. We characteristics of the available RATs, it is crucial to precise- consider that NB-IoT does not present this restriction as it ly select the most adequate technology to guarantee both makes use of licensed spectrum. We also assume that there the energy efficiency of the device as well as the QoS of are not dedicated queues for each interface; instead, we the transmitted packets. At this point, TinyML comes into consider a One in—One out message-forwarding system. play with the aim of smartly supporting this decision. As This assumption is made given the memory constraints of discussed in next Section, a number of ML techniques may MCUs, which prevent them of storing great amount of data be employed to this end, thanks to the algorithm adapta- in RAM. Two message features are also evaluated in the de- tions provided by the afore-reviewed TinyML frameworks. cision process. The data size (S), in bytes, is considered as This multi-R AT smart board can be integrated LoRaWAN has a maximum payload length of 240 B. Finally, within a plethora of devices and gadgets, e.g., vehicles, the urgency level of the message (U) is also contemplated wearables, sensors, robots, etc., which will be enriched with the aim of supporting different classes of services. with intelligence and communication capabilities. These With the features identified above, the current elements are connected to the cloud or other devices system status is defined by the following vector through the available communication interfaces in or- (B, " c 1, c 2 ,, L, S, U ), which is employed for determining der to gain access to a range of next-generation services, the most adequate action A ! " a 0, a 1, a 3 , . We tackle this which are finally accessed by me a n s of e nd - u se r’s issue as a supervised multi-class classification problem. personal devices such as smartphones or tablets. To this end, a dataset has been generated to train a num- ber of algorithms that have been embedded in a real con- B. Problem Statement strained device as explained in the following. We consider the problem of intelligently selecting the most adequate communication interface in a multi-RAT set up C. Dataset when certain data is generated to be transmitted in a non- A synthetic training dataset of 10,000 samples, i.e., fea- connection-oriented fashion. Concretely, given the architec- tures vectors, has been generated and each of the ture presented above, we consider an end-device producing features composing the input vectors has received a sporadic messages and that is equipped with two commu- value randomly calculated. The values for the message nication interfaces, namely, LoRaWAN and NB-IoT. As both urgency (U) and the coverage of each RAT (c1, c2) have of them are LPWAN-based solutions, they provide long been generated by assigning them uniformly distribut- range communications with reduced power consumption, ed values in the range [0%, 100%]. The values for the which are highly valued characteristics for moving devices. device’s remaining battery (B) and the packet size (S), However, given these common general attributes, as shown have been randomly calculated in the ranges [1%, 100%] in Table II, both present unique characteristics in terms of and [50 bytes, 500 bytes], respectively, following a uni- bandwidth, uplink/downlink latency, duty-cycle, etc. There- form distribution as well. Finally, the LoRaWAN interface fore, by considering these RAT features as well as the device availability (L), which was modeled as a boolean input, status and the message characteristics, it should be decid- has received true/false values uniformly distributed. ed whether to transmit a message or not, and if so, the most Each of the 10,000 samples has been labelled with the adequate communication interface should be selected. most adequate action considering its input features. To The set of all possible ac- tions (A) is {a0, a1, a2}, where Table II. a0 represents the action of Main characteristics of RATs under consideration. dropping the message and {a1, RAT Throughput Latency RX Sensitivity Power Consumption a2} indicates sending the data LoRaWAN Low High High Low using LoRaWAN or NB-IoT in- terfaces, respectively. Regard- NB-IoT High Low Medium Medium THIRD QUARTER 2020 IEEE CIRCUITS AND SYSTEMS MAGAZINE 13 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
this end, some hard restrictions have been established mized code in terms of memory allocation. We proved this in order to select a communication interface for sending fact by porting to C code a number of models produced by a given message (Table III). With these restrictions and Scikit-learn, namely, RF, decision trees, and SVM. In every considering the characteristics of each RAT (Table II), we case, m2cgen generated non-compilable code for Arduino have developed an automatic labelling scheme as follows. Uno due to unallowed memory and object manipulations. LoRaWAN interface is preferred for small packets with low MicroMLGen is specialized in translating Scikit- urgency, given its highly reduced power consumption. Nev- learn’s SVM models to Arduino-like platforms. It is able ertheless, it is not the best option for transmitting urgent to handle different SVM kernels such as linear, polyno- packets because of its limited throughput and the long la- mial, and Radial Basis Function (rbf). SVMs are highly tency introduced. In turn, NB-IoT interface is more suitable sensitive to the number of training samples as they de- for transporting urgent messages, given its lower latency termine the number of generated support vectors which and higher data-rate. However, its more limited sensitivity have a direct impact on the required memory. Table IV causes that its usage is not recommended with a coverage summarizes the attained results in terms of accuracy level below 10%. The same applies to the battery level, and F1-score provided by Scikit-learn as well as program when the remaining energy is below 10%, using the NB-IoT memory footprint and classification time in Arduino interface is discarded due to its higher power consump- Uno. Regarding the latter, we present the average run- tion. When one of these situations happen, short urgent ning time for 100 random input vectors as well as the packets are sent via the LoRaWAN interface if it is available confidence interval ^a = 0.05 h . The alpha SVM param- (L = true). Of course, considering the mentioned restric- eter has been fixed to 0.0001. It can be seen how the clas- tions, some combinations of the input features may lead to sifier accuracy is improved by increasing the number of the decision of dropping messages. training samples, no matter the kernel employed. The lin- ear kernel is the fastest one and shows better accuracy D. TinyML integration and evaluation than the rbf kernel, which presents the greatest memory Given the notoriety and mass adoption of Arduino Uno, footprint. Observe that both of them surpass the maxi- this board has been chosen as the testbench for evaluat- mum MCU’s program memory when employing 1000 and ing the performance of TinyML frameworks. We consider 1250 training e xamples due to the large number of gen- it as a good candidate for conducting this benchmarking erated support vectors. Finally, the polynomial kernel is due to its processing and memory constraints, as it is the most accurate one and also presents the most mod- equipped with an 8 bit processor at 16 MHz, flash mem- erate memory footprint. However, it is the slowest kernel ory of 32 KB, and 2 kB of Static RAM (SRAM). According although always under the threshold of 200 ms. to the device classification defined in [8], this platform sklearnporter has been evaluated when dealing with belongs to the most constrained type of MCUs (Class 0). SVMs as well. However, the produced C code from the Taking a look at Table I, the only publicly available Scikit-learn’s models is not optimized for constrained frameworks that can produce code for Arduino Uno devices as mentioned previously, hence it provoked an are MicroMLGen [46], sklearnporter [47], m2cgen [48], unstable behaviour of the Arduino unit. Given the incon- and emlearn [51]. The workflow followed by these tools sistent performance obtained, we could not collect reli- is similar in all of them: firstly, a selected algorithm is able outcomes to present here. trained using a Python-based ML library; thereafter, the As discussed above, the family of NN-based algo- model generated by this library is ported to C code by rithms is the one that has received greater attention the corresponding TinyML translation tool; finally, the from the TinyML community. In this case, we have produced code is integrated within an Arduino sketch evaluate the performance of the MLP implementation by importing it as an external library. provided by the emlearn framework. We have employed Regarding m2cgen, as discussed above, although this Scikit-learn for producing trained networks with differ- framework is highly versatile as it supports many algo- ent number of hidden layers and neurons per layer, us- rithms and output languages, it does not generate opti- ing the sigmoid activation function. As the number of Table III. Feature restrictions for generating the dataset. Battery Coverage (C) LoRaWAN Packet Size (S) Urgency (U) RAT Level (B) (%) (%) Availability (L) (bytes) (%) LoRaWAN >5 >5 True 10 — — — 14 IEEE CIRCUITS AND SYSTEMS MAGAZINE THIRD QUARTER 2020 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
training samples does not impact on the weight of the for training and 20% for evaluating the produced classifier. produced model, the 80% of the dataset has been em- Observe how these algorithms capture the logic employed ployed for training and the other 20% for test. The ob- for labelling our synthetic dataset, which is evidenced by tained outcomes are shown in Table V. It can be seen the good accuracy obtained. By increasing the depth of the that in the MLP case the limiting factor is the Arduino decision tree, we improve the classifier performance up to Uno’s SRAM, as the network’s weights and biases are a perfect accuracy of 1 with decision trees of 6 levels. Re- stored there. Thus, when the number of neurons per lay- garding the RF classifier, we have evaluated one model con- er is 10, the maximum number of hidden layers for not sisting of 10 estimators (trees) and we also obtain a perfect consuming all the unit’s SRAM is 3. Regarding the algo- accuracy although the memory footprint and processing rithm’s performance, a trade-off between accuracy and time are notably greater for this scheme. Comparing both network complexity is evidenced. However, employing frameworks, sklearnporter provides faster and lighter deci- one hidden layer with 4 neurons permits an accuracy of sion trees in comparison with emlearn. However, the oppo- 0.97, which is the best figure for this simple configura- site behaviour is obtained for the RF algorithm. tion. It is clear the direct relationship between the over- In the light of the obtained results, the performance all number of neurons in the network and the process- of decision trees and RF are notably better than that ing time for each classification operation, which should of SVM and MLP for solving the intelligent communica- be also considered when selecting the most adequate tion interface selection issue under consideration. They system configuration. However, observe that the times achieve greater accuracy with reduced memory foot- employed by the MLP classifier are notably shorter than print and computation delay. Therefore, for the specific those needed by the MicroMLGen’s SVMs. The achieved case of smart multi-RAT configuration management, accuracy is also better for the MLP algorithm. this family of algorithms would be the most adequate to The adaptation of other well-known ML mechanisms be integrated within the MCU. such as decision trees and RF has been evaluated as well. To sum up, it can be concluded that great advances have To this end, we have employed the sklearn-porter and em- been made so far for integrating ML algorithms within fru- learn frameworks to port the models generated by Scikit- gal devices. These achievements were unimaginable few learn. In this case, sklearnporter produced suitable code time ago and they pave the way for the arrival of smart ap- for Arduino Uno, given that decision trees and RF are light- plications and services that will not require a continuous er than SVMs. A similar performance analysis to those processing support from the cloud. However, we are just in presented previously is shown in Table VI. The dataset the inception of this movement and more efforts are need- has been split as in the case of MLP evaluation, i.e., 80% ed for the integral development of the TinyML ecosystem. Table IV. SVM generated by MicroMLGen. Samples/Kernel Linear Polynomial rbf 250 Accuracy = 0.7816 Accuracy = 0.7992 Accuracy = 0.7355 F1 = 0.7119 F1 = 0.7643 F1 = 0.6298 Memory = 10678 B Memory = 10264 B Memory = 12172 B Time = 11.1±0.2 ms Time = 73.7±0.3 ms Time = 34.7±0.03 ms 500 Accuracy = 0.7871 Accuracy = 0.8552 Accuracy = 0.7472 F1 = 0.7161 F1 = 0.8074 F1 = 0.6661 Memory = 17782 B Memory = 15636 B Memory = 20574 B Time = 22.3±0.1 ms Time = 73.2±0.7 ms Time = 72±0.047 ms 750 Accuracy = 0.7922 Accuracy = 0.8561 Accuracy = 0.75 F1 = 0.723 F1 = 0.8347 F1 = 0.6815 Memory = 25298 B Memory = 20642 B Memory = 26644 B Time = 34.1±0.2 ms Time = 107.2±0.3 ms Time = 98.9±0.07 ms 1000 Accuracy = 0.7921 Accuracy = 0.8688 Accuracy = 0.7757 F1 = 0.7232 F1 = 0.8443 F1 = 0.7402 Memory = 30156 B Memory = 25530 B Memory = 35184 B Time = 41.8±0.2 ms Time = 140.1±0.2 ms (out-of-range) 1250 Accuracy = 0.7939 Accuracy = 0.8757 Accuracy = 0.7818 F1 = 0.725 F1 = 0.845 F1 = 0.7484 Memory = 37930 B Memory = 31968 B Memory = 44058 B (out-of-range) Time = 183.6±0.4 ms (out-of-range) THIRD QUARTER 2020 IEEE CIRCUITS AND SYSTEMS MAGAZINE 15 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
Table V. MLP generated by emlearn. Hidden layers/ Neurons 2 4 6 8 10 1 Accuracy = 0.8835 Accuracy = 0.9735 Accuracy = 0.961 Accuracy = 0.9705 Accuracy = 0.968 F1 = 0.8393 F1 = 0.9737 F1 = 0.9605 F1 = 0.9701 F1 = 0.9679 Memory = 6750 B Memory = 6830 B Memory = 6910 B Memory = 6990 B Memory = 7070 B SRAM = 392 B SRAM = 472 B SRAM = 552 B SRAM = 648 B SRAM = 744 B Time = 1.9+0.006 Time = 2.7+0.009 Time = 3.5+0.01 Time = 4.3+0.01 Time = 4.9+0.03 ms ms ms ms ms 2 Accuracy = 0.8045 Accuracy = 0.907 Accuracy = 0.9544 Accuracy = 0.9742 Accuracy = 0.9691 F1 = 0.7319 F1 = 0.878 F1 = 0.9541 F1 = 0.9742 F1 = 0.9691 Memory = 6790 B Memory = 6926 B Memory = 7094 B Memory = 7294 B Memory = 7526 B SRAM = 430 B SRAM = 566 B SRAM = 734 B SRAM = 950 B SRAM = 1198 B Time = 2.4+0.005 Time = 3.8+0.009 Time = 5.5+0.01 Time = 7.3+0.02 Time = 9.2+0.02 ms ms ms ms ms 3 Accuracy = 0.8903 Accuracy = 0.9746 Accuracy = 0.9543 Accuracy = 0.9705 Accuracy = 0.9795 F1 = 0.8431 F1 = 0.9745 F1 = 0.9532 F1 = 0.9709 F1 = 0.9797 Memory = 6826 B Memory = 7018 B Memory = 7274 B Memory = 7594 B Memory = 7978 B SRAM = 468 B SRAM = 660 B SRAM = 916 B SRAM = 1252 B SRAM = 1652 B Time = 2.9+0.01 Time = 5+0.01 ms Time = 7.4+0.02 Time = 10.1+0.02 Time = 13.2+0.03 ms ms ms ms 4 Accuracy = 0.888 Accuracy = 0.922 Accuracy = 0.924 Accuracy = 0.945 Accuracy = 0.944 F1 = 0.8392 F1 = 0.9242 F1 = 0.9238 F1 = 0.9453 F1 = 0.9422 Memory = 6864 B Memory = 7112 B Memory = 7456 B Memory = 7896 B Memory = 8432 B SRAM = 506 B SRAM = 754 B SRAM = 1098 B SRAM = 1154 B SRAM = 2106 B Time = 3.6+0.009 Time = 6.2+0.03 Time = 9.3+0.03 Time = 13+0.06 ms (out-of-range) ms ms ms 5 Accuracy = 0.715 Accuracy = 0.882 Accuracy = 0.9285 Accuracy = 0.931 Accuracy = 0.9245 F1 = 0.6108 F1 = 0.8385 F1 = 0.9269 F1 = 0.9297 F1 = 0.9183 Memory = 6902 B Memory = 7206 B Memory = 7638 B Memory = 8198 B Memory = 8886 B SRAM = 544 B SRAM = 848 B SRAM = 1280 B SRAM = 1856 B SRAM = 2560 B Time = 3.9+0.03 Time = 7.3+0.03 Time = 11.1+0.07 Time = 15.8+0.08 (out-of-range) ms ms ms ms Table VI. Decision trees and RF generated by sklearnporter and emlearn. Library/ Decision Tree Decision Tree Decision Tree Decision Tree RF Estimators Algorithm Depth = 3 Depth = 4 Depth = 5 Depth = 6 = 10 sklearnporter Accuracy = Accuracy = Accuracy = Accuracy = 1 Accuracy = 1 0.9155 0.9155 0.9935 F1 = 1 F1 = 1 F1 = 0.8886 F1 = 0.8886 F1 = 0.9935 Memory = 3770 B Memory = 27814 B Memory = 3610 B Memory = 3670 B Memory = 3730 B Time = 48.75±5.6 Time = 252.24±7 Time = Time = Time = μs μs 40.56±0.63 μs 45.28±1.55 μs 48.32±9.48 μs emlearn Accuracy = Accuracy = Accuracy = Accuracy = 1 Accuracy = 1 0.9155 0.9155 0.9935 F1 = 1 F1 = 1 F1 = 0.8886 F1 = 0.8886 F1 = 0.9935 Memory = 40016 B Memory = 13160 B Memory = 3840 B Memory = 3904 B Memory = 3952 B Time = Time = Time = Time = Time = 104.08±3.87 μs 193.6±6.06 μs 103.28±3.87 μs 103.58±7.18 μs 103.96±83.82 μs 16 IEEE CIRCUITS AND SYSTEMS MAGAZINE THIRD QUARTER 2020 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
V. Conclusions services, management of wireless mobile networks, green The integration of ML within IoT devices will provide them networking techniques, and IoT/M2M architectures. with a flexibility and processing capabilities never seen be- fore. The TinyML paradigm proposes to adapt advanced Antonio Skarmeta received the B.S. de- ML techniques to the constraints of MCUs in order to con- gree (Hons.) from the University of Mur- vert them in real smart objects. In this work, a comprehen- cia, Spain, the M.S. degree from the Uni- sive review and discussion of this novel and vibrating eco- versity of Granada, and the Ph.D. degree system have been presented. The related challenges and from the University of Murcia, all in com- opportunities have been identified as well as the potential puter science. He has been a Full Profes- services that will be enabled by these intelligent small el- sor with the University of Murcia, since 2009. He has been ements. The TinyML frameworks available in the market part of many EU FP projects and even coordinated some of that currently permit the integration of complex ML mecha- them. He has published more than 200 international arti- nisms within MCUs have been also reviewed and their ap- cles. His main interests include the integration of security plicability has been discussed. Finally, a realistic use case services, identity, the IoT, 5G, and smart cities. has been explored. Concretely, a multi-RAT smart object ar- chitecture has been proposed and the ML integration pro- References cess has been explained. A number of TinyML frameworks [1] L. Columbus, “10 Charts that will change your perspective of big data’s growth,” Tech. Rep., 2018. [Online]. Available: https://software have been employed for solving the problem of selecting strategiesblog.com/2018/06/02/10 - charts-that-will- change -your the most adequate communication interface considering -perspective-of-big-datas-growth/ both the status of the device and the characteristics of the [2] Y. Sun, M. Peng, Y. Zhou, Y. Huang, and S. Mao, “Application of ma- chine learning in wireless networks: Key techniques and open issues,” data to be sent. We have demonstrated that several classifi- IEEE Commun. Surveys Tut., vol. 21, no. 4, pp. 3072–3108, 2019. doi: cation algorithms can be integrated within an Arduino Uno 10.1109/COMST.2019.2924243. [Online]. Available: https://ieeexplore .ieee.org/document/8743390/ board for tackling this issue, namely, SVM, MLP, decision [3] T. Baltrusaitis, C. Ahuja, and L.-P. Morency, “Multimodal ma- trees, and RF. From the attained outcomes, we can conclude chine learning: A survey and taxonomy,” IEEE Trans. Pattern Anal. that decisions trees and RF present the best performance Mach. Intell., vol. 41, no. 2, pp. 423 – 4 43, Feb. 2019. doi: 10.1109/ TPAMI.2018.2798607. [Online]. Available: https://ieeexplore.ieee.org/ in terms of accuracy, memory footprint, and classification document/8269806/ speed. Other algorithms also showed good results in terms [4] Q. Chen et al., “A survey on an emerging area: Deep learning for of classification accuracy but their implementation should smart city data,” IEEE Trans. Emerg. Topics Comput. Intell., vol. 3, no. 5, pp. 392–410, Oct. 2019. doi: 10.1109/TETCI.2019.2907718. [Online]. Avail- be carefully addressed given the memory restrictions of able: https://ieeexplore.ieee.org/document/8704334/ the considered MCU. As future work, we plan to consider [5] G. Li, R. Gomez, K. Nakamura, and B. He, “Human-centered rein- forcement learning: A survey,” IEEE Trans. Human-Mach. Syst., vol. 49, other constrained devices in order to benchmark additional no. 4, pp. 337–349, Aug. 2019. doi: 10.1109/THMS.2019.2912447. [Online]. TinyML frameworks. Besides, other relevant aspects such Available: https://ieeexplore.ieee.org/document/8708686/ as decision safety and model correctness in sensitive fields [6] T. Luo et al., “DaDianNao: A neural network supercomputer,” IEEE Trans. Comput., vol. 66, no. 1, pp. 73–88, Jan. 2017. doi: 10.1109/TC.2016.2574353. of application, e.g., eHealth, will be considered. [Online]. Available: http://ieeexplore.ieee.org/document/7480791/ [7] B. Garbinato, R. Guerraoui, J. Hulaas, M. Monod, and J. H. Spring, “Per- vasive computing with frugal objects,” in Proc. IEEE 21st Int. Conf. Advanced Acknowledgment Information Networking and Applications Workshops (AINAW’07), 2007, pp. This work has been supported by the European Commis- 13–18. [Online]. Available: http://ieeexplore.ieee.org/document/4224076/ sion, under the projects IoTCrawler (Grant No. 779852), [8] C. Bormann, M. Ersue, and A. Keranen, Terminology for Constrained- Node Networks, IETF RFC 7228, 2014. CyberSec4Europe (Grant No. 830929) and Fed4IoT (Grant [9] M. S. Hadj Sassi, F. G. Jedidi, and L. C. Fourati, “A new architecture No. 814918), and by the Spanish Ministry of Science, In- for cognitive internet of things and big data,” Proc. Comput. Sci., vol. 159, pp. 534–543, 2019. doi: 10.1016/j.procs.2019.09.208. [Online]. Available: novation and Universities, under the project PERSEIDES https://linkinghub.elsevier.com/retrieve/pii/S1877050919313924 (Grant No. TIN2017-86885-R with ERDF funds). [10] R. Roman, J. Lopez, and M. Mambo, “Mobile edge computing, Fog et al.: A survey and analysis of security threats and challenges,” Fu- ture Gener. Comput. Syst., vol. 78, pp. 680–698, Jan. 2018. doi: 10.1016/j. Ramon Sanchez-Iborra received the future.2016.11.009. [Online]. Available: https://linkinghub.elsevier.com/ B.Sc. degree in telecommunication engi- retrieve/pii/S0167739X16305635 neering and the M.Sc. and Ph.D. degrees [11] B. Martinez, M. Monton, I. Vilajosana, and J. D. Prades, “The power of models: modeling power consumption for IoT devices,” IEEE Sensors J., in information and communication tech- vol. 15, no. 10, pp. 5777–5789, Oct. 2015. doi: 10.1109/JSEN.2015.2445094. nologies from the Technical University [Online]. Available: http://ieeexplore.ieee.org/document/7122861/ [12] TinyML. 2020. [Online]. Available: https://www.tinyml.org/ of Cartagena, in 2007, 2013, and 2016, re- [13] C. MacGillivray and M. Torchia, “Internet of Things: Spending spectively. He is currently an Assistant Professor and a trends and outlook,” Tech. Rep., 2019. [Online]. Available: https://www Researcher with the Information and Communications .idc.com/getdoc.jsp?containerId=US45161419 [14] H. Doyu, “TinyML as a Service and the challenges of machine learn- Engineering Department, University of Murcia. His main ing at the edge,” Ericsson, Tech. Rep., 2019. [Online]. Available: https:// research interests are evaluation of QoE in multimedia www.ericsson.com/en/blog/2019/12/tinyml-as-a-service THIRD QUARTER 2020 IEEE CIRCUITS AND SYSTEMS MAGAZINE 17 Authorized licensed use limited to: Universidad Politecnica de Cartagena. Downloaded on August 18,2020 at 08:22:56 UTC from IEEE Xplore. Restrictions apply.
You can also read