Post

Applying deep learning to real-time UAV-based forest monitoring Leveraging multi-sensor imagery for improved results

Publication on Expert Systems with Applications (Q1)

Check out the PDF manuscript here!

Highlights

• Real-time detection using UAVs for people and cars in forests based on deep learning.

• 4-channel object detection model with RGB and IR, for low-visibility environments.

• A new annotated and aligned image dataset with four channels (RGB and IR).

• A system architecture for transmitting images between client (UAV) and server.

• Web platform for real-time detection visualization across multiple devices.

Abstract

Rising global fire incidents necessitate effective solutions, with forest surveillance emerging as a crucial strategy. This paper proposes a complete solution using technology that integrates visible and infrared spectrum images through Unmanned Aerial Vehicles (UAVs) for enhanced detection of people and vehicles in forest environments. Unlike existing computer vision models relying on single-sensor imagery, this approach overcomes limitations posed by limited spectrum coverage, particularly addressing challenges in low-light conditions, fog, or smoke. The developed 4-channel model uses both types of images to take advantage of the strengths of each one simultaneously. This article presents the development and implementation of a solution for forest monitoring ranging from the transmission of images captured by a UAV to their analysis with an object detection model without human intervention. This model consists of a new version of the YOLOv5 (You Only Look Once) architecture. After the model analyzes the images, the results can be observed on a web platform on any device, anywhere in the world. For the model training, a dataset with thermal and visible images from the aerial perspective was captured with a UAV. From the development of this proposal, a new 4-channel model was created, presenting a substantial increase in precision and mAP (Mean Average Precision) metrics compared to traditional SOTA (state-of-the-art) models that only make use of red, green, and blue (RGB) images. Allied with the increase in precision, we confirmed the hypothesis that our model would perform better in conditions unfavorable to RGB images, identifying objects in situations with low light and reduced visibility with partial occlusions. With the model’s training using our dataset, we observed a significant increase in the model’s performance for images in the aerial perspective. This study introduces a modular system architecture featuring key modules: multisensor image capture, transmission, processing, analysis, and results presentation. Powered by an innovative object detection deep-learning model, these components collaborate to enable real-time, efficient, and distributed forest monitoring across diverse environments.

Graphical abstract

Graphical Abstract

Introduction

It is possible to estimate that forest fires now cause 3 million more hectares of tree cover loss annually than they did in 2001 and accounted for more than a quarter of all tree cover loss over the past 20 years, according to data from a recent study by researchers at the University of Maryland (Tyukavina et al., 2022).

Fire prevention and suppression have seen a substantial increase in investment by the Portuguese government. In 2021, “316 million euros were invested in rural fire risk management (9% more than in 2020), 46% for prevention and 54% for suppression” (Lusa, 2022). However, the solutions currently used for prevention, in addition to being expensive, have some disadvantages, namely the need for constant observation by forest rangers and patrol cars (Minho, 2023). Furthermore, all these solutions require one or more operators 24 h a day in various locations across the country. These solutions are also restricted by what a human being can observe, which is a relatively small area. With this work, we intend to present a solution that will allow the automation of some of the processes of detecting potential occurrences and reduce the percentage of fires that end up not being detected in time to be avoided. This project proposes using unmanned aerial vehicles to detect people and cars in forest and rural areas with a higher risk of occurrences, using artificial intelligence techniques, particularly Deep Learning. By constantly patrolling these areas, UAVs will detect the presence of these targets with the aid of computer vision. The results of this analysis can be observed through a web platform in real-time, which can assist in decision-making for forest rangers and firefighters.

According to data provided by the INCF (Institute for the Conservation of Nature and Forests) (Instituto da Conservação da Natureza e das Florestas, 2023), of the 8186 fire occurrences in the year 2021, 42.8% were the result of negligence, 34.3% of unknown causes, and 16% were intentional. Therefore, our solution focuses mainly on the prevention of occurrences with these three causes, the detection of people and cars in cases of negligence and intentional (arson), and the constant monitoring in cases where the cause was unknown.

Our solution prevails mainly when visibility conditions are not optimal, particularly at night, in foggy and smoky conditions. After detecting people and cars in risk areas, the responsible operator will be notified and can carry out manual monitoring by taking control of the vehicle or even going on-site for inspection. Due to the good performance of the detection under unfavorable conditions, it can not only be used for detection with the aim of prevention but also for locating victims who are unconscious in fires where visibility is impeded by smoke. The main innovation of our complete solution is a novel deep learning-based object detection system capable of operating online and in real-time under special conditions; to the best of our knowledge, there is no deep learning-based system in the literature capable of detecting objects (people and vehicles) in conditions of reduced visibility, online and in real-time, using multi-spectral images, RGB and infrared, captured by UAVs.

For this project we highlight five main objectives:

• Study of existing technologies for detecting people and vehicles using UAVs.

• Capture and annotation of images in order to create our dataset adapted to the conditions of the problem: low light and reduced visibility.

• Creation and training of a 4-channel object detection deep- learning based model for the detection of people and cars, mainly in low light and low visibility conditions with partial occlusions from the aerial perspective.

• Development of a system architecture for transmitting images between client (UAV) and server.

• Construction of a Web platform for real-time observation of images sent by air vehicles after passing through the detection model.

Fulfilling these objectives, we intend to achieve three main contributions: build and prepare a new annotated and aligned image dataset with four channels RGB (Reg, Green, Blue) and IR (InfraRed); propose a real-time modular system architecture, and developing a deep learning based system for the detection of people and vehicles using aerial images captured from UAVs with two camera sensors, also working in real-time.

This paper is organized as follows. Section 2 reviews existing technologies, evolution in the computer vision field, types of algorithms, object detection using various types of sensors, UAVs in image capturing and existing datasets. In Section 3, we present two possible solutions for monitoring and detecting people in forest environments. For our solution we present the corresponding modular architecture, with an image capture module, transmission module, processing module, analysis module and the results presentation module. In Section 3.6, we introduce our 4-channel object detection model. Section 4 contains the results of the initial model training with the Flir-aligned dataset and comparison with a standard model that uses only RGB images, training the final model with our dataset and tests in real situations. Finally, in Section 5 we draw conclusions, reflect on the principal contributions of our work and propose some future work.

Section snippets

This section reviews existing research and technologies; we address several topics which are relevant for the process of object detection using UAV imagery captured using multi-sensors. We begin with Existing Technologies, where we tackle computer vision, image classification and object detection areas. Then we explore the existing datasets, and we address the work with multiple sensors. Finally, we talk about using UAVs to capture data.

Solution applying deep learning to UAV-based forest monitoring After capturing information with the UAV, there is still a long way before performing detection. There are two viable solutions: the first is to use an on-site base to receive the images and perform the processing while showing the results to the responsible operator. This solution has some advantages, such as the low latency between capturing the image and making the results available while not requiring Internet access. However, there are disadvantages associated with this solution: the need

Experiments and results

This chapter addresses the training of the various developed models, their performances, and their comparison with existing models. We also discuss the advantages of using a 4-channel input model versus a model that only uses images from the visible spectrum. Finally, we present tests in real situations and their results.

Conclusions and future work

The solution presented in this article leverages three main contributions. Firstly, a recently acquired dataset was built and described, with aerial photos in authentic forest environments, in visible and infrared spectra (4-channel). These images are aligned using an efficient algorithm and annotated in the YOLO format for future use. This dataset fulfills a lack in this area for this type of problem that arises from using unaligned multi-sensor imagery. It also reveals an implemented

CRediT authorship contribution statement

Tomás Marques: Conceptualization, Methodology, Development, Writing. Samuel Carreira: Conceptualization, Methodology, Development, Writing. Rolando Miragaia: Conceptualization, Methodology, Neural Networks advising, Guidance. João Ramos: Conceptualization, Methodology, UAVs technology advising, Guidance. António Pereira: Conceptualization, Methodology, Supervision, Project administration, Funding acquisition.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was financed by national funds through the Portuguese Foundation for Science and Technology - FCT , under the Project “DBoidS - Digital twin Boids fire prevention System” Ref. PTDC/CCI-COM/2416/2021

This post is licensed under CC BY 4.0 by the author.