Caroline Pacheco do E.Silva (Ph.D.)'s profile

Equation Discovery for Visual Analysis

Automated Mathematical Equation Discovery for Visual Analysis


Finding the best mathematical equation to deal with the different challenges found in complex scenarios requires a thorough understanding of the scenario and a trial and error process carried out by experts. In recent years, most state-of-the-art equation discovery methods have been widely applied in modelling and identification systems. However, equation discovery approaches can be very useful in computer vision, particularly in the field of feature extraction. In this paper, we focus on recent AI advances to present a novel frame- work for automatically discovering equations from scratch with little human intervention to deal with the different challenges encountered in real-world scenarios. In addition, our proposal can reduce human bias by proposing a search space design through generative network instead of hand-designed. As a proof of concept, the equations discovered by our framework are used to distinguish moving objects from the background in video sequences. Experimental results show the potential of the proposed approach and its effectiveness in discovering the best equation in video sequences. 


Proposed Method

The proposed approach to discovering equations consists of two main components: (1) Dis- cover the best VAE to design our search space that contains a wide variety of mathematical LBP equation structure that we may never have thought of yet. (2) Search the best LBP equation to deal with different challenges (e.g., changes in lighting, dynamic backgrounds, camera jitter, noise, shadows, and others) faced by a background subtraction algorithm in detecting moving objects found in complex scenarios. We describe each of these components below.

 Search Space Design by VAE to Generate Equation Structures

Given a set of parameters denoted by S, the Multi-Objective Covariance Matrix Adaptation Evolution Strategy (MO-CMA-ES)  is responsible for generating a set of optimal parameters to train different VAE’s expressed by Ψ = {ψ1 , ψ2 , . . . , ψL }, where ψ represents a Variational Autoencoders (VAE) and L is the maximum number of elements ψ ⊆ Ψ. Then, the VAE with the optimum parameters that allow best performance is selected. Finally, the ψ ⊆ Ψ which presented the highest performance is chosen to design the search space E. A brief overview of this first step of our proposed framework is illustrated in Figure 2 below.
Figure 1: A brief overview of the first step of our proposed framework.
Search Strategy by MO-CMA-ES to Discover the Best Equation

A set of LBP equations E = {ε1 , ε2 , . . . , εK }, where ε expresses each equation structures and k is a user parameter that determines the number of elements ε ⊆ E. 
Initially, the  MO-CMA-ES algorithm seeks the best LBP equation by mutating the arithmetic operators of each equation ε ⊆ E resulting in a new one of the mutated equations E = {ε1, ε2, . . . , εK }. The performance of each LBP equation is estimated by a background subtraction algorithm to distinguish the moving objects from the background of a set of videos. Finally, the ε that presented the maximum accuracy is selected as a best equation structure.The Figure below shows a brief illustration of this step of our framework.
Figure 2: A brief overview of the second step of our proposed framework.
Experimental Results

We present the visual results on individual frames from six different scenes: ’peopleInShade’ (frame #318), ’snowFall’ (frame #2758), ’canoe’ (frame #904), ’busStation’ (frame #300), ’skating’ (frame #1845) and ’fall’ (frame#3987) of the CDnet-2014 data set. Figures 3 and 74 show the foreground detection results using the Texture BGS and the proposed method, respectively. They were shown without any post-processing technique.
Figure 3: Background subtraction results using the CDnet-2014 data set. From top to bottom: (a) Original frame, (b) Ground truth, (c) Texture BGS method, (d) Proposed method, and (e) LBP texture. The true positives (TP) pixels are in white, the true negatives (TN) pixels are in black, the false positives (FP) pixels are in red, and the false negatives (FN) pixels are in green.
Figure 4: Background subtraction results using the CDnet-2014 data set. From top to bottom: (a) Original frame, (b) Ground truth, (c) Texture BGS method, (d) Proposed method, and (e) LBP texture. The true positives (TP) pixels are in white, the true negatives (TN) pixels are in black, the false positives (FP) pixels are in red, and the false negatives (FN) pixels are in green.

Table 1 shows the proposed framework evaluated on in the six scenes. The best scores are in bold. The proposed approach presented the best scores for ’peopleInShade’, ’snowFall’, ’canoe’, ’busStation’ and ’skating’ while it performing the worst score for the ’fall’ scene. However, we can improve the results of our proposed method by conducting an exhaustive search to find the best LBP equation for this scene.
Table 3: Performance using the CDnet-2014 data set

Publications

2021 -  Pacheco do Espírito. Silva, C. and De Souza, J. M. F.  and Vacavant, A. and Bouwmans, T. and Cordolino Sobral, A. “Automated Mathematical Equation Structure Discovery for Visual Analysis. Pattern. ArXiv, 2104.08633, 2021 [PDF] [CODE] (submitted to Machine Learning Research (JMLR))
Equation Discovery for Visual Analysis
Published:

Equation Discovery for Visual Analysis

Published:

Creative Fields