# Multi-scale-average-filter-assisted level set segmentation model with local region restoration achievements

One of the most ambitious and stimulating tasks in image segmentation is intensity inhomogeneity. Problems, for instance, artificial illumination and non-uniform daylight can cause imperfection of acquisition which leads to image inhomogeneity. Intensity inhomogeneity highly affects the image segmentation precision due to the overlap of background and foreground. In the last decade, many promising algorithms and methods were introduced to tackle this problem^{15,16,17,18,19,20}. However, all those methods have limitations and are unable to tackle severe intensity inhomogeneity and work for images with specific properties^{21,22}. For a better understanding of their limitation, we will shortly revise and comment on some state-of-the-art approaches and techniques. In summation to the intensity-based approach, we will discuss the widely used deep learning-based methods and well-known techniques for tackling the low-level computer vision problems, for instance de-noising and artifact removal.

The main image segmentation techniques can be classified as: (i) edge-based segmentation approaches; and (ii) region-based segmentation techniques. The edge-based models^{13,23,24} incorporate edge detector functions which channelize the movement of active contour in the directions of the boundaries. Such functions rely on the gradient of the image data. The region-based models^{11,16,17} utilize region information, such as variance, mean etc., to move the contour in the directions of the object’s boundary. The edge-based models utilize the image local information to be unable good performance in noisy images and ignore the objects having diluted boundaries. On the other hand, the region-based methods and models are unable to tackle the intensity inhomogeneity. This is due to the fact that the intensity inhomogeneity is the local property of images rather than a global one. One of the benefits of region-based techniques is that these approaches are potentially less sensitive to the noise and outliers due to which their segmentation results are better in noisy images. In fact, the majority of the region grounded models are approximations to the milestone Mumford–Shah (MS) energy functional^{12}. Among all of them, active contour without edges suggested by the Chan and Vese (CV)^{11,12} gained much popularity in the literature due to its simple implementation. In terms of the CV model, the energy functional is given and illustrated in Eq. (2):

$$begin{aligned} F^{CV}(c_1,c_2,Gamma )= & {} mu (Length(Gamma ))nonumber &+lambda _1int _{inside(Gamma )}|u_0({textbf {x}})-c_1|^2d{textbf {x}}nonumber &+lambda _2int _{outside(Gamma )}|u_0({textbf {x}})-c_2|^2d{textbf {x}}, end{aligned}$$

(2)

where (u_0({textbf {x}})) is the given image, (Gamma ) denotes smooth and segmented curve, (mu ), (lambda _1) and (lambda _2) are positive parameters (to be tuned accordingly), (c_1) and (c_2) are the mean intensities of (u_0({textbf {x}})) inside and outside of the (Gamma ), correspondingly. Although, the CV model is commonly used and it has promising results for additive Gaussian noise, however, its limitation can be easily observed in cases the image suffers from intensity inhomogeneity^{25,26}. This drawback is due to the utilization of the global information of images and ignoring local features information^{27}. To enhance the CV model for inhomogeneity image segmentation the Local Binary Fitting (LBF) model^{16} was introduced. The LBF model hires a kernel function to locate the local intensity information of images and embeds this information and statistics into a region-based active contour model and level set formulation^{28,29}. In terms of the LBF model, the energy functional is given as illustrated in Eq. (3):

$$begin{aligned} F^{LBF}(Gamma , g_1, g_2)= & {} lambda _1int _{Omega }int _{inside(Gamma )}K_{sigma }({textbf {x}}-{textbf {y}})|u_0({textbf {y}})-g_1({textbf {x}})|^2d{textbf {y}}d{textbf {x}}nonumber &+lambda _2int _{Omega }int _{outside(Gamma )}K_{sigma }({textbf {x}}-{textbf {y}})|u_0({textbf {y}})-g_2({textbf {x}})|^2d{textbf {y}}d{textbf {x}}, end{aligned}$$

(3)

whereas the variables (lambda _1), (lambda _2) are constants, (K_{sigma }) represents the Gaussian kernel with standard deviation ((sigma )). Furthermore, the variables (g_1) and (g_2) characterizes the two smooth functions that, in fact, approximate the local details and statistics of the image inside and/or outside of the (Gamma ), correspondingly. Although, the LBF model can cope with the intensity inhomogeneous; nevertheless, this model is very sensitive to the initial contours. Moreover, changes on initial contour can potentially lead the LBF model to produce undesirable segmentation results. Therefore, to further improve the segmentation of intensity inhomogeneity images and for bias field correction, Li et al.^{30} suggested a new region based variational model^{31,32}. The authors in Ref.^{30} defined an objective function for K-means clustering, which is weighted, in a locality close to every point, with the centers of the clusters and having a multiplicative component that, in fact, computes and estimates the bias within the locality. Subsequently, then the proposed function is amalgamated over the whole environment and embedded into a level set formulation. Even though, the method suggested by Li et al.^{30}, overcomes the existing ones, still the method can not deal with high image inhomogeneity, as similar to the other methods, the method is grounded on the laying claim that every intensity inhomogeneous image is, in fact, homogeneous within a small region. Another problem with those methods is that there is no prediction on the scale of the homogeneous region. Dealing with serious and hard intensity inhomogeneity and tuning the scale, in particular, for inhomogenous regions may potentially cause undesired results. Taking into account these problems, Wang et al.^{20} suggested a multi-scale local (MSL), and region oriented, system and model for segmentation of intensity inhomogeneous images. With the assumption that the desired neat image (u({textbf {x}})) is vitiated and damaged by the additive noise (eta ({textbf {x}})) and the intensity inhomogeneity (varphi ({textbf {x}})), then the obtained image (u_0({textbf {x}})) is described as given by Eq. (4):

$$begin{aligned} u_0({textbf {x}})= varphi ({textbf {x}}) u ({textbf {x}})+eta ({textbf {x}}). end{aligned}$$

(4)

A generally accepted assumption is that the intensity inhomogeneity is a tardily changing and varying component over the entire image and it is constant within a small local region. The target is to acquire the corresponding clean image (u({textbf {x}})) which is impressed and disturbed by both the noise and intensity inhomogeneity. To achieve this, the MSL model defines a local region in a circular shape for capturing local information and statistics and then some kind of mathematical and statistical assessment is done on those local circular arenas for each and every pixel utilizing multi-scale low-pass filtering. Assuming (hat{u}({textbf {x}})=varphi ({textbf {x}}) u({textbf {x}})), we have the following relationship:

$$begin{aligned} u_0({textbf {x}})=hat{u}({textbf {x}})+eta ({textbf {x}}). end{aligned}$$

(5)

Applying Eq. (1) it is easy to recover (hat{u}({textbf {x}})) and then consider it as a given image which is only suffered from intensity inhomogeneity and free from noise. Thus, the problem reduces into finding (u({textbf {x}})) from (hat{u}({textbf {x}})=varphi ({textbf {x}}) u({textbf {x}})) where (varphi ({textbf {x}})) is the intensity inhomogeneity. After applying the logarithmic transformation, we obtained:

$$begin{aligned} log (hat{u}({textbf {x}}))=log (varphi ({textbf {x}}))+log (u({textbf {x}})). end{aligned}$$

(6)

As both the inhomogeneity layer (varphi ({textbf {x}})) and the clean image (u({textbf {x}})) are unknown directly finding the clean image (u({textbf {x}})) from Eq. (6) it is impossible. To defeat this difficulty. Wang et al.^{20} suggested a multi-scale average filter. The local circular regions are defined in order to make the model more adaptable in capturing intensity information in the local region of a given pixel. To examine and investigate the information of the local circular region at each center pixel (mathbf {x}) of the, particular, given image (hat{u}) the multi-scale average filter is designed in the following form as illustrated in Eq. (7):

$$begin{aligned} MSF_{i}(mathbf {x})={1over n}sum _{mathbf {y}in F_{mathbf {x},i}}hat{u}(mathbf {y}), end{aligned}$$

(7)

where, the subscript *i* is the radius of the local circular region and this can also be characterized as a scale parameter. Furthermore, the variable *n* symbolizes the total amount of pixels within that particular local circular region (F_{mathbf {x},i}) with center (mathbf {x}); and is subsequently defined by the following Eq. (8):

$$begin{aligned} F_{mathbf {x},i}={mathbf {y}: sqrt{(mathbf {y}_1-mathbf {x}_1)^2+(mathbf {y}_2-mathbf {x}_2)^2} le i}. end{aligned}$$

(8)

Furthermore, (M_k(mathbf {x})) is taken to be the mean of the multi-scale average filter and it is characterized as illustrated using Eq. (9):

$$begin{aligned} M_k(mathbf {x})=frac{1}{k}sum _{i=1}^k MSF_{i}(mathbf {x}), end{aligned}$$

(9)

where *k* represents the entire amount of the scales and this needs to be tuned properly according to the images. It may be noted that in case the value of the variable *k* is little then very elite circular regions will be investigated for every center pixel which may lead to an unfavorable result. Similarly, on the other hand if in case the value of the variable *k* is taken very ample then it will potentially increase the computational cost due to the fact that too many local circular regions will consider for every center pixel value. By replacing (varphi (mathbf {x})) in Eq. (6) by (M_k(mathbf {x})), we get the following relationship:

$$begin{aligned} log (bar{u}(mathbf {x}))=log (hat{u}(mathbf {x}))-log (M_k(mathbf {x}))+log (M_N). end{aligned}$$

(10)

In fact, this should be noted that (bar{u}) is an approximation to the clean image or intensity inhomogeneity free image *u*, whereas (M_N) is a constant, in fact normalized, to conserve the mean intensity of (bar{u}). Furthermore, Eq. (10) can be represented in an equivalent form to decrease the computational cost as:

$$begin{aligned} bar{u}(mathbf {x})=hat{u}(mathbf {x})M_N/M_k(mathbf {x}). end{aligned}$$

(11)

In fact, Eq. (11) represents an approximation of inhomogeneity-free image and shows that (bar{u}) can be obtained by dividing (hat{u}(mathbf {x})M_N) by multi-scale intensity information (M_k(mathbf {x})). The filter can be named as dual filter formulation, as the image has been filtered twice and then divide it by its average. For a better understanding of the dual filter formulation, we show experimental outcomes, in particular, for a gray-scale synthetic inhomogeneous image and, subsequently, a color image of the plane with relatively high brightness within the background, as shown in Fig. 1. The dual filter is implemented on these two test images with *k* value 10, 20 and 30, respectively, as shown in Figs. 2 and 3.

From Fig. 2, first row third column, it is clear that the intensity inhomogeneity is almost covered but at the same time the edges are also diffuse and an extra region around edges become darker which may cause an unsatisfactory result in segmentation. When the value of *k* increases from (k = 10) to (k = 20) and then (k=30) we notice that the inhomogeneity is slightly removed for higher values of *k* as clear from Fig. 2a,c, but the edges are not affected and an extra region around the edges is not damaged. Figure 3 demonstrates the results of dual-filter formulation on a real-world color image of a plane, which has high brightness in the background due to sunlight which may cause difficulty in segmentation. On the other hand using the filtered image instead of the original one makes the segmentation task easier and more efficient as filter images are clear from the original one. However, it may be noted that the scale parameter *k* plays a vital role in the dual filter formulation, shown in the last column of Fig. 3. As the value of *k* increases, we notice more content in the resulting image but at the same time, it increases the computational cost. Through experiments, we noticed that the *k* value can vary from 5 to 35 and default (k=30) is more appropriate.

Alternatively, deceptively regularized kernel-oriented techniques have been used to enable local information into the segmentation fitting term. For instance, Elazab et al.^{34} proposed a deceptively regularized and fuzzy kernel-oriented *C*-means clustering (ARKFCM) system and framework. The suggested system has applications in terms of segmentation capabilities for brain MR images and inhomogeneous datasets with the energy function as illustrated mathematically in Eq. (12):

$$begin{aligned} F_{ARKFCM}= & {} 2sum _{i=1}^Nsum _{j=1}^cu_{ij}^m(1-K(x_iv_j))nonumber &+2sum _{i=1}^Nsum _{j=1}^cvarphi _i u_{ij}^m(1-K(hat{x}_iv_j)), end{aligned}$$

(12)

with (x_i), (i=1:N), image gray scale in *k* dimensional space, (v_j), (j=1:c), cluster center, (u_{i,j}), the membership value for every pixel *i* and (j{text{-th}}) cluster, and *K* the Gaussian radial basis function. Note that in this framework three different algorithms have been suggested that consists of: (i) the local average gray-scaling being substituted by the gray-scale of the mean filter (ARKFCM(_a)), (ii) median filter (ARKFCM(_m)), and (iii) devised weighted images (ARKFCM(_w)), correspondingly. In fact, all these algorithms utilize the heterogeneity of the gray-scales in the pixel locality and, subsequently, put to work this assessment criterion for local contextual information. This should be noted that this is achieved by replacing the standard Euclidean distance with the Gaussian radial basis kernel functions. The ARKFCM framework is independent of parameters which is one of the main advantage of this method and also has promising results for images in presence of noise. Although, one can observe limitation of this technique with images having intensity inhomogeneity, as will be later shown in Fig. 10, usually occurring in MR images.

Recently, Cai et al.^{14} suggested a variational framework for image segmentation while taking advantage of the image restoration techniques. In this work a link among the image segmentation and the image restoration approaches has been shown, whereas Cai et al.^{35} proves arguments on the fact that the solution of CV model^{11} can be achieved through thresholding the minimizer of the ROF model^{1}. Finally, the energy functional of the Cai et al. model^{14} is based on two different data fitting terms, i.e., (i) one for image restoration, and (ii) the other one for image segmentation. The relationship is illustrated mathematically as given in the following Eq. (13):

$$begin{aligned} F^{Cai}(u,c_i,v_i)= & {} mu int _{Omega }(u_0-mathcal {O}u)^2dxdynonumber &+lambda sum _{i=1}^Kint _{Omega }(u-c_i)^2v_idxdynonumber &+sum _{i=1}^Kint _{Omega }|nabla v_i|dxdy, end{aligned}$$

(13)

where (sum _{i=1}^K v_i(mathbf{{x}})=1,v_i(mathbf{{x}})in {0, 1}) is a fuzzy membership function, and (mathcal {O}) is a blurring operator i.e., if blur is observed in the image and the aforementioned is a recognition operator for a noisy ascertained image, as well. The blurring operator (mathcal {O}) can be computed through using various image de-blurring methods and techniques as suggested in Refs.^{36,37,38}. The Cai et al. model^{14} can efficiently segment images that are damaged and corrupted with the high noise, blur affect, and/or missing pixels; however, its limitation can still be observed in intensity inhomogeneous images, which represents the main problems in the Cai et al. model^{14}. This drawback of the Cai et al. method^{14} is due to the fact that the suggested approach and method uses only global information of images and ignores the local one. This issue can be solved through implementing certain machine learning approaches into the image segmentation methods. In next section, we discuss how machine learning based methods can be integrated into existing approaches and use them in the field of image segmentation.