Keywords

1 Introduction

Texture is the most fundamental information on which the majority of all living organisms base their visual cognition and is a key component of computer vision system [11]. Basically, all the digital images can be regarded as texture. Texture analysis has been applied to many visual problems such as material categorization, surface inspection, medical image analysis, object recognition, image segmentation, pedestrian detection, face analysis and so on.

Over the years, lots of texture descriptors have been proposed [12, 20, 22, 28]. Among these descriptors, local patterns have achieved good performance in most texture applications [3, 15, 18]. In particular, LBP is an efficient descriptor for describing local structures [18]. LBP descriptors have already demonstrated powerful discriminative capability, low computational complexity, and low sensitivity to illumination variation. For further improving the discrimination of LBP, a large number of LBP variants have been proposed [14]. Most of these changes make efforts on the following three directions.

First is to utilize different forms of information from the original textures. Guo et al. proposed Complete LBP which utilized the sign and magnitude information of local neighborhood in the descriptor [9]. Some other methods concentrate on the local derivative information respected to a local region, such as LDP [30], CLDP [29], LDDP [8], POEM [27] and so on. Second is rotation invariance, which is an important topic in texture classification. Many methods have been proposed to achieve rotation invariance, such as SRP [13, 24], SIFT [15] and so on. Third is feature selection. The exponential increasing in the number of features with the patch size is a limitation for the traditional LBP. The uniform LBP descriptor proposed by Ojala et al. [18] is the first attempt to solve this problem.

The main contributions of the paper are threefold. Firstly, we propose the Affine-Gradient based method to describe texture information. Affine-Gradient (AG) has some properties that Euclidean-Gradient (EG) does not have, which will be elaborated detailedly in the following. Secondly, an improved method for determining the local reference direction is proposed to reach rotation invariance, which is fast to compute and effective for the rotation transformations. Finally, we propose a simple but effective feature selection method considering both the distribution of patterns and the intraclass variance on the training datasets. Experiments show that the proposed feature selection method not only increases the discriminative power but also reduce the dimension of descriptor effectively.

2 Affine-Gradient Based Local Pattern Descriptor

In this section we elaborate our approach in detail. First, we give a brief review of LBP. Second, we discuss how to make full use of multi-information, especially Affine-Gradient (AG), for texture classification. The properties of AG are discussed in detail. Then we discuss the method we proposed to achieve the rotation invariance. Finally, the criterion for feature selection are discussed.

2.1 Overview of LBP Method

The traditional LBP operator extracts information that is invariant to local gray-scale variations in the image. It is computed at each pixel location, considering the values of a small circular neighborhood around the central pixel \(q_c\). Then, the LBP is defined as following:

$$\begin{aligned} LBP_{R,P}=\sum _{p=0}^{P-1}s(g_p-g_c)\cdot 2^p \quad \quad s(x)={\left\{ \begin{array}{ll} 1, x\ge 0 \\ 0, x<0 \end{array}\right. } \end{aligned}$$
(1)

where \(g_c\) is the central pixel and \(g_p\) are the values of its neighbors. p is the index of the neighbor, R is the radius of the circular neighborhood and P is the number of pixels in the neighborhood. Then the histogram of these patterns is used to describe the texture of the image.

There are three obvious disadvantages of LBP. First, it has no rotation invariance. Second, it is just 1-th order sign information used in the descriptor. Third is the exponentially length increasing with the parameter R. The proposed method has been improved in these three direction.

2.2 Affine-Gradient Based Descriptors

In here, we propose the method based on the AG information to increase the discrimination of the descriptor. The Euclidean Gradient (EG) can be defined as \(G=\sqrt{I_x^2+I_y^2}\). It is 2-norm of gradient in Euclidean space that remains invariant only under Euclidean transformation.

Olver et al. [19] proposed that there are two basic relative affine differential invariant of 2-order in two-dimensional affine spaces as following:

$$\begin{aligned} H=I_{xx}I_{yy}-I_{xy}^2 \end{aligned}$$
(2)
$$\begin{aligned} J=I_{xx}I_y^2-2I_xI_yI_{xy}+I_x^2I_{yy} \end{aligned}$$
(3)
Fig. 1.
figure 1

The EG and AG information of image example: (a) image example; (b) EG magnitudes of example; (c) AG of example range in (0–0.2); (d) AG of example range in (0.2–1).

All other 2-order differential invariants can be made up of these two expressions. And their ratios constitute absolute invariant of differential in affine space. The affine gradient magnitude (affG) can be defined as Eq. (4). In order to avoid the calculation fault of zero-denominator, we can make some changes to the definition as \(affG'\).

$$\begin{aligned} affG=\left| \frac{H}{J}\right| ,\quad \quad affG'=\sqrt{\frac{H^2}{J^2+1}} \end{aligned}$$
(4)

The Affine-Gradient is superior than Euclidean-Gradient (EG), because AG is invariant for the affine transformation, and the EG just remains invariant under Euclidean transformation. Using the AG information can improve the robustness of descriptor for the geometric transformation. Ge et al. constructed a new descriptor using the AG to replace the EG in SIFT, which get much better performance than the original SIFT [4]. The gradient and AG information are shown in Fig. 1.

In Fig. 2(a) and (b), we can see that the histogram of EG is much more continuous and smooth than that of AG. In fact, the range of AG is from 0 to 162, not limited to 0 to 1 corresponding to Fig. 2(b). It’s just more sparse where the value bigger than 1. But the distribution of EG just ranges form 0 to 763 corresponding to Fig. 2(a). So intuitively, the information of AG ranging (0,1) probably corresponding to that of EG as shown in Fig. 1(b) and (c). And there are some local extreme information in the AG as shown in Fig. 1(d).

Fig. 2.
figure 2

The histogram of EG and AG: (a) histogram of the gradient; (b) histogram of the AG.

Table 1. Results of Multi-information based descriptors on Outex12

For further verification of the validity of AG, experiments are conducted on Outex12 dataset. The Local Gradient Pattern (LGP) and Local Affine-Gradient Patter (LAGP) can be defined as

$$\begin{aligned} LGP_{R,P} = \sum _{p=0}^{P-1}s(G_p-G_c) \end{aligned}$$
(5)
$$\begin{aligned} LAGP_{R,P} = \sum _{p=0}^{P-1}s(affG'_p-affG'_c) \end{aligned}$$
(6)

The s function is defined in Eq. (1). The Multi-Information based descriptor MI-G, can be defined as the concatenation of LGP and LBP. Similarly, MI-AG is the concatenation of LAGP and LBP. Then the experimental results are listed in Table 1.

From the results, we can see that the Multi-Information descriptor based on Affine-Gradient get the best performance in all scenarios. It was demonstrated that the AG information can substantially increase the discriminative power of the descriptors.

2.3 Rotation Invariance

Mehta and Egiazarian [16] proposed a method that quantizing the directions into P discrete values, then make direction with the maximum magnitude of the difference as the reference direction. The definition of can be defined as [16]:

$$\begin{aligned} D= \mathop {\arg \max }_{p\in (0,1,\ldots ,P-1)}{|g_p-g_c|} \end{aligned}$$
(7)

But this definition discard the sign information of the magnitude and will assign the opposite directions into the same one. In this paper, we take both the sign and magnitude of the discrete directions into consideration. The reference direction can be defined as:

$$\begin{aligned} Ds = (D + \frac{P}{2} \cdot s(g_D-g_c))\mod P \end{aligned}$$
(8)

where s is the sign function defined in Eq. (1). The proposed descriptor is computed by rotating the weights with respect to the reference direction. The rotation invariance LBP (roLBP) can be defined as

$$\begin{aligned} roLBP_{R,P} = \sum _{p=0}^{P-1}s(g_p-g_c)\cdot 2^{(p-Ds)\mod P} \end{aligned}$$
(9)

In the above definition, the weight term \((p-Ds)\mod P\) depends on Ds. Thus, the mod operator circularly shifts the weights with respect to the reference direction Ds.

To illuminate the advantage of the proposed method, both roLBP and RLBP are evaluated on the Outex12 dataset and the results are shown in Table 2. Considering the computational complexity, both of the methods are applied feature selection method proposed in [16] to reduce the length of descriptors, which called DRLBP and DroLBP as shown in Table 2. The DroLBP get best performance in scale (3, 16). We can see that our method get better performance in a larger scale, because our method is closer to ground-truth gradient direction. And the gradient direction has little effect in a very small scale. Use approximate gradient direction as the reference direction may get better result in a lager scale, which needs further validation.

Table 2. Experiment results of different reference direction selection descriptors on Outex12

Applying the reference direction selection method to the LAGP descriptor. We can get the rotation invariant descriptor roLAGP as following:

$$\begin{aligned} roLAGP_{R,P} = \sum _{p=0}^{P-1}s(affG'_p-affG'_c)\cdot 2^{(p-Ds)\mod P} \end{aligned}$$
(10)

Then the final descriptor AGLBP can be defined as the concatenation of roLBP and roLAGP.

$$\begin{aligned} AGLBP_{R,P} = roLBP_{R,P}\_{roLAGP}_{R,P} \end{aligned}$$
(11)

2.4 Feature Selection

It is observed the dimensionality of descriptors also increases exponentially with the number of neighboring pixels. In [16], proposed a method depending on the distribution of patterns in the training dataset. Besides, some patterns may be negative to the final classification result. So in our method, the intraclass variance of training datasets is also chosen as the evaluation for feature selection.

In the statistical description, variance is defined as\(\frac{1}{n-1}\sum (X-\mu )^2\), where \(\mu \) is mean value of the array. The distribution of the intraclass variance of all patterns are computed from the training dataset, as shown in Fig. 3.

Fig. 3.
figure 3

The intraclass variance distribution for roLBP on Outex12 dataset: (a) The variance distribution of roLBP in Outex12 training dataset; (b) The variance distribution of roLAGP in Outex12 training dataset.

The bins of the histogram are sorted in descending order. Then there will be two method for feature selection. One selects the top N patterns in the ordered list, the other selects bins which is less than a threshold \(\phi \) as the final descriptor. The final patterns selected depend on the threshold parameter N or \(\phi \) and the training datasets. The final dimensionality of the descriptor is not constant. It varies across different datasets. The accuracy-parameter curve of the two method for roLBP on Outex12 dataset are plotted in Fig. 4.

Fig. 4.
figure 4

The accuracy-parameter curve for roLBP on Outex12 dataset: (a) the accuracy-N curve of roLBP on Outex12 dataset; (b) the accuracy-\(\phi \) curve of roLBP on Outex12 dataset.

It can be observed in Fig. 4(b) that the classification accuracy reach the peak with the threshold value almost between 1.6–2.0, just over the peak of distribution corresponding to Fig. 3(a). This values results in a significant reduction of the dimensionality.

Thus, the proposed approach consider both the statical frequency and the intraclass variance of the training textures, which not only reduces the dimensionality of descriptors, but also improves the classification accuracy. The effective of the proposed approach will be demonstrated in next section.

2.5 Classification Method

Some state-of-the-art methods, such as artificial neural network (ANN), SVM, AdaBoost, can achieve outstanding classification performance, but these methods require complex learning procedure and may influence analysis of discriminative capabilities of features. To make a fair comparison with some other approaches, the Nearest Neighbor (NN) classifier based on the Chi-Square distance was performed as our classification method. The effectiveness of the Chi-Square distance for classification is demonstrated in [7, 8].

3 Experiments

To evaluate the proposed descriptor (AGLBP), three experiments are conducted on texture datasets: Outex10, Outex12 and KTH-TIPS2. Outex10 and Outex12 datasets are for rotation invariant texture classification with rotation and illumination deformations. The KTH-TIPS2 is for material categorization and includes scale and viewpoints variations. The parameter \(\phi \) of proposed method is set to 2 in all our experiments. Some state-of-the-art descriptors have been implemented and compared on each dataset, such as \(LBP\text {-}HF\) [1], LBPV [10], LDDP [8], LCP [6], \(VZ\_{MR8}\) [25], \(VZ\_{joint}\) [26], PLBP [21], MDLBP [23], FBLLBP [7], BIF [2, 5], LEP [31], DRLBP [16].

3.1 Outex12

Outex is a framework for empirical evaluation of texture classification algorithms [17]. First we conduct experiment on the Outex12 dataset. It consists of 9120 images, which are separated into 24 different texture classes captured with different illuminations and rotations. This dataset contains 20 training images and 360 (2 * 9 * 20) testing images under two different illumination and 9 different orientation for each class. In experiment, following two problem proposed in the dataset [17], problem 000 and 001. Considering the length of the final descriptor is depending on the parameter (R, P), we use a conservative setting of the parameter as (1, 8), (2, 12) and (3, 16). All the LBP-based methods were performed and the results are shown in Table 3.

LBP

Table 3. Experiment results of LBP based methods on different datasets

Among these methods, the proposed method with setting (3, 16) has achieved the highest accuracy of 97.84% for problem 000 and 97.38% for problem 001. For further analysis, we compare our method with some other state-of-the-art methods. The results are shown in Table 4. It can be seen that the proposed descriptor achieves the best result, the close second is DRLBP, which get the accuracy 97.15% for problem 000 and 95.37% for problem 001. Another interesting result is that the proposed method with setting (1, 8) not only not improving the performance, but getting a lower accuracy. It may be the pattern types of setting (1, 8) is not enough to describe the texture information, so there is no need for feature selection. So feature selection is only applicable when the descriptor dimension is too long.

Table 4. Experiment results of descriptors on different datasets

3.2 Outex10

Then experiment is conducted on the Outex10 dataset, which includes 4320 images of 24 different classes. These images are captured under the same illumination but rotated at nine different angles. There are 20 images at each angle for each class. Following the problem proposed in the dataset [17], 480 images captured at angle \(0^\circ \) are taken as the training set and the rest 3840 images captured at other angles used for testing.

The results with various setting are shown in Table 3. For further analysis, AGLBP are compared with some other state-of-the-art approaches. The result of these methods are also shown in Table 4. It can be observed that AGLBP performs well under various rotation deformations. The problem for our method with setting (1, 8) also exists, but among all, our method with setting (3, 16) has achieved the highest accuracy 99.22%, better than the results of 99.19%, which achieved by DRLBP.

3.3 KTH-TIPS2 Dataset

Experiment on the KTH-TIPS2 dataset has also been conducted for material classification. The KTH-TIPS2 database contains 11 texture classes with different materials. For each class, the images are captured from 4 different samples of materials. And for each sample, 9 different scales with 4 different illumination and 3 different poses are conducted for the imaging. In this experiment, following problem proposed in most research [6, 10], images of one random sample are selected from each class are taken as the training dataset, images from the other samples are taken as the testing dataset.

All the methods were performed and the results are shown in Table 3. As the same, AGLBP is also compared with some other state-of-the-art approaches. The result of these methods are shown in Table 4. The proposed descriptor outperforms all other descriptors again. It can be concluded that our method is effective for texture classification.

4 Conclusion

In this paper we have proposed an Affine-Gradient based Local Binary Pattern (AGLBP) descriptor for texture classification. Affine-Gradient is different from the Euclidean-Gradient and has been proved to have a good improvement for texture classification. In addition, we have proposed an improved method for determining the local reference direction to reach rotation invariance. Importantly, the dimension increasing bringing by multi-information is also alleviated by proposed feature selection method, which considering both the statistical frequency and the intraclass variance of the training texture. Three extensive experiments have been conducted on texture datasets including rotating, scaling and viewpoint deformations. The results demonstrate that the AGLBP performed better than some state-of-the-art approaches for texture classification. The AGLBP utilize the Affine-Gradient which has been demonstrated robust for the viewpoint deformation. For further research, information invariant for projective transformation should be utilized to enhance the robustness to viewpoint deformation.