1 Introduction

Land cover refers to the biophysical attributes of the surface of the earth. Features of land covers include texture, shape, colour, contrast and so on. Land cover classification involves classifying the multispectral remotely sensed image into various land covers such as land, vegetation, water, etc. Some of the applications of land cover classification are town planning, conservation of earth’s natural resources, studying the effects of climatic conditions and analyzing change in land forms. Identification of a suitable feature extraction technique and classifier is a challenging task in land cover classification of remotely sensed images.

Texture based methods are widely used in applications like face recognition, content based image retrieval, pattern classification in medical imagery and land cover classification of remotely sensed images. Texture is a surface property that characterizes the coarseness and smoothness of land covers. Pixel based techniques classify a pixel depending on the intensity of the current pixel but texture based techniques classify a pixel based on its relationship with the neighborhood. Texture measures can capture micro as well as macro patterns as they can be captured by varying the size of neighborhood. Most of the texture based methods are rotation, illumination, scaling and color invariant and are robust and susceptible to noise. Recent texture based studies reveal that texture measures augmented with a contrast measure characterizing the local neighborhood yield accurate results, provided the conditions like using a sufficient number of precise samples for training, a suitable neighborhood for finding pattern unit and an optimal window size are satisfied.

Support vector machine (SVM) is basically a binary classifier but can be used for multiclass classification following suitable approaches. The advantage of SVM over other classifiers is that SVM allows a marginal region on both sides of the linear or non linear boundary of separation of classes and classifies the pixels within the support regions based on measures of uncertainty and reliability. This ensures that uncertain pixels that fall in the support or boundary region are assigned exactly correct class labels. The objective of this research work is to propose a multivariate texture model that performs land cover classification of remotely sensed images with the help of SVM.

1.1 Motivation and justification of the proposed approach

A variety of texture models are found in literatures. The univariate texture model, local binary pattern (LBP) [11] was proposed for gray level images and its classification accuracy was proved to be better in many applications. A multivariate extension of the univariate LBP model was proposed for remotely sensed images by [10] as multivariate local binary pattern (MLBP). They concluded that MLBP model with uncertainty measure helped in identifying objects and yielded high classification accuracy. Algorithms using wavelet transform [2] and rotation invariant features of Gabor wavelets [3] were proposed for performing texture segmentation of gray level images and they reported that the results were promising. To provide better pattern discrimination, advanced local binary pattern (ALBP) [1] was proposed for texture classification and applied on standard texture databases. It was proved that ALBP characterized local and global texture information and was robust in discriminating texture. Local texture pattern (LTP) [15] was proposed for gray level images and later extended to remotely sensed images as multivariate local texture pattern (MLTP) [16]. From the experiments, it was proved that MLTP model gave high classification accuracy. In dominant local binary pattern (DLBP) [8] histograms of dominant patterns were used as features for texture classification of standard textures. Local derivative pattern [13] was proposed for face recognition under challenging image conditions. A novel face descriptor named local color vector binary pattern (LCVBP) [7] was proposed to recognize face images with challenges. Two color local texture features like color local Gabor wavelets (CLGWs) and color local binary pattern (CLBP) [4] were purposed for face recognition and both were combined to maximize their complementary effect of color and texture information respectively.

Among many classification algorithms used for texture based classification of remotely sensed images, support vector machine [[6], [12]], relevance vector machine [5] are reported often in literatures. [6] suggested that SVM was more suitable for heterogeneous samples for which only a few number of training samples were available. [12] concluded that the SVM classification approach was better than K nearest neighbour classification algorithm. [9] performed a detailed survey of various classification algorithms including pixel based, sub pixel based, parametric, non parametric, hard and soft classification algorithms. They summarized that the success of an image classification algorithm depended on the availability of high quality remotely sensed imagery, the design of a proper classification procedure and analyst’s skills.

Among the texture models mentioned earlier, only LBP, LTP, wavelet and Gabor wavelet have been extended to remotely sensed images already. The challenge in spectral methods is that they produce features of high dimensionality. So dimensionality reduction may be required prior to classification. At the same time, the multivariate texture models MLBP [10] and MLTP [17] yield high classification accuracy on remotely sensed images using at most three discrete levels. So it is expected that if we increase the number of discrete levels, we can more precisely model the relationship between neighbour pixels. Motivated by this, a multivariate texture model with four discrete levels is proposed for land cover classification of remotely sensed images. Incorporating fuzziness either during feature extraction [18] and [14] or classification can improve the classification accuracy of pattern classification and recognition problems. Support vector machine is a fuzzy classifier often used in classification of remotely sensed images. Moreover it converges quickly and needs only a minimum number of samples for classification. Justified by these facts, the proposed multivariate texture model is combined with SVM classification algorithm for performing land cover classification of remotely sensed images. The objective of this research work is to propose a multivariate texture model MDLTP for land cover classification of remotely sensed images that gives high classification accuracy.

1.2 Outline of the proposed approach

The proposed approach has texture feature extraction part as shown in Fig. 1a and classification part as shown in Fig. 1b. During feature extraction, the centre pixel of each 3 × 3 neighbourhood of a sample is assigned a pattern label using the proposed local texture descriptor. Local contrast variance is also used as a supplementary local feature descriptor. These two local descriptors are then used to form a 2D global histogram of each sample. The 2D global histograms thus formed characterize the global feature of the sample. The SVM classifier works in two phases as shown in Fig. 1b. In the training phase, training samples are extracted from distinct land cover classes of remotely sensed images. Texture features in the form of 2D global histograms of training samples are used to train SVM classifier. In the testing phase, test samples centred around each pixel of remotely sensed image are extracted, 2D global histogram was found and given as input to SVM. The SVM classifier finds the optimal hyper plane of separation and returns the class label based on its prior learning of training samples.

Fig. 1
figure 1

Feature extraction and classification

1.3 Organization of the paper

The second section of the paper gives the overview of the proposed multivariate texture model. The third section describes the SVM classification algorithm. The fourth section gives a detailed account of the experiments conducted with the proposed multivariate texture model for supervised texture classification of remotely sensed image. It also evaluates the performance of the proposed model. The final section discusses the outcomes of various experiments and gives the conclusion.

2 Texture feature extraction

2.1 Local texture description using discrete local texture pattern (DLTP)

The proposed texture model extracts local texture information from a neighbourhood in an image. Let us take a 3 × 3 neighbourhood where g c g 1, ···g 8 be the pixel values of a local region where the value of the centre pixel is g c and g 1g 2···g 8 are the pixel values in its neighbourhood. The relationship between the centre pixel and one of its neighbour pixels is described in Eq. (1).

$$ p(g_{i} ,g_{c} ) = \left\{ \begin{array}{ll} - 1 & if\;g_{i} < (g_{c} - m) \\ 0 & if\;(g_{c} - m) \le g_{i} \le \,g_{c} \\ 1\,& if\;g_{c} < g_{i} \le (g_{c} + m) \\ 9 & if\;g_{i} \; > \;(g_{c} + m) \end{array} \right. $$
(1)

Here ‘m’ is the threshold which is set to express the closeness of neighbouring pixel with the centre pixel. The value p(g i g c )stands for output level assigned to ith pixel in the neighbourhood. The discrete output levels are fixed numerically to −1, 0, 1 and 9 to assign unique pattern values during individual summation of positive and negative values. The output levels characterize the neighbourhood pixel relation. Concatenation of these levels in a neighbourhood gives us a pattern unit. The sample calculation of pattern unit for m = 5 is shown below.

$$ \left[ {\begin{array}{*{20}c} {206} & {194} & {201} \\ {203} & {201} & {198} \\ {212} & {210} & {202} \\ \end{array} } \right] \to \left[ {\begin{array}{*{20}c} 1 & { - 1} & 0 \\ 1 & {} & 0 \\ 9 & 9 & 1 \\ \end{array} } \right] \to \begin{array}{*{20}c} 1 & { - 1} & 0 & 0 & 1 & 9 & 9 & 1 \\ \end{array} \;(Pattern\;Unit) $$

The total number of patterns considering all combinations of four output levels with number of pixels in the neighbourhood (P) equal to eight will be 48. This will lead to increase in number of bins required when these local patterns are accumulated to characterize global regions. In order to reduce the number of possible patterns, a uniformity measure (U) is introduced as defined in Eq. (3). It corresponds to the number of circular spatial transitions between output levels like −1, 0, 1 and 9 in the pattern unit. Patterns for which U value is less than or equal to three are considered uniform and other patterns are considered non uniform. The gray scale DLTP for local region ‘X’ is derived as in Eq. (2). The value PS stands for sum of all positive output levels including zero and NS stands for sum of all negative output levels in the pattern unit. To each pair of (NS+1, PS+1) values, a unique DLTP value is obtained from the lookup table ‘L’ for all uniform patterns and 166 will be assigned for non uniform patterns.

$$ DLTP(X) = \left\{ \begin{gathered} L(NS+1,PS+1)\quad U \le 3 \hfill \\ 166\quad \quad \quad \quad \;Otherwise\quad \hfill \\ \end{gathered} \right.\quad $$
(2)

where

$$ U = \left| {s(g_{P - 1} } \right. - g_{c} ) - \left. {s(g_{0} - g_{c} )} \right| + \;\sum\limits_{k = 1}^{P - 1} {\left| {s(g_{k} } \right. - g_{c} ) - \left. {s(g_{k - 1} - g_{c} )} \right|} $$
(3)

where s(x,y) = \( \left\{ \begin{gathered} 1\quad if\quad \left| {x\; - \;y} \right|\; > \;0 \hfill \\ 0\quad if\quad otherwise \hfill \\ \end{gathered} \right. \)and

$$ \begin{gathered} PS = \sum\limits_{i = 0}^{P - 1} {p(} g_{i} ,g_{c} )\quad if\quad s(p(g_{i} ,g_{c} ) \ge 0 \hfill \\ NS = \sum\limits_{i = 0}^{P - 1} {p(} g_{i} ,g_{c} )\quad if\quad s(p(g_{i} ,g_{c} ) < 0 \hfill \\ \end{gathered} $$
(4)

The lookup table (L) shown in Table 1 provides unique pattern values to the different combinations of NS+1 and PS+1 values. The maximum negative sum (NS) is eight as there can be eight −1’s. The maximum positive sum (PS) is 72 as there can be eight 9’s. So the size of the lookup table is (9 × 73). All entries in the table are filled sequentially starting from 1 to 165 which characterize unique pattern labels. Zero entries in the lookup table show that the patterns will never occur. This scheme provides 165 uniform patterns and one non uniform pattern.

Table 1 Look up table for DLTP

2.2 Local contrast variance- supplementary feature

Texture features by itself do not capture contrast information of an image. This will result in patterns with same texture values but different contrast values to get classified into same class. In order to avoid this, texture is supplemented with contrast information. Rotation invariant local variance is a powerful spatial property that provides contrast information and is defined for 3 × 3 neighbourhood of a gray scale image as follows.

$$ VAR = \frac{1}{8}\sum\limits_{i = 0}^{7} {\left( {g_{i} - \mu_{8} } \right)^{2} \quad where\quad } \mu_{8} = \frac{1}{8}\sum\limits_{i = 0}^{7} {g_{i} } $$
(5)

Equal percentile binning is performed for quantization of variance values. We can find the bin interval for binning variance values by using the formula `100/B’, where B is the required number of bins.

2.3 Extending DLTP and VAR for multispectral bands

The proposed DLTP operator for gray scale image is extended as Multivariate DLTP (MDLTP). Among the multispectral bands, three most suitable bands for land cover classification are chosen and combined to form a RGB image. Nine DLTP operators are calculated in the RGB image. Out of nine, three DLTP operators (RR, GG and BB) describe the local texture in each of the three bands R, G and B individually. Six more DLTP operators describe the local texture of the cross relation of each band with other bands (GR, BR, RG, BG, RB and GB). For example, the GR cross relation is obtained by replacing the centre pixel of R band in its neighbourhood with the centre pixel of G band. Nine DLTP operators thus obtained are arranged in a 3 × 3 matrix. Then MDLTP is found by calculating DLTP for the 3 × 3 resulting matrix as shown below. This MDLTP histogram has only 166 bins.

$$ MDLTP = DLTP\;\left[ \begin{gathered} DLTPg_{i}^{R} \,,g_{c}^{R} \quad DLTPg_{i}^{G} \,,g_{c}^{R} \quad DLTPg_{i}^{B} \,,g_{c}^{R} \hfill \\ DLTPg_{i}^{R} \,,g_{c}^{G} \quad DLTPg_{i}^{G} \,,g_{c}^{G} \quad DLTPg_{i}^{B} \,,g_{c}^{G} \hfill \\ DLTPg_{i}^{R} \,,g_{c}^{B} \quad DLTPg_{i}^{G} \,,g_{c}^{B} \quad DLTPg_{i}^{B} \,,g_{c}^{B} \hfill \\ \end{gathered} \right] $$
(6)

where ‘i’ ranges from 0 to 7 (total number of pixels in 3 × 3 neighbourhood).

The univariate variance measure (VAR) can be extended as Multivariate variance (MVAR) for remotely sensed image as follows. The individual independent variances VAR1, VAR2 and VAR3 of R, G and B bands are found using Eq. (5) and combined into a single composite variance (MVAR) by applying the formula below.

$$ MVAR = \frac{1}{3}\sum\limits_{i = 1}^{3} {(VAR_{i} - \mu_{3} )^{2} \quad where\quad } \mu_{3} = \frac{1}{3}\sum\limits_{i = 1}^{3} {VAR_{i} } $$
(7)

2.4 Global description through 2D histogram

The multivariate local descriptor describes the texture pattern over any local region. The global description of an image can be obtained through combining multivariate local texture descriptor and multivariate local contrast variance in a 2D histogram. The steps are given below.

  1. 1.

    Find multivariate local texture descriptor (MDLTP) and multivariate local contrast variance descriptor (MVAR) for all pixels by using a sliding window neighbourhood that runs over the image from top left to bottom right.

  2. 2.

    Compute the occurrence frequency of the ordered pair MDLTP and MVAR into a 2D histogram where x ordinate denotes MDLTP and y ordinate denotes MVAR.

3 Support vector machine classification algorithm

The SVM classifier is a supervised binary classifier which can classify pixels that are not linearly separable. Support vectors are the samples closest to the separating hyper plane and SVM orientates this hyper plane in such a way as to be as far as possible from the sure candidates of both classes. The optimization problem of finding support vectors with maximal margin around separating hyper plane is solved subject to a tolerance value entered by the user. The classifier solves optimization problem with the help of one of the kernels like linear, sigmoid, radial basis function, polynomial, wavelet and frame. Each new testing sample is classified by evaluating the sign of output of SVM.

Multiclass classification is done in SVM following two approaches namely one against one and one against all. We have used one against one approach in this article. In one against one approach, one SVM per each pair of classes is used. The whole set of patterns is divided into two classes at a time and finally the patterns which get classified into more than one class are fixed to a single class using probability measures. The steps involved in multiclass classification are as follows.

3.1 Training phase

  1. 1.

    If ‘n’ is the number of classes (In Fig. 2, Cl1, Cl2 …. Cln are classes), then ‘nC2’ support vector machines are needed.

    Fig. 2
    figure 2

    Working principle of SVM

  2. 2.

    Each SVM is trained with the 2D histograms of known samples and their class labels (pertaining to the corresponding pair of classes).

3.2 Testing phase

  1. 1.

    The 2D global histogram of unknown sample is given as input to all SVM’s.

  2. 2.

    The output of SVM per pair of classes is mapped to a local probability value.

  3. 3.

    Then the global posterior probability is found from the individual probabilities to decode the class label of the unknown sample.

The overall working principle of multiclass SVM is outlined in Fig. 2.

4 Experiments and results

4.1 Experimental data

The remotely sensed image under study is a IRS P6, LISS- IV image supplied by National Remote Sensing Centre(NRSC), Hyderabad, Government of India. The image has been taken in July 2007 and is of size 2959x2959. It is formed by combining bands 2, 3 and 4 of LISS- IV data (Green, red and near IR) and is shown in Fig. 3. It covers the area in and around Tirunelveli city located in the state of Tamil Nadu in India. It extends to the suburbs of Nanguneri village in the South, the outskirts of Palayamkottai in the East, the suburbs of Alankulam village in the North and the suburbs of Cheranmahadevi village near Ambasamudram in the West. The river Thamirabarani runs across the diagonal region of the image. In the image, residential areas are either with closely packed buildings or with partially occupied buildings with shrubs and trees scattered then and there. Some irrigation tanks are present inside the city. Also in the south of Tirunelveli city leading to Nanguneri village several irrigation tanks and vegetation areas are present. In the North, bare soil is scattered in some places on the way to Sankarankoil. In the West, on the way leading to Cheranmahadevi fertile paddy fields and vegetation are present on either sides of the perennial river. An updated geological map has been selected as a reference for ground truth study of the same area.

Fig. 3
figure 3

IRS P6, LISS-IV remotely sensed image

The experimental classes or training samples are the areas of interest extracted from source image in Fig. 3 and are of size 16 × 16 as shown below Table 2.

4.2 Land cover classification of remotely sensed image with MDLTP/MVAR

In experiments, the size of training and testing samples are kept same to get high classification accuracy. Since the size of the training sample was 16 × 16, the size of testing sample was also fixed to 16 × 16. The multivariate local texture feature (MDLTP) and multivariate local contrast variance (MVAR) were found and the 2D histograms for global description were formed for all samples as illustrated in Sect. 2. In training phase, the 2D histograms of training samples were used to train SVM. In testing phase, the 2D histograms of testing samples were given as input to SVM. The classifier returned the class label. The classified image is shown in Fig. 4.

Fig. 4
figure 4

Classified image using MDLTP/MVAR

The MDLTP/MVAR model discriminates well between various land covers because it is so designed to assign distinct and precise pattern codes to capture patterns. So settlement and vegetation-3 classes cluster densely. The thin diagonal line of water running across the image is clearly traced without discontinuity. The vegetation-1 class which lies on either sides of river is seen vividly. The vegetation-2 class present around water tanks is classified precisely.

4.3 Performance evaluation of classified image

The overall classification accuracy and Kappa coefficient are the performance metrics for assessing the classified image. To compute these values, an error matrix is built as follows. The size of error matrix is ‘c × c’ where ‘c’ is the number of classes. If a pixel that belongs to class (where 1≤ i ≤c) is correctly classified, then a count is added in entry (i, i) of error matrix. If a pixel that belongs to class ci is incorrectly classified to class (where 1≤ j ≤c), then a count is added to the entry (i, j) of error matrix. The diagonal entries mark correct classifications while the upper and lower diagonal entries mark incorrect classifications. Then the overall accuracy (Po) can be found as follows.

$$ {\text{Overall classification accuracy}}\,\left( {{\text{P}}_{\text{o}} } \right) \, = \frac{{\sum_{i = 1}^{c} {x_{ii} } }}{b} $$
(8)

where ‘b’ is the total number of observations and

xii is the observation in row ‘i’ and column ‘i’ of error matrix.

The classification accuracy expected (Pe) is found as below.

$$ {\text{Accuracy expected}}\,\left( {{\text{P}}_{\text{e}} } \right) \, = \frac{{\sum_{i = 1}^{c} {x_{1} x_{2} } }}{{b^{2} }} $$
(9)

where x1 is the marginal total of row ‘i’ and x2 is the marginal total of column ‘i’. Kappa coefficient is found using Po and Pe as follows.

$$ {\text{Kappa Coefficient}} = \frac{{P_{o} - P_{e} }}{{1 - P_{e} }} $$
(10)

In our experiments, a set of stratified random samples comprising of 2400 pixels were used for building error matrix. The performance measures Po and kappa coefficient described above are found for the classified image in Fig. 4 and shown in Table 3 and Table 4 respectively.

Table 2 Training samples and their descriptions
Table 3 Error matrix of proposed model MDLTP/MVAR

The proposed model MDLTP/MVAR gives a classification accuracy of 93.46 % and a kappa coefficient of 0.9156. The model performs well because the neighbourhood pixel relations are precisely captured with the help of four discrete levels.

For evaluating the performance of the proposed model with the existing pixel based and texture based classification algorithms, the classification accuracies of various algorithms were found and tabulated in Table 5. The existing texture methods such as gabor wavelet, multivariate local binary pattern (MLBP), multivariate local texture pattern (MLTP) and wavelet and the existing pixel based methods such as Maximum likelihood classifier, Mahalonobis distance classifier and minimum distance classifier are considered for comparison.(Table 5)

Table 4 Accuracy totals of proposed model MDLTP/MVAR
Table 5 Comparison of Classification accuracies of proposed model (MDLTP/MVAR) with existing models

From the above table, it is inferred that the pixel based classifiers give classification accuracy only in the order of 75 %. This is due to the lack of attaching due weightage to the intensities of neighbourhood rather than just the intensity of current pixel value. The performance of spectral models drops when the spectral characteristics of different patterns are similar. The MLBP texture model gives 90.42 % classification accuracy. The degree of quantization is more in MLBP as we use only two discrete levels for modeling neighbour pixel relation. Moreover, MLTP (with discrete levels like 0, 1 and 9) with MVAR yield 91.88 % classification accuracy. The proposed model MDLTP/MVAR performs better than the chosen methods and gives 93.46 % classification accuracy.

5 Discussion and conclusion

A multivariate texture model (MDLTP) is proposed for land cover classification of remotely sensed images. The advantages of the proposed model are threefold. Firstly, it gives stable results even for small window sizes and secondly, it requires only a minimum number of training samples in training phase. Thirdly, it captures additional uniform patterns. The model is made wholesome by adding contrast variance as supplementary measure. The significance of the method is vividly seen as it captures even minute pattern differences with the help of four discrete levels and subsequent assignment of unique pattern labels. The SVM classifier augments the texture model by incorporating fuzziness in classifying land covers. From the experiments, it is proved that the proposed model yields 93.46 % classification accuracy.

In future, it is proposed to extend the model for hyper spectral data. The proposed model will certainly inspire researchers to find optimal number of discrete levels for each texture model so that maximum classification accuracy can be achieved. The model can be hybridized with extreme learning machine or relevance vector machine classifier to yield better classification accuracy.