Keywords

1 Introduction

The segmentation stage in an iris recognition system has received attention in many researches. The main aim of these researches is to increase the iris recognition rates. The errors occurred in segmentation stage are propagated on the rest of stages of the recognition process. Therefore, this process turns out to be one of the very difficult, especially when the eye images are captured on less controlled environments, for example: at a distance, on the move, under visible light, among others. The Noisy Iris Challenge Evaluation (NICE) [19] and Multiple Biometrics Grand Challenge (MBGC)Footnote 1 have shown the importance of the robustness of the iris segmentation in last generation of biometric systems [19]. In recent years, some authors have focused the fusion at segmentation level (FSL) with the aim to improve accuracy of an iris recognition system. The segmentation fusion combines the information of several independent segmentations from a same image obtaining a single merged segmentation. This merged segmentation contains more information than each independent segmentation.

Uhl and Wild [16] introduced the concept of multi-segmentation fusion to combine the results of several segmentations. The authors evaluated two fusion algorithms using the CasiaV4-Interval database; the segmentation of the iris region was manually obtained. The authors experimentally demonstrated that the accuracy in the recognition process for different feature extraction algorithms was increased, but they did not show results of the segmentation combination built by automatic segmentation methods. In order to determine the most suitable frame from each video sequence of the MBGC iris video database, Colores-Vargas et al. [1] evaluated seven fusion methods to extract the texture information in normalized iris templates. The experimental results showed that the principal component analysis (PCA) method presents the better performance. Sanchez-Gonzalez et al. [12] used a Sum-Rule fusion algorithm in FSL. The authors used three automatic methods to segment the iris region. The fusion was performed after the step of normalization. The recognition rates were also increased. Garea Llano et al. [6] evaluated four fusion methods in FSL using two iris segmentation algorithms as basis. The authors used the same experimental scheme that was proposed by Sanchez-Gonzalez et al. [12] modifying the input image. This input image was captured by different sensors. Wild et al. [19] revealed auto corrective properties of augmented model fusion on masks before FSL in most of the tested cases. The authors proposed a scanning iris masks before the merging process. Although, the recognition rates were also increased, the authors did not take into account the quality of the mask fusion when boundaries are frequently overestimated. Furthermore, non-convexity of the mask can lead to sample points which are attributed to the wrong boundary. In general the major of these fusion methods are based in the analysis pixel to pixel and the spatial relationships between pixel structures are not taken into account. This implies that the contextual information between the pixel structures is not analyzed.

In the data clustering problems, there are not clustering algorithms that effectively work for every data sets [4]. If some of them are applied to a dataset, different clusters will be obtained. One solution may be to find the best of them, but in the last years, the idea of combining individual segmentations has gained interest showing good results, for example: Vega-Pons et al. [18] formalized the segmentation ensemble problem and introduced a new method to solve it, which is based on the kernel clustering ensemble philosophy. The proposal was experimented on Berkeley image database and compared to several state-of-the art clustering ensemble algorithms. Franek et al. [4] addressed their work to the parameter selection problem by applying general ensemble clustering methods in order to produce consensus segmentation. Other works [5, 13] evaluated some clustering ensemble algorithms on image segmentation problems obtaining good results. A similar problem can be seen in the context of the iris segmentation. There is not a segmentation method able to correctly work on different conditions of iris image capturing. Besides several segmentation algorithms could produce very different segmentations of a same iris image. Therefore, we believe it is necessary to introduce the idea of the consensus segmentation based in ensemble clustering algorithms as a new fusion method. It will to allow a better performance compared with the independent segmentations obtained by different segmentation algorithms from a same eye image.

The main contribution of this work is to introduce the proposal of consensus segmentation based in ensemble clustering as a new fusion method for iris recognition at the segmentation level. The aim of this contribution is to replace the analysis pixel to pixel in the fusion process of a iris recognition system at the segmentation level by the analysis of the spatial relationships between pixel structures for a major impact in the merged image. The consensus segmentation is obtained by weighted median partition approach [18], modeling in a super-pixel set the segmentations obtained by different algorithms.

The remainder of this paper is organized as follows. In Sect. 2 we present the proposed general scheme and method for iris consensual segmentation fusion. In Sect. 3 we present and discuss the experimental results. Finally, we present the main conclusions and future work of this paper.

Fig. 1.
figure 1

Scheme for consensual iris segmentation fusion

2 Iris Consensual Segmentation Fusion by Segmentation Ensemble via Kernels

Let I(x, y) be an iris image and \(\mathbb {S} = \{ S_1, S_2, \ldots , S_n \}\) a set of segmentation masks of I obtained by different automatic iris segmentation algorithms. Then it is possible to model the consensus segmentation \(S^*\) as an ensemble clustering problem. For building \(S^*\), two main steps are performed (see Fig. 1). In the first step, the elements of \(\mathbb {S}\) are represented as binary matrices with values of 0s (black color) and 1s (white color) and a matrix G (see Fig. 1) is obtained from \(\mathbb {S}\). Each column of G is generated partitioning each segmentation \(S_i \in \mathbb {S}\) on clusters of pixels or super-pixels. A super-pixel [18] is a connected component in the image, formed by pixels that were grouped into the same cluster in each segmentation \(S_i\). This representation allows to decrease the count of objects (pixels) of \(\mathbb {S}\) and to maintain the spatial relationship between pixel structures in the image. In the matrix G, gij is the super-pixel j of the initial segmentation \(S_i\). All super-pixels (\(g_{ij}: i = 1 \ldots n, j = 1 \ldots k\)) in a column j are compound by pixels located in the same region but in each one of the segmentations \(S_i\); pixels inside a cluster have the same color. If a pixel set forms a super-pixel containing black pixels, then this cluster is labeled with the b character, otherwise it will be labeled with the w character. The matrix \(G_L\) shows a possible labeling of G: The partition set \(P_i\) is represented by the concatenation of the labels in the row i, i.e., \(P_i\) would be the union of \(g_{11} \cup g_{12}\, \cup , \ldots , \cup g_{1k}\) on G represented on \(G_L\) as: \(b - b - \ldots - w\). Once obtained \(P_i\) for each \(S_i\), in the second step, the consensus segmentation \(S^*\) is built. For this process we propose the use of a clustering ensemble algorithm.

Clustering ensemble methods combine partitions of the same dataset into a final consensus clustering. Let \(\mathbb {O} = \{o_1, o_2, \ldots , o_m\}\) be a set of objects, \(\mathbb {P} = \{P_1, P_2, \ldots , P_n\}\) a clustering ensemble, where each \(P_i \in \mathbb {P}\) is a partition of the set \(\mathbb {O}\) and \(P_\mathbb {O}\) the set of all possible partitions of \(\mathbb {O}\). The consensus partition \(P^*\) is defined through the median partition problem [18] and obtained by equation:

$$\begin{aligned} P^* = arg \max _{P \in P_\mathbb {O}} \sum ^{n}_{i = 1} \varGamma (P,P_i), \end{aligned}$$
(1)

where \(\varGamma \) is a similarity function defined between partitions, since this problem can be solved by minimizing the dissimilarity between partitions [18].

In the last years, several clustering ensemble methods haven been proposed for media partition based problem. With the aim to obtain the consensus segmentation \(S^*\), we propose the use of the WPCK method introduced in [17], because this algorithm uses partitions generated by any kind of clustering algorithm with any initialization parameter in order to build the consensus partition and its low computational cost, lower than the most of the state-of-the-art methods [18]. In WPCK, the theoretical consensus partition is definedFootnote 2 as:

$$\begin{aligned} P^* = arg \max _{P \in P_\mathbb {O}} \sum ^{n}_{i = 1} \tilde{k}(P,P_i), \end{aligned}$$
(2)

where, \(\tilde{k}\) is a positive definite kernel function [18]. This similarity function measures the significance of all possible subsets of \(\mathbb {O}\) for two partitions (\(P_i,P_j\)).

Given that \(\tilde{k}\) is a kernel function, then problem defined by Eq. 2 can be mapped into Reproducing Kernel Hilbert Space associated to \(\tilde{k}\), where the exact theoretical solution in this space is defined as:

$$\begin{aligned} \psi = \frac{\sum ^{n}_{i=1}\phi (P_i)}{\Vert \sum ^{n}_{i=1}\phi (P_i)\Vert }, \end{aligned}$$
(3)

where \(\phi \) is the function that allows mapping the partitions of \(P_\mathbb {O}\) into the Hilbert Space [18].

Based on the above, then \(P^*\) can be expressed as:

$$\begin{aligned} P^* = arg \min _{P \in P_\mathbb {O}} \Vert \phi (P) - \psi \Vert ^2, \end{aligned}$$
(4)

where \(\Vert \phi (P) - \psi \Vert ^2\) can be rewritten in terms of similarity function as:

$$\begin{aligned} \Vert \phi (P) - \psi \Vert ^2 = 2 - 2 \sum _{i=1}^{n} \tilde{k}(P,P_i) \end{aligned}$$
(5)

Taking into account that we use the super-pixel representation, a segmentation \(S_i \in \mathbb {S}\) can be defined as the composition of several regions composed by super-pixels. In [18], several region properties of \(S_i\) were defined, proving that the set of all segmentations \(\mathbb {S_I}\) of the Image I is a subset of the all possible partitions of the super-pixel sets of the image. Then, given a set of initial segmentations \(\mathbb {S} = \{S_1, S_2, \ldots , S_n\}\), these are mapped to the Hilbert Space and then, using the Eq. 4, the consensus segmentation \(S^*\) can be built of the following way:

$$\begin{aligned} S^* = arg \min _{S \in \mathbb {S}_I} 2 - 2 \sum ^{n}_{i = 1} \tilde{k}(S,S_i), \end{aligned}$$
(6)

The Eq. 6 expresses how close is the consensus segmentation \(S^*\) to any segmentation \(S \in \mathbb {S_I}\). In order to solve the previous equation, we propose the use of simulated annealing meta-heuristic selecting as first state the closest segmentation \(S \in \mathbb {S_I}\) as theoretical consensus segmentation. In this work, we experimented with the simulated annealing parameters and processing defined in [18].

3 Experimental Design

In this section, we show the results oriented to explore the capacity of the proposed fusion method to increase the recognition rates. The experiments were conducted on an international iris dataset and focused on assess how the proposed consensual fusion method improves the recognition rates of the independent segmentations obtained by automatic segmentation algorithms. The iris collection used in our experiments was UBIRIS.v1 [10], a dataset comprised of 1877 images collected from 241 persons in two distinct sessions. This database incorporates images with several noise factors (contrast, reflections, luminosity, focusing, occlusion by eyelids and eyelashes), simulating less constrained image acquisition environments. In the experiments, three initial segmentations were obtained by automatic segmentation algorithms: Contrast-Adjusted Hough Transform (CHT) [8], a traditional sequential (limbic-after pupillary) method based on circular Hough Transform (HT) and contrast-enhancement; Weighted Adaptive Hough and Ellipsopolar Transform (WHT) [15], a two-stage adaptive multiscale HT segmentation technique using elliptical models and Viterbi-based Segmentation Algorithm (VIT) [14], a circular HT-based method with boundary refinement. The motivation for selecting these algorithms was their public availability as open source software for reproducibility and also because, they proved good results in the iris segmentation. From the set of initial segmentations (CHT-WHT-VIT) the consensual segmentation fusion method \(S^*\) was obtained. To obtain \(S^*\), two functions were used to measure the similarity between two iris segmentations: Rand Index (RI) [11] and Kernel (\(\tilde{k}\)) [18]. RI is a positive definite kernel function proposed in [18] to measure the similarity between two image segmentations. Then using the similarity function defined by Eq. 6, \(S^*\) can be computed as:

$$\begin{aligned} S^* = arg \min _{S \in \mathbb {S}_I} 2 - 2 \sum ^{n}_{i = 1} RI(S, S_i), \end{aligned}$$
(7)

\(\tilde{k}\) is the similarity function mentioned in the Sect. 2. In general, \(\tilde{k}\) and RI take values in the range [0, 1]; values close to zero imply the minimum difference between two segmentations. In order to normalize the segmentations, Daugmans rubber sheet normalization [2] was used.

In order to assess the effectiveness of our proposal and possible influence of the type of features used for recognition, four feature extraction methods were experimented. The 2D version of Gabor filters (Daugman) [3]; The wavelet transform (Ma) [7]; The 1D Log-Gabor wavelets (Masek) [8] and the features derived from the zero crossings of the differences between 1D DCT coefficients (Monro) [9]. The results in terms of equal error rate (EER) were obtained by computing and comparing the Hamming distances in the verification task. First, the consensual segmentation fusions were obtained using \(\tilde{k}\) and RI functions. Then, we perform the verification process for each obtained consensual fusion and for each initial segmentation (WHT, CHT and VIT) using the four mentioned feature extraction methods and comparing each segmentation algorithm separately and when their results are fused.

Fig. 2.
figure 2

Recognition accuracy by EER on UBIRIS.v1 database

3.1 Results and Discussion

Figure 2 reports the results in terms of EER for each one of the automatic segmentation results and their consensual fusion obtained by the two similarity functions RI and \(\tilde{k}\) in the UBIRIS.v1 database.

The results in the Fig. 2 show that under less controlled conditions of UBIRIS v1 database, \(S^*-RI\) achieved better recognition rate on the results of the initial segmentations separately. \(S^*-\tilde{k}\) also achieved better recognition rate for Masek and Ma extraction methods. In general \(S^*-\tilde{k}\) does not exceed the results obtained by \(S^*-RI\). We think that these results are due to the fact that \(S^*-RI\) reached a good approximation of the global optimum. However in the case of \(S^*-\tilde{k}\) when the Daugman feature extraction algorithm is used, an improvement of the results achieved by the method VIT is not obtained. This may mean that the nature of the feature extraction method combined with the conditions of the database because a negative influence on the similarity measure used. But this should be a matter to investigate in future studies. Besides we can say that despite \(S^*-\tilde{k}\) did not reach a good approximation of the global optimum in most of the cases (for example in Daugman for VIT and CHT), but the meta-heuristic could find a close segmentation to theoretical consensus solution in the Hilbert space (\(\psi \)), improving at least in one of the initial segmentations (CHT, VIT, WHT). The results show that the idea of fusion by consensual segmentation is promising in the sense of the effect it has on the increased efficacy in recognition. Another advantage is that the fusion process is simplified because the analysis is done at the level of super-pixel unlike traditional fusion methods that perform pixel by pixel analysis. Another advantage is that the spatial relationships between pixel structures are preserved.

4 Conclusions

In this paper we proposed the Consensual Iris Segmentation Fusion. Our proposal allows obtaining a consensus from the initial segmentations produced by automatic segmentation algorithms. Our proposal is based on the weighted median partition problem and it uses the super-pixel representation of the iris image in order to overcome some drawbacks as, the volume of calculations to be performed for pixel to pixel image analysis and loss of spatial relation between pixel structures. Experimental results show that the proposed approach is promising. It was demonstrated the robustness of the approach for the degraded conditions of UBIRIS v1 database where the recognition results were improved in the most of the cases. Future work will aim to analyze the influence of similarity measures used on the results of the fusion in the sense to compare their results with the results of the ideal segmentation. Another line of research will be directed to propose new similarity measures closer to the nature of the iris. The combination of the proposed method with other proposed fusion methods can be another future line of research.