Diatom Classification Including Morphological Adaptations Using CNNs

Sánchez, Carlos; Vállez, Noelia; Bueno, Gloria; Cristóbal, Gabriel

doi:10.1007/978-3-030-31332-6_28

Carlos Sánchez¹²,
Noelia Vállez¹³,
Gloria Bueno¹³ &
…
Gabriel Cristóbal¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11867))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

1600 Accesses
4 Citations

Abstract

Diatoms are a major group of aquatic microalgae. They are widely used in different fields such as environmental studies to estimate water quality. This paper presents the use of convolutional neural networks (CNNs) to identify diatoms during their life cycle. This life cycle involves morphological and other changes to the diatom frustule adding intraclass variance and making harder the classification task. The performance of CNNs is compared against a classical image classification scheme (i.e., feature extraction and classification) using a 14 classes dataset with a total number of 1085 images ranging from 40 to 120 images per class. Classification accuracy was 99.07% and 99.7% for CNNs and classical methods respectively.

You have full access to this open access chapter, Download conference paper PDF

Deep learning-based diatom taxonomy on virtual slides

Article Open access 02 September 2020

Diatom Feature Extraction and Classification

Image Classification of Algal Species Applied Deep Learning Algorithms

Article 11 May 2023

Keywords

1 Introduction

Diatoms are a group of unicellular algae that are present in a great variety of aquatic environments. It is estimated that the total number of species is more than 200,000 (although the number of species already described is about 10,000). Since diatoms can adapt themselves to the environment, they can be used as a natural water quality indicator in environmental studies [5].

Diatoms are formed by two thecae that fit together to create a capsule known as a frustule. The frustule is formed by silica and depending on its shape diatoms can be centric (rounded frustule) or pennate (elongated frustule). The reproduction of the diatoms is asexual and sexual. In the asexual stage, the frustule is separated in the two valves. Then the other half of the cell grows originating two different diatoms, one bigger than the other. These differences in size are what is called life cycle. After several generations, the size of the valve can not decrease more triggering sexual reproduction. At this point, the cell form auxospores that will form new full-size algae. This is called sexual reproduction.

Traditionally, the task of identifying diatoms in samples from different aquatic environments was made by biologists. They usually looked for different morphometric features (length, width, shape) and frustule ornamentation such as the striae density. The identification is made comparing against previously described specimens [2]. Doing this task manually involves different challenges due to inter-species similarities and intra-species dissimilarities, originated from the various stages of the life cycle.

Different attempts to automate this process has been made [3, 4, 22]. This task is challenging due to different factors such as to the vast number of diatom species, similarities between them and the life cycle related changes in shape and texture. Some researchers [21] used shape descriptors based on Legendre polynomials and principal component analysis (PCA) in the identification of the Cymbella cistula species. Others [20] applied PCA to the Fourier descriptors extracted from the contour of the Tabellaria group. There are also recent studies on the application of different classification methodologies and the consideration of different image features such as textures, geometry, morphology and their combination [3]. Convolutional neural networks (CNNs) have also been applied with success for a high number of taxa [22]. However, the main source of errors come from the misclassification of algae due to their life cycle.

In this paper, we present an extension of the work presented in [24]. Two different contributions are added to the previous work. The main novelty of this work resides on the one hand that the number of classes has been increased from 8 to 14 and secondly a different approach has been considered using CNNs to classify the diatoms. CNNs have been applied recently to the taxonomic identification of diatoms with a 99.51% of accuracy in 80 species. However, the dataset used by these authors contains an average of 100 samples per taxa before applying any data augmentation technique. Due to the known need of relatively large training datasets for training some architectures such as AlexNet or GoogleNet from scratch, we propose to use transfer learning techniques as a fine-tuning strategy to the complete the model or by fixing the convolutional layers to use them as feature extractor to retrain the last part of the network [30]. In both cases, the networks are initialized with the weights of their corresponding architectures previously trained on ImageNet. In this work, ResNet18, AlexNet, VGG11, SqueezeNet1.0, DenseNet121, and InceptionV3 have been compared. Finally, a comparison between the results obtained with a traditional image identification workflow (i.e., image preprocessing, segmentation, feature extraction, dimensionality reduction, and classification) and CNNs is presented.

2 Materials and Methods

2.1 Database

The database used in this work is formed by 1085 diatom images of 14 different classes distributed as in Table 1.

Table 1. Number of images per taxa.

Full size table

2.2 Traditional Image Classification

The first step to carry out is image segmentation and contour extraction. Then three different sets of features are extracted to describe the segmented image and the contour. After that, all the features undergo a dimensionality reduction process. Finally, a classifier is used with this reduced set of features. The method is more extensively described in [24].

A. Segmentation and Contour Extraction. Semi-automatic global thresholding based on the Otsu method and morphological operations was used. In this process, few images were manually discarded due to inhomogeneous illumination and noise.

B. Feature Extraction. Three different descriptors have been used to describe the images. Elliptical Fourier descriptors (EFD) model the diatom contour while Gabor filters and phase congruency (PC) descriptors characterize the diatom ornamentation.

Elliptical Fourier descriptors. The method to calculate EFD is described in [16]. It starts with a contour image and calculates the Freeman chain code. Then the x, y projections of the chain code are calculated. Finally, the Fourier coefficients are obtained from these projections. It was empirically determined that the first 30 coefficients are sufficient to represent the contour.
Phase congruency descriptors. The phase congruency is based on the fact that all Fourier components are in phase in areas where signals occur, i.e., corners, edges, and textures of the images. PC descriptors are calculated as in [28]. Starting from the phase congruency maximum (M) and minimum (m) momentum images(described in [14]), the mean and standard deviation were calculated for both images. Those images combine the phase congruency information of each orientation. A total of 4 phase congruency descriptors are obtained.
Gabor filters descriptors. Gabor based descriptors are calculated by the same method as in [3] and initially described in [6]. First, the log-Gabor filters are calculated as shifted Gaussians for different orientations and scales. These filters are applied to the images and then the first and second order statistics are obtained for every sub-band.

C. Dimensionality Reduction. After the feature extraction, a total of 223 features were obtained. Therefore a dimensionality reduction is needed. For such purpose, Linear Discriminant Analysis (LDA) [7] was selected as it was proven that enhances classification results over other techniques such as PCA. LDA projects the feature space into a new smaller subspace that maximizes the separation between classes. With this supervised method, the original 223 dimensions space is reduced to \(N-1\) dimensions, where N is the number of classes in the dataset (\(N=14\) in this work).

D. Classification. In machine learning, a classifier can be defined as a function that takes the values of different features of a sample and gives as an output the prediction of the class to which the sample belongs [23]. In [24], different supervised and non-supervised classifiers were tested. Among the tested algorithms, Hierarchical Agglomerative Clustering [25] was chosen as it achieved the best results with the proposed dataset. Hierarchical clustering is a machine learning algorithm to cluster unlabeled data points. It produces a set of nested clusters organized as a hierarchical tree that can be visualized using a dendogram. They may correspond to meaningful taxonomies e.g. diatom taxa. The initial phase of this algorithm states that every single observation is a different cluster. Then a distance function between clusters is computed, and the closer clusters are merged. The algorithm finishes once the number of clusters is equal to the previously defined number of clusters.

2.3 Deep Learning

The number of images contained in this dataset is reasonable for applying traditional machine learning methods but is far from the amount required by deep learning techniques as explained in [22]. This number can be decreased to 100 samples per class by using transfer learning techniques [8]. However, most of the classes have fewer samples than that, and the number should be later reduced by partitioning the dataset into training, validation and test datasets. To deal with this problem, we added a data augmentation step that performs:

1.
Horizontal flip
2.
Vertical flip
3.
Random rotation between 0\(^\circ \) and 90\(^\circ \)

The combination of these three transformations is randomly applied each time a batch is requested during the training. After this process, images are resized to the network input size, i.e., 224 \(\times \) 224 pixels. Figure 1 shows some examples of this process.

Since image classification is a common task, several classification network architectures have been proposed in the literature. In this case, we have tested ResNet18 [9], AlexNet [15] (see Fig. 2), VGG11 [26], SqueezeNet1.0 [12], DenseNet121 [10], and InceptionV3 [27]. To deal with convergence problems due to the low number of samples per class, two transfer learning techniques have been applied. One of them is fine-tuning, in which a pre-trained model in used to initialize the network and then all the weights are adjusted during training. The other one consists in using the convolutional layers as a feature extractor and then training only the last part of the architecture. In all cases, the model weights were initialized with the ones from their corresponding pre-trained models on ImageNet since it has demonstrated to be successful on a wide range of transfer tasks [11]. Therefore, ImageNet is only used to learn good general-purpose features as a starting point for our diatom classification task.

The dataset was split into 3 different parts to train and evaluate the models. The 80% of the images were used for training whereas the other 20% was divided into validation: 10%, and test: 10%. This was repeated 10 times following a 10 fold cross validation scheme. Data augmentation was applied after this division. Analogously to the pretrained models, the subtracted mean m and standard deviation \(\sigma \) used to normalize the inputs were \((m=0.485, \sigma =0.229)\), \((m=0.456, \sigma =0.224)\), \((m=0.406\), \(\sigma =0.225)\) for training, validation and testing respectively.

3 Results

Two different tests were done with the dataset. In the first experiment, the images were analyzed with a traditional image classification scheme obtaining a classification accuracy of 99.7%. In the second experiment, different CNNs were tested, being Densenet with 99.07% accuracy the best result achieved.

Figure 3 is a representation of the clusters using t-Distributed Stochastic Neighbor Embedding (t-SNE) [17] algorithm to reduce the dimension of the feature vector. In such figure, it can be observed that the separation of the clusters allows to identify each cluster with a different class. Despite not being perfectly differentiated all the clusters in this representation, it is possible to assure that they are well separated in the 14 dimensions hyperplane of the features space according to the classification results where only 3 observations were misclassified.

Figure 4 represents the confusion matrix with the correctly identified samples and the errors produced by the classifier. In addition to classification accuracy, different objective metrics were calculated to assess the clustering performance [13, 29]. These metrics measure similarities with the ground truth (RAND), the similarity between elements of the same cluster (Silhouette), the similarities between the class assignment and the ground truth classes (Adjusted Mutual Information), if a cluster contains only members of the same class (Homogeneity) and if all the members of the same class are assigned to the same cluster (Completeness) Table 2 shows the values for the metrics. The values close to 1 indicates that the clusters are separated and well defined.

Table 2. Clustering metrics.

Full size table

Tables 3 and 4 show the accuracies of the CNN models on the test set. All architectures obtained better results with the use of fine-tuning rather than using them as a feature extractor, being the average accuracy difference between both techniques of around 11%. DenseNet, ResNet and VGG are the model architectures that provide the highest accuracy. From them, DenseNet shows the best results by achieving 99.07% of the samples correctly classified and having only one image misclassified. With the use of the convolutional layers as a feature extractor, SqueezeNet provides the best results with an accuracy of 93.52%. The differences between the two transfer learning techniques may be caused by the dissimilarity between diatoms and the classes in the ImageNet dataset. Based on that, it reasonable to have better results when the weights of the feature extractor are adjusted to the new dataset.

Table 3. Fine tuning results

Full size table

Table 4. CNN as a feature extractor results

Full size table

Regardless of the model used, the average per class accuracies show that the most challenging classes for both techniques are: Nitzschia amphibia, Sellaphora blackfordensis, and Sellaphora pupula. Nitzschia amphibia is often classified as Gomphonema minutum. Misclassification between them may be caused by the similarities of their lateral views as shown in Fig. 5(a)–(b). On the other hand, Sellaphora blackfordensis and Sellaphora pupula are often misclassified as Sellaphora capitata and Sellaphora auldreekie. The confusion between those classes is most likely to be caused by their high general similarity (Fig. 5(c)–(f)).

4 Discussion

This work pursued two main purposes as a sequel of the previously presented [24]. On the one hand, use a larger dataset with more different classes for testing the method described for diatoms life cycle classification. Elsewhere, test deep learning CNNs with the same dataset to compare with the results obtained with traditional classification algorithms.

With the new dataset (14 classes), a 99.7% accuracy was obtained with classical methods, whereas a similar result than the 98% obtained with a smaller dataset (8 classes) in [24].

Despite the good results obtained for 14 classes, the dataset can be still considered small. The absence of loss of precision when some additional classes were included in the experiment needs to be corroborated in the case of considering a significantly large number of classes (e.g., 50–100) together with a sufficiently high number of samples per class. This would be a more realistic situation where a higher number of diatoms coexist in the same ecosystem.

Convolutional Neural Networks classified correctly the 99.07% of the samples in the best scenario and 65.74% in the worst case. Concerning per class accuracies, it has been shown that three classes (Nitzschia amphibia, Sellaphora pupula and Sellaphora blackfordensis) are the most difficult to classify independently of the learning technique. The best results were obtained using a fine-tuning strategy and thus, adjusting all the weights whereas the worst results were obtained using the first layers of the pre-trained models as fixed feature extractors. This may be caused due to the differences between the different application domains. While models trained on ImageNet learn how to classify instances from categories such as animals or objects, diatoms are very different from those. Therefore, using such models as a feature extractor do not allow to extract the needed features for diatom classification. On the contrary, models trained on ImageNet can generalize with good results to other classification problems with some adjustments.

5 Conclusions

Increasing the number of classes present in the dataset and, consequently, the number of images has not decreased the accuracy of the method based on image descriptors and a traditional classifier. It remains close to 99%. Moreover, the results obtained using Deep Learning reach also high classification rates. Although the dataset is small to train a CNN to classify diatom according to the taxa, a transfer learning procedure has been applied to obtain the 99.07% of samples correctly classified. From the two proposed techniques, fine tuning (adjusting all the network weights) achieves the best performance since diatoms differ from the objects of the categories commonly used to initialize the architectures.

References

Blanco, S.: Diatom life cycle images dataset (2018). https://doi.org/10.6084/m9.figshare.7077725
Blanco, S., Borrego-Ramos, M., Olenici, A.: Disentangling diatom species complexes: does morphometry suffice? PeerJ 5, e4159 (2017). https://doi.org/10.7717/peerj.4159
Article Google Scholar
Bueno, G., et al.: Automated diatom classification (Part A): handcrafted feature approaches. Appl. Sci. 7(8), 753 (2017)
Article Google Scholar
du Buf, H., Bayer, M.M.: Automatic Diatom Identification. Series in Machine Perception and Artificial Intelligence, vol. 51 (2002)
Google Scholar
European Committee for Standardization: Water quality-guidance standard for the identification, enumeration and interpretation of benthic diatom samples from running waters. Technical report (2004)
Google Scholar
Fischer, S., Šroubek, F., Perrinet, L., Redondo, R., Cristóbal, G.: Self-invertible 2D log-Gabor wavelets. Int. J. Comput. Vis. 75(2), 231–246 (2007)
Article Google Scholar
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2), 179–188 (1936)
Article Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition (2015). http://arxiv.org/abs/1512.03385
Huang, G., Liu, Z., Weinberger, K.Q.: Densely connected convolutional networks (2016). http://arxiv.org/abs/1608.06993
Huh, M., Agrawal, P., Efros, A.A.: What makes ImageNet good for transfer learning? (2016). http://arxiv.org/abs/1608.08614
Iandola, F.N., Moskewicz, M.W., Ashraf, K., Han, S., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)0.5 MB model size (2016). http://arxiv.org/abs/1602.07360
Kassambara, A.: Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning, vol. 1. STHDA (2017)
Google Scholar
Kovesi, P.: Phase congruency detects corners and edges. In: The Australian Pattern Recognition Society Conference: DICTA, vol. 2003 (2003)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc. (2012). http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Kuhl, F.P., Giardina, C.R.: Elliptic Fourier features of a closed contour. Comput. Graph. Image Process. 18(3), 236–258 (1982)
Article Google Scholar
Maaten, L.v.d., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)
Google Scholar
Mann, D., Bayer, M.: Diatom size reduction image sets for shape and appearance models (2018). http://rbg-web2.rbge.org.uk/DIADIST/
Mann, D.G., et al.: The Sellaphora pupula species complex (Bacillariophyceae): morphometric analysis, ultrastructure and mating data provide evidence for five new species. Phycologia 43(4), 459–482 (2004)
Article Google Scholar
Mou, D., Stoermer, E.F.: Separating Tabellaria (Bacillariophyceae) shape groups based on Fourier descriptors. J. Phycol. 28(3), 386–395 (1992)
Article Google Scholar
Pappas, J.L., Stoermer, E.F.: Legendre shape descriptors and shape group determination of specimens in the Cymbella cistula species complex. Phycologia 42(1), 90–97 (2003)
Article Google Scholar
Pedraza, A., Bueno, G., Deniz, O., Cristóbal, G., Blanco, S., Borrego-Ramos, M.: Automated diatom classification (Part B): a deep learning approach. Appl. Sci. 7(5), 460 (2017)
Article Google Scholar
Pereira, F., Mitchell, T., Botvinick, M.: Machine learning classifiers and fMRI: a tutorial overview. Neuroimage 45(1), S199–S209 (2009)
Article Google Scholar
Sánchez, C., Cristóbal, G., Bueno, G.: Diatom identification including life cycle stages through morphological and texture descriptors. PeerJ 7, e6770 (2019). https://doi.org/10.7717/peerj.6770
Article Google Scholar
Schütze, H., Manning, C.D., Raghavan, P.: Introduction to Information Retrieval, vol. 39. Cambridge University Press, Cambridge (2008)
MATH Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings, San Diego, CA, USA, 7–9 May 2015 (2015). http://arxiv.org/abs/1409.1556
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision (2015). http://arxiv.org/abs/1512.00567
Verikas, A., Gelzinis, A., Bacauskiene, M., Olenina, I., Olenin, S., Vaiciukynas, E.: Phase congruency-based detection of circular objects applied to analysis of phytoplankton images. Pattern Recogn. 45(4), 1659–1670 (2012)
Article Google Scholar
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11(Oct), 2837–2854 (2010)
Google Scholar
Weiss, K., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3(1), 9 (2016)
Article Google Scholar

Download references

Acknowledgements

The authors acknowledge financial support of the Spanish Government under the Aqualitas-retos project (Ref. CTM2014-51907-C2-R-MINECO) http://aqualitas-retos.es/en/.

Author information

Authors and Affiliations

Instituto de Óptica CSIC, Serrano 121, 28006, Madrid, Spain
Carlos Sánchez & Gabriel Cristóbal
VISILAB Universidad Castilla La Mancha, Av. Camilo José Cela, 13071, Ciudad Real, Spain
Noelia Vállez & Gloria Bueno

Authors

Carlos Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Noelia Vállez
View author publications
You can also search for this author in PubMed Google Scholar
Gloria Bueno
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel Cristóbal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carlos Sánchez .

Editor information

Editors and Affiliations

Universidad Autónoma de Madrid, Madrid, Spain
Aythami Morales
Universidad Autónoma de Madrid, Madrid, Spain
Julian Fierrez
Universitat Jaume I, Castellón de la Plana, Spain
José Salvador Sánchez
University of Coimbra, Coimbra, Portugal
Bernardete Ribeiro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sánchez, C., Vállez, N., Bueno, G., Cristóbal, G. (2019). Diatom Classification Including Morphological Adaptations Using CNNs. In: Morales, A., Fierrez, J., Sánchez, J., Ribeiro, B. (eds) Pattern Recognition and Image Analysis. IbPRIA 2019. Lecture Notes in Computer Science(), vol 11867. Springer, Cham. https://doi.org/10.1007/978-3-030-31332-6_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-31332-6_28
Published: 22 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31331-9
Online ISBN: 978-3-030-31332-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Diatom Classification Including Morphological Adaptations Using CNNs

Abstract

Similar content being viewed by others

Deep learning-based diatom taxonomy on virtual slides

Diatom Feature Extraction and Classification

Image Classification of Algal Species Applied Deep Learning Algorithms

Keywords

1 Introduction