On the Quality of Network Flow Records for IDS Evaluation: A Collaborative Filtering Approach

Catillo, Marta; Vecchio, Andrea Del; Pecchia, Antonio; Villano, Umberto

doi:10.1007/978-3-031-04673-5_16

Marta Catillo¹¹,
Andrea Del Vecchio¹¹,
Antonio Pecchia¹¹ &
…
Umberto Villano¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13045))

Included in the following conference series:

IFIP International Conference on Testing Software and Systems

388 Accesses

Abstract

Network flow records consist of categorical and numerical features that provide context data and summary statistics computed from the raw packets exchanged between pairs of nodes in a network. Flow records labeled by human experts are typically used in high speed networks to design and evaluate intrusion detection systems. In spite of the ever-increasing body of literature on flow-based intrusion detection, there is no contribution that investigates the accuracy of flow records at rendering the class of traffic of the original aggregation of packets.

This paper proposes a collaborative filtering approach to compute sanitized labels for a given set of flow records. Sanitized labels are compared with the labels assigned by human experts. Experiments are done with CICIDS2017, i.e., an intrusion detection dataset that provides raw packets and labeled flow records obtained from benign operations and attack conditions. Results indicate that around 3.61% flow records might fail to render benign aggregations of packets; surprisingly, the percentage of flow records, which fail to render aggregations of packets pertaining to attacks, ranges from 5.39% to 27.18% depending on the type of attack. These findings indicate the need for improving the features collected or potential imperfections while computing the flow records.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://datatracker.ietf.org/doc/html/rfc2722.
2.
https://www.cisco.com/c/en/us/products/ios-nx-os-software/ios-netflow/index.html.
3.
https://github.com/ahlashkari/CICFlowMeter.
4.
https://tranalyzer.com/.
5.
https://nesg.ugr.es/nesg-ugr16/.
6.
http://idsdata.ding.unisannio.it.
7.
https://www.stratosphereips.org/datasets-ctu13.
8.
http://agnigarh.tezu.ernet.in/~dkb/resources.html.
9.
http://www.unb.ca/cic/datasets/ids.html.
10.
If a distance metric is adopted, the computed distances need to be sorted in ascending order; on the other hand, sorting needs to be in descending order in case of similarity metrics.
11.
https://scikit-learn.org/stable/.
12.
https://www.unb.ca/cic/datasets/ids-2017.html.
13.
For a small number of flow records the protocol field is unspecified.
14.
https://allabouttesting.org/golden-eye-ddos-tool-installation-and-tool-usage-with-examples/.

References

Ahmim, A., Maglaras, L., Ferrag, M.A., Derdour, M., Janicke, H.: A novel hierarchical intrusion detection system based on decision tree and rules-based models. In: Proceedings of the International Conference on Distributed Computing in Sensor Systems, pp. 228–233 (2019)
Google Scholar
Bhuyan, M.H., Bhattacharyya, D., Kalita, J.: Towards generating real-life datasets for network intrusion detection. Int. J. Netw. Secur. 17, 683–701 (2015)
Google Scholar
Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., Villano, U.: USB-IDS-1: a public multilayer dataset of labeled network flows for IDS evaluation. In: 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 1–6. IEEE (2021)
Google Scholar
Catillo, M., Pecchia, A., Rak, M., Villano, U.: Demystifying the role of public intrusion datasets: a replication study of DoS network traffic data. Comput. Secur. 108, 102341 (2021)
Article Google Scholar
Catillo, M., Rak, M., Villano, U.: 2L-ZED-IDS: a two-level anomaly detector for multiple attack classes. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) WAINA 2020. AISC, vol. 1150, pp. 687–696. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44038-1_63
Chapter Google Scholar
Catillo, M., Pecchia, A., Villano, U.: Measurement-based analysis of a DoS defense module for an open source web server. In: Casola, V., De Benedictis, A., Rak, M. (eds.) ICTSS 2020. LNCS, vol. 12543, pp. 121–134. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64881-7_8
Chapter Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)
Google Scholar
Cotroneo, D., Paudice, A., Pecchia, A.: Empirical analysis and validation of security alerts filtering techniques. IEEE Trans. Dependable Secure Comput. 16(5), 856–870 (2019)
Article Google Scholar
García, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur 45, 100–123 (2014)
Article Google Scholar
Gogoi, P., Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Packet and flow based network intrusion dataset. In: Parashar, M., Kaushik, D., Rana, O.F., Samtaney, R., Yang, Y., Zomaya, A. (eds.) IC3 2012. CCIS, vol. 306, pp. 322–334. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32129-0_34
Chapter Google Scholar
Kshirsagar, D., Kumar, S.: An efficient feature reduction method for the detection of DoS attack. ICT Express 7, 371–375 (2021)
Article Google Scholar
Lee, J., Kim, J., Kim, I., Han, K.: Cyber threat detection based on artificial neural networks using event profiles. IEEE Access 7, 165607–165626 (2019)
Article Google Scholar
Liu, H., Lang, B.: Machine learning and deep learning methods for intrusion detection systems: a survey. Appl. Sci. 9(20), 4396 (2019)
Article Google Scholar
Maciá-Fernández, G., Camacho, J., Magán-Carrión, R., García-Teodoro, P., Therón, R.: UGR’16: a new dataset for the evaluation of cyclostationarity-based network IDSs. Comput. Secur. 73, 411–424 (2017)
Article Google Scholar
Paudice, A., Muñoz-González, L., Lupu, E.C.: Label sanitization against label flipping poisoning attacks. In: Alzate, C., et al. (eds.) ECML PKDD 2018. LNCS (LNAI), vol. 11329, pp. 5–15. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13453-2_1
Chapter Google Scholar
Ring, M., Wunderlich, S., Scheuring, D., Landes, D., Hotho, A.: A survey of network-based intrusion detection data sets. Comput. Secur. 86, 147–167 (2019)
Article Google Scholar
Sharafaldin, I., Lashkari, A.H., Ghorbani., A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of the International Conference on Information Systems Security and Privacy, pp. 108–116. SciTePress (2018)
Google Scholar
Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.: Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 31, 357–374 (2012)
Article Google Scholar
Smallwood, D., Vance, A.: Intrusion analysis with deep packet inspection: increasing efficiency of packet based investigations. In: Proceedings of the International Conference on Cloud and Service Computing, pp. 342–347. IEEE (2011)
Google Scholar
Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B.: An overview of IP flow-based intrusion detection. IEEE Commun. Surv. Tutor. 12(3), 343–356 (2010)
Article Google Scholar
Umer, M.F., Sher, M., Bi, Y.: Flow-based intrusion detection: techniques and challenges. Comput. Secur. 70, 238–254 (2017)
Article Google Scholar
Wankhede, S., Kshirsagar, D.: DoS attack detection using machine learning and neural network. In: Proceedings of the 4th International Conference on Computing Communication Control and Automation, pp. 1–5 (2018)
Google Scholar

Download references

Acknowledgment

Andrea Del Vecchio gratefully acknowledges support by the “Orio Carlini” 2020 GARR Consortium Fellowship.

Author information

Authors and Affiliations

Dipartimento di Ingegneria, Università degli Studi del Sannio, Benevento, Italy
Marta Catillo, Andrea Del Vecchio, Antonio Pecchia & Umberto Villano

Authors

Marta Catillo
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Del Vecchio
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Pecchia
View author publications
You can also search for this author in PubMed Google Scholar
Umberto Villano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marta Catillo .

Editor information

Editors and Affiliations

University College London, London, UK
David Clark
Middlesex University, London, UK
Hector Menendez
Telecom SudParis, Evry Cedex, France
Ana Rosa Cavalli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Catillo, M., Vecchio, A.D., Pecchia, A., Villano, U. (2022). On the Quality of Network Flow Records for IDS Evaluation: A Collaborative Filtering Approach. In: Clark, D., Menendez, H., Cavalli, A.R. (eds) Testing Software and Systems. ICTSS 2021. Lecture Notes in Computer Science, vol 13045. Springer, Cham. https://doi.org/10.1007/978-3-031-04673-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-04673-5_16
Published: 10 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04672-8
Online ISBN: 978-3-031-04673-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

On the Quality of Network Flow Records for IDS Evaluation: A Collaborative Filtering Approach