Skip to main content

On the Quality of Network Flow Records for IDS Evaluation: A Collaborative Filtering Approach

  • Conference paper
  • First Online:
Testing Software and Systems (ICTSS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13045))

Included in the following conference series:

  • 388 Accesses

Abstract

Network flow records consist of categorical and numerical features that provide context data and summary statistics computed from the raw packets exchanged between pairs of nodes in a network. Flow records labeled by human experts are typically used in high speed networks to design and evaluate intrusion detection systems. In spite of the ever-increasing body of literature on flow-based intrusion detection, there is no contribution that investigates the accuracy of flow records at rendering the class of traffic of the original aggregation of packets.

This paper proposes a collaborative filtering approach to compute sanitized labels for a given set of flow records. Sanitized labels are compared with the labels assigned by human experts. Experiments are done with CICIDS2017, i.e., an intrusion detection dataset that provides raw packets and labeled flow records obtained from benign operations and attack conditions. Results indicate that around 3.61% flow records might fail to render benign aggregations of packets; surprisingly, the percentage of flow records, which fail to render aggregations of packets pertaining to attacks, ranges from 5.39% to 27.18% depending on the type of attack. These findings indicate the need for improving the features collected or potential imperfections while computing the flow records.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://datatracker.ietf.org/doc/html/rfc2722.

  2. 2.

    https://www.cisco.com/c/en/us/products/ios-nx-os-software/ios-netflow/index.html.

  3. 3.

    https://github.com/ahlashkari/CICFlowMeter.

  4. 4.

    https://tranalyzer.com/.

  5. 5.

    https://nesg.ugr.es/nesg-ugr16/.

  6. 6.

    http://idsdata.ding.unisannio.it.

  7. 7.

    https://www.stratosphereips.org/datasets-ctu13.

  8. 8.

    http://agnigarh.tezu.ernet.in/~dkb/resources.html.

  9. 9.

    http://www.unb.ca/cic/datasets/ids.html.

  10. 10.

    If a distance metric is adopted, the computed distances need to be sorted in ascending order; on the other hand, sorting needs to be in descending order in case of similarity metrics.

  11. 11.

    https://scikit-learn.org/stable/.

  12. 12.

    https://www.unb.ca/cic/datasets/ids-2017.html.

  13. 13.

    For a small number of flow records the protocol field is unspecified.

  14. 14.

    https://allabouttesting.org/golden-eye-ddos-tool-installation-and-tool-usage-with-examples/.

References

  1. Ahmim, A., Maglaras, L., Ferrag, M.A., Derdour, M., Janicke, H.: A novel hierarchical intrusion detection system based on decision tree and rules-based models. In: Proceedings of the International Conference on Distributed Computing in Sensor Systems, pp. 228–233 (2019)

    Google Scholar 

  2. Bhuyan, M.H., Bhattacharyya, D., Kalita, J.: Towards generating real-life datasets for network intrusion detection. Int. J. Netw. Secur. 17, 683–701 (2015)

    Google Scholar 

  3. Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., Villano, U.: USB-IDS-1: a public multilayer dataset of labeled network flows for IDS evaluation. In: 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 1–6. IEEE (2021)

    Google Scholar 

  4. Catillo, M., Pecchia, A., Rak, M., Villano, U.: Demystifying the role of public intrusion datasets: a replication study of DoS network traffic data. Comput. Secur. 108, 102341 (2021)

    Article  Google Scholar 

  5. Catillo, M., Rak, M., Villano, U.: 2L-ZED-IDS: a two-level anomaly detector for multiple attack classes. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) WAINA 2020. AISC, vol. 1150, pp. 687–696. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44038-1_63

    Chapter  Google Scholar 

  6. Catillo, M., Pecchia, A., Villano, U.: Measurement-based analysis of a DoS defense module for an open source web server. In: Casola, V., De Benedictis, A., Rak, M. (eds.) ICTSS 2020. LNCS, vol. 12543, pp. 121–134. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64881-7_8

    Chapter  Google Scholar 

  7. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)

    Google Scholar 

  8. Cotroneo, D., Paudice, A., Pecchia, A.: Empirical analysis and validation of security alerts filtering techniques. IEEE Trans. Dependable Secure Comput. 16(5), 856–870 (2019)

    Article  Google Scholar 

  9. García, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur 45, 100–123 (2014)

    Article  Google Scholar 

  10. Gogoi, P., Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Packet and flow based network intrusion dataset. In: Parashar, M., Kaushik, D., Rana, O.F., Samtaney, R., Yang, Y., Zomaya, A. (eds.) IC3 2012. CCIS, vol. 306, pp. 322–334. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32129-0_34

    Chapter  Google Scholar 

  11. Kshirsagar, D., Kumar, S.: An efficient feature reduction method for the detection of DoS attack. ICT Express 7, 371–375 (2021)

    Article  Google Scholar 

  12. Lee, J., Kim, J., Kim, I., Han, K.: Cyber threat detection based on artificial neural networks using event profiles. IEEE Access 7, 165607–165626 (2019)

    Article  Google Scholar 

  13. Liu, H., Lang, B.: Machine learning and deep learning methods for intrusion detection systems: a survey. Appl. Sci. 9(20), 4396 (2019)

    Article  Google Scholar 

  14. Maciá-Fernández, G., Camacho, J., Magán-Carrión, R., García-Teodoro, P., Therón, R.: UGR’16: a new dataset for the evaluation of cyclostationarity-based network IDSs. Comput. Secur. 73, 411–424 (2017)

    Article  Google Scholar 

  15. Paudice, A., Muñoz-González, L., Lupu, E.C.: Label sanitization against label flipping poisoning attacks. In: Alzate, C., et al. (eds.) ECML PKDD 2018. LNCS (LNAI), vol. 11329, pp. 5–15. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13453-2_1

    Chapter  Google Scholar 

  16. Ring, M., Wunderlich, S., Scheuring, D., Landes, D., Hotho, A.: A survey of network-based intrusion detection data sets. Comput. Secur. 86, 147–167 (2019)

    Article  Google Scholar 

  17. Sharafaldin, I., Lashkari, A.H., Ghorbani., A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of the International Conference on Information Systems Security and Privacy, pp. 108–116. SciTePress (2018)

    Google Scholar 

  18. Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.: Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 31, 357–374 (2012)

    Article  Google Scholar 

  19. Smallwood, D., Vance, A.: Intrusion analysis with deep packet inspection: increasing efficiency of packet based investigations. In: Proceedings of the International Conference on Cloud and Service Computing, pp. 342–347. IEEE (2011)

    Google Scholar 

  20. Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B.: An overview of IP flow-based intrusion detection. IEEE Commun. Surv. Tutor. 12(3), 343–356 (2010)

    Article  Google Scholar 

  21. Umer, M.F., Sher, M., Bi, Y.: Flow-based intrusion detection: techniques and challenges. Comput. Secur. 70, 238–254 (2017)

    Article  Google Scholar 

  22. Wankhede, S., Kshirsagar, D.: DoS attack detection using machine learning and neural network. In: Proceedings of the 4th International Conference on Computing Communication Control and Automation, pp. 1–5 (2018)

    Google Scholar 

Download references

Acknowledgment

Andrea Del Vecchio gratefully acknowledges support by the “Orio Carlini” 2020 GARR Consortium Fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marta Catillo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Catillo, M., Vecchio, A.D., Pecchia, A., Villano, U. (2022). On the Quality of Network Flow Records for IDS Evaluation: A Collaborative Filtering Approach. In: Clark, D., Menendez, H., Cavalli, A.R. (eds) Testing Software and Systems. ICTSS 2021. Lecture Notes in Computer Science, vol 13045. Springer, Cham. https://doi.org/10.1007/978-3-031-04673-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-04673-5_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-04672-8

  • Online ISBN: 978-3-031-04673-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics