Dynamic Whitelisting Using Locality Sensitive Hashing

Pryde, Jayson; Angeles, Nestle; Carinan, Sheryl Kareen

doi:10.1007/978-3-030-04503-6_19

Jayson Pryde¹⁶,
Nestle Angeles¹⁶ &
Sheryl Kareen Carinan¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11154))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1237 Accesses

Abstract

Computer systems may employ some form of whitelisting for execution control, verification, minimizing false positives from other detection methods or other purposes. A legitimate file in a whitelist may be represented by its cryptographic hash, such as a hash generated using an SHA1 or MD5 hash function. Due to the fact that any small change to a file in a cryptographic hash results in a completely different hash, a file with a cryptographic hash in a whitelist may no longer be identifiable in the whitelist if the file is modified even by a small amount. This prevents a target file from being identified as legitimate even if the target file is simply a new version of a whitelisted legitimate file.

Locality Sensitive Hashing is a state of the art method in big data and machine learning for the scalable application of approximate nearest neighbor search in high dimensional spaces [9]. The identification of executable files which are very similar to known legitimate executable files fits very well within this paradigm.

In this paper, we show the effectiveness of applying TLSH [1, 2]; Trend Micro’s implementation of locality sensitive hashing, to identify files similar to legitimate executable files. We start with a brief explanation of locality sensitive hashing and TLSH. We then proceed with the concept of whitelisting, and describe typical modifications made to legitimate executable files such as security updates, patches, functionality enhancements, and corrupted files. We will also describe the scalability problems posed by all the legitimate executable files available on the Windows OS. We will also show results of similarity testing against malicious files (malwares). Data will be provided on the efficacy and scalability of this approach. We will conclude with a discussion of how this new methodology may be employed in a variety of computer security applications to improve the functionality and operation of a computer system. Examples may include whitelisting, overriding malware detection performed by a machine learning system, identifying corrupted legitimate files, and identifying new versions of legitimate files.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Open Sources in GitHub. https://github.com/trendmicro/tlsh/
Oliver, J., Cheng, C., Chen, Y.: TLSH - A Locality Sensitive Hash. In: 4th Cybercrime and Trustworthy Computing Workshop, Sydney, November 2013. https://github.com/trendmicro/tlsh/blob/master/TLSH_CTC_final.pdf
Oliver, J., Forman, S., Cheng, C.: Using randomization to attack similarity digests. In: Batten, L., Li, G., Niu, W., Warren, M. (eds.) ATIS 2014. CCIS, vol. 490, pp. 199–210. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45670-5_19
Chapter Google Scholar
Oliver, J., Pryde, J.: Smart Whitelisting Using Locality Sensitive Hashing. https://blog.trendmicro.com/trendlabs-security-intelligence/smart-whitelisting-using-locality-sensitive-hashing/
Implementing Application Whitelisting. Australian Signals Directorate. https://www.asd.gov.au/publications/protect/application_whitelisting.htm
How CyberCrime Exploits Digital Certificates. InfoSec Institute. http://resources.infosecinstitute.com/cybercrime-exploits-digital-certificates/#gref
Krebs, B.: Security Firm Bit9 Hacked, Used to Spread Malware. https://krebsonsecurity.com/2013/02/security-firm-bit9-hacked-used-to-spread-malware/
IOPI: Borrowing Microsoft Code Signing Certificates. https://blog.conscioushacker.io/index.php/2017/09/27/borrowing-microsoft-code-signing-certificates/
Rajaraman, A., Ullman, J.: Mining of Massive Datasets (2010). (Chapter 3)
Google Scholar
Ni, Y., Chu, K., Bradley, J.: Detecting Abuse at Scale: Locality Sensitive Hashing at Uber Engineering (2017). https://eng.uber.com/lsh/

Download references

Author information

Authors and Affiliations

Trend Micro Inc., PH, 7th Floor, Tower 2, Rockwell Business Center, Ortigas Ave., Pasig, Philippines
Jayson Pryde, Nestle Angeles & Sheryl Kareen Carinan

Authors

Jayson Pryde
View author publications
You can also search for this author in PubMed Google Scholar
Nestle Angeles
View author publications
You can also search for this author in PubMed Google Scholar
Sheryl Kareen Carinan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jayson Pryde .

Editor information

Editors and Affiliations

University of Melbourne, Melbourne, VIC, Australia
Mohadeseh Ganji
University of Melbourne, Melbourne, VIC, Australia
Lida Rashidi
McGill University, Montreal, QC, Canada
Benjamin C. M. Fung
Griffith University, Gold Coast, QLD, Australia
Can Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pryde, J., Angeles, N., Carinan, S.K. (2018). Dynamic Whitelisting Using Locality Sensitive Hashing. In: Ganji, M., Rashidi, L., Fung, B., Wang, C. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 11154. Springer, Cham. https://doi.org/10.1007/978-3-030-04503-6_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-04503-6_19
Published: 21 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04502-9
Online ISBN: 978-3-030-04503-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics