Symposium on Electronic Crime Research

Accepted Research Papers - 2016

eCrime 2016 Accepted Research Papers

Profiling Underground Merchants Based on Network Behavior

Srikanth Sundaresan, UC Berkeley

Damon McCoy, New York University

Sadia Afroz, UC Berkeley

Vern Paxson, UC Berkeley

The domain name system (DNS) is a fundamental protocol of the Internet, and cyber attacks leave their marks deeply buried in massive DNS traffic. In this paper we propose FLOAT, an off-line reputation system that base on our large-scale real-life passive DNS Databases. We use Support Vector Machine (SVM) and Random Forest (RF) to build models for known legitimate domains and malicious domains, and utilize these models to compute begign or malicious reputation scores for billions of domains. Our goal is to distinguish malicious domains from others with a very low points, allowing a relative small false positive rate. Experimental result shows that either model is effective. SVM is slightly less accurate, and more expensive in terms of training and prediction. RF produces similar accuracy in a faster manner. Our experiments show that FLOAT can identify malicious domain with high accuracy (true positive rate of 99.40% and low false positive rate 0.13%), and variety of malicious activities (e.g. phishing, fraud, malicious, faked medicine, fast-flux, DGA, etc.).

“Smells Phishy?”: An Educational Game about Online Phishing Scams

Malak Baslyman, University of Ottawa

Sonia Chiasson, Carleton University

We propose Smells Phishy?, a board game that contributes to raising users’ awareness of online phishing scams. We designed and developed the board game and conducted user testing with 21 participants. The results showed that after playing the game, participants had better understanding of phishing scams and learnt how to better protect themselves. Participants enjoyed playing the game and said that it was a fun and exciting experience. The game increased knowledge and awareness, and encouraged discussion.

ENTRADA: Enabling DNS Big Data Applications

Maarten Wullink, SIDN

Moritz Müller, SIDN

Marco Davids, SIDN

Giovane C. M. Moura SIDN

Cristian Hesselman, SIDN

DNS operators, TLD registries, hosting providers, and other Internet operators are frequently faced with the same question: how to draw insights and knowledge from their respective network traffic data in order to improve their services, security, and operations? ``Big data'' processing solutions play a major role as an enabling platform in this sense, especially with the increasing growth of the volume of Internet traffic.
With this in mind, we have developed and presented in a previous work ENTRADA, an open-source high-performance Hadoop-based data streaming warehouse designed to both ingest continuous streams of data and deliver interactive response times over large datasets, even in a small cluster. Whereas in the previous study we focused on the architecture and performance evaluation, in this paper we present a series of use cases and applications that cover phishing, botnets, email security, and visualizations. These applications can be directly used by DNS operators, TLD registries, and researchers to quickly analyze their network data and improve their services, security, and operations.

Taking Down Websites to Prevent Crime

Alice Hutchings, University of Cambridge

Richard Clayton, University of Cambridge

Ross Anderson, University of Cambridge

Website takedown has been used to disrupt criminal activities for well over a decade. Yet little is known about its overall effectiveness, particularly as many websites can be replaced rapidly and at little cost. We conducted lengthy interviews with a range of people actively engaged in website takedown, including commercial companies that offer specialist services, organisations targeted by criminals, UK law enforcement and service providers who respond to takedown requests. We found that law enforcement agencies are far less effective at takedown than commercial firms, who get an awful lot more practice. We conclude that the police must either raise their game, or subcontract the process.

REAPER: An automated, scalable solution for mass user credential harvesting and OSINT

Blake Butler, PayPal

Brad Wardman, PayPal

Nate Pratt, PayPal

Releases of usernames and passwords, referred to as credential dumps, have become an increasingly popular shared resource over the past decade, especially within underground communities. The sharing of compromised credentials by cyber-criminals is done in order to demonstrate technical capability, increase reputation, and to augment one’s legitimacy within criminal communities. There has been minimal research demonstrating standardized methods for identifying the distribution of credential dumps or the origin(s) of where a dump first surfaced. There has also been a lack of research related to the open source intelligence that can be obtained through tracing the distribution of dumps across the Internet. This research presents a method called REAPER which demonstrates how to leverage unique data points within credential dumps to identify its distribution, while also providing an in-depth look into intelligence that can be gained by observing the criminal activities associated with the credentials dumped.

Knowing your Enemies: Leveraging Data Analysis to Expose Phishing Patterns Against a Major US Financial Institution

Javier Vargas, EasySolutions

Alejandro Correa Bahnsen, EasySolutions

Sergio Villegas, EasySolutions

Phishing attacks against financial institutions constitutes a major concern and forces them to invest thousands of dollars annually in prevention, detection and takedown of these kinds of attacks. This operation is so massive and time critical that there is usually no time to perform analysis to look for patterns and correlations between attacks. In this work we summarize our findings after applying data analysis and clustering analysis to the record of attacks registered for a major financial institution in the US. We use HTML structure and content analysis, as well as domain registration records and DNS RRSets information of the sites, in order to look for patterns and correlations between phishing attacks. It is shown that by understanding and clustering the different types of phishing sites, we are able to identify different strategies used by criminal organizations. Furthermore, the findings of this study provide us valuable insight into who is targeting the institution and their modus operandi, which gives us a solid foundation for the construction of more and better tools for detection and takedown, and eventually for forensic analysts who will be able to correlate cases and perform focused searches that speed up their investigations.

Revisiting Password Rules: Facilitating Human Management of Passwords

Leah Zhang-Kennedy, Carleton University

Sonia Chiasson, Carleton University

Paul van Oorschot, Carleton University

Password rules were established in the context of past security concerns. Recent work in computer security challenges the conventional wisdom of expert password advice, such as change your passwords often, do not reuse your passwords, or do not write your passwords down. The effectiveness of these rules for protecting user accounts against real world attacks is questioned. We review the latest research examining password rules for general-purpose user authentication on the web, and discuss the arguments behind the continued acceptance or the rejection of the rules based on empirical evidence and solid justifications. Following the review, we recommend an updated set of password rules.

Geo-Phisher: The Design and Evaluation of Information Visualizations about Internet Phishing Trends

Leah Zhang-Kennedy, Carleton University

Elias Fares, Carleton University

Sonia Chiasson and Robert Biddle, Carleton University

We designed an information visualization about phishing trends and phishing prevention for the general public to examine the effects of interactivity on information finding, user perceptions and security behaviour intentions, and effectiveness of learning. In an user study (N = 30) with two experimental conditions (HI – high interactivity, and LO – low interactivity control condition), the results show that the HI interactivity condition supported more accurate information finding, resulted in greater perceived interactivity and perceived knowledge than the LO interactivity condition, but did not affect attitudes toward the visualization and security behaviour intentions for proactive awareness. Furthermore, the HI interactivity condition led to greater learning effects and a deeper understanding towards phishing prevention than the control condition.

When Cybercrimes Strike Undergraduates

Morvareed Bidgoli, The Pennsylvania State University

Bart P. Knijnenburg, Clemson University

Jens Grossklags, The Pennsylvania State University

Cybercrimes can cause various kinds of harm to those affected. This paper focuses on how cybercrimes impact undergraduate students, a group particularly vulnerable to cybercrimes due to their extensive use of technology and their recently gained financial responsibility and social independence. We present a mixed methods study to understand students’ knowledge, perceptions, and behaviors regarding cybercrimes. 10 semi-structured interviews provided the groundwork for a theoretical model, which was subsequently tested on a sample of 222 survey responses. We find that roughly half of the undergraduates in our studies have experienced one or more cybercrimes while in college, with malware, hacking, and phishing being the most prominently experienced cybercrimes. Furthermore, we find that students acquire their knowledge of cybercrimes predominantly through people they personally know who have been victimized by a cybercrime and the media. Our model shows how students’ knowledge of cybercrimes and their self-control in using the Internet influences their perceived cybercrime self-efficacy and their fear of cybercrimes. Self- efficacy and fear, in turn, influence their tendency to take preventative measures to avoid enabling behaviors and to report eventual cybercrimes to the appropriate entities. We also find that despite the reported importance of adequate cybercrime reporting and access to comprehensive cybercrime statistics, the majority of students do not know how to officially report a cybercrime.

Behind Closed Doors: Measurement and Analysis of CryptoLocker Ransoms in Bitcoin

Kevin Liao, Arizona State University

Ziming Zhao, Arizona State University

Adam Doupe, Arizona State University

Gail-Joon Ahn, Arizona State University

Bitcoin, a decentralized cryptographic currency that has experienced proliferating popularity over the past few years, is the common denominator in a wide variety of cybercrime. We perform a measurement analysis of CryptoLocker, a family of ransomware that encrypts a victim’s files until a ransom is paid, within the Bitcoin ecosystem from September 5, 2013 through January 31, 2014. Using information collected from online fora, such as reddit and BitcoinTalk, as an initial starting point, we generate a cluster of 968 Bitcoin addresses belonging to CryptoLocker. We provide a lower bound for CryptoLocker’s economy in Bitcoin and identify 795 ransom payments totalling 1,128.40 BTC ($310,472.38), but show that the proceeds could have been worth upwards of $1.1 million at peak valuation. By analyzing ransom payment timestamps both longitudinally across CryptoLocker’s operating period and transversely across times of day, we detect changes in distributions and form conjectures on CryptoLocker that corroborate information from previous efforts. Additionally, we construct a network topology to detail CryptoLocker’s financial infrastructure and obtain auxiliary information regarding CryptoLocker’s actions, victims, and potential conspirators. Most notably, we find evidence that suggests connections to popular Bitcoin services, such as Bitcoin Fog and BTC-e, and subtle links to other cybercrimes surrounding Bitcoin, such as the Sheep Marketplace scam of 2013. We use our study to exemplify the value of measurement analyses and threat intelligence in understanding the erratic cybercrime landscape.

Evaluating Randomness in Cyber Attack Textual Artifacts

Xuan Zhao, Cylance

Matt Wolff, Cylance

Jay Luan, Cylance

Textual data indicators can provide valuable insight to identify potential malicious activity. There are various scenarios where cyber attacks will leave textual clues, for example the use of DNS names or string data in binary files. Several techniques can be used to evaluate if these textual clues provide useful information for the purpose of detecting attacks. In this paper, we aim at finding out if the textual data can be considered human generated text or randomly generated by computers. Here we particularly consider specific textual artifacts of filenames. As dropping/copying/creating files with randomly-generated filenames is a common behavior of malware, detecting this behavior through detecting randomly-generated filenames would help identifying a cyber attack. To measure if a filename is human generated, we discuss several features designed to differentiate randomly generated text from human generated text, and build a classification model based on these features. On test data of 1 mil human-generated file names and 1 mil randomly generated filenames, our model gets an accuracy of 98.2940% in classifying human-generated filenames, and an accuracy of 97.8378% in classifying randomly generated filenames.