Analyzing HTTPS encrypted traffic to identify user's operating system, browser and application

Published on Jan 1, 2017 in CCNC (Consumer Communications and Networking Conference)
· DOI :10.1109/CCNC.2017.8013420
Jonathan Muehlstein2
Estimated H-index: 2
(Ariel University),
Yehonatan Zion2
Estimated H-index: 2
(Ariel University)
+ 4 AuthorsOfir Pele8
Estimated H-index: 8
(Ariel University)
Desktops and laptops can be maliciously exploited to violate privacy. There are two main types of attack scenarios: active and passive. In this paper, we consider the passive scenario where the adversary does not interact actively with the device, but he is able to eavesdrop on the network traffic of the device from the network side. Most of the internet traffic is encrypted and thus passive attacks are challenging. In this paper, we show that an external attacker can identify the operating system, browser and application of HTTP encrypted traffic (HTTPS). To the best of our knowledge, this is the first work that shows this. We provide a large data set of more than 20000 examples for this task. Additionally, we suggest new features for this task.We run a through a set of experiments, which shows that our classification accuracy is 96.06%.
  • References (48)
  • Citations (11)
📖 Papers frequently viewed together
2 Citations
4 Authors (Brian Schulte, ..., Angelos Stavrou)
1 Citations
40 Citations
78% of Scinapse members use related papers. After signing in, all features are FREE.
#1Ran Dubin (BGU: Ben-Gurion University of the Negev)H-Index: 5
#2Amit Dvir (Ariel University)H-Index: 8
Last. Ofer Hadar (Ariel University)H-Index: 16
view all 4 authors...
Desktops can be exploited to violate privacy. There are two main types of attack scenarios: active and passive. We consider the passive scenario where the adversary does not interact actively with the device, but is able to eavesdrop on the network traffic of the device from the network side. In the near future, most Internet traffic will be encrypted and thus passive attacks are challenging. Previous research has shown that information can be extracted from encrypted multimedia streams. This in...
9 CitationsSource
#1Martin Husák (Masaryk University)H-Index: 5
#2Milan Čermák (Masaryk University)H-Index: 5
Last. Pavel Čeleda (Masaryk University)H-Index: 11
view all 4 authors...
The encryption of network traffic complicates legitimate network monitoring, traffic analysis, and network forensics. In this paper, we present real-time lightweight identification of HTTPS clients based on network monitoring and SSL/TLS fingerprinting. Our experiment shows that it is possible to estimate the User-Agent of a client in HTTPS communication via the analysis of the SSL/TLS handshake. The fingerprints of SSL/TLS handshakes, including a list of supported cipher suites, differ among cl...
14 CitationsSource
#1Shan Suthaharan (UNCG: University of North Carolina at Greensboro)H-Index: 14
Support Vector Machine is one of the classical machine learning techniques that can still help solve big data classification problems. Especially, it can help the multidomain applications in a big data environment. However, the support vector machine is mathematically complex and computationally expensive. The main objective of this chapter is to simplify this approach using process diagrams and data flow diagrams to help readers understand theory and implement it successfully. To achieve this o...
716 CitationsSource
#1Ran DubinH-Index: 5
#2Amit DvirH-Index: 8
Last. Ofir TrabelsiH-Index: 2
view all 6 authors...
The increasing popularity of HTTP adaptive video streaming services has dramatically increased bandwidth requirements on operator networks, which attempt to shape their traffic through Deep Packet Inspection (DPI). However, Google and certain content providers have started to encrypt their video services. As a result, operators often encounter difficulties in shaping their encrypted video traffic via DPI. This highlights the need for new traffic classification methods for encrypted HTTP adaptive...
8 Citations
#1Mauro Conti (UNIPD: University of Padua)H-Index: 36
#2Luigi V. ManciniH-Index: 31
Last. Nino Vincenzo VerdeH-Index: 14
view all 4 authors...
Mobile devices can be maliciously exploited to violate the privacy of people. In most attack scenarios, the adversary takes the local or remote control of the mobile device, by leveraging a vulnerability of the system, hence sending back the collected information to some remote web service. In this paper, we consider a different adversary, who does not interact actively with the mobile device, but he is able to eavesdrop the network traffic of the device from the network side (e.g., controlling ...
68 CitationsSource
#1Tomasz Bujlow (AAU: Aalborg University)H-Index: 9
#2Valentín Carela-Español (UPC: Polytechnic University of Catalonia)H-Index: 7
Last. Pere Barlet-Ros (UPC: Polytechnic University of Catalonia)H-Index: 14
view all 3 authors...
Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic classification. However, the actual performance of DPI is still unclear to the research community, since the lack of public datasets prevent the comparison and reproducibility of their results. This paper presen...
67 CitationsSource
#1Zigang Cao (CAS: Chinese Academy of Sciences)H-Index: 4
#2Gang Xiong (CAS: Chinese Academy of Sciences)H-Index: 12
Last. Li Guo (CAS: Chinese Academy of Sciences)H-Index: 25
view all 5 authors...
With the widespread use of encryption techniques in network applications, encrypted network traffic has recently become a great challenge for network management. Studies on encrypted traffic classification not only help to improve the network service quality, but also assist in enhancing network security. In this paper, we first introduce the basic information of encrypted traffic classification, emphasizing the influences of encryption on current classification methodology. Then, we summarize t...
20 CitationsSource
Aug 28, 2014 in DCNET (International Conference on Data Communication Networking)
#1Petr Matousek (Brno University of Technology)H-Index: 5
#2Ondrej Rysavy (Brno University of Technology)H-Index: 6
Last. Martin Vymlatil (Brno University of Technology)H-Index: 2
view all 4 authors...
This paper deals with identification of operating systems (OSs) from the Internet traffic. Every packet injected on the network carries a specific information in its packet header that reflects the initial settings of a host's operating system. The set of such features forms a fingerprint. The OS fingerprint usually includes an initial TTL time, a TCP initial window time, a set of specific TCP options, and other values obtained from IP and TCP headers. Identification of OSs can be useful for mon...
10 CitationsSource
#1Walter de DonatoH-Index: 14
#2Antonio PescapeH-Index: 30
Last. Alberto Dainotti (UCSD: University of California, San Diego)H-Index: 21
view all 3 authors...
The availability of open source traffic classification systems designed for both experimental and operational use, can facilitate collaboration, convergence on standard definitions and procedures, and reliable evaluation of techniques. In this article, we describe Traffic Identification Engine (TIE), an open source tool for network traffic classification, which we started developing in 2008 to promote sharing common implementations and data in this field. We designed TIE?s architecture and funct...
36 CitationsSource
Jan 1, 2013 in TMA (Traffic Monitoring and Analysis)
#1Silvio Valenti (ENST: Télécom ParisTech)H-Index: 12
#2Dario Rossi (ENST: Télécom ParisTech)H-Index: 30
Last. Marco Mellia (Polytechnic University of Turin)H-Index: 37
view all 6 authors...
Traffic classification has received increasing attention in the last years. It aims at offering the ability to automatically recognize the application that has generated a given stream of packets from the direct and passive observation of the individual packets, or stream of packets, flowing in the network. This ability is instrumental to a number of activities that are of extreme interest to carriers, Internet service providers and network administrators in general. Indeed, traffic classificati...
44 CitationsSource
Cited By11
#1Omar Richardson (Karlstad University)H-Index: 3
#2Johan Garcia (Karlstad University)H-Index: 8
High level traffic characteristics have the potential to be useful for inference of various host characteristics. This work proposes the novel Flow-Discretize Order (FDO) approach for describing session characteristics in an intuitive manner, while also retaining flow ordering information. The FDO approach allows for flexible construction of flow descriptors, by using different flow properties and applying appropriate discretization. The individual flow descriptors are concatenated to form sessi...
view all 2 authors...
HTTPS is gaining widespread popularity for performing secure transactions. Most popular sites have made default choice as HTTPS. Therefore, this paper makes a survey through various study done in the area and it has comprehensively explored the various tools, technologies, and mechanisms to deal with secured network in a robust way. We make a complete analysis and evaluation of HTTPS protocol–is it ensuring security or are we entering into a vicious cycle of finding weaknesses and trying to fill...
#2Daniel MoratoH-Index: 9
Last. Mikel IzalH-Index: 8
view all 4 authors...
The measurement of response time in web based applications is a common task for the evaluation of service responsiveness and the detection of network or server problems. Traffic analysis is the most common strategy for obtaining response time measurements. However, when the traffic is encrypted, the analysis tools cannot provide these measurement results. In this paper we propose a methodology for measuring the response time in HTTPS traffic based on the flow of data in each direction. We have v...
#1Furat Al-Obaidy (RyeU: Ryerson University)H-Index: 2
#2Shadi Momtahen (RyeU: Ryerson University)
Last. Farah Mohammadi (RyeU: Ryerson University)H-Index: 7
view all 4 authors...
increasing the deployment of encryption in network protocols and applications poses a challenge for traditional traffic classification approaches. Social media applications such as Skype, WhatsApp, Facebook, YouTube etc. as popular representatives of encrypted traffics have attracted big attention to communication and entertainment. Therefore, the accurate identification of them within encrypted traffic has become a big issue and a hot topic to explore them in detail. In this context, Machine Le...
#1Xinxin Lou (Bielefeld University)H-Index: 1
#2Karl Waedt (Areva)H-Index: 1
Last. Deeksha Gupta (TUD: Dresden University of Technology)H-Index: 1
view all 6 authors...
The cybersecurity issue becomes increasingly important with the development of the Industrial IoT (IIoT) and Industrial 4.0 architectures. The instance of cyberattacks against infrastructures and Industrial Control System (ICS) in safety critical domains is increasing every year. How to alleviate this situation is a challenging topic. In many cases, threats to cybersecurity are only discovered after they have led to a disaster. In this paper, we are going to analyze those cybersecurity issues fr...
1 CitationsSource
#1Erik Arestrom (Linköping University)
#2Niklas Carlsson (Linköping University)H-Index: 20
Timely and accurate flow classification is important for identifying flows with different service requirements, optimized network management, and for helping network operators simultaneously operate networks at higher utilization while providing end users good quality of experience (QoE). With most services starting to use end-to-end encryption (HTTPS and QUIC), traditional Deep Packet Inspection (DPI) and port-based approaches are no longer applicable. Furthermore, most flow-level-based approac...
Network traffic classification, which has numerous applications from security to billing and network provisioning, has become a cornerstone of today's computer networks. Previous studies have developed traffic classification techniques using classical machine learning algorithms and deep learning methods when large quantities of labeled data are available. However, capturing large labeled datasets is a cumbersome and time-consuming process. In this paper, we propose a semi-supervised approach th...
6 Citations
#1Anshu Priya (National Institute of Technology, Arunachal Pradesh)
#2Sunit Kumar Nandi (National Institute of Technology, Arunachal Pradesh)
Last. R S Goswami (National Institute of Technology, Arunachal Pradesh)
view all 3 authors...
The Internet is drastically becoming part of our life as well as work. Every aspect of life is somehow associated with Internet due to which communication and technology needs to be getting advanced day by day. With the affluent usage of encrypted network data, its classification has been predominantly accepted these days. The Classification of network traffic can be defined as the procedure of identification and analysis of application and protocol in the network. It has power to manage and sol...
#1Jan Kohout (CTU: Czech Technical University in Prague)H-Index: 4
#2Tomáš Komárek (CTU: Czech Technical University in Prague)H-Index: 2
Last. Jakub Lokoč (Charles University in Prague)H-Index: 13
view all 5 authors...
Abstract Encrypted communication on the Internet using the HTTPs protocol represents a challenging task for network intrusion detection systems. While it significantly helps to preserve users’ privacy, it also limits a detection system’s ability to understand the traffic and effectively identify malicious activities. In this work, we propose a method for modeling and representation of encrypted communication from logs of web communication. The idea is based on introducing communication snapshots...
4 CitationsSource