Click here for the PDF version!

Justin Wickett

4/28/07

Duke 2010

 

The Evolution of the World Wide Web and Its Effects on Security

 

            Over the recent years, the World Wide Web has rapidly evolved from a small computer science project to an international collaboration of global networks that drives many of the world’s markets. Because the Internet and the World Wide Web have experienced drastic change over the years, the underlying protocols that allow people to communicate have undergone modifications to promote new innovative services. In order to promote backward compatibility, many of today’s web services have had to rely on these older protocols to accomplish their goals on the Internet. However, these web services are pushing their protocols’ limits by forcing them to achieve tasks they weren’t initially designed to handle. As a result, many of these new web services contain flaws and vulnerable security exploits. In order to stay ahead of their competition, many developers integrate new features with older protocols and do not take the time to implement the appropriate security patches. The result is that an increasing number of web services are insecure and pose a risk to user privacy. In order to protect their users, Web developers must wisely choose which features and protocols to incorporate with their web service and take the time to patch vulnerable exploits that compromise system security.

            When the World Wide Web originally took shape and started to become popular, it was powered by stateless protocols that would render static pages. Using simple hypertext markup, programmers were able to connect pages together by relying on basic links. Tim Berners-Lee, the director of the World Wide Web Consortium and the inventor of the HTTP protocol as well as the HTML language, states that “the Web’s major goal was to be a shared information space through which people and machines could communicate” [1]. This goal is still in the process being achieved today even though the Web we use is very different from what was originally created. Many of today’s web services, which have been defined as “Web 2.0” by [2], are based on user-generated content that is uploaded into a SQL database via server side programming languages. Since this private information may be shared between parties or used as a basis for commerce, proper levels of security must be implemented. HTTP statelessness has been replaced with AJAX, Flash, Java applets, frames, and many other technologies to create the illusion of a seamless browsing experience. However, problems arise since many of the underlying concepts and principles that were used to build the original basic web pages are still being used today create advanced functionality. Even though web developers constantly implement new features, they rarely dedicate the time to account for the increased security risks. By adding onto the existing simple architecture and by improving the standard document object module, developers hope to promote interoperability and allow for any necessary expansions. The result has been a “new” World Wide Web powered by innovative and complex protocols that take advantage of what the original existing system had to offer. However, many of these changes were not taken into consideration when the Web was first implemented. As a result, the intricacy of this new Web creates a convoluted environment where security vulnerabilities are more prominent, and thus users run a higher risk of having their privacy breached.

            Over the years, the need for a more robust client-to-server interaction scheme has become a high priority concern. In order to address this problem, older layer 7 protocols that provided services directly to client applications were modified to contain information concerning the user’s state. While the Representational State Transfer Architecture (REST) worked fantastically well for traditional websites offering stateless content that didn’t require any form of authentication, developers wanted to expand on the simple HTTP header to make use of cookies on end users’ machines. HTTP cookies act as “memory” for websites, which turns a web application into a finite state machine based on the contents stored in the cookie. RFC 2109 proposed by Kristol and Montulli laid out the foundation for cookies and sessions to store state so that users could enjoy a custom tailored web experience [3]. However, by incorporating cookies and other HTTP state management mechanisms, the REST philosophy was being violated due to additional complexity that would have to be carefully accounted for [4]. Ever since the introduction of cookies, malicious hackers have attempted to steal a user’s state to take on his or her identity. Many web services have made use of cookies without fully addressing the associated security risks. Because these web developers are busy incorporating intricate features with their service, cookies are often left vulnerable to being attacked. Since a single vulnerability can compromise an entire system, these flaws pose an immediate risk to the overall security. With the introduction of HTTP cookies, the architecture of the Web has evolved from being REST-based to relying on Remote Procedure Calls (RPC) where state is stored on the client and server side. Because of this new design, programmers and users alike have traded in simplicity and speed due to efficient caching for a more dynamic web experience. However, due to the modifications required to implement cookies, new vulnerabilities in the application layer protocol have rendered the modern dynamic Web susceptible to various attacks.

            Even though HTTP cookies allowed websites to have state, developers were still in need of an efficient method to transfer user-specific data from one page to another. Web 2.0 sites began to make extensive use of JSON objects and arrays as specified in [5] to implement lightweight messaging schemes. As a result, JSON rapidly became an essential part of AJAX web programming to retrieve data in real time without needing to reload the page. However, the JSON protocol has undergone scrutiny for security reasons as developers and users have discovered potential ways it can be used to leak sensitive data. Even though JSON is a powerful protocol that allows for innovative web services, developers must take into consideration its associated security risks. In order to create a versatile web service, companies such as Yahoo! and Google began to exchange data using JSON. However, even though it is widely deployed and popular among major web services, developers should reevaluate security risks after having incorporated JSON with their websites service. Because JSON does not respect the same origin policy, [6] states that a malicious hacker can create a webpage to exploit cross-site scripting vulnerabilities by importing JSON objects from the sites a user is logged into. By doing this, a malicious hacker would be able to access and steal a user’s sensitive information that is being transmitted within a JSON object. For this reason, the use of JSON can be classified as a potential security threat. Because JSON operates over the HTTP protocol, a standard HTTP header that contains private cookie information will be received when a request is sent. The contents of these cookies can then be sent back to the malicious hacker’s computer via an XMLHttpRequest where they can be stored for future retrieval. With this information, a malicious hacker may attempt to gain access to a user’s account, which has the potential to lead to an unsuspected privacy and security breach. However, there are also benefits associated with allowing JSON to violate the same origin policy. Web developers can create efficient mash ups by pulling together JSON objects from different sites to build a page that combines elements from web services that interest the user. Because of this powerful feature, JSON is likely to retain its unique ability despite the security vulnerability that puts users at risk of attacks such as sessions stealing and snooping. Therefore, it is up to developers that make use of JSON to ensure that it is properly secure. Fortify Software termed this type of attack as “JavaScript Hijacking” and noted that many of the popular frameworks utilized by Web 2.0 companies are vulnerable [6]. Still however, many web developers are not aware of this newly discovered exploit associated with JSON and use it without implementing additional security layers or considering a more secure alternative messaging schemes. In order to protect against these types of attacks, security experts recommend requiring unique identifiers, which conform to the same origin policy (such as cookies), to be embedded with the original JSON request [6]. By doing this, servers can verify that the request is coming from a legitimate user and not a malicious site. Because traditional websites that were around when the Web first started could not incorporate AJAX functionality, this security threat never applied to them. However, because the Web has changed so drastically, developers must take new security vulnerabilities into account and attempt to patch them. Sending JSON-RPC data over stream connections instead of using the HTTP protocol is another potential patch developers should consider. By transmitting JSON objects as TCP data, web developers would effectively eliminate the transmission of cookies within the HTTP header. However, this option is not always available to RPC based services that rely on HTTP cookies. The JSON RFC 4627 was introduced in July of 2006, and still is relatively new, so a potential revision may appear sometime soon to address this problem [5]. Still however, as the Web evolves and becomes more complex, JSON is expected to remain a standard way of performing lightweight computer interchange of data over a network connection due to the numerous language bindings that are publicly available. As a result, web developers must dedicate the time required to implement additional layers of security to prevent a malicious hacker from violating a user’s privacy by taking advantage of the JSON exploits.

            SOAP is another messaging scheme that has the potential to serve as an alternative to JSON. Similarly to JSON, SOAP is used to pass data stored in XML format between pages to create a streamlined user experience. For this reason, SOAP is popular among Web 2.0 services due to the popularity of XML DOM and SAX parsing libraries, and the fact that the W3C Consortium has endorsed it. Still however, it is not without flaws of its own. Just like JSON, SOAP often makes use of the HTTP protocol. Even though SOAP builds on top of the older XML-RPC messaging scheme, many of the same principles apply. Akin to the JSON format, the XML DOM provides a vendor neutral solution that can interoperate across platforms and thus is scaleable. Still however, SOAP is vulnerable to all of XML’s flaws, and if inputted data is not properly parameterized and escaped of dangerous characters, it has the potential to cause havoc [7]. The use of a deep packet inspection is one way to ensure that the SOAP data being transmitted to or from the server is free of malicious content, however DPI firewalls are very expensive and take time to traverse each packet while looking for pre-defined matching strings [8]. Because SOAP is a data exchange format and is not based on JavaScript, it does not suffer from the JavaScript hijacking attack. In order to minimize the vulnerabilities surrounding SOAP messaging, WS-Security was incorporated to enhance SOAP’s messaging ability by ensuring message integrity, message confidentiality, and message authentication [9]. Still however, because SOAP is used as an intermediary service, the packets can be modified along the way by an actor, which violates the authenticity of the packets. X.509 certificates may be used to sign and validate SOAP packets originating from the intended server, which are then allowed by browsers such as Internet Explorer and Firefox [9]. Elements from WS-Security have allowed for a safer web environment, but sites must take advantage of them and incorporate the technology into their website’s architecture. Failure to do so leaves common SOAP vulnerabilities at risk of being exploited by malicious hackers.

            Another vulnerability with today’s World Wide Web revolves around the usage of signed scripts. While signed code does allow a user to validate the authenticity of the client-side macros running in his/her browser, signed scripts can also be used for malicious purposes. Signed scripts make sense in an intranet setting, however they are practically useless on the World Wide Web and pose a security threat. The idea behind script signing is to confirm the application’s identity, and has little to do with the code’s security. Because of this, a programmer can easily get away with signing a malicious piece of JavaSript code. To add complexity to the situation, different browsers have unique ways of interpreting signed scripts. While signed JavaScript is allowed in Mozilla Firefox, Internet Explorer relies on the notion of authenticode that only applies to Microsoft’s ActiveX objects. Furthermore, browsers do not make any attempt to determine the security risks associated with the code it is about to execute, which therefore may lead to privacy and security violations. In the Firefox browser, a signed script prompts an unsuspecting user whether he/she want to allow the page to run with extended privileges [10]. A malicious programmer can make it appear to the user that the site requires extended privileges to properly function, and therefore users are more likely to grant them full access. The result is that the same origin policy no longer applies to this seemingly innocent site. Once a user allows for a signed script to execute, there is no telling what types of actions that script will take. The malicious coder can then make use of cross-site request forgeries to steal sensitive cookie information from other sites, while transmitting that data back to a base server via a simple XMLHttpRequest. Once a user grants a signed script access, developers lack the power to prevent their site from being attacked. In order to minimize the effects of a signed script attack, browser developers must take actions to protect their user’s privacy. By providing users with more information about the signed scripts that are being executed, users will be able to better reason whether to run a script. By implementing different security levels in which a signed script may execute, developers can help minimize the overall damage of a signed script by limiting its resources. At the very least, browser developers must make an attempt to educate users of the dangers associated with signed scripts and encourage them to view the source code prior to granting the signed script unrestricted access.

            Facebook is one of the many Web 2.0 sites exhibiting common security vulnerabilities. Due to the competitive environment, these Web 2.0 companies are forced to allocate their resources in a manner such that new features and ideas can be implemented in a timely fashion. Because all of the developers are working to create a finished product before their competition has a chance to release their own version, little emphasis beyond the bare minimum is placed on security and user privacy. As a result, many of these Web 2.0 sites lack innovative security tactics to hide sensitive information. While these sites incorporate new technologies such as client and server side scripting capabilities, SQL databases, and messaging services, the chances of being vulnerable to a security exploit significantly increase. Because Facebook was developed in PHP and relies on common libraries in the PHP API, they are vulnerable to known flaws within the language. Furthermore, the way they implement certain features can lead to further vulnerable exploits. This can be demonstrated by the way Facebook uses PHP sessions to store user information. When a user logs into Facebook, a session and a corresponding hashed session key is created on the Facebook server. Several cookies are set on the client’s machine by appending a Set-Cookie line to the response packet’s HTTP header as specified in [3]. After this step is completed, the user is officially logged into Facebook. One of these cookies, named ‘xs’, contains the unique session key. In order to successfully navigate pages that require the user to be log into Facebook, both the ‘xs’ and the ‘c_user’ cookies must be present on the client’s web browser. The ‘c_user’ cookie contains the user’s unique id, which is used in the URN as a form of identification when viewing a profile page. While the information in the ‘c_user’ cookie is publicly available, the information within the ‘xs’ cookie should be kept private by all possible means. Both of these cookies must contain the correct values in order for a user to have access to the site. The usage of sessions is very popular amongst RPC-based web services, and thus Facebook’s approach cannot be criticized. However, Facebook’s site architecture allows malicious attackers to access a user’s sensitive cookie information with ease. If a malicious user gains access to both the sensitive hashed session key and the victim’s unique id, he/she can take on the victim’s identity and has full privilege to the victim’s account. Because there are no additional security layers and the HTTP header is transmitted in plan text, malicious sniffers can gain full access to a user’s account. They can then perform actions that include posting on other people’s walls, viewing a friend’s otherwise restricted profile, sending/viewing/deleting private messages, tagging/uploading/deleting pictures as the “hacked” user. The result is a major security and privacy breach that could have a negative effect on a victimized user’s account.

Almost all data after a user logs in is transmitted over a regular HTTP connection, and thus lacks suitable encryption. The client’s browser automatically includes all relevant cookies in the HTTP headers that are sent with each page request to be processed by Facebook’s internal servers. As a result, the contents of these cookies can be “stolen” by others along the path by which the packets travel due to the fact that the cookies are sent in plain text format. Because most college campuses create an open ESS by providing wireless access using a single SSID that students are accustomed to, a malicious user can simply set up a wireless repeater and sniff the link by implementing a man-in-the-middle attack. With such a set up, innocent users that connect can fall victim to this attack and suffer from having their Facebook identity violated. A more advanced hacker could set up his/her own website that would make use of the vulnerabilities mentioned above such as script signing to perform illicit cross-site request forgeries. By doing so, they would be able to gain access to session information stored in an unsuspecting user’s cookies. They could then easily transmit that information back to their base server, where it could be stored in a database. The upside to this attack is that PHP sessions eventually expire after a given amount of time or when the user logs out, thus an attacker does not have control for very long. However, this is only the case when the sessions are properly implemented. It has been discovered that when two users using the same HTTP cookie credentials access Facebook, and one logs out, the other user still has full access to the site. This not only is a major security flaw, but also places an unnecessary burden on Facebook’s servers. These sessions, which are presumably stored in RAM, consume precious resources while providing no benefit to the system. An educated guess would leads me to believe that Facebook simply deletes a user’s cookies when he/she logs out, but forgets to destroy the session residing on the server. Because of this, a malicious user can still have full control over a victim’s account even after the victim has logged out.

            Facebook was responsive to these vulnerable exploits, and claims to be currently working on a fix to remedy the problems. As of April 30, 2007, Facebook appears to have implemented the required steps to properly destroy user sessions. However, malicious users can still gain access to other users’ accounts that are logged into the system. As Facebook looks to improve their service, this vulnerable exploit must be appropriately addressed and fixed. Because of the fast-paced nature of Web 2.0, security features often get overlooked or left to be implemented in the future when time permits. Incorporating new features usually takes precedence over users’ privacy and security, which suggests why this exploit has been vulnerable for over a month after being reported. Since the time the vulnerabilities were reported, Facebook modified their original layout, promoted mobile access to their web service, and released an RPC-based framework, all of which goes to show that incorporating new features is favored over ensuring security and privacy. Facebook can easily provide additional security layers to make session stealing more difficult for the malicious hacker. One way to do this involves storing the user’s IP address or another unique identifier inside of the spawned session. By doing so, the user making the GET request would have to have the same IP address as the user who originally logged into the system. If a malicious user were to spoof the original user’s IP address to gain access to the user’s session, the information requested would be sent to the original user’s computer. While this is not a viable fix for users residing behind a NAT router, such an implementation would make it more difficult for the malicious user to gain access to a victim’s account. Another potential fix would involve transmitting all data over an HTTPS connection. By encrypting the packets sent between a client and the server, a malicious user would be unable to view the cookie information in plain text. This would prevent malicious hackers relying on man-in-the-middle attacks from stealing session cookies. However, HTTPS leads to a large amount of data overhead due to complex encryption schemes. Because of this, encrypting every packet to and from the Facebook server is not feasible, and thus this solution should not be implemented. Because Facebook now properly destroys a user’s session when they log off, the chances of doing a significant amount of damage due to session stealing is low. By implementing the necessary changes to fix the vulnerable exploits, Facebook will greatly improve their overall system security and provide user’s with a level of privacy they deserve and have come to expect in today’s online world.

            As the World Wide Web continues to expand, the protocols, the markup language, and the browsers will evolve as new features are incorporated into the existing standards. While these changes allow for advanced functionality, web services will also become more complicated and difficult to manage. In order to minimize the negative effects, developers must wisely choose the most beneficial features to incorporate with their service. As users continue to contribute their own personal information that serves as the fuel for many of today’s popular web service to grow and be successful, developers must provide users with an unparalleled level of confidence that their privacy and security have not been overlooked. Even though the end result might be fewer features and less functionality, developers must dedicate time to implement the needed security to build user trust. While the Internet serves as a repository full of open source frameworks, developers must make an effort to protect users from any associated security risks before deploying new functionality. Should developers choose to take advantage of advanced features, they must be willing to properly secure them to ensure that user privacy and security is not put at risk of being compromised. Developers must not become reliant on browsers to secure their application and browsers should not assume that a developer’s code is secure. By following these simple principles, higher levels of security and user privacy can be attained, which leads to a safer browsing experience for all.


References

[1]       T. Berners-Lee, “WWW: Past, Present, and Future,” [Online document], 1996 Oct, [cited 2007 May 5], Available HTTP: http://ieeexplore.ieee.org/iel1/2/11670/00539724.pdf

[2]       T. O’Reilly, “What is Web 2.0,” [Online document], 2005 Sept 30, [cited 2007 May 5], Available HTTP: http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html

[3]       D. Kristol and L. Montulli, “HTTP State Management Mechanism,” [Online document], 1997 Feb, [cited 2007 May 5], Available HTTP: http://tools.ietf.org/rfc/rfc2109.txt

[4]       R.T. Fielding and R.N. Taylor, “Principled Design of the Modern Web Architecture,” [Online document], 2000, [cited 2007 May 5], Available HTTP: http://www.ics.uci.edu/~fielding/pubs/webarch_icse2000.pdf

[5]       D. Crockford, “The application/json Media Type for JSON,” [Online document], 2006 July, [cited 2007 May 5], Available HTTP: http://www.ietf.org/rfc/rfc4627.txt

[6]       B. Chess, Y.T. O’Neil, and J. West, “JavaScript Hijacking,” [Online document], 2007 March 12, [cited 2007 May 5], Available HTTP: http://www.fortifysoftware.com/servlet/downloads/public/JavaScript_Hijacking.pdf

[7]       S. Faust, “SOAP Web Services Attacks,” [Online document], 2005, [cited 2007 May 5], Available HTTP: http://www.spidynamics.com/whitepapers/SOAP_Web_Security.pdf

[8]       T. Porter, “The Perils of Deep Packet Inspection,” [Online document], 2005 Jan 11, [cited 2007 May 5], Available HTTP: http://www.securityfocus.com/infocus/1817

[9]       IBM Research, “Web Services Security,” [Online document], 2004 March 1, [cited 2007 May 5], Available HTTP:

http://www-128.ibm.com/developerworks/library/specification/ws-secure/

[10]     J. Ruderman, “Signed Scripts in Mozilla,” [Online document], 2001 Dec 6, [cited 2007 May 5], Available HTTP: http://www.mozilla.org/projects/security/components/signed-scripts.html

 


Creative Commons License
This work is licensed under a Creative Commons Attribution-No Derivative Works 3.0 License.