By John Floyd
I recently took a Microsoft C# MCSD exam and while studying for this test I ran across this question:
Frankly, I had no idea what a ‘Cryptographic Hashing Algorithm’ was and I was not able to answer the question correctly. This, I felt, required some research. I found a couple of excellent articles written on this subject on the Internet and having read through them, I decided that to really learn about this technology, I would put it to into practice. I would create something that I could use for a project I was developing for one of our clients. The pages below describe much of what I learned and how I was able to use this information in a practical way to create tamper-resistant web applications.
Using Cryptographic Hash Algorithms to detect web software intrusion.
A cryptographic hash function is an algorithm that takes an arbitrary block of data and returns a fixed-size bit string, the (cryptographic) hash value, such that any (accidental or intentional) change to the data will (with very high probability) change the hash value. The data to be encoded are often called the “message.” The hash value is sometimes called the message digest or simply a digest.
Using a hash algorithm, you can compute and validate hash codes to ensure that code running on your machine or server has not been altered since its deployment. ASP.NET provides a mechanism for allowing code that resides within a web site to be instantiated to perform hash code validation functions to accomplish this.
One of the biggest concerns in deploying a web based software solution, especially in a “customer facing” solution, is the fear of the software being maliciously altered by a third party or an “outside influence”. Certainly, web software solutions are almost always deployed behind a trustworthy secure firewall and are in most cases not subject to the injection of a worm, Trojan or other virus. However, it is always best to mitigate this possibility and the use of hash algorithms is a very plausible method for doing just that.
Creating page digests for your web pages.
By using a hash algorithm, each page in a website can have its own personal identity that is only changeable by modifying the contents of the page. This identity can be thought of as if it were a fingerprint. Each page has its own unique fingerprint (or digest) when a hash algorithm is applied to it. If a single character (byte) in that page changes, the page’s fingerprint will change when the hash code is recalculated. Using this digest or fingerprint, each page can be verified that it is the original page that was deployed and has therefore not been tampered with or altered since it was deployed. First, let’s take a look at how Microsoft’s ASP.NET handles web page requests from the Internet.
The following diagram shows the typical flow of web pages through ASP.NET:
Figure 1 ASP.NET Logic Flow
In Figure 1, you can see that a web browser initiates a request for a web (.aspx) page, which is routed through the ASP.NET load event. Once the page is loaded, the server side logic associated with the web page that was loaded is executed. An HTML page is the result, and is sent back to the web browser. Note also that there may or may not be server side code executed for a particular page request but in either case, HTML is generated and sent back to the browser.
In ASP.NET Version 6.0 and above, there was a new capability introduced which allows for a new event to take place before the requested ASPX page is loaded and before the server side code is processed. This is called the “ihttpModule” interface. It is in this new interface, that hash code (digest) checking can be done to ensure that the fingerprint of each ASP page to be loaded is correct.
There are three entry points to the IhhtpModule class.
- Init(). This is called once, when the web site is first starting up (just prior to when the ‘default.aspx’ or the first .aspx page is loaded initially). This allows any one-time code needed to prepare for examining the hash codes to be processed on the server.
- Page_Load(). This method is called just prior to any page or file being loaded by ASP.NET and prior to any server side code that will be processed.
- Dispose(). This method is called when the web site is being closed and no more pages are going to be loaded for this session.
To activate these events in a website, an entry must be placed in either the Web.Config file for the web site or in the Machine.Config file for the .NET Framework version for which the web site has been built. This entry will typically look like the following:
For Machine.Config or Web.Config – in the “System.Web” section the following lines must exist:
<add name=”VerificationNameSpace” type=”VerificationNameSpace.VerificationModule, AssemblyName” />
“VerficationNameSpace” is the namespace assigned to the software module that will contain the three different IhttpModule interface methods.
“VerificationNameSpace.VerificationModule” is the Class for the software module that will contain the three different ihttpModule interface methods.
“AssemblyName” is the assembly name of the web site when it was built for deployment.
*NOTE: These names can be anything that the developer wishes them to be as long as they match what is placed in the .Config file and the code behind file.
Generation of the hash digests.
Before hash digests can be validated for each module, a digest must be calculated and stored somewhere for the web pages/components that are going to be validated. The storage location for these codes needs to be somewhere outside the URL space to be sure that it’s safe from tampering by an intruder who gains access to the application files of a hosted site through its normal publishing point. For purposes of demonstrating this hash validation capability, I have chosen a table inside a SQL server database, but this could really be any “safe” storage location that is available for any particular client.
As pointed out in a Microsoft article published this year by Jason Coombs, this is a Catch-22 for automated systems that need a way to verify hash codes dynamically. Jason notes:
“The hash codes that are trusted as authentic must be available to the automated system in real time in order for it to validate hashes. Those stored hash codes are themselves subject to tampering, so how do you authenticate the hash codes? Digital signatures can be applied to each stored hash code so that the authenticity of the hash against which each file is compared can be established, but that only pushes the problem down another layer. You canhash the key, its certificate, and the entire chain of trust to make sure it hasn’t been changed, but then the question becomes how to authenticate the authentic key, certificate, and chain of trust. The fact is that it too is subject to tampering unless it is embedded in non-programmable hardware along with the logic that makes use of it. The extent to which your automated system needs a guarantee of data security defines the extent to which you will need to unravel this tangle and how sensible it is to use extra layers of validation.”
The Hash Generator
To demonstrate the feasibility of this technology, I have developed a C# console application that will read through all files in a given folder, generate the appropriate hash digests for each module in the deployed web solution and using a designated assembly name, store these values in a SQL Server database table. This program can be run from the command line using the following syntax:
>HashGenerator –f”\\FolderName” -a”AssemblyName”
This will use the supplied folder to gather a list of file names that need to have hash digests generated, generate the hash values for each module and store them along with the specified assembly name in a SQL table named ‘HashModuleTable’. The database connection string is embedded in the app.config file for this assembly and would need to be altered for each client using this tool. All command line options that are available using the HashGenerator.EXE file are as follows:
-f – the folder where the web pages are deployed.
-a – the assembly name for the web site application.
-s – show all files that are being processed.
-c – create a new instance of the ‘HashModuleTable’ in the database specified in the connection string.
-? – Help – show all commands that can be entered at the command line.
The Hash Validator
To validate each web page as it is loaded by ASP.NET, I have developed a C# web class that can be included in any web project to perform hash digest validation for all web pages in the deployed web site. This module assumes that all pages have been “tagged” and catalogued using the HashGenerator.exe program described earlier in this document. As indicated earlier, there are three entry points to this class: Init(), VeifyHash() and Dispose(), these methods are described below:
This method is called when the web site/assembly is started by ASP.NET. The hash information for each web page in the solution is loaded from the SQL WebHashTable from the database specified in the app.config file. This data is loaded into a dictionary of the type <string, HashInformation>.
The event handler is also established here that ASP.NET will call before each web page is loaded. The firing of this (and the init and dispose events) assumes that you have added the following lines to the <SYSTEM.WEB> section of either the machine.config file for the .NET Framework this is targeting, or the Web.Config file for this web site:
<add name=”HashVerification” type=”HashVerification.HashVerificationModule, AssemblyName” />
NOTE: The “AssemblyName” in this section must match the name of the assembly (web site) to which you are adding this code.
Called when the web site is unloading (session is ending).
HashAuthorization(object sender, EventArgs e) –
This method is called each time before a web page is loaded by ASP.NET. This method will read the contents of the file being loaded from the deployment file location, calculate a Cryptographic Hash Code for the contents and compare it to what was stored in the SQL WebHashTable for the same page. If the two do not match, the module is not loaded and the web session is effectively ended after a message is sent to the browser indicating that there was an error in processing.
Hash validation flow through ASP.NET.
The following diagram depicts how ASP.NET will handle web pages when the hash validation module has been inserted into the normal flow of web page processing:
As you can see in the above diagram, as soon as ASP.NET is ready to load a web page, the “ValidateHash” method is called to validate the hash digest for the page. The valid hash digest is read from storage, the current hash digest is computed for the page about to be loaded and the two are compared. If these fail to match, an error is returned to the browser via an HTML page. Otherwise, flow continues as normal for the page that is being loaded.
As long as there are entities willing to attack a web site or server, there will always be risks that need to be taken into consideration when thinking about security requirements to protect against intrusion. The lengths to which an organization might extend itself to protect against intrusion must be weighed against factors such as cost and effort to implement each solution. The method described in this article is but one small step that can be taken but is a viable way to prevent malicious code from entering into a web based solution.