There is a threat for file access named path canonicalization. Canonicalization is a process for converting data in standard (or canonical) form and it refers to the action that builds a path in a safe form. The next picture shows this process:

Path canonicalization in action. If an attacker passes special characters, such as ..\or .\, that user can alter the routine path and access files in other directories

Path canonicalization in action. If an attacker passes special characters, such as .. or ., that user can alter the routine path and access files in other directories

A web server is protected by default from this attack known as directory traversal vulnerability. If you have some code inside a physical directory named c:\inetpub\sitename\ and you are requesting something like https://localhost/../../../somefile.txt, the corresponding physical request will not be processed. The problem isn’t in how the web server is processing these requests, but how you dynamically compose a path. Usually in web application the path is composed by using parameter values. This method is used in different situations: from a downloading system to user-generated files, dealing with dynamic path building is a common issue. Because by default all user input is potentially evil, you have to take actions to sanitize it. Your aim is to safe system that composes a path dynamically and avoids path canonicalization vulnerability.

Parameter values used to compose paths, in web applications, are important. If you want to build a local file path by using some user input, you will end up with string concatenation or you will use the Combine() static method of the Path class from the Sysem.IO namespace. This method is very useful because it handles leading and trailing slashes automatically, but it does not deal with directory traversal. For example if an attacker passes c:\inetpub\sitename as the first part of the path and ..\..\windows\system32\cmd.exe as the second part the result will be c:\windows\system32\cmd.exe. This result isn’t the one you want, and, depending on your application’s behavior, the vulnerability might become quite dangerous.

 

The best approach in this case is to check for unwanted characters in the specified parameter and to follow the next steps:

1. Check for invalid characters in the parameter value by using the GetInvalidFileNameChars() method of the Path class.

2. Use the Combine method.

3. Perform a last check on the results to make sure the resulting path starts with the base path.

 

You can use the next code lines to implement this approach:

public static class PathExtensions

{

public static string CanonicalCombine(string basePath, string path)

{

if (String.IsNullOrEmpty(basePath) || string.IsNullOrEmpty(path))

throw new ArgumentNullException();

basePath = HttpUtility.UrlDecode(basePath);

 

path = HttpUtility.UrlDecode(path);

// Check for invalid characters

if (path.IndexOfAny(Path.GetInvalidFileNameChars()) > -1)

throw new FileNotFoundException(“FileName not valid”);

 

// Use Path.Combine

string filePath = Path.Combine(basePath, path);

 

// Check the composed path

if (!filePath.StartsWith(basePath))

throw new FileNotFoundException(“Path not valid”);

return filePath;

}

}

 

string filPath = PathExtensions.CanonicalCombine(basePath, PathValue.Text);

 

Important note:

Path canonicalization affects a lot of applications, and their developers likely don’t understand the implications of such threats. Inadvertently giving users access to the server disk is bad in terms of security because often, along with code, the server has the configuration data. That data could be used to bypass other security defenses.