There is a threat for file access named path canonicalization. Canonicalization is a process for converting data in standard (or canonical) form and it refers to the action that builds a path in a safe form. The next picture shows this process:
A web server is protected by default from this attack known as directory traversal vulnerability. If you have some code inside a physical directory named c:\inetpub\sitename\ and you are requesting something like https://localhost/../../../somefile.txt, the corresponding physical request will not be processed. The problem isn’t in how the web server is processing these requests, but how you dynamically compose a path. Usually in web application the path is composed by using parameter values. This method is used in different situations: from a downloading system to user-generated files, dealing with dynamic path building is a common issue. Because by default all user input is potentially evil, you have to take actions to sanitize it. Your aim is to safe system that composes a path dynamically and avoids path canonicalization vulnerability.
Parameter values used to compose paths, in web applications, are important. If you want to build a local file path by using some user input, you will end up with string concatenation or you will use the Combine() static method of the Path class from the Sysem.IO namespace. This method is very useful because it handles leading and trailing slashes automatically, but it does not deal with directory traversal. For example if an attacker passes c:\inetpub\sitename as the first part of the path and ..\..\windows\system32\cmd.exe as the second part the result will be c:\windows\system32\cmd.exe. This result isn’t the one you want, and, depending on your application’s behavior, the vulnerability might become quite dangerous.
The best approach in this case is to check for unwanted characters in the specified parameter and to follow the next steps:
1. Check for invalid characters in the parameter value by using the GetInvalidFileNameChars() method of the Path class.
2. Use the Combine method.
3. Perform a last check on the results to make sure the resulting path starts with the base path.
You can use the next code lines to implement this approach:
Public Module PathExtensions
Public Function CanonicalCombine(ByVal BasePath As String, _
ByVal MyPath As String) As String
If String.IsNullOrEmpty(BasePath)
OrElse String.IsNullOrEmpty(MyPath) Then
Throw New ArgumentNullException()
End If
BasePath = HttpUtility.UrlDecode(BasePath)
MyPath = HttpUtility.UrlDecode(MyPath)
‘ Check for invalid characters
If MyPath.IndexOfAny(Path.GetInvalidFileNameChars()) > -1 Then
Throw New FileNotFoundException(“FileName not valid”)
End If
‘ Use Path.Combine
Dim FilePath As String = Path.Combine(BasePath, MyPath)
‘ Check the composed path
If Not FilePath.StartsWith(basePath) Then
Throw New FileNotFoundException(“Path not valid”)
End If
Return filePath
End Function
End Module
Dim FilePath As String = PathExtensions.CanonicalCombine(BasePath, PathValue.Text)
Important note:
Path canonicalization affects a lot of applications, and their developers likely don’t understand the implications of such threats. Inadvertently giving users access to the server disk is bad in terms of security because often, along with code, the server has the configuration data. That data could be used to bypass other security defenses.