Path Traversal
webhacking
What is path traversal?
It's also known as directory traversal.
These vulnerabilities enable an attacker to read arbitrary files on the server that is running an application:
Application code and data
Credentials for back-end systems
Sensitive operating system files
An attacker may also be able to write arbitrary files on the server.
The attacker can modify application data or behavior and ultimately take full control of the server.
Reading arbitrary files via path traversal
Example: a shopping app displays images for sale. The HTML used to load an image:
<img src="/loadingImage?filename=218.png">The loaded image URL takes a filename parameter and returns the contents of the specified file.
Image files are stored on disk in
/var/www/images/.To return an image, the application appends the requested filename to the base directory and uses a filesystem API to read the file contents.
The path for a normal request:
This application implements no defence against path traversal attacks. An attacker can request the following URL to retrieve
/etc/passwd:
The application reads from:
The sequence
../steps up one level in the directory structure. Three consecutive../sequences step up from/var/www/images/to the filesystem root, yielding:
On Unix-based systems,
/etc/passwdcontains details of registered users on the server.On Windows, both
../and..\are valid directory traversal sequences. Example:
Lab: File path traversal, simple case
Common obstacles to exploiting path traversal vulnerabilities
Applications that place user input into file paths often implement defenses against path traversal attacks.
How to bypass these filters:
Developers can strip or block directory traversal sequences from the user-supplied filename. You can bypass this by using an absolute path from the filesystem root:
filename=/etc/passwd
This directly references a file without using traversal sequences.
Lab: File path traversal — traversal sequences blocked with absolute path bypass
Another common bypass in labs:
Use nested traversal sequences such as
....//or....\/. These can revert to simple traversal sequences when the inner sequence is stripped.
Lab: File path traversal — traversal sequences stripped non-recursively
Another common bypass in labs:
In some contexts (URL path or the filename parameter of a multipart/form-data request), web servers may strip directory traversal sequences before passing input to the application.
You can bypass this by URL-encoding or double URL-encoding the traversal characters.
../can be encoded as%2e%2e%2for double-encoded as%252e%252e%252fThere are various non-standard encodings such as
..%c0%afor..%ef%bc%8f
Burp Suite Professional's Intruder provides predefined payload lists (Fuzzing-path-traversal) with encoded paths.
Lab: File path traversal — traversal sequences stripped with superfluous URL-decode
Another common bypass in labs:
Some applications require the user-supplied filename to start with the expected base folder, such as
/var/www/images. In this case you can include the base folder followed by a suitable traversal sequence:
Lab: File path traversal — validation of start of path
Another common bypass in labs:
An application may require the user-supplied filename to end with an expected extension such as
.png. It may be possible to use a null byte to effectively terminate the file path before the required extension:
Lab: File path traversal — validation of file extension with null byte bypass
How to prevent a path traversal attack
Avoid passing user-supplied input to filesystem APIs altogether.
Use two layers of defense to prevent attacks:
Validate user input before processing it.
Compare user input with a whitelist of permitted content (for example, allow only alphanumeric characters).
Append the input to the base directory and use a platform filesystem API to canonicalize the path.
Verify the canonicalized path starts with the expected base directory.
Example Java code to validate the canonical path of a file based on user input: