Vulnerabilities > CVE-2024-47883 - Path Traversal vulnerability in Openrefine Butterfly
Summary
The OpenRefine fork of the MIT Simile Butterfly server is a modular web application framework. The Butterfly framework uses the `java.net.URL` class to refer to (what are expected to be) local resource files, like images or templates. This works: "opening a connection" to these URLs opens the local file. However, prior to version 1.2.6, if a `file:/` URL is directly given where a relative path (resource name) is expected, this is also accepted in some code paths; the app then fetches the file, from a remote machine if indicated, and uses it as if it was a trusted part of the app's codebase. This leads to multiple weaknesses and potential weaknesses. An attacker that has network access to the application could use it to gain access to files, either on the the server's filesystem (path traversal) or shared by nearby machines (server-side request forgery with e.g. SMB). An attacker that can lead or redirect a user to a crafted URL belonging to the app could cause arbitrary attacker-controlled JavaScript to be loaded in the victim's browser (cross-site scripting). If an app is written in such a way that an attacker can influence the resource name used for a template, that attacker could cause the app to fetch and execute an attacker-controlled template (remote code execution). Version 1.2.6 contains a patch.
Vulnerable Configurations
Part | Description | Count |
---|---|---|
Application | 1 |
Common Weakness Enumeration (CWE)
Common Attack Pattern Enumeration and Classification (CAPEC)
- Relative Path Traversal An attacker exploits a weakness in input validation on the target by supplying a specially constructed path utilizing dot and slash characters for the purpose of obtaining access to arbitrary files or resources. An attacker modifies a known path on the target in order to reach material that is not available through intended channels. These attacks normally involve adding additional path separators (/ or \) and/or dots (.), or encodings thereof, in various combinations in order to reach parent directories or entirely separate trees of the target's directory structure.
- Directory Traversal An attacker with access to file system resources, either directly or via application logic, will use various file path specification or navigation mechanisms such as ".." in path strings and absolute paths to extend their range of access to inappropriate areas of the file system. The attacker attempts to either explore the file system for recon purposes or access directories and files that are intended to be restricted from their access. Exploring the file system can be achieved through constructing paths presented to directory listing programs, such as "ls" and 'dir', or through specially crafted programs that attempt to explore the file system. The attacker engaging in this type of activity is searching for information that can be used later in a more exploitive attack. Access to restricted directories or files can be achieved through modification of path references utilized by system applications.
- File System Function Injection, Content Based An attack of this type exploits the host's trust in executing remote content including binary files. The files are poisoned with a malicious payload (targeting the file systems accessible by the target software) by the attacker and may be passed through standard channels such as via email, and standard web content like PDF and multimedia files. The attacker exploits known vulnerabilities or handling routines in the target processes. Vulnerabilities of this type have been found in a wide variety of commercial applications from Microsoft Office to Adobe Acrobat and Apple Safari web browser. When the attacker knows the standard handling routines and can identify vulnerabilities and entry points they can be exploited by otherwise seemingly normal content. Once the attack is executed, the attackers' program can access relative directories such as C:\Program Files or other standard system directories to launch further attacks. In a worst case scenario, these programs are combined with other propagation logic and work as a virus.
- Using Slashes and URL Encoding Combined to Bypass Validation Logic This attack targets the encoding of the URL combined with the encoding of the slash characters. An attacker can take advantage of the multiple way of encoding an URL and abuse the interpretation of the URL. An URL may contain special character that need special syntax handling in order to be interpreted. Special characters are represented using a percentage character followed by two digits representing the octet code of the original character (%HEX-CODE). For instance US-ASCII space character would be represented with %20. This is often referred as escaped ending or percent-encoding. Since the server decodes the URL from the requests, it may restrict the access to some URL paths by validating and filtering out the URL requests it received. An attacker will try to craft an URL with a sequence of special characters which once interpreted by the server will be equivalent to a forbidden URL. It can be difficult to protect against this attack since the URL can contain other format of encoding such as UTF-8 encoding, Unicode-encoding, etc.
- Manipulating Input to File System Calls An attacker manipulates inputs to the target software which the target software passes to file system calls in the OS. The goal is to gain access to, and perhaps modify, areas of the file system that the target software did not intend to be accessible.