Vulnerabilities > CVE-2021-41125 - Insufficiently Protected Credentials vulnerability in multiple products
Summary
Scrapy is a high-level web crawling and scraping framework for Python. If you use `HttpAuthMiddleware` (i.e. the `http_user` and `http_pass` spider attributes) for HTTP authentication, all requests will expose your credentials to the request target. This includes requests generated by Scrapy components, such as `robots.txt` requests sent by Scrapy when the `ROBOTSTXT_OBEY` setting is set to `True`, or as requests reached through redirects. Upgrade to Scrapy 2.5.1 and use the new `http_auth_domain` spider attribute to control which domains are allowed to receive the configured HTTP authentication credentials. If you are using Scrapy 1.8 or a lower version, and upgrading to Scrapy 2.5.1 is not an option, you may upgrade to Scrapy 1.8.1 instead. If you cannot upgrade, set your HTTP authentication credentials on a per-request basis, using for example the `w3lib.http.basic_auth_header` function to convert your credentials into a value that you can assign to the `Authorization` header of your request, instead of defining your credentials globally using `HttpAuthMiddleware`.
Vulnerable Configurations
Common Weakness Enumeration (CWE)
Common Attack Pattern Enumeration and Classification (CAPEC)
- Session Sidejacking Session sidejacking takes advantage of an unencrypted communication channel between a victim and target system. The attacker sniffs traffic on a network looking for session tokens in unencrypted traffic. Once a session token is captured, the attacker performs malicious actions by using the stolen token with the targeted application to impersonate the victim. This attack is a specific method of session hijacking, which is exploiting a valid session token to gain unauthorized access to a target system or information. Other methods to perform a session hijacking are session fixation, cross-site scripting, or compromising a user or server machine and stealing the session token.
- Lifting credential(s)/key material embedded in client distributions (thick or thin) An attacker examines a target application's code or configuration files to find credential or key material that has been embedded within the application or its files. Many services require authentication with their users for the various purposes including billing, access control or attribution. Some client applications store the user's authentication credentials or keys to accelerate the login process. Some clients may have built-in keys or credentials (in which case the server is authenticating with the client, rather than the user). If the attacker is able to locate where this information is stored, they may be able to retrieve these credentials. The attacker could then use these stolen credentials to impersonate the user or client, respectively, in interactions with the service or use stolen keys to eavesdrop on nominally secure communications between the client and server.
- Password Recovery Exploitation An attacker may take advantage of the application feature to help users recover their forgotten passwords in order to gain access into the system with the same privileges as the original user. Generally password recovery schemes tend to be weak and insecure. Most of them use only one security question . For instance, mother's maiden name tends to be a fairly popular one. Unfortunately in many cases this information is not very hard to find, especially if the attacker knows the legitimate user. These generic security questions are also re-used across many applications, thus making them even more insecure. An attacker could for instance overhear a coworker talking to a bank representative at the work place and supplying their mother's maiden name for verification purposes. An attacker can then try to log in into one of the victim's accounts, click on "forgot password" and there is a good chance that the security question there will be to provide mother's maiden name. A weak password recovery scheme totally undermines the effectiveness of a strong password scheme.
References
- https://github.com/scrapy/scrapy/commit/b01d69a1bf48060daec8f751368622352d8b85a6
- https://w3lib.readthedocs.io/en/latest/w3lib.html#w3lib.http.basic_auth_header
- https://github.com/scrapy/scrapy/security/advisories/GHSA-jwqp-28gf-p498
- http://doc.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.httpauth
- https://lists.debian.org/debian-lts-announce/2022/03/msg00021.html