-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added detection for untrusted domains in script content - edited existing library and added new query and tests ("the Polyfill PR") #16886
Added detection for untrusted domains in script content - edited existing library and added new query and tests ("the Polyfill PR") #16886
Conversation
…ting library and added new query and tests
QHelp previews: javascript/ql/src/Security/CWE-830/PolyfillIOCompromisedScript.qhelperrors/warnings:
|
QHelp previews: javascript/ql/src/Security/CWE-830/PolyfillIOCompromisedScript.qhelperrors/warnings:
|
QHelp previews: javascript/ql/src/Security/CWE-830/FunctionalityFromUntrustedDomain.qhelpUntrusted domain used in script or other contentContent Delivery Networks (CDNs) are used to deliver content to users quickly and efficiently. However, they can change hands or be operated by untrustworthy owners, risking the security of the sites that use them. Some CDN domains are operated by entities that have used CDNs to deliver malware, which this query identifies. For example, Including a resource from an untrusted source or using an untrusted channel may allow an attacker to include arbitrary code in the response. When including an external resource (for example, a Even when See the [`CUSTOMIZING.md`](https://github.com/github/codeql/blob/main/javascript/ql/src/Security/CWE-830/CUSTOMIZING.md) file in the source code for this query for information on how to extend the list of untrusted domains used by this query. RecommendationCarefully research the ownership of a Content Delivery Network (CDN) before using it in your application. If you find code that originated from an untrusted domain in your application, you should review your logs to check for compromise. To help mitigate the risk of including a script that could be compromised in the future, consider whether you need to use polyfill or another library at all. Modern browsers do not require a polyfill, and other popular libraries were made redundant by enhancements to HTML 5. If you do need a polyfill service or library, move to using a CDN that you trust. When you use a You can also choose to self-host an uncompromised version of the service or library. ExampleThe following example loads the Polyfill.io library from the <html>
<head>
<title>Polyfill.io demo</title>
<script src="https://cdn.polyfill.io/v2/polyfill.min.js" crossorigin="anonymous"></script>
</head>
<body>
...
</body>
</html> Instead, load the Polyfill library from a trusted CDN, as in the next example: <html>
<head>
<title>Polyfill demo - Cloudflare hosted with pinned version (but no integrity checking, since it is dynamically generated)</title>
<script src="https://cdnjs.cloudflare.com/polyfill/v3/polyfill.min.js?version=4.8.0" crossorigin="anonymous"></script>
</head>
<body>
...
</body>
</html> If you know which browsers are used by the majority of your users, you can list the hashes of the polyfills for those browsers: <html>
<head>
<title>Polyfill demo - Cloudflare hosted with pinned version (with integrity checking for a *very limited* browser set - just an example!)</title>
<script src="https://cdnjs.cloudflare.com/polyfill/v3/polyfill.min.js?version=4.8.0" integrity="sha384-i0IGVuZBkKZqwXTD4CH4kcksIbFx7WKFMdxN8zUhLFHpLdELF0ym0jxa6UvLhW8/ sha384-3d4jRKquKl90C9aFG+eH4lPJmtbPHgACWHrp+VomFOxF8lzx2jxqeYkhpRg18UWC" crossorigin="anonymous"></script>
</head>
<body>
...
</body>
</html> References
javascript/ql/src/Security/CWE-830/FunctionalityFromUntrustedSource.qhelpInclusion of functionality from an untrusted sourceIncluding a resource from an untrusted source or using an untrusted channel may allow an attacker to include arbitrary code in the response. When including an external resource (for example, a When including external resources, it is possible to verify that the responding server is the intended one by using an Even when Subresource integrity (SRI) checking is commonly recommended when importing a fixed version of a library - for example, from a CDN (content-delivery network). Then, the fixed digest of that version of the library can easily be added to the A dynamic service cannot be easily used with SRI. Nevertheless, it is possible to list multiple acceptable SHA hashes in the See the [`CUSTOMIZING.md`](https://github.com/github/codeql/blob/main/javascript/ql/src/Security/CWE-830/CUSTOMIZING.md) file in the source code for this query for information on how to extend the list of hostnames required to use SRI by this query. RecommendationWhen an When using a ExampleThe following example loads the jQuery library from the jQuery CDN without using <html>
<head>
<title>jQuery demo</title>
<script src="http://code.jquery.com/jquery-3.6.0.slim.min.js" crossorigin="anonymous"></script>
</head>
<body>
...
</body>
</html> Instead, loading jQuery from the same domain using <html>
<head>
<title>jQuery demo</title>
<script src="https://code.jquery.com/jquery-3.6.0.slim.min.js" integrity="sha256-u7e5khyithlIdTpu22PHhENmPcRdFiHRjhAuHcs05RI=" crossorigin="anonymous"></script>
</head>
<body>
...
</body>
</html> References
|
Because Polyfill.io dynamically generates its content, SRI is apparently impossible. I'm going to redraft this tomorrow to account for this. I will refactor the library I made to separate known-bad domains from those requiring SRI. I also need to remove any references to SRI being a mitigation, and edit the Cloudflare example somehow to make it work - my current example assumes SRI will function correctly. |
Hmmm. From reading the updates in the article: https://sansec.io/research/polyfill-supply-chain-attack it sounds like the problem has been fixed. |
That's totally true. An alert about using this domain isn't going to change future exposure, but rather alert people to the fact that there was prior exposure. That allows them to start a security response. |
Moved lists of domains to data extensions, including adding those to the overall qlpack.yml Expanded scope of new query to further domains operated by the untrusted owners of polyfill.io
I've made a few changes in the interest of getting this merged. I've extended the query now using data extensions, so it includes more domains operated by the same CDN operator who used Nevertheless, I moved it to an experimental query. I'd rather it was merged into the main queries for widespread visibility of code that is or was exposed to this risk, but will accept merging into experimental and escalating a move to the main queries later. It also allows users to extend the query using a data extensions model pack. I also changed the existing CDN query to allow the extension of the CDN list that requires an SRI more easily in a data extension file. It no longer requires escaping dots in a regex - you just add the domain to the next line in a string in a tuple in the YAML file. Hope that motivates merging this. |
# new location of the polyfill.io CDN, which was used to serve malware. See: https://www.cside.dev/blog/the-polyfill-attack-explained | ||
- ["polyfill.com"] | ||
- ["polyfillcache.com"] | ||
|
||
# domains operated by the same owner as polyfill.io, which was used to serve malware. See: https://www.cside.dev/blog/the-polyfill-attack-explained | ||
- ["bootcdn.net"] | ||
- ["bootcss.com"] | ||
- ["staticfile.net"] | ||
- ["staticfile.org"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, those are still active.
That removes my argument for keeping the new query in the experimental/
folder, you can move the new query to the default suite.
I've moved the query into the main set of queries, and slightly updated the |
I did some evaluations, and they look good: (nightly, default (internal links)). There is a lot of documentation here, and I would like some eyes from the doc team, but the PR is good to merge after that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small docs suggestions, but it's good to have help with lots of context like this 👍🏻
javascript/ql/src/Security/CWE-830/FunctionalityFromUntrustedDomain.qhelp
Outdated
Show resolved
Hide resolved
javascript/ql/src/Security/CWE-830/FunctionalityFromUntrustedDomain.qhelp
Outdated
Show resolved
Hide resolved
javascript/ql/src/Security/CWE-830/FunctionalityFromUntrustedDomain.qhelp
Outdated
Show resolved
Hide resolved
javascript/ql/src/Security/CWE-830/FunctionalityFromUntrustedDomain.qhelp
Outdated
Show resolved
Hide resolved
javascript/ql/src/Security/CWE-830/FunctionalityFromUntrustedDomain.ql
Outdated
Show resolved
Hide resolved
Docs suggestions accepted, thank you 🙏 Co-authored-by: Felicity Chapman <felicitymay@github.com>
…hub.com/aegilops/codeql into aegilops/polyfill-io-compromised-script
@felicitymay thanks so much for the suggestions! I transferred my note about using SRI with dynamic services into the other query help too. It's the same wording in both places now. I thought about usability and added a note on It describes how someone can extend the queries with data extensions, to add their own hostname that needs SRI, or to add their own untrusted domain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding information about how to customize the queries 💖
This looks good to me from a docs point of view, but I'd recommend moving the new article into a location that's easier to link to and where it sits alongside related content for discoverability.
I suspect that this addition might require a 👍🏻 from the CodeQL dynamic language team 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we move this information into the Customizing Library Models for JavaScript — CodeQL article as a practical example. Alternatively, we could create a new article in https://github.com/github/codeql/tree/main/docs/codeql/codeql-language-guides, which could be extended with future examples.
This would require converting the format to RST, but the content is fairly simple so this shouldn't be too much of an overhead and we can provide support if needed.
Can I propose using the current in-repo markdown approach, and coming back after next week with a second PR to merge it into CodeQL docs, to improve things? I'll definitely accept your suggestions when I'm back at my IDE, but I don't really want to slow down merging this to get this customisation docs part perfect. Sound OK, or no go? |
Happy to accept this approach. In that case, can we link directly to where the new file will be, that is: https://github.com/github/codeql/blob/main/javascript/ql/src/Security/CWE-830/CUSTOMIZING.md instead of just mentioning the file name? |
Co-authored-by: Felicity Chapman <felicitymay@github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates 👍🏻
Polyfill.io was a popular JavaScript polyflll Content Delivery Network (CDN), used to support new web browser standards on older browsers.
In February 2024 the domain was sold, and in June 2024 it was widely publicised that the domain had been used to serve malicious scripts. It was taken down later in that month, leaving a window where sites that used the service could have been compromised.
Subresource Integrity (SRI) would have prevented the undetected replacement of the Polyfill script code.
This CodeQL query builds on a library,
semmle/javascript/security/FunctionalityFromUntrustedSource.qll
, which was abstracted from an existing query,javascript/ql/src/Security/CWE-830/FunctionalityFromUntrustedSource.ql
, to detect specific cases where a script frompolyfill.io
orcdn.polyfill.io
was used without Subresource Integrity, and so the service using it was exposed to compromise.I've modified that existing library to add
polyfill.io
andcdn.polyfill.io
as domains that require SRI, which makes the existing query detect such uses, and I have added agetUrl()
predicate to classes in the library so that I can filter down only to results frompolyfill.io
in a new query.I added a specific test for
polyfill.io
use in its most common context, in an HTML script tag.This query will work for cases in HTML or in JavaScript code, but will not find cases in any other languages such as PHP or Python where server-side scripting is used to generate client-side HTML or JavaScript.
A code search across github.com hosted sites will find more results manually.