Skip to content

Commit

Permalink
Merge pull request #73 from owasp-dep-scan/feature/go-binary
Browse files Browse the repository at this point in the history
go binary sbom
  • Loading branch information
prabhu authored Mar 17, 2024
2 parents da0718f + fe782c8 commit 2bcaacb
Show file tree
Hide file tree
Showing 5 changed files with 175 additions and 44 deletions.
11 changes: 10 additions & 1 deletion .github/workflows/bintests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,21 @@ jobs:
poetry install
- name: Test binaries
run: |
mkdir -p bintests
mkdir -p bintests gobintests
cd bintests
wget -q https://github.com/owasp-dep-scan/dosai/releases/download/v0.1.1/Dosai.exe
wget -q https://github.com/owasp-dep-scan/dosai/releases/download/v0.1.1/Dosai
wget -q https://github.com/owasp-dep-scan/dosai/releases/download/v0.1.1/Dosai-osx-arm64
cd ..
cd gobintests
wget -q https://github.com/containerd/containerd/releases/download/v1.7.14/containerd-1.7.14-linux-amd64.tar.gz
wget -q https://github.com/containerd/nerdctl/releases/download/v1.7.4/nerdctl-1.7.4-windows-amd64.tar.gz
tar -xvf containerd-1.7.14-linux-amd64.tar.gz
tar -xvf nerdctl-1.7.4-windows-amd64.tar.gz
rm containerd-1.7.14-linux-amd64.tar.gz
rm nerdctl-1.7.4-windows-amd64.tar.gz
cd ..
poetry run blint sbom -i bintests -o reports/bom.json --deep
poetry run blint sbom -i gobintests -o reports/bom.json --deep
env:
SCAN_DEBUG_MODE: "debug"
39 changes: 24 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
# BLint

![blint logo](blint.png)

BLint is a Binary Linter to check the security properties, and capabilities in your executables. It is powered by [lief](https://github.com/lief-project/LIEF). Since version 2, blint can also generate Software Bill-of-Materials (SBOM) for supported binaries.
![blint logo]
BLint is a Binary Linter that checks the security properties and capabilities of your executables. It is powered by [lief](https://github.com/lief-project/LIEF). Since version 2, blint can also generate Software Bill-of-Materials (SBOM) for supported binaries.

[![BLint Demo](https://asciinema.org/a/438138.png)](https://asciinema.org/a/438138)

Expand All @@ -13,15 +12,13 @@ Supported binary formats:
- PE (exe, dll)
- Mach-O (x64, arm64)

You can run blint on Linux, Windows and Mac against any of these binary formats.
You can run blint on Linux, Windows, and Mac against any of these binary formats.

## Motivation

Nowadays, vendors distribute statically linked binaries produced by golang or rust or dotnet tooling. Users are used to running antivirus and anti-malware scans while using these binaries in their local devices. Blint augments these scans by listing the technical capabilities of a binary. For example, whether the binary could use network connections, or can perform file system operations and so on.

The binary is first parsed using lief framework to identify the various properties such as functions, static, and dynamic symbols present. Thanks to YAML based [annotations](./blint/data/annotations) data, this information could be matched against capabilities and presented visually using a rich table.

NOTE: The presence of capabilities doesn't imply that the operations are always performed by the binary. Use the output of this tool to get an idea about a binary. Also, this tool is not suitable to review malware and other heavily obfuscated binaries for obvious reasons.
Nowadays, vendors distribute statically linked binaries produced by Golang, Rust, or Dotnet tooling. Users are used to running antivirus and anti-malware scans while using these binaries in their local devices. Blint augments these scans by listing the technical capabilities of a binary. For example, whether the binary could use network connections or can perform file system operations and so on.
The binary is first parsed using the lief framework to identify the various properties, such as functions and the presence of symtab and dynamic symbols. Thanks to YAML-based annotation data, this information can be matched against capabilities and presented visually using a rich table.
NOTE: The presence of capabilities doesn't imply that the binary always performs the operations. Use the output of this tool to get an idea about a binary. Also, this tool is not suitable for reviewing malware and other heavily obfuscated binaries for obvious reasons.

## Use cases

Expand All @@ -39,7 +36,7 @@ pip install blint

### Single binary releases

You can download single binary builds from the [blint-bin releases](https://github.com/OWASP-dep-scan/blint/releases). These executables should work with requiring python to be installed. The macOS .pkg file is signed with a valid developer account.
You can download single binary builds from the [blint-bin releases](https://github.com/OWASP-dep-scan/blint/releases). These executables should work without requiring python to be installed. The macOS .pkg file is signed with a valid developer account.

## Usage

Expand Down Expand Up @@ -83,7 +80,7 @@ options:
operation.
```
To test any binary including default commands
To test any binary, including default commands
```bash
blint -i /bin/netstat -o /tmp/blint
Expand Down Expand Up @@ -111,12 +108,24 @@ blint sbom -i /path/to/apk -o bom.json
blint sbom -i /directory/with/apk/aab -o bom.json
```
To parse all files including `.dex` files, pass `--deep` argument.
To parse all files, including `.dex` files, pass `--deep` argument.
```shell
blint sbom -i /path/to/apk -o bom.json --deep
```
The following binaries are supported:
- Android (apk/aab)
- Dotnet executable binaries
- Go binaries
```shell
blint sbom -i /path/to/go-binaries -o bom.json --deep
```
For all other binaries, the symbols will be collected and represented as properties with `internal` prefixes for the parent component. Child components and dependencies would be missing.
PowerShell example
![PowerShell](./docs/blint-powershell.jpg)
Expand All @@ -127,7 +136,7 @@ Blint produces the following json artifacts in the reports directory:
- blint-output.html - HTML output from the console logs
- exename-metadata.json - Raw metadata about the parsed binary. Includes symbols, functions, and signature information
- findings.json - Contains information from the security properties audit. Useful for CI/CD based integration
- findings.json - Contains information from the security properties audit. Useful for CI/CD integrations
- reviews.json - Contains information from the capability reviews. Useful for further analysis
- fuzzables.json - Contains a suggested list of methods for fuzzing
Expand All @@ -140,10 +149,10 @@ sbom command generates CycloneDX json.
## Discord support
The developers could be reached via the [discord](https://discord.gg/DCNxzaeUpd) channel.
The developers can be reached via the [Discord](https://discord.gg/DCNxzaeUpd) channel.
## Sponsorship wishlist
If you love blint, you should consider [donating](https://owasp.org/donate?reponame=www-project-dep-scan&title=OWASP+dep-scan) to our project. In addition, consider donating to the below projects which make blint possible.
If you love blint, you should consider [donating](https://owasp.org/donate?reponame=www-project-dep-scan&title=OWASP+dep-scan) to our project. In addition, consider donating to the below projects, which make blint possible.
- [LIEF](https://github.com/sponsors/lief-project/)
84 changes: 71 additions & 13 deletions blint/binary.py
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ def ignorable_symbol(symbol_name: str | None) -> bool:
"""
if not symbol_name:
return True
for pref in ("$f64.", "__"):
for pref in ("_$f", "$f64.", "__"):
if symbol_name.startswith(pref):
return True
return False
Expand Down Expand Up @@ -656,18 +656,19 @@ def parse_macho_symbols(symbols):
if not symbol_name or isinstance(symbol_name, lief.lief_errors):
symbol_name = symbol.name
symbol_name = symbol_name.replace("..", "::")
if not exe_type:
exe_type = guess_exe_type(symbol_name)
symbols_list.append(
{
"name": (f"{libname}::{symbol_name}" if libname else symbol_name),
"short_name": symbol_name,
"type": symbol.type,
"num_sections": symbol.numberof_sections,
"description": symbol.description,
"value": symbol_value,
}
)
if not ignorable_symbol(symbol_name):
if not exe_type:
exe_type = guess_exe_type(symbol_name)
symbols_list.append(
{
"name": (f"{libname}::{symbol_name}" if libname else symbol_name),
"short_name": symbol_name,
"type": symbol.type,
"num_sections": symbol.numberof_sections,
"description": symbol.description,
"value": symbol_value,
}
)
except (AttributeError, TypeError):
continue
return symbols_list, exe_type
Expand Down Expand Up @@ -759,6 +760,8 @@ def add_elf_metadata(exe_file, metadata, parsed_obj):
metadata["functions"] = parse_functions(parsed_obj.functions)
metadata["ctor_functions"] = parse_functions(parsed_obj.ctor_functions)
metadata["dotnet_dependencies"] = parse_overlay(parsed_obj)
metadata["go_dependencies"], metadata["go_formulation"] = parse_go_buildinfo(parsed_obj)

return metadata


Expand Down Expand Up @@ -927,6 +930,59 @@ def parse_overlay(parsed_obj: lief.Binary) -> dict[str, dict]:
return deps


def parse_go_buildinfo(parsed_obj: lief.Binary) -> (dict[str, dict[str, str]], dict[str, str]):
"""
Parse the go build info section to extract go dependencies
Args:
parsed_obj (lief.Binary): The parsed object representing the binary.
Returns:
tuple(dict[str, str], dict[str, str]): Tuple representing the dependencies and formulation.
"""
formulation = {}
deps = {}
build_info_str: str = ""
# Look for specific buildinfo sections for ELF and MachO binaries
build_info: lief.Section = None
if isinstance(parsed_obj, lief.ELF.Binary):
build_info = parsed_obj.get_section(".go.buildinfo")
elif isinstance(parsed_obj, lief.MachO.Binary):
build_info = parsed_obj.get_section("__go_buildinfo")
if build_info and build_info.size:
build_info_str = (
codecs.decode(build_info.content.tobytes(), encoding="utf-8", errors="replace")
.replace("\0", "")
.replace("\uFFFD", "")
.replace("\t", " ")
).strip()
build_info_str = build_info_str.encode('ascii', 'ignore').decode('ascii')
elif isinstance(parsed_obj, lief.PE.Binary):
# For PE binaries look for .data section
s: lief.PE.Section = parsed_obj.get_section(".data")
build_info_str = codecs.decode(s.content.tobytes()[:int(s.size / 32)], encoding="ascii",
errors="replace").replace("\0", "").replace("\uFFFD", "").replace("\t", " ")
lines = build_info_str.split("\n")
for line in lines:
if line.startswith("Go buildinf:"):
tmp_a = line.split("Go buildinf:")
formulation["go_version"] = tmp_a[-1].split("\x19")[0].split(" ")[-1]
if "path " in line:
tmp_a = line.split("path ")
formulation["path"] = tmp_a[-1]
if line.startswith("mod "):
tmp_a = line.split("mod ")
formulation["module"] = tmp_a[-1]
if line.startswith("dep "):
tmp_a = line.removeprefix("dep ").split(" ")
deps[tmp_a[0]] = {"version": tmp_a[1],
"hash": tmp_a[2] if len(tmp_a) == 3 and tmp_a[2].startswith("h1:") else None}
if line.startswith("build "):
tmp_a = line.removeprefix("build ").split("=")
formulation[tmp_a[0].replace("-", "")] = tmp_a[1]

return deps, formulation


def add_pe_metadata(exe_file: str, metadata: dict, parsed_obj: lief.PE.Binary):
"""Adds PE metadata to the given metadata dictionary.
Expand Down Expand Up @@ -978,6 +1034,7 @@ def add_pe_metadata(exe_file: str, metadata: dict, parsed_obj: lief.PE.Binary):
if i == 14 and dd.type.value == lief.PE.DataDirectory.TYPES.CLR_RUNTIME_HEADER.value:
metadata["is_dotnet"] = True
metadata["dotnet_dependencies"] = parse_overlay(parsed_obj)
metadata["go_dependencies"], metadata["go_formulation"] = parse_go_buildinfo(parsed_obj)
tls = parsed_obj.tls
if tls and tls.sizeof_zero_fill:
metadata["tls_address_index"] = tls.addressof_index
Expand Down Expand Up @@ -1138,6 +1195,7 @@ def add_mach0_metadata(exe_file, metadata, parsed_obj):
metadata = add_mach0_commands(metadata, parsed_obj)
metadata = add_mach0_functions(metadata, parsed_obj)
metadata = add_mach0_signature(exe_file, metadata, parsed_obj)
metadata["go_dependencies"], metadata["go_formulation"] = parse_go_buildinfo(parsed_obj)
return metadata


Expand Down
83 changes: 69 additions & 14 deletions blint/sbom.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
import base64
import binascii
import codecs
import os
import urllib.parse
import uuid
from datetime import datetime
from typing import Any, Dict
Expand Down Expand Up @@ -317,44 +320,44 @@ def process_exe_file(
# If this is unsuccessful then store the information as a property
lib_components += components_from_symbols_version(symbols_version)
if not lib_components and symbols_version:
parent_component.properties += [
parent_component.properties.append(
Property(
name="internal:symbols_version",
value=", ".join([f["name"] for f in symbols_version]),
)
]
)
internal_functions = [f["name"] for f in metadata.get("functions", []) if not f["name"].startswith("__")]
if internal_functions:
parent_component.properties += [
parent_component.properties.append(
Property(
name="internal:functions",
value=SYMBOL_DELIMITER.join(internal_functions),
)
]
)
symtab_symbols = [f["name"] for f in metadata.get("symtab_symbols", [])]
if symtab_symbols:
parent_component.properties += [
parent_component.properties.append(
Property(
name="internal:symtab_symbols",
value=SYMBOL_DELIMITER.join(symtab_symbols),
)
]
)
all_imports = [f["name"] for f in metadata.get("imports", [])]
if all_imports:
parent_component.properties += [
parent_component.properties.append(
Property(
name="internal:imports",
value=SYMBOL_DELIMITER.join(all_imports),
)
]
)
dynamic_symbols = [f["name"] for f in metadata.get("dynamic_symbols", [])]
if dynamic_symbols:
parent_component.properties += [
parent_component.properties.append(
Property(
name="internal:dynamic_symbols",
value=SYMBOL_DELIMITER.join(dynamic_symbols),
)
]
)
if not sbom.metadata.component.components:
sbom.metadata.component.components = []
_add_to_parent_component(sbom.metadata.component.components, parent_component)
Expand All @@ -370,6 +373,18 @@ def process_exe_file(
if metadata.get("dotnet_dependencies"):
pe_components = process_dotnet_dependencies(metadata.get("dotnet_dependencies"), dependencies_dict)
lib_components += pe_components
# Convert go dependencies
if metadata.get("go_dependencies"):
go_components = process_go_dependencies(metadata.get("go_dependencies"))
lib_components += go_components
# Convert go formulation section
for k, v in metadata.get("go_formulation", {}).items():
parent_component.properties.append(
Property(
name=f"internal:{camel_to_snake(k)}",
value=str(v).strip(),
)
)
if lib_components:
components += lib_components
track_dependency(dependencies_dict, parent_component, lib_components)
Expand Down Expand Up @@ -485,9 +500,10 @@ def process_dotnet_dependencies(dotnet_deps: dict[str, dict], dependencies_dict:
purl = f"pkg:nuget/{tmp_a[0]}@{tmp_a[1]}"
hash_content = ""
try:
hash_content = str(base64.b64decode(v.get("sha512", "").removeprefix("sha512-"), validate=True), "utf-8")
except Exception:
pass
hash_content = codecs.encode(base64.b64decode(v.get("sha512").removeprefix("sha512-"), validate=True),
encoding="hex")
except binascii.Error:
hash_content = str(v.get("hash").removeprefix("sha512-"))
comp = Component(
type=Type.application if v.get("type") == "project" else Type.library,
name=tmp_a[0],
Expand All @@ -505,7 +521,7 @@ def process_dotnet_dependencies(dotnet_deps: dict[str, dict], dependencies_dict:
comp.bom_ref = RefType(purl)
components.append(comp)
targets: dict[str, dict[str, dict]] = dotnet_deps.get("targets", {})
for tk, tv in targets.items():
for _, tv in targets.items():
for k, v in tv.items():
tmp_a = k.split("/")
purl = f"pkg:nuget/{tmp_a[0]}@{tmp_a[1]}"
Expand All @@ -518,6 +534,45 @@ def process_dotnet_dependencies(dotnet_deps: dict[str, dict], dependencies_dict:
return components


def process_go_dependencies(go_deps: dict[str, str]) -> list[Component]:
"""
Process the go dependencies metadata extracted for binary overlays
Args:
go_deps (dict[str, str]): dependencies metadata
Returns:
list: New component list
"""
components = []
# Key is the name and value is the version
# We need to construct a purl by pretending the module name is the name with no namespace
# This would make this compatible with cdxgen and depscan
# See https://github.com/CycloneDX/cdxgen/issues/897
for k, v in go_deps.items():
purl = f"""pkg:golang/{urllib.parse.quote_plus(k)}@{v.get("version")}"""
comp = Component(
type=Type.library,
name=k,
version=v.get("version"),
purl=purl,
scope=Scope.required,
evidence=create_component_evidence(k, 1.0)
)
hash_content = ""
if v.get("hash"):
try:
hash_content = codecs.encode(base64.b64decode(v.get("hash").removeprefix("h1:"), validate=True),
encoding="hex")
except binascii.Error:
hash_content = str(v.get("hash").removeprefix("h1:"))
if hash_content:
comp.hashes = [Hash(alg=HashAlg.SHA_256, content=hash_content)]
comp.bom_ref = RefType(f"""pkg:golang/{k}@{v.get("version")}""")
components.append(comp)
return components


def track_dependency(
dependencies_dict: dict[str, set], parent_component: Component, app_components: list[Component]
) -> None:
Expand Down
Loading

0 comments on commit 2bcaacb

Please sign in to comment.