Skip to content

Latest commit

 

History

History
146 lines (106 loc) · 11.3 KB

File metadata and controls

146 lines (106 loc) · 11.3 KB

SCM Inventory Module

The SCM Inventory module is designed to automate the deployment of resources necessary for scanning SCM and pulling an inventory from such platforms. Initially it supports pullung GitHub organizations' repositories, their issues and pull requests to generate an inventory and maintain it.

The inventory includes by default additional information about the top 5 languages used in the repository as well as the top 5 topics used. This information can be customized to include additional data.

This Terraform module provisions an AWS EC2 instance, configures it with necessary permissions, and sets up a workflow to fetch GitHub inventory data and pushes it to an S3 bucket. The module is designed to be flexible and can be customized to support additional SCM platforms and data sources.

Supported SCM

  • GitHub: For more information see the python module github_inventory stored in this repository.

Prerequisites

  • AWS CLI configured with appropriate credentials
  • Access to an AWS account with permissions to create EC2 instances, IAM roles, policies, and S3 buckets
  • A GitHub token with permissions to access the repositories and organizations you wish to scan

Usage

Configure AWS Credentials

Ensure your AWS CLI is configured with credentials that have the necessary permissions to create the resources defined in this module.

Prepare GitHub Token

Store your GitHub token in AWS Secrets Manager. Note the ARN of the secret as it will be used in the Terraform variables.

Set Terraform Variables

Customize the Terraform variables defined in the variables.tf file or provide a terraform.tfvars file with your specific values.

We recommend setting the variables in a terraform.tfvars file based off the terraform.tfvars.example file provided.

Key variables include:

  • aws_profile: The AWS profile to use for authentication.
  • aws_region: The AWS region where resources will be deployed.
  • s3_bucket_name: The name of the S3 bucket where the inventory will be stored. (This bucket must be created beforehand).
  • github_token_secret_name: The ARN of the AWS Secrets Manager secret containing your GitHub token. This will have to be provisonned separately
  • project_name: A name for your project.
  • scanned_org: The GitHub organization you wish to scan.

Initialize Terraform

Run terraform init in the infrastructure/inventory/aws/scm-inventory/ directory to initialize the Terraform project.

Apply Terraform Configuration

Execute terraform apply to create the resources. Review the plan and confirm the action.

Access the Inventory

Once the EC2 instance completes its run, the generated inventory will be available in the specified S3 bucket. The instance can be configured to terminate automatically after completion.

Additional Notes

The EC2 instance will use a t2.micro instance type by default, but this can be adjusted based on your needs. We didn't want to use a larger instance type by default to avoid unnecessary costs.

It is also possible to keep the EC2 running after the inventory generation, which can be useful for debugging purposes. This can be done by setting the terminate_instance_after_completion variable to false.

The module supports optional fetching of issues and pull requests from the scanned GitHub organizations by setting the fetch_issues and fetch_pr variables.

The inventory script is located in the scripts/inventory/github_inventory directory.

For detailed information on the resources created and managed by this module, refer to the automatically generated documentation below.

Requirements

Name Version
terraform >=1.7
aws ~> 5.0

Providers

Name Version
aws ~> 5.0
local n/a
null n/a

Modules

No modules.

Resources

Name Type
aws_iam_instance_profile.ec2_instance_profile resource
aws_iam_policy.permissions_for_ec2_instance resource
aws_iam_policy.s3_access_policy resource
aws_iam_role.ec2_role resource
aws_iam_role_policy_attachment.PermissionsForEC2InstancePolicyAttachment resource
aws_instance.ec2_inventory resource
aws_s3_object.poetry_dist resource
null_resource.poetry_build resource
aws_ami.amazon_ami data source
aws_caller_identity.current data source
aws_iam_policy_document.ec2_assume_role data source
aws_iam_policy_document.policy_document_permissions_for_ec2_instance data source
aws_iam_policy_document.s3_access_policy_document data source
aws_s3_bucket.resources_and_results data source
aws_secretsmanager_secret.github_token_secret data source
aws_security_group.default data source
aws_security_groups.custom_security_groups data source
aws_subnet.selected data source
aws_subnets.default data source
aws_vpc.selected data source
local_file.dist data source

Inputs

Name Description Type Default Required
ami_image_filter Filter to use to find the Amazon Machine Image (AMI) to use for the EC2 instance the name can contain wildcards. Only GNU/Linux images are supported. string "amzn2-ami-hvm*" no
ami_owner Owner of the Amazon Machine Image (AMI) to use for the EC2 instance string "amazon" no
aws_default_security_groups_filters Filters to use to find the default security groups list(string) [] no
aws_profile AWS profile to use for authentication string n/a yes
aws_region AWS region where to deploy resources string "us-east-1" no
ec2_workdir Working directory for the EC2 instance string "~/github-inventory" no
environment_type Environment (PRODUCTION, PRE-PRODUCTION, QUALITY ASSURANCE, INTEGRATION TESTING, DEVELOPMENT, LAB) string "PRODUCTION" no
fetch_issues Indicates whether to fetch issues for the repositories bool false no
fetch_pr Indicates whether to fetch pull requests for the repositories bool false no
github_token_secret_name SSM parameter name containing the GitHub token of the Service Account string n/a yes
instance_type Instance type to use for fetching the inventory string "t2.micro" no
inventory_project_dir Path to the directory containing the inventory project string "../../../../scripts/inventory/github_inventory" no
permissions_boundary_arn Permissions boundary to use for the IAM role string null no
project_name Name of the project string "secrets-detection" no
project_version Version of the project string "0.1.0" no
s3_bucket_name S3 bucket name where to upload the scripts and results string n/a yes
scanned_org Name of the organization to scan string n/a yes
subnet_name Filter to select the subnet to use, this can use wildcards. string null no
tags A map of tags to add to the resources map(string) {} no
terminate_instance_after_completion Indicates whether the instance should be terminated once the scan has finished (set to false for debugging purposes) bool true no
vpc_name Filter to select the VPC to use, this can use wildcards. string "" no

Outputs

Name Description
ec2_instance_arn n/a
ec2_instance_id n/a
ec2_role_arn n/a