Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flake8-bandit] Detect patterns from multi line SQL statements (S608) #13574

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

DataEnggNerd
Copy link

Summary

The regex written for identifying possible sql injections was not identifying some patterns, mostly multiple lined queries.

Fixes include:

Test Plan

Copy link
Contributor

ruff-ecosystem results

Linter (stable)

ℹ️ ecosystem check detected linter changes. (+17 -191 violations, +0 -0 fixes in 4 projects; 50 projects unchanged)

apache/airflow (+0 -161 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- airflow/example_dags/tutorial_objectstorage.py:118:22: S608 Possible SQL injection vector through string-based query construction
- airflow/migrations/utils.py:38:9: S608 Possible SQL injection vector through string-based query construction
- airflow/migrations/versions/0003_2_7_0_add_include_deferred_column_to_pool.py:46:24: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/hooks/redshift_data.py:230:15: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/transfers/redshift_to_s3.py:153:16: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/transfers/s3_to_redshift.py:173:30: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/transfers/s3_to_redshift.py:191:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/transfers/s3_to_redshift.py:192:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/utils/openlineage.py:47:11: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/apache/cassandra/hooks/cassandra.py:210:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/apache/hive/operators/hive_stats.py:133:15: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/apache/hive/operators/hive_stats.py:146:15: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/apache/hive/operators/hive_stats.py:155:19: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/apache/hive/sensors/metastore_partition.py:74:24: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/common/sql/operators/sql.py:460:20: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/common/sql/operators/sql.py:671:20: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/common/sql/operators/sql.py:946:16: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/databricks/operators/databricks_sql.py:319:24: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/databricks/sensors/databricks_partition.py:146:19: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_facebook_ads_to_gcs.py:114:26: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:115:26: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:125:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:164:26: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:175:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:67:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:76:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_salesforce_to_gcs.py:107:26: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/hooks/dataproc_metastore.py:679:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/transfers/gcs_to_bigquery.py:477:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/microsoft/mssql/hooks/mssql.py:129:16: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/microsoft/mssql/hooks/mssql.py:90:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/mysql/hooks/mysql.py:228:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/oracle/hooks/oracle.py:315:19: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/postgres/hooks/postgres.py:330:15: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/salesforce/hooks/salesforce.py:222:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/teradata/transfers/azure_blob_to_teradata.py:107:22: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/teradata/transfers/s3_to_teradata.py:107:22: S608 Possible SQL injection vector through string-based query construction
- airflow/utils/db_cleanup.py:149:43: S608 Possible SQL injection vector through string-based query construction
... 123 additional changes omitted for project

apache/superset (+0 -30 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- superset/db_engine_specs/gsheets.py:274:40: S608 Possible SQL injection vector through string-based query construction
- superset/db_engine_specs/mssql.py:155:17: S608 Possible SQL injection vector through string-based query construction
- superset/db_engine_specs/postgres.py:465:17: S608 Possible SQL injection vector through string-based query construction
- superset/db_engine_specs/presto.py:505:39: S608 Possible SQL injection vector through string-based query construction
- superset/db_engine_specs/redshift.py:179:17: S608 Possible SQL injection vector through string-based query construction
- superset/migrations/shared/utils.py:114:17: S608 Possible SQL injection vector through string-based query construction
- superset/migrations/versions/2022-04-01_14-38_a9422eeaae74_new_dataset_models_take_2.py:871:9: S608 Possible SQL injection vector through string-based query construction
- superset/utils/encrypt.py:136:29: S608 Possible SQL injection vector through string-based query construction
- superset/utils/encrypt.py:186:18: S608 Possible SQL injection vector through string-based query construction
- superset/utils/mock_data.py:284:30: S608 Possible SQL injection vector through string-based query construction
... 20 additional changes omitted for project

freedomofpress/securedrop (+4 -0 violations, +0 -0 fixes)

+ securedrop/alembic/versions/2e24fc7536e8_make_journalist_id_non_nullable.py:67:68: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ securedrop/alembic/versions/2e24fc7536e8_make_journalist_id_non_nullable.py:84:80: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ securedrop/alembic/versions/2e24fc7536e8_make_journalist_id_non_nullable.py:89:73: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ securedrop/alembic/versions/b7f98cfd6a70_make_filesystem_id_non_nullable.py:46:57: RUF100 [*] Unused `noqa` directive (unused: `S608`)

indico/indico (+13 -0 violations, +0 -0 fixes)

+ indico/cli/database.py:54:49: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/cli/database.py:56:96: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/core/db/sqlalchemy/util/management.py:98:19: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20171124_1138_2af245be72a6_review_questions_models.py:47:15: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20171124_1138_2af245be72a6_review_questions_models.py:48:93: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20190118_1514_416f9c877300_add_reservationlink_table.py:70:11: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20230712_1742_9b3fc740b722_fix_user_name_format_values.py:38:15: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20230712_1742_9b3fc740b722_fix_user_name_format_values.py:47:15: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20231124_0941_fb0ca1440185_add_redirect_uri_checkin_app.py:26:11: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20231124_0941_fb0ca1440185_add_redirect_uri_checkin_app.py:34:11: RUF100 [*] Unused `noqa` directive (unused: `S608`)
... 3 additional changes omitted for project

Changes by rule (2 rules affected)

code total + violation - violation + fix - fix
S608 191 0 191 0 0
RUF100 17 17 0 0 0

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+17 -191 violations, +0 -0 fixes in 4 projects; 50 projects unchanged)

apache/airflow (+0 -161 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- airflow/example_dags/tutorial_objectstorage.py:118:22: S608 Possible SQL injection vector through string-based query construction
- airflow/migrations/utils.py:38:9: S608 Possible SQL injection vector through string-based query construction
- airflow/migrations/versions/0003_2_7_0_add_include_deferred_column_to_pool.py:46:24: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/hooks/redshift_data.py:230:15: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/transfers/redshift_to_s3.py:153:16: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/transfers/s3_to_redshift.py:173:30: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/transfers/s3_to_redshift.py:191:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/transfers/s3_to_redshift.py:192:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/amazon/aws/utils/openlineage.py:47:11: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/apache/cassandra/hooks/cassandra.py:210:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/apache/hive/operators/hive_stats.py:133:15: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/apache/hive/operators/hive_stats.py:146:15: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/apache/hive/operators/hive_stats.py:155:19: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/apache/hive/sensors/metastore_partition.py:74:24: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/common/sql/operators/sql.py:460:20: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/common/sql/operators/sql.py:671:20: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/common/sql/operators/sql.py:946:16: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/databricks/operators/databricks_sql.py:319:24: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/databricks/sensors/databricks_partition.py:146:19: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_facebook_ads_to_gcs.py:114:26: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:115:26: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:125:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:164:26: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:175:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:67:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_presto_to_gcs.py:76:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/example_dags/example_salesforce_to_gcs.py:107:26: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/hooks/dataproc_metastore.py:679:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/google/cloud/transfers/gcs_to_bigquery.py:477:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/microsoft/mssql/hooks/mssql.py:129:16: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/microsoft/mssql/hooks/mssql.py:90:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/mysql/hooks/mysql.py:228:13: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/oracle/hooks/oracle.py:315:19: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/postgres/hooks/postgres.py:330:15: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/salesforce/hooks/salesforce.py:222:17: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/teradata/transfers/azure_blob_to_teradata.py:107:22: S608 Possible SQL injection vector through string-based query construction
- airflow/providers/teradata/transfers/s3_to_teradata.py:107:22: S608 Possible SQL injection vector through string-based query construction
- airflow/utils/db_cleanup.py:149:43: S608 Possible SQL injection vector through string-based query construction
... 123 additional changes omitted for project

apache/superset (+0 -30 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- superset/db_engine_specs/gsheets.py:274:40: S608 Possible SQL injection vector through string-based query construction
- superset/db_engine_specs/mssql.py:155:17: S608 Possible SQL injection vector through string-based query construction
- superset/db_engine_specs/postgres.py:465:17: S608 Possible SQL injection vector through string-based query construction
- superset/db_engine_specs/presto.py:505:39: S608 Possible SQL injection vector through string-based query construction
- superset/db_engine_specs/redshift.py:179:17: S608 Possible SQL injection vector through string-based query construction
- superset/migrations/shared/utils.py:114:17: S608 Possible SQL injection vector through string-based query construction
- superset/migrations/versions/2022-04-01_14-38_a9422eeaae74_new_dataset_models_take_2.py:871:9: S608 Possible SQL injection vector through string-based query construction
- superset/utils/encrypt.py:136:29: S608 Possible SQL injection vector through string-based query construction
- superset/utils/encrypt.py:186:18: S608 Possible SQL injection vector through string-based query construction
- superset/utils/mock_data.py:284:30: S608 Possible SQL injection vector through string-based query construction
... 20 additional changes omitted for project

freedomofpress/securedrop (+4 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ securedrop/alembic/versions/2e24fc7536e8_make_journalist_id_non_nullable.py:67:68: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ securedrop/alembic/versions/2e24fc7536e8_make_journalist_id_non_nullable.py:84:80: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ securedrop/alembic/versions/2e24fc7536e8_make_journalist_id_non_nullable.py:89:73: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ securedrop/alembic/versions/b7f98cfd6a70_make_filesystem_id_non_nullable.py:46:57: RUF100 [*] Unused `noqa` directive (unused: `S608`)

indico/indico (+13 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ indico/cli/database.py:54:49: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/cli/database.py:56:96: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/core/db/sqlalchemy/util/management.py:98:19: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20171124_1138_2af245be72a6_review_questions_models.py:47:15: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20171124_1138_2af245be72a6_review_questions_models.py:48:93: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20190118_1514_416f9c877300_add_reservationlink_table.py:70:11: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20230712_1742_9b3fc740b722_fix_user_name_format_values.py:38:15: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20230712_1742_9b3fc740b722_fix_user_name_format_values.py:47:15: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20231124_0941_fb0ca1440185_add_redirect_uri_checkin_app.py:26:11: RUF100 [*] Unused `noqa` directive (unused: `S608`)
+ indico/migrations/versions/20231124_0941_fb0ca1440185_add_redirect_uri_checkin_app.py:34:11: RUF100 [*] Unused `noqa` directive (unused: `S608`)
... 3 additional changes omitted for project

Changes by rule (2 rules affected)

code total + violation - violation + fix - fix
S608 191 0 191 0 0
RUF100 17 17 0 0 0

@DataEnggNerd
Copy link
Author

@MichaReiser Can you review the changes please?

Regex::new(r"(?i)\b(select\s.+\sfrom\s|delete\s+from\s|(insert|replace)\s.+\svalues\s|update\s.+\sset\s)")
.unwrap()
Regex::new(
r"(?ixs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for including additional modifiers like x and s? We don't use any comments in the regex so x shouldn't be required. And the \s group should include the newline character so we might not require the s modifier as well. I might be wrong here so it would be helpful if you can provide some context here :)

Comment on lines -91 to -92
// f"select * from table where val = {val}"
Expr::FString(f_string) => concatenated_f_string(f_string, checker.locator()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should avoid checking f-strings otherwise we're reducing the surface area for the rule. The fix should be to just remove the escape_default call but we still need to concatenated the f-strings.

The reason that string literals aren't being concatenated explicitly is because the to_str call above (string.value.to_str()) will concatenated them internally.

Does this make sense?

@dhruvmanila dhruvmanila added the bug Something isn't working label Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants