sast-orchestration
4
总安装量
3
周安装量
#51414
全站排名
安装命令
npx skills add https://github.com/hardw00t/ai-security-arsenal --skill sast-orchestration
Agent 安装分布
claude-code
3
github-copilot
3
codex
3
opencode
2
antigravity
2
kimi-cli
2
Skill 文档
SAST Orchestration
This skill enables comprehensive static application security testing through tool orchestration, custom rule development, finding triage, and CI/CD integration using industry-standard SAST tools.
When to Use This Skill
This skill should be invoked when:
- Scanning source code for security vulnerabilities
- Writing custom detection rules for Semgrep, CodeQL, or other SAST tools
- Triaging and prioritizing SAST findings
- Setting up automated security scanning in CI/CD pipelines
- Comparing results across multiple SAST tools
- Reducing false positives in security scans
Trigger Phrases
- “scan this code for vulnerabilities”
- “write a Semgrep rule to detect…”
- “triage these SAST findings”
- “set up security scanning in CI/CD”
- “find SQL injection in this codebase”
- “analyze the security scan results”
SAST Tool Selection Matrix
| Tool | Languages | Strengths | Best For |
|---|---|---|---|
| Semgrep | 30+ languages | Fast, custom rules, low FP | Custom patterns, quick scans |
| CodeQL | 10 languages | Deep dataflow, taint tracking | Complex vulnerability chains |
| Bandit | Python | Python-specific, easy setup | Python security audits |
| gosec | Go | Go-specific patterns | Go security scanning |
| Brakeman | Ruby/Rails | Rails-aware analysis | Rails applications |
| SpotBugs + FindSecBugs | Java | Bytecode analysis | Java/JVM apps |
| ESLint + security plugins | JavaScript/TS | IDE integration | Frontend/Node.js |
| PHPStan + security rules | PHP | Type-aware analysis | PHP applications |
Semgrep
Quick Start
# Install
pip install semgrep
# or
brew install semgrep
# Run with default security rules
semgrep --config=auto .
# Run specific rule packs
semgrep --config=p/security-audit .
semgrep --config=p/owasp-top-ten .
semgrep --config=p/cwe-top-25 .
# Run with custom rules
semgrep --config=./rules/ .
# Output formats
semgrep --config=auto --json -o results.json .
semgrep --config=auto --sarif -o results.sarif .
Rule Packs for Security
# Comprehensive security scanning
semgrep --config=p/security-audit \
--config=p/secrets \
--config=p/supply-chain \
--config=p/default .
# Language-specific
semgrep --config=p/python .
semgrep --config=p/javascript .
semgrep --config=p/java .
semgrep --config=p/golang .
# Framework-specific
semgrep --config=p/django .
semgrep --config=p/flask .
semgrep --config=p/react .
semgrep --config=p/nodejs .
Writing Custom Semgrep Rules
# Basic pattern matching
rules:
- id: hardcoded-password
pattern: password = "..."
message: Hardcoded password detected
languages: [python]
severity: ERROR
metadata:
cwe: "CWE-798: Use of Hard-coded Credentials"
owasp: "A07:2021 - Identification and Authentication Failures"
# Using metavariables
- id: sql-injection-format-string
patterns:
- pattern: |
$QUERY = f"...{$USER_INPUT}..."
$CURSOR.execute($QUERY)
- pattern: |
$CURSOR.execute(f"...{$USER_INPUT}...")
message: SQL injection via f-string
languages: [python]
severity: ERROR
# Pattern with focus
- id: dangerous-subprocess
patterns:
- pattern: subprocess.$METHOD(..., shell=True, ...)
- metavariable-pattern:
metavariable: $METHOD
pattern-either:
- pattern: run
- pattern: call
- pattern: Popen
message: Subprocess with shell=True is dangerous
languages: [python]
severity: WARNING
# Taint tracking (requires Semgrep Pro for full taint)
- id: xss-vulnerability
mode: taint
pattern-sources:
- pattern: request.args.get(...)
- pattern: request.form.get(...)
pattern-sinks:
- pattern: render_template_string(...)
- pattern: Markup(...)
message: User input flows to unsafe output
languages: [python]
severity: ERROR
Advanced Semgrep Patterns
rules:
# Pattern negation - exclude safe patterns
- id: unsafe-deserialization
patterns:
- pattern: pickle_module.loads($DATA)
- pattern-not-inside: |
if validate_signature($DATA):
...
message: Unsafe deserialization without validation
languages: [python]
severity: ERROR
# Metavariable comparison
- id: timing-attack-comparison
patterns:
- pattern: $SECRET == $USER_INPUT
- metavariable-pattern:
metavariable: $SECRET
patterns:
- pattern-either:
- pattern: password
- pattern: token
- pattern: api_key
message: Use constant-time comparison for secrets
languages: [python]
severity: WARNING
fix: hmac.compare_digest($SECRET, $USER_INPUT)
# Multiple pattern conjunction
- id: jwt-none-algorithm
patterns:
- pattern-either:
- pattern: jwt.decode($TOKEN, ..., algorithms=["none"], ...)
- pattern: jwt.decode($TOKEN, ..., options={"verify_signature": False}, ...)
message: JWT verification disabled
languages: [python]
severity: ERROR
# Regex-based detection
- id: aws-access-key
pattern-regex: 'AKIA[0-9A-Z]{16}'
message: AWS Access Key ID detected
languages: [generic]
severity: ERROR
# Cross-file analysis
- id: flask-debug-production
patterns:
- pattern-inside: |
if __name__ == "__main__":
...
- pattern: app.run(..., debug=True, ...)
paths:
include:
- "**/*prod*.py"
- "**/production/**"
message: Debug mode enabled in production file
languages: [python]
severity: ERROR
CodeQL
Setup and Basic Usage
# Install CodeQL CLI
# Download from https://github.com/github/codeql-cli-binaries
# Create database
codeql database create ./codeql-db --language=python --source-root=./src
# Run security queries
codeql database analyze ./codeql-db \
codeql/python-queries:codeql-suites/python-security-extended.qls \
--format=sarif-latest \
--output=results.sarif
# Run specific query
codeql database analyze ./codeql-db \
./custom-queries/sql-injection.ql \
--format=csv \
--output=results.csv
Writing CodeQL Queries
/**
* @name SQL Injection
* @description User input flows to SQL query without sanitization
* @kind path-problem
* @problem.severity error
* @security-severity 9.8
* @id py/sql-injection
* @tags security
* external/cwe/cwe-089
*/
import python
import semmle.python.security.dataflow.SqlInjection
import DataFlow::PathGraph
from SqlInjection::Configuration config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "SQL injection from $@.", source.getNode(), "user input"
/**
* @name Hardcoded credentials
* @kind problem
* @problem.severity warning
* @id py/hardcoded-credentials
*/
import python
from Assignment a, StringLiteral s
where
a.getValue() = s and
a.getTarget().(Name).getId().regexpMatch("(?i).*(password|secret|key|token|credential).*") and
s.getText().length() > 5
select a, "Potential hardcoded credential in variable: " + a.getTarget().(Name).getId()
CodeQL for Taint Tracking
/**
* @name Command injection
* @kind path-problem
*/
import python
import semmle.python.dataflow.new.TaintTracking
import semmle.python.ApiGraphs
class CommandInjectionConfig extends TaintTracking::Configuration {
CommandInjectionConfig() { this = "CommandInjectionConfig" }
override predicate isSource(DataFlow::Node source) {
// Flask request inputs
source = API::moduleImport("flask").getMember("request").getMember(_).getACall()
}
override predicate isSink(DataFlow::Node sink) {
// subprocess calls
exists(DataFlow::CallCfgNode call |
call = API::moduleImport("subprocess").getMember(_).getACall() and
sink = call.getArg(0)
)
or
// os.system
exists(DataFlow::CallCfgNode call |
call = API::moduleImport("os").getMember("system").getACall() and
sink = call.getArg(0)
)
}
override predicate isSanitizer(DataFlow::Node node) {
// shlex.quote sanitizes command injection
node = API::moduleImport("shlex").getMember("quote").getACall()
}
}
Language-Specific SAST Tools
Python – Bandit
# Install
pip install bandit
# Basic scan
bandit -r ./src
# With severity filtering
bandit -r ./src -ll # Medium and above
bandit -r ./src -lll # High only
# Specific tests
bandit -r ./src -t B301,B302,B303 # Specific checks
bandit -r ./src -s B101 # Skip assert check
# Output formats
bandit -r ./src -f json -o bandit-results.json
bandit -r ./src -f sarif -o bandit-results.sarif
# Configuration file
bandit -r ./src -c bandit.yaml
# bandit.yaml
skips: ['B101'] # Skip assert_used
tests: ['B301', 'B302', 'B303', 'B304', 'B305', 'B306', 'B307', 'B308', 'B309', 'B310', 'B311', 'B312', 'B313', 'B314', 'B315', 'B316', 'B317', 'B318', 'B319', 'B320', 'B321', 'B322', 'B323', 'B324', 'B325']
exclude_dirs: ['tests', 'venv']
Go – gosec
# Install
go install github.com/securego/gosec/v2/cmd/gosec@latest
# Basic scan
gosec ./...
# With severity filtering
gosec -severity medium ./...
# Specific rules
gosec -include=G101,G102,G103 ./...
gosec -exclude=G104 ./...
# Output formats
gosec -fmt=json -out=results.json ./...
gosec -fmt=sarif -out=results.sarif ./...
JavaScript/TypeScript – ESLint Security
# Install
npm install --save-dev eslint eslint-plugin-security eslint-plugin-no-unsanitized
# Run
npx eslint --ext .js,.ts ./src
// .eslintrc.json
{
"plugins": ["security", "no-unsanitized"],
"extends": ["plugin:security/recommended-legacy"],
"rules": {
"security/detect-object-injection": "error",
"security/detect-non-literal-require": "error",
"security/detect-non-literal-fs-filename": "error",
"security/detect-eval-with-expression": "error",
"security/detect-child-process": "warn",
"no-unsanitized/method": "error",
"no-unsanitized/property": "error"
}
}
Java – SpotBugs + Find Security Bugs
<!-- pom.xml -->
<plugin>
<groupId>com.github.spotbugs</groupId>
<artifactId>spotbugs-maven-plugin</artifactId>
<version>4.8.2.0</version>
<configuration>
<plugins>
<plugin>
<groupId>com.h3xstream.findsecbugs</groupId>
<artifactId>findsecbugs-plugin</artifactId>
<version>1.13.0</version>
</plugin>
</plugins>
<effort>Max</effort>
<threshold>Low</threshold>
</configuration>
</plugin>
# Run
mvn spotbugs:check
# Generate report
mvn spotbugs:spotbugs
Finding Triage Workflow
Severity Classification
## Triage Priority Matrix
| Severity | Exploitability | Data Sensitivity | Priority |
|----------|---------------|------------------|----------|
| Critical | Easy | High | P0 - Immediate |
| High | Easy | Medium | P1 - This sprint |
| High | Difficult | High | P1 - This sprint |
| Medium | Easy | Low | P2 - Next sprint |
| Medium | Difficult | Medium | P2 - Next sprint |
| Low | Any | Any | P3 - Backlog |
False Positive Identification
## Common False Positive Patterns
### SQL Injection FPs
- Parameterized queries flagged incorrectly
- ORM methods (SQLAlchemy, Django ORM)
- Constant/hardcoded queries
- Query builders with proper escaping
### XSS FPs
- Auto-escaping template engines (Jinja2 with autoescape)
- React/Vue automatic escaping
- Server-side only code paths
- Sanitization libraries in use
### Command Injection FPs
- Hardcoded command arguments
- Validated/allowlisted inputs
- Proper escaping with shlex.quote
### Crypto FPs
- Test/development environments
- Non-sensitive data encryption
- Legacy code marked for migration
Triage Decision Tree
## Triage Process
1. **Is it reachable?**
- Dead code? â FP
- Test code only? â Low priority
- Production path? â Continue
2. **Is user input involved?**
- Hardcoded values only? â FP
- Internal-only data? â Reduce severity
- User-controlled? â Continue
3. **Are there mitigations?**
- Sanitization present? â Verify effectiveness
- WAF protection? â Defense-in-depth
- Authentication required? â Reduce severity
4. **What's the impact?**
- RCE possible? â Critical
- Data breach? â High
- DoS only? â Medium
- Information disclosure? â Context-dependent
Multi-Tool Orchestration
Parallel Scanning Script
#!/bin/bash
# sast_scan.sh - Orchestrate multiple SAST tools
PROJECT_DIR="${1:-.}"
OUTPUT_DIR="${2:-./sast-results}"
mkdir -p "$OUTPUT_DIR"
echo "[*] Starting SAST scan orchestration..."
# Run tools in parallel
(
echo "[*] Running Semgrep..."
semgrep --config=auto "$PROJECT_DIR" --json -o "$OUTPUT_DIR/semgrep.json" 2>/dev/null
echo "[+] Semgrep complete"
) &
(
echo "[*] Running Bandit..."
bandit -r "$PROJECT_DIR" -f json -o "$OUTPUT_DIR/bandit.json" 2>/dev/null
echo "[+] Bandit complete"
) &
(
echo "[*] Running gitleaks..."
gitleaks detect --source="$PROJECT_DIR" --report-path="$OUTPUT_DIR/gitleaks.json" --report-format=json 2>/dev/null
echo "[+] Gitleaks complete"
) &
# Wait for all tools
wait
echo "[+] All scans complete. Results in $OUTPUT_DIR"
Result Aggregation
#!/usr/bin/env python3
"""Aggregate SAST results from multiple tools."""
import json
from pathlib import Path
from collections import defaultdict
def load_semgrep(path):
"""Parse Semgrep JSON output."""
findings = []
with open(path) as f:
data = json.load(f)
for result in data.get('results', []):
findings.append({
'tool': 'semgrep',
'rule': result.get('check_id'),
'severity': result.get('extra', {}).get('severity', 'unknown'),
'file': result.get('path'),
'line': result.get('start', {}).get('line'),
'message': result.get('extra', {}).get('message'),
'cwe': result.get('extra', {}).get('metadata', {}).get('cwe'),
})
return findings
def load_bandit(path):
"""Parse Bandit JSON output."""
findings = []
with open(path) as f:
data = json.load(f)
for result in data.get('results', []):
findings.append({
'tool': 'bandit',
'rule': result.get('test_id'),
'severity': result.get('issue_severity'),
'file': result.get('filename'),
'line': result.get('line_number'),
'message': result.get('issue_text'),
'cwe': result.get('issue_cwe', {}).get('id'),
})
return findings
def deduplicate(findings):
"""Deduplicate findings across tools."""
seen = set()
unique = []
for f in findings:
key = (f['file'], f['line'], f.get('cwe'))
if key not in seen:
seen.add(key)
unique.append(f)
return unique
def aggregate_results(results_dir):
"""Aggregate all SAST results."""
findings = []
semgrep_path = Path(results_dir) / 'semgrep.json'
if semgrep_path.exists():
findings.extend(load_semgrep(semgrep_path))
bandit_path = Path(results_dir) / 'bandit.json'
if bandit_path.exists():
findings.extend(load_bandit(bandit_path))
# Deduplicate and sort by severity
findings = deduplicate(findings)
severity_order = {'ERROR': 0, 'HIGH': 0, 'WARNING': 1, 'MEDIUM': 1, 'INFO': 2, 'LOW': 2}
findings.sort(key=lambda x: severity_order.get(x['severity'].upper(), 3))
return findings
CI/CD Integration
GitHub Actions
name: SAST Scanning
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
sast:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/security-audit
p/secrets
p/owasp-top-ten
- name: Run CodeQL
uses: github/codeql-action/analyze@v3
with:
languages: python, javascript
- name: Run Bandit
run: |
pip install bandit
bandit -r . -f sarif -o bandit.sarif || true
- name: Upload SARIF results
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: bandit.sarif
GitLab CI
sast:
stage: test
image: python:3.11
before_script:
- pip install semgrep bandit
script:
- semgrep --config=auto . --sarif -o semgrep.sarif || true
- bandit -r . -f sarif -o bandit.sarif || true
artifacts:
reports:
sast:
- semgrep.sarif
- bandit.sarif
when: always
# Language-specific jobs
semgrep:
stage: test
image: returntocorp/semgrep
script:
- semgrep ci
variables:
SEMGREP_RULES: "p/security-audit p/secrets"
Pre-commit Hooks
# .pre-commit-config.yaml
repos:
- repo: https://github.com/returntocorp/semgrep
rev: v1.52.0
hooks:
- id: semgrep
args: ['--config', 'p/secrets', '--error']
- repo: https://github.com/PyCQA/bandit
rev: 1.7.7
hooks:
- id: bandit
args: ['-ll', '-ii']
exclude: tests/
- repo: https://github.com/gitleaks/gitleaks
rev: v8.18.1
hooks:
- id: gitleaks
Common Vulnerability Patterns
Injection Patterns
# Semgrep rules for common injections
rules:
- id: sql-injection-python
patterns:
- pattern-either:
- pattern: cursor.execute("..." + $VAR + "...")
- pattern: cursor.execute(f"...{$VAR}...")
- pattern: cursor.execute("...%s..." % $VAR)
- pattern: cursor.execute("...{}...".format($VAR))
message: Potential SQL injection
languages: [python]
severity: ERROR
- id: command-injection-python
patterns:
- pattern-either:
- pattern: os.system($CMD)
- pattern: subprocess.call($CMD, shell=True, ...)
- pattern: subprocess.run($CMD, shell=True, ...)
message: Potential command injection
languages: [python]
severity: ERROR
- id: xpath-injection
patterns:
- pattern: |
$TREE.xpath("..." + $INPUT + "...")
message: Potential XPath injection
languages: [python]
severity: ERROR
Authentication/Authorization Patterns
rules:
- id: missing-auth-decorator
patterns:
- pattern: |
@app.route(...)
def $FUNC(...):
...
- pattern-not: |
@login_required
@app.route(...)
def $FUNC(...):
...
- pattern-not: |
@auth.required
@app.route(...)
def $FUNC(...):
...
paths:
exclude:
- "**/public/**"
- "**/health/**"
message: Route may be missing authentication
languages: [python]
severity: WARNING
- id: jwt-weak-secret
patterns:
- pattern: jwt.encode(..., $SECRET, ...)
- metavariable-regex:
metavariable: $SECRET
regex: '".{1,20}"'
message: JWT secret appears to be weak
languages: [python]
severity: WARNING
Crypto Patterns
rules:
- id: weak-hash-algorithm
patterns:
- pattern-either:
- pattern: hashlib.md5(...)
- pattern: hashlib.sha1(...)
message: Weak hash algorithm - use SHA-256 or better
languages: [python]
severity: WARNING
- id: weak-cipher
patterns:
- pattern-either:
- pattern: DES.new(...)
- pattern: ARC4.new(...)
- pattern: Blowfish.new(...)
message: Weak cipher algorithm
languages: [python]
severity: ERROR
- id: hardcoded-iv
patterns:
- pattern: AES.new(..., iv=$IV, ...)
- metavariable-regex:
metavariable: $IV
regex: 'b".*"'
message: Hardcoded IV detected - use random IV
languages: [python]
severity: ERROR
Reporting Template
# SAST Scan Report
## Executive Summary
- Scan Date: YYYY-MM-DD
- Repository: [name]
- Commit: [hash]
- Tools Used: Semgrep, CodeQL, Bandit
- Total Findings: X (Critical: Y, High: Z)
## Critical Findings
### [CRITICAL] SQL Injection in user_service.py
- **Location**: src/services/user_service.py:42
- **Tool**: Semgrep (sql-injection-format-string)
- **CWE**: CWE-89
- **Code**:
```python
query = f"SELECT * FROM users WHERE id = {user_id}"
cursor.execute(query)
- Remediation: Use parameterized queries
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))
Finding Summary by Category
| Category | Critical | High | Medium | Low |
|---|---|---|---|---|
| Injection | 2 | 3 | 1 | 0 |
| Authentication | 0 | 2 | 4 | 1 |
| Cryptography | 1 | 1 | 2 | 0 |
| Secrets | 0 | 5 | 0 | 0 |
Tool Coverage
| Tool | Findings | FP Rate | Coverage |
|---|---|---|---|
| Semgrep | 45 | 12% | All languages |
| Bandit | 23 | 18% | Python only |
| CodeQL | 12 | 5% | Python, JS |
Recommendations
- [P0] Fix all SQL injection vulnerabilities immediately
- [P1] Rotate exposed secrets and implement secret scanning
- [P2] Upgrade weak cryptographic algorithms
- [P3] Add authentication to unprotected endpoints
---
## Bundled Resources
### scripts/
- `sast_scan.sh` - Multi-tool orchestration script
- `aggregate_results.py` - Result aggregation and deduplication
- `sarif_to_csv.py` - SARIF to CSV converter
### references/
- `semgrep_rules.md` - Custom Semgrep rule reference
- `cwe_mapping.md` - CWE to tool rule mapping
- `false_positive_patterns.md` - Known FP patterns by tool
### checklists/
- `triage_checklist.md` - Finding triage checklist
- `ci_integration_checklist.md` - CI/CD setup checklist