loki-logging
1
总安装量
1
周安装量
#51012
全站排名
安装命令
npx skills add https://github.com/bagelhole/devops-security-agent-skills --skill loki-logging
Agent 安装分布
opencode
1
codex
1
claude-code
1
Skill 文档
Grafana Loki
Aggregate and query logs with Grafana Loki, the Prometheus-inspired logging system.
When to Use This Skill
Use this skill when:
- Implementing cost-effective log aggregation
- Building logging for Kubernetes environments
- Integrating logs with Grafana dashboards
- Querying logs with label-based filtering
- Preferring lighter-weight alternative to ELK
Prerequisites
- Docker or Kubernetes
- Grafana for visualization
- Promtail or other log shipper
Architecture Overview
âââââââââââââââ ââââââââââââ ââââââââââââ
â Application ââââââ¶â Promtail ââââââ¶â Loki â
âââââââââââââââ ââââââââââââ ââââââââââââ
â
â¼
ââââââââââââ
â Grafana â
ââââââââââââ
Docker Deployment
# docker-compose.yml
version: '3.8'
services:
loki:
image: grafana/loki:2.9.0
ports:
- "3100:3100"
volumes:
- ./loki-config.yaml:/etc/loki/local-config.yaml
- loki-data:/loki
command: -config.file=/etc/loki/local-config.yaml
promtail:
image: grafana/promtail:2.9.0
volumes:
- ./promtail-config.yaml:/etc/promtail/config.yaml
- /var/log:/var/log:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
command: -config.file=/etc/promtail/config.yaml
grafana:
image: grafana/grafana:10.2.0
ports:
- "3000:3000"
volumes:
- grafana-data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning
environment:
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
volumes:
loki-data:
grafana-data:
Loki Configuration
# loki-config.yaml
auth_enabled: false
server:
http_listen_port: 3100
common:
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /loki/index
cache_location: /loki/cache
shared_store: filesystem
limits_config:
reject_old_samples: true
reject_old_samples_max_age: 168h
max_query_series: 5000
max_query_parallelism: 2
chunk_store_config:
max_look_back_period: 168h
table_manager:
retention_deletes_enabled: true
retention_period: 168h
Promtail Configuration
# promtail-config.yaml
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
# System logs
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
__path__: /var/log/*.log
# Docker container logs
- job_name: docker
docker_sd_configs:
- host: unix:///var/run/docker.sock
refresh_interval: 5s
relabel_configs:
- source_labels: ['__meta_docker_container_name']
regex: '/(.*)'
target_label: 'container'
- source_labels: ['__meta_docker_container_log_stream']
target_label: 'stream'
# Application logs with parsing
- job_name: application
static_configs:
- targets:
- localhost
labels:
job: application
__path__: /var/log/app/*.log
pipeline_stages:
- json:
expressions:
level: level
message: message
timestamp: timestamp
- labels:
level:
- timestamp:
source: timestamp
format: RFC3339
Kubernetes Deployment
# Using Helm
helm repo add grafana https://grafana.github.io/helm-charts
helm install loki grafana/loki-stack \
--namespace monitoring \
--create-namespace \
--set grafana.enabled=true \
--set promtail.enabled=true
Promtail DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: promtail
namespace: monitoring
spec:
selector:
matchLabels:
app: promtail
template:
metadata:
labels:
app: promtail
spec:
containers:
- name: promtail
image: grafana/promtail:2.9.0
args:
- -config.file=/etc/promtail/promtail.yaml
volumeMounts:
- name: config
mountPath: /etc/promtail
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: config
configMap:
name: promtail-config
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
LogQL Queries
Basic Queries
# All logs from a job
{job="application"}
# Filter by label
{job="application", level="error"}
# Multiple labels
{namespace="production", container="api"}
# Regex match
{job=~"app.*"}
Log Pipeline
# Filter by content
{job="application"} |= "error"
# Exclude content
{job="application"} != "debug"
# Regex filter
{job="application"} |~ "user_id=[0-9]+"
# JSON parsing
{job="application"} | json | level="error"
# Line format
{job="application"} | json | line_format "{{.level}}: {{.message}}"
Metric Queries
# Count logs per second
count_over_time({job="application"}[5m])
# Rate of errors
rate({job="application", level="error"}[5m])
# Sum by label
sum by (level) (count_over_time({job="application"}[5m]))
# Top services by error count
topk(5, sum by (service) (count_over_time({level="error"}[1h])))
Aggregations
# Average log line length
avg_over_time({job="application"} | unwrap line_length [5m])
# Percentile of numeric field
quantile_over_time(0.95, {job="application"} | json | unwrap response_time [5m])
# Error percentage
sum(rate({job="application", level="error"}[5m]))
/
sum(rate({job="application"}[5m])) * 100
Pipeline Stages
# promtail-config.yaml
pipeline_stages:
# Parse JSON logs
- json:
expressions:
level: level
message: msg
trace_id: trace_id
# Extract with regex
- regex:
expression: 'user_id=(?P<user_id>\d+)'
# Add labels from parsed fields
- labels:
level:
user_id:
# Modify timestamp
- timestamp:
source: timestamp
format: '2006-01-02T15:04:05.000Z'
# Filter logs
- match:
selector: '{level="debug"}'
action: drop
# Add static labels
- static_labels:
environment: production
# Modify log line
- template:
source: message
template: '{{ ToUpper .Value }}'
Grafana Integration
Data Source Configuration
# grafana/provisioning/datasources/loki.yaml
apiVersion: 1
datasources:
- name: Loki
type: loki
access: proxy
url: http://loki:3100
isDefault: false
jsonData:
maxLines: 1000
Dashboard Panel
{
"title": "Application Logs",
"type": "logs",
"datasource": "Loki",
"targets": [
{
"expr": "{job=\"application\"} | json",
"refId": "A"
}
],
"options": {
"showTime": true,
"showLabels": true,
"wrapLogMessage": true
}
}
Recording Rules
# loki-rules.yaml
groups:
- name: error_rates
interval: 1m
rules:
- record: job:log_errors:rate5m
expr: |
sum by (job) (rate({level="error"}[5m]))
Alerting
# loki-alerts.yaml
groups:
- name: log_alerts
rules:
- alert: HighErrorRate
expr: |
sum(rate({level="error"}[5m])) > 10
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate in logs"
description: "Error rate is {{ $value }} errors/second"
Common Issues
Issue: High Memory Usage
Problem: Loki consuming too much memory Solution: Reduce max_query_series, limit query time range
Issue: Logs Not Appearing
Problem: Promtail not shipping logs Solution: Check positions file, verify file paths, check label configuration
Issue: Query Timeout
Problem: LogQL queries timing out Solution: Add more specific label filters, reduce time range
Issue: Ingestion Rate Limit
Problem: Logs being dropped Solution: Increase per_stream_rate_limit in limits_config
Best Practices
- Use meaningful labels (avoid high cardinality)
- Filter by labels before log content
- Parse logs at collection time with Promtail
- Set appropriate retention periods
- Use recording rules for common queries
- Implement proper multitenancy for large deployments
- Monitor Loki’s own metrics
- Use chunk caching for better performance
Related Skills
- prometheus-grafana – Metrics monitoring
- elk-stack – Alternative logging
- alerting-oncall – Alert management