If you want to get data into Grafana Loki then Promtail probably the easiest way to do so. In my case, I wanted to forward Caddy’s access logs to Loki. There are some fields, though, that I don’t want to send over to Loki, like a user’s IP or port number.
There are two ways I could do that: Either configure Caddy to not even log this fields in the first place, or not submitting them to Loki in Promtail. Since I wanted to play around with Promtail’s pipeline feature anyway, I chose the latter option for this little experiment.
Promtail has a couple of flags that make experimenting with configuration changes quite convenient. The primary ones here are -stdin
and -dry-run
. Let’s say I have a little configuration file located in ./promtail.yaml
and want to test some dummy data located inside test.log
against that. Then I can run the following command to pipe that data through Promtail without updating any state files or submitting data to Loki but instead logging it to the terminal:
cat test.log | promtail -stdin -dry-run -config.file promtail.yaml
This will give me the transformed log statements like this:
Clients configured:
----------------------
url: http://localhost:30000
batchwait: 1s
batchsize: 1048576
follow_redirects: false
enable_http2: false
backoff_config:
min_period: 500ms
max_period: 5m0s
max_retries: 10
timeout: 10s
tenant_id: ""
stream_lag_labels: ""
2023-01-25T18:32:28.3600988+0100 {level="info"} {"level":"info","request_host":"zerokspot.com","request_method":"GET","request_path":"/tags/licensing/","response_status":"200"}
2023-01-25T18:32:34.4030662+0100 {level="info"} {"level":"info","request_host":"zerokspot.com","request_method":"GET","request_path":"/weblog/2005/03/09/links-on-delicious/","response_status":"200"}
If I then want to see more of what’s actually happening in each stage of the pipeline, I can also pass the -inspect
flag which will give me something like this:
[inspect: timestamp stage]:
{stages.Entry}.Entry.Entry.Timestamp:
-: 2023-01-25 19:30:34.767041 +0100 CET
+: 2023-01-25 18:32:28.3600988 +0100 CET
2023-01-25T18:32:28.3600988+0100 {level="info"} {"level":"info","request_host":"zerokspot.com","request_method":"GET","request_path":"/tags/licensing/","response_status":"200"}
That’s pretty much all I needed to iterate on my configuration in order to just submit the data that I wanted to Loki 😊 And just for completeness’ sake, this is the pipeline I’m currently experimenting with:
# Dummy client for local testing
clients:
- url: http://localhost:30000
scrape_configs:
- job_name: access_log
pipeline_stages:
# Extract all the fields I care about from the
# message:
- json:
expressions:
"level": "level"
"timestamp": "ts"
"response_status": "status"
"request_path": "request.uri"
"request_method": "request.method"
"request_host": "request.host"
"request_useragent": "request.headers.\"User-Agent\""
# Promote the level into an actual label:
- labels:
level:
# Regenerate the message as all the fields listed
# above:
- template:
# This is a field that doesn't exist yet, so it will be created
source: "output"
template: |
{{toJson (unset (unset (unset . "Entry") "timestamp") "filename")}}
- output:
source: output
# Set the timestamp of the log entry to what's in the
# timestamp field.
- timestamp:
source: "timestamp"
format: "Unix"
Do you want to give me feedback about this article in private? Please send it to comments@zerokspot.com.
Alternatively, this website also supports Webmentions. If you write a post on a blog that supports this technique, I should get notified about your link 🙂