I often find myself wondering if all this is really that much better than the ol...

vimda · on Sept 28, 2021

What we do internally is use syslog-ng (https://github.com/syslog-ng/syslog-ng) to read the journald socket and push to a remote and into Kafka. I think journald works well as a structured logging tool, but it's certainly deficient in other ways

reacharavindh · on Sept 28, 2021

I have looked into ElasticSearch + Kibana as a solution to aggregate logs. There may be plenty of choice to replace ElasticSearch(ClickHouse, even Postgres, heck even journald), but a nice UI where you can simply search for that random piece of text you need to sift through the logs is the red herring.

Until now, I have not seen a web interface to log as powerful as Kibana that can work with anything other than ElasticSearch.

This is why I chose to stop my search and pay for Datadog to do this correctly, and simply allow me to search for that keyword on logs when I need it the most(and not worry about whether I indexed stuff correctly, or balanced some whatever in ElasticSearch, or remembered to setup something far too technical for a log system). Datadog allows you to keep a short periods worth of data in the index and "expire" old content into archives while retaining the ability to add them back to index if needed for any investigation.

pas · on Sept 28, 2021

journald is not good at handling a lot of data, nor is it good at managing imported data. (It could be improved, probably "easily", but it's main feature is that it's an "always on" not terribly dumb log target, it's not a long term log management system.)

reacharavindh · on Sept 28, 2021

Hmm. Never tried to use journald at any reasonable scale beyond tens of servers. Good to know its characteristics.

To be honest I wasn’t looking for a long term log management system and that is why Journald even came up in mind. If it could aggregate logs from several servers and retain them for a week while expiring older logs to an archive source, it’s sufficient for my needs.

pas · on Sept 30, 2021

Exactly why I wrote my comment. :) Because it seems it's able to do that, but not really. And it seems easy to fix, but of course patches are welcome. (Hopefully.)

https://github.com/systemd/systemd/issues/5242

Sure, it's not terribly hard to work around it with a cron (or systemd-timer) script, but why go uphill, when there are better tools.

sofixa · on Sept 28, 2021

Grafana?

reacharavindh · on Sept 28, 2021

Haven’t played with the logs part of Grafana recently, but would it work on top of say Clickhouse? I thought it was more tuned for the Loki use case… is it not?

nullify88 · on Sept 28, 2021

I'd agree, it isnt good for exploratory queries. But if you have some predefined ES queries for correlating log messages to metrics it can be useful to have it all in one dashboard.