• @Hazzia@discuss.tchncs.de
    hexbear
    20
    1 month ago

    I've only ever grepped log files. 9 years into my career now so not sure which side of the spectrum I'm on (i'm definitely on the spectrum)

  • @Machindo@lemmy.ml
    hexbear
    14
    1 month ago

    I'm running Grafana Loki for my company now and I'll never go back to anything else. Loki acts like grep, is blazing fast and low maintenance. If it sounds like magic it kind is.


    I saw this post and genuinely thought one of my teammates wrote it.

    I had to manage an ELK stack and it was a full time job when we were supposed to be focusing on other important SRE work.

    Then we switched to Loki + Grafana and it's been amazing. Loki is literally k8s wide grep by default but then has an amazing query language for filtering and transforming logs into tables or even doing Prometheus style queries on top of a log query which gives you a graph.

    Managing Loki is super simple because it makes the trade off of not indexing anything other than the kubernetes labels, which are always going to be the same regardless of the app. And retention is just a breeze since all the data is stored in a bucket and not on the cluster.

    Sorry for gushing about Loki but I genuinely was that rage wojak before we switched. I am so much happier now.

    • Jo Miran@lemmy.ml
      hexbear
      4
      1 month ago

      We do Grafana + Prometheus for most of our clients but I think that adding Loki into the mix might be necessary. The amount of clients that are missing basic events like "you've run out of disk space...two days ago", is too damn high.

      • @Machindo@lemmy.ml
        hexbear
        2
        23 days ago

        I would add Alertmanager to your stack if you haven't already. It's pretty tightly integrated with prometheus. There's some canned alerting rules based on predicting disk space full in X number of days. We wire Alertmanager to Pagerduty.

  • @RoadieRich@midwest.social
    hexbear
    7
    1 month ago

    As someone who used to troubleshoot an extremely complex system for my day job, I can say I've worked my way across the entire bell curve.

  • Tabitha ☢️[she/her]
    hexbear
    1
    edit-2
    1 month ago

    I needed to search something in the AWS log thing the other day, couldn't figure out how to search text with one common non azAZ09-_ character, also couldn't figure out how to negate on simple words, have to do the grep thing and it JustWorked™