Hey folks!

Unfortunately, roughly 2 hours ago, lemm.ee went offline. The cause was our load balancer: it suddenly decided that all of our servers had become unhealthy, despite all health checks responding successfully when I requested them directly. In such cases, the load balancer stops serving all requests, effectively meaning that lemm.ee is unreachable for all users. I am still not sure what exactly caused the issue, but I will try to investigate more over the weekend.

For now, we have partially recovered, and I am continuing to work on remaining issues. Hopefully we will be back to 100% very soon. Sorry for the inconvenience!

  • TLGA
    link
    fedilink
    arrow-up
    24
    ·
    3 months ago

    I was wondering what was going on, status.lemm.ee said the server was ok but the federation was broken. Thank you for fixing it

    • sunaurusOPMA
      link
      fedilink
      arrow-up
      21
      ·
      3 months ago

      Sorry for the delay in updating the status page - I actually had gone out for lunch just a few minutes before the downtime started, so I didn’t even realize anything was up until I was back at my computer about 45 minutes later 💀

      • ToxicWaste
        link
        fedilink
        arrow-up
        7
        ·
        3 months ago

        no need to apologise. still a better response time, than some of the professionals I work with ;-)

  • don
    link
    fedilink
    arrow-up
    21
    ·
    3 months ago

    I survived the July 18th lemm.ee downtime, and all I got was this lousy comment.

  • Cyrus Draegur
    link
    fedilink
    English
    arrow-up
    11
    ·
    3 months ago

    All is forgiven, thank you for running this lovely instance _

  • p3e7
    link
    fedilink
    arrow-up
    10
    ·
    3 months ago

    Thanks for your great work and transperancy!

  • fossphi
    link
    fedilink
    English
    arrow-up
    9
    ·
    3 months ago

    Thanks for the quick fix! What did you have to do to get the load balancer working again?

    • sunaurusOPMA
      link
      fedilink
      arrow-up
      16
      ·
      3 months ago

      For now, I just redeployed all of our servers completely, but as I don’t know the actual root cause of the issue yet, I’m still investigating to figure out if anything more is needed.

  • ramble81
    link
    fedilink
    arrow-up
    9
    ·
    3 months ago

    Nginx? I had an nginx LB shit itself yesterday. Luckily it auto-recovered and I had HA but just weird it happened.

  • db0@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    8
    ·
    3 months ago

    Typically when this happens, the issue is on the LB itself. Maybe its own network had issues?

  • EABOD25
    link
    fedilink
    English
    arrow-up
    8
    ·
    3 months ago

    Would it be in bad taste to blame Russia?

    • TurtlePower
      link
      fedilink
      arrow-up
      1
      ·
      3 months ago

      Yeah, but it could have been China, India, Iran, or maybe even North Korea. There are a lot of places that think disrupting the rest of the world will get them somewhere.

  • Clot
    link
    fedilink
    English
    arrow-up
    7
    ·
    3 months ago

    Sometimes, downtimes are awesome. Get off your machine and spend time with your family, folks!

  • edric
    link
    fedilink
    English
    arrow-up
    5
    ·
    3 months ago

    I thought the entire lemmy network was down because status.lemm.ee was saying our instance was fine and federation wasn’t working with every other instance. lol

  • JimmyBigSausage
    link
    fedilink
    arrow-up
    4
    ·
    3 months ago

    Thank goodness! Hopefully discovering these vulnerabilities and protecting them will help keep Lemmy alive when the big dogs come in to sweep us away! (Worst fears)

  • LedgeDrop
    link
    fedilink
    arrow-up
    4
    ·
    3 months ago

    Seriously, your professionalism in handling the situation and in reporting it is fantastic.

    It’s totally above and beyond anything we should expect for a service powered by donations!

    Thank you!