Lemmy self-hosters. What is your image cleanup process?

idle@158436977.xyz · 1 year ago

Lemmy self-hosters. What is your image cleanup process?

HTTP_404_NotFound@lemmyonline.com · 1 year ago

Honestly, If I can get posts to stay synced up, that will be a good day for me…

Seriously, federation/sync issues, are not fun.

j4k3@lemmy.world · 1 year ago

Name checks out

HTTP_404_NotFound@lemmyonline.com · 1 year ago

Hah!

Matt@netmonkey.tech · 1 year ago

I’ve had lot of issues with lemmy.ml. I just unsubscribed from everything over there since zero comments were federating over to my instance.

cstine@lemmy.uncomfortable.business · 1 year ago

I noticed that they’ll show up eventually where “eventually” could be like, 10-12 hours.

I suspect that they’re just absolutely slammed to the point they can’t actually push the federated content out to subscribers because EVERYONE is subscribing.

Might be an architectural thing due to not having a sufficiently scalable job queue/worker thread infrastructure, or just like, not enough CPU cycles to do it.

Matt@netmonkey.tech · 1 year ago

It’s hard to say. I don’t know if the admins of Lemmy.ml have been public about their issues or not. I know that Lemmy.world hasn’t been having the same issues, at least from my perspective. Makes me think it’s less an architectural or design problem, but rather a lack of server resources like CPU, as you suggested.

StrayPizza@lemmy.world · 1 year ago

I read somewhere that Lemmy.ml has basically maxed out its VPS with its provider, so they’re stuck for the time being, whereas Lemmy.world actually just upgraded its server hardware. Hoping they’ll migrate to a beefier server soon.

Matt@netmonkey.tech · 1 year ago

Yup, I’ve read something similar. Hopefully they’re able to get things sorted out soon!

HTTP_404_NotFound@lemmyonline.com · 1 year ago

Beehaw has been my bigger problem child.

However, tonight it’s smooth as butter. Things are syncing, I’m getting alerts.

Could be due to some of the maintenance I did earlier too.

Matt@netmonkey.tech · 1 year ago

I’ve not personally noticed any federation issues with Beehaw on my instance. Glad to hear things are better tonight.

a253040@midwest.social · 1 year ago

IIRC, I’ve read comments elsewhere that pictrs caches for 6 months, but I can’t independently verify. I hope this gets a broader answer because I’m still on the fence about getting an instance set up for myself and some small communities.

rs5th@lemmy.scottlabs.io · 1 year ago

I believe the activity table in Postgres is retained for 6 months (although I’m purging mine daily) and the pict-rs cache is 168 hours (1 week).

a253040@midwest.social · 1 year ago

I knew I read something was kept for 6 months ;)

Glad to see that even here, the best way to get the right answer on the internet is to provide a wrong one.

idle@158436977.xyz · 1 year ago

Only 1 week? That should be fine. Thanks!

Jamie@jamie.moe · 1 year ago

I was starting to sweat a little because my instance, that only I use, already has 600MB of pictures after less than 24 hours. The server has more than enough space, but I still wouldn’t like it. A week is far more swallow-able.

cwagner@discuss.tchncs.de · edit-2 1 year ago

deleted by creator

Quindius@lemmy.world · 1 year ago

How do you purge daily? Also, does that delete any post history or anything in a similar vein?

rs5th@lemmy.scottlabs.io · 1 year ago

I’m running the following SQL, although I’m not actually sure it’s as necessary since 0.18.3. It doesn’t delete any post history or anything.

DELETE FROM activity WHERE published &lt; NOW() - INTERVAL '1 day';

nii236@lemmy.jtmn.dev · 1 year ago

Related note, pictrs is super cool. Its like an OSS imgur backend, but no one really talks much about it or its potential.

nephs@lemmygrad.ml · 1 year ago

It would probably be worth it to have that period be configurable by instance admins…

rs5th@lemmy.scottlabs.io · 1 year ago

I think it’s configurable inside pict-rs’s configuration file. I haven’t messed with it though. I’m also not sure if pict-rs has an API that lemmy can use to configure that.

jon@lemmy.tf · 1 year ago

I’m just letting mine do whatever it wants, got plenty of local storage. If/when I have storage issues I’ll add an s3 bucket, pretty easy to modify the entrypoint for pictrs to pass s3 connection info in the docker-compose deployment.

fox@lemmy.fakecake.org · 1 year ago

S3 support is a good thing, thanks for mentioning it.

poVoq@slrpnk.net · edit-2 1 year ago

Remote images are not cached or proxied right now as far as I know. Edit: seems I was wrong and there is some image caching happening. For sure for the small image thumbnails, but also sometimes for other pictures, but it seems very inconsistent.

Your growning pictrs directory might be also due to the extremely verbose default logging that Pictrs (and the Lemmy backend too btw) uses.

idle@158436977.xyz · 1 year ago

When I look in the directories, it’s 100s of images that are definitely from posts. Maybe it only caches the images I clicked on?

poVoq@slrpnk.net · 1 year ago

No, I was wrong and caching is happeing somehow, but not always. I think there might be a strict time-out or something like that for pict-rs trying to cache the images, which is why most images do not get cached in my experience.

idle@158436977.xyz · 1 year ago

In any case, a weeks retention is fine by me. I have a couple hundred gigs available, so long as it’s getting cleaned up at some point it’s not a problem for me.

arkcom@kbin.social · 1 year ago

I though instances only cached the text of submissions? I could see that ballooning to be insane pretty quick if the fediverse really takes off.