IAMA recovering GPT-2 Bot Operator. Ask Me Anything!

mm_maybe@sh.itjust.works · 3 days ago

this; every time the ublock origin absolutists insist that everyone must use Firefox or die I just wonder if they never open more than one or two tabs anyway. hell, a sufficiently complex web app running in a single tab can make FF choke

mm_maybe@sh.itjust.works · 4 days ago

There are a bunch of reasons why this could happen. First, it’s possible to “attack” some simpler image classification models; if you get a large enough sample of their outputs, you can mathematically derive a way to process any image such that it won’t be correctly identified. There have also been reports that even simpler processing, such as blending a real photo of a wall with a synthetic image at very low percent, can trip up detectors that haven’t been trained to be more discerning. But it’s all in how you construct the training dataset, and I don’t think any of this is a good enough reason to give up on using machine learning for synthetic media detection in general; in fact this example gives me the idea of using autogenerated captions as an additional input to the classification model. The challenge there, as in general, is trying to keep such a model from assuming that all anime is synthetic, since “AI artists” seem to be overly focused on anime and related styles…

mm_maybe@sh.itjust.works · 8 days ago

Serious question, is it likely that comedians who he brings on Kill Tony are also conservative racist POS assholes? I hope not because I actually kind of enjoyed Casey Rocket for a bit when he came up through that show. Definitely got a weird vibe from the host that made me wonder who they were and what the show really was (now I know).

mm_maybe@sh.itjust.works · 9 days ago

Well, maybe we need a movement to make physical copies of these games and the consoles needed to play them available in actual public libraries, then? That doesn’t seem to be affected by this ruling and there’s lots of precedent for it in current practice, which includes lending of things like musical instruments and DVD players. There’s a business near me that does something similar, but they restrict access by age to high schoolers and older, and you have to play the games there; you can’t rent them out.

mm_maybe@sh.itjust.works · 11 days ago

r/SubSimGPT2Interactive for the lulz is my #1 use case

i do occasionally ask Copilot programming questions and it gives reasonable answers most of the time.

I use code autocomplete tools in VSCode but often end up turning them off.

Controversial, but Replika actually helped me out during the pandemic when I was in a rough spot. I trained a copyright-safe (theft-free) bot on my own conversations from back then and have been chatting with the me side of that conversation for a little while now. It’s like getting to know a long-lost twin brother, which is nice.

Otherwise, i’ve used small LLMs and classifiers for a wide range of tasks, like sentiment analysis, toxic content detection for moderation bots, AI media detection, summarization… I like using these better than just throwing everything at a huge model like GPT-4o because they’re more focused and less computationally costly (hence also better for the environment). I’m working on training some small copyright-safe base models to do certain sequence prediction tasks that come up in the course of my data science work, but they’re still a bit too computationally expensive for my clients.

mm_maybe@sh.itjust.works · 17 days ago

We don’t. It probably is. Mastodon is the way, but they need to fix a few things themselves.

mm_maybe@sh.itjust.works · 17 days ago

Ok, thanks for clarifying. FWIW, I find the built-in adblocker in Vivaldi extremely dependable, without the performance cost of loading an add-on (especially on top of a base browser that is significantly slower to begin with).

mm_maybe@sh.itjust.works · 17 days ago

Honest question: why is it not safe after then? They developed their own adblocker if I’m not mistaken? What am I missing?

mm_maybe@sh.itjust.works · edit-2 20 days ago

I already know that nobody will mention Vivaldi, or if they do, they will be drowned out by a chorus of voices shouting “Firefox”, but if you like the chromium ecosystem or just don’t like waiting for web pages to function, it’s a great alternative with a built-in adblocker that doesn’t depend on uBlock Origin.

mm_maybe@sh.itjust.works · 20 days ago

may I ask which third-party tool you use? i’m using onedriver and it’s pretty unreliable in my experience

mm_maybe@sh.itjust.works · 20 days ago

It will legit be a fantastic era for Linux on the desktop though… imagine how cheap we’ll be able to get perfectly good hardware.

mm_maybe@sh.itjust.works · 28 days ago

'tis true that women’s bodies hold great power, and not irrelevant at all to the discussion at hand. rather than reiterate and attempt to paraphrase jaron Lanier on the topic of how male obsession with creating artifical people is linked to womb envy, I’ll just link to a talk in which he explains it himself:

https://youtu.be/rGqiswuJuQI?si=oAKvWrtlji4yrfpd&t=42m05s

mm_maybe@sh.itjust.works · 28 days ago

Like any occupation, it’s a long story, and I’m happy to share more details over DM. But basically due to indecision over my major I took an abnormal amount of math, stats, and environmental science coursework even through my major was in social science, and I just kind of leaned further and further into that quirk as I transitioned into the workforce. bear in mind that data science as a field of study didn’t really exist yet when I graduated; these days I’m not sure such an unconventional path is necessary. however I still hear from a lot of junior data scientists in industry who are miserable because they haven’t figured out yet that in addition to their technical skills they need a “vertical” niche or topic area of interest (and by the way a public service dimension also does a lot to help a job feel meaningful and worthwhile even on the inevitable rough day here and there).

mm_maybe@sh.itjust.works · 28 days ago

My “day job” is doing spatial data science work for local and regional governments that have a mandate to addreas climate change in how they allocate resources. We totally use AI, just not the kind that has received all the hype… machine learning helps us recognize patterns in human behavior and system dynamics that we can use to make predictions about how much different courses of action will affect CO2 emissions. I’m even looking at small GPT models as a way to work with some of the relevant data that is sequence-like. But I will never, I repeat never, buy into the idea of spending insane amounts of energy attempting to build an AI god or Oracle that we can simply ask for the “solution to climate change”… I feel like people like me need to do a better job of making the world aware of our work, because the fact that this excuse for profligate energy waste has any traction at all seems related to the general ignorance of our existence.

mm_maybe@sh.itjust.works · 29 days ago

I find it very funny that people are so concerned about false positives. Models like these should really only be used as a screening tool to catch things and flag them for human review. In that context, false positives seem less bad than false negatives (although, people seem to demand zero error in either direction, and that’s just silly).

mm_maybe@sh.itjust.works · 29 days ago

If you don’t mind, I’d be interested to see the images you used. The broad validation tests I’ve done suggest 80-90% accuracy in general, but there are some specific categories (anime, for example) on which it performs kinda poorly. If your test samples have something in common it would be good to know so I can work on a fix.

mm_maybe@sh.itjust.works · 30 days ago

Friendly reminder that my AI-generated image detector is available to use free of charge here: https://huggingface.co/spaces/umm-maybe/sdxl-detector

mm_maybe@sh.itjust.works · 1 month ago

We had a classic bot on r/SubSimGPT2Interactive that was trained on discussions about Dwarf Fortress. It was always one of my favorites even though I had never played Dwarf Fortress, and it took me a while to realize that more than half of the absurdity it was generating was due to the game just being like that. GPT-2 was merely shuffling the deck and dealing the cards

mm_maybe@sh.itjust.works · 1 month ago

Me: I’ve cut my coffee intake down to one cup a day! Look how disciplined and restrained I am!

Also me: drinks 1.5 cans of Celsius per day

mm_maybe@sh.itjust.works · 2 months ago

Especially since, like, women have been using machines to pleasure themselves for a while now and it’s still kind of a novelty for most het cis men

mm_maybe@sh.itjust.works · 1 year ago

IAMA recovering GPT-2 Bot Operator. Ask Me Anything!

IAMA recovering GPT-2 Bot Operator. Ask Me Anything!

IAMA recovering GPT-2 Bot Operator. Ask Me Anything!

Moderates