From the grifters, to the chanlord fascists, to the pedophiles, to the people whose only crime is just being too cringe, the community is a toxic morass that is completely unsalvageable. And the creepiest part is how it's all just beneath the surface, hidden in plain site. For example, if one browses checkpoints on civitai you'll inevitably run across galleries showing off how well the checkpoint can do porn in one image, and then the next image will be a SFW picture showing off how well it can generate children, with the creator saying something like "remember to add words like loli and child to the negative prompt when generating NSFW pictures just in case," in the description.
Then there's this absurd gem from a wildcard collection that I poked through to learn how wildcards work (turns out it's literally just a list of options broken up with newlines; and dynamic ones are the same but can be nested until no wildcards or dynamic prompt syntax remains). But before you click this spoiler, I want you to imagine what race.txt contains, really think about what an AI guy would put in there, then see how close you were:
british
czech
european
french
german
hungarian
icelandic
irish
italian
jewish
polish
portuguese
russian
[russian:japanese:0.3]
spanish
swedish
ukrainian
welsh
That's right, it's 100% weird euro brainworms splitting hairs between flavors of european, and one context switching prompt that switches from russian to japanese 30% of the way through generation to make sure that the one non-white entry starts off white. Not sure why he even bothers, since most of the checkpoints are so overtrained on white women that they will always spit out extremely pale figures regardless of what the prompt says.
.
Second conclusion: stable diffusion itself is actually a pretty fun toy, as long as you ignore the community. Fighting with it to make it not suck is an engaging challenge, and hitting the generate button to see if you've succeeded is like pulling a slot machine lever. Learning how to control this inscrutable eldritch machine is indisputably fun, despite everything around it.
Third conclusion: stable diffusion is fucking terrifying, and at this point is actually good at what it does with modern checkpoints. SDXL is obviously a step up, but even SD1.5 has been refined to the point that it's starting to lose the obvious tells as long as it's used right. The state it's in now is absurdly different from where it was six months ago and almost unrecognizable from where it was a year ago.
Fourth conclusion: stable diffusion is a horrifyingly addictive skinner box that mainlines psychic damage directly into your brain. It's an infinite gacha machine that you pay for with electricity and time instead of microtransactions. It's like introverted doomscrolling. It's so captivating that it's consumed almost every waking moment of my life for the past week, and I've only escaped by sequestering it onto a linux partition and breaking my stable diffusion install on windows in a way that would take a conscious effort to fix while trying to optimize it.
Fifth conclusion/summary: stable diffusion is a cognitohazard being actively shaped by the worst people alive, and there's no solution in sight. There was some faint hope that Nightshade could slow it down, but so far the buzz around that seems to be that it actually improves the models because its concept poisoning introduces noise that prevents overtraining while still helping to refine it, but then that's coming from the stable diffusion community so that's unreliable info at best.
Still, the fact that something open source and completely uncontrollable has become as good as stable diffusion already is and that there's every indication it will only continue to be refined and improved on is almost a relief, compared to the alternative of it being exactly the same but also the private and fully enclosed property of corporations run by the literal worst people alive. I really can't help but take some solace in the fact that open models are competing effectively with the proprietary ones, and may even win out. I sure as hell don't want see those OpenAI ghouls come out on top, because even if most of the stable diffusion community is irredeemably awful at least some it is just sort of cringe.
It really is a fascinating technology, and the deeper you get into manipulating it step-by-step, the more impressive it is that any of this stuff actually works. There are the super complex ComfyUI setups with conditioning and segmenting and other stuff I don't understand, and then there's the new turbo models that can give you a new wacky image as fast as you can type. It's the craziest toy I've ever seen, a hallucinating computer with a whole patchboard full of dials and plugs and cables hidden behind a panel with a giant "make a picture" button that lets you see what you convinced it to dream about.
And then almost everything about the way it's actually used is depressing as hell
In short:
That reminds me I should try to get comfyui setup. I've just been using auto1111, although that still has weird stuff like cross-attention control which I haven't even started to fuck with.
Yeah, SDXL Turbo is wild. I've heard it's down to like .1 seconds per image on a high end consumer card, and on my (mid range AMD) card it's a few seconds. The only downside is the comparative lack of community assets for it (for a variety of reasons), and the lower overall quality than SDXL or SD1.5. Well, that and the fact that SDXL Turbo is apparently proprietary and more strictly controlled than the older models. Still, the fact that a model can generate results as good as that with so few sampling passes is just absurd.