From the grifters, to the chanlord fascists, to the pedophiles, to the people whose only crime is just being too cringe, the community is a toxic morass that is completely unsalvageable. And the creepiest part is how it's all just beneath the surface, hidden in plain site. For example, if one browses checkpoints on civitai you'll inevitably run across galleries showing off how well the checkpoint can do porn in one image, and then the next image will be a SFW picture showing off how well it can generate children, with the creator saying something like "remember to add words like loli and child to the negative prompt when generating NSFW pictures just in case," in the description.

Then there's this absurd gem from a wildcard collection that I poked through to learn how wildcards work (turns out it's literally just a list of options broken up with newlines; and dynamic ones are the same but can be nested until no wildcards or dynamic prompt syntax remains). But before you click this spoiler, I want you to imagine what race.txt contains, really think about what an AI guy would put in there, then see how close you were:
british
czech
european
french
german
hungarian
icelandic
irish
italian
jewish
polish
portuguese
russian
[russian:japanese:0.3]
spanish
swedish
ukrainian
welsh

That's right, it's 100% weird euro brainworms splitting hairs between flavors of european, and one context switching prompt that switches from russian to japanese 30% of the way through generation to make sure that the one non-white entry starts off white. Not sure why he even bothers, since most of the checkpoints are so overtrained on white women that they will always spit out extremely pale figures regardless of what the prompt says.

.

Second conclusion: stable diffusion itself is actually a pretty fun toy, as long as you ignore the community. Fighting with it to make it not suck is an engaging challenge, and hitting the generate button to see if you've succeeded is like pulling a slot machine lever. Learning how to control this inscrutable eldritch machine is indisputably fun, despite everything around it.

Third conclusion: stable diffusion is fucking terrifying, and at this point is actually good at what it does with modern checkpoints. SDXL is obviously a step up, but even SD1.5 has been refined to the point that it's starting to lose the obvious tells as long as it's used right. The state it's in now is absurdly different from where it was six months ago and almost unrecognizable from where it was a year ago.

Fourth conclusion: stable diffusion is a horrifyingly addictive skinner box that mainlines psychic damage directly into your brain. It's an infinite gacha machine that you pay for with electricity and time instead of microtransactions. It's like introverted doomscrolling. It's so captivating that it's consumed almost every waking moment of my life for the past week, and I've only escaped by sequestering it onto a linux partition and breaking my stable diffusion install on windows in a way that would take a conscious effort to fix while trying to optimize it.

Fifth conclusion/summary: stable diffusion is a cognitohazard being actively shaped by the worst people alive, and there's no solution in sight. There was some faint hope that Nightshade could slow it down, but so far the buzz around that seems to be that it actually improves the models because its concept poisoning introduces noise that prevents overtraining while still helping to refine it, but then that's coming from the stable diffusion community so that's unreliable info at best.

Still, the fact that something open source and completely uncontrollable has become as good as stable diffusion already is and that there's every indication it will only continue to be refined and improved on is almost a relief, compared to the alternative of it being exactly the same but also the private and fully enclosed property of corporations run by the literal worst people alive. I really can't help but take some solace in the fact that open models are competing effectively with the proprietary ones, and may even win out. I sure as hell don't want see those OpenAI ghouls come out on top, because even if most of the stable diffusion community is irredeemably awful at least some it is just sort of cringe.

  • goose [he/him]
    hexbear
    31
    5 months ago

    It really is a fascinating technology, and the deeper you get into manipulating it step-by-step, the more impressive it is that any of this stuff actually works. There are the super complex ComfyUI setups with conditioning and segmenting and other stuff I don't understand, and then there's the new turbo models that can give you a new wacky image as fast as you can type. It's the craziest toy I've ever seen, a hallucinating computer with a whole patchboard full of dials and plugs and cables hidden behind a panel with a giant "make a picture" button that lets you see what you convinced it to dream about.

    And then almost everything about the way it's actually used is depressing as hell

    In short: yea

    • KobaCumTribute [she/her]
      hexagon
      hexbear
      8
      5 months ago

      There are the super complex ComfyUI setups with conditioning and segmenting and other stuff I don't understand

      That reminds me I should try to get comfyui setup. I've just been using auto1111, although that still has weird stuff like cross-attention control which I haven't even started to fuck with.

      the new turbo models that can give you a new wacky image as fast as you can type

      Yeah, SDXL Turbo is wild. I've heard it's down to like .1 seconds per image on a high end consumer card, and on my (mid range AMD) card it's a few seconds. The only downside is the comparative lack of community assets for it (for a variety of reasons), and the lower overall quality than SDXL or SD1.5. Well, that and the fact that SDXL Turbo is apparently proprietary and more strictly controlled than the older models. Still, the fact that a model can generate results as good as that with so few sampling passes is just absurd.

  • bubbalu [they/them]
    hexbear
    23
    5 months ago

    The excitement around this feels how I imagine the OG modular synths built. Its a lot of people who are good at playing on the surface of the thing, and few truly gifted systems designers.

  • Llituro [he/him, they/them]
    hexbear
    14
    5 months ago

    every time i see something about stable diffusion i confuse it for wavefunction collapse algorithms, and i'm always disappointed that we're talking about the ai

  • dinklesplein [any, he/him]
    hexbear
    14
    5 months ago

    based off some preliminary research it feels like a large part of ai porn all looking the same stylistically seems to come down to ai gooners all reusing the same baseline prompt - i would not be surprised if 95% of it contained a common set of keywords in the stablediffusion prompt.

    • KobaCumTribute [she/her]
      hexagon
      hexbear
      8
      5 months ago

      This is a wild tangent, but for some reason that idea reminds me of the novelization of Myst of all things, where a plot point around the whole "creating worlds by describing them in detail" thing involved the protagonist going into obsessive detail about every minor detail of the setting and being scolded for not being minimalist and exclusively focusing on the functional parts like "there's air" and "this place is useful and also not on fire or made of poison or some shit like that" by his father who erases the added lines, yielding worlds that are shitty and don't work right.

      For all that it's a rather on-the-nose allegory for writing and scene setting in general, it's eerily similar to how stacking the right added details in a prompt can massively impact the entire image, including unrelated parts, in stable diffusion. Like left without them it just sort of fuzzily makes a generic average that might be ok if generic or it might make a limb fold back in on itself, disappear behind a narrow object, and reappear somewhere else entirely like it's a fucking looney toons gag. But setting up something to painstakingly describe the color and texture of the literal dirt on the ground in the picture can somehow impact and fix the detail and perspective of figures in the scene, like it's trying to make everything match the intricacy and so not falling into the weird impossible contortion and melting zones.

  • fanbois [he/him]
    hexbear
    9
    edit-2
    5 months ago

    For the mentioned reasons, any ai art and it's generators should be treated the same way as toxic waste. It's radioactive garbage that should be buried under 120000 tons of rock and salt with a big sign with a skull Infront of it.

    It's a hallucinating computer, but it can't die and we are giving it the worst drug cocktail imaginable and then tweak whether it needs more fly amanita, gasoline or dried centipede to make the titties just right.

    It is the best intersection of human and machine that we are capable off and the results is exclusively the worst of both worlds.

    • carpoftruth [any, any]
      hexbear
      3
      5 months ago

      Ask ChatGPT to write its own version of the "this is not a place of honour" plaque

  • RyanGosling [none/use name]
    hexbear
    7
    5 months ago

    I’m pretty sure Stable Diffusion and Midjourney are the dominant models. DALLE is basically irrelevant now because it didn’t allow the model to generate using stolen artwork lol. But I’m assuming OpenAI will cave in and open up those restrictions to compete.

    • Awoo [she/her]
      hexbear
      3
      5 months ago

      Dalle is considerably better than both imo. It's massively better at understanding prompts with very complicated requests and the output is rarely a mess.

      The problem is that the current implementation on Bing/create is very limited.

    • KobaCumTribute [she/her]
      hexagon
      hexbear
      13
      5 months ago

      Open source AI image generator that runs locally on consumer GPUs (best on nvidia, but surprisingly usable even on AMD albeit with worse performance and a bit more work required to make it function). I've been using the automatic1111 webui which sets up a local server that you interact with through a browser tab.

    • Awoo [she/her]
      hexbear
      2
      5 months ago

      Depends on your gpu and settings, but generally speaking better gpu will yield much faster results.

  • kot [they/them]
    hexbear
    3
    5 months ago

    It's the exact same people involved with NFTs, what did you expect from their latest bazinga fad?

  • Awoo [she/her]
    hexbear
    3
    edit-2
    5 months ago

    I've played around with this a lot and had pretty much the same experience. Can't really discuss much of it here.

    Still, the fact that something open source and completely uncontrollable has become as good as stable diffusion already is and that there's every indication it will only continue to be refined and improved on is almost a relief, compared to the alternative of it being exactly the same but also the private and fully enclosed property of corporations run by the literal worst people alive. I really can't help but take some solace in the fact that open models are competing effectively with the proprietary ones, and may even win out. I sure as hell don't want see those OpenAI ghouls come out on top, because even if most of the stable diffusion community is irredeemably awful at least some it is just sort of cringe.

    One of the primary issues with the community is this problem. The primary motivator of the people working on this shit is the content they're not allowed to have in the private AIs. This invariably means that they're a bunch of porn addicts or those that want genuinely illegal content.

    Either way this guarantees open AI will continue alongside private AI because private isn't going to touch this shit with a barge pole.