Stylized Alex Ward logo

How I’m approaching Generative AI

by cthos
About 27 min

The Duality of AI
Lidiia Lohinova @ Shutterstock #2425460383

Also known as the “Plausible Sentence Generator” and “Art Approximator”

This post is only about Generative AI. There are plenty of other Machine Learning models, and some of them are really useful, we’re not talking about those today.

I feel like every single day I see some new startup or post about how Generative AI is the future of everything and how we’re right on the cusp of Artificial General Intelligence (AGI) and soon everything from writing to art to music to making pop tarts will be controlled by this amazing new technology.

In other words, this is the biggest tech hype cycle I’ve personally witnessed. Blockchain, NFTs, and the like come close (remember companies adding “blockchain” to their products just to get investment in the last bubble?) and maybe the dotcom bubble, but I think this “AI” cycle is even bigger than them all.

There are a lot of reasons for that, which I’m going to get into as part of this … probably very lengthy post about Generative AI in general, where I find it useful, where I don’t, where my personal ethics land on the various elements of GenAI (and I’ll be sure to treat LLMs and Diffusion models differently). So, by the end of this, if I’ve done my job right, you’re going to understand a bit more about why I think there’s a lot of hype and not a lot of substance here — and how we’re going to do a lot of damage in the meantime.

Never you worry friends, I’m going to link to a lot of sources for this one.

If you’ve been living deep in a cave with no access to the news you might not have heard about Generative AI. If you are one of those people and are reading this, I envy you - please take me with you. I’m going to go ahead and define AI for the purposes of this article because the industry has gone and overloaded the term “AI” once again.

I’m going to be very constrained to “Generative AI”, also known as “GenAI”, of two categories: Large Language Models (LLMs) and Diffusion Models (like Dall-E and Stable Diffusion). The way they work is a little bit different, but the way they are used is similar. You give them a “prompt” and they give you some output. For the former, this is text and for the latter this is an image (or video, in the case of Sora). Sometimes we slap them together. Sometimes we slap them together 6 times.

Examples of LLMs: ChatGPT, Claude, Gemini (they might rename it again after this post goes live because Google gonna Google).

I’m going to take my best crack at summarizing how this works, but I’ll link to more in-depth resources at the end of the section. In its most basic terms, an LLM takes the prompt that you entered and then it uses statistical analysis to predict the next “token” in the sequence. So, if you give it the sentence “Cats are excellent”, the LLM might have correlated “hunters” as the next token in the sequence as statistically 60% likely. The word “pets” might be 20%. And so on. It’s essentially “autocomplete with a ton of data fed to it”.

Sidebar, a token is not necessarily a full word. It could be a “.”, it could be a syllable, it could be a suffix, and so on. But for the purposes of the example you can think of as words.

What the LLM does that makes it “magical” and able to generate “novel” text is that sometimes it won’t pick the statistically most likely next token. It’ll pick a different one (based on its Temperature, Top-P, and Bottom-P parameters), which then sends it down a different path (because the token chain is now different). This is what enables it to give you a Haiku about your grandma. It’s also what makes it generate “alternative facts”. Also known as “hallucinations”.

This is a feature.

You see, the LLM has no concept of what a “fact” is. It only “understands” statistical associations between the words that have been fed to it as part of its dataset. So, when it makes up court cases, or claims public figures have died when they’re very much still alive, this is what’s happening. OpenAI, Microsoft, and others are attempting to rein this in with various techniques (which I’ll cover later), but ultimately the “bullshit generation” is a core function of how an LLM works.

This is a problem if you want an LLM to be useful as a search engine, or in any domain that relies on factual information, because invariably it will make fictions up by design. Remember that, because it’s going to come up over and over again.

  1. Stephen Wolfram talks about how ChatGPT works.
  2. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜

I don’t understand diffusion models as well as I understand language models, much like I understand the craft of writing more than I do art, so this is going to be a little “fuzzier”

Examples of Diffusion Models: Dall-E 3 (Bing Image Creator), Stable Diffusion, Midjourney

Basically, a Diffusion model is the answer to the question “what happens if you train a neural network on tagged images and then introduce progressively more random noise?” The process works (massively simplified) like this:

  1. The model is given an image labeled “cat”
  2. A bit of random noise (or static) is introduced into the image.
  3. Do Step 2 over and over again until the image is totally unrecognizable as a cat.
  4. Congrats! You now know how to make a “Cat” into random noise.

But the question then becomes “can we reverse the process?”. Turns out, yes, you can. To get a from a prompt of “Give me an image that looks like a cat” the diffusion model will do the process in essentially reverse:

  1. We generate an image that is nothing but random noise.
  2. The model uses its training data to “remove” that random noise, just a bit
  3. Repeat step 2 over and over again
  4. Finally, you have an image that looks something akin to a cat

Now, on this other side, your model might not have generated a great cat. It doesn’t know what a cat is. So, it asks another model: “Hey, is this an acceptable cat?” Said model will either say “nope, try again”, or it will respond with “heck yes! That’s a cat. Do more like that”.

This is Reinforcement Learning - this is going to come up again later.

So, at it’s most “basic” representation the things that are making “AI Art” are essentially random noise de-noiserators. Which, at a technical level is super cool! Who would have thought you could give a model random noise garbage and get a semi-coherent image out of the other end?

  1. Step by Step Visual Introduction to Diffusion Models by Kemal Erdem
  2. How Diffusion Models Work by Leo Isikdogan (video)

These things are energy efficient and cheap to run, right?

Permalink to “These things are energy efficient and cheap to run, right?”

I mean, it’s $20/mo for an OpenAI ChatGPT pro subscription, how expensive could it be?

My friends, this whole industry is propped up by a massive amount of speculative VC / Private Equity funding. OpenAI is nowhere near profitable. Their burn rate is enormous (partly due to server costs, but also training foundational models is expensive). Sam Altman is seeking $7 Trillion dollars for AI chips. Moore’s law is Dead, so we can’t count on the cost of compute getting ever smaller.

Let’s also talk about the environmental impact of some of these larger models. Training them requires a lot of water. Using them uses way less water (well, as much as running a power-hungry GPU would require), but the overall lifecycle of a GenAI large foundational model isn’t exactly sustainable in the world of impending climate crises.

One thing that’s also interesting is there are a number of smaller, useable-ish models that can run on commodity hardware. I’m going to talk about those later.

I think part of what’s fueling the hype here is only a few companies on the planet can currently field and develop these large foundational models, and no research institutions currently can. If you can roll out “AI” to every person who uses a computer, your potential and addressable markets are enormous.

Because there are only a few players in the space, they’re essentially doing what Amazon is an expert at: Subsidize the product to levels that are unsustainable (that $20/mo, for example) and then jack up the price later once you’ve got a captive market that has no choice in the matter anymore.

Go have a watch of John Stewart’s interview with FTC Chair Lina Khan, it’s a good one and touches on this near the end.

We’re already seeing them capture a lot of market here, too, because a ton of startups are building features which simply ask you, the audience, to provide an OpenAI API key. Or, they subsidize the API access cost to OpenAI through other subscription fees. Ultimately, a very small number of players under the hood control access and cost…. which is going to be very very “fun” for a lot of businesses later.

I do think OpenAI is chasing AGI…for some definition of AGI; but I don’t think it’s likely they’re going to get there with LLMs. I think they think that they’ll get there, but they’re now chasing profit. They’re incentivized to say they’ve got AGI even if they don’t.

Cool! So we modeled this on the human brain?

Permalink to “Cool! So we modeled this on the human brain?”

I’m getting pretty sick of hearing this one, because the concept of a computer Neural Network is pretty neat but every time someone says “and this is how a human brain works” it drives me a little bit closer to throwing my laptop in a river.

It’s not. Artificial Neural Networks (ANNs) were invented in the late 1960s, and were modeled after a portion of how we thought our brains might work at the time. Since then, we’ve made advances with things like Convolutional Neural Networks (CNNs) starting in the 1980s, and most recently Transformers (this is what ChatGPT uses). None of these ANN models actually model what the human brain is actually doing. We don’t actually understand how the human brain works in the first place, and the entire field of neuroscience is constantly making discoveries.

Did Transformer architecture stumble upon how the human brain works? Unlikely, but, hey, who knows. Let’s throw trillions of dollars at the problem until we get sentient clippy.

Look, I could get into a lengthy discussion about whether free will exists but I’m gonna spare you that one.

Wikipedia covers this better than I could, so go have a read on the history of ANNs.

AI will not just keep getting better the more data we put in

Permalink to “AI will not just keep getting better the more data we put in”

Couple of things here: it’s really hard to model how well a generative AI tool is doing on benchmarks. Pay attention to the various studies that have been released (peer reviewing OpenAI’s studies has been hard, turns out). You’re not getting linear growth with more data. You’re not getting exponential growth (which I suspect is what the investors are wanting).

You’re getting small incremental improvements simply from the adding more data. There are some things that the AI companies that are doing to improve performance for certain queries (this is human reinforcement, as well as some “safety” models and mechanisms) - but the idea that you just keep feeding a foundational model more data and it suddenly becomes much better is a logical fallacy and there’s not a lot of evidence for it.

Oh, oh gods, no. It can be very biased. It was trained on content curated from the internet.

I cannot do any better description than this Bloomberg Article - it's amazing, but it covers how image generators tend to racially code professions.

I'm skeptical that is even possible. Google Tried and wound up making "Racially diverse WWII German soldiers". AKA Nazis.

Right, so how did they train these things?

Permalink to “Right, so how did they train these things?”

The shortest answer is “a whole bunch of copyrighted content that a non-profit scraped from the internet”. The longer answer is “we don’t actually fully know because OpenAI will not disclose what’s in their datasets”.

One of the datasets, by the way, is Common Crawl - you can block its scraper if you desire. That dataset is available for anyone to download.

If you’re an artist that had a publicly accessible site, art on Deviantart, or really anywhere else one of the bots can scrape, your art has probably been used to train one of these models. Now, they didn’t train the models on “the entire internet”, Common Crawl’s dataset is around 90 TB compressed, and most of that is…. Well, garbage. You don’t want that going into a model. Either way, it’s a lot of data.

If you were a company who wanted to get billions of dollars in investment by hyping up your machine learning model, you might say “this is just how a human learns to do art! They look at art, and they use that as inspiration! Exactly the same.”

I don’t buy that. An algorithm isn’t learning, it’s taking pieces of its training set and reproducing it like a facsimile. It’s not making anything new.

I struggle with this a bit too. One of my favorite art series is Marcel Duchamp’s Readymades - because it makes you question “what is art, really?”. Does putting a urinal on its side make it art? For me, yes, because Duchamp making you question the art is the art. Is “Hey Midjourney give me Batman if he were a rodeo clown” art? Nah.

Thus, OpenAI is willing to go to court to make a fair use argument in order to continue to concentrate the research dollars in their pockets and they’re willing to spend the lobbying dollars to ask forgiveness rather than waiting to ask permission. There’s a decent chance they’ll succeed. They’ll have profited off of all of our labor, but are they contributing back in a meaningful way?

Let’s explore.

Part 2, or “how useful are these things actually”?

Permalink to “Part 2, or “how useful are these things actually”?”

Recycling truck
Paul Vasarhelyi @ Shutterstock #78378802

Let’s start with LLMs, which the AI companies claim to be a replacement for writing of all sorts or (in the case of Microsoft) the cusp of a brilliant Artificial General Intelligence which will solve climate change (yeahhhhh no).

Remember above how LLMs take statistically likely tokens and start spitting them out in an attempt to “complete” what you’ve put into the prompt? How are the AI companies suggesting we use this best?

Well, the top things I see being pushed boil down to:

  1. Replace your Developers with the AI that can do the grunt work for you
  2. Generate a bunch of text from some data, like a sales report or other thing you need “summarized”
  3. Replace search engines, because they all kind of suck now.
  4. Writing assistants of all kinds (or, if you’re an aspiring grifter, Book generation machine)
  5. Make API calls by giving the LLM the ability to execute code.
  6. Chatbots! Clippy has Risen again!

There are countless others that rely on the Illusion that LLMs can think, but we’re going to stay away from those. We’re talking about what I think is useful here.

The Elephant in the Software Community: Do you need developers?

Permalink to “The Elephant in the Software Community: Do you need developers?”

Okay, there are so many ways I can refute this claim it’s hard to pick the best one. First off, “prompt engineering” has emerged as a prime job, and it’s really just typing various things into the LLM to try and get the best results (again, manipulating the statistics engine into giving you output you want. Non-deterministic output). That is essentially a development job; you’re using natural language to try to get the machine to do what you want. Because it has a propensity to not do that, though, it’s not the same as a programming language where it does exactly what you tell it to, every time. Devs write bugs, to be sure, but what the code says is what you’re going to get out the other end. With a carefully crafted prompt you will probably get what you want out the other end, but not always (this is a feature, remember?)

The folks who are financially motivated to sell you ever increasingly complex engines are incentivized to tell you that you can cut costs and just let the LLM do the “boring stuff” leaving your most high-value workers free to do more important work.

And you know what, because these LLMs were trained on a bunch of structured code, yeah, you probably can get it to semi-reliably produce working code. It’s pretty decent at that, turns out. You can get it to “explain” some code to you and it’ll do an okay (but often subtly wrong) job. You can feed it some code, tell it to make modifications, or write tests, and it’ll do it.

Even if it’s wrong, we’ve built up a lot of tooling over the years to catch mistakes. Paired with a solid IDE, you can find errors in the LLMs code more readily than just reading it yourself. Neat!

I actually tried this recently when revamping the GW2 Assistant app. I’ll be doing a post on my experiment doing this soonish, but in the meantime let me summarize my thoughts (which are actually the second point):

An experienced developer knows when the LLM has produced unsustainable or dangerous code, and if they’re on guard for that and critically examine the output they probably will be more efficient than they were before.

Inexperienced developers will not be able to do that due to unfamiliarity and will likely just let the code go if it “works”. If it doesn’t work, they’re liable to get stuck for far longer than pair programming with a human.

Devin, the AI Agent that claims to be the first AI software engineer looks pretty impressive! Time for all software devs to take up pottery or something. I want you to keep an eye on those demos and what the human is typing into the prompt engine. One thing I noticed in the headline demo is that the engineer had to tell Devin 3 or 4 times (I kinda lost count) that it was using the wrong model and “be sure to use the right model”. There were also several occasions where he had to nudge it using specialized knowledge that the average person is simply not going to have. Really, go check it out.

Okay, so, we’re safe for a little bit right?

Well….no. I’m going to link to an article by Baldur Bjarnason: The one about the web developer job market. It’s pretty depressing, but it also summarizes my feelings well. Regardless of the merits of these AI systems (and I have a sneaking suspicion that the bubble’s going to pop sooner rather than later due to the intensity of the hype), CTOs and CEOs that are focused on cutting costs are going to reduce headcount as a money-saving measure, especially in industries that view software as a Cost Center. Hell, if Jensen Huang says we don’t need to train developers, we can be assured that the career is dead.

I think this is a long-term tactical mistake for a few reasons:

  1. I think a lot of the hype is smoke-and-mirrors, and there’s no guarantee that it’s going to be orders-of-magnitude better.
  2. We’ll make our developer talent pool much smaller, and have little to no environment for Juniors to learn and grow, aside from using AI assistants to do work.
  3. Once the cost of using AI tools increases, we’ll be scrambling to either rehire devs at deflated cost, or we’re going to try and wrangle less power hungry models into doing more development things.


This LLM wish fulfillment strategy is essentially “I don’t have time to crunch this data myself, can I get the AI to do it for me and extract only the most important bits”. The shortest answer is “maybe to some degree of accuracy”. If you feed it a document, for example, and ask it to summarize - odds are decent that it’ll both give you a relatively accurate summary (because you’ve increased the odds that it’ll produce the tokens you want to see in said document) that will also contain degrees of factual errors. Sometimes there will be zero factual errors. Sometimes there will be many. Whether those are important or not depends entirely on the context.

Knowing the difference would require you to read the whole document and decide for yourself. But we’re here to save time and be more productive, remember? You’re not going to do that. You’re going to trust that the LLM has accurately summarized the data in the text you’re giving it.

BTW, by itself an LLM can’t do math. OpenAI is trying to overcome this limitation by allowing it to run Python code or connect to Wolfram Alpha but there are still some interesting quirks.

So, you trust that info, and you take it to a board presentation. You’re showcasing your summarized data and it clearly shows that your Star Wars Action Figure sales have skyrocketed. Problem is you’re an oil and gas company and you do not sell Star Wars action figures. Next thing you know, you’re looking like an idiot in front of the board of directors and they’re asking for your resignation. Or, worse, the Judge is asking you to produce the case law your LLM fabricated, and now you’re being disbarred. Neat!

Remember, the making shit up is a feature, not a bug.

But wait! We can technology our way out of this problem! We’ll have the LLM search its dataset to fact check itself! Dear reader, this is Retrieval Augmented Generation (RAG). Based on nothing but my own observations, the most common technique I’ve seen for this is doing a search first for the prompt, injecting those results into the context window, and then having it cite its sources. That can increase the accuracy by nudging the statistics in the right direction by giving it more text. Problem is, it doesn’t always work. Sometimes it’ll still cite fake resources. You can pile more and more stuff on top (like doing another check to see if the text from the source appears in the summary) in an ever-increasing race to keep the LLM honest but ultimately:

LLMs have no connection to “truth” or “fact” - all the text they generate are functionally equivalent based on statistics

RAG and Semantic Search are related concepts - you might use a semantic search engine (which attempts to search on what the user meant, not necessarily what they asked) to retrieve the documents you inject into the system.

The other technique we really need to talk about briefly is Reinforcement Learning from Human Feedback (RLHF). This is “we have the algorithm produce a thing, have a human rate it, and then use that human feedback to retrain / refine the model”.

Two major problems with this:

  1. It only works on the topics you decide to do it on, namely “stuff that winds up in the news and we pinky swear to ‘fix’ it”.
  2. It’s done by an army of underpaid contractors.

You’d be surprised just how much of our AI infrastructure is actually Mechanical Turks. Take Amazon Just-walk-out, for example.

What we wind up doing is just making the toolchain ever more complicated trying to get spicy autocomplete to stop making up “facts”, and it might have just not been worth the effort in the first place.

But that’s harder to do these days because:

Google’s been fighting a losing battle against “SEO Optimized” garbage sites for well over a decade at this point. Trying to get relevant search results amidst the detritus and paid search results has gotten harder over time. So, some companies have thought “hey! Generative AI can help with this - just ask the bot (see point #6) your question and it’ll give you the information directly”.

Cool, well, this has a couple of direct impacts, even if it works. Remember those hallucinations? They tend to sneak in places where they’re hard to notice, and its corpus of data is really skewed towards English language results. So, still potentially disconnected from reality (but usually augmented via RAG), but how would you know? It’s replaced your search engine - so are you going to now take the extra time to go to the primary source? Nah.

Buuuut, because Generative AI can generate even more of this SEO garbage at a record pace (usually in an effort to get ad revenue) we’re going to see more and more of the garbage web showing up in search. What happens if we’re using RAG on the general internet? Well, it’s an Ouroboros of garbage, or, as some folks theorize, Model Collapse.

The other issue is that if people just take the results the chat bot gives them and do not visit those primary sources, ad revenue and traffic to the primary sources will go down. This disincentivizes those sources from writing more content. The Generative AI needs content to live. Maybe it’ll starve itself. I dunno.

But it’ll help me elevate my writing and be a really good author right?

Permalink to “But it’ll help me elevate my writing and be a really good author right?”

I’ve been too cynical this whole time. I’m going to give this one a “maybe”. If you’re using it to augment your own writing, having it rephrase certain passages, or call out to you where there are grammar mistakes, or any of that “kind” of idea more power to you.

I don’t do any of that for two reasons, one is practical, the other highlights where I think there’s an ethical line:

  1. I’m not comfortable having a computer wholesale rewrite what I’ve done. I’d rather be shown places that can improve, see some other examples, and then rewrite it myself.
  2. There’s a pretty good chance that the content it regurgitates is copyrighted, and we’re still years out from knowing the legal precedent.

The AI industry has come up with a nice word for “model regurgitates the training data verbatim”. Where we might call it “plagiarism” they call it “overfitting”.

Look, I don’t want to be a moral purist here, but my preferred workflow is to write the thing, do an editing pass myself, and then toss the whole thing into a grammar checker because my stupid brain freaking loves commas. Like, really, really, loves them. Comma.

I do this with a particular tool: Pro Writing Aid. It’s got a bunch of nice reports which will do things like “highlight every phrase I’ve repeated in this piece” so that I can see them and then decide what to do with them. Same deal with the grammar. I ignore its suggestions frequently because if I don’t, the piece will lose my “voice” - and you’ll be able to tell.

They, like everyone else, have started injecting Gen AI stuff into their product, but for me it’s been absolutely useless. The rephrase feature hits the same bad points I mentioned earlier. They’ve also got a “critique” function which always issues the same tired platitudes (gotta try it to understand it, folks).

This raises another interesting point about the people investing in Generative AI heavily. One of those companies is Microsoft. A company who makes a word processor. The parent of clippy themselves. They could have integrated better grammar tools into their product. They could have invested more in “please show me all the places where I repeated the word ‘bagel’”. They didn’t do this.

That makes me think that they didn’t see the business case in “writing assistants”, and why Clippy died a slow death.

Suddenly, though, they have a thing that can approximate human writing and suddenly there’s a case and a demand for “let this thing help you write”. I feel like they’re grasping at use cases here. We stumbled upon this thing, it’s definitely the “future”, but we don’t…quite….know….how.

I want to take a second here to talk about a lot of what I’m seeing in the business world’s potential use cases. “Use this to summarize meetings!” or “Use this to write a long email from short content” or “Here, help make a presentation”.

After all one third of meetings are pointless, and could be an email! I want to also contend that many emails are pointless.

Essentially what you’re seeing is a “hey, busywork sucks, let’s automate the busywork”. Instead of doing that, why not just…not do the busywork? If you can’t be bothered to write the thing, does it actually have any value?

I’m not talking about documentation, which is often very important (and should be curated rather than generated), but all those little things that you didn’t really need to say.

If you’re going to type a bulleted list into an LLM to generate an email, and the person on the other end is going to just use an LLM to summarize, lossily, I might add, why didn’t you just send the bulleted list?

You’re making more work for yourself. Just… don’t do that?

Let’s give it the ability to make API Calls

Permalink to “Let’s give it the ability to make API Calls”

Right, so one of the fun things OpenAI has done for some of their GPT-4 products is to give it the ability to make function calls, so that you can have it do things like:

  • Book a flight
  • Ask what the next Guild Wars 2 World Boss is
  • Call your coffee maker and make it start
  • Get the latest news
  • Tie your shoes (not really)

And so on. Anything you can make a function call out to, you can have the LLM do!

It does this by being fed a function signature, so it “knows” how to structure the function call, and then runs it through an interpreter to actually make the call (cause that seems safe).

Here’s the…minor problem. It can still hallucinate when it makes that API call. So, say you have a function that looks like this: buyMeAFlight(destination, maxBudget) and you say to the chatbot “Hey, buy me a flight to Rio under $200”. What the LLM might do is this: buyMeAFlight("Rio de Janeiro", 20000). Congrats, unless you have it confirm what you’re doing you just bought a flight that’s well over your budget.

Now, like all other Generative AI things, there are techniques you can use to increase the accuracy. Making just the perfect prompt, having it repeat output back to you, asking “are you sure”, telling it that it’s a character on star trek. You know, normal stuff.

Alternatively you could just... use something deterministic, like, I don’t know, a web form or any of the existing chat agent software we already had.

Sidebar: Apparently OpenAI has introduced a “deterministic” mode in beta, where you provide a seed to the conversation to get it to reliably reproduce the same text every time. Are you convinced this is a random number generator yet?

And on that note, let’s talk about our obsession with chatbots.

So the “killer” application we’ve come up with, over and over, is “let’s type our question in natural language and it does a thing.” I honestly don’t understand this on a personal level - because I don’t really like talking to chatbots. I don’t want to say “Please book me a flight on Friday to New York” and then forget about it. I want to have control over when I’m going to fly.

Do large swaths of people want executive assistants to do important things like cross-country travel?

Not coincidentally, I really struggle with that kind of delegation and have never really made use of an executive assistant personally.

We’ve decided that the best interface for doing work is “ask the chatbot to do things for you” in the agent format. This is exactly the premise of the Rabbit R1 and the Humane Ai Pin. Why use your phone when you can shout into a thing strapped to you and it’ll do…whatever you ask. Perhaps it’ll shout trivia answers at you.

But guess what, my phone can already do that. Siri’s existed for years and like, I hardly use it. It’s not because it’s not useful. It’s because I can do what I want without shouting at it. In public. For some reason.

We do need to talk about accessibility. One of the things that AI agents would be legitimately useful for is for those folks who cannot access interfaces normally whether that’s situationally (driving a car), or temporarily / permanently (blind, disabled).

If we can use LLMs to get better accessibility tech that is reliable, I’m all for it. Problem is that the companies pushing the technology have a mixed track record on doing accessibility work, and I’m concerned that we’ve decided that LLMs being able to generate text means we can abdicate responsibility for doing actual accessibility work.

Like many other things in the space, we’ve decided that “AI” is magic, and will make things accessible without having to do the work. I mean, no. That’s not how it works.

Remember back to the beginning of this article where I talked about other Machine Learning Models? I think that’s the space where we’re going to make more accessibility advances, like the Atom Limb which uses a non-generative model to interpret individual muscle signals.

Still with me?

If I had to summarize my thoughts on all of the above it’s that we’ve stumbled upon something really cool - we’ve got an algorithm that can create convincing looking text.

The companies that have the resources to push this tech seem to be scrambling for the killer use-case. Many companies are clamoring for things that let them reduce labor costs. Those two things are going to result in bad outcomes for everyone.

I don’t think there’s a silver bullet use case here. There are better tools already for every use case I’ve seen put forward (with some minor exceptions), but we’re shoving LLMs into everything because that’s where the money is. We’re chasing a super-intelligent god that can “solve the climate crisis for us” by making the climate crisis worse in the meantime.

If you were holding NVDA stock, something something TO THE MOON. They’ve been making bank off of every bubble that needs GPUs to function.

This feels exactly like the Blockchain and Web3 bubbles. Lots of hype, not a lot of substance. We’re tying ourselves in knots to get it to not “hallucinate”, but like I’ve repeated over and over again in this piece the bullshit is a feature, not a bug. I recommend reading this piece by Cory Doctorow: What Kind of Bubble is AI? It’ll give you warm fuzzies. But it won’t.

Midjourney, Sora, all those things that can fake voices and make music. We’ve got a big category of things that are, charitably “art generators”, but more realistically “plagiarism engines”.

This section is going to be a lot shorter. Let me summarize my feelings:

  • If you’re using one of these things for personal reasons, making character art for your home D&D game, or other things that you’re not trying to profit from - go for it. I don’t care. I’d rather you not give these companies money but I don’t have moral authority here.
    • I’ve used it for this too! I’m not exempt from this statement.
  • If you’re using AI “art” in a commercial product, you don’t have an ethical defense here (but we’ll talk about business risk in a sec). The majority of these models were trained on copyrighted content without consent and the humans who put the work in are not compensated for it.

I personally don’t find all of the existing AI creations that inspiring, other than how neat it is we’ve gotten a neural network to approximate images in its training set. Some of the things it spits out are “cool” and “workable” but I just don’t like it.

Hey, I do empathize with the diffusion models a bit though. Hands are hard.

As I mentioned earlier in the post, as far as we can tell, the art diffusion models were trained on publicly viewable, but still copyrighted content.

If for some reason you’re a business and you’re reading this post: That’s a lot of business risk you’d be shouldering. There are multiple different lawsuits happening right now, many of them on different lines, and we don’t actually know how that’s going to go. Relatedly, AI Art is not copyrightable, so that’s…probably a problem for your business especially if you’re making a book or other art-heavy product. At best you can do is treat it like stock art, where you don’t own the exclusive rights to the art, and you’re hoping you don’t get slapped with liability in the future.

So, if you’re using an AI Art model in your commercial work, these are all things you have to worry about.

This is where, and I cannot believe I am saying this, I think Adobe is playing it smart. They’ve trained Firefly on Art they’ve licensed from their Adobe Stock art platform and are (marginally) compensating artists for the privilege. They have also gone so far as to offer to guarantee legal assistance to enterprise customers. If you’re a risk averse business, that’s a pretty sweet deal (and less ethically concerning - though the artists are getting pennies).

The rest of them? You’re carrying that risk on your business.

But what if your business happens to be “crime”?

AI companies seem hell bent on both automating the act of creation (devaluing artistry and creativity in the process) and also making it startlingly easy for fraudsters to do their thing.

Lemme just… link some articles.

So, you create things that can A) Clone a person’s voice, B) Imitate their Likeness, and C) Make them say whatever you want.


I dunno about y’all, but I’m okay not practicing Digital Necromancy (just regular, artisanal necromancy).

The commercial business for these categories of generative AI are flat out fraud engines. OF COURSE criminals are going to use this to defraud people and influence elections. You’ve made their lives so much easier.

Hey, I guess we can take solace in the fact that the fraud mills can do this with fewer employees now. Neat.

But Netflix Canceled My Favorite Show and I want to Revive it

Permalink to “But Netflix Canceled My Favorite Show and I want to Revive it”

This is also known as the “democratizing art” argument. First thing I would like to point out is that art is already democratized? It’s a skill. That you can learn. All you need to do is put in the time. It’s not a mystical talent that only a select few possess.

Artists are not gurus who live in the woods and produce art from nothing, and the rest of us are mere drones who are incapable of making art.

So in this case “democratization” really means “can make things without putting in the effort”. The result of that winds up being about as tepid as you might imagine.

Now, there’s a question in there of if a person is having to work all the time to simply live, won’t this enable them to “make art”? There’s another way to fix that, by reducing the amount of work they need to do to simply exist, but nah, we’re gonna automate the fun parts.

Hey, awesome artists who are making things with AI tools but are using it as a process augmenter - all good. I’m not talking to you. I’m talking to the Willy Wonka Fraud Experience “entrepreneurs”

But you know what, I’m not even really that concerned with people who want to make stuff on their own that is for their own enjoyment. I don’t think the result is going to be very good, and I’d rather have more people creating good stuff than fewer, but hey more power to ya.

Another aside: I really do not want to verbally talk to NPCs in games. I play single-player games to not talk to people. I don’t want to be subjected to that in the name of “more realistic background dialog”.

It’s just not going to work out like you think it will. What’ll actually happen is:

The primary thing I suspect is going to happen with the AI “art” is back to the cost-cutting efforts. Where you might have used stock art before, or a junior artist, you’re going to replace that with Dall-E.

For marketing efforts, that’s not an immediate impact. Marketing content is designed to be churned out quickly, and shotgunned into people’s feeds in an effort to get you to buy something or to feel some way, etc. I don’t think those campaigns are going to be as effective as the best campaigns ever, but eh, we’ll see I guess.

The most concerning uses are going to be the media companies that are going to replace assets in video games and movies. Fewer employees, lower budgets, and … dare I say … lower quality.

You see, diffusion models don’t let you tweak them, yet (although, who knows, maybe if we start doing deterministic “seeds” again we’ll get somewhere with how Sora functions). They also have a propensity to not give you what you asked for, so, yeah, let’s spend billions of dollars trying to fix that.

So, at risk of trying to predict the future (which I’m definitely bad at), I think we’re going to gut a swath of creatives, devalue their work, and then realize that “oh, no one wants this generative crap”. We’ll rehire the artists at lower rates and we’ll consolidate capital into the hands of a few people.

Meanwhile, we’ve eliminated the positions where people would traditionally learn skills, so you won’t be able to have a career.


We live in a society that requires money to function. Most of us sell our labor for the money we need to live.

The goal of companies is to replace as many of their workers as they can with these flawed AI tools. Especially in the environment we find ourselves in where VC money needs to make a return on investment now that the Zero Interest Rate Phenomenon (ZIRP) is finished.

Now, “every time technology has displaced jobs we’ve made more jobs” is the common adage. And, generally, that’s true. However, the main fear here isn’t that we won’t be working, it’s that it’ll have a deflationary impact on wages and increase income inequality. CEO compensation compared to average salary has increased by 1460% since the 70’s, after all.

What I think is different about previous technological advances (but hey, the Luddites were facing similar social problems) is that we’re in a situation where the amount of capital is being invested in the hands of a few companies, and only a very few of them have the resources to control this new technology. I don’t think they’re altruists.

This is not a post-scarcity Star Trek future we’re living in. I wish we were. I’m sorry.

Right. Uh. I’m not sure I can, but here are some things I’d like to see:

  • There are a number of really small models that can run on commodity hardware, and with enough tuning you can get them to give you comparable results to what you’re getting on some of the larger models. Those don’t require an ocean of water to train or use, and run locally.
  • We’re going to see more AI chips, that’s inevitable, but the non-generative models are going to benefit from that too. There’s a lot of interesting work happening out there.
    • I’m also pretty cool with DLSS and RSR for upscaling video game graphics for lower-powered hardware. That’s great.

I honestly hope I’m wrong and that the fantastical claims about AI solving climate change are real… but the odds of that are really bad.

This is the longest post I think I’ve ever written, we’re well over 7,000 words. I have so many thoughts it’s hard to make them coherent.

Perhaps unsurprisingly, I’ve not used Generative AI (or AI of any kind) for this post. I’ve barely even edited it. You’re getting the entire stream of consciousness word vomit from me.

Call me a doomer if you want, but hey, if you do, I’ve got a Blockchain to sell you.