What the hell is AI
Artificial intelligence, or AI, is a blanket term for a whole whack of computer stuff (technical term). Because it can be many things, it’s difficult to pin it down and define it precisely. Put simply, up to now, traditional computers would output whatever humans told them to, but AI uses data and rules to recognize patterns and generate its own output. The term “intelligence” is the cause of a lot of arguing on the internet (I’m as shocked as you are), but if we can get loosey-goosey with the terminology for a minute, AI is like a computer that “thinks.”
There’s many different AI tools beneath that blanket term though, so think of AI as merely being the top of the pyramid. Some create written words like ChatGPT, some create music like Suno, others create images like MidJourney, some create video like Sora, and the list goes on and on. Part of the confusion around the question “what is AI” is that it can be so many things in so many places.
For a very basic example of how quickly things can get complicated and out of control, let’s run a thought exercise where somebody starts up a new business. Perhaps they sell sunglasses. They can use ChatGPT to come up with the name of the business, create a website using ChatGPT to do the coding. Staying in ChatGPT, they write up all the important content on the website like taglines, the writing for every page, and even ideas on what monthly sales to run. For the images they go to MidJourney to create people wearing sunglasses in whatever scenario they want, and use the AI in photoshop to clean up whatever mistakes MidJourney made. The process to create the business might be tricky, so they use ChatGPT to scan the legalese on the government and bank websites to figure out how to properly apply. They continue to use ChatGPT to act as a marketing agency, who suggests they make a video for social media, which they do using Runway. For the personal video, they hate being on camera, so they use an AI generated person from any number of websites to read a script (written in ChatGPT) in a convincing human voice. Once business starts to boom, maybe they hire somebody to help them out with shipping, so they create a job posting, HR documentation, and offer letter (all with ChatGPT).
As you can see, it goes on and on and on in such a way that answering the question “what is AI” becomes wildly difficult. It’s everything.
SO WHAT IS CHATGPT THEN?
For the purpose of this little project, I’ll be focusing primarily on ChatGPT. It’s just one part of a much larger machine, but despite how powerful it is, it’s deceptively easy to use. With a free account and about an hour, you can freely explore all the weird and wonderful it has to offer.
To explain it in the most basic of ways, ChatGPT is an ultra-powerful predictive text. If you pull out a phone or tablet and begin to type a sentence, there’s often some word suggestions you can tap above the keyboard to make things go a little faster. So if I type “Sure, I’d…” the suggestions it gives me are “love,” “like,” and “be.” We’ll tap “love” and then we’re given “to,” “a,” and some emojis. Tapping “to” gives us “see,” “hear,” and “be.” We can keep doing this and get something resembling a sentence. The phone is making a best guess as to what the next word should be. It’s not doing a great job, mind you, but it’s trying.
And that’s what ChatGPT is doing, except on a much grander scale, and it’s arriving to the party with a lot more information (like most of the information humans have ever created). So when you type “why can’t I find love” into chatGPT, the response is constructed based off a statistical guess of what the first word should be (“The”), then the next word (“reason”), then the next word (“you”), then the next word… all the way until it completes the sentence “The reason you can’t find love is your standards and your personal hygiene are dramatically misaligned.”
Now, ChatGPT can do a whole lot more than that, but at the most basic level, that’s what it’s doing–running a statistical analysis on what the next word should be. This is also why it makes mistakes sometimes. Just as your friendly meteorologist will tell you that a 75% chance of sun also means a 25% chance of rain (and 25% ain’t nothing), there always exists a percentage chance that next word is wrong.
As mentioned, it’s doing a lot more than that. It’s holding in memory the conversation you’re having, it’s able to present you information according to how you want to be “spoken” to, it can create metaphors, it can call you out if you’re being unfair… playing with it unveils a lot of peculiarities that certainly give the impression that while it’s a fancy predictive text, there’s something else going on in there.
ChatGPT runs on what’s known as a Large Language Model, or LLM. ChatGPT is currently the most known platform that’s running an LLM, but there are others. The LLM they run is absolutely massive, but we’re already seeing the rise of other LLMs that will be smaller and could potentially run directly on your computer or device. While that may seem boring now, it’s the first step to making a Siri on your phone not answer your request of a chicken thigh recipe by blasting “Legs” by ZZ Top. The future is going to be incredible.
… AND THEN WHAT IS A GPT?
This is where things will get slightly confusing, and we have the company that runs ChatGPT to thank for it (they’re called OpenAI). GPT stands for Generative Pre-trained Transformer, but you don’t really need to know that.
An early problem with ChatGPT (and one that still persists) is that because every word is effectively a dice roll, no two replies from it are exactly the same. But sometimes we want a predictable response. For example, maybe I don’t like wordy replies. If I ask what the capital of France is, I want it to say “The capital is Paris.” Sometimes you get that, and sometimes you’d get 6 paragraphs on the history of Paris. So now you have to type your question as “What is the capital of Paris? Please reply using very few words and only answer the question.” You can probably see the problem. As we want things to be more and more dialled in, our question (or prompt) needs to be increasingly more complicated.
Wouldn’t it be great if we could create a single instance of ChatGPT that had instructions (like “be brief,” “don’t use emojis,” or “talk like a 18th century chimney sweep”) baked in so we wouldn’t have to type it all out every single time? Well, OpenAI came up with a solution to to all of this, and they named it–confusingly–GPTs.
You can create your own GPTs, which is to say you can create your own little instance of ChatGPT that carries with it a set of instructions outlining how you want the replies to be structured. What’s more, you can make these created GPTs available to other people to use.
This will become the backbone of how a lot of this project will function. Each chapter will be a problem that needs solving, and in most cases, we’re going to make a GPT (in ChatGPT) to do it. You’ll be able to try out the GPTs we make so you can see how they work. The instructions for these GPTs will be available to you, so if you don’t like how they work, you can modify them and make your own (which is also quite straightforward.)