Are we making ChatGPT Dumber?

As we continue to help train ChatGPT's models are the results getting worse?

October 07, 2023 • Estimated Reading Time: 9 minutes

Photo of a metallic robot with rusted patches, tangled wires hanging loosely, and a screen displaying a puzzled expression amidst a cluttered workshop.

[The image above is generated by Midjourney. Dalle-3 is newly available, and I am giving it a whirl. The prompt I used to create the image is listed at the end of this email. Look forward to tips on using Dalle-3 next week.]

In recent weeks and months, the answers I have been getting from ChatGPT seem worse than a few months ago. It made me wonder, is this true as it’s extremely hard to measure, or am I just using the

In July, a group of Stanford and Berkley researchers published a study titled How Is ChatGPT’s Behavior Changing over Time? delved into the capabilities of GPT-4, a successor of the popular GPT-3 model, and its proficiency in mathematical reasoning. The intriguing findings raised the question: Are we making ChatGPT dumber?

The Evolution of ChatGPT

From the first time I used ChatGPT, I’ve been impressed by its ability to generate reasonably good text. Sometimes, the results have even been exceptional. As time passed, I expected each version to be smarter, faster, and more accurate. But is that the reality?

To examine the consistency of ChatGPT’s underlying GPT-3.5 and -4 algorithms, researchers from Stanford and Berkley tested the AI’s tendency to “drift,” i.e. offer answers with varying levels of quality and accuracy, and its ability to follow commands properly. Researchers asked ChatGPT-3.5 and -4 to solve math problems, answer sensitive and dangerous questions, visually reason from prompts, and generate code.

The Study's Findings

The study presented a series of mathematical queries to GPT-4, specifically focusing on prime number identification and happy numbers. The results were a mixed bag. While GPT-4's March version seemed adept at identifying prime numbers, it faltered in some arithmetic calculations. Moreover, when tasked with counting happy numbers within smaller intervals, the June version of GPT-4 frequently responded with a singular happy number, regardless of the query's specifics.

A happy number is a number which eventually reaches 1 when replaced by the sum of the square of each digit. If this process results in an endless cycle of numbers containing 4, then the number is termed as an unhappy number.

What Does This Mean?

These findings suggest that while GPT-4 has made strides in certain areas, there are still gaps in its learning. The discrepancies between the March and June versions indicate that updates to the model don't necessarily equate to improvements in all areas. It's akin to human learning; sometimes, when we cram too much information into our brains, other details get pushed out or muddled.

The fluctuating efficiency of ChatGPT is a subject of debate, with various hypotheses put forward. Some believe OpenAI might deliberately reduce the model's capabilities to conserve computational power. The resources needed to sustain their performance can escalate as AI models evolve. OpenAI might aim to control these expenses by purposefully lowering the model's efficiency.

Another perspective suggests that as the model evolves, its complexity increases. This heightened intricacy can pose challenges in training and optimizing the model. Consequently, the model's effectiveness might wane as time progresses.

Early on, ChatGPT was trained on data from the internet, data scientists, and labeled data, which provided a good foundation. Today, we are likely throwing terabytes of data at the model all day, and our collective answers, opinions, and information, whether wrong or right, are filling up the ChatGPT’s “brain.” Is it possible that we are overwhelming the platform?

The Bigger Picture

It's essential to understand that machine learning models, including ChatGPT, are only as good as the data they're trained on. If the training data has biases, inaccuracies or is not comprehensive, the model's output will reflect those shortcomings. The "dumbing down" of ChatGPT might not result from the model's inherent flaws but could stem from the quality and breadth of its training data.

In Conclusion

ChatGPT will continue to evolve; there are likely many ways that the company behind ChatGPT will adapt to our usage. As researchers and developers continue to refine and train these models, we can expect both breakthroughs and setbacks. I don’t know if ChatGPT is getting dumber or if we are training them collectively in a way we haven’t expected. I am anxiously awaiting to see how things look in another few months.

Check out this edition of the Artificially Intelligent Enterprise to get tips on how to set custom instructions in ChatGPGT. It is a very rudimentary how we can train these models ourselves or, more accurately, fine-tune them.

Tip of the Week: How to create an AI-generated image from a piece of stock art

Have you ever found an image you really like, but it lacks a little something? Or you want to use a unique image and not one that a competitor or someone else is using. This is how I use Midjourney to do this for everything from blog images to artwork for presentations.

To get started, find an image you like, and then we’ll work on uploading it to Midjourney.

Web-Integrated Image Uploads for Midjourney

Midjourney offers a seamless experience by supporting image uploads via web URLs. Just paste the link, and you're good to go. However, ensure the link is in a recognized image format like .jpg, .png, .webp, or .gif.

Drag your desired image file into the designated Midjourney channel.
Hit 'Enter' and watch the magic unfold.
Once uploaded, click on the image for a closer look.
Right-click and choose “Copy Link” to prepare your image URL as a Midjourney prompt.

Or you can hit /describe image, and it will spawn and upload dialogue in the chat. Here’s an example of the results:

Creating a new image from the Description

You have two choices here. First, you can hit the imagine all button, and it will create four new images from the prompts. Or you can iterate and create your prompt by extracting information from the describe command. If you hit imagine all, you will get four images with four variants.

Reroll: The Midjourney Equivalent of Regenerate

Every masterpiece might not resonate with your vision. That's where rerolling steps in, offering a fresh perspective. Enact the Remix Mode by typing /prefer remix in any Midjourney Discord chat. With Remix Mode on, you can tweak prompts, dimensions, and more to craft a fresh masterpiece. When you get a resulting image or group of images, you can reroll to have Midjourney automatically regenerate your image.

In Conclusion

Using Midjourney to generate images in a style you like is probably better than searching the internet for that perfect image. This provides a quick and easy way to create unique images in a fraction of the time for your websites, Powerpoint presentations, and other business materials.

What I Read this Week

How To Use AI To Brainstorm A Billion-Dollar Business Idea - Crunchbase
AI Detection Startups Say Amazon Could Flag AI Books. It Doesn't - Amazon
Jasper, an Early Generative AI Winner, Cuts Internal Valuation as Growth Slows - The Information
Generative AI exists because of the transformer - The Financial Times
Six Months Ago Elon Musk Called for a Pause on AI. Instead Development Sped Up - Wired

AI Tools I am Evaluating

Creative Reality™ Studio - Use generative AI to create future-facing videos
Collato - Find, summarize, and generate new content based on your own product knowledge. Save hours in manual work so you can return to the parts of your job you love.
AI Prompt Finder - The AI Prompt Finder application is a platform where you can publish prompts that you can use for any artificial intelligence tool.

DALL-E 3 Prompt for Header Image

For every issue of the Artificially Intelligent Enterprise, I include the MIdjourney prompt I used to create the header image for that edition. In this edition, I am playing with DALLE-3. I included the prompt and am showing the comparison image from MidJourney below.

A dumb robot that looks disheveled and confused --s 1000 --ar 16:9

Reply

or to participate.