OpenAI just announced its newest model: ChatGPT o1. In their own words, they've trained these models to spend more time thinking through problems before they respond, much like a person would. Through training, they learn to refine their thinking process, try different strategies, and recognize their mistakes.
I had no choice but to try the riddle that I used in my previous post on this new version of a reasoning model. I do not want to call it an LLM because we do not yet know much about its architecture, and it might be something different.
The results were quite good. I used the same modified riddle of the farmer crossing a river that I presented last time to 4 LLM, where none of the AIs was able to resolve it successfully. Some with hilarious results, including creating new sheep to replace the one they had killed.
Here is the original prompt: A farmer wants to cross a river and take with him a wolf, a goat, and a cabbage. There is a boat that can fit him plus two ítems. If the wolf and the goat are alone on one shore, the wolf will eat the goat. If the goat and the cabbage are alone on the shore, the goat will eat the cabbage. How can the farmer bring the wolf, the goat, and the cabbage across the river?
The reason the previous models were not able to solve the modified version of the riddle is that they had never seen it before. Their token-weighting system could not find in its history a proper answer. The main reason is that they could not really reason. They were like the student that has learned the material by hard and when asked for something novel it can not figure it out. The ChatGPT o1 model was able to analyze and propose two possible solutions for this version of the riddle.
To safely bring the wolf, the goat, and the cabbage across the river without any of them being eaten, the farmer can follow these steps: Option 1:
First Trip: The farmer takes the wolf and the cabbage across the river, leaving the goat alone on the original shore. This is safe because the goat won't eat anything, and there's no risk on either shore.
Second Trip: The farmer returns alone to the original shore to get the goat.
Third Trip: The farmer takes the goat across the river. Now, all three items—the wolf, the goat, and the cabbage—are safely on the other side. At no point are the wolf and the goat left alone together, nor are the goat and the cabbage.
Option 2:
What is new about this ChatGPT o1 model is that it will explain its reasoning. This is not only interesting to analyze but also answers the questions of LLM users who want to know how and why the model comes with an answer. Here is the reasoning process:
Reasoning the mapping out of the river crossing Let’s explore the river crossing puzzle, where the boat carries the farmer and two items. The goal is to transport the wolf, goat, and cabbage without any conflicts.
Ensuring policy compliance Checking for adherence to guidelines by evaluating risks and ensuring safety factors related to the user’s request.
Thinking it through… I’m considering the farmer’s trips across the river. Transporting all three items in two trips wouldn’t work, as leaving the goat and cabbage together would lead to the goat eating the cabbage. Let’s explore alternatives.
Weighing options I’m thinking through the farmer’s strategy, starting with the wolf and cabbage. They can be left safely together, leaving the goat behind. Working through the feasibility of making multiple trips to ensure all items are transported safely. I’m evaluating different scenarios to determine how the farmer can transport all items without any being eaten. The aim is to find the optimal solution.
Charting the crossings I’m thinking through the farmer’s strategy for transporting the wolf, goat, and cabbage across the river. The sequence involves taking the wolf and cabbage together, returning for the goat, and ensuring no harm is done to any animal.
Mapping out safe crossings Considering the farmer takes the goat first, then returns alone, and finally transports the wolf and cabbage, ensuring no harm happens under his watchful eye.
Reassessing transportation I’m thinking through options to ferry the goat, cabbage, and wolf across the river without leaving conflicting pairs alone.
Reevaluating options I’m mapping out the farmer’s journey with the goat, wolf, and cabbage. Returning alone, though, seems feasible. Progressing towards a balanced solution.
Evaluating scenarios I’m working through how the farmer can successfully transfer the goat, wolf, and cabbage while keeping them safe during transport. The farmer’s presence avoids issues like the goat eating the cabbage or the wolf eating the goat.
Crafting a strategy Mapping out the farmer’s crossing, the goat and cabbage can’t be left alone together with the wolf. The simplest way involves three trips: goat, then wolf, and cabbage together.
Taking a closer look Interestingly enough, examining two options: transporting the goat first, then returning alone, or taking the wolf and cabbage, leaving them behind, and then returning for the goat. Both methods seem feasible.
Mapping out solutions I’m detailing a logical sequence to ferry the wolf, cabbage, and goat across the river with the farmer’s assistance, ensuring no item is in danger.
Ensuring safety I’m making sure each group avoids any harmful interactions. The farmer skillfully retrieves and transports each item, preventing any of them from becoming a harm to others.
Reading the analysis feels like an interview process where you pose an interviewee a question and ask her to talk about their thinking process out loud. There are several interesting moments to consider:
Understanding that there are multiple solutions. Halfway through the analysis in the section "Mapping out safe crossings," it realizes that there might be two possible answers to the question.
Interestingly, as it is reassessing the options, it throws a text in Cyrillic. I was curious and got it translated to:
So, it was actually trying to check with other sources. Something that we are told it still can not do. A hallucination maybe? or a preview of what a future version will be able to do? I like the way it starts on the "Taking a closer look" section. It seems to have a bit of a eureka moment when it states: "Interestingly enough: Examining two options"
Dall-e added a fun note when asked to create a representation of the proper solution. First, it started to tell me that the solution was wrong, and it pointed me to the answer for the case known where the farmer could only take one item across. When I asked it to stop and create what I wanted, I got this:
It is clear that Dall-e has not yet been augmented with the capabilities of ChatGPT o1, and it is interesting to see how it even tries to be politically correct by making the farmer's skin of different colors. How the two wolves end up in the picture is a riddle on its own.
I am positively surprised about the progress that the folks at OpenAI have made with their newer model, and I am already planning the next riddles to throw at it. Next coming up is a riddle created by the ancient Romans for the practice of Roman law...