Our Cookies

This site uses cookies, including from our partners, to give you the best browsing experience, to create content personalised for you and to analyse website use.

Blog

AI Obesity: Are We Losing Our Critical Thinking Skills?

In his book,“Irreplaceable: The Art of Standing Out in the Age of Artificial Intelligence”, Pascal Bornet introduces the concept of “AI Obesity”.

This term refers to our over-dependence on AI — just as we can become obese from overconsumption of fast food, we can become mentally obese from over-dependence on AI.

He states that the main risk of over-consumption is that we might lose our critical thinking if we become addicted to fast creativity, and fast decisions.

Andrea Rosales, Lead Data Scientist

With a PhD in Computer Science, Andrea Rosales specialises in domain adaptation, transfer learning, continual learning, and generative AI. Andrea is passionate about developing innovative data science models that deliver impactful solutions. She has a proven track record of creating novel deep-learning models to address real-world problems in both industry and academia, and she is recognised as a Global UK Talent.

Andrea Rosales

Lead Data Scientist

What is critical thinking?

“Critical thinking is the art of making clear, reasoned judgements based on interpreting, understanding, applying and synthesising evidence gathered from observation, reading and experimentation”. (Burns, T., & Sinfield, S. 2016),

Being critical means analysing evidence from different sources and making reasoned conclusions. When it comes to interpreting responses from an AI tool, we should be able to use our critical thinking skills to evaluate the output and decide whether or not we agree with it.

AI Obesity might make people prioritise speed and efficiency over critical thinking, making decisions without reasoning about the outputs. As Bornet questioned in his book: Are we settling for “good enough” solutions instead of striving for excellence?

The purpose of this blog is to reflect on the importance of critical thinking, but first I'll provide a recap on the history of LLMs, highlighting how language barriers have been gradually overcome while improving reasoning and incorporating elements of critical thinking. I hope you enjoy the read!

From mimicking to critical thinking

The history of language models starts in 1883 with the concept of semantics, developed by the French philologist, Michel Bréal — founder of modern semantics. He studied the ways languages are organised and how words are connected within a language.

Natural language processing (NLP) gained great popularity after the end of World War II in 1945. People realised through peace talks that translating automatically from one language to another is crucial.

Active research on NLP started with machine translation projects such as the Georgetown-IBM experiment (1954). IBM's Arthur Samuel created a computer checker-playing program. In 1959, he developed algorithms that enabled his program to improve, calling this “machine learning.”

In 1958, Frank Rosenblatt combined Hebbian learning with Samuel's work on machine learning to create the first artificial neural network (ANN), known as the Mark 1 Perceptron.

ELIZA (1966), was the first computer program in the world — an early natural-language processing computer program that could conduct human-like conversations. ELIZA could recognise simple user inputs and respond based on pre-defined scripts, using methodologies like pattern-matching recognition, and substitution methodology that gave users an illusion of understanding. An outstanding feature from ELIZA was that it showed emotions and empathy, for example:

Conversation with ELIZA

Conversation with ELIZA. Source Wikipedia.

Let's jump to the 90s era, where text analysis and speech generation methods like N-Grams and Recurrent Neural Networks became very popular. In 2006, Google Translate was launched as a multilingual neural machine translation service. It was able to translate text, documents and websites from one language to another.

Apple's Siri, was the first successful NLP/ AI assistant in the world in 2011. Siri's automated speech recognition module translates the user's words into digitally interpreted concepts. The voice-command system then correlates those concepts to predefined commands and performs specific actions. For example:

Siri Task 📝 — Answering Call

Siri Command 🤖 — “Hey Siri, answer the phone”

But Siri had many problems recognising and interpreting user commands, especially in the presence of accents, dialects, or noisy environments.

In 2017, the Transformer architecture was introduced by researchers at Google. This model incorporates self-attention mechanisms to capture dependencies and relationships within input sequences and to improve previous architectures for machine translations.

The Transformer model led to the development of pre-trained systems, such as Generative Pre-trained Transformers (GPT) and Bidirectional Encoder Representations from Transformers (BERT).

The first major breakthrough in text generation was with GPT-2 in 2019, which could generate semantic sentences. Then, in 2020 came another big jump with GPT-3, trained in much the same way as GPT-2 but with roughly an order of magnitude (i.e., 100x) more parameters.

In 2022, OpenAI released ChatGPT. ChatGPT significantly surpasses GPT-3 in a range of tasks, including communicating in human-like English, developing new software, and writing speeches.

Alongside the release of GPT-3's, Google released T5 (Text-to-Text Transfer Transformer). T5 is a transformer-based that uses a text-to-text approach (where both the input and output are text strings).

Diagram of text-to-text framework. Image from https://research.google/blog/exploring-transfer-learning-with-t5-the-text-to-text-transfer-transformer/.

Diagram of text-to-text framework. Image from https://research.google/blog/exploring-transfer-learning-with-t5-the-text-to-text-transfer-transformer/.

OpenAI's GPT-4 marks a new milestone in the evolution of LLMs. GPT-4 is a multimodal large language model that incorporates three main capabilities: creativity, visual input, and longer texts. These capabilities allow for deeper contextual understanding, and multi-step reasoning setting the basis to introduce critical thinking.

OpenAI states that GPT-4 is

“More reliable, creative, and able to handle much more nuanced instructions than GPT-3.5.”

For example:

User 🙎🏻‍♂️ — “Write an essay comparing the economic policies of the UK and the US.”

LLM 🤖 — “Sure. Would you like the comparison to focus on their historical impact, theoretical differences, or real-world applications?”

Despite these capabilities, GPT-4 still makes foolish mistakes and produces false statements. GPT-4, for instance, has been revealed to have a good grasp of algorithms but struggles with arithmetic or notation.

Still GPT-4 performs poorly in critical reasoning. In his research paper, M. M. Jahani Yekta, explains that this is possible because the training data does not generally include domain logic (thinking process leading to solutions) and the limitations of the next-word-prediction architectural paradigm.

The importance of critical thinking skills

Within the AI world, LLMs are designed to mimic human behaviour as closely as possible, even with limited context. For example, we can ask ChatGPT to summarise a research paper. ChatGTP will proceed to summarise it without considering the knowledge level of the user and the purpose of the summary.

Now, let's consider the following prompt:

User 🙎🏻‍♂️ — “9.11 and 9.2 which is bigger?”

ChatGPT 🤖 — “9.11 is bigger than 9.2”

What's “wrong” in the response?

Context! ChatGPT answered the question without considering the user's context.

So, let's first see what context means in LLMs.

Screenshot from author's ChatGPT conversation

Screenshot from author's ChatGPT conversation

What is context window in LLMs?

Imagine you are asked to summarise a long book, but you can only focus on a few pages at a time. This is similar to how language models process information. They have a limit on how much context they can consider before providing an output. This limit is called context window size.

The context window is the number of tokens (units of text such as words or characters) on either side of a specific word or character, the token we target; it's what delineates the limits within which AI has utility.

Example of context window, illustrating how older data is forgotten and newer information is remembered in the interaction. Image created by author.

Screenshot from author's ChatGPT conversation

A well-sized context window allows LLMs to make more informed predictions and generate higher-quality text. Its size is influenced by several factors:

Model design and objectives: Models meant to analyse documents, create content, or answer questions often require a larger context window to process and remember information as long as possible.

Increased computational resources: Larger context windows need more memory and processing.

Training data: The used training data similarly can affect the context window. Models trained on diverse and large datasets will likely require larger context windows.

Performance balance: while a larger context window can improve the model's understanding and output quality, it also demands more computational power and can slow down processing speeds. Finding the balance between these two is vital.

Why do context windows matter in LLMs?

A context window is a critical factor in assessing the performance of an LLM. It acts as a lens through which LLMs view and interpret information. The size and effectiveness of this lens significantly impact the LLM's ability to understand and respond to language in a meaningful way.

Although context windows have come a long way, they still are nowhere near emulating how humans process context. LLMs are semantically shallow; they have difficulty navigating cultural and emotional subtext and have limited memory — beyond the context window, the model “forgets” previous information.

Andrew Ye et al. discuss these imitations and more in their case study, which was presented at COLM 2024 and detailed in the next section.

Beyond the prompt: AI tools for critical thinking

The results from the case study “Language Models as Critical Thinking Tools: A Case Study of Philosophers” suggest that language models (LMs) lack a sense of selfhood and initiative. The study suggests that LMs are not good critical thinking tools for two main reasons: LMs are too neutral, detached, and nonjudgmental, and LMs are too servile, passive and incurious.

However, LMs can act as a muse for ideas, similar to when someone facing writer's block uses a language model to generate creative story prompts. For example:

Writer inputs 🙎🏻‍♂️ — “Suggest a sci-fi story idea involving time travel and AI,”

LM 🤖 — “A scientist discovers a time loop where AI governs the past, present, and future, but one glitch could erase all human history.”

This idea serves as a stimulus, sparking the writer's imagination to develop a unique storyline.

LMs can also assist in the refinement of ideas. For instance, a student writing a draft of an essay on climate change.

Student 🙎🏻‍♂️ — “Provide feedback about my climate change essay”

LM 🤖 — “Your argument is strong, but consider expanding on renewable energy solutions and including data to support your claims.”

He develops a paper closer to what is impactful and evidence-based. This is where LMs come in handy, providing the initial seeds of new ideas (stimulus) and working through them (refinement), but what about questioning, reorienting, analysing, and building ideas (critical thinking)?

Some modelling challenges need to be addressed before critical thinking capabilities can be introduced in LLMs. This may require rethinking how LLMs are fine-tuned and aligned. LLMs will need to understand what is happening in the conversation, including what hasn't been said explicitly.

Until these challenges are solved, we can rethink how we work with AI and how we avoid the risk of AI Obesity. As Bornet wrote in his book:

• We must not let AI take over all creative tasks.

• Don't rely on AI for making decisions or solving problems — this weakens our critical thinking.

Many institutions, particularly Universities, are working together to develop resources to help students strengthen their critical thinking skills in the AI era. An example is Newcastle University which developed a checklist based on 6 questions to encourage critical thinking. I've summarised them in the diagram below.

Diagram created by the author using Wepik.

Diagram created by the author using Wepik

Final thoughts

When you choose to turn on AI tools at work or in your personal life, pause before signing off on their responses. Reflect on the following considerations:

• AI still cannot comprehend the subtleties and nuances of human language and context.

• The “knowledge” that generative AI tools contain rarely reflects data after a certain date, so they don't know much about recent events or sources.

• AI tools respond to prompts we craft. Making good prompts usually takes not only an understanding of how the tool works and the content we need to find but also some level of critical thinking.

• AI generates solutions based on pattern recognition and a pre-defined context window.

• The values that govern our decisions and our moral compass are absent from an AI system, leading to results that are often quite vanilla.

• While AI can be retrained on new data, it lacks human-type reasoning (updating strategies in light of new information or shifting circumstances).

• Usually, AI tools never report their data sources, nor do they ever claim to be trained on specific data. This means that we have no idea who created the original information used by the AI and cannot know whether they had the necessary skills, experience, or expertise.

AI in itself is neither good nor bad. However, when done correctly, AI can enhance our productivity and make us more valuable at work and in our lives.

Don't settle for “good enough answers”, think critically and avoid falling into AI obesity.