Muad’Dib learned rapidly because his first training was in how to learn. And the first lesson of all was the basic trust that he could learn. It’s shocking to find how many people do not believe they can learn, and how many more believe learning to be difficult. Muad’Dib knew that every experience carries its lesson. - Frank Herbert, Dune
An article published by the World Economic Forum says that we are “in the middle of a global reskilling emergency” as AI will automatize some of our jobs, and the Fourth Industrial Revolution will invent new jobs. According to the article, the most important skill to acquire in the new digital age is the ability to learn. Unfortunately, according to a McKinsey report: “Few adults have been trained in the core skills and mindsets of effective learners”.
There are primarily three types of memory associated with learning: short-term, working, and long-term memory.
As the name implies, short-term memory stores a small amount of information and keeps it for a very short time (several seconds). There is no manipulation of information in short-term memory. For example, while we write a phone number, we focus on the act of writing, nothing else.
Working memory is another type of memory that holds data for a short time. Although both short-term and working memory are similar regarding the time constraints, the main difference between these types of memory is that working memory stores the data to manipulate it.
Working memory allows us to recall and use relevant information in the middle of an activity. For example, the steps required to bake a dessert involve working memory: how many ingredients have I used so far? Did I weigh everything? Is the oven on?
Thinking about what to say next in a conversation is another example of working memory.
The capacity of the working memory is limited. Some researchers believe that, on average, the working memory can hold around four thoughts or concepts (chunks in cognitive psychology). That means that some people might have working memories that could hold only two or three pieces of information, whereas others could hold five or six chunks.
Long term memory has a much larger storage capacity, theoretically unlimited compared to short-term and working memory. Unless actively maintained and transferred to long-term memory, information in working memory will be lost.
When learning to drive a car, we engage all chunks of the working memory: where should we look? Which pedal to push? Wait, was there a road sign earlier? Oh no, a roundabout! After encoding all driving information into long-term memory, driving becomes habitual, with almost no working memory input.
We cannot change our working memory capacity, as this is a fixed characteristic, like the colour of our eyes. We can control instead knowledge transfer from working memory to long-term memory.
When learning a topic, the more information we transfer to long term memory, the less working memory we need, freeing up the chunks of the working memory for something else.
So, how to transfer information from short-term memory and working memory to long-term memory?
A study from 2013 conducted by John Dunlosky and collaborators analyzed over two hundred studies on revision techniques and compared the most used ten study techniques. According to them, highlighting, summarization (writing summaries of various lengths about subjects to-be-learned), rereading, mnemonics (using the first letters of words within a phrase to make a name), imagery used for text learning (mentally imaging or drawing pictures of the content we learn) have low utility compared to other study approaches.
What study techniques have moderate or high utility?
Self-explanation is a moderate utility study method. We either explain how new information is related to already known information, or we explain aspects of our thinking during learning. What parts of this page are new to us? What does the statement mean? Is there anything we still don’t understand? This method is efficient for all ages across various topics.
Elaborative interrogation is another moderate utility learning method. This method harnesses the power of our default inquisitive nature. It generates explanations for explicitly stated facts by asking why questions: why is this true? Why does x work this way? Why does it make sense that…? According to Dunlosky et al., self-explanation and elaborative interrogation are highly related but have developed independently. Because of that, researchers considered these methods separately.
Interleaving is a moderate utility study approach that involves practising different problem-solving techniques from various topics in one study session. As new problems with their specific solution methods are introduced, we should first learn and understand the new topics. Then we should interleave or mix resolving the new concepts with sets of exercises presented during previous lessons.
Another interleaving approach is to mix topics. Instead of separating subjects into blocks of time (one hour for algebra in which we study one strategy to resolve exercises, one hour for trigonometry in which we learn a different problem-solving approach), we mix topics in one session. We examine in the same session strategies for algebra and trigonometry exercises.
Distributed practice is a high utility study method. This method is the opposite of cramming. The focus in distributed practice is on learning less in each session but more frequently.
As we move information from working memory to long-term memory, we need to construct and periodically reinforce the neural networks related to our learning materials. Research has consistently shown that studying small chunks of information spread over time is more effective than one extensive study session. According to the mentioned study, this method works across students of different ages, with a wide variety of materials.
This method might require a study plan across long time periods so that we can effectively plan frequent study sessions. Especially with this method, we need an understanding of how procrastination and habits work. The purpose is to make distributed practice habitual, something we have to do regardless of the stories we might be compelled to tell ourselves.
Self-testing or practice testing is another high utility study technique according to Dunlovsky’s research. Practice testing is a technique where we practice retrieving information we learned and evaluate ourselves on the correctness of our answers.
This method can include completing questions from textbooks or previous exams or reviewing flashcards we created through active recalling. Active recalling means actively testing our knowledge and skills.
For example, after reading a page, we can cover the page and recall the main ideas. When walking somewhere, we can use this time as recalling the main ideas of the area we try to study.
Self-testing is incredibly efficient on its own and even more so when combined with distributed practice. We space the tests in time, and then, based on our objective opinion about our performance, we shorten or lengthen the time intervals between practice testing. We must be entirely objective and transparent in our self-evaluation.
Practice testing is not particularly time-intensive and we can implement it with minimal training. After all, we either understand something, or we identified some shaky areas that need further exploration.
Alternating between focused mode and diffuse mode
Professor Barbara Oakley named the two main types of networks, a highly attentive state network and a more relaxed resting-state network as focussed mode and diffuse mode, respectively. The focused mode is associated with the concentrating abilities of the brain. Diffuse-mode thinking happens when we relax our concentration and let our minds wander: taking breaks, doing something that relaxes us, sleeping, etc.
What would you think is the better learning method? Perhaps being in a focused mode? Oakley and her colleagues argue that the optimal way to efficient learning is alternating between focused and diffused modes.
Going back and forth between focused and diffused modes is a much better strategy to learn new concepts and create aha! moments. This approach explains why we have our Eureka ideas in the shower or when taking a walk, relaxing, etc.
Chunking
Remember the capacity of the working memory and how easily it can become overwhelmed when we learn something? Chunking is breaking down a massive topic into smaller and more manageable chunks.
We practice with the smaller units until we deeply understand the smaller topics. Thus, retrieving a smaller topic or chunk becomes trivial and leaves the working memory capable to handle something else. Then, after we understand every individual piece, we put the smaller chunks together and master the bigger chunks.
Take, for example, learning to read. We first encode each step of the reading process as a chunk in working memory (recognising phonemes, graphemes, connecting a word to its neighbours, reasoning). With enough repetitions, reading becomes a long-term memory activity.
Chunks allow us to create a mental library of systems that we can easily retrieve and see how they fit into the current context.
Flashcards
Flashcards are cards having information on both sides. Each flashcard shows a question on one side and an answer on the other.
Anki is an application that allows for creating online flashcards, which we can use for self-testing. Anki combines spaced repetition (a technique where newly introduced and more difficult flashcards are shown more frequently, while older and less difficult flashcards are shown less frequently), active recall, and interleaving. Based on our answers to flashcards, Anki determines how the review time intervals should grow or shrink.
I use Anki for my day to day job as a software engineer, where I either create flashcards for every new information that I learn or review information.
Pomodoro Sessions
The Pomodoro technique is a popular method of studying. It consists of alternations between 25-minutes of work (doing nothing but work, any other thoughts or activities can be dealt with after the 25-minute session) with five or ten-minute breaks (anything not work-related). Francesco Cirillo initially used this method with a tomato-shaped kitchen timer (hence the name, Pomodoro means “tomato” in Italian).
Some people might want longer pomodoro sessions but for most of us, an interval of 20-30 minutes is the most we can stay focused on a task. Inevitably, our minds will begin to wander. Some of us might force through this, but the Pomodoro technique expects the diminishing returns of work (the more we work, the less we produce). And so, instead of working past 25-30 minutes, we take a break to recharge and refresh (alternating between focused and diffuse modes).
After four pomodoro sessions, it is advised to take an extended break (15-20 minutes).
Sleep
Sleep, an eminently diffuse mode, is crucial for our brain and especially in learning. During sleep, the brain rehearses the information we want to learn and clears less important information. This is why pulling all-nighters is not an efficient learning strategy as there is almost no chance that knowledge can be transferred from working memory to long-term memory without sleep.
One study watched coloured dye flow through the brain of sleeping mice. They discovered that the space between brain cells increases during sleep as the brain cells shrink. This activity allows the brain to flush out brain toxins accumulated during the day.
Another study recorded the electrical activity and took fMRI images of adults while they slept. During non-REM sleep, oxygen-rich blood flows out of the brain while cerebrospinal fluid rolls in. That cerebrospinal fluid may help clean harmful proteins out of the brain. The video in action is here.
Anecdotally, any parent would know the milestone moments when babies learn how to roll. Babies would start rolling while they sleep during the night (and isn’t that a confusing moment when it happens for the first time), almost like they continue their rolling learning sessions from the daytime.
In the study I mentioned before, summarizing has low utility, but researchers are careful to remark that this method can be effective if learners are skilled at summarizing. However, many learners write things down and reread them. How to become skilled note-takers then?
During live classes, we might be taking notes at the same time as listening to the instructor. That means we might not have enough working memory to process the new information. And so, it might be better to either borrow someone’s notes, take notes after classes or take light notes during lessons so that we can focus entirely on the instructor. In this case, online learning has an advantage over live classes as we can pause the videos as much as we need.
An effective note-taking strategy is the Cornell notes system that involves elaborative interrogation, self-explanation, and self-testing through active recalling.
We divide a paper into three sections: a large column on the right for taking notes in class, usually the lecture’s main ideas or the text. As space is at a premium, we avoid long phrases and use symbols or abbreviations.
A smaller column in the left for relevant questions or keywords (cues) to help with the review of the notes. While reviewing, we can cover the right column while we answer the questions or develop the keywords from the left column. The process of filling in the questions section ourselves and then covering the notes sections while we answer the questions we filled in works better than passive rereading because we actively engage in retrieving information.
The small section at the bottom of the page is to summarise the concepts we are learning in a couple of sentences.
Image credit: Wikipedia
This technique was not invented by the famous physicist Richard Feynman, but it was named in his honour as Feynman’s nickname was “the great explainer”. If we try to understand a new concept, we should aim to explain it simply. This method involves the following steps:
On a sheet of paper or a digital medium, we try to explain the new concept as if we were teaching it to someone else who doesn’t have the knowledge we assume we have (for example, a 10-year old child or a rubber duck from rubber duck debugging).
The focus is on plain and straightforward language. We review the explanation and pinpoint the weak areas where we don’t know something or the terms are too technical or complex.
Once we have identified those areas, we go back to our notes and examples. How can we simplify this concept? What analogies can we use? We try to explain the new concept again on a sheet of paper or a digital medium, emphasizing the previously discovered gaps. Then we review the explanation looking for weak areas. We repeat the process until we are satisfied with our explanation.
This method involves remembering a familiar place (the layout of our home) and using it as a visual notepad where we can put concepts dressed in visual metaphors. We simply walk through our memory palace and we place our images. This video with four-time USA memory champion Nelson Dellis explains how he uses the memory palace technique to memorise 10,000 digits of pi.
Romans and Greeks used this method in their discourses by extracting the key ideas of a subject, re-arranging the ideas in relation to an argument, then linking the ideas to different places in order.
Although Dunlosky and his collaborators marked mnemonics and imagery as low utility study methods separately, it seems that the combination of mnemonics and imagery (the memory palace technique) is efficient.
I haven’t tried the memory palace method to full power yet, but this is something I want to experiment with in the future.
Unsurprisingly, Dunlosky and his colleagues ask pertinent questions at the end of their article:
Why aren’t the best techniques for learning taught in schools? Why do generation after generation of students cram and passively reread overly highlighted books that almost resemble colouring books?
Researchers argue that one reason is that educators themselves might not be taught effective learning techniques during their educational psychology classes. Another reason might be that the learning curriculum is geared toward content rather than learning that content effectively. And so, the authors include some tips for teachers:
at the beginning of each class, teachers could give low-stakes practice tests, with feedback, on the essential ideas from the previous section. students should be encouraged to use practice retrieval instead of passive rereading, teachers could apply interleaved sets of problems so that students can identify types of problems and use the appropriate solution. teachers could give the students a study planner so that students can use distributed repetition instead of cramming.
When introducing key concepts, teachers should encourage students to correlate new information with what students already know.
Being aware of efficient teaching methods is only a part of the equation. We also must keep in check the emotional aspect of learning.
Sometimes, instead of studying, we procrastinate because we need a coping mechanism for negative emotions such as boredom, frustration, self-doubt, and anxiety that we experiment with when we associate learning with pain (learning is tedious, too difficult, too vague, or exams are too far away in the future).
Or we might find ourselves highly motivated to start learning only to stop completely after a few days. Hope is not a proper learning strategy. We should aim to have study practices so ingrained they become habits. And it takes a step, a tiny, tiny step to start and build knowledge in a kaizen fashion from chunk to chunk to chunk.
Other times, we might fall prey to illusions of competence, where we believe we have already mastered the entire topic, whereas we might have understood only the introductory parts.
Every so often, the narrator’s voice builds a reassuring story: I got this! I can learn this! I will take it step by step!
Then, the same voice haunts us: I’m not good at maths, why should I even try? It’s too late to learn now. I should have started years ago! Why bother? It’s not like I will succeed.
The mentor. The tormentor.
We must be critical about self-fulfilling prophecies because what we tell ourselves becomes our inner voice and what we believe is what we will become.
As with any technique, learning strategies are only as efficient as we use them. After all, Feynman remarked:
You must not fool yourself, and you are the easiest person to fool.
I wrote this guide primarily for my daughter that recently started school as a junior infant. My purpose is to avoid my learning mistakes and show better study methods to my daughter.
Although I unconsciously applied some techniques while I was a student (I self-tested myself thoroughly and objectively before exams), I spent a lot of time with ineffective learning strategies. And to think as a student, I was so upset with myself because I couldn’t pull all-nighters like other colleagues…
There are some techniques I already use with my daughter (distributed practice – phonics and basic additions or subtraction every day, pomodoro sessions of five minutes for the weekly homework, self-testing for numbers and letters). Still, at her age, everything is done through play and songs, as it should be.
No one can do an article about the meta-skill of learning how to learn without referencing the great free courses provided by Barbara Oakley and her collaborators. These courses are:
Previously published here.