Creating "Aesthetic Families" for Anki Cards with Midjourney
Make your images look unique to remember them better.
In the two previous posts about using Midjourney with Anki, we talked about what Midjourney is and why images help you remember things better. In this post, I’ll put two pieces together and explain how to create “aesthetic families” for the different topics you want to learn.
A Quick Recap
But first, a quick recap of the Picture Superiority Effect. In particular, we want to read this paragraph again:
One theory argues that because images are more perceptually distinct from one another than words are, they are more likely to be encoded into memory. When images are very similar to each other, the Picture Superiority Effect is not present. This seems to suggest that it’s not images per se that are the key, but rather their distinctiveness.
What does this mean? That you don’t merely want to create images for the things you’re learning, but distinct images. Thankfully, this is easy to do with Midjourney, if you get a little creative.
To show you how, let’s walk through some examples. Let’s say we’re learning two languages at the same time, Spanish and Japanese.
Both languages will have words for common things and ideas. For example, mountain is “montaña” in Spanish and “山 (Yama)” in Japanese.
If, using Midjourney, we type in the prompt “mountain
” twice, once for Spanish and once for Japanese, we’ll get something like this:


As you can see, there is no real difference between the images. If I show you one of them, you can’t tell whether it belongs to Spanish or Japanese. This makes the word itself harder to remember.
I used the example of mountain because the actual research study used images of mountains to highlight that differentiation is key to the Picture Superiority Effect:
…subjects saw many different photographs, but they came from a small number of concept categories (e.g. pictures of landscapes divided into only a handful of sub categories, such as mountains, deserts, bodies of water, forests). Subjects could only rely on visual information to discriminate old from new items (“I saw many pictures of mountains, but did I see a picture of this particular mountain?”).
Memory for pictures: Sometimes a picure is not worth a single word
So what’s the solution? Create an “aesthetic family,” or group of key visual terms, to apply to all images related to a specific topic. You are essentially being an art director, creating a unified artistic vision for the images you want to create.
While you can create this yourself, I think it’s easier to simply draw on the visual art associated with the language you’re learning. That includes film, painting, drawing, architecture, and many other art forms.
For example, here are a few ideas:
Spanish – Pablo Picasso, Mexico City, Dali, Goya, Madrid, Frida Kahlo, Diego Rivera, Miami, Miami Vice / the 80s Neon Aesthetic
Japanese – Futuristic, Tokyo, cyberpunk, Anime, Manga, Woodblock prints, Hokusai, Buddhist temples, Shinto shrines
German – Klimt, Vienna architecture, Berlin, German Expressionism, Dürer, Fritz Lang
Italian – Renaissance sketches, Michelangelo, Da Vinci, Rome, Venice, Florence
British English – London architecture, fog, Sherlock Holmes, Hyde Park, red buses, London Tube, Buckingham Palace
If you don’t know much about the visual culture of a particular place, I recommend googling “[country] artists” or “[country] filmmakers” and browsing the results to find one that you like. You can also browse this extensive site of Midjourney styles. Type in your desired country into the search bar; e.g., “French artist.” This list of filmmakers is also super useful.
If you’re learning a more abstract topic that doesn’t lend itself immediately to visuals, try to come up with a series of adjectives that relate to it:
Mathematics: chalkboard, chalk, classroom, patterns, graph paper, patterns, numbers
Technology-related topic: old school sci-fi magazine illustration, robots, computers, high-tech
Philosophy: academics, ancient Greece, Plato’s Academy, The Thinker sculpture
The key is to create a unified aesthetic that is tied only to a single language or educational topic, then generate distinct images in that aesthetic. The more strange and bizarre an image is, the more memorable it will be.
Don’t be afraid to exaggerate, even to a stereotypical degree. While Miami doesn’t actually look like Miami Vice, images of a neon-filled city with 80s motifs are vastly more eye-catching than the real-world city.
Recreating Our Images
Now, let’s recreate our two images for “mountain,” using the following prompts. The letters and numbers at the end just indicate Midjourney settings, which you can read more about here.
Spanish: mountain, Miami Vice style, neon, 80s, bright colors, palm trees, beach mood --q 2 --v 5 --s 750
Japanese: mountain in Japanese anime style, cyberpunk motifs --q 2 --v 5 --s 750
Here’s what we get:


Now these are very distinctive and we won’t have any issues telling Spanish and Japanese apart. To get similar results for other words/concepts, just add the same list of keywords after the initial focus word. For example:
dog, Miami Vice style, neon, 80s, bright colors, palm trees, beach mood --q 2 --v 5 --s 750
…will give you this result:
If your images aren’t highlighting the main concept enough (“mountain” in our example) try putting the word “style” after everything that isn’t the specific noun you want.
Picking One Image
By default, Midjourney creates 4 smaller images from your prompt. If you like one, you can “upscale” it to add more detail. Learn more about how that works in the What is Midjourney? post.
However, I typically don’t use upscale and instead take a screenshot of the particular image I want to use with the Portion screenshot option on my Mac. As I need the image to be memorable – and not extremely detailed – this saves some time and also cuts down on file size.
So, after generating an aesthetic image, pick which of the 4 images is your favorite, then take a portion screenshot of it – or upscale it. Here’s our final result, which we can add to Anki:


And that covers the basic idea of creating an aesthetic family. In a future post, I’ll talk about optimizing this process by using a text-expander to quickly paste your lists of aesthetics.