2026-06-10By Mohamed Mohamoud

The Mood Index: How Retrievals Finds Art by Feeling

When you search for 'melancholy figure in candlelight,' Retrievals doesn't look for those words in a database. It maps your description into a geometric space where mood has coordinates — and finds the artworks that live nearest to them.

The Mood Index: How Retrievals Finds Art by Feeling

The Mood Index: How Retrievals Finds Art by Feeling

"Melancholy figure in candlelight." No artist. No title. No date. Just a description of a feeling and the quality of light that produces it.

Retrievals returns Rembrandt. It returns Georges de La Tour. It returns the Dutch Golden Age candlelight school that you may not have known existed before the results appeared.

This isn't fuzzy matching against catalog descriptions. The word "melancholy" doesn't appear in the NGA's metadata for most of those works. What's happening is something geometrically precise: your query and those artworks occupy the same region of a high-dimensional semantic space, and the system finds them by measuring distance.


Mood as Geometry

The embedding model at the core of Retrievals — Qwen3-VL-Embedding-2B — maps both images and text into the same vector space. Every artwork in the collection is represented as a point in 1024-dimensional space. When you type a query, it becomes a point in that same space.

The key property is that meaning determines proximity, not vocabulary. The model was trained to place semantically similar things near each other. An image of a single figure bent over a candle in a dark room and the phrase "solitary candlelight" end up near each other not because of any explicit rule, but because the model learned from vast amounts of image-text data that these things belong together.

flowchart TD
    A["Query: 'melancholy figure in candlelight'"]
    B["Qwen3-VL\ntext embedding"]
    C[("1024-dimensional\nvector space")]
    D["Nearest neighbors\nby cosine distance"]
    E["Rembrandt · de La Tour\nHonthorst · Leyster"]

    A --> B
    B --> C
    C --> D
    D --> E

    classDef query fill:#f5f0e8,stroke:#a86845,color:#2c2926;
    classDef model fill:#ebe5d9,stroke:#6f685f,color:#2c2926;
    classDef space fill:#fcf9f2,stroke:#2c2926,color:#2c2926,stroke-width:2px;
    classDef result fill:#fffaf0,stroke:#a86845,color:#2c2926;
    class A query;
    class B model;
    class C space;
    class D,E result;

This is why the query doesn't need to match any metadata field. The geometry does the work.


What "Mood" Actually Encodes

When a model embeds an image of a painting, it's encoding a compressed representation of everything the model can perceive about it: color distribution, compositional structure, subject matter, lighting quality, the relationship between figure and ground. Mood isn't a separate field — it's an emergent property of all those signals together.

A Rembrandt self-portrait and the phrase "introspective old man in warm shadow" are close in embedding space because the model has learned that warm-shadow compositions with aged subjects produce a particular emotional register. It doesn't have a lookup table for this. It has geometry.

The same mechanism explains why abstract queries work. "Violent energy," "stillness at the edge of water," "the moment before something breaks" — these aren't catalog terms, but they describe real visual and emotional properties that the embedding space captures.


Three Mood Queries, Examined

"Grief and gold light"

This lands near Byzantine and early Renaissance devotional works — the combination of grief (as subject) and gold (as visual ground) is characteristic of that period. You might not have known the connection; the embedding space makes it visible.

"Vast empty landscape, human figure small"

Classic Romantic-era signals: the sublime, the diminishment of the individual against nature. This query surfaces Friedrich-adjacent works, American Hudson River School paintings, and Northern European landscapes from the late 18th and 19th centuries.

"Intimacy and domestic light"

This is the Vermeer cluster. Also de Hooch. Also Judith Leyster's domestic scenes. The query describes an emotional and compositional property — warm, enclosed, personal — that a specific tradition of Northern European painting developed into a discipline.

In all three cases, the result isn't a coincidence. The embedding space learned the correlation between those descriptive phrases and those visual traditions.


The Limit: What Mood Can't Find

The geometry isn't perfect. A few places where mood-first search falls short:

Iconographic specificity. "The moment Judith raises the sword" is a mood and an action, but if the composition is ambiguous or the model's training data underrepresents that iconographic tradition, you may surface works with similar compositional energy but different subjects.

Counter-intuitive pairings. Some artists produce work where the mood you'd expect and the visual properties diverge — a technically violent composition might be serene in its color and light. The model reads the visual signal, not your cultural interpretation of the subject.

Very large collections. The 68,000-object NGA dataset is big enough that some clusters are densely populated. A mood query that should surface a specific work may find many similar candidates; browsing the first two pages is usually enough.

Despite those limits, mood-first search finds things keyword search cannot. The entire point is to reach the works where your only entry is a feeling.


Using It

The most effective mood queries tend to combine two or three descriptors across different registers:

  • Emotional state + lighting quality: "loneliness and gray northern light"
  • Subject + atmosphere: "domestic interior, warm afternoon, no drama"
  • Period signal + mood: "Baroque chiaroscuro, female subject, defiant"
  • Pure sensory description: "deep red, heavy fabric, candlelight catching gold"

The system handles multilingual queries natively — the embedding space is language-agnostic at the semantic level, so Spanish, German, French, and Chinese queries produce coherent results. "Triste et lumineux" finds the same region of the collection as "melancholy and luminous."

The collection is 68,816 objects. Most of them have never been searched by their mood. Start with a feeling →

#semantic search#embeddings#mood#vector space#art discovery