About

Search art the way you think about it

Retrievals is a semantic search engine for the National Gallery of Art's open collection. Instead of matching keywords, it understands meaning — so "melancholy figure in candlelight" finds Rembrandt before you know to look for him.

Why it exists

Museum search is broken. The NGA's own search requires you to already know the artist, the title, or the accession number. If you remember a painting from a childhood visit — warm light, a woman reading, Dutch — you have no way in.

Retrievals fixes that. Describe what you remember, or what you feel, and the index finds it. The entire 68,000-object collection is embedded into a shared vector space where visual and semantic similarity are the same thing.

How it works

Every artwork in the collection was passed through Qwen3-VL-Embedding-2B, a multimodal vision-language model that produces a single 768-dimensional vector from an image. Those vectors live in a FAISS index on a serverless Modal GPU.

When you search, your query is embedded by the same model and compared against all 68,000 vectors using approximate nearest-neighbour search. The top candidates are then re-scored by a cross-encoder reranker that considers the full query-image pair — not just vector distance — before the final results are returned.

Embeddings

Qwen3-VL-Embedding-2B

Multimodal vision-language model — encodes both image and text into a shared vector space

Vector index

FAISS (IVFFlat)

Approximate nearest-neighbour search across 68,000 vectors in milliseconds

Reranker

Qwen3-VL (image mode)

Cross-encoder reranking pass that re-scores top candidates against the query image

Compute

Modal L40S GPU

Serverless GPU inference — cold-starts in ~2 s, scales to zero between requests

Collection

National Gallery of Art open data

68,816 objects from the NGA's public-domain dataset, thumbnails served from Cloudflare R2

The collection

The index covers all 68,816 objects in the NGA's open-access dataset, which the gallery releases under a Creative Commons Zero licence. Works range from 13th-century panel paintings to 20th-century photographs. Thumbnails are served from Cloudflare R2; high-resolution images link back to the NGA's own IIIF server.