Exhibition Demo
An exhibition about what happens when the unstable language of diffusion prompting meets the partial sight of a vision-language model.
Loading featured reconstructions...
Wall Text
Many people have used diffusion image generation to achieve significant results, yet exact recreation remains difficult. The interface is a field of words rather than a stable address: even the same prompt does not guarantee the return of the same image. What feels repeatable at the level of language slips at the level of form.
At the same time, vision-language models can understand the broad scene while missing the fine-grained details that make one image precisely itself. This project puts both limits together by staging an anthropomorphized VLM in a closed reconstruction loop. It studies a reference image, writes another prompt, asks the diffusion model to try again, and reacts to the result. Each run is capped at ten generations, preserving the images alongside the prompts, inner thoughts, and visible affect traces that emerge with each attempt.