← back

Contra Solitudinem

DEV
#Computer Vision#YOLOv8#MediaPipe#Three.js#Blender#FastAPI
huggingface.co/spaces/Dariachup/vitrazh-live →
Contra Solitudinem

An architecture student from an Austrian university approached me with a diploma idea: an art installation that reacts to people and encourages them to look at art with their eyes — not through a phone screen. We built the software together.

The concept

Motorized stained-glass panels on pivoting rods. A camera watches the room. The panels respond:

  • Nobody around — panels spin continuously in a hypnotic loop
  • One person — panels tease, drifting toward assembly but never completing it. You need company
  • Two or more — 30 seconds of chaotic pursuit, then the panels smoothly converge into a complete image
  • Someone raises a phone to photograph — the image breaks apart instantly

The artwork only reveals itself to groups who are present in the moment. You can’t capture it alone.

Demo version

The demo runs in your browser. A 3D room with stained-glass panels — modeled in Blender, rendered via Three.js.

How to use:

  • Rotate the room — click and drag with your mouse
  • Zoom in/out — scroll wheel
  • Start camera — small button in the top-right corner, “start cam”. Allow webcam access and step back so your full body is visible
  • Watch the panels react — they’ll start assembling when they detect you. Raise your hands — panels converge faster

The webcam feed goes through YOLOv8 (person detection) + MediaPipe (pose estimation) right in the browser. Nothing is sent anywhere.

Try the live demo on Hugging Face Spaces →

Technical pipeline

CameraYOLOv8 detectionMediaPipe poseState MachineMotor control

Five states: SPINNING → TEASING → PURSUIT → ASSEMBLING → ASSEMBLED

Pose classification uses pure geometry on MediaPipe’s 33 landmarks — elbow angles, wrist-to-face distances — to detect three behaviors: IDLE, PHOTOGRAPHING, PHONE_VIEWING. No extra ML model, just math.

Stack

  • YOLOv8n + ByteTrack — real-time person detection and tracking on CPU
  • MediaPipe Pose Landmarker — 33-point skeleton for pose geometry
  • Python 3.11+, OpenCV — camera pipeline (webcam, RTSP, file)
  • State Machine — 5-state FSM with injectable time for testing
  • FastAPI + WebSocket — live dashboard with operator controls
  • Three.js + Blender — full 3D scene: panels, room, lighting, orbit camera
  • Pydantic V2 + YAML — type-safe configuration
  • RPi.GPIO / Serial — hardware motor control (mock by default)

What’s next

The software pipeline is complete and runs as a browser demo. The production version — with real motors, physical panels, and a pavilion — is on the student’s side. The code is ready for it: swap mock motor driver for GPIO, point at an RTSP camera, deploy.