Full documentation covering architecture, deployment, API endpoints, speaker entity mapping, pipeline stages, and how recommendations improve over time. Fixed stale speaker entity reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
177 lines
7.4 KiB
Markdown
177 lines
7.4 KiB
Markdown
# haunt-fm
|
|
|
|
Personal music recommendation engine that captures listening history from Music Assistant, discovers similar music via Last.fm, computes audio embeddings with CLAP, and generates playlists mixing known favorites with new discoveries — played back on house speakers via Apple Music.
|
|
|
|
## How It Works
|
|
|
|
```
|
|
You play music on any speaker
|
|
→ HA automation logs the listen event
|
|
→ Last.fm discovers similar tracks (~50 per listened track)
|
|
→ iTunes Search API finds 30-second audio previews
|
|
→ CLAP model computes 512-dim audio embeddings
|
|
→ pgvector stores and indexes embeddings (HNSW cosine similarity)
|
|
→ Taste profile = weighted average of listened-track embeddings
|
|
→ Recommendations = closest unheard tracks by cosine similarity
|
|
→ Playlist mixes known favorites + new discoveries
|
|
→ Music Assistant plays it on speakers via Apple Music
|
|
```
|
|
|
|
## Deployment
|
|
|
|
Runs on the NAS as two Docker containers:
|
|
|
|
| Container | Image | Port | Purpose |
|
|
|-----------|-------|------|---------|
|
|
| `haunt-fm` | Custom build | 8321 → 8000 | FastAPI app + embedding worker |
|
|
| `haunt-fm-db` | `pgvector/pgvector:pg17` | internal | PostgreSQL + pgvector |
|
|
|
|
```bash
|
|
# Deploy / rebuild
|
|
cd /volume1/homes/antialias/projects/haunt-fm
|
|
git pull && docker-compose up -d --build haunt-fm
|
|
|
|
# Run migrations
|
|
docker exec haunt-fm alembic upgrade head
|
|
```
|
|
|
|
**Access:**
|
|
- Status page: https://recommend.haunt.house
|
|
- Health check: http://192.168.86.51:8321/health
|
|
- API status: http://192.168.86.51:8321/api/status
|
|
- Source: https://git.dev.abaci.one/antialias/haunt-fm
|
|
|
|
## API Endpoints
|
|
|
|
| Method | Path | Purpose |
|
|
|--------|------|---------|
|
|
| GET | `/health` | Health check (DB connectivity) |
|
|
| GET | `/api/status` | Full pipeline status JSON |
|
|
| GET | `/` | HTML status dashboard |
|
|
| POST | `/api/history/webhook` | Log a listen event (from HA automation) |
|
|
| POST | `/api/admin/discover` | Expand listening history via Last.fm |
|
|
| POST | `/api/admin/build-taste-profile` | Rebuild taste profile from embeddings |
|
|
| GET | `/api/recommendations?limit=50` | Get ranked recommendations |
|
|
| POST | `/api/playlists/generate` | Generate and optionally play a playlist |
|
|
|
|
## Usage
|
|
|
|
### Generate and play a playlist
|
|
|
|
```bash
|
|
curl -X POST http://192.168.86.51:8321/api/playlists/generate \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"total_tracks": 20,
|
|
"known_pct": 30,
|
|
"speaker_entity": "media_player.living_room_speaker_2",
|
|
"auto_play": true
|
|
}'
|
|
```
|
|
|
|
**Parameters:**
|
|
- `total_tracks` — number of tracks in the playlist (default 20)
|
|
- `known_pct` — percentage of known-liked tracks vs new discoveries (default 30)
|
|
- `speaker_entity` — Music Assistant entity ID (must be a `_2` suffix entity)
|
|
- `auto_play` — `true` to immediately play on the speaker
|
|
|
|
### Speaker entities
|
|
|
|
The `speaker_entity` **must** be a Music Assistant entity (the `_2` suffix ones) for text search to resolve through Apple Music. Raw Cast entities cannot resolve search queries.
|
|
|
|
| Speaker | Entity ID |
|
|
|---------|-----------|
|
|
| Living Room speaker | `media_player.living_room_speaker_2` |
|
|
| Dining Room speaker | `media_player.dining_room_speaker_2` |
|
|
| basement mini | `media_player.basement_mini_2` |
|
|
| Kitchen stereo | `media_player.kitchen_stereo_2` |
|
|
| Study speaker | `media_player.study_speaker_2` |
|
|
| Butler's Pantry speaker | `media_player.butlers_pantry_speaker_2` |
|
|
| Master bathroom speaker | `media_player.master_bathroom_speaker_2` |
|
|
| Kids Room speaker | `media_player.kids_room_speaker_2` |
|
|
| Guest bedroom speaker 2 | `media_player.guest_bedroom_speaker_2_2` |
|
|
| Garage Wifi | `media_player.garage_wifi_2` |
|
|
| Whole House | `media_player.whole_house_2` |
|
|
| downstairs | `media_player.downstairs_2` |
|
|
| upstairs | `media_player.upstairs_2` |
|
|
|
|
### Other operations
|
|
|
|
```bash
|
|
# Log a listen event manually
|
|
curl -X POST http://192.168.86.51:8321/api/history/webhook \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"title":"Paranoid Android","artist":"Radiohead","album":"OK Computer"}'
|
|
|
|
# Run Last.fm discovery (expand candidate pool)
|
|
curl -X POST http://192.168.86.51:8321/api/admin/discover \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"limit": 50}'
|
|
|
|
# Rebuild taste profile
|
|
curl -X POST http://192.168.86.51:8321/api/admin/build-taste-profile
|
|
|
|
# Get recommendations (without playing)
|
|
curl http://192.168.86.51:8321/api/recommendations?limit=20
|
|
```
|
|
|
|
## Pipeline Stages
|
|
|
|
1. **Listening History** — HA automation POSTs to webhook when music plays on any Music Assistant speaker. Deduplicates events within 60 seconds.
|
|
2. **Discovery** — Last.fm `track.getSimilar` expands each listened track to ~50 candidates.
|
|
3. **Preview Lookup** — iTunes Search API finds 30-second AAC preview URLs (rate-limited ~20 req/min).
|
|
4. **Embedding** — Background worker downloads previews, runs CLAP model (`laion/larger_clap_music`), stores 512-dim vectors in pgvector with HNSW index.
|
|
5. **Taste Profile** — Weighted average of listened-track embeddings (play count * recency decay).
|
|
6. **Recommendations** — pgvector cosine similarity against taste profile, excluding known tracks.
|
|
7. **Playlist** — Mix known-liked + new recommendations, interleave, play via Music Assistant.
|
|
|
|
## Improving Recommendations Over Time
|
|
|
|
Recommendations improve as the system accumulates more data:
|
|
|
|
- **Listen to music** — every track played on any speaker is logged automatically
|
|
- **Run discovery periodically** — `POST /api/admin/discover` to expand the candidate pool via Last.fm
|
|
- **Rebuild taste profile** — `POST /api/admin/build-taste-profile` after significant new listening activity
|
|
- **Embedding worker runs continuously** — new candidates are automatically downloaded and embedded
|
|
|
|
The taste profile is a weighted average of all listened-track embeddings. More diverse listening history = more nuanced recommendations.
|
|
|
|
## Tech Stack
|
|
|
|
| Component | Choice |
|
|
|-----------|--------|
|
|
| App framework | FastAPI + SQLAlchemy async + Alembic |
|
|
| Database | PostgreSQL 17 + pgvector (HNSW cosine similarity) |
|
|
| Embedding model | CLAP `laion/larger_clap_music` (512-dim, PyTorch CPU) |
|
|
| Audio previews | iTunes Search API (free, no auth, 30s AAC) |
|
|
| Discovery | Last.fm `track.getSimilar` API |
|
|
| Playback | Music Assistant via Home Assistant REST API |
|
|
| Music catalog | Apple Music (via Music Assistant) |
|
|
| Reverse proxy | Traefik (`recommend.haunt.house`) |
|
|
|
|
## Environment Variables
|
|
|
|
All prefixed with `HAUNTFM_`. Key ones:
|
|
|
|
| Variable | Purpose |
|
|
|----------|---------|
|
|
| `HAUNTFM_DATABASE_URL` | PostgreSQL connection string |
|
|
| `HAUNTFM_LASTFM_API_KEY` | Last.fm API key for discovery |
|
|
| `HAUNTFM_HA_URL` | Home Assistant URL |
|
|
| `HAUNTFM_HA_TOKEN` | Home Assistant long-lived access token |
|
|
| `HAUNTFM_EMBEDDING_WORKER_ENABLED` | Enable/disable background embedding worker |
|
|
| `HAUNTFM_EMBEDDING_BATCH_SIZE` | Tracks per batch (default 10) |
|
|
| `HAUNTFM_EMBEDDING_INTERVAL_SECONDS` | Seconds between batch checks (default 30) |
|
|
| `HAUNTFM_MODEL_CACHE_DIR` | CLAP model cache directory |
|
|
| `HAUNTFM_AUDIO_CACHE_DIR` | Downloaded preview cache directory |
|
|
|
|
See `.env.example` for full list.
|
|
|
|
## Integrations
|
|
|
|
- **Home Assistant** — automation `haunt_fm_log_music_play` captures listening history; REST API used for speaker playback
|
|
- **Music Assistant** — resolves text search queries to Apple Music tracks, streams to Cast speakers
|
|
- **OpenClaw** — has a skill doc (`skills/haunt-fm/SKILL.md`) so you can request playlists via Telegram/iMessage
|
|
- **Traefik** — routes `recommend.haunt.house` to the service
|
|
- **Porkbun DNS** — CNAME for `recommend.haunt.house`
|