No description
- Rust 86.4%
- Python 5.8%
- Svelte 4.1%
- TypeScript 2.6%
- Shell 0.9%
| crates | ||
| data | ||
| docs | ||
| frontend | ||
| mk | ||
| src | ||
| tests | ||
| tools | ||
| .gitignore | ||
| backlog-methodology.md | ||
| Cargo.lock | ||
| Cargo.toml | ||
| Dockerfile | ||
| README.md | ||
| rust-toolchain.toml | ||
| TODO.md | ||
| todo.md | ||
| TUTORIAL.md | ||
Music Recommender
A Bay Area-first music discovery engine that starts from upcoming local shows, then helps you figure out what to listen to next.
Current scope
- Ingest local shows from 19hz and Foopee via the structured crawler on
localhost:11235 - Capture editorial recommendation signals from NPR, The Needle Drop, and KEXP
- Store artists, venues, events, genres, images, and source evidence in SQLite
- Enrich artists with Every Noise and MusicBrainz genre data
- Produce genre-first artist recommendations using simple likes and dislikes
- Import persistent loved-artist assertions from CSV and use them as cautious taste signals
- Record attended shows directly from the CLI or CSV with artist/venue resolution and unresolved review support
- Use stored loved-artist and attended-show history as light, explicit taste signals in recommendation scoring
Quick start
cargo run -- init-db
cargo run -- doctor
cargo run -- musicbrainz-health
cargo run -- add-loved-artist Broadcast --source manual --note "favorite band"
cargo run -- import-loved-artists ./loved-artists.csv --default-source manual
cargo run -- list-loved-artists
cargo run -- add-attended-show --artist Broadcast --venue "Fox Theater" --source manual
cargo run -- import-attended-shows ./attended-shows.csv --default-source history.csv
cargo run -- list-attended-shows --unresolved-only
cargo run -- completions --all-shells --output-dir ./completions
cargo run -- ingest-all --artist-limit 500
cargo run -- recommend --liked-genre punk --disliked-genre edm
Notes
- The app expects the crawler service at
http://localhost:11235/crawl - The app now defaults to XDG locations:
~/.cache/music-recommender,~/.config/music-recommender, and~/.local/share/music-recommender - On first run, the app writes a config template to
~/.config/music-recommender/config.toml - Point the app at your own MusicBrainz by setting
musicbrainz_domain = "192.168.50.119:5000"ormusicbrainz_domain = "https://musicbrainz.example.com"in config; the app derives the/ws/2base URL automatically musicbrainz_base_urlstill exists as an explicit full-URL override and takes precedence overmusicbrainz_domaincrawler_urlin config may be a comma-separated fallback list- Set
searx_search_urlandcrawl4ai_urlin~/.config/music-recommender/config.tomlor viaSEARX_SEARCH_URL/CRAWL4AI_URL; both accept comma-separated failover lists tried in order, andcrawl4ai_urlcan be either the bare domain or the full/mdendpoint - The default SQLite database lives at
~/.local/share/music-recommender/music-recommender.sqlite3 - On first run, the bundled Every Noise seed file is copied into
~/.local/share/music-recommender/everynoise_genres_20260317_135315.jsonunlessMUSIC_RECOMMENDER_EVERYNOISE_PATHis set - Crawler and MusicBrainz responses are cached under
~/.cache/music-recommender - Show crawler responses default to a 24 hour TTL and signal crawler responses default to a 48 hour TTL
- Configure crawler TTLs with
crawler_show_cache_ttl_secsandcrawler_signal_cache_ttl_secs - The crawler also has configurable timeout and retry settings:
crawler_timeout_secs,crawler_max_retries, andcrawler_retry_base_delay_ms - Set
musicbrainz_user_agentin~/.config/music-recommender/config.tomlor viaMUSIC_RECOMMENDER_MUSICBRAINZ_USER_AGENT; use a real contactable value likemusic-recommender/0.1.0 (you@example.com) - MusicBrainz lookups retry transient failures; for public MusicBrainz the default is a small self-throttle with parallelism
1, while self-hosted MusicBrainz defaults to no throttle and highermusicbrainz_parallelism - Tune MusicBrainz behavior with
musicbrainz_min_interval_ms,musicbrainz_max_retries,musicbrainz_retry_base_delay_ms, andmusicbrainz_parallelism - Use
cargo run -- doctorto print resolved config and endpoint reachability - Use
cargo run -- musicbrainz-healthto run a fast MusicBrainz-specific probe that validates both artist search and artist lookup with usable genre data - Use
cargo run -- import-loved-artists ./loved-artists.csv --default-source manualto store explicit loved-artist assertions from CSV and immediately try to enrich them with MusicBrainz genres - Use
cargo run -- add-loved-artist Broadcast --source manualfor quick one-off loved-artist entry without preparing a CSV first - Use
cargo run -- list-loved-artiststo inspect the stored per-source loved-artist assertions, including notes, confidence, and any persisted MBID - Use
cargo run -- add-attended-show --artist "Visages" --venue "Public Works" --source manualto log attended shows directly from the CLI - Use
cargo run -- import-attended-shows ./attended-shows.csv --default-source history.csvto import attended-show history from CSV using anartistscolumn split by| - Use
cargo run -- ingest-signals --source kexp --refreshto ingest a broader recent KEXP review set plus a recent KEXP playlist snapshot add-attended-showtries exact local resolution first, then MusicBrainz for artists; unresolved venue/artist names are preserved instead of discardedimport-attended-showsuses the same resolution path and also merges likely duplicates so a corrected typo re-import can land on the earlier unresolved row instead of creating a second copy- Use
cargo run -- list-attended-shows --unresolved-onlyto review typo-prone or still-unmatched attended-show rows and see likely local candidates - KEXP review mentions and KEXP playlist plays are now stored as separate editorial sources, so recommendation buzz/source diversity can treat them comparably to NPR and Needledrop support
- KEXP review and playlist support can also weakly reinforce an artist's already-strong genres during enrichment, so KEXP affects genre overlap as well as editorial buzz
- If the local crawler is down,
ingest-signals --source kexpnow warns and continues with the playlist snapshot instead of failing the whole KEXP ingest - Use
-vor--verboseon any command to print detailed request/progress output - imported loved artists and resolved attended-show artists now feed
recommendautomatically as light exact-match and cautious genre-overlap signals recommendnow defaults to compact output; add--explainif you want the full accumulated reason textrecommendnow caches results under~/.cache/music-recommender/recommendfor 15 minutes and invalidates when recommendation-relevant data changes- Use
cargo run -- recommend --refreshto force a recompute, orcargo run -- recommend --no-cacheto bypass the recommendation cache entirely - Use
cargo run -- recommend --after 10d --before 30dto limit recommendations to shows happening within a relative day window - Use
cargo run -- recommend --after 2026-05-01 --before 2026-06-30to filter by absolute show-day bounds --afterand--beforework at the local event-day level and combine with--horizon-days; invalid ranges where--afterresolves later than--beforefail fast- Use
cargo run -- completions --all-shells --output-dir ./completionsto generate Bash, Zsh, Fish, Elvish, Nushell, and PowerShell completions plus an install guide - Use
cargo run -- completions --shell zsh --stdoutto print a single shell script directly to stdout - Use
cargo run -- ingest-all --artist-limit 500to runingest-shows,ingest-signals, andenrich-artistsin one command - Add
--check-musicbrainztoenrich-artistsoringest-allto run the fast MusicBrainz preflight before any MusicBrainz-backed work starts and fail early if it does not pass ingest-allfails fast by default; add--allow-signal-ingest-failuresif you want it to continue into enrichment when editorial signal fetching fails- Use
--cache-ttl-hours <n>oningest-showsoringest-signalsto override crawler TTLs for that run - Use
--show-cache-ttl-hours <n>and--signal-cache-ttl-hours <n>oningest-allto override stage-specific crawler TTLs - Use
--cache-foreveroningest-showsoringest-signalsto reuse the last crawler response regardless of age - Use
--show-cache-foreverand--signal-cache-foreveroningest-allto reuse stale cache for those stages - Use
--refreshto skip cache reads and write fresh responses, or--no-cacheto skip both reads and writes entirely - The Rust implementation avoids
unsafe