A
antiochIst
Guest
I built a system that monitors ~200,000 news RSS feeds in near real-time and clusters related articles to show how stories spread across the web.
It uses Snowflake’s Arctic model for embeddings and HNSW for fast similarity search. Each “story cluster” shows who published first, how fast it propagated, and how the narrative evolved as more outlets picked it up.
Would love feedback on the architecture, scaling approach, and any ways to make the clusters more accurate or useful.
Live demo: https://yandori.io/news-flow/
Comments URL: https://news.ycombinator.com/item?id=46053076
Points: 80
# Comments: 22
It uses Snowflake’s Arctic model for embeddings and HNSW for fast similarity search. Each “story cluster” shows who published first, how fast it propagated, and how the narrative evolved as more outlets picked it up.
Would love feedback on the architecture, scaling approach, and any ways to make the clusters more accurate or useful.
Live demo: https://yandori.io/news-flow/
Comments URL: https://news.ycombinator.com/item?id=46053076
Points: 80
# Comments: 22