I almost shipped a RAG pipeline that, on certain questions, cited exactly the right document — and...
My eval harness paid for itself on the first run: 0.57 0.96, two bugs no unit test could catch
I almost shipped a RAG pipeline that, on certain questions, cited exactly the right document — and...
Most phishing detection APIs check URL reputation databases. The problem? Brand new phishing sites...
Notes following a discussion on how memory works in language models - and how it could be improved:...
My AI conversations were scattered across three apps that couldn't remember each other. So I built a...