language processing and math

This blog contains a collection of article summaries, work in progress ideas and personal updates.

I don't believe in empirical science. I only believe in a priori truth.

Kurt Gödel
Top AI Safety Dataset Contains Harmful Data: RLHF and Data Cleaning
Exploring the dataset used to align Anthropics' large language model Claude
February 27, 2023
How to Visualize a Pinecone Vector Database
Visualizing a Pinecone Vector Database
February 24, 2023
How to Visualize Open AI Embeddings
Visualizing embeddings from the OpenAI API
February 22, 2023
How a Subreddit Made Millions from COVID-19
nlp analysis of a stock trading subreddit during covid-19 induced volatility.
March 25, 2020
Multi-label Document Classification with BERT
designing a language model powered document classification architecture.
September 14, 2019
Collection o' Links
Links to learning materials, book recommendations
Meta-post