ML

I enjoy ML research and think it's fun and important.

I wrote a paper by myself, and contributed to a couple at Cohere, before we largely stopped publishing to focus on creating the product.

Neural nets should not spend the same amount of compute per example. Some points are demonstrably easier than others to classify.

https://arxiv.org/abs/2007.13512

You can use a trained language model to select good datapoints to train future models on.

https://arxiv.org/abs/2107.02565

You can use the likelihood of a large language model to identify toxic content in a dataset.

https://arxiv.org/abs/2108.07790

This list is a compilation of papers I found particularly helpful or thought-provoking.

It's possible to measure the intrinsic dimension of the loss landscape.

https://arxiv.org/abs/1804.08838

The embeddings obtained from BERT are much less anisotropic (i.e. self-similar) than those obtained from GPT.

https://arxiv.org/abs/1909.00512

Big convolutional neural networks are not calibrated, but big transformers are.

https://arxiv.org/abs/1706.04599 and https://arxiv.org/abs/2003.07892

Imposing fairness constraints on ML systems can backfire in the long run. This paper scooped my Master's thesis!

https://arxiv.org/abs/1803.04383

Predictive policing is a bad idea.

https://arxiv.org/abs/1706.09847, which is based on this fantastic paper