Machine Learning

Videos, Articles, books, and blog posts

Facial Recognition Auditing
Data Skeptic | 19 June 2020

Podcast about the importance of de-biasing training sets, and how thorough this must be. Also, stresses the necessity of being transparent with how model is trained and worked.

Interpretable AI in Healthcare
Data Skeptic | 17 May 2020

Academic Papers

Here is a list of important academic papers that are also well-written and clearly make an effort to be accessible (as possible as that is at the academic research level).

Chipman, H.A., and McCulloch, R.E. "BART: Bayesian additive regression trees". 1998. The Annals of Applied Statistics, 4(1): 266-298.

BART is a really cool decision tree based machine learner. The algorithm is one of the best predictive regression algorithms out there, and has multiple easy to use R and python implementations, including dbarts and bartmachine. Also, see the paper below for the many uses of BART since its publication, including in causal inference.

Hill, Linero, and Murray "Bayesian Additive Regression Trees: A Review and Look Forward". Oct 2019. Annual Review of Statistics and Its Application.

This paper summarizes where BART has been used, how it works, BART adaptations, and BART for causal inference. See also Hill (2011) for a review of BART and how it can be used in causal inference.

Thiagarajan, Sattigeri,Rajan, Venkatesh “Calibrating Healthcare AI: Towards Reliable and Interpretable Deep Predictive Models.” May 2020. arXiv. (preprint)

New paper that continues to build on the burgeoning field of "Explainable Machine Learning". The paper attempts to incorporate counter-factual reasoning (see the causal inference tab) into a variational auto-encoder, which is a fancy neural network that can work without labels (i.e. unsupervised learning). Additionally, the paper combines the model output plus "expert human insight" and introduces reliability plots which allow for one to see how well the model works versus how much influence the human expert has. Podcast above.

Raji, Gebru, Mitchell, Buolamwini, Lee, Denton "Saving Face: Investigating the Ethical Concerns of Facial Recognition Auditing" Jan 2020. arXiv. (preprint)

Podcast post "Facial Recognition Auditing" summarizes this paper.

Radford, Wu, Child, Luan, Amodei, Sutskever "Language Models are Unsupervised Multitask Learners" 2019. OpenAI>

This paper works to generalize NLP (natural language processing) techniques. One potential tool that can arise from the methods introduced in this paper is a text generator. In particular, I saw someone use the method to try to create a sitcom writer by training on every scene from the entire run of Seinfeld (a whole lot of non-sense to sift through). The result was...well you can judge for yourself after reading this blog.