Statistics books for the summer 2020
My list of stats books of the rest of summer 2020:
ML:
- Scipy Lectures by https://scipy-lectures.org/preface.html#authors
- AAAMLP(2020) by @abhi1thakur
- HON(2019) by @aureliengeron
Econometrics:
- ROS(2020) by @StatModeling, Jennifer Hill and @avehtari
Links, pics and comments at https://bit.ly/38WSyPg
There are two new titles published in summer 2020 on data analysis: "AAAMLP" (machine learning) and "ROS" (econometrics). These two books triggered me to remember other similar titles and how to put a reading on data analysis in a perspective. I soon realised I can sort the learning resources along the following lines:
- Topics covered: econometrics vs machine learning vs deep learning
- Release date, edition: newly released books vs established textbooks
- Programming language: Python vs R vs Julia (also other open source and proprietary)
- Access: free vs paid
- Code: repository with code examples
So far I've linked 5 titles and the reading path among them looks like below.
For machine learning:
- read Scipy Lectures before "AAAMLP"
- "HON" (2019) goes well with "AAAMLP" (2020)
- all above is Python
- if you are interested in Julia "HON" author has a comprehensive notebook tutorial
For econometrics:
- go through code examples in "ROS" website, code in R
- (hm.. not a path yet! but I plan to add more books from https://trics.me/textbook/index.html to here).
Please let me know if this book listing makes sense to you.
Scipy Lectures
Great (but underrated) reading that starts with Python bascis and goes into machine learning tasks with scikit-learn
in final chapter.
Approaching (Almost) Any Machine Learning Problem by Abhishek Thakur ("AAAMLP")
- Out as Kindle book mid-2020, see it on amazon. It is paid content, but inexpensive.
- Author twitter
- Rahul Dave recommends this book
Annotation:
This book is for people who have some theoretical knowledge of machine learning and deep learning and want to dive into applied machine learning. The book doesn't explain the algorithms but is more oriented towards how and what should you use to solve machine learning and deep learning problems. The book is not for you if you are looking for pure basics. The book is for you if you are looking for guidance on approaching machine learning problems. The book is best enjoyed with a cup of coffee and a laptop/workstation where you can code along.
Hands-On Machine Learning with Scikit-Learn, Scikit-Learn, Keras and TensorFlow by Aurélien Géron
- Author on Twitter
- See TOC
From annotation:
You’ll learn a range of techniques, starting with simple linear regression and progressing to deep neural networks.
(You really will.)
Julia for Pythonistas (Colab notebook) by Aurélien Géron
This is a walk-through on Julia programming language by author of Hands-On Machine Learning (above).
https://colab.research.google.com/github/ageron/julia_notebooks/blob/master/Julia_for_Pythonistas.ipynb
Regression and other stories ("ROS")
- Published by CUP mid-2020, ebook sold on amazon
- See TOC
- Has a code repo
- Authors on Twitter: Andrew Gelman, Jennifer Hill, and Aki Vehtari
- Predecessor book: Data Analysis Using Regression and Multilevel/Hierarchical Models. ROS is the new edition of this book.
For further review
This section collects items I might add to learning path later. For now please ignore.
Video:
Books:
- @avehtari "BDA" course and book (has links to simplier bayesian courses)
- @rlmcelreath "Statistical rethinking" (SR) and video
- Full text of ISLR-7
- Data Science at the Command Line with code
Papers:
- P Huenermund on Causal Inference in Machine Learning and AI" (with lessons for econometrics)
- Athley, Imbens. Machine Learning Methods That Economists Should Know About
- Varian. Big Data: New Tricks for Econometrics
R learning resources (in response to DataCamp disaster):
Posts:
- Anatomy of a plot and visualisation cheatsheets:
- ggplot
- matplotlib
Quotes
Some quotes to mind awake and open, fooun don Twitter, July 2020.
Statistics means never having to say you’re certain.
— Daniela Witten (@daniela_witten) July 10, 2020
There are a lot more people who know how to move data around than who know what to do with it.
— Probability Fact (@ProbFact) July 1, 2020
Everyone has an idea, what sets people apart is execution.
— Shane Parrish (@ShaneAParrish) July 1, 2020
just because you made the mistakes doesn’t mean you learned the lesson
— Rob Rix (@rob_rix) June 23, 2020