Getting into: Data Science

I have now been a data scientist for over three years now. I started looking into it at least five years ago.

These are the reference I would recommend to people getting into the field. They are the books and online courses that I have found to be most useful in the long run.

These books are all tool-specific. I use Python and Web technologies to build prototypes. This has been extremely useful because I had to build up a data science team and a big part of the job was to get people excited about the potential.

  1. Hands-On Machine Learning with Scikit-Learn and TensorFlow. This is the best bok I have read on the Python stack for machine learning. It strikes a good balance between theory, practice and code example.
  2. Effective Computation in Physics: Field Guide to Research with Python. This books cover most of the basics. It teaches you have to use the terminal, Make, and a all the basic python packages for scientific computation in Python. I reccomend is a good allround introduction to the Python stack for Data Science.
  3. D3.js in Action and Interactive Data Visualization for the Web. It is possible to get people excite about tables and static graphs if you do the presentation right – but it is so much easier if you are demoing a fully interactive product. You can show how your results could be used.
  4. Flask Web Development: Developing Web Applications with Python. This book is useful because it allows you to build a functioning prototype of a website.
  5. Clean Code, Code Complete or The Pragmatic Programmer. The general idea here is just to read a book about how to write maintainable code. This is especially important because a lot of data scientists have a background in the natural sciences where the approach to coding is: when it seems to work it is done.

I plan to keep adding to this list as more useful reference emerge.