Course resources¶
Articles and books¶
- Getting started with Python for research, a gentle introduction to Python in data-intensive research.
- An introduction to "Data Science", a collection of Notebooks by BIDS' Stéfan Van der Walt.
- Effective Computation in Physics, by Kathryn D. Huff; Anthony Scopatz. Notebooks to accompany the book.
- A Whirlwind Tour of Python, by Jake VanderPlas.
- The Python Data Science Handbook, by Jake VanderPlas.
- Python for Data Analysis, 2nd Edition, by Wes McKinney, creator of Pandas. Companion Notebooks
- Effective Pandas, a book by Tom Augspurger, core Pandas developer.
- Implementing Reproducible Research, edited by V. Stodden, F. Leisch and R. Peng, contains the week's first reading assignment, as well as others I will suggest through the semester. You can find on that link all the chapters.
- The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences is a collection of real-world case studies on the pitfalls and complexities of implementing research reproducibly. It was a collaborative effort edited and coordinated by Kitzes, Turek and Deniz, scientists at the Berkeley Institute for Data Science.
- Reproducibility: a Primer on Semantics and Implications for Research contains an excellent overview of the main concepts in this field.
- Elegant SciPy, a collection of example-oriented lessons on how to best use the scientific Python toolkit, by the creator of Scikit-Image and BIDS researcher Stefan van der Walt. In addition to the previous O'Reilly reader, the full book as well as all the notebooks are available.
O'Reilly book access note: if you are physically on campus, for you to freely access the O'Reilly Safari book library, you must be logged into WiFi via either AirBears2 or eduroam, but not through CalVisitor. This page explains the WiFi options that exist, and on the right provides links for you to set up either AirBears2 or eduroam access on your laptop or phone.
If you want to be able to access this service while off-campus, you can set up a Firefox/Chrome profile configured to use the UC Berkeley Library proxy, which will give you access to all library-related materials (Safari Online books but also all academic journals available on campus). This page has details and instructions on the Library Proxy service.
Online tutorials¶
- The Berkeley Stats department maintains a very nice collection of tutorials on a variety of computing topics, many relevant to this course. In particular, I recommend you take a look at:
- Using the bash shell. The most convenient way to read this will be to locally clone the repository and open the
bash.html
file with your computer's web browser. - Introduction to Git. This is based on the notes we used in this class, but has a fair amount of additional explanation and detail you may find useful through the semester.
- Using the bash shell. The most convenient way to read this will be to locally clone the repository and open the
Github Student pack¶
You should apply for a Student Developer Pack, which will give you a lot of free resources on Github for as long as you're a student.