Stat 159/259 - Reproducible and Collaborative Data Science¶
All materials for this course are available on GitHub.
The class syllabus will be updated over the course of the first couple of weeks of class.
Lectures¶
- An interactive Git Tutorial: the tool you didn’t know you needed
- A quick overview of the Jupyter Notebook and IPython
- Reading discussion - Developing open source scientific practice
- Reading discussion - Scientific Python, IPython, Jupyter
- Class practice: strings, lists & numbers
- Conda and pip - managing environments
- From September 25 reading
- Make: automating tasks
- LIGO, the 2017 Nobel prize in physics, and wrapping up Makefiles
- An Introduction to the Scientific Python Ecosystem
- Motivation: the trapezoidal rule
- NumPy arrays: the right data structure for scientific computing
- High quality data visualization with Matplotlib
- The trapezoidal rule: vectorization and perormance
- Matplotlib: Beyond the basics
- Matplotlib: Live plots
- Matplotlib image tutorial
- Multichannel images
- Strings
- NLTK: Natural Language Made Easy
- Data - an introduction to the world of Pandas
- Data Manipulation
- An introduction to Sphinx
- P-values discussion with Prof. Philip B. Stark
- Testing your software in Python
- Simple chaotic behavior in non-linear systems