Please complete the following steps before the beginning of the workshop on Saturday, January 15. The goal is to set up and familiarize yourself with a toolkit that is suitable for machine learning methods and for the extraction of knowledge from data sets.

Do these things until the workshop on Saturday, January 15:

  1. Revisit what you learnt about linear and logistic regression in the Workshop on Statistical Methods in November. What is the difference between linear and logistic regression?

  2. Familiarize yourself with the scikit-learn by completing the two (small) tasks in the following Jupyter notebook (also accessible here). You can use either Google Colab or your local Python environment. The following class might be helpful:

  1. Read the article 50 Years of Data Science by David Donoho. According to Donoho, what is a crucial (and sometimes under-appreciated) paradigm that led to the development of technologies like smartphone voice recognition or machine translation?