This repository contains the slides for the "Introduction to machine learning" course. See also the moodle page.
We are happy about new contributors. If you contribute something, please feel free to add your name to the team.
- Access to the
master
branch is protected, please make your own, issue-/task-specific branch off themaster
branch to work in and do a pull request once you're done. - Do many small, focused, single-issue commits with descriptive commit
messages: each commit message should refer the issue it adresses or fixes,
i.e. include something like
adresses #<issuenumber>
,closes #<issuenumber>
or similar, where applicable. - We generally work based on the feature branch workflow
- The person who merges the pull requests adds a note to the changelog if the changes are substantial
- Notation on the slides uses
latex-math
. Please do read the accompanying ReadMe and clonelatex-math
into this repo (otherwise you will not be able to render the slides). - Use the commands defined there, don't define your own.
- If you have to introduce new notation/symbols you should add it to
latex-math
, after doublechecking that- it is consistent with what we already have
- you do not overwrite symbols we have already defined differently
- We write slides for beginners: keep it simple, keep it short
- We try to keep slides modular: slidesets should represent about 15-20 minutes of material and be moderately self-contained.
- Don't put code on the slides, the theory is orthogonal to issues of implementation (... in theory..). Code is strictly for exercises/ practice sessions.
- Compiling the slides should be done via the Makefile: just type
make all
in the specific folder and it will render all slidesets in the folder, ormake <SLIDES>.pdf
to render a specific file<SLIDES>.tex
. make
will automatically move a copy of the compiled PDFs to theslides-pdf
directory. From there, files can be copied into the course website repository in case of a new release. If you use Windows we recommend that you access make via the Ubuntu bash (take a look at the installation tips)- We try to keep a "dependency graph" between slide sets up to date so that it's easier to keep track of
what material needs to be understood before what else. Please do add appropriate
%! includes:
-comments in your slides to keep this up-to-date, see alsoattic/slide-dependencies.R
andslides/slide-dependencies.pdf
. - We recommend usage of
{tinytex}
(install viatinytex::install_tinytex()
) - Use
make install
in the slides folder to automagically install all theR
packages you'll need for the slides, demos and exercises. See alsoattic/install.R
- Figures not produced by us are added to the
figure-man
folder of the respective chapter - R-files which produce figures should be named
fig-*.R
- The basic assumption is that you execute the R-files from the rsrc folder
- These figure producing R-files should save their respective figures to
../figure/
. From the name of the figure it should be clear which R-file produced it. - If you create a new plot or change an existing plot, you need to commit your changes of the r-files as well as the corresponding pdf-files. This means in if you create a new plot, you will have to add the pdf-files with
git add -f *.pdf
since pdf-files are ignored in this repo by default. - Utility functions used by more than one R-file should be exported to a separate R-file (also located in the respective rsrc folder)
- Heavy simulations should not be done in the figures producing R-files. Instead, we only load Rdata files which were produced by separate R-files (also located in the rsrc folder)
- If you replace graphics with new files with a different file name, or if you remove slides with graphics in them, then make sure that you remove unused files. To check if there are unused files in a
figure/
orfigure_man/
-folder, do the following:- Make sure you are in the folder that contains the
.tex
-files. - Run
make most
, which re-compiles all.pdf
-files while creating a log of what files were used. - Run
../../scripts/check_files_used.sh figure unused slides-*.tex
to list all files in thefigure/
-folder that are unused. - Do the same for the
figure_man/
-folder:../../scripts/check_files_used.sh figure_man unused slides-*.tex
. - Remove the unused files from git. The easiest way to do this is to use
git rm <file>
, but you can also delete the file first and then "add the deletion":rm <file>
followed bygit add <the file that was deleted>
. You can then commit. - If you find that you deleted a file that should not have been deleted, you can retrieve it from the git history: through the command line or by browsing the GitHub git history.
- Make sure you are in the folder that contains the
- Exercises are organized chapter-wise. Each folder will contain
- a subdirectory
figure
for plots, - a subdirectory
ex_rnw
that contains .Rnw files with single exercises (prefixed withex_
) and associated solutions (prefixed with solsol_
), - one or multiple exercise sheets (prefixed with
ex_
) and associated solutions (prefixed withsol_
), sourcing the single snippets fromex_rnw
, - a collection file (prefixed with
collection_
) that assembles all exercises for the given topic (those currently used in the exercise sheets, further existing material, ideas, URLs, ...)
- a subdirectory
- Compiling the slides should be done via the Makefile: just type
make all
and it will render all exercises, solutions and collection files, ormake <FILE>.pdf
to render a specific file<FILE>.Rnw
. make
will automatically move a copy of the compiledex_
andsol_
PDFs (i.e., those that will appear on the Website) to theexercises-pdf
directory. From there, files can be copied into the course website repository in case of a new release.- When creating new exercise sheets or collection files, please use the setup provided in
style/preamble_ueb.Rnw
andstyle/preamble_ueb_coll.Rnw
.
- Please follow this style guide
- We write code that is meant to be read/worked on by beginners:
- simple and legible is better than complex and elegant
- add a lot of explanatory comments
- use base-R as much as possible
- choose variable names and code designs to maximize legibility and comprehension
now in /code-demos
. Originals at this link
Google Figures are stored in the G-Drive
- Video files should have the same name as the slide set they are narrating.
- Our videos show the lecturer's head in the bottom right corner
- Make sure you minimize background noise, have good lighting and do remember to switch off your phone and to sedate or expell your pets / spouses / flatmates / office co-inhabitants for distraction-free recording.
- Make sure you record in a resolution that's high enough to easily read the slides (at least 1280 x 760, higher is better).
- We have excellent USB-Microphones to borrow in Bernd's office
- Many possible workflows, Fabian uses :
mpv /dev/video0 --framedrop=no --speed=1.01 --window-scale=0.35 --no-border --ontop
for a borderless, low latency webcam window andkazam
for screen capture.- In
kazam
, don't forget to- set preferences to "USB microphone" & set loudness fairly high
- set the frame rate to 30
The number of slides and length of videos can be found here and should be updated regularly (i.e. if a new video is published)
The website is updated whenever the master branch is pushed, via the Github action Pkgdown.
The website uses pkgdown
via _pkgdown.yml
, its pages are in \vignettes
.
The automatic deployment uses a "secret" (see repository settings on Github),
which is a PAT called DEPLOY_PAT
(created by Fabian Scheipl, Jan 30 2020).