From a7592d3bdcf79c13dd9cae366f3753cb2fc715af Mon Sep 17 00:00:00 2001 From: Julie Hogan Date: Mon, 16 Dec 2024 16:10:14 -0600 Subject: [PATCH] DAS-ize episodes --- episodes/02-cpp-hello-world.md | 40 ++---------------- episodes/03-root-and-cpp-read-and-write.md | 16 ++------ episodes/04-root-and-cpp-fill-a-histogram.md | 22 +++------- episodes/05-using-root-with-python.md | 43 -------------------- episodes/06-uproot.md | 42 +++++-------------- episodes/07-awkward.md | 4 +- episodes/introduction.md | 2 +- learners/setup.md | 19 ++++++++- 8 files changed, 45 insertions(+), 143 deletions(-) diff --git a/episodes/02-cpp-hello-world.md b/episodes/02-cpp-hello-world.md index e31428c..5b64086 100644 --- a/episodes/02-cpp-hello-world.md +++ b/episodes/02-cpp-hello-world.md @@ -16,44 +16,12 @@ exercises: 10 :::::::::::::::::::::::::::::::::::::::::::::::: -## Setting up your working area - -If you completed the [Docker pre-exercises](https://cms-opendata-workshop.github.io/workshopwhepp-lesson-docker/) -you should already have worked through -[this episode](https://cms-opendata-workshop.github.io/workshopwhepp-lesson-docker/03-docker-for-cms-opendata/index.html), under **Download the docker images for ROOT and python tools and start container**, and you will have - -- a working directory `cms_open_data_root` on your local computer -- a docker container with name `my_root` created with the working directory `cms_open_data_root` mounted into the `/code` directory of the container. - -Start your ROOT container with - -```bash -docker start -i my_root -``` - -In the container, you will be in the `/code` directory and it shares the files with your local `cms_open_data_root` directory. - -:::::::::::: callout - -## If you're using apptainer: - -Whenever you see a `docker start` instruction, replace it with `apptainer shell` to open either the ROOT or Python container image. -The specific commands in this pre-exercise and during the live workshop will be given for docker, since that is the most common application. -As a general rule, editing of files will be done in the standard terminal (the containers do not have all text editors!) or via the jupyter-lab interface, and then commands will be executed inside the container shell. If you see `Singularity>` on your command line, you are ready to run a ROOT or python script. - -::::::::::::: - ## Your first C/C++ program (Optional Review!) Let's start with writing a simple `hello world` program in C. First we'll edit the -*source* code with an editor of your choice. - -Note that you will -*edit* the file in a local terminal on your computer and then *run* the file -in the Docker environment. This is because we mounted the `cms_open_data_root` directory -on your local disk such that it is visible inside the Docker container. - -Let's create a new file called `hello_world.cc` in the `cms_open_data_root` directory, using your preferred editor. +*source* code with an editor of your choice. +Go to your [working directory for this exercise](../learners/setup.md), and let's create a +new file called `hello_world.cc`, using your preferred editor. The first thing we need to do, is `include` some standard libraries. These libraries allow us to access the C and C++ commands to print to the screen (`stdout` and `stderr`) as @@ -163,7 +131,7 @@ Hello world! This uses the C++ 'iostream' library to direct output to standard o Hello world! This uses the C++ 'iostream' library to direct output to standard error. ``` -When you are working with the Open Data, you will be looping over events +When you are working with CMS data, you might be looping over events and may find yourself making selections based on certain physics criteria. To that end, you may want to familiarize yourself with the C++ syntax for [loops](https://www.w3schools.com/cpp/cpp_for_loop.asp) diff --git a/episodes/03-root-and-cpp-read-and-write.md b/episodes/03-root-and-cpp-read-and-write.md index ac8fec5..c26fb6b 100644 --- a/episodes/03-root-and-cpp-read-and-write.md +++ b/episodes/03-root-and-cpp-read-and-write.md @@ -79,14 +79,6 @@ for a separate workshop, but much of the material is relevant for the Open Data to complete the tutorial. * [ROOT tutorial from Nevis Lab (Columbia Univ.)](https://www.nevis.columbia.edu/~seligman/root-class/). Very complete and always up-to-date tutorial from our friends at Columbia. -::::::::::::::::::: callout -## Be in the container! -For this episode, you'll still be running your code from the `my_root` docker container -that you launched in the previous episode. - -As you edit the files though, you may want to do the editing from your *local* environment, -so that you have access to your preferred editors. -::::::::::::::::::::: ## ROOT terminology @@ -434,12 +426,12 @@ of what you just ran. Huzzah! You've successfully written your first ROOT file! :::::::::::::::::::: callout -## Will I have to `make` my Open Data analysis code? +## Will I have to `make` my analysis code? Maybe! - If you prefer to write your end-level analysis code in C++, your `make` setup will be very similar to this exercise -- If you are analyzing AOD or MiniAOD data and using the dedicated CMS software, a configuration and build system called [SCRAM](https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideScram) is used for this purpose of compiling and linking code. +- If you are analyzing AOD or MiniAOD data and using CMSSW software, a configuration and build system called [SCRAM](https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideScram) is used for this purpose of compiling and linking code. - If you only analyze NanoAOD samples and do so in python (see the upcoming lesson pages!), then you will not use `make` :::::::::::::::::::::: @@ -639,7 +631,7 @@ clean: rm -f ./*~ ./*.o ./read_ROOT_file ``` -We can now compile and run the code in your `my_root` container shell! +We can now compile and run the code! ```bash make read_ROOT_file @@ -691,7 +683,7 @@ still using the C++ syntax. :::::::::::::::::::::::::::::: keypoints -- ROOT defines the file format in which all of the CMS Open Data is stored. +- ROOT defines the file format in which all of the CMS data is stored. - These files can be accessed quickly using C++ code and the relevant information can be dumped out into other formats. :::::::::::::::::::::::::::::: diff --git a/episodes/04-root-and-cpp-fill-a-histogram.md b/episodes/04-root-and-cpp-fill-a-histogram.md index b0cb049..56701f4 100644 --- a/episodes/04-root-and-cpp-fill-a-histogram.md +++ b/episodes/04-root-and-cpp-fill-a-histogram.md @@ -29,7 +29,7 @@ we have to a new file, `fill_histogram.cc`. cp read_ROOT_file.cc fill_histogram.cc ``` -Into this file, we'll add some lines at some key spots. Again, use your favourite editor on your local computer. For now, we'll go through those lines +Into this file, we'll add some lines at some key spots. For now, we'll go through those lines of code individually, and then show you the completed file at the end to see where they went. First we need to include the header file for the ROOT [TH1F](https://root.cern.ch/doc/master/classTH1F.html) class. @@ -193,7 +193,7 @@ clean: rm -f ./*~ ./*.o ./fill_histogram ``` -And then compile and run it, remember to do it in the container! +And then compile and run it: ```bash make fill_histogram @@ -203,14 +203,6 @@ make fill_histogram The output on the screen should not look different. However, if you list the contents of the directory, you'll see a new file, `output.root`! -If you are using a container with VNC, now it is time to start the graphics window with - -```bash -start_vnc -``` - -and connect to it with the default password `cms.cern`. - To inspect this new ROOT file, we'll launch CINT for the first time and create a [`TBrowser` object](https://root.cern.ch/doc/master/classTBrowser.html). @@ -234,7 +226,8 @@ to inspect ROOT files. Inside this CINT environment, type the following root [1] TBrowser b; ``` -You should see the `TBrowser` pop up! +You should see the `TBrowser` pop up! If no browser pops up, check out the [uscms.org page about the LPC](https://uscms.org/uscms_at_work/computing/getstarted/uaf.shtml) +to troubleshoot your X11 connection, or ask for help on Mattermost. :::::::::::::::::::: callout @@ -269,7 +262,7 @@ Open the `tree.root` file with ROOT: root -l tree.root ``` -Now, dump the content of the `t1` tree with the method `Print`. Note that, by opening the file, the ROOT tree in there is automatically loaded. +You can dump the content of the `t1` tree with the method `Print`. Note that, by opening the file, the ROOT tree in there is automatically loaded. ```cpp root [0] @@ -277,8 +270,7 @@ Attaching file tree.root as _file0... root [1] t1->Print() ``` -Please copy the output this statement generates and paste it into the corresponding section in our [assignment form](https://docs.google.com/forms/d/e/1FAIpQLSdxsc-aIWqUyFA0qTsnbfQrA6wROtAxC5Id4sxH08STTl8e5w/viewform); remember you must sign in and click on the submit button in order to save your work. You can go back to edit the form at any time. -Then, quit ROOT. +When you're done, quit ROOT. :::::::::::::::::::::::::::::::: @@ -418,8 +410,6 @@ You'll be popped into the CINT environment and you should see the following plot :::::::::::: -Exit from the container. If you are using a container with VNC, first stop VNC with `stop_vnc`. - ::::::::::::::::: keypoints - You can quickly inspect your data using just ROOT diff --git a/episodes/05-using-root-with-python.md b/episodes/05-using-root-with-python.md index ab36eab..0b7ad53 100644 --- a/episodes/05-using-root-with-python.md +++ b/episodes/05-using-root-with-python.md @@ -40,53 +40,10 @@ data processing in python, and the scikit-HEP tools are very important for that You can check out a tutorial for many of their tools [here](https://hsf-training.github.io/hsf-training-scikit-hep-webpage/). -## Using the Python docker container - -The tools in the Python docker container will allow you to can easily open -and analyze ROOT files. This is useful for when you make use of the CMS open data tools to skim -some subset of the open data and then copy it to your local laptop, desktop, or perhaps an -HPC cluster at your home institution. - -If you completed the [Docker pre-exercises](https://cms-opendata-workshop.github.io/workshopwhepp-lesson-docker/) -you should already have worked through -[this episode](https://cms-opendata-workshop.github.io/workshopwhepp-lesson-docker/03-docker-for-cms-opendata/index.html), under **Download the docker images for ROOT and python tools and start container**, and you will have - -- a working directory `cms_open_data_python` on your local computer -- a docker container with name `my_python` created with the working directory `cms_open_data_python` mounted into the `/code` directory of the container. - -Start your python container with - -```bash -docker start -i my_python -``` - -In the container, you will be in the `/code` directory and it shares the files with your local `cms_open_data_python` directory. - -:::::::::::: callout - -## If you're using apptainer: - -Whenever you see a `docker start` instruction, replace it with `apptainer shell` to open either the ROOT or Python container image. -The specific commands in this pre-exercise and during the live workshop will be given for docker, since that is the most common application. -As a general rule, editing of files will be done in the standard terminal (the containers do not have all text editors!) or via the jupyter-lab interface, and then commands will be executed inside the container shell. If you see `Singularity>` on your command line, you are ready to run a ROOT or python script. - -::::::::::::: - -If you want to test out the installation, from within Docker you can launch and -interactive python session by typing `python` (in Docker) and then trying - -```python -import uproot -import awkward as ak -``` - -If you don't get any errors then congratulations! You have a working environment and you are ready to -perform some HEP analysis with your new python environment! :::::::::::::: keypoints - PyROOT is a complete interface to the ROOT libraries - Scikit-HEP provides tools to interface between ROOT and global scientific python tools -- We will use `uproot`, `awkward`, and `vector` in our NanoAOD analysis :::::::::::::: diff --git a/episodes/06-uproot.md b/episodes/06-uproot.md index f0ca2d5..51dc9db 100644 --- a/episodes/06-uproot.md +++ b/episodes/06-uproot.md @@ -32,56 +32,35 @@ from Jim Pivarski. ## How to type these commands? -Now that you've installed the necessary python modules in your `my_python` container you can choose to write and execute the code however -you like. There are a number of options, but we will point out two here. - -* [Jupyter notebook](https://jupyter.org/). This provides an editor and an evironment in which to run -your python code. Often you will run the code one *cell* at a time, but you could always put all your -code in one cell if you prefer. There are many, many tutorials out there on using Jupyter notebooks -and if you chose to use Jupyter as your editing/executing environment that you have developed some -familiarity with it. - - * In the `my_python` container, you can start `jupyter-lab` with - ```bash - jupyter-lab --ip=0.0.0.0 --no-browser - ``` - and open the link given in the message on your browser. Choose the icon under "Notebook". - -* Python scripts. In this approach, you edit the equivalent of a text file and then pass that text -file into a python interpreter. For example, if you edited a file called `hello_world.py` such that -it contained +There are many options for interacting with python scripts for CMS data analysis, including interactive tools like jupyter notebooks. +In this exercise, we will stick to editing python scripts. For example, if you edited a file called `hello_world.py` such that +it contained: ```python print("Hello world!") ``` -You could save the file and then (perhaps in another Terminal window), execute +You could save the file and then execute: ```bash python hello_world.py ``` -This would interpret your text file as python commands and produce the output +This would interpret your text file as python commands and produce the output: ```output Hello world! ``` -We leave it to you to decide which approach you prefer. +If you would prefer to use a jupyter notebook for these exercises, go to [CERN's SWAN facility](https://swan.web.cern.ch/swan/) and try the new interactive jupyter-lab interface (you can leave the other options to their defaults). From the options page, select a Python3 notebook. ## Open a file Let's open a ROOT file! -If you're writing a python script, let's call it `open_root_file.py` and if you're using -a Jupyter notebook, let's call it `open_root_file.ipynb`. If you are working in the container, you will open and *edit* the python script on your local computer and *run* it in the container, or you will open a notebook on your jupyter-lab window in the browser. +Call your script `open_root_file.py` and open it in your preferred text editor. On this webpage we will show small snippets of Python that you can add to your script one after the other, and run to see new output. First we will import the `uproot` library, as well as some other standard -libraries. These can be the first lines of your python script or the first cell of your Jupyter notebook. - -*If this is a script, you may want to run `python open_root_file.py` every few lines or so to see the output. -If this is a Jupyter notebook, you will want to put each snippet of code in its own cell and execute -them as you go to see the output.* - +libraries. These can be the first lines of your python script: ```python import numpy as np @@ -115,8 +94,7 @@ on the CERN Open Data Portal. If you scroll down to the bottom of the page and c the **Download** button. For the remainder of this tutorial you will want the file to be in the same directory/folder -as your python code, whether you are using a Jupyter notebook or a simple python script. So make -sure you move this file to that location after you have downloaded it. +as your python code. So make sure you move this file to that location after you have downloaded it. To read in the file, you'll change one line to define the input file to be ```python @@ -132,7 +110,7 @@ So you've opened the file with `uproot`. What is this `infile` object? Let's add print(type(infile)) ``` -and we get +and upon running the script we get ```output diff --git a/episodes/07-awkward.md b/episodes/07-awkward.md index 8645338..fb43eff 100644 --- a/episodes/07-awkward.md +++ b/episodes/07-awkward.md @@ -37,7 +37,7 @@ of your file. ## Environment -Use the Python environment that you set up in the previous two lesson pages, such as your `my_python` docker container. +Use the Python environment that you set up in the previous two lesson pages. We leave it up to you whether or not you write and execute this code in a script or as a Jupyter notebook. ## Numpy arrays: a review @@ -152,7 +152,7 @@ or by downloading it. ## *Stop!* If you haven't already, make sure you have run through the -[previous lesson](https://cms-opendata-workshop.github.io/workshop2024-lesson-cpp-root-python/06-uproot/index.html) on working with uproot. +[previous lesson](06-uproot.md) on working with uproot. :::::::::::::::::::::::: Let's open this ROOT file! If you're writing a python script, let's call it `open_root_file_and_analyze_data.py` and if you're using diff --git a/episodes/introduction.md b/episodes/introduction.md index 6a4dbbe..339cf91 100644 --- a/episodes/introduction.md +++ b/episodes/introduction.md @@ -54,7 +54,7 @@ tutorials, for those who want to go further. ::::::::::::::::::::::::::::: callout ## You still have choices! -Just to emphasize, you really only *need* to use ROOT and C++ at the early stages of analyzing CMS Open Data in the AOD (Run 1) or MiniAOD (Run 2) formats. These datasets require using CMS-provided tools that perform much better in C++ than python. However, downstream in your analysis or to analyze Run 2 NanoAOD files, you are welcome to use whatever tools and file formats you choose. +Just to emphasize, you really only *need* to use ROOT and C++ at the early stages of analyzing CMS Open Data in the AOD (Run 1) or MiniAOD (Run 2 or Run 3) formats. These datasets require using CMS-provided tools that perform much better in C++ than python. However, downstream in your analysis or to analyze NanoAOD files, you are welcome to use whatever tools and file formats you choose. ::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::: keypoints diff --git a/learners/setup.md b/learners/setup.md index 7f282ac..e8569bd 100644 --- a/learners/setup.md +++ b/learners/setup.md @@ -2,5 +2,22 @@ title: Setup --- -This lesson requires a computer with an internet connection and a bash shell (either native Linux, MacOs or Windows WSL2 Linux). You should have Docker installed and the [Docker pre-exercises](https://cms-opendata-workshop.github.io/workshop2023-lesson-docker/) finished so that you can access the containers `my_root` and `my_python` created as instructed there. +This lesson requires several software packages common in high-energy physics. This exercise is intended to be used for Fermilab's CMS Data Analysis School 2025. If you have completed the [pre-exercises](https://fnallpc.github.io/cms-das-pre-exercises/), go to the following CMSSW area and set the CMS environment to access these packages. +Connect to the LPC cluster: + +```bash +kinit @FNAL.GOV # if needed +ssh -XY @cmslpc-el8.fnal.gov +``` + +Access your CMSSW environment: + +```bash +cd ~/nobackup/cmsdas/CMSSW_13_0_10 +cmsenv +mkdir root-exercise +cd root-exercise/ +``` + +Now you are ready to do the steps of this exercise in your terminal!