diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 00000000..e69de29b diff --git a/404.html b/404.html new file mode 100644 index 00000000..28b82e35 --- /dev/null +++ b/404.html @@ -0,0 +1,12 @@ +
The National Health Data Science Sandbox project kicked off in 2021 with 5 years of funding via the Data Science Research Infrastructure initiative from the Novo Nordisk Foundation. Health data science experts at five Danish universities are contributing to the Sandbox with coordination from the Center for Health Data Science under lead PI Professor Anders Krogh. Data scientists hosted in the research groups of each PI are building infrastructure and training modules on Computerome and UCloud, the primary academic high performance computing (HPC) platforms in Denmark.
Our computational 'sandbox' allows data scientists to explore datasets, tools and analysis pipelines in the same high performance computing environments where real research projects are conducted. Rather than a single, hefty environment, we're deploying modularized topical environments tailored for independent use on each HPC platform. We aim to support three key user groups based at Danish universities:
Activity developing independent training modules and hosting workshops has centered on UCloud, while collaborative construction of a flexible Course Platform has been completed on Computerome for use by the Sandbox and independent educators. Publicly sourced datasets are being used in training modules on UCloud, while generation of synthetic data is an ongoing project at Computerome. Sandbox resources are under active construction, so check out our other pages for the current status on HPC Access, Datasets, and Modules. We run workshops using completed training modules on a regular basis and provide active support for Sandbox-hosted courses through a slack workspace. See our Contact page for more information.
We thank the Novo Nordisk Foundation for funding support. If you use the Sandbox for research or reference it in text or presentations, please acknowledge the Health Data Science Sandbox project and its funder the Novo Nordisk Foundation (grant number NNF20OC0063268).
We do not currently support independent use of Sandbox materials on Computerome. Access is supported via courses collaborating with the Sandbox and run on Computerome's Course Platform. Check here for more info.
The below instructions are provided as reference for course participants.
To set up a user account on Computerome, you will need to provide administrators with your name, email address, and phone number for two-factor authentication. Once approved as a user, you will receive your username and server address (URL) by email, and you will receive an initial course-platform password by text.
On Computerome, the Sandbox environment is deployed as a virtual machine with a Linux desktop as user interface. This environment can be accessed through VMware Horizon using two different methods: (A) a desktop client (which you install on your computer) or (B) a web-based client (for those without install privileges on their computer). Please follow the appropriate instructions (A versus B) depending on your access preference.
Sign-In Instructions
(B) access the environment via browser (right: 'VMware Horizon HTML Access'). You will always use the server address in your browswer to access this entry point if this is your chosen method of access.
Select the cloud icon
(B) which is linked to the Sandbox course. This option appears after you have selected VMware Horizon HTML Access.
Enter your username and your course-platform password.
On the first sign-in, this will be the course-platform password texted to you. You will then be prompted to create your own permanent password to replace this password which you will use for all future sign-ins.
When prompted, enter the one-time password texted to you from DTU (NOT the same password as the course-platform password).
(A) If it is your first login / you logged off at last access, press any key when greeted with the blue time status screen. This will allow you to select your own user account in a dialog box.
Sign-in using your course-platform password again after choosing the correct language for the environment in the upper right corner of the screen (this is important for the keyboard and typing your password). Danish (the da option) is default, so those with English keyboards will need to switch to English (the en option) at every login.
Congratulations, you have entered the Sandbox environment. Relevant links for courses should be present on your desktop.
Exit Instructions
To exit the environment, you have two options with different outcomes. You can log off and kill all running processes, or you can disconnect and your processes will continue running. "Power off" is disabled for users as this will shut down your virtual machine, local settings and user files may be lost, and the virtual machine will need to be manually restarted for your account.
User accounts on UCloud are enabled by university login credentials using WAYF (Where Are You From). Access the WAYF login portal here, and then find your affiliated Danish university using the search bar. After login, we suggest setting up Two Factor Authentication by clicking on the icon in the top-right corner of the screen. Once you are an approved user of UCloud, you can access the Sandbox environment via different 'Sandbox' apps linked to topical modules that you deploy using your own storage and computing resources - just go to Apps once you have signed into UCloud and search 'Sandbox' to find what we have deployed. Each app page has its own Documentation link that will direct you to Sandbox-based usage guidelines which may be customized to the app's particular tools and scope. Apps will have different 'courses' that you can initially choose which make a personal copy of training materials in your workspace for you to edit.
Each Danish university has its own usage relationship with UCloud as governed by their local front office of DeiC - check with your university IT support / DeiC representatives about requesting computational resources. For example, the University of Copenhagen has previously allotted an initial chunk of free UCloud compute hours to staff (from PhD students to professors as well as non-academic staff). If you have further questions about getting compute resources, please contact Sandbox staff.
Extensive documentation on the general use of UCloud (how to use apps and run jobs, etc.) is available in the UCloud user guide.
Log onto UCloud at the address http://cloud.sdu.dk using university credentials.
When you are logged in, choose the project from the dashboard (highlighted in red) from which you would like to utilize compute resources. Every user has their personal workspace (My workspace). You can also provision your own project (check with your local DeiC office if you're new to UCloud) or you can be invited to someone else's project. If you've previously selected a project, it will be launched by default. If it's your first time, you'll be in your workspace. If you've joined one of our courses or workshops, your instructor will let you know which to choose.
For this example, we select Sandbox_workshop.
On the left side, you can see the structure of the project (content changes when you select a different project):
[!IMPORTANT] Don't forget to accept the invitation to access new projects. Remember to switch projects to access other files and resources. Test switching among projects and observe how the dashboard changes.
At the bottom left corner, you will find your user ID, which you may need to provide once the course starts or for future collaborations, such as being added to other people's projects. You can also find here UCloud docs.
In the dashboard, you will also find news, your favorite apps, recent runs, resources and other notifications between other applications: - Resource allocations: indicate your currently allocated resources (e.g., KU employees have access to 1000kr in computing). - Grant applications: apply for more resources (computing or storage if you run out of them)
Then click on Apps in the left panel to investigate what tools and environments you can use (green circle). The easiest way to find Sandbox resources is to search via the toolbar (red circle). In this example, we'll select the Genomics Sandbox (which will bring you to the submission screen).
[!TIP] Mark them as favorites so they appear on your dashboard.
Click on the app button to get into the settings window. First, we recommend reading the documentation of the app (highlighted in green). Then, you can configure the app as shown below, or be provided with a configuration file made available in a workshop's project folders (import parameters) which will take care of everything for you.
In this example, we configure our session by:
[!IMPORTANT] The first 3 steps set up our computing resources for the period we want to work and can be customized as needed. However, only step 2 can be modified after submitting the job. For some of the Sandbox apps, you might want to select folders (Home and the Notebooks/Data from the module to avoid downloading it every time you start a new job). If you are in doubt, read the documentation specific to the app you are interested in. Select the version of the app (if in doubt, use the latest one). This allows you to run specific versions of software.
There are different types of apps, and therefore, interfaces. Some, like RStudio or Jupyter Notebooks, have their own graphical user interface, whereas others are command-line interfaces. Lastly, you can also deploy a virtual desktop and virtual machine, which allow you to spin up a virtual computer.
Wait to go through the queue. When the session starts, the timer begins to count down. In a couple of minutes, you should be able to open the interface through the button (green circle) in a new window (refresh the window if needed).
This page will remain open while you work (or you can return to it via 'Runs' in the left panel). You can end your session early by pressing and holding 'Stop application' (pink circle), you can see how much time you have left (red circle) and you can add hours to your session as you go (buttons in blue square).
If you are testing the genomic app, your interface should look like in the image below. Different apps might use other development environments. In this case, you will be working from JupyterLab. You can open Jupyter Notebooks (yellow square), R studio (blue square) or a terminal (black square) among others. In this case, #1 and #2 have all the software and packages that you will need pre-installed (this is not the case with Python 3 to the left).
You can navigate through the different folders and start running the Python notebooks (pink arrow).
If you are an advanced user, you can also create your own Python files and select the kernel NGS (python) to use the pre-installed software. Learn how to manage (upload and download new data) and share files that you have created/developed with collaborators here.
[!TIP] Create your own directories to save the output of your jobs. You will be able to access them later in your project folders under the resources you are using If you haven't created any directories, look for the generated files under a folder with the same name as the job name you used.
You are ready to start using Ucloud and the sandbox tools!
Here you have the instructions to start working on the tutorial.
You can use these instructions to open jupyterlab also for the analysis of your own data and for the integration analysis.
Log onto ucloud at the address http://cloud.sdu.dk using the university credentials.
When you are logged in, be sure to choose the project for the NNF course (red circle). Then click on the Apps button (green circle).
Find the app Jupyterlab
(red circle), which is under the title Featured
.
Click on the app button. You will get into the settings window. Load the application settings following the illustrations below.
Submit
(red circle).To work on the tutorial you need to go on a personal folder, which contains also the dataset you will filter after the tutorial. Each student has its own folder as in the table below:
Name | Sample folder |
---|---|
Andersen, Albert Lund | Gifu_ctr_1 |
Bagger, Andreas | Gifu_ctr_2 |
Hansen, Mads Würgler | Gifu_ctr_ 3 |
Milo, Lasse | Gifu_ctr_4 |
Reimick, Sebastian Haunstrup | Gifu_R7A_1 |
Hemmingsen, Jonas Klejs | Gifu_R7A_2 |
Skovmøller, Emma Hvitfeldt | Gifu_R7A_3 |
Sørensen, Emma Frasez | Gifu_R7A_4 |
Agersnap, Simon Nørregaard | HAR1_ctr_1 |
Schmidt, Alina | HAR1_ctr_2 |
Henriksen, Frederik Oskar | HAR1_ctr_3 |
Lundby, Josephine Marie | HAR1_ctr_4 |
Nørholm, Anne | HAR1_R7A_1 |
Odgaard, Louise Nyrup | HAR1_R7A_2 |
Sørensen, Sara Sejer | HAR1_R7A_3 |
Lønskov, Jonas | HAR1_R7A_4 |
Niklassen, Jacob Hansen | Gifu_ctr_1_bis |
Øllgaard, Ann Mai Brøndum Holm | Gifu_ctr_2_bis |
Overgaard, Morten Øgelund | Gifu_R7A_1_bis |
Sørensen, Elisabeth Asta | Gifu_R7A_2_bis |
Rey, Isabel | HAR1_ctr_1_bis |
When you open jupyterlab, you need to use the browser on your left to go to the folder 426401/Students_analysis/Folder_name
, where you find your personal folder (from the table) instead of Folder_name
.
Here, you have the notebook tutorial.ipynb
. Open that to start working on the tutorial
R scrna
(red circle). If not, click there and choose R scrna
from the menu that appears.When you are finished with the tutorial, you are ready to go on to use the tutorial code for the filtering session.
The Sandbox is collaborating with the two major academic high performance computing platforms in Denmark. Computerome is located at the Technical University of Denmark (and co-owned by the University of Copenhagen) while UCloud is owned by the University of Southern Denmark. These HPC platforms each have their own strengths which we leverage in the Sandbox in different ways.
Computerome is the home of many sensitive health datasets via collaborations between DTU, KU, Rigshospitalet, and other major health sector players in the Capital Region of Denmark. Computerome has recently launched their secure cloud platform, DELPHI, and in collaboration with the Sandbox has built a Course Platform on the same backbone such that courses and training can be conducted in the same environment as real research would be performed in the secure cloud. The Sandbox is supporting courses in the Course Platform, but it is also available for independent use by educators at Danish universities. Please see their website for more information on independent use and pricing, and contact us if you'd like to collaborate on hosting a course on Computerome. We can help with tool installation, environment testing, and user support (ranging from using the environment to course content if we have Sandbox staff with matching expertise).
Participants in courses co-hosted by the Sandbox can check here for access instructions.
UCloud is a relatively new HPC platform that can be accessed by students at Danish universities (via a WAYF university login). It has a user friendly graphical user interface that supports straightforward project, user, and resource management. UCloud provides access to many tools via selectable Apps matched with a range of flexible compute resources, and the Sandbox is deploying training modules in this form such that any UCloud user can easily access Sandbox materials independently. The Sandbox is also hosting workshops and training events on UCloud in conjunction with in-person training.
Check out UCloud's extensive user docs here and instructions for how to access Sandbox apps here.
{"use strict";/*!
+ * escape-html
+ * Copyright(c) 2012-2013 TJ Holowaychuk
+ * Copyright(c) 2015 Andreas Lubbe
+ * Copyright(c) 2015 Tiancheng "Timothy" Gu
+ * MIT Licensed
+ */var Va=/["'&<>]/;qn.exports=za;function za(e){var t=""+e,r=Va.exec(t);if(!r)return t;var o,n="",i=0,s=0;for(i=r.index;i