🤞 3. Running RECAP

Sept 15, 2023

20 min read

Set-up for Running RECAP

In order to run RECAP you will follow the general workflow. Note that both EC2 set-up and cluster set-up require coordination with other E3-ers so please reach out and plan accordingly.

Cluster Set-Up (1-3 day lead time)

These Steps for Cluster Set-up assumes you already have a working case in an EC2 instance. This helps you get your project on the cluster.

Get permissions for AWS, FrontEgg, and

the next sections will point you to the right places for permissions

Cluster Set-Up (1-3 day lead time)

AWS

If you do not have AWS credentials (if you do, there is an AWS tile on your Okta home page): Put in a ticket with WilldanIT to add a new user’s Willdan Email Addresses to the AD group: “E3 Developer”. Do not mention AWS in the request.
Once you have AWS permissions you will need to configure your aws-cli. These instructions are provided in the general cluster guide here.

Front Egg

Ask Pete Ngai (IT manager) for instructions

Datadog?

Ask Pete Ngai (IT manager) for instructions

Setting up project onto JupyterHub

By the end of this, you should have the following down:

Created a project-specific Workspace
Linked your Github Account to your Workspace through a SSH Key

1. Create your Workspace (For those with Github Write access)

Go to the following link: https://github.com/e3-/enkap-saas-controller
Pull the main branch
Go to the folder within the branch /clusters/x-ray/platforms/e3labs
Copy an existing project .yaml folder and create a new .yaml folder for your
Open the .yaml and change the following fields:
- metadata/name:
- e3/gurobi/license:
- workflow/model:
- workflow/projectId:

saas-controller

Save the file
Commit the changes and push back to main

2. Link Github Account through an SSH Key

Once you create a workspace, open a browser and enter this link: https://jupyterhub.e3labs.x-ray.ethree.cloud/
Start the server associated with your workspace / project
Once you start your server, this is the home page for JupyterHub
Open Terminal from your server / workspace
Copy and paste the following codeblock and hit enter

ssh-keygen -t ed25519

It will prompt you for a file in which to save the key, hit enter instead
It will ask for a passphrase, hit enter instead
It will prompt for the passphrase once more, hit enter again. 9. A total of 3 enters will get you to the following results. This is normal
This will create a window that will look similar to the screenshot below
Now copy and paste the following code block

cat .ssh/id_ed25519.pub

This will generate a SSH key, see the screenshot below for an example. We will need this later in step 16.
Open a new tab in a browser and go to https://github.com/e3-
Sign in through the organization page
Once signed in, go to the top right and click the profile icon, then the tab called “Your organizations”

15. Once there, click “SSH and GPG Keys” on the tab on the left bar, and click “New SSH key”

16. This will prompt you to add the SSH key. Enter the information and add the key. This will prompt a 2FA authentication. * Title = [WD username]@jupyter-[WD Username] * Key Type = Authentication Key * Key = [SSH key generated in Step 11]

Once you get to this page with a new SSH key window, click the dropdown for “Configure SSO” and click “Authorize” next to the e3- organization
Go through the steps to authorize until you return back to this page. This completes linking Github to your Workspace

Running RECAP Locally / EC2

EC2 Background Info

AWS EC2 instances are remote desktops that enable E3 to (1) Size instances according to project needs and (2) Utilize a shared (Z:/) drive.

For a new project, you will have to request to be assigned an EC2 instance

Reach out to Pete to request an EC2 instance.

EC2 instance assignments are tracked here.

An EC2 instance is used to perform initial model set-up and to run preliminary cases to confirm that the system is behaving as expected.

The instructions for running a case are spelled out in detail in the Toy Model guide.

These same instructions apply whether the case is run locally or on an EC2 instance.

Instructions specific to running the model are re-included below for context.

Running RECAP from the UI

Choose the case and execute the model: Click the cell that contains the correct case and press “Run Recap Cases” button to run the selected case. This will initiate a separate Command Prompt window showing the progress of the model.

Running RECAP in the Cluster

The cluster provides the most scalable way to run RECAP (and RESOLVE).
Users should first confirm the model behaves as expected on an EC2 instance before migrating runs onto the cluster.
The following sections will guide you to (1) run cases on the cluster and (2) sync results to your folder
- If you have not set up on the cluster, see the above instructions to set up the Cluster. It will take 1 day to fully set up.

Running on the cluster only works if you have an active branch in kit with all your cases committed to that specific branch.

Running Cases

Open a browser tab and enter the following link: https://jupyterhub.e3labs.x-ray.ethree.cloud/hub/home

2. Choose the Workspace for your project and click start. Please see set up instructions for Workspace setup if your project is not present. Jupyter_homepageimg 3. If github is correctly linked, your project folder should already be loaded. If not, you must log into Github through JupyterHub and have an SSH key generated. See (“2. Link Github Account through an SSH Key”)[2. Link Github Account through an SSH Key] above 4. To ensure that your project is synced, open Terminal (3rd row down, 1st icon on the left) and type the following

dvc pull

Once down, return to the home page and navigate to the notebooks folder and open cloudrunner.ipynb
Once that’s open, run all the code block by select the block and hitting ctrl+enter
You should see 2 main windows.
- Cloudrunner >> A window with a Case List of all Cases you’ve generated
- Cloudwatcher >> A window of Case Statuses.
Cloudrunner allows you to drag and select cases you want to run in parallel. Choose your cases and move it to the right window using the arrows. Submit cases by clicking the Submit button
- It’s important to configure the right CPU, Memory Size, and Ephermeral Storage. Ask Adrian, Ruoshui, Karl for more information about these.
Once you submit cases, hit the Reload button in Cloudwatcher.
Each case you’ve submitted should show up in Cloudwatcher. At this point, any project member can monitor the cases by running the Cloudwatcher block.
Using the Cloudwatcher, you’re able to monitor cases that are running, done, or failed. Use the links D and L to monitor the case running memory size and logs, and use A if you are interested in the Argo workflow.
- Note: it’s good to know what the links do, but if you have guaranteed success in an EC2 instance, you rarely need to check these unless you’re trying to gauge memory size.

Syncing Results

📊 2. Checking Model Inputs

📊 4. Analyzing Results

🤞 3. Running RECAP

Set-up for Running RECAP

EC2 Set-up (1-2 day lead time)

Pull recap repo into project folder + Set up kit envt

Cluster Set-Up (1-3 day lead time)

AWS

Front Egg

Datadog?

Setting up project onto JupyterHub

1. Create your Workspace (For those with Github Write access)

2. Link Github Account through an SSH Key

Running RECAP Locally / EC2

EC2 Background Info

Running RECAP from the UI

Running RECAP in the Cluster

Running Cases

Syncing Results