Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
amyheather committed Oct 3, 2024
2 parents 72e7985 + 4cf312e commit 793bc83
Show file tree
Hide file tree
Showing 14 changed files with 180 additions and 45 deletions.
29 changes: 1 addition & 28 deletions logbook/posts/2024_10_01/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -50,34 +50,7 @@ Looking at Figure 6 in the article, the number of line managers was always fairl

I decided to go back over the paper, identifying all the model parameters mentioned, and make sure I could find these in the code and confirm they matched the paper.

| Parameter | Code | Location in paper | Location in code |
| --- | --- | --- | --- |
| At screening station, 1% go to med eval and 99% to dispensing. At med eval station, 99% got to dispensing and 1% exit POD | <!----> | 4.1.1 Splits | <!----> |
| Number of forms per designee: 1 31.8%, 2 26.7%, 3 16.8%, 4 12.6%, 5 6.8%, 6 5.6% | <!----> | Table 1 | <!----> |
| Service time **line manager**: triangular, minimum = 0.029, maximum = 0.039, mode = 0.044 |`time = random.triangular(low=5/60.0, high=92/60.0, mode=23/60.0)`. Low 5/60 = 0.0833. High 92/60 = 1.533. Mode 23/60 = 0.3833. ✅? There is one **commented out** which has... Low 1.77/60 = 0.0295. Higher 2.66/60 = 0.044. Mode 2.38/60 = 0.0397. This would match the article, except maximum and mode the other way round. | Table 2 | `visit_greeter()` in `Customer.py` |
| Service time **screening**: weibull, shape = 2.29, scale = 0.142 |`time = random.weibullvariate(alpha=0.142, beta=2.29)` | Table 2 | `visit_screener()` in `Customer.py` |
| Service time **dispensing**: weibull, shape = 1, scale = 0.311 |`time = random.weibullvariate(alpha=0.311, beta=1)` | Table 2 | `visit_dispenser()` in `Customer.py` |
| Service time **medical evaluation**: lognormal, logarithmic mean = 1.024, logarithmic stdev = 0.788 |`time = random.lognormvariate(mu=1.024, sigma=0.788) ` | Table 2 | `visit_medic()` in `Customer.py` |
| Arrival rate 100 designess per minute per POD (following a Poisson distribution) | <!----> | 4.1.4 Arrival rate | <!----> |
| Three simulation runs | <!----> | 4.3 Processing | <!----> |
| Number of staff members per station - mentions examples of where "*each station could have up to thirty staff members*" or "*for example 60*". We know it cannot be 30, but could reasonably assume to be 60 | <!----> | 3 Problem and 4.3 Processing | <!----> |
| Forms per hour from code: forms is throughput x 3.2. **TODO: understand why this is... could check i am right by running simulation from table 4 with fixed parameters and check n forms processed matches...** | <!----> | `plotting_staff_results.R` | <!----> |
| Default crossover rate 1.0 and n=1 | <!----> | 5 Experimental results | <!----> |
| Experiment 1: tri-objective model, population 100, generations 50, pre-screened scenarios 10%, 20%, 30%... 90% | <!----> | 4.1.1 Splits and Figure 5 | <!----> |
| Experiment 2: bi-objective model, population 50, generations 25, pre-screened scenarios 10%, 20%, 30%... 90% | <!----> | 4.1.1 Splits and Figure 7 | <!----> |
| Experiment 3: tri-objective model, pre-screened percentage ??, (a) 100 pop 50 gen (b) 200 pop 100 gen (c) 50 pop 25 gen | <!----> | 5.3 Experiment 3 and Figure 8 | <!----> |
| Experiment 4: tri-objective model, maximum line managers 1, 2 or 3 | <!----> | 5.4 Experiment 4 and Figure 9 | <!----> |
| Experiment 5: 6 dispensing, 6 screening, 4 line manager, one medical evaluator, number of replications 1-7 | <!----> | 5.5 Experiment 5 and Figure 10 | <!----> |

::: {.callout-tip}
## Reflection

Would've been handy if all parameters could have been mentioned in one place.
:::

Reflections:

* Big difference in the line manager service times - this might explain it!
I returned to this the following day, so please refer to subsequent logbook for results.

## Timings

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
118 changes: 118 additions & 0 deletions logbook/posts/2024_10_02/index.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
---
title: "Day 8"
author: "Amy Heather"
date: "2024-10-02"
categories: [reproduce]
bibliography: ../../../quarto_site/references.bib
---

::: {.callout-note}

Correcting a parameter, adding parallel processing, and experimenting with parameters to try and run similarly but more quickly. Total time used: 11h 21m (28.4%)

:::

## 09.14-09.47: Continuing to check parameters

::: {.callout-tip}
## Reflection

Would've been handy if all parameters could have been mentioned in one place.
:::

| Parameter | Code | Location in paper | Location in code |
| --- | --- | --- | --- |
| Length of simulation: 1 hour |`self.maxTime = 60` | 4.3 Processing | `PODSimulation() __init__` in `PodSimulation.py` |
| At screening station, 1% go to med eval and 99% to dispensing. At med eval station, 99% got to dispensing and 1% exit POD |*Can't find* | 4.1.1 Splits | - |
| Number of forms per designee: 1 31.8%, 2 26.7%, 3 16.8%, 4 12.6%, 5 6.8%, 6 5.6% | ❔ throughput x 3.2 - so appears to be averaged to 3.2? | Table 1 | `plotting_ staff_ results.r` |
| Service time **line manager**: triangular, minimum = 0.029, maximum = 0.039, mode = 0.044 |`time = random. triangular( low=5/60.0, high=92/60.0, mode=23/60.0)`. Low 5/60 = 0.0833. High 92/60 = 1.533. Mode 23/60 = 0.3833. There is one **commented out** which has... Low 1.77/60 = 0.0295. Higher 2.66/60 = 0.044. Mode 2.38/60 = 0.0397. This would match the article, except maximum and mode the other way round. | Table 2 | `visit_greeter()` in `Customer.py` |
| Service time **screening**: weibull, shape = 2.29, scale = 0.142 |`time = random. weibullvariate( alpha=0.142, beta=2.29 )` | Table 2 | `visit_screener()` in `Customer.py` |
| Service time **dispensing**: weibull, shape = 1, scale = 0.311 |`time = random. weibullvariate( alpha=0.311, beta=1 )` | Table 2 | `visit_dispenser()` in `Customer.py` |
| Service time **medical evaluation**: lognormal, logarithmic mean = 1.024, logarithmic stdev = 0.788 |`time = random. lognormvariate( mu=1.024, sigma=0.788 ) ` | Table 2 | `visit_medic()` in `Customer.py` |
| Arrival rate 100 designess per minute per POD (following a Poisson distribution) |`self.meanTBA = 1/200.0 #1/float(115) #mean time between arrivals, minutes btw entities`. This would mean 200 arrivals per minute, rather than 100. | 4.1.4 Arrival rate | `PODSimulation() __init__` in `PodSimulation.py` |
| Three simulation runs | 🟡 This wasn't the case when I first started, but have already been changing this | 4.3 Processing | `main.py` |
| Number of staff members per station - mentions examples of where "*each station could have up to thirty staff members*" or "*for example 60*". We know it cannot be 30, but could reasonably assume to be 60 | 🟡 This wasn't the case when I first started, but I have already noticed and addressed, and fixed to 60. | 3 Problem and 4.3 Processing | `Staff Allocation Problem()` in `Staff Allocation Problem.py` |
| Default crossover rate 1.0 and n=1 |`ea.variator = [variators. n_point_crossover]` with `variators` imported from `inspyred.ec`, which can see from [docs](https://pythonhosted.org/inspyred/reference.html) the default crossover rate is 1 and default number of crossover points is 1 | 5 Experimental results | `nsga2.py` |
| Experiment 1: tri-objective model, population 100, generations 50, pre-screened scenarios 10%, 20%, 30%... 90% | ✅ As in the input files like `10-prescreened.txt` | 4.1.1 Splits and Figure 5 | As left |
| Experiment 2: bi-objective model, population 50, generations 25, pre-screened scenarios 10%, 20%, 30%... 90% | TBC | 4.1.1 Splits and Figure 7 | - |
| Experiment 3: tri-objective model, pre-screened percentage ??, (a) 100 pop 50 gen (b) 200 pop 100 gen (c) 50 pop 25 gen | TBC<br><br>Note: Where I am unsure of pre-screened percentage here, I presume it might be default from code which, `if parameterReader == None`, then `self. preScreened Percentage = 0.1` | 5.3 Experiment 3 and Figure 8 | -<br><br>`PODSimulation() __init__` in `PodSimulation.py` |
| Experiment 4: tri-objective model, maximum line managers 1, 2 or 3 | TBC | 5.4 Experiment 4 and Figure 9 | - |
| Experiment 5: 6 dispensing, 6 screening, 4 line manager, one medical evaluator, number of replications 1-7 | TBC | 5.5 Experiment 5 and Figure 10 | - |

Reflections on discrepancies (❌): Big difference in the line manager service times - this might explain it! Will start with this, but could then try some of the others? Wary of trying all at once, as always hard to know what might be right - the code or the article. Will run 100pop 5gen 1 run.

## 13.15-13.25: Corrected line manager service times

This took 210 minutes to run (3 hours and a half). The Y axis of Figure 5 is now much higher - in fact, too high!

Figure 5:

![](figure5_fixtimes_100pop_5gen_1run.png)

Figure 6, filtered to just the throughouput in the range of those plot in the article:

![](figure6_fixtimes_100pop_5gen_1run.png)

I tried switching to 10pop 1 gen 1run, just to confirm if it gives results with similar range, as if so, that's much quicker for trying out different changes, but these looked rather different!

Run time: 6 minutes

These looked quite different though, so will need to run with a bit more...

Figure 5:

![](figure5_10pop_1gen_1run.png)

Figure 6:

![](figure6_10pop_1gen_1run.png)
## 13.26-13.35, 13.44-14.03, 15.17-15.21: Adding parallel processing

I tried switching the loop in `Experiment1.py` into a parallel loop, to speed up the run time.

First, I just changed it to a function with a loop.

I ran this with the same parameters as above (10 pop 1 gen 1 run) to confirm the results didn't change at all due to the parallel processing and, indeed, they remained the same, so this is successfully implemented. The only change was - as expected - to the times files, which is now a single file.

Run time: **6 minutes**

I then adjusted the loop to use parallel processing (with multiprocessing Pool). This required the old syntax (i.e. not the with() statement).

Run time: **55 seconds**

::: {.callout-tip}
## Reflection

The pre-existing seed control and way the code was structured made it really easy to implement this.
:::

I then tried running with 50 pop 25 gen 1 run, as they used that for Figure 7 (though 3 runs) so it should be similar.

Run time: **57 minutes**

![](figure5_50pop_25gen_1run.png)

![](figure6_50pop_25gen_1run.png)

I realised then that that had just as much impact on the y axis as anything else! Hence, I figured best plan of action would be to run as is with 100 pop 50 gen 1 run, and see how that looks. Then, if that doesn't look right, try tweaking with parameters identified in table above.

## Timings

```{python}
import sys
sys.path.append('../')
from timings import calculate_times
# Minutes used prior to today
used_to_date = 606
# Times from today
times = [
('09.14', '09.47'),
('13.15', '13.25'),
('13.26', '13.35'),
('13.44', '14.03'),
('15.17', '15.21')]
calculate_times(used_to_date, times)
```
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
46 changes: 46 additions & 0 deletions logbook/posts/2024_10_03/index.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
title: "Day 9"
author: "Amy Heather"
date: "2024-10-03"
categories: [reproduce]
bibliography: ../../../quarto_site/references.bib
---

::: {.callout-note}

X. Total time used: Xh Xm (X%)

:::

## 09.05-09.08, 10.36-10.39, 15.23-X: Running all scenarios with 100 pop 50 gen 1 run

I tried running all scenarios with 100 population 50 generations 1 run on the remote machine. However, about 90 minutes later, I noted the process had stopped running with the error:

```
client_loop: send disconnect: Broken pipe
```

Apparently this error indicates the sudden termination of a network connection, or a timeout for the SSH connection due to no activity. I know the latter shouldn't be the case as I've run commands for longer previously, so assuming it might have just been a network issue, I tried again, and this worked.

Run time: 268 minutes = **4 hours 28 minutes** (in parallel on remote machine)

![](figure5_100pop_50gen_1run.png)

![](figure6_100pop_50gen_1run.png)

## Timings

```{python}
import sys
sys.path.append('../')
from timings import calculate_times
# Minutes used prior to today
used_to_date = 681
# Times from today
times = [
('09.05', '09.08')]
calculate_times(used_to_date, times)
```
Binary file modified reproduction/r_outputs/figure5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified reproduction/r_outputs/figure6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
32 changes: 15 additions & 17 deletions reproduction/r_scripts/plot_experiment1.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ base_folder <- "../python_outputs/experiment1"
subfolders <- list.dirs(base_folder, recursive = FALSE, full.names = TRUE)
```

```{r}
# Import time
time <- readLines(file.path(base_folder, "time.txt"), warn=FALSE)
```

```{r}
# Initialise empty lists
result_list <- list()
Expand All @@ -30,9 +35,6 @@ for (folder_path in subfolders) {
# Save results.txt, and add the scenario
result_list[[folder_name]] <- read.delim(file.path(folder_path, "results.txt"))
result_list[[folder_name]]$scenario <- folder_name
# Save time.txt
times[[folder_name]] <- readLines(file.path(folder_path, "time.txt"), warn=FALSE)
}
```

Expand All @@ -43,8 +45,17 @@ rownames(results) <- NULL
# Add forms
results$forms <- results$throughput*3.2
```

## Show times for experiment 1

results
```{r}
# Convert the times to difftime objects
time_adj <- as.numeric(strptime(time, format="%H:%M:%OS")) - as.numeric(strptime("0:00:00", format="%H:%M:%OS"))
# Sum the times
total_time <- sum(unlist(time_adj))
print(paste0(round(total_time), " sec, ", round(total_time/60), " min"))
```

## Create Figure 5
Expand Down Expand Up @@ -118,16 +129,3 @@ ggsave("../r_outputs/figure6.png", width = 10 , height = 10)
fig6
```

## Show times for experiment 1

```{r}
# Convert the times to difftime objects
times_converted <- lapply(times, function(x) {
as.numeric(strptime(x, format="%H:%M:%OS")) - as.numeric(strptime("0:00:00", format="%H:%M:%OS"))
})
# Sum the times
total_time <- sum(unlist(times_converted))
print(paste0(round(total_time), " sec, ", round(total_time/60), " min"))
```

0 comments on commit 793bc83

Please sign in to comment.