Add bias detection #19

Zethson · 2024-03-11T16:01:34Z

No description provided.

Signed-off-by: zethson <lukas.heumos@posteo.net>

review-notebook-app · 2024-03-11T16:01:39Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Signed-off-by: zethson <lukas.heumos@posteo.net>

eroell · 2024-04-08T14:34:46Z

TODOs for Eljas later

Diabetes Dataset use one name consistently DONE
A1c/A1C/HbA1c use name consistently DONE

Signed-off-by: zethson <lukas.heumos@posteo.net>

review-notebook-app · 2024-04-12T13:18:57Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-12T13:18:56Z
----------------------------------------------------------------

I think I dont immediately understand points

propensity score
inverse probability

VladimirShitov commented on 2024-04-17T12:49:23Z
----------------------------------------------------------------

This is a bit controversial.

1. If all visits from a clinic are taken, researchers do not have control over demographics distribution. In this case, the best they can do is reporting the data biases

3. Isn't something better than nothing? I don't see how dropping underrepresented groups decreases biases. Maybe, a better way would be using this groups as validation set. If results obtained on a bigger data reproduce in these groups, that's great. If not, that's a good point for discussion on the limitations of the study

4-5. I double @eroell's point. "Propensity score" is not commonly familiar to technical people, and "inverse probability" might confuse people with more clinical background.

6. Well, again, it is tricky. You do not expect breast cancer to have 50/50 sex distribution

VladimirShitov commented on 2024-04-17T12:51:31Z
----------------------------------------------------------------

If you've taken it from somewhere, maybe leaving it as is and referencing the source is fine. If not, I would find some conventional approaches and list them here

eroell commented on 2024-04-17T16:57:24Z
----------------------------------------------------------------

1.Added "if possible". Agreed

3.We do have it from somewhere: I added this specifically to this point

4./5. removed - if controversial, dont state it this specifically

6.removed

Zethson commented on 2024-04-17T17:00:57Z
----------------------------------------------------------------

Why don't we define 4/5 properly instead of removing it? We want to add propensity score matching to ehrapy

eroell commented on 2024-04-17T19:34:59Z
----------------------------------------------------------------

Keep the propensity score for the clinical-people; and explain inverse probability weighting.

True, with propensity scores coming we can list that here

- Employ propensity score matching to adjust for variables that predict receiving the treatment, thus balancing the data.

- Apply inverse probability weighting (that is, assign more weight to groups infrequently observed) to adjust the influence of observed variables that could lead to selection bias.

review-notebook-app · 2024-04-12T13:18:57Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-12T13:18:57Z
----------------------------------------------------------------

Even when the filtering criteria are clearly defined and consistently applied, a non-representative sample of the original population can occur?

E.g. the filtering criterion of being older than 18 years (and wearing glasses) very consistently applied will give a non-representative sample of the original population

eroell commented on 2024-04-17T17:00:01Z
----------------------------------------------------------------

I rewrote it to

Filtering bias emerges when the criteria used to include or exclude records in the analysis are unintendedly or unexpectedly filtering for other variables as well.

This keeps the spirit and seems more correct to me

review-notebook-app · 2024-04-12T13:18:58Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-12T13:18:57Z
----------------------------------------------------------------

I dont really understand the

Design prospective studies point

Very vague and somewhat obvious, no?

regularly check for completeness

review-notebook-app · 2024-04-12T13:18:59Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-12T13:18:58Z
----------------------------------------------------------------

I see how this can cause misrepresentations if matching ontologies is hard or not properly doable. Still have the feeling this is more of an inconsistency, than a "bias", altough we use this word a bit flexible with its different meanings in different contexts

review-notebook-app · 2024-04-12T13:18:59Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-12T13:18:59Z
----------------------------------------------------------------

So the patient's severity does not alter with time? Just the mild ones that stay mild foerever are lost, while the severe ones that stay severe forever will come again?

Not a common scenario, no? Makes the test a bit hard to understand. Having the test only on the baseline characteristics as on the last point in the mitigation strategy is easier to understand, but yeah lacks taking into account how patients develop.

eroell commented on 2024-04-17T21:21:27Z
----------------------------------------------------------------

Added explanations to numbers, and context making it more obvious I think

review-notebook-app · 2024-04-12T13:19:00Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-12T13:19:00Z
----------------------------------------------------------------

The last point is a very nice one indeed

review-notebook-app · 2024-04-12T13:19:01Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-12T13:19:00Z
----------------------------------------------------------------

Though, people doing clinical trials know this very well, and the TableOne of disease and control group with average disease severity is an integral part of any such clinical trial; not sure if it is stating the obvious a bit. but can also leave it like that

review-notebook-app · 2024-04-12T13:19:02Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-12T13:19:01Z
----------------------------------------------------------------

think need to explain what the magic numbers 50 in line 5 represent, the others too - not quite obvious

eroell commented on 2024-04-17T21:33:16Z
----------------------------------------------------------------

this is a very intricate example I think actually, I'm not sure I grasped that before;

this is a scenario modelling "Regression toward the mean", where patients that have a good baseline health have a tendency to loose a lot of their extraordinary health towards a mediocre outcome: for these patients, the treatment effect is not fully visible, as it seems they have not profited much starting off from their high baseline; but in fact, they did, they would have felt worse without treatment than they do now.

this baseline information being hidden in the regression is the bias.

Do we want to have this complex scenario this way in our notebook? I think we'd have to make it more clear

review-notebook-app · 2024-04-12T13:19:02Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-12T13:19:02Z
----------------------------------------------------------------

noice

review-notebook-app · 2024-04-12T13:19:03Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-12T13:19:03Z
----------------------------------------------------------------

add conclusion section?

review-notebook-app · 2024-04-17T14:02:42Z

View / edit / reply to this conversation on ReviewNB

VladimirShitov commented on 2024-04-17T14:02:41Z
----------------------------------------------------------------

There are 2 groups of advices:

For people planning or performing the experiment (points 1, 3, 4, 5)
For people working with the data (2, 6, 7)

Maybe it's worth separating those points to separate lists? Same applies to other sections

eroell commented on 2024-04-17T16:16:54Z
----------------------------------------------------------------

Agreed - made this separation here, as it is particularly obvious in this section to me;

also removed point 4, with my "having fewer but stronger points" attitude here :)

Zethson commented on 2024-04-17T17:04:00Z
----------------------------------------------------------------

True. I'd start with people planning/performing the experiment and then the rest. Generally, I think it's useful to say that you should talk to your collaborators and have a say on experiment/cohort design.

with my "having fewer but stronger points" attitude here

Fine with me!

review-notebook-app · 2024-04-17T14:02:43Z

View / edit / reply to this conversation on ReviewNB

VladimirShitov commented on 2024-04-17T14:02:42Z
----------------------------------------------------------------

The example is valid but I think the opposite is more common. For severe cases you have a lot of tests, sometimes invasive, while for controls and mild cases they are typically not done

eroell commented on 2024-04-17T16:19:11Z
----------------------------------------------------------------

Indeed - this is counterintuitive. I now changed to the common, intuitive one where mild conditions have more missing data.

Although the bias, overestimating the severity and impact, does not sound as dramatic.

Nonetheless, it is a bias, and would cost resources better invested otherwise potentially

review-notebook-app · 2024-04-17T14:02:43Z

View / edit / reply to this conversation on ReviewNB

VladimirShitov commented on 2024-04-17T14:02:43Z
----------------------------------------------------------------

I once saw a python package for a pretty formatting of numbers and p-values. For example, it would output "<0.001" for tiny p-values. But I can't find it now...

eroell commented on 2024-04-17T16:32:07Z
----------------------------------------------------------------

Cool idea, just added this now

pvalue_string = "< 0.001" if pvalue < 0.001 else f"{pvalue:.2f}"

Zethson commented on 2024-04-17T17:04:57Z
----------------------------------------------------------------

Does it really matter here? I think in this specific case it just adds noise

eroell commented on 2024-04-17T19:24:20Z
----------------------------------------------------------------

haha ok. undo

review-notebook-app · 2024-04-17T14:02:44Z

View / edit / reply to this conversation on ReviewNB

VladimirShitov commented on 2024-04-17T14:02:43Z
----------------------------------------------------------------

It is a bit difficult to memorize all the mean, median and std values while reading the page. Would be nice to save them in variables or a data frame, and printing it after each step

eroell commented on 2024-04-17T16:32:39Z
----------------------------------------------------------------

amazing, I struggled with that myself even. Added precisely that, its wonderful

review-notebook-app · 2024-04-17T14:02:45Z

View / edit / reply to this conversation on ReviewNB

VladimirShitov commented on 2024-04-17T14:02:44Z
----------------------------------------------------------------

"Clusters in embeddings" doesn't sound right. I'd say "clusters in the data"

eroell commented on 2024-04-17T16:38:00Z
----------------------------------------------------------------

changed to "clusters in the data"

review-notebook-app · 2024-04-17T14:51:49Z

View / edit / reply to this conversation on ReviewNB

eroell commented on 2024-04-17T14:51:48Z
----------------------------------------------------------------

todo for me:

import warnings

warnings.filterwarnings("ignore")

& check it doesnt mess things we want up

eroell commented on 2024-04-17T16:38:15Z
----------------------------------------------------------------

done

eroell · 2024-04-17T16:09:03Z

Yes, taking care of :)

Add bias detection #19

Add bias detection #19

Conversation

Zethson commented Mar 11, 2024

review-notebook-app bot commented Mar 11, 2024

eroell commented Apr 8, 2024 • edited Loading

review-notebook-app bot commented Apr 12, 2024 • edited Loading

review-notebook-app bot commented Apr 12, 2024 • edited Loading

review-notebook-app bot commented Apr 12, 2024 • edited Loading

review-notebook-app bot commented Apr 12, 2024 • edited Loading

review-notebook-app bot commented Apr 12, 2024 • edited Loading

review-notebook-app bot commented Apr 12, 2024 • edited Loading

review-notebook-app bot commented Apr 12, 2024 • edited Loading

review-notebook-app bot commented Apr 12, 2024 • edited Loading

review-notebook-app bot commented Apr 12, 2024 • edited Loading

review-notebook-app bot commented Apr 12, 2024 • edited Loading

review-notebook-app bot commented Apr 17, 2024 • edited Loading

review-notebook-app bot commented Apr 17, 2024 • edited Loading

review-notebook-app bot commented Apr 17, 2024 • edited Loading

review-notebook-app bot commented Apr 17, 2024 • edited Loading

review-notebook-app bot commented Apr 17, 2024 • edited Loading

review-notebook-app bot commented Apr 17, 2024 • edited Loading

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

Zethson commented Apr 17, 2024

Zethson commented Apr 17, 2024

Zethson commented Apr 17, 2024

Zethson commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 17, 2024

eroell commented Apr 8, 2024 •

edited

Loading

review-notebook-app bot commented Apr 12, 2024 •

edited

Loading

review-notebook-app bot commented Apr 12, 2024 •

edited

Loading

review-notebook-app bot commented Apr 12, 2024 •

edited

Loading

review-notebook-app bot commented Apr 12, 2024 •

edited

Loading

review-notebook-app bot commented Apr 12, 2024 •

edited

Loading

review-notebook-app bot commented Apr 12, 2024 •

edited

Loading

review-notebook-app bot commented Apr 12, 2024 •

edited

Loading

review-notebook-app bot commented Apr 12, 2024 •

edited

Loading

review-notebook-app bot commented Apr 12, 2024 •

edited

Loading

review-notebook-app bot commented Apr 12, 2024 •

edited

Loading

review-notebook-app bot commented Apr 17, 2024 •

edited

Loading

review-notebook-app bot commented Apr 17, 2024 •

edited

Loading

review-notebook-app bot commented Apr 17, 2024 •

edited

Loading

review-notebook-app bot commented Apr 17, 2024 •

edited

Loading

review-notebook-app bot commented Apr 17, 2024 •

edited

Loading

review-notebook-app bot commented Apr 17, 2024 •

edited

Loading