Small change to `derive_predictors_and_scores` for speed + normalization #119

PalkaPuri · 2024-09-11T20:31:30Z

I noticed that generating a combined DF of predictors and scores was taking very long for large datasets (my kernel kept dying). This operation was very compute expensive because we were using df.merge which involves searching through column values to find matching rows. However, we can get away with using something simpler like pd.concat since all predictor/score DataFrames inherit the grid from local_windows and hence the rows match by design. This should help clear the speed bottleneck.

I also updated the function to return eCDF normalized values as an option

…sResearch/collab-creatures into pp-collab2-pairwise-copying-pred

…ollab-creatures into pp-collab2-pairwise-copying-pred

…sResearch/collab-creatures into pp-collab2-howfarscore

PalkaPuri · 2024-09-25T21:10:54Z

Added option to scale values based on empirical CDF
(rescaling using min and max, which we were implementing before, does nothing as the predictors are already scaled to be between 0 and 1)

…uff (collab2 ). (#126) * Some work on the random_foragers notebook and fixing stuff. * Linting + completing the random_foragers notebook. * Finished random_foragers * Interactive plots now should be displayed in HTML * make format * small fixes to random foragers * Some more tweaks + zero-index fixes. * Hungry birds simulation updated. * Minor. * Completed the follower NB. * Saves the samples from each one of the R,H,F to disk, for later plotting in a single figure. * Comparative fig. * Minor. * Make lint and format * Typos * Improved explanations of the predictors and the scores. * Updated the model description in the random notebook. * Minor * Added option for initial positions. Updated RHF. * reviewed random * added toc to followers * fixed followers * Small formulas + model updates * small modification * small fixes, re-ran * fixing save and display in follower * format lint, dilling in hungry --------- Co-authored-by: rfl-urbaniak <rfl.urbaniak@gmail.com>

…/collab-creatures into pp-collab2-upgrade-derive

rfl-urbaniak

Please pull origin from the current version of staging, make sure you resolve all conflicts and pass all the tests.

…llab2-upgrade-derive

PalkaPuri · 2024-12-12T03:59:19Z

@rfl-urbaniak @dimkab Just finished going over this branch. The following changes were implemented:

final derivedDF is now generated by concatenating each predictor/score DF on index instead of using df.merge which was more computationally intensive
Added time logging for generation of derivedDF and local_windows
Updated the UserWarning in case of NaNs : previously we said X frames are dropped from derivedDF. But that is incorrect as the number X corresponds to rows of the df (which may be coming from any number of frames of the data).
Updated add_scaled_scores option to scale values according to the empirical CDF. Previously scaling using min/max did not do anything as all predictors/scores are already scaled to be between 0 and 1.
The output of the corresponding test notebook was changed due to these updates

Note: After merging these changes we would need to run all the doc notebooks again, as 2,3, and 4 can potentially change the outputs of the cells. I did not do this just yet so as to not overwhelm you both with 100 file changes in one PR.

into pp-collab2-upgrade-derive

rfl-urbaniak

I am not convinced that replacing merge with concat has no strange consequences at least for some of the predictors. As a sanity check I re-run communicators inference with your proposed modification and the posterior marginals are significantly different from the ones we currently have. I think this modification, if indeed correct, needs additional explicit tests that ensure proper functioning, involving predictors other than velocity too.

PalkaPuri added 30 commits August 7, 2024 14:32

add_velocity function

8365331

velocity predictor WIP

e2ee500

wip

6a19ae4

format/lint

b5dde8a

specify forager column as int

a923457

add warning

8cc4111

format/lint

892da93

rename f,t for clarity

f675d23

rerun notebook

1ee7caa

add docstrings

fcd34b2

format/lint

dc950c2

use inbuilt function for gaussian pdf

205235f

merge changes from pp-collab2-compute-velocity

b476a17

updated visualization

045e7d0

add handling of nan values in visualization

05b3d90

add _generate_pairwise_copying predictor

5774858

format/lint

41a7787

type hints

ab6ee7e

add pairwise predictor and animation function

47679f3

change velocity to backward looking

fac3f10

Merge branch 'pp-collab2-compute-velocity' of https://github.com/Basi…

ddf4572

…sResearch/collab-creatures into pp-collab2-pairwise-copying-pred

docstrings and type hints

4b242d8

generate function and typehints

82223c0

Merge branch 'staging-collab-2' of https://github.com/BasisResearch/c…

11f5dce

…ollab-creatures into pp-collab2-pairwise-copying-pred

fixed naming, nan handling

0dec71b

refactoring

ea399b2

vicsek wip

9a2ec97

update normalization of velocity predictors

033e565

distance_to_next_move WIP

9c6854b

Merge branch 'pp-collab2-vicsek-predictor' of https://github.com/Basi…

8bfa2d0

…sResearch/collab-creatures into pp-collab2-howfarscore

rfl-urbaniak and others added 4 commits September 19, 2024 18:43

restored init fix

f3a6aa1

bump pyro to 1.9.1

8f61e4e

revert (chirho), remove deterministic from rendering

e0dfbaf

updated warning and added eCDF normalization

1b204de

dimkab and others added 3 commits October 2, 2024 16:01

format lint

2150069

Merge branch 'ru-random-hungry-2' of https://github.com/BasisResearch…

58f526c

…/collab-creatures into pp-collab2-upgrade-derive

rfl-urbaniak requested changes Oct 3, 2024

View reviewed changes

rfl-urbaniak added status:awaiting response Awaiting response from creator and removed status:WIP Work-in-progress not yet ready for review labels Oct 3, 2024

PalkaPuri changed the title ~~Small change to derive_predictors_and_scores for speed~~ Small change to derive_predictors_and_scores for speed + normalization Oct 6, 2024

Merge branch 'staging-collab-2' into pp-collab2-upgrade-derive

63e3a87

PalkaPuri added status:awaiting review Awaiting response from reviewer and removed status:awaiting response Awaiting response from creator labels Oct 7, 2024

PalkaPuri requested review from dimkab and rfl-urbaniak October 7, 2024 12:02

rfl-urbaniak removed the collab2.0 label Nov 1, 2024

PalkaPuri changed the base branch from staging-collab-2 to main December 12, 2024 02:21

PalkaPuri added 3 commits December 11, 2024 19:13

make copy of changes from collab2 folder

7ef84c0

Merge remote-tracking branch 'origin/staging-velocity-prs' into pp-co…

89574fd

…llab2-upgrade-derive

copied changes from collab2

0aa4980

rfl-urbaniak added 4 commits December 31, 2024 06:13

Merge branch 'main' of https://github.com/BasisResearch/collab-creatures

989aa5e

into pp-collab2-upgrade-derive

small comments in derive_predictors, fix model graph

6e822da

fixing derive_pred notebook for testing

391031c

format, lint

667aa44

rfl-urbaniak requested changes Dec 31, 2024

View reviewed changes

rfl-urbaniak added blocked do not merge until further discussion and removed status:awaiting review Awaiting response from reviewer labels Dec 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small change to `derive_predictors_and_scores` for speed + normalization #119

Small change to `derive_predictors_and_scores` for speed + normalization #119

PalkaPuri commented Sep 11, 2024 •

edited

Loading

PalkaPuri commented Sep 25, 2024 •

edited

Loading

rfl-urbaniak left a comment

PalkaPuri commented Dec 12, 2024

rfl-urbaniak left a comment

Small change to derive_predictors_and_scores for speed + normalization #119

Are you sure you want to change the base?

Small change to derive_predictors_and_scores for speed + normalization #119

Conversation

PalkaPuri commented Sep 11, 2024 • edited Loading

PalkaPuri commented Sep 25, 2024 • edited Loading

rfl-urbaniak left a comment

Choose a reason for hiding this comment

PalkaPuri commented Dec 12, 2024

rfl-urbaniak left a comment

Choose a reason for hiding this comment

Small change to `derive_predictors_and_scores` for speed + normalization #119

Small change to `derive_predictors_and_scores` for speed + normalization #119

PalkaPuri commented Sep 11, 2024 •

edited

Loading

PalkaPuri commented Sep 25, 2024 •

edited

Loading