Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neel/spatial map answer extraction #57

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

neelsj
Copy link
Collaborator

@neelsj neelsj commented Dec 7, 2024

This PR is a set of changes to update and simplify the answer extraction or Spatial Map and Maze. Removes model specific answer extraction

eureka_ml_insights/data_utils/spatial_utils.py Outdated Show resolved Hide resolved
eureka_ml_insights/metrics/metrics_base.py Outdated Show resolved Hide resolved
eureka_ml_insights/metrics/metrics_base.py Outdated Show resolved Hide resolved
vidhishanair
vidhishanair previously approved these changes Dec 10, 2024
Copy link
Collaborator

@vidhishanair vidhishanair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -68,8 +68,7 @@
ASTEvalTransform,
PrependStringTransform,
ExtractAnswerGrid,
ExtractAnswerSpatialMap,
ExtractAnswerMaze,
ExtractAnswerSpatialMapAndMaze,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add "ExtractQuestionOptions" so that when we run the formatters this import does not get removed (considered unused).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, yes will do

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done and checked in

eureka_ml_insights/metrics/metrics_base.py Outdated Show resolved Hide resolved
eureka_ml_insights/metrics/metrics_base.py Outdated Show resolved Hide resolved
eureka_ml_insights/metrics/metrics_base.py Outdated Show resolved Hide resolved
eureka_ml_insights/user_configs/vision_language/maze.py Outdated Show resolved Hide resolved
The code is from: https://github.com/alvinmingwisc/spatial_reason_vlm/tree/main/eval,
and included with minimal modifications.
Extracts the answer from the text based on known model output patterns.
Searches for both a letter and whole word answer and returns both as they are not
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh is this why you have added the OR metrics? If so, I don't think this justifies adding new metric classes, instead the answer extractor should return some combination of these two, maybe simply "x or y".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do this another way. Especially if that means less new code. However, if let's say I return "x or y", I still need to a metric that knows how to check this. the current ones just look for case-sensitive or insensitive exact match.

One though was to make one metric that can take a single or a list (instead of having two as now) and an optional parameter for how to combine, i.e. any ("or") or all ("and"). what do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now returning the "x or y" string

if answers_match:
model_output_parsed = answers_match.group(1)

return [model_output_parsed, model_output_parsed_letter]
Copy link
Collaborator

@safooray safooray Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding a test for the extractor to showcase/check all these cases that you have covered? regex is hard to read and review, a test would give me peace of mind.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in a new file under tests/data_utils_tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants