numeric match and topic tag #72

lchen001 · 2024-12-18T17:19:14Z

Add a new metric "numeric match" to compare two numeric values.
Add a pipeline to extract topics from a math question.

nushib

Prompt questions

eureka_ml_insights/user_configs/aime.py

eureka_ml_insights/prompt_templates/aime_templates/Template_tag1.jinja

nushib · 2024-12-19T06:03:40Z

eureka_ml_insights/user_configs/aime.py

+
+
+class AIME_PIPELINETag(AIME_PIPELINE):
+    """This class specifies the config for running AIME benchmark 5 repeated times"""


update comment so it reflects the functionality of the class

nushib · 2024-12-19T06:20:01Z

eureka_ml_insights/user_configs/aime.py

+        # Each query is tagged with one or more topics from arithmetic, algebra, counting, geometry, number theory, and probability and other.
+        # These topics follow the description on the official website: https://artofproblemsolving.com/wiki/index.php/American_Invitational_Mathematics_Examination?srsltid=AfmBOooSIQ8ua5aJX00ZtYCKDuOAB4I4c-YE9zr1xYZ86fq8x5RL2sEg.
+        # In their own words, "The AIME tests mathematical problem solving with arithmetic, algebra, counting, geometry, number theory, and probability and other secondary school math topics"
+        return pipeline


Since the class inherits from the original AIME_PIPELINE it will continue to run the rest of the AIME_PIPELINE but with the tagging prompt. For example, this means that it will also try to extract an answer and generate the report. There are two options here: 1) Either to not inherit from AIME_PIPELINE, or 2) Inherit from AIME_PIPELINE but then return only the components you need in the pipeline. For example,

return PipelineConfig(
[
self.data_processing_comp,
self.inference_comp,
self.data_post_processing,
],
self.log_dir,
)

In case 2, also requires changing the answer extractor as the marker is different here.

Lingjiao Chen added 4 commits December 17, 2024 11:39

add tagging

abf6c03

add new metric

77d1103

remove sampling

2fe7095

back to original model configs

bd72169

nushib reviewed Dec 19, 2024

View reviewed changes

eureka_ml_insights/user_configs/aime.py Outdated Show resolved Hide resolved

eureka_ml_insights/prompt_templates/aime_templates/Template_tag1.jinja Show resolved Hide resolved

update the tagging prompt

e4cb820

lchen001 requested a review from nushib December 19, 2024 01:56

nushib reviewed Dec 19, 2024

View reviewed changes

lchen001 and others added 2 commits January 17, 2025 11:02

add direct run prompt

551341a

split majority vote performance by year

392b559

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

numeric match and topic tag #72

numeric match and topic tag #72

lchen001 commented Dec 18, 2024

nushib left a comment

nushib Dec 19, 2024

nushib Dec 19, 2024



		class AIME_PIPELINETag(AIME_PIPELINE):
		"""This class specifies the config for running AIME benchmark 5 repeated times"""

numeric match and topic tag #72

Are you sure you want to change the base?

numeric match and topic tag #72

Conversation

lchen001 commented Dec 18, 2024

nushib left a comment

Choose a reason for hiding this comment

nushib Dec 19, 2024

Choose a reason for hiding this comment

nushib Dec 19, 2024

Choose a reason for hiding this comment