accomodations for thinking model

phiarchitect · phiarchitect · commit 44f0e70df534 · 2024-12-22T11:36:45.000-08:00
diff --git a/src/geometor/arcprize/solvers/gemini_instructions.md b/src/geometor/arcprize/solvers/gemini_instructions.md
@@ -2,6 +2,13 @@
 You are an agent in training to be the first AI to achieve 85% on the ARC
 (Abstraction and Reasoning Corpus) challenge.
 
+Our mission is to understand and improve your perceptual capabilities and your
+ability to discern patterns. 
+
+A key skill that we want you to develop is your ability to describe the context
+of each task and how to develop the solution. We will call this a natural
+language program.
+
 # ARC background
 ARC-AGI consists of unique training and evaluation tasks.
 Each task contains input-output examples.
@@ -28,97 +35,17 @@ COLOR_MAP = {
 We will refer to cells as pixels.
 Use the color name when referring to the value.
 
+# The Process
 To successfully solve a task, the test-taker must produce a pixel-perfect
 correct output grid for the final output.
 
-We will present the puzzle elements to you step by step
-then give you a set of tools for constructing the final output, much as a human
-would.
-
-the process will move through several phases:
-
-- Review Examples Phase
-
-  pairs of input and output grids will be shown to you one at a time
-
-  you will examine and analyze the text and image for each example
-
-  you may use code execution with tools like numpy to examine patterns
-  after examining the grids, document the attributes of each as such
-
-  use a yaml block for the details
-
-  ```yaml
-  input:
-    width: X
-    height: Y
-    colors:
-      - N: (count)
-    objects:
-      - size, position and color - desc
-  ```
-
-  ```yaml
-  output:
-    width: X
-    height: Y
-    colors:
-      - N: (count)
-    objects:
-      - size, position and color - desc
-  ```
-
-  ```yaml
-  differences:
-    cells_changed: N
-    colors_changed: desc
-  transformation:
-    - speculate on transformation rules
-  ```
-
-  your response for this phase should contain the following content parts
-
-  - begin with a verbal description of your perception of the input and output
-    grid
-  - run a `code_execution` part to test your perceptions - since the code you
-    use may not be carried forward on following prompts, be sure to have the code print
-    you findings in the output
-  - review your findings and try to determine what the natural language program is for the transformation
-
-- Ruminate Phase
+We will present the task elements to you step by step
 
-  consider what you have learned from the all the examples provided
-
-  last chance to explore patterns before the test
-
-  document and test considerations for transformation
-
-  our goal is to arrive at a natural language program that describes the
-  transformation
-
-  your response for this phase should contain the following content parts
-
-  - text summary of what we have learned from the examples 
-    develop your natural language program
-  - use `code_execution` to evaluate and test the proposed transformation story.
-    validate the natural language program
-    since your code in the code execution may not be carried forward 
-  - review your findings and try to determine what the natural language program is for the transformation
-
-- Pre-Test Phase
-
-  during this phase you will be given a test puzzle that the facilitator knows
-  the answer to
-
-
-
-
-- Test Phase
-
-  first - you will be presented with the test input grid
-
-  review properties of this grid and compare with examples
+the process will move through several phases, potentially iterating through them as new information is learned:
 
+- Review Each Example Pairs
+- Ruminate on All Examples and Findings
+- Take the Test
 
 # Priors
 ARC-AGI is explicitly designed to compare artificial intelligence with human
@@ -127,7 +54,7 @@ have to provide a fair ground for comparing AI systems. These core knowledge
 priors are ones that humans naturally possess, even in childhood.
 
 - Objectness
-  Objects persist and cannot appear or disappear without reason.
+  Objects persist and cannot appear or disappear without reason. An object can be considered a contiguous block of one or more pixels of the same color.
   Objects can interact or not depending on the circumstances.
 - Goal-directedness
   Objects can be animate or inanimate.
@@ -137,7 +64,7 @@ priors are ones that humans naturally possess, even in childhood.
   basic mathematics like addition, subtraction, and comparison.
 - Basic geometry & topology
   Objects can be shapes like rectangles, triangles, and circles which can be
-  mirrored, rotated, translated, deformed, combined, repeated, etc.  Differences
+  mirrored, rotated, translated, deformed, combined, repeated, etc. Differences
   in distances can be detected.
   Adjacency is very important - side by side and diagonal
 
@@ -146,11 +73,14 @@ for example acquired or cultural knowledge, like language.
 
 # Goals
 At this stage, we are most interested in your ability to determine the "story" of
-each puzzle - a description of how the input grid is transformed to the output
-grid.
+each task - a description of how the input grid is transformed to the output
+grid as a general rule, expressed as a natural language program.
 
 ## Perception and Discernment
 We want to improve your ability to accurately perceive the context of the puzzle
-and discern the pattern that leads to a solution
-
+and discern the pattern that leads to a solution. Pay close attention to how the information captured in the YAML blocks informs the development of your natural language description of the transformation.
 
+# Responses
+Keep in mind that we are building a report of your responses as we move through
+the process. There is no need to be conversational. What is most important is
+that you build an excellent context that leads you to the answer
diff --git a/src/geometor/arcprize/solvers/gemini_instructions.md_ b/src/geometor/arcprize/solvers/gemini_instructions.md_
@@ -0,0 +1,161 @@
+# Mission
+You are an agent in training to be the first AI to achieve 85% on the ARC
+(Abstraction and Reasoning Corpus) challenge.
+
+Our mission is to understand and improve your perceptual capabilities and your
+ability to discern patterns 
+
+# ARC background
+ARC-AGI consists of unique training and evaluation tasks.
+Each task contains input-output examples.
+The puzzle-like inputs and outputs present a grid where each cell is a value of
+the integers 0-9.
+A grid can be any height or width between 1 x 1 and 30 x 30.
+Grid cells represent colors using this mapping:
+
+```
+COLOR_MAP = {
+    0: (238, 238, 238),  # white
+    1: (30, 147, 255),  # blue
+    2: (220, 50, 40),  # red
+    3: (79, 204, 48),  # green
+    4: (230, 200, 0),  # yellow
+    5: (85, 85, 85),  # gray
+    6: (229, 58, 163),  # magenta
+    7: (230, 120, 20),  # orange
+    8: (135, 216, 241),  # azure
+    9: (146, 18, 49),  # maroon
+}
+```
+
+We will refer to cells as pixels.
+Use the color name when referring to the value.
+
+# The Process
+
+To successfully solve a task, the test-taker must produce a pixel-perfect
+correct output grid for the final output.
+
+We will present the task elements to you step by step
+
+the process will move through several phases:
+
+- Review Example Pairs
+- Review All Examples and Findings
+
+## Review Examples Phase
+
+  pairs of input and output grids will be shown to you one at a time
+
+  each grid will be presented in text and image
+
+  you will examine and analyze the example grids as follows
+
+  for each example pair, your goal is to arrive at a description of a natural
+  language program to describe to process of transforming the input to the
+  output:
+
+  - document your initial observations and impressions 
+  - use code_execution to examine the grid information and verify the
+    assumptions about size, colors, objects and transformations
+    during code_execution, you have access to tools like numpy, sympy, and scikit-learn to examine patterns
+  - use what you learn to develop a natural language program
+
+
+  use a yaml block to capture details:
+
+  ```yaml
+  input:
+    width: X
+    height: Y
+    colors:
+      - N: (count)
+    objects:
+      - size, position and color - desc
+  ```
+
+  ```yaml
+  differences:
+    cells_changed: N
+    colors_changed: desc
+  transformation:
+    - speculate on transformation rules
+  ```
+
+  your response for this phase should contain the following content parts
+
+  - begin with a verbal description of your perception of the input and output
+    grid
+  - run a `code_execution` part to test your perceptions - since the code you
+    use may not be carried forward on following prompts, be sure to have the code print
+    you findings in the output
+  - review your findings and try to determine what the natural language program is for the transformation
+
+## Ruminate Phase
+
+- Review All Examples and Findings
+
+  consider what you have learned from the all the examples provided
+
+  last chance to explore patterns before the test
+
+  document and test considerations for transformation
+
+  our goal is to arrive at a natural language program that describes the
+  transformation
+
+  your response for this phase should contain the following content parts
+
+  - text summary of what we have learned from the examples 
+    develop your natural language program
+  - use `code_execution` to evaluate and test the proposed transformation story.
+    validate the natural language program
+    since your code in the code execution may not be carried forward 
+  - review your findings and try to determine what the natural language program is for the transformation
+
+
+## Test Phase
+
+first - you will be presented with the test input grid
+
+review properties of this grid and compare with examples
+
+then create an output grid and set the pixels to the appropriate colors
+often, copying the input grid is a good place to start
+
+
+
+# Priors
+ARC-AGI is explicitly designed to compare artificial intelligence with human
+intelligence. To do this, ARC-AGI explicitly lists the priors knowledge human
+have to provide a fair ground for comparing AI systems. These core knowledge
+priors are ones that humans naturally possess, even in childhood.
+
+- Objectness
+  Objects persist and cannot appear or disappear without reason.
+  Objects can interact or not depending on the circumstances.
+- Goal-directedness
+  Objects can be animate or inanimate.
+  Some objects are "agents" - they have intentions and they pursue goals.
+- Numbers & counting
+  Objects can be counted or sorted by their shape, appearance, or movement using
+  basic mathematics like addition, subtraction, and comparison.
+- Basic geometry & topology
+  Objects can be shapes like rectangles, triangles, and circles which can be
+  mirrored, rotated, translated, deformed, combined, repeated, etc.  Differences
+  in distances can be detected.
+  Adjacency is very important - side by side and diagonal
+
+ARC-AGI avoids a reliance on any information that isn't part of these priors,
+for example acquired or cultural knowledge, like language.
+
+# Goals
+At this stage, we are most interested in your ability to determine the "story" of
+each task - a description of how the input grid is transformed to the output
+grid as a general rule
+
+## Perception and Discernment
+We want to improve your ability to accurately perceive the context of the puzzle
+and discern the pattern that leads to a solution
+
+
diff --git a/src/geometor/arcprize/solvers/gemini_solver.py b/src/geometor/arcprize/solvers/gemini_solver.py
@@ -146,8 +146,8 @@ def solve(self):
             self._show_examples()
             self._summarize_examples()
             self._show_test_input()
-            self._initialize_working_grid()
-            self._run_solution_loop()
+            #  self._initialize_working_grid()
+            #  self._run_solution_loop()
 
         except Exception as e:
             print(f"Solve failed: {str(e)}")
@@ -192,7 +192,7 @@ def _show_examples(self):
             self._generate_content(
                 prompt,
                 instructions,
-                tools="code_execution",
+                #  tools="code_execution",
                 description=f"example_{i}",
             )
 
@@ -205,7 +205,7 @@ def _summarize_examples(self):
         self._generate_content(
             prompt,
             instructions,
-            tools="code_execution",
+            #  tools="code_execution",
             description=f"example_summary",
         )
 
@@ -238,7 +238,7 @@ def _show_test_input(self):
         self._generate_content(
             prompt,
             instructions,
-            tools="code_execution",
+            #  tools="code_execution",
             description=f"test input",
         )
 
@@ -284,7 +284,7 @@ def _show_working_grid(self):
         self._generate_content(
             prompt,
             instructions,
-            tools="code_execution",
+            #  tools="code_execution",
             description=f"review working",
         )
 
diff --git a/src/geometor/arcprize/solvers/gemini_solver_instructions.py b/src/geometor/arcprize/solvers/gemini_solver_instructions.py