Merge pull request #210 from Tigul/knowledge

Knowledge
cram2 · Nov 28, 2024 · fd2701e · fd2701e
2 parents 164a0ea + 7403dfd
commit fd2701e
Show file tree

Hide file tree

Showing 38 changed files with 2,878 additions and 1,112 deletions.
diff --git a/doc/images/knowledge/knowledge_arch.png b/doc/images/knowledge/knowledge_arch.png
diff --git a/doc/images/knowledge/property_evaluation.png b/doc/images/knowledge/property_evaluation.png
diff --git a/doc/images/knowledge/property_resolution.png b/doc/images/knowledge/property_resolution.png
diff --git a/doc/source/_toc.yml b/doc/source/_toc.yml
@@ -14,6 +14,8 @@ parts:
     - file: new_robot.rst
     - file: notebooks.rst
     - file: designators.rst
+    - file: knowledge.rst
+    - file: knowledge_and_reasoning.rst
     - file: costmap.rst
 
   - caption: Trouble Shooting
@@ -28,6 +30,8 @@ parts:
       - file: notebooks/bullet_world
       - file: notebooks/language
       - file: notebooks/local_transformer
+      - file: notebooks/minimal_task_tree
+      - file: notebooks/improving_actions
       - file: designator_example.rst
         sections:
         - file: notebooks/action_designator
@@ -48,6 +52,10 @@ parts:
         - file: notebooks/interface_examples/giskard.md
         - file: notebooks/interface_examples/robokudo.md
         - file: notebooks/ontology
+      - file: knowledge_examples.rst
+        sections:
+        - file: notebooks/knowledge_source.md
+        - file: notebooks/properties.md
 
   - caption: API
     chapters:

diff --git a/doc/source/knowledge.rst b/doc/source/knowledge.rst
@@ -0,0 +1,91 @@
+=========
+Knowledge
+=========
+
+To be able to perform tasks in unknown environments a robot needs to have a way to reason and access
+knowledge about the world. This knowledge can be represented in different ways and formats. In this
+chapter we will discuss how PyCRAM access different kinds of knowledge and integrates them with
+action designators.
+
+-------
+Concept
+-------
+The concept of knowledge in PyCRAM is based on the idea that knowledge can be represented in different ways and provided
+from different sources which can be specialized for different tasks. These different sources of knowledge are implemented
+behind a common interface which provides a set of methods to query the knowledge. This architecture can be seen in the
+following image:
+
+.. image:: ../images/knowledge/knowledge_arch.png
+    :align: center
+    :alt: Knowledge Architecture
+
+The methods provided by the knowledge sources, are called "properties" since they are used to reason about the properties
+of entities in the world. Properties can be combined to create more complex expressions which describe conditions
+that have to be true at the time an action designator is executed. Let's look at an example explaining this:
+
+.. code-block:: python
+
+    GraspableProperty(ObjectDesignator(...))
+    & ReachableProperty(Pose(....))
+
+In this example, we have two properties, one that checks if an object is graspable and one that checks if a pose is reachable.
+The `&` operator is used to combine the two properties into a single property that checks if both conditions are true at
+the same time. This combined property stems from the PickUpAction where the object a robot wants to pick up has to be
+reachable for the robot as well as being able to fit into the end effector of the robot.
+
+Since knowledge sources are usually specialized for a certain task, they do not need to be able to implement all methods
+of the interface. This leads to a lot of knowledge sources which all implement a subset of the methods, therefore no
+knowledge source can be used to answer all questions. To solve this problem, PyCRAM has a central interface for processing
+the combined properties and querying the knowledge sources called the "KnowledgeEngine". The job of the KnowledgeEngine
+is to take a combined property and resolve the individual properties to the available knowledge sources which implement
+the methods needed to answer the question. The resolved properties are then combined in the same way as the input property
+and evaluated.
+
+This image shows the process of resolving the properties through the knowledge engine:
+
+.. image:: ../images/knowledge/property_resolution.png
+    :align: center
+    :alt: Property Resolution
+
+
+
+-----------------
+Knowledge Sources
+-----------------
+Knowledge does not have a unified form or representation, it can be available as an SQL database, a Knowledge Graph,
+a simple JSON file,  etc. To be able to handle a multitude of different representations of knowledge, PyCRAM uses the
+concept of Knowledge Sources. A Knowledge Source is a class that implements a set of methods to access knowledge. Therefore,
+PyCRAM does not care how the knowledge is accesses or where it is from as as long as the Knowledge Source implements the
+abstract methods.
+
+The methods that a Knowledge Source must implement are some basic methods to manage connecting to the knowledge itself
+and more importantly, methods to query the knowledge. Which methods are provided by each knowledge source decides each
+knowledge source on its own by using the respective property as a mix-in of the knowledge source. The properties curren
+available and which a knowledge source can implement are:
+
+- `GraspableProperty`: Checks if an object is graspable
+- `ReachableProperty`: Checks if a pose is reachable
+- `SpaceIsFreeProperty`: Checks if a space is free for the robot to move to
+- `GripperIsFreeProperty`: Checks if the gripper is free to grasp an object
+- `VisibleProperty`: Checks if an object is visible
+
+
+If you want to learn more about the implementation of a Knowledge Source, you can look at the following example:
+
+:ref:`Knowledge Source example<knowledge_source_header>`
+
+----------------
+Knowledge Engine
+----------------
+The Knowledge Engine is the central component in PyCRAM to reason about the world. It takes a combined property and
+resolves the individual properties to the available knowledge sources which implement the methods needed to answer the
+question.
+
+While the properties are resolved they also infer parameter which are needed to execute the action but may not be defined
+in the action designator description. For example, the PickUpAction needs an arm to pick up an object with, however, the
+arm does not need to be defined in the action designator and can be inferred from the properties and the state of the
+world.
+
+After the properties are resolved, evaluated and the parameters are inferred, the Knowledge Engine grounds the action
+in the belief state and tests if the found solution is valid and can achieve the goal. If the solution is valid, the
+Knowledge Engine returns the solution and the action designator is performed.
diff --git a/doc/source/knowledge_and_reasoning.rst b/doc/source/knowledge_and_reasoning.rst
@@ -0,0 +1,48 @@
+=======================
+Knowledge and Reasoning
+=======================
+
+The knowledge engine is able to infer parameters of a designator description from the context given by the properties
+attached to its parameters. Since the properties are defined for the parameters of a designator description they add
+semantic information to the designator description parameters. The knowledge engine is able to utilize this information
+to infer the value of a parameter from the context.
+
+Inference is done very similar to the normal reasoning process where the property function of the designator description
+is first resolved and then evaluated. The difference is that we now not only look at the result (if the properties are
+satisfied or not) but also a the possible parameter solutions that are generated while reasoning.
+
+We start again by taking the properties of of the designator description and resolve them. This was already explained in
+the :ref:`Knowledge <knowledge>` documentation.
+
+.. image:: ../images/knowledge/property_resolution.png
+    :alt: Source Resolve
+    :align: center
+
+We then evaluate the properties and generate the possible parameter solutions.
+
+.. image:: ../images/knowledge/property_evaluation.png
+    :alt: Source Evaluate
+    :align: center
+
+As can be seen in the picture above the major part of inferring missing parameter is done by the Knowledge Source with
+the semantic context provided by the properties. The magics consists now of matching the inferred parameter from the
+Knowledge Sources with the parameter of the designator description.
+
+-------------------------------
+Matching of inferred parameters
+-------------------------------
+
+The parameter that are inferred by the Knowledge Source during reasoning need to be matched to the designator to be
+usable in the execution of a designator. Matching of the inferred parameters is kinda ambiguous since the parameter are
+provided by the Knowledge Source as a dictionary with a name and the value. Therefore the name given in the dictionary
+might not match the designator.
+
+To solve the issue of aligning the inferred parameters from the Knowledge Source with the parameters of the designator
+we employ two methods. The first is to match the names in the dictionary with the names of the parameters of the
+designator. This is most reliable when the return from the Knowledge Source tries to adhere to the conventions of the
+designator description.
+The second method is to match the type of the inferred parameter with the type annotations in the designator. While this
+seems like the more reliable method, it cloud happen that a designator has multiple parameters of the same type. In this
+case the matching might not yield the correct result, since the first found parameter of the designator is matched with
+the parameter of the Knowledge Source.
+
diff --git a/doc/source/knowledge_examples.rst b/doc/source/knowledge_examples.rst
@@ -0,0 +1,3 @@
+=======================
+Knowledge Examples
+=======================
diff --git a/examples/action_designator.md b/examples/action_designator.md
@@ -244,9 +244,9 @@ from pycram.datastructures.enums import Arms
 milk_desig = BelieveObject(names=["milk"])
 
 description = TransportAction(milk_desig,
-                              [Arms.LEFT],
                               [Pose([2.4, 1.8, 1], 
-                                       [0, 0, 0, 1])])
+                                       [0, 0, 0, 1])],
+                              [Arms.LEFT])
 with simulated_robot:
     MoveTorsoAction([0.2]).resolve().perform()
     description.resolve().perform()

diff --git a/examples/improving_actions.md b/examples/improving_actions.md
@@ -75,12 +75,15 @@ session = sqlalchemy.orm.sessionmaker(bind=engine)()
 Now we construct an empty world with just a floating milk, where we can learn about PickUp actions.
 
 ```python
+from pycrap import Robot, Milk
+
 world = BulletWorld(WorldMode.DIRECT)
 print(world.prospection_world)
-robot = Object("pr2", ObjectType.ROBOT, "pr2.urdf")
-milk = Object("milk", ObjectType.MILK, "milk.stl", pose=Pose([1.3, 1, 0.9]))
+robot = Object("pr2", Robot, "pr2.urdf")
+milk = Object("milk", Milk, "milk.stl", pose=Pose([1.3, 1, 0.9]))
+viz_marker_publisher = VizMarkerPublisher()
 viz_marker_publisher = VizMarkerPublisher()
-milk_description = ObjectDesignatorDescription(types=[ObjectType.MILK]).ground()
+milk_description = ObjectDesignatorDescription(types=[Milk]).ground()
 ```
 
 Next, we create a default, probabilistic model that describes how to pick up objects. We visualize the default policy.
@@ -164,10 +167,11 @@ Next, we put the learned model to the test in a complex environment, where the m
 area.
 
 ```python
-kitchen = Object("kitchen", ObjectType.ENVIRONMENT, "apartment.urdf")
+from pycrap import Apartment
+kitchen = Object("apartment", Apartment, "apartment.urdf")
 
 milk.set_pose(Pose([0.5, 3.15, 1.04]))
-milk_description = ObjectDesignatorDescription(types=[ObjectType.MILK]).ground()
+milk_description = ObjectDesignatorDescription(types=[Milk]).ground()
 fpa = MoveAndPickUp(milk_description, arms=[Arms.LEFT, Arms.RIGHT],
                     grasps=[Grasp.FRONT, Grasp.LEFT, Grasp.RIGHT, Grasp.TOP], policy=model)
 fpa.sample_amount = 200

diff --git a/examples/knowledge_source.md b/examples/knowledge_source.md
@@ -0,0 +1,166 @@
+---
+jupyter:
+  jupytext:
+    text_representation:
+      extension: .md
+      format_name: markdown
+      format_version: '1.3'
+      jupytext_version: 1.16.3
+  kernelspec:
+    display_name: Python 3 (ipykernel)
+    language: python
+    name: python3
+---
+
+# How to create Knowledge Source
+(knowledge_source_header)=
+This notebook will detail what a knowledge source does, how it works and how you can create your own. 
+
+A knowledge source is part of the wider knowledge system in PyCRAM as explained [here](/knowledge). The purpose of a 
+knowledge source is to provide an interface to an external knowledge and reasoning system.
+
+A knowledge source essentially consists of two parts, management methods which take care of connecting to the knowledge 
+system as well as registering the knowledge source with the knowledge engine and the implementation of the respective 
+reasoning queries which this knowledge source is able to process. 
+
+In this example we will walk through the process of creating a simple knowledge source and all steps involved in this process. 
+
+## Creating the Knowledge Source structure
+
+We will start by creating the general structure of the Knowledge Source as well as the management methods. To do this 
+you have to create a new class which inherits from the ```KnowledgeSource``` class.
+
+```python
+from pycram.knowledge.knowledge_source import KnowledgeSource
+
+class ExampleKnowledge(KnowledgeSource):
+
+    def __init__(self):
+        super().__init__(name="example", priority=0)
+        self.parameter = {}
+
+    def is_available(self) -> bool:
+        return True
+
+    def is_connected(self) -> bool:
+        return True
+
+    def connect(self):
+        pass
+
+    def clear_state(self):
+        self.parameter = {}
+```
+
+What we did in the code above was creating a class which inherits from the ```KowledgeSource``` base class, in the 
+constructor of this new class we initialize the base class with the name of the new Knowledge Source as well as a 
+priority. The priority is used to determine the order of all Knowledge Sources, in case two Knowledge Sources provide 
+the same reasoning queries the one with the higher priority is used. 
+
+Furthermore, we define a number of methods that manage the connection to the knowledge system namely the methods 
+```is_available```, ```is_connected``` and ```connect```. The first two methods just return a bool which states if the 
+knowledge system is available and connected to this Knowledge Source. The last method is used to create a connection 
+to the knowledge system. Since this is an example and we are not connecting to an external knowledge system the methods 
+are fairly trivial. 
+
+The last method we defined is ```clear_state```, this is used to clear any state the knowledge source itself might hold 
+either of the knowledge system or of this Knowledge Source class itself. 
+
+
+# Managing the resolvable Properties 
+Properties serve two purposes in the management of knowledge in PyCRAM, the first is to define semantic properties of 
+parameter of action designator. The second is to specify which properties or knowledge queries a Knowledge Source can 
+answer. 
+
+To define which properties a Knowledge Source can handle we simply use the respective properties as mix-in for the class 
+definition. With this let's take another look at our Knowledge Source with the handling of two properties. 
+
+```python
+from pycram.knowledge.knowledge_source import KnowledgeSource
+from pycram.datastructures.property import ReachableProperty, SpaceIsFreeProperty
+from pycram.datastructures.world import World
+from pycram.datastructures.pose import Pose
+from pycram.datastructures.dataclasses import ReasoningResult
+from pycram.costmaps import OccupancyCostmap
+from pycram.ros.logging import loginfo
+import numpy as np
+
+class ExampleKnowledge(KnowledgeSource, ReachableProperty, SpaceIsFreeProperty):
+
+    def __init__(self):
+        super().__init__(name="example", priority=0)
+        self.parameter = {}
+
+    def is_available(self) -> bool:
+        return True
+
+    def is_connected(self) -> bool:
+        return True
+
+    def connect(self):
+        pass
+
+    def clear_state(self):
+        self.parameter = {}
+
+    def reachable(self, pose: Pose) -> ReasoningResult:
+        loginfo(f"Checking reachability for pose {pose}")
+        robot_pose = World.robot.pose
+        distance = pose.dist(robot_pose)
+        return ReasoningResult(distance < 0.6)
+
+    def space_is_free(self, pose: Pose) -> ReasoningResult:
+        loginfo(f"Checking if the space is free around {pose}")
+        om = OccupancyCostmap(0.2, False, 100, 0.02, pose)
+        return ReasoningResult(np.sum(om.map) == 6561)
+```
+
+Now we extend our Knowledge Source with the capability to handle two properties, Reachable and SpaceIsFree. As you can 
+see all we needed to do for this is to use the respective properties as mix-ins besides the Knowledge Source as well as 
+implement the method for each property which does the actual reasoning. 
+
+In this case the reasoning is kept fairly simple, since this is not the objective of this example. Reachable just 
+checks if a pose is within 60 centimeters of the robot while SpaceIsFree checks if a 2x2 meter square around the given 
+pose has no obstacles. 
+
+The methods doing the reasoning have to return a ```ReasoningResult``` instance, which contains a bool stating if the 
+reasoning succeeded or failed as well as additional parameters which might be inferred during reasoning. The additional 
+parameters are stated as key-value pairs in a dictionary.
+
+
+# Testing the Knowledge Source
+Since we now have a Knowledge Source which also implements two properties we can check if the Knowledge Source is used 
+to resolve the correct properties. 
+
+For this test we need a world as well as a robot.
+
+```python
+from pycram.worlds.bullet_world import BulletWorld
+from pycram.world_concepts.world_object import Object
+from pycram.datastructures.enums import WorldMode, ObjectType
+from pycram.knowledge.knowledge_engine import KnowledgeEngine
+from pycram.datastructures.pose import Pose
+from pycram.datastructures.property import ReachableProperty, SpaceIsFreeProperty
+from pycrap import Robot
+
+world = BulletWorld(WorldMode.GUI)
+pr2 = Object("pr2", Robot, "pr2.urdf")
+
+target_pose = Pose([0.3, 0, 0.2])
+property = ReachableProperty(target_pose) & SpaceIsFreeProperty(target_pose)
+
+ke = KnowledgeEngine()
+resolved_property = ke.resolve_properties(property)
+
+print(f"Result of the property: {resolved_property()}")
+```
+
+As you can see we created a ```ReachableProperty``` as well as a ```SpaceIsFreeProperty``` and resolved them. For more 
+details on how properties and their resolution work please referee to the properties example. 
+
+Afterwards, we execute the properties, here we can see the logging infos from our Knowledge Source as well as the 
+confirmation that the implementation for both properties worked correctly.
+
+```python
+world.exit()
+```
diff --git a/examples/orm_example.md b/examples/orm_example.md
@@ -171,7 +171,7 @@ class SayingActionPerformable(ActionAbstract):
     orm_class = ORMSaying
 
     @with_tree
-    def perform(self) -> None:
+    def plan(self) -> None:
         print(self.text)
 ```