-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathconvert-to_openai.txt
95 lines (87 loc) · 6.62 KB
/
convert-to_openai.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
NOTE : NEED TO ENSURE THAT THERE IS A LINE BETWEEN FUNCTION DESCRIPTION, ARGUMENTS AND RETURN VALUE
For converting pure function to openai format
convert_python_function_to_openai_function(fn)
Breakdown:
1. get name -> _get_python_function_name(fn
2. make schema -> create_schema_from_function
args = func_name, <-from step 1
function, <- fn
filter_args=(),
parse_docstring=True,
error_on_invalid_docstring=False,
include_injected=False,
3. Convert schema to openai format -> convert_pydantic_to_openai_function(model Schema? , name, description = model.__doc__)
Step 1 - Breakdown of subfunctions in convert_python_function_to_openai_function
1. get name
function.__name__
Step 2 - Breakdown of subfunctions in make schema
2. make schema
create_schema_from_function (from langchain_core import tools)
2.1 Validate the arguments from the function (validate_arguments)
2.2 Return validate_arguments.model
2.3 if any arguments to be ignored, remove them
2.4 get fn descriptions and argument descriptions from docstring
_infer_arg_descriptions(
func,
parse_docstring=parse_docstring,
error_on_invalid_docstring=error_on_invalid_docstring)
2.5 valid_properties = _get_filtered_args(
inferred_model, func, filter_args=filter_args, include_injected=include_injected) <- not sure what this is
2.6 _create_subset_model
Breakdown
2.1 this is a default pydantic function argument - can be used directly. We need to set this up as a decorator
2.2 Model is just the structured function object. We want model.__fields__ ({'query': ModelField(name='query', type=str, required=True), 'v__duplicate_kwargs': ModelField(name='v__duplicate_kwargs', type=Optional[List[str]], required=False, default=None), 'args': ModelField(name='args', type=Optional[List[Any]], required=False, default=None), 'kwargs': ModelField(name='kwargs', type=Optional[Mapping[Any, Any]], required=False, default=None)}
2.3 Ignore this...never gonna happen
2.4 Breakdown
2.4.1 Get the annotations inspect.get_annotations(pri) #{'query': <class 'str'>, 'return': <class 'dict'>}
2.4.2 _parse_python_function_docstring(
fn, annotations, error_on_invalid_docstring=error_on_invalid_docstring
)
Breakdown:
2.4.2.1 Get the docstring inspect.getdoc(function) <-entire docstring incld args and return values
2.4.2.2 _parse_google_docstring(
docstring,
list(annotations), <- not really required unless the arguments are "run_manager", "callbacks", "return"
error_on_invalid_docstring=error_on_invalid_docstring,
)
Breakdown:
2.4.2.2.1 Basically the doc string is split into three parts: The main description, the arguments section defined as starting with "Args:" and the return value section defined as starting with "Returns:"
2.4.2.2.2 So basically we split the three sections with '\n\n' and trim each section.
2.4.2.2.3 We also add a description list if neither Args: nor Return: which is basically the function description
2.4.2.2.4 create the list of args by taking the args block, spliting with '\n' and removing "Args:"
2.4.2.2.5 Then for each item in args block again split the line with ":" to get args name and args description. Get {args:args_description} dict
return ' '.join(description) from 2.4.2.2.3 and args dict from 2.4.2.2.5
So basically we are returning function_description and {arg:arg_desc} dict for every argument
2.4.3 _validate_docstring_args_against_annotations(arg_descriptions, annotations) - check if the arg descriptions are in annotations. For our example,
annotations = {'query': <class 'str'>, 'return': <class 'dict'>} and args_descriptions = {'query (str)': 'The search query.'}
So basically is 'query' as a key in args_descriptions present in annotations.keys()
2.4.4 _get_annotation_description - This is exception handling only - used if the arg name from the annotations list is not in the args_list and if the annotation class is provided in the arguments during function definition. Not required if we do this well.
2.5 returns something like {'query': {'title': 'Query', 'type': 'string'}} for every key in signature(function).parameters for every key in the signature. This we refer to as field_names
basically returns {k:schema[k]} where k is the key in signature(function).parameters and schema is the validated_model.schema()['properties']. The validated_model.schema() is as follows:
{'title': 'SearchInformation', 'type': 'object', 'properties': {'query': {'title': 'Query', 'type': 'string'}, 'v__duplicate_kwargs': {'title': 'V Duplicate Kwargs', 'type': 'array', 'items': {'type': 'string'}}, 'args': {'title': 'Args', 'type': 'array', 'items': {}}, 'kwargs': {'title': 'Kwargs', 'type': 'object'}}, 'required': ['query'], 'additionalProperties': False}
Hence, the value of the properties key is as: {'query': {'title': 'Query', 'type': 'string'}
2.6 Breakdown:
2.6.1: Get the field_names from 2.5 (which has been got from inspecting the signature of the function)
2.6.2: Get the fields from the validated model (which we have got from the pydantic validation of the function)
2.6.3: From 1, get the function name
2.6.4: Get the arg descriptions from arg_descriptions from 2.4.2.2.5
2.6.5: Get the function description also from 2.4.2.2.5
2.6.6: Create the dict "fields"
2.6.7 for every field_name from 2.6.1: (e.g. 'query')
2.6.7.1 get the ModelField object for that field -> validated_model.__fields__[field_name]
2.6.7.2 From this ModelField, check .required attribute to see if it is a required field and check .allow_none to see if it does NOT accept None as an input
2.6.7.2.1 if neither, set t tothe outer_type_ of the ModelField. (Here it is <class 'str'>)
2.6.7.2.2 if either, set t to Optional[ModelField.outer_type_]
2.6.7.3 if we have our args_description from 2.6.4 and the field is a key in the args_description:
2.6.7.3.1 set the ModelField.field_info.description as the arg_description value i.e. args_description['query']
2.6.7.3.2 update the fields dict from 2.6.6 as fields[field_name] = (t, model_field.field_info) # t - from 2.6.7.2.1/2
2.6.8 Basically a Pydantic function that creates a Pydantic model from the name of the function and the fields_dict from 2.6.7.3.2
Step 3- Breakdown of Convert schema to openai format
3.1 Convert to json schema
3.2 Rereference the json - basically remove $refs in JSON Schema.
Breakdown:
3.2.1
3.3 Pop "definitions" from the schema
3.4 Extract the title
3.5 Extract the description
3.6 return a dict{"name":name from 3.4, "description":description from 3.5, "parameters": the schema but without the title k,v pair (title is extroneous - useless)