convert-to_openai.txt

NOTE : NEED TO ENSURE THAT THERE IS A LINE BETWEEN FUNCTION DESCRIPTION, ARGUMENTS AND RETURN VALUE


For converting pure function to openai format

convert_python_function_to_openai_function(fn)
Breakdown:
1. get name -> _get_python_function_name(fn
2. make schema -> create_schema_from_function 
	args = func_name, <-from step 1
        function, <- fn
        filter_args=(),
        parse_docstring=True,
        error_on_invalid_docstring=False,
        include_injected=False,
		
3. Convert schema to openai format -> convert_pydantic_to_openai_function(model Schema? , name, description = model.__doc__)

Step 1 - Breakdown of subfunctions in convert_python_function_to_openai_function
1. get name
	function.__name__

Step 2 - Breakdown of subfunctions in make schema
2. make schema
	create_schema_from_function (from langchain_core import tools)
	2.1 Validate the arguments from the function (validate_arguments)
	2.2 Return validate_arguments.model
	2.3 if any arguments to be ignored, remove them
	2.4 get fn descriptions and argument descriptions from docstring  
		_infer_arg_descriptions(
			func,
			parse_docstring=parse_docstring,
			error_on_invalid_docstring=error_on_invalid_docstring)
	2.5 valid_properties = _get_filtered_args(
        inferred_model, func, filter_args=filter_args, include_injected=include_injected) <- not sure what this is
	2.6 _create_subset_model
	
	Breakdown
	2.1 this is a default pydantic function argument - can be used directly. We need to set this up as a decorator
	2.2 Model is just the structured function object. We want model.__fields__ ({'query': ModelField(name='query', type=str, required=True), 'v__duplicate_kwargs': ModelField(name='v__duplicate_kwargs', type=Optional[List[str]], required=False, default=None), 'args': ModelField(name='args', type=Optional[List[Any]], required=False, default=None), 'kwargs': ModelField(name='kwargs', type=Optional[Mapping[Any, Any]], required=False, default=None)}
	2.3 Ignore this...never gonna happen
	2.4 Breakdown
		2.4.1 Get the annotations inspect.get_annotations(pri) #{'query': <class 'str'>, 'return': <class 'dict'>}
		2.4.2  _parse_python_function_docstring(
            fn, annotations, error_on_invalid_docstring=error_on_invalid_docstring
        )
			Breakdown:
			2.4.2.1	Get the docstring inspect.getdoc(function) <-entire docstring incld args and return values
			2.4.2.2 _parse_google_docstring(
						docstring,
						list(annotations), <- not really required unless the arguments are "run_manager", "callbacks", "return"
						error_on_invalid_docstring=error_on_invalid_docstring,
						)
				Breakdown:
					2.4.2.2.1 Basically the doc string is split into three parts: The main description, the arguments section defined as starting with "Args:" and the return value section defined as starting with "Returns:"
					2.4.2.2.2 So basically we split the three sections with '\n\n' and trim each section. 
					2.4.2.2.3 We also add a description list if neither Args: nor Return: which is basically the function description
					2.4.2.2.4 create the list of args by taking the args block, spliting with '\n' and removing "Args:"
					2.4.2.2.5 Then for each item in args block again split the line with ":" to get args name and args description. Get {args:args_description} dict
					return  ' '.join(description) from 2.4.2.2.3 and args dict from 2.4.2.2.5
					So basically we are returning function_description and {arg:arg_desc} dict for every argument	
		2.4.3 _validate_docstring_args_against_annotations(arg_descriptions, annotations) - check if the arg descriptions are in annotations. For our example,
				annotations = {'query': <class 'str'>, 'return': <class 'dict'>} and args_descriptions = {'query (str)': 'The search query.'}
				So basically is 'query' as a key in args_descriptions present in annotations.keys()
		2.4.4 _get_annotation_description - This is exception handling only - used if the arg name from the annotations list is not in the args_list and if the annotation class is provided in the arguments during function definition. Not required if we do this well.
	2.5 returns something like {'query': {'title': 'Query', 'type': 'string'}} for every key in signature(function).parameters for every key in the signature. This we refer to as field_names
		basically returns {k:schema[k]} where k is the key in signature(function).parameters and schema is the validated_model.schema()['properties']. The validated_model.schema() is as follows:
		 {'title': 'SearchInformation', 'type': 'object', 'properties': {'query': {'title': 'Query', 'type': 'string'}, 'v__duplicate_kwargs': {'title': 'V  Duplicate Kwargs', 'type': 'array', 'items': {'type': 'string'}}, 'args': {'title': 'Args', 'type': 'array', 'items': {}}, 'kwargs': {'title': 'Kwargs', 'type': 'object'}}, 'required': ['query'], 'additionalProperties': False}
		 Hence, the value of the properties key is as: {'query': {'title': 'Query', 'type': 'string'}
	2.6 Breakdown:
		2.6.1: Get the field_names from 2.5 (which has been got from inspecting the signature of the function)
		2.6.2: Get the fields from the validated model (which we have got from the pydantic validation of the function)
		2.6.3: From 1, get the function name
		2.6.4: Get the arg descriptions from arg_descriptions from 2.4.2.2.5
		2.6.5: Get the function description also from 2.4.2.2.5
		2.6.6: Create the dict "fields"
		2.6.7 for every field_name from 2.6.1: (e.g. 'query')
				2.6.7.1 get the ModelField object for that field -> validated_model.__fields__[field_name]
				2.6.7.2 From this ModelField, check .required attribute to see if it is a required field and check .allow_none to see if it does NOT accept None as an input
					2.6.7.2.1 if neither, set t tothe outer_type_ of the ModelField. (Here it is <class 'str'>)
					2.6.7.2.2 if either, set t to Optional[ModelField.outer_type_]
				2.6.7.3 if we have our args_description from 2.6.4 and the field is a key in the args_description:
					2.6.7.3.1 set the ModelField.field_info.description as the arg_description value i.e. args_description['query']
					2.6.7.3.2 update the fields dict from 2.6.6 as fields[field_name] = (t, model_field.field_info) # t - from 2.6.7.2.1/2
		2.6.8 Basically a Pydantic function that creates a Pydantic model from the name of the function and the fields_dict from  2.6.7.3.2

Step 3- Breakdown of Convert schema to openai format		
	3.1 Convert to json schema
	3.2 Rereference the json - basically remove $refs in JSON Schema.
		Breakdown:
			3.2.1
	3.3 Pop "definitions" from the schema
	3.4 Extract the title
	3.5 Extract the description
	3.6 return a dict{"name":name from 3.4, "description":description from 3.5, "parameters": the schema but without the title k,v pair (title is extroneous - useless)