vectara · pwoznic · Nov 25, 2024 · Nov 22, 2024 · Nov 22, 2024
diff --git a/www/docs/api-reference/search-apis/interpreting-responses/metadata.md b/www/docs/api-reference/search-apis/interpreting-responses/metadata.md
@@ -12,11 +12,12 @@ In <Config v="names.product"/>, when you [index a document](/docs/api-reference/
 document has a `type` parameter that determines the format of the document 
 as `core` or `structured`. The `core` type has `document_parts` and the `structured` 
 type has `sections`. Both can be nested and both can contain separate `metadata`, 
-including some metadata that <Config v="names.product"/> will auto-generate. 
-A good example of this is that you could have a document which has some global 
-attributes like the `URL` or `owner` but individual sections will have a `section` 
-attribute and a `lang`.
+including some metadata that <Config v="names.product"/> will auto-generate.
 
+## Metadata structure
+
+For example, a document might have global attributes such as the `URL` or `owner` 
+but individual sections have a `section` attribute and a `lang`.
 
 Here's an example response with different metadata at these different levels:
 
@@ -29,12 +30,12 @@ Here's an example response with different metadata at these different levels:
       "part_metadata": {
         "speaker": "Deep Thought",
         "lang": "eng",
-        "section": "2",
-        "offset": "316"
+        "section": 2,
+        "offset": 316
       },
       "document_metadata": {
         "author": "Douglas Adams",
-        "publicationyear": "1979"
+        "publicationyear": 1979
       },
       "document_id": "hitchhikers-guide",
       "request_corpora_index": 0
@@ -44,8 +45,8 @@ Here's an example response with different metadata at these different levels:
       "score": 0.13511724770069122,
       "part_metadata": {
         "lang": "eng",
-        "section": "17",
-        "offset": "171"
+        "section": 17,
+        "offset": 171
       },
       "document_metadata": {
         "author": "Dr. Seuss"
@@ -64,29 +65,112 @@ metadata. The reason for this split is that there may be multiple sections
 from the same document in the response, and this allows for deduplication of
 document-level metadata, which can reduce the total time for the response.
 
+## Metadata type consistency
+
+The metadata type conversion applies only to the `part_metadata` and 
+`document_metadata` fields in query responses. Metadata remains 
+unconverted during the document upload process, even when using API v2:
+
+* **Numbers** are returned as numbers (for example, `section: 2`, `publicationyear: 1979`).
+* **Booleans** are returned as `true` or `false` (case-sensitive).
+* **JSON objects** maintain their native structure.
+
+This behavior differs from API v1, where metadata such as `section` or 
+`publicationyear` might have been returned as strings (`"2"`, `"1979"`). 
+Ensure client applications handle these types correctly for smooth integration. 
+
+## Metadata type regex patterns
+
+The following regex examples provide information about how each type is 
+identified and processed. By understanding these patterns, users can account 
+for type conversion in their client applications.
+
+### Numbers regex
+
+This pattern matches valid numeric formats, including integers, decimals, and 
+scientific notation, ensuring they are returned as numbers instead of strings. 
+Examples include `section: 2` or `offset: 316`.
+
+**Pattern:** `-?(?:0|[1-9]\d*)(?:\.\d+)?(?:[eE][+-]?\d+)?`
+
+
+| Input      | Matches | Explanation                                |
+|------------|---------|--------------------------------------------|
+| `123`      | ✅  | Valid integer.                            |
+| `0`        | ✅  | Valid zero.                               |
+| `-456`     | ✅  | Valid negative integer.                   |
+| `3.14`     | ✅  | Valid decimal number.                     |
+| `-0.001`   | ✅  | Valid negative decimal.                   |
+| `2e10`     | ✅  | Valid scientific notation.                |
+| `-1.23E-4` | ✅  | Valid negative number in scientific notation. |
+| `.5`       | ❌  | Invalid (missing leading integer).        |
+| `1e`       | ❌  | Invalid (missing exponent value).         |
+| `1.2.3`    | ❌  | Invalid (multiple decimal points).        |
+| `-`        | ❌  | Invalid (missing digits).                 |
+
+
+### Boolean regex
+
+This pattern matches exact boolean values (`true` or `false`), with exact case 
+sensitivity and no extra characters.
+
+**Pattern:** `^(true|false)$`
+
+| Input      | Matches | Explanation                                     |
+|------------|---------|-------------------------------------------------|
+| `true`     | ✅  | Exact match for `true`.                        |
+| `false`    | ✅  | Exact match for `false`.                       |
+| ` true`    | ❌  | Invalid (leading space).                       |
+| `false `   | ❌  | Invalid (trailing space).                      |
+| `True`     | ❌  | Invalid (case-sensitive; must be lowercase).   |
+| `TRUE`     | ❌  | Invalid (case-sensitive; must be lowercase).   |
+| `falsey`   | ❌  | Invalid (extra characters after `false`).      |
+| `truest`   | ❌  | Invalid (extra characters after `true`).       |
+| `tru`      | ❌  | Invalid (partial match; incomplete `true`).    |
+
+### JSON regex
+
+This pattern identifies JSON-like structures, ensuring valid JSON objects so 
+that `{}` or arrays like `[]` are properly maintained.
+
+**Pattern:** `^[{|\[].*$`
+
+| Input         | Matches | Explanation                                  |
+|---------------|---------|----------------------------------------------|
+| `{example}`   | ✅  | Starts with `{` and has additional content.  |
+| `[data]`      | ✅  | Starts with `[` and has additional content.  |
+| `{`           | ✅  | Matches a single `{` at the start.           |
+| `[`           | ✅  | Matches a single `[` at the start.           |
+| `example`     | ❌  | Does not start with `{` or `[`.              |
+| `something{`  | ❌  | Starts with other characters, not `{`.       |
+| `(empty)`     | ❌  | Empty string does not match.                 |
+
 ## Combining document and section metadata
 
-In order to display metadata for a particular section, you may want to combine
-it with the document-level metadata. To do so, look at the `document_id`
-value. This tells you which document the metadata belongs to.
+To display metadata for a particular section, you may want to combine it with 
+the document-level metadata.
+
+In order to display metadata for a particular section, you may want to combine 
+it with the document-level metadata. Use the `document_id` value to determine 
+which document the metadata belongs to.
 
-For example, the first result in the `search_results` array ("Answer to the Ultimate
-Question of Life, the Universe, and Everything, is 42.") has a `document_id`
-value of `hitchhikers-guide` and has a `part_metadata` of `speaker:Deep Thought`, `lang:eng`,
-`section:2`, and `offset:316`. These are the section-level metadata for this
+For example, the first result in the `search_results` array ("Answer to the Ultimate 
+Question of Life, the Universe, and Everything, is 42.") has a `document_id` 
+value of `hitchhikers-guide` and has a `part_metadata` of `speaker:Deep Thought`, `lang:eng`, 
+`section:2`, and `offset:316`. These are the section-level metadata for this 
 result.
 
-Because the `document_id` is `hitchhikers-guide`, we look at the first result in the
-`search_results` array to find the document-level metadata and document ID.  In this
-case, the `id` is `hitchhikers-guide` and the document-level metadata is
+Because the `document_id` is `hitchhikers-guide`, we look at the first result in the 
+`search_results` array to find the document-level metadata and document ID. In this 
+case, the `id` is `hitchhikers-guide` and the document-level metadata is 
 `author:Douglas Adams` and `publicationyear:1979`.
 
-Depending on your use case, you might want to combine these metadata elements
+Depending on your use case, you might want to combine these metadata elements 
 together for display purposes.
 
 ## Filtering
 
-You can also use the `document`- and `section`-level metadata to filter in a
-search operation.  For more information on how to apply filter expressions at
-either the document or section/part level, please see the
+You can also use the `document`- and `section`-level metadata to filter search 
+results. For more information on how to apply filter expressions at 
+either the document or section/part level, please see the 
 [filter expression](/docs/learn/metadata-search-filtering/filter-overview) documentation.
diff --git a/www/docs/learn/recommendation-systems/overview.md b/www/docs/learn/recommendation-systems/overview.md
@@ -46,7 +46,7 @@ are similar to the one they're looking at or a recently purchased product. These
 use cases can be dealt with by using <Config v="names.product"/> in a
 document-to-document search/recommendation platform.  In order to do this, the
 most important change is that you'll need to use `RESPONSE` similarity measure
-(available to [our Scale plan users](https://vectara.com/pricing/)).
+(available to [our Pro and Enterprise plan users](https://vectara.com/pricing/)).
 It's easier to explain how this is different by first explaining how the `DEFAULT`
 similarity works.
 

diff --git a/www/docs/migration-guide-api-v2.md b/www/docs/migration-guide-api-v2.md
@@ -123,6 +123,19 @@ In addition to the new Corpus Key:
   requests.
 * Remove the `textless` and `encrypted` fields from your requests.
 
+## Metadata type conversions
+
+Metadata remains unconverted during the document upload process, even when 
+using API v2. This means that numbers return as numbers, booleans return as 
+booleans, and JSON objects retain their native structure. This behavior 
+differs from API v1, where metadata such as `section` or `publicationyear` might 
+have been returned as strings. For more details, see [Reading Metadata](/docs/api-reference/search-apis/interpreting-responses/metadata).
+
+**Action item:**
+
+Ensure client applications handle these types correctly for smooth integration.
+
+
 ## Terminology, parameter, and property name changes
 
 * API v1 uses `num_results` for specifying the maximum number of results