title	description
Implementation

Installation

To install Carbon Connect as a pre-built React component, use npm as follows:

npm install carbon-connect

Prerequisites

The following packages will be added as peer dependencies:

@radix-ui/react-checkbox
@radix-ui/react-dialog
@radix-ui/react-dropdown-menu
@radix-ui/react-popover
@radix-ui/react-slot
class-variance-authority
clsx
next-themes
react
react-dom
react-dropzone
react-infinite-scroll-component
react-loader-spinner
tailwind-merge

Please check for the versions from package.json if you encounter a version mismatch error.

Component Properties

The CarbonConnect component accepts the following properties:

Property	Type	Required?	Description
`brandIcon`	String	Yes	A URL or a local path to your organization's brand icon.
`orgName`	String	Yes	The name of your organization. This is displayed in the initial announcement modal view.
`tokenFetcher`	Function	Yes	A function that returns a promise which resolves with the access and refresh tokens.
`onSuccess`	Function	No	A callback function that will be called after the file upload is successful.
`onError`	Function	No	A callback function that will be called if there is any error in the file upload.
`children`	React Node(JSX)	No	You can pass any valid React node that will be used as a trigger to open the component.
`entryPoint`	String	No	The initial active step when the component loads. Default entry point is `LOCAL_FILES`.
`maxFileSize`	Number	No	Maximum file size in bytes that is allowed to be uploaded. Defaults to 10 MB
`tags`	Object	No	Any additional data you want to associate with the component's state, such as an app ID.
`enabledIntegrations`	dict	No	Let's you choose which 3rd party integrations to show. See below for more details about this prop.
`primaryBackgroundColor`	String	No	The primary background color of the component. Defaults to `#000000`.
`primaryTextColor`	String	No	The primary text color of the component. Defaults to `#FFFFFF`.
`secondaryBackgroundColor`	String	No	The secondary background color of the component. Defaults to `#FFFFFF`.
`secondaryTextColor`	String	No	The secondary text color of the component. Defaults to `#000000`.
`allowMultipleFiles`	Boolean	No	Whether or not to allow multiple files to be uploaded at once. Defaults to `false`.
`chunkSize`	Number	No	The no.of tokens per chunk. Defaults to 1500.
`overlapSize`	Number	No	The no.of tokens to overlap between chunks. Defaults to 20.
`open`	Boolean	No	Whether or not to open the component. Defaults to `false`.
`setOpen`	Function	No	A function that will be called to set the open state of the component. Defaults to `None`.
`alwaysOpen`	Boolean	No	Whether or not to always keep the component open. Defaults to `false`.
`tosURL`	String	No	A URL to your organization's terms of service. Defaults to `https://carbon.ai/terms`.
`privacyPolicyURL`	String	No	A URL to your organization's privacy policy. Defaults to `https://carbon.ai/privacy`.
`navigateBackURL`	String	No	A URL to your intended destination. Defaults to `None`.
`backButtonText`	String	No	The label that you want to show on the back button. Defaults to `Go back`.
`zIndex`	Number	No	Update the z-index of the Carbon Connect modal.
`embeddingModel`	String	No	Specifies the embedding model used for the integration. The options are `OPENAI`, `AZURE_OPENAI`, or `COHERE_MULTILINGUAL_V3` for text and audio files, and `VERTEX_MULTIMODAL` for image files.
`filePickerMode`	String	No	Specifies whether users can locally upload files, folders, or both. The options are `FILES`, `FOLDERS`, or `BOTH`.
`prependFilenameToChunks`	String	No	Adds the file title to each chunk. Defaults to `false`.
`showFilesTab`	Boolean	No	Shows the synced files tab in Carbon Connect 2.0. Defaults to `true`.
`useRequestIds`	Boolean	No	A `request_id` will be assigned to the uploaded files in that session.
`loadingIconColor`	String	No	Defines the color of the loader icon. This can be specified using standard CSS color names, or directly as either a Hexadecimal (Hex) code or RGB color values.
`sendDeletionWebhooks`	Boolean	No	When set to `true`, enables triggering a `FILE_DELETED` webhook event whenever a user deletes files within Carbon Connect. If set to `false`, deleting files will not generate any webhook notifications.
`fileSyncConfig`	Array	No	Includes data source and file specific configurations.
`splitRows`	Boolean	No	If `splitRows` is set to `true`, CSV rows will be automatically split if they exceed either the specified chunk size or the maximum token limit of the embedding model. For `LOCAL_FILES`, `splitRows` can be set on the integration or extension level. For third-party connectors, this value can be set under the `fileSyncConfig` as `split_rows`. Defaults to `false`.
`incrementalSync`	Boolean	No	By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`.
`filesTabColumns`	Array	No	Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"created_at"`, `"external_url"`.
`theme`	String	No	Specifies whether dark or light mode is enabled. The prop can have values `"light"`, `"dark"`, and `"auto"`.
`dataSourcePollingInterval`	Number	No	Specifies how frequently data sources are polled for any updates and events. The value is specified in milliseconds (ms) and the minimum allowed value for this property is 3000 ms. Defaults to 8000 ms.
`openFilesTabTo`	String	No	Specifies which tab (`FILE_PICKER` or `FILES_LIST`) the user is taken to by default when they select an integration. Only applies when customer has enabled Carbon’s in-house file picker.
`apiURL`	String	No	Defaults to https://api.carbon.ai but can be set to another URL value. For self-hosting customers, this URL value then acts as the base path for all of the requests made through Carbon Connect.
`dataSourceTags`	String	No	Key-value pairs that will be added to all data sources connected through Carbon Connect as custom metadata. Example: `{{"userId": "swapnil@carbon.ai"}}`
`dataSourceTagsFilterQuery`	String	No	This parameter filters for tags when querying data sources. It functions similarly to our documented file filters. If not provided, all data sources will be returned. Example: `{{"key": "userId", "value": "swapnil@carbon.ai"}}`

When you do not pass open or setOpen, Carbon Connect will manage the open state internally. If you pass open and setOpen, you will have to manage the open state yourself.

Usage

This section demonstrates how to integrate the Carbon Connect component within a Next.js project.

Client Side Configuration

1. Import Libraries and Components:

import { CarbonConnect } from 'carbon-connect';
import axios from 'axios';

2. Token Retrieval:

The tokenFetcher function is set up to request access tokens from Carbon directly via your backend:

const tokenFetcher = async () => {
  const response = await axios.get('/api/auth/fetchCarbonTokens', {
    params: { customer_id: 'your_customer_id' },
  });
  return response.data; // Must return data containing access_token
};

In the example above, tokenFetcher is a helper function that retrieves the necessary tokens for authentication. This function should be implemented in your client-side code and is designed to make a request to an API on your backend server. The API then requests tokens from the Carbon token creation endpoint. The Carbon token creation endpoint is a secure endpoint that requires a valid API key and customer ID. The customer ID is a unique identifier for your end-user, and you can pass any string as the customer ID. The API key is a secret key provided to you by Carbon. Please contact us to obtain your API key.

3. Implement Carbon Connect Component:

Here's a concise usage example. Customize according to your requirements:

<CarbonConnect
  orgName="Your Organization"
  brandIcon="path/to/your/brand/icon"
  embeddingModel={EmbeddingGenerators.OPENAI_ADA_LARGE_1024}
  tokenFetcher={tokenFetcher}
  tags={{
    tag1: 'tag1_value',
    tag2: 'tag2_value',
    tag3: 'tag3_value',
  }}
  maxFileSize={10000000}
  enabledIntegrations={[
    {
      id: 'LOCAL_FILES',
      chunkSize: 100,
      overlapSize: 10,
      maxFileSize: 20000000,
      allowMultipleFiles: true,
      maxFilesCount: 5,
      allowedFileTypes: [
        {
          extension: 'csv',
          chunkSize: 1200,
          overlapSize: 120,
          embeddingModel: 'OPENAI',
        },
        {
          extension: 'txt',
          chunkSize: 1599,
          overlapSize: 210,
          embeddingModel: 'AZURE_OPENAI',
        },
        {
          extension: 'pdf',
        },
      ],
    },
    {
      id: 'NOTION',
      chunkSize: 1500,
      overlapSize: 20,
      embeddingModel: 'OPENAI',
    },
    {
      id: 'WEB_SCRAPER',
      chunkSize: 1500,
      overlapSize: 20,
    },
    {
      id: 'GOOGLE_DRIVE',
      chunkSize: 1000,
      overlapSize: 20,
      fileSyncConfig: {
        detect_audio_language: true,
        split_rows: true,
        generate_chunks_only: true,
      },
    },
    {
      id: 'INTERCOM',
      chunkSize: 1000,
      overlapSize: 20,
      fileSyncConfig: {
         "auto_synced_source_types": AutoSyncedSourceTypes.TICKET,
      },
    },
  ]}
  onSuccess={(data) => console.log('Data on Success: ', data)}
  onError={(error) => console.log('Data on Error: ', error)}
  primaryBackgroundColor="#F2F2F2"
  primaryTextColor="#555555"
  secondaryBackgroundColor="#f2f2f2"
  secondaryTextColor="#000000"
  allowMultipleFiles={true}
  open={true}
  chunkSize={1500}
  overlapSize={20}
  // entryPoint="LOCAL_FILES"
></CarbonConnect>

4. Specify an Embedding Model (Optional)

If you are using Carbon to generate embeddings, in the Carbon Connect component, the specification of an embedding model (view available models) can be set at different levels:

Global Level: Here, the Embedding Model (embeddingModel) prop applies universally across the entire system or application. It serves as the default unless overridden at other levels.

Per Connector Level: This setting applies to a specific connector (ie: Google Drive), allowing customization for a particular connector's behavior, which takes precedence over the global setting for that connector.

Per File Type Level: The most specific setting, it applies at the individual file level, allowing granular control over embedding models for particular file types. This setting supersedes both connector and global settings, providing the highest priority. This is applicable for local file uploads only.

The order of precedence is: File Level > Connector Level > Global Level. Meaning, if a specific embedding model is defined at the file level, it takes priority over the connector-level setting, and the connector-level setting takes priority over the global setting. We default to OPENAI if no value is provided.

Server Side Configuration

Your backend should handle token requests like this:

const response = await axios.get('https://api.carbon.ai/auth/v1/access_token', {
  headers: {
    'Content-Type': 'application/json',
    'customer-id': '<YOUR_USER_UNIQUE_IDENTIFIER>',
    authorization: 'Bearer <YOUR_API_KEY>',
  },
});
if (response.status === 200 && response.data) {
  res.status(200).json(response.data);
}

Return Value

Ensure that your tokenFetcher returns an object structured as:

{
  access_token: string;
}

Enabling Connectors

You can enable connectors users can access via the enabledIntegrations property. The property also allows additional configuration per connector.

Here's the list of connectors available for activation:

`LOCAL_FILES`: This integration lets you upload files from your local machine. You can reference the supported file formats [here](learn/files/text). You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Default is 1500. - `overlapSize`: Size of overlap in tokens. Default is 20. - `maxFileSize`: Maximum file size allowed for upload in bytes. Default is 10 MB. - `allowMultipleFiles`: Determines if multiple files can be uploaded simultaneously. Default is `false`. - `maxFilesCount`: Maximum number of files allowed for upload at once. Default is 10. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Default is `false`. - `useOcr`: Toggle to enable Optical Character Recognition (OCR) for PDFs. Default is `false`. - `parsePdfTablesWithOcr`: Enables table parsing when useOCR is set to true. Default is `false`. (Please set on file extension level.) - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `maxItemsPerChunk`: Specifies the number of items to include in a specific chunk. Defaults to `null`. - `transcriptionService`: Specifies the model being used for audio transcripton. Accepts an enum of `ASSEMBLYAI` or `DEEPGRAM`. Defaults to `DEEPGRAM`. - `includeSpeakerLabels`: Specifies whether speaker diarization will be enabled for the audio transcription services. This allows us to format chunks so that the text is organized by utterances and each utterance will be labeled with the speaker. Defaults to `false`. - `generateChunksOnly`: When this flag is set to `true`, documents will be chunked without generating embeddings, and the `/list_chunks_and_embeddings` will list chunks only. Defaults to `false`. - `allowedFileTypes`: This is an array of objects. Each object represents a file type that is allowed to be uploaded. Each object can have the following properties: - `extension`: File extension of the allowed file type (required property). - `chunkSize`: Number of tokens per chunk for this file type. Defaults to global setting if not specified. - `overlapSize`: Overlap size in tokens for this file type. Defaults to global setting if not specified. - `skipEmbeddingGeneration`: Toggle to skip embedding generation for this file type. Defaults to global setting if not specified. - `embeddingModel`: Specifies the embedding model for this file type. Options same as global but specific to this file type. Defaults to global setting if not specified. - `useOcr`: Toggle to enable OCR for PDF files. Defaults to global setting if not specified. - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search for this file type. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the file. Defaults to `false`. - `maxItemsPerChunk`: Specifies the number of items to include in a specific chunk. Defaults to `null`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. - `transcriptionService`: Specifies the model being used for audio transcripton. Accepts an enum of `ASSEMBLYAI` or `DEEPGRAM`. Defaults to `DEEPGRAM`. `NOTION`: This integration lets you upload files from your notion account. You can pass the following configuration for this integration

- `chunkSize`: Number of tokens per chunk. Defaults to 1500.
- `overlapSize`: Size of the overlap in tokens. Defaults to 20.
- `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`.
- `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models).
- `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`.
- `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. 
- `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only)
- `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`.
- `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only)
- `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`.
- `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`.

`WEB_SCRAPER`: This integration lets you scrape URLs. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `sitemapEnabled`: This option enables the sitemap tab to be displayed. Defaults to `true`. - `recursionDepth`: Depth of recursion for scraping. Defaults to 3. Use 1 to disable recursion and 0 to scrape recursively until reaching the `maxPagesToScrape` limit. - `maxPagesToScrape`: Maximum number of pages to scrape. Defaults to 100. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `enableAutoSync`: Toggle to enable scheduled syncs. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `htmlTagsToSkip`: Define HTML tags to exclude when converting HTML to plaintext. Defaults to `[]`, an empty list. - `cssClassesToSkip`: Define CSS Classes to exclude when converting HTML to plaintext. Defaults to `[]`, an empty list. - `cssSelectorsToSkip`: Define CSS Selectors to exclude when converting HTML to plaintext. Defaults to `[]`, an empty list. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `generateChunksOnly`: When this flag is set to `true`, documents will be chunked without generating embeddings, and the `/list_chunks_and_embeddings` will list chunks only. Defaults to `false`. `GOOGLE_DRIVE`: This integration lets you upload files from your Google Drive. You can reference the supported file formats [here](learn/files/text). You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `maxItemsPerChunk`: Specifies the number of items to include in a specific chunk. Defaults to `null`. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `useOcr`: Toggle to enable Optical Character Recognition (OCR) for PDFs. Default is `false`. - `parsePdfTablesWithOcr`: Enables table parsing when `useOCR` is set to `true`. Default is `false`. - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `INTERCOM`: This integration lets you select pages from your Intercom. You can pass the following configuration for this integration:

- `chunkSize`: Number of tokens per chunk. Defaults to 1500.
- `overlapSize`: Size of the overlap in tokens. Defaults to 20.
- `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`.
- `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models).
- `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`.
- `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. 
- `syncFilesOnConnection`: Auto-sync all files from a user’s connected account.
- `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only)
- `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`.
- `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only)
- `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`.
- `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`.

`DROPBOX`: This integration lets you upload files from your Dropbox. You can reference the supported file formats [here](learn/files/text). You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `maxItemsPerChunk`: Specifies the number of items to include in a specific chunk. Defaults to `null`. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `useOcr`: Toggle to enable Optical Character Recognition (OCR) for PDFs. Default is `false`. - `parsePdfTablesWithOcr`: Enables table parsing when `useOCR` is set to `true`. Default is `false`. - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `BOX`: This integration lets you upload files from your Box. You can reference the supported file formats [here](learn/files/text). You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `maxItemsPerChunk`: Specifies the number of items to include in a specific chunk. Defaults to `null`. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `useOcr`: Toggle to enable Optical Character Recognition (OCR) for PDFs. Default is `false`. - `parsePdfTablesWithOcr`: Enables table parsing when `useOCR` is set to `true`. Default is `false`. - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `ONEDRIVE`: This integration lets you upload files from your OneDrive. You can reference the supported file formats [here](learn/files/text). You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `maxItemsPerChunk`: Specifies the number of items to include in a specific chunk. Defaults to `null`. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `useOcr`: Toggle to enable Optical Character Recognition (OCR) for PDFs. Default is `false`. - `parsePdfTablesWithOcr`: Enables table parsing when `useOCR` is set to `true`. Default is `false`. - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `ZOTERO`: This integration lets you upload files from your Zotero. You can reference the supported file formats [here](learn/files/text). You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `maxItemsPerChunk`: Specifies the number of items to include in a specific chunk. Defaults to `null`. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `useOcr`: Toggle to enable Optical Character Recognition (OCR) for PDFs. Default is `false`. - `parsePdfTablesWithOcr`: Enables table parsing when `useOCR` is set to `true`. Default is `false`. - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `SHAREPOINT`: This integration lets you upload files from your SharePoint. You can reference the supported file formats [here](learn/files/text). You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `maxItemsPerChunk`: Specifies the number of items to include in a specific chunk. Defaults to `null`. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `useOcr`: Toggle to enable Optical Character Recognition (OCR) for PDFs. Default is `false`. - `parsePdfTablesWithOcr`: Enables table parsing when `useOCR` is set to `true`. Default is `false`. - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `CONFLUENCE`: This integration lets you upload files from your Confluence. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `syncFilesOnConnection`: Auto-sync all files from a user’s connected account. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `ZENDESK`: This integration lets you upload files from your Zendesk. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `syncFilesOnConnection`: Auto-sync all files from a user’s connected account. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `FRESHDESK`: This integration lets you sync pages from your Freshdesk. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `syncFilesOnConnection`: Auto-sync all files from a user’s connected account. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `GITBOOK`: This integration lets you sync pages from your Gitbook. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `syncFilesOnConnection`: Auto-sync all files from a user’s connected account. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `GITBOOK`: This integration lets you sync pages from your Gitbook. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `syncFilesOnConnection`: Auto-sync all files from a user’s connected account. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `false`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `SALESFORCE`: This integration lets you sync pages from your Salesforce Knowledge. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `syncFilesOnConnection`: Auto-sync all files from a user’s connected account. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `GURU`: This integration lets you sync content from your Guru workspace. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `syncFilesOnConnection`: Auto-sync all files from a user’s connected account. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `GMAIL`: This integration enables you to import emails from Gmail, including file attachments. You can reference the supported file formats [here](learn/files/text). Once a user has connected their Gmail account, you can select which emails to upload via the `/integrations/gmail/sync` endpoint.

You can pass the following configuration for this integration: - filesTabColumns: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values "name", "status", "created_at", "external_url". - skipEmbeddingGeneration: Toggle to skip embedding generation. Defaults to false.

`OUTLOOK`: This integration enables you to import emails from Outlook, including file attachments. You can reference the supported file formats [here](learn/files/text). Once a user has connected their Outlook account, you can select which emails to upload via the `/integrations/outlook/sync` endpoint.

You can pass the following configuration for this integration: - filesTabColumns: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values "name", "status", "created_at", "external_url". - skipEmbeddingGeneration: Toggle to skip embedding generation. Defaults to false.

`Slack`: This integration enables you to import conversations from Slack. Once a user has connected their Slack account, you can select which conversations to upload via the `/integrations/slack/conversations` and `/integrations/slack/sync` endpoints. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. `RSS_FEED`: This integration lets you upload content from a RSS or Atom feed. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. `S3`: This integration lets you upload files from your Confluence. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `syncFilesOnConnection`: Auto-sync all files from a user’s connected account. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`. - `enableDigitalOcean`: Specifies whether files from Digital Ocean Spaces can be synced. The default value is `false`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `AZURE_BLOB_STORAGE`: This integration lets you upload files from your Confluence. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `syncFilesOnConnection`: Auto-sync all files from a user’s connected account. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`. - `enableDigitalOcean`: Specifies whether files from Digital Ocean Spaces can be synced. The default value is `false`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`. `GCS`: This integration lets you upload files from your Confluence. You can pass the following configuration for this integration: - `chunkSize`: Number of tokens per chunk. Defaults to 1500. - `overlapSize`: Size of the overlap in tokens. Defaults to 20. - `skipEmbeddingGeneration`: Toggle to skip embedding generation. Defaults to `false`. - `embeddingModel`: Specifies the embedding model used. You can find the model options [here](learn/models/models). - `generateSparseVectors`: Toggle to `true` to generate sparse vectors for hybrid search. Default is `false`. - `prependFilenameToChunks`: Adds the file title to each chunk for the integration. Defaults to `false`. - `syncFilesOnConnection`: Auto-sync all files from a user’s connected account. - `showFilesTab`: Shows the synced files tab in Carbon Connect for this specific integration. Defaults to `true`. (Carbon Connect 2.0 only) - `syncSourceItems`: Controls whether items from the file directory are synced by default. It defaults to `true`. - `useCarbonFilePicker`: Controls whether Carbon Connect defaults to Carbon’s file picker instead of the source’s file picker. Defaults to `false`. (Carbon Connect 3.0 only) - `incrementalSync`: By setting `incremental_sync` to `true`, only new or updated files since the last sync will be re-synced. Defaults to `false`. - `filesTabColumns`: Specifies which columns are displayed in the file list view and accepts an array of strings which can have values `"name"`, `"status"`, `"created_at"`, `"external_url"`.

Callback Function Props

onSuccess

Responds to successful events: file upload, 3rd party account connection, file selection, and web scraping initiation.

Event Types

INITIATE: This event type is triggered when a user enters the integration flow (either for auth or file selection)
ADD: This event type is triggered when a user authenticates an account under an integration.
UPDATE: This event type is triggered when a user adds or removes files for an integration.
CANCEL: This event type is triggered when when a user exits the integration flow without taking any action.

Callback Response

The data passed to the onSuccess callback prop will be:

LOCAL_FILES:

{
  status: 200,
  data: {
    "data_source_external_id": null, // This field is not applicable for local files
    "sync_status": null, // This is not applicable for local files
    "files": <Array of objects corresponding to the files uploaded>, (Refer to the file object format below)
  },
  action: 'UPDATE'
  event: 'UPDATE'
  integration: 'LOCAL_FILES',
}

WEB_SCRAPER:

{
  status: 200,
  data: {
    "data_source_external_id": null, // This field is not applicable for webscrapers
    "sync_status": null, // This is not applicable for webscrapers
    "files": <Array of objects corresponding to the parent URLs submitted>, (Refer to the file object format below)
  },
  action: 'UPDATE'
  event: 'UPDATE'
  integration: 'WEB_SCRAPER',
}

3rd Party Connectors

{
  status: 200,
  data: {
    "data_source_external_id": <Unique ID for the data source>
    "sync_status": <SYNC_STATUS>
    "files_synced": `true` or `false`
    "request_id": <Unique ID generated for the upload. Can be auto-generated if `useRequestIds` prop is set to `true`.>
  } or null,
  action: <ACTION_TYPE>, // `ACTION_TYPE` can be one of the following: `INITIATE`, `ADD`, `UPDATE`, `CANCEL`
  event: <EVENT_TYPE>, // `EVENT_TYPE` can be one of the following: `INITIATE`, `ADD`, `UPDATE`, `CANCEL`
  integration: <INTEGRATION_NAME>, // `INTEGRATION_NAME` can be one of the following: `LOCAL_FILES`, `NOTION`, `WEB_SCRAPER`, `GOOGLE_DRIVE`, `INTERCOM`, `DROPBOX`, `ONEDRIVE`,`BOX`
}

Each files object follows this format:

{
    "id": `Unique ID for the file, can be used for resyncing, deleting, updating tags etc.`,
    "source": `<integration_name>`, // One among `LOCAL_FILES`, `NOTION`, `WEB_SCRAPER`, `GOOGLE_DRIVE`, `INTERCOM`, `DROPBOX`, `ONEDRIVE`
    "organization_id": `<organization_id>`, // This is your unique organization id in carbon
    "organization_supplied_user_id": `<organization_supplied_user_id>`, // This is the unique user id that you pass to CC
    "organization_user_data_source_id": `<organization_user_data_source_id>`, // This is the unique user data source id that Carbon Connect creates for each user for each integration
    "external_file_id": `<external_file_id>`, // This is the unique file id in the 3rd party integration
    "external_url": `<external_url>`, // This is the unique url of the file in the 3rd party integration
    "sync_status": `<sync_status>`, // This is the sync status of the file. It can be one of the following: `READY`, `QUEUED_FOR_SYNCING`, `SYNCING`, `SYNC_ERROR`
    "last_sync": `<last_sync>`, // This is the timestamp of the last sync
    "tags": `<tags>`, // These are the tags passed in to CC
    "file_statistics": `<file_statistics>`, // This is the file statistics object
    "file_metadata": `<file_metadata>`, // This is the file metadata object
    "chunk_size":   `<chunk_size>`, // This is the chunk size used for the file
    "chunk_overlap": `<chunk_overlap>`, // This is the chunk overlap used for the file
    "name": `<name>`, // This is the name of the file
    "enable_auto_sync": `<enable_auto_sync>`, // This is the auto sync status of the file. This is a boolean flag
    "presigned_url": `<presigned_url>`, // This is the presigned url of the file
    "parsed_text_url": `<parsed_text_url>`, // This is the parsed text url of the file
    "skip_embedding_generation": `<skip_embedding_generation>`, // This is the skip embedding generation status of the file. This is a boolean flag
    "created_at": `<created_at>`, // This is the timestamp of the file creation
    "updated_at": `<updated_at>`, // This is the timestamp of the file updation
    "action": `<action>`, // This is the action type. It can be one of the following: `ADD`, `UPDATE`, `REMOVE`
}

onError

Triggered during file upload errors.

Structure:

{
  status: 400,
  action: 'UPDATE',
  event: 'UPDATE',
  integration: `<INTEGRATION_NAME>`, // 'LOCAL_FILES' or 'WEB_SCRAPER',
  data: `<data_object>`, // This field will be present only if the error is related to a file or web scraper
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

react-component.mdx

react-component.mdx

Installation

Prerequisites

Component Properties

Usage

Client Side Configuration

1. Import Libraries and Components:

2. Token Retrieval:

3. Implement Carbon Connect Component:

4. Specify an Embedding Model (Optional)

Server Side Configuration

Return Value

Enabling Connectors

Callback Function Props

onSuccess

Event Types

Callback Response

onError

Files

react-component.mdx

Latest commit

History

react-component.mdx

File metadata and controls

Installation

Prerequisites

Component Properties

Usage

Client Side Configuration

1. Import Libraries and Components:

2. Token Retrieval:

3. Implement Carbon Connect Component:

4. Specify an Embedding Model (Optional)

Server Side Configuration

Return Value

Enabling Connectors

Callback Function Props

onSuccess

Event Types

Callback Response

onError