Skip to main content

Data Sources

Connect and configure the external content systems that feed knowledge into your digital twins.

Introduction

The Data Sources feature in the Viven Admin Portal is the integration hub that connects Viven's digital twin intelligence to your organization's content and communication systems. Each data source represents a distinct content pipeline — from document repositories (Attachments, Confluence, Box) to communication tools (Conversation) and code platforms (GitHub) — feeding knowledge into the AI model that powers your digital twins.

This guide covers the complete lifecycle of managing data sources: understanding the dashboard, enabling and disabling sources, configuring authentication credentials, tuning sync behavior, and customizing how the underlying AI model responds through the Bot Configuration panel.

na:::info Audience
This guide is intended for technical administrators responsible for configuring integrations and maintaining data pipelines in the Viven Admin Portal.
:::

What Are Data Sources?

Data sources are the external systems whose content Viven ingests and indexes. Once enabled and configured, Viven continuously syncs content from these sources and makes it queryable through your digital twins. Each source has its own configuration page for authentication and sync settings.

The following source types are supported:

Data SourceDescription
AttachmentsFiles and email attachments uploaded directly to Viven or associated with inbound communications.
BoxDocuments and files stored in your organization's Box cloud storage.
ClariRevenue intelligence data from Clari, including deal and pipeline information.
ConfluenceWiki pages, spaces, and documentation from your Atlassian Confluence instance.
ConversationConversational data and transcripts from integrated messaging or call platforms.
Digital Twin MeetingMeeting summaries and content generated by digital twin interactions.
File UploadManually uploaded files directly through the Admin Portal.
GitHubCode repositories, pull requests, issues, and documentation from GitHub.

The Data Sources dashboard provides a centralized view of all available integrations, their current status, document counts, and quick-action controls.

Data Sources dashboard

Figure 1 – Data Sources dashboard listing all configured integrations

Dashboard Overview

The dashboard is organized as a table with the following columns:

ColumnDescription
Data SourceName and icon of the integration. Click the column header to sort alphabetically.
StatusShows whether the source is Enabled (green) or Not Enabled (orange). Click the column header to sort by status.
DocumentsThe total number of documents currently ingested from this source. Click the column header to sort by count.
ActionsThree action buttons per row: Configure (settings icon), Credentials (key icon), and Enable/Disable toggle.

Global Ingestion Rules

The Global Ingestion Rules button at the top of the dashboard opens organization-wide rules that apply across all data sources. These rules allow you to define universal content filters, blocklists, or ingestion priorities before source-specific settings are applied.

note

Global Ingestion Rules take precedence over individual source configurations. Configure global rules before enabling individual sources to ensure consistent ingestion behavior.

Enabling and Disabling a Data Source

Each data source row has an Enable or Disable button on the right. Disabling a source halts all sync activity and prevents the source's documents from being queried, but does not delete any ingested data.

  1. Locate the data source in the dashboard table.
  2. Click Disable to stop sync and remove the source from active queries, or Enable to activate it.
  3. The Status badge updates immediately to reflect the change.
warning

Disabling a source that feeds critical data to active digital twins may degrade response quality until the source is re-enabled and re-indexed.

Configuring a Data Source

Each data source has a dedicated Configuration page accessible via the Configure button (gear icon) in the dashboard. The configuration page is divided into three sections: Basic Settings, Sync Settings, and Advanced Settings.

Attachments Configuration page

Figure 2 – Attachments Configuration page showing Basic, Sync, and Advanced Settings

Basic Settings

Basic Settings control the fundamental activation state and domain scope of the data source.

SettingDescription
Data source is activeToggle to enable or disable this data source. When off, no sync occurs and no documents are queried.
DomainThe domain associated with this data source (e.g., company.com). Used to scope the integration to your organization.

Sync Settings

Sync Settings determine how and when data is pulled from the source system into Viven.

SettingDescription
Continuous sync is enabledWhen enabled, Viven monitors the source for new and updated content in near real-time.
Use service account for syncRoutes all sync activity through a designated service account rather than individual user credentials. Recommended for production environments.
Timezone NameThe timezone of the source system's server (e.g., America/New_York). Used to interpret timestamps accurately during sync.
User syncToggles syncing of user-level data (individual user content and permissions) from this source.
Organization syncToggles syncing of organization-level data (shared spaces, team content) from this source.

Advanced Settings

Advanced Settings provide fine-grained control over what is included in or excluded from ingestion.

SettingDescription
Entities to SyncA comma-separated list of entity types to include in the sync (e.g., pages, comments, attachments). Leave empty to sync all supported entities.
Blocklist PatternsComma-separated patterns to exclude from ingestion. Supports sender addresses, subjects, channel names, and keywords (e.g., from:noreply@service.com, subject:automated, channel).

Saving Configuration

  1. Make all desired changes across Basic, Sync, and Advanced Settings sections.
  2. Click Save Configuration in the bottom-right corner to persist changes.
  3. A confirmation indicator will appear. If continuous sync is enabled, Viven will begin applying the new configuration immediately.
note

Configuration changes take effect on the next sync cycle. If continuous sync is enabled, this typically occurs within minutes. Otherwise, changes apply at the next scheduled sync interval.

Configuring Credentials

Certain data sources (such as Confluence, Box, and GitHub) require OAuth token credentials to authenticate with the external system. The Credentials page is accessible via the Credentials button (key icon) in the dashboard.

Confluence Token Configuration page

Figure 3 – Confluence Token Configuration showing OAuth credential fields and Revision History

OAuth Token Configuration Fields

The required fields vary by data source. The Confluence example shown above requires the following:

FieldDescription
Connected EmailRead-only display of the email address currently associated with the active token.
EmailThe email address of the account used to generate the API token. Must match the Confluence account.
API TokenThe API token generated from the Atlassian account settings. Stored securely and masked after saving.
DomainYour Confluence domain in the format yourcompany.atlassian.net.

Fields marked with are required.

Saving Token Credentials

  1. Navigate to the data source in the dashboard and click Credentials (key icon).
  2. Click View detailed instructions within the page to review the source-specific setup guide (opens in a new tab).
  3. Enter the required credential fields (Email, API Token, Domain).
  4. Click Save Token Configuration to securely store the credentials.
  5. To clear the credentials and start over, click Reset.

Revision History

The Revision History section at the bottom of the Credentials page logs all previous token configurations, including timestamps and the associated account email. This provides an audit trail for credential changes and helps diagnose authentication failures by identifying when credentials were last rotated.

Security Note

API tokens should be rotated regularly in accordance with your organization's security policy. Use the Revision History to track when tokens were last updated.

Bot Configuration

The Bot Configuration page controls how the AI model behind your digital twins interacts with users. It is accessible from the Data Sources section via the breadcrumb Admin Console / Data Sources / Bot Configuration.

Bot Configuration page

Figure 4 – Bot Configuration showing Service Account, Special Instructions, and Disclaimer Text

Service Account

The Service Account section defines the admin-level identity used to run integrations and sync operations across all data sources. This is a shared operational account, not an end-user account.

FieldDescription
Service Account User IDThe email address of the service account used for running integrations across multiple data sources (e.g., admin@yourdomain.ai). This account must have appropriate read permissions in all connected source systems.
Best Practice

Use a dedicated service account (not a personal account) for the Service Account User ID. This ensures that integrations remain active even when individual users' credentials change or accounts are deactivated.

Special Instructions

Special Instructions are pre-defined tags that provide the AI model with contextual information about how to handle domain-specific terminology and queries. Each tag maps a short abbreviation or alias to its full meaning, helping the model resolve ambiguous terms accurately.

How Special Instructions Work

Tags appear as dismissible chips in the Special Instructions field. When a user's query contains a term that matches a tag's key, the model uses the associated definition as context. For example, a tag mapping "TDS" to "Twin digital stack" ensures that queries referencing TDS are correctly understood.

Adding a Special Instruction

  1. Click inside the Special Instructions text area at the bottom of the section.
  2. Type one instruction per line in the format: ABBREVIATION means FULL DEFINITION.
  3. Press Enter to add each instruction as a tag chip.
  4. To remove a tag, click the × button on the chip.
tip

Keep instructions concise and unambiguous. Use the format "X means Y" consistently so the model can reliably parse the context. Avoid duplicating abbreviations with different definitions.

Disclaimer Text

The Disclaimer Text section allows administrators to configure an HTML disclaimer that is displayed to users below the query input field in the digital twin interface. This is typically used to communicate usage policies, data handling notices, or accuracy limitations.

FieldDescription
Disclaimer Text (input)A rich text area where you enter the HTML content of the disclaimer. Supports standard HTML tags such as <br>, <strong>, and <a href>.
Disclaimer Text (preview)A read-only preview field showing how the HTML disclaimer will render for end users.

Editing the Disclaimer

  1. Click inside the Disclaimer Text input area.
  2. Enter or paste HTML content. Use <br /> for line breaks and <strong> for bold text.
  3. Review the rendered output in the preview field below.
  4. The disclaimer is saved automatically when you save the Bot Configuration page.
Example
My digital twin may make mistakes; check important info.<br />
Zero tolerance for abuse; it's not worth your job.<br />
Details are in the acceptable use policy.

Frequently Asked Questions

The initial sync duration depends on the volume of content in the source system. Large repositories (e.g., a Confluence instance with thousands of pages) may take several hours. Subsequent incremental syncs are significantly faster. Monitor the Documents count on the dashboard to track ingestion progress.

Currently, each data source type has a single configuration entry in the dashboard. If you need to ingest content from multiple instances of the same platform (e.g., two Confluence instances), contact your Viven account team to discuss multi-instance configuration options.

Disabling a data source stops new content from being ingested but does not delete previously indexed documents. The existing data remains available for queries until you explicitly remove it or re-enable the source. Re-enabling triggers a fresh incremental sync from the point of the last successful sync.

Some data sources (such as Attachments and File Upload) do not require OAuth credentials because they operate within the Viven platform directly. The Credentials button is only active for sources that authenticate with external systems via token-based auth.

  • Verify that the API token entered in the Credentials page is valid and has not expired.
  • Confirm that the Connected Email matches the account that owns the token.
  • Ensure the service account has read access to the Confluence spaces you intend to sync.
  • Check that the domain field is in the correct format (e.g., yourcompany.atlassian.net with no trailing slash).
  • Review the Revision History to confirm credentials were saved successfully.

Special Instructions provide additional context to the model for terminology resolution, but they do not override the model's core reasoning or safety behaviors. They are additive hints, not directives. For deeper model customization, contact your Viven account team about advanced model configuration options.

Navigate to the Configure page for the relevant data source, scroll to Advanced Settings, and enter your blocklist patterns in the Blocklist Patterns field. Patterns support sender addresses, subject line text, channel names, and keywords, separated by commas. Changes take effect on the next sync cycle after saving.

Yes. Credentials and service account information entered in the Admin Portal are encrypted at rest and in transit. The Service Account User ID is used only for internal integration routing and is never exposed to end users interacting with your digital twins.