Data Sources
Connect and configure the external content systems that feed knowledge into your digital twins.
Introduction
The Data Sources feature in the Viven Admin Portal is the integration hub that connects Viven's digital twin intelligence to your organization's content and communication systems. Each data source represents a distinct content pipeline — from document repositories (Attachments, Confluence, Box) to communication tools (Conversation) and code platforms (GitHub) — feeding knowledge into the AI model that powers your digital twins.
This guide covers the complete lifecycle of managing data sources: understanding the dashboard, enabling and disabling sources, configuring authentication credentials, tuning sync behavior, and customizing how the underlying AI model responds through the Bot Configuration panel.
na:::info Audience
This guide is intended for technical administrators responsible for configuring integrations and maintaining data pipelines in the Viven Admin Portal.
:::
What Are Data Sources?
Data sources are the external systems whose content Viven ingests and indexes. Once enabled and configured, Viven continuously syncs content from these sources and makes it queryable through your digital twins. Each source has its own configuration page for authentication and sync settings.
The following source types are supported:
| Data Source | Description |
|---|---|
| Attachments | Files and email attachments uploaded directly to Viven or associated with inbound communications. |
| Box | Documents and files stored in your organization's Box cloud storage. |
| Clari | Revenue intelligence data from Clari, including deal and pipeline information. |
| Confluence | Wiki pages, spaces, and documentation from your Atlassian Confluence instance. |
| Conversation | Conversational data and transcripts from integrated messaging or call platforms. |
| Digital Twin Meeting | Meeting summaries and content generated by digital twin interactions. |
| File Upload | Manually uploaded files directly through the Admin Portal. |
| GitHub | Code repositories, pull requests, issues, and documentation from GitHub. |
Navigating the Data Sources Dashboard
The Data Sources dashboard provides a centralized view of all available integrations, their current status, document counts, and quick-action controls.
Data Sources dashboard
Figure 1 – Data Sources dashboard listing all configured integrations
Dashboard Overview
The dashboard is organized as a table with the following columns:
| Column | Description |
|---|---|
| Data Source | Name and icon of the integration. Click the column header to sort alphabetically. |
| Status | Shows whether the source is Enabled (green) or Not Enabled (orange). Click the column header to sort by status. |
| Documents | The total number of documents currently ingested from this source. Click the column header to sort by count. |
| Actions | Three action buttons per row: Configure (settings icon), Credentials (key icon), and Enable/Disable toggle. |
Global Ingestion Rules
The Global Ingestion Rules button at the top of the dashboard opens organization-wide rules that apply across all data sources. These rules allow you to define universal content filters, blocklists, or ingestion priorities before source-specific settings are applied.
Global Ingestion Rules take precedence over individual source configurations. Configure global rules before enabling individual sources to ensure consistent ingestion behavior.
Enabling and Disabling a Data Source
Each data source row has an Enable or Disable button on the right. Disabling a source halts all sync activity and prevents the source's documents from being queried, but does not delete any ingested data.
- Locate the data source in the dashboard table.
- Click Disable to stop sync and remove the source from active queries, or Enable to activate it.
- The Status badge updates immediately to reflect the change.
Disabling a source that feeds critical data to active digital twins may degrade response quality until the source is re-enabled and re-indexed.
Configuring a Data Source
Each data source has a dedicated Configuration page accessible via the Configure button (gear icon) in the dashboard. The configuration page is divided into three sections: Basic Settings, Sync Settings, and Advanced Settings.
Attachments Configuration page
Figure 2 – Attachments Configuration page showing Basic, Sync, and Advanced Settings
Basic Settings
Basic Settings control the fundamental activation state and domain scope of the data source.
| Setting | Description |
|---|---|
| Data source is active | Toggle to enable or disable this data source. When off, no sync occurs and no documents are queried. |
| Domain | The domain associated with this data source (e.g., company.com). Used to scope the integration to your organization. |
Sync Settings
Sync Settings determine how and when data is pulled from the source system into Viven.
| Setting | Description |
|---|---|
| Continuous sync is enabled | When enabled, Viven monitors the source for new and updated content in near real-time. |
| Use service account for sync | Routes all sync activity through a designated service account rather than individual user credentials. Recommended for production environments. |
| Timezone Name | The timezone of the source system's server (e.g., America/New_York). Used to interpret timestamps accurately during sync. |
| User sync | Toggles syncing of user-level data (individual user content and permissions) from this source. |
| Organization sync | Toggles syncing of organization-level data (shared spaces, team content) from this source. |
Advanced Settings
Advanced Settings provide fine-grained control over what is included in or excluded from ingestion.
| Setting | Description |
|---|---|
| Entities to Sync | A comma-separated list of entity types to include in the sync (e.g., pages, comments, attachments). Leave empty to sync all supported entities. |
| Blocklist Patterns | Comma-separated patterns to exclude from ingestion. Supports sender addresses, subjects, channel names, and keywords (e.g., from:noreply@service.com, subject:automated, channel). |
Saving Configuration
- Make all desired changes across Basic, Sync, and Advanced Settings sections.
- Click Save Configuration in the bottom-right corner to persist changes.
- A confirmation indicator will appear. If continuous sync is enabled, Viven will begin applying the new configuration immediately.
Configuration changes take effect on the next sync cycle. If continuous sync is enabled, this typically occurs within minutes. Otherwise, changes apply at the next scheduled sync interval.
Configuring Credentials
Certain data sources (such as Confluence, Box, and GitHub) require OAuth token credentials to authenticate with the external system. The Credentials page is accessible via the Credentials button (key icon) in the dashboard.
Confluence Token Configuration page
Figure 3 – Confluence Token Configuration showing OAuth credential fields and Revision History
OAuth Token Configuration Fields
The required fields vary by data source. The Confluence example shown above requires the following:
| Field | Description |
|---|---|
| Connected Email | Read-only display of the email address currently associated with the active token. |
| The email address of the account used to generate the API token. Must match the Confluence account. | |
| API Token | The API token generated from the Atlassian account settings. Stored securely and masked after saving. |
| Domain | Your Confluence domain in the format yourcompany.atlassian.net. |
Fields marked with are required.
Saving Token Credentials
- Navigate to the data source in the dashboard and click Credentials (key icon).
- Click View detailed instructions within the page to review the source-specific setup guide (opens in a new tab).
- Enter the required credential fields (Email, API Token, Domain).
- Click Save Token Configuration to securely store the credentials.
- To clear the credentials and start over, click Reset.
Revision History
The Revision History section at the bottom of the Credentials page logs all previous token configurations, including timestamps and the associated account email. This provides an audit trail for credential changes and helps diagnose authentication failures by identifying when credentials were last rotated.
API tokens should be rotated regularly in accordance with your organization's security policy. Use the Revision History to track when tokens were last updated.
Bot Configuration
The Bot Configuration page controls how the AI model behind your digital twins interacts with users. It is accessible from the Data Sources section via the breadcrumb Admin Console / Data Sources / Bot Configuration.
Bot Configuration page
Figure 4 – Bot Configuration showing Service Account, Special Instructions, and Disclaimer Text
Service Account
The Service Account section defines the admin-level identity used to run integrations and sync operations across all data sources. This is a shared operational account, not an end-user account.
| Field | Description |
|---|---|
| Service Account User ID | The email address of the service account used for running integrations across multiple data sources (e.g., admin@yourdomain.ai). This account must have appropriate read permissions in all connected source systems. |
Use a dedicated service account (not a personal account) for the Service Account User ID. This ensures that integrations remain active even when individual users' credentials change or accounts are deactivated.
Special Instructions
Special Instructions are pre-defined tags that provide the AI model with contextual information about how to handle domain-specific terminology and queries. Each tag maps a short abbreviation or alias to its full meaning, helping the model resolve ambiguous terms accurately.
How Special Instructions Work
Tags appear as dismissible chips in the Special Instructions field. When a user's query contains a term that matches a tag's key, the model uses the associated definition as context. For example, a tag mapping "TDS" to "Twin digital stack" ensures that queries referencing TDS are correctly understood.
Adding a Special Instruction
- Click inside the Special Instructions text area at the bottom of the section.
- Type one instruction per line in the format: ABBREVIATION means FULL DEFINITION.
- Press Enter to add each instruction as a tag chip.
- To remove a tag, click the × button on the chip.
Keep instructions concise and unambiguous. Use the format "X means Y" consistently so the model can reliably parse the context. Avoid duplicating abbreviations with different definitions.
Disclaimer Text
The Disclaimer Text section allows administrators to configure an HTML disclaimer that is displayed to users below the query input field in the digital twin interface. This is typically used to communicate usage policies, data handling notices, or accuracy limitations.
| Field | Description |
|---|---|
| Disclaimer Text (input) | A rich text area where you enter the HTML content of the disclaimer. Supports standard HTML tags such as <br>, <strong>, and <a href>. |
| Disclaimer Text (preview) | A read-only preview field showing how the HTML disclaimer will render for end users. |
Editing the Disclaimer
- Click inside the Disclaimer Text input area.
- Enter or paste HTML content. Use
<br />for line breaks and<strong>for bold text. - Review the rendered output in the preview field below.
- The disclaimer is saved automatically when you save the Bot Configuration page.
My digital twin may make mistakes; check important info.<br />
Zero tolerance for abuse; it's not worth your job.<br />
Details are in the acceptable use policy.
Frequently Asked Questions
The initial sync duration depends on the volume of content in the source system. Large repositories (e.g., a Confluence instance with thousands of pages) may take several hours. Subsequent incremental syncs are significantly faster. Monitor the Documents count on the dashboard to track ingestion progress.
Currently, each data source type has a single configuration entry in the dashboard. If you need to ingest content from multiple instances of the same platform (e.g., two Confluence instances), contact your Viven account team to discuss multi-instance configuration options.
Disabling a data source stops new content from being ingested but does not delete previously indexed documents. The existing data remains available for queries until you explicitly remove it or re-enable the source. Re-enabling triggers a fresh incremental sync from the point of the last successful sync.
Some data sources (such as Attachments and File Upload) do not require OAuth credentials because they operate within the Viven platform directly. The Credentials button is only active for sources that authenticate with external systems via token-based auth.
- Verify that the API token entered in the Credentials page is valid and has not expired.
- Confirm that the Connected Email matches the account that owns the token.
- Ensure the service account has read access to the Confluence spaces you intend to sync.
- Check that the domain field is in the correct format (e.g.,
yourcompany.atlassian.netwith no trailing slash). - Review the Revision History to confirm credentials were saved successfully.
Special Instructions provide additional context to the model for terminology resolution, but they do not override the model's core reasoning or safety behaviors. They are additive hints, not directives. For deeper model customization, contact your Viven account team about advanced model configuration options.
Navigate to the Configure page for the relevant data source, scroll to Advanced Settings, and enter your blocklist patterns in the Blocklist Patterns field. Patterns support sender addresses, subject line text, channel names, and keywords, separated by commas. Changes take effect on the next sync cycle after saving.
Yes. Credentials and service account information entered in the Admin Portal are encrypted at rest and in transit. The Service Account User ID is used only for internal integration routing and is never exposed to end users interacting with your digital twins.