Getting Started with Airbyte Cloud
This page guides you through setting up your Airbyte Cloud account, setting up a source, destination, and connection, verifying the sync, and allowlisting an IP address.
Set up your Airbyte Cloud account
To use Airbyte Cloud:
If you haven't already, sign up for Airbyte Cloud using your email address, Google login, or GitHub login.
Airbyte Cloud offers a 14-day free trial. For more information, see Pricing.note
If you are invited to a workspace, you cannot use your Google login to create a new Airbyte account.
If you signed up using your email address, Airbyte will send you an email with a verification link. On clicking the link, you'll be taken to your new workspace.info
A workspace lets you collaborate with team members and share resources across your team under a shared billing account.
Set up a source
A source is an API, file, database, or data warehouse that you want to ingest data from.
To set up a source:
On the Airbyte Cloud dashboard, click Sources and then click + New source.
On the Set up the source page, select the source you want to set up from the Source type dropdown.
The fields relevant to your source are displayed. The Setup Guide provides information to help you fill out the fields for your selected source.
Click Set up source.
Set up a destination
A destination is a data warehouse, data lake, database, or an analytics tool where you want to load your extracted data.
To set up a destination:
On the Airbyte Cloud dashboard, click Destinations and then click + New destination.
On the Set up the destination page, select the destination you want to set up from the Destination type dropdown.
The fields relevant to your destination are displayed. The Setup Guide provides information to help you fill out the fields for your selected destination.
Click Set up destination.
Set up a connection
A connection is an automated data pipeline that replicates data from a source to a destination.
Setting up a connection involves configuring the following parameters:
|Replication frequency||How often should the data sync?|
|Data residency||Where should the data be processed?|
|Destination Namespace and stream names||Where should the replicated data be written?|
|Catalog selection||Which streams and fields should be replicated from the source to the destination?|
|Sync mode||How should the streams be replicated (read and written)?|
|Optional transformations||How should Airbyte protocol messages (raw JSON blob) data be converted into other data representations?|
For more information, see Connections and Sync Modes and Namespaces
If you need to use cron scheduling:
- In the Replication Frequency dropdown, click Cron.
- Enter a cron expression and choose a time zone to create a sync schedule.
- Only one sync per connection can run at a time.
- If cron schedules a sync to run before the last one finishes, the scheduled sync will start after the last sync completes.
- Cloud does not allow schedules that sync more than once per hour.
To set up a connection:
On the Airbyte Cloud dashboard, click Connections and then click + New connection.
On the New connection page, select a source:
To use an existing source, select your desired source from the Source dropdown. Click Use existing source.
To set up a new source, select the source you want to set up from the Source type dropdown. The fields relevant to your source are displayed. The Setup Guide provides information to help you fill out the fields for your selected source. Click Set up source.
Select a destination:
- To use an existing destination, select your desired destination from the Destination dropdown. Click Use existing destination.
- To set up a new destination, select the destination you want to set up from the Destination type dropdown. The fields relevant to your destination are displayed. The Setup Guide provides information to help you fill out the fields for your selected destination. Click Set up destination.
The Set up the connection page is displayed.
From the Replication frequency dropdown, select how often you want the data to sync from the source to the destination.
Note: The default replication frequency is Every 24 hours.
From the Destination Namespace dropdown, select the format in which you want to store the data in the destination:
Note: The default configuration is Mirror source structure.
Configuration Description Mirror source structure Some sources (for example, databases) provide namespace information for a stream. If a source provides the namespace information, the destination will reproduce the same namespace when this configuration is set. For sources or streams where the source namespace is not known, the behavior will default to the "Destination default" option Destination default All streams will be replicated and stored in the default namespace defined on the Destination Settings page. For more information, see Destination Connector Settings Custom format All streams will be replicated and stored in a custom format. See Custom format for more details
To better understand the destination namespace configurations, see Destination Namespace example
(Optional) In the Destination Stream Prefix (Optional) field, add a prefix to stream names (for example, adding a prefix
(Optional) Click Refresh schema if you had previously triggered a sync with a subset of tables in the stream and now want to see all the tables in the stream.
Activate the streams you want to sync:
- (Optional) If your source has multiple tables, type the name of the stream you want to enable in the Search stream name search box.
- (Optional) To configure the sync settings for multiple streams, select the checkbox next to the desired streams, configure the settings in the purple box, and click Apply.
Configure the sync settings:
Toggle the Sync button to enable sync for the stream.
- Namespace: The database schema of your source tables (auto-populated for your source)
- Stream name: The table name in the source (auto-populated for your source)
Sync mode: Select how you want the data to be replicated from the source to the destination:
For the source:
- Select Full Refresh to copy the entire dataset each time you sync
- Select Incremental to replicate only the new or modified data
For the destination:
Select Overwrite to erase the old data and replace it completely
Select Append to capture changes to your table Note: This creates duplicate records
Select Deduped + history to mirror your source while keeping records unique
Note: Some sync modes may not yet be available for your source or destination
Cursor field: Used in Incremental sync mode to determine which records to sync. Airbyte pre-selects the cursor field for you (example: updated date). If you have multiple cursor fields, select the one you want.
Primary key: Used in Deduped + history sync mode to determine the unique identifier.
- Namespace: The database schema of your destination tables.
- Stream name: The final table name in destination.
Click Set up connection.
Airbyte tests the connection. If the sync is successful, the Connection page is displayed.
Verify the connection
Verify the sync by checking the logs:
- On the Airbyte Cloud dashboard, click Connections. The list of connections is displayed. Click on the connection you just set up.
- The Sync History is displayed. Click on the first log in the sync history.
- Check the data at your destination. If you added a Destination Stream Prefix while setting up the connection, make sure to search for the stream name with the prefix.
Allowlist IP addresses
Depending on your data residency location, you may need to allowlist the following IP addresses to enable access to Airbyte:
United States and Airbyte Default
GCP region: us-west3
Some workflows still run in the US, even when the data residency is in the EU. If you use the EU as a data residency, you must allowlist the following IP addresses from both GCP us-west3 and AWS eu-west-3.
GCP region: us-west3
AWS region: eu-west-3