Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/treeverse/dvc/llms.txt

Use this file to discover all available pages before exploring further.

DVC supports various remote storage backends for storing and sharing data. This guide covers all available storage options and their configuration parameters.

Adding a Remote

Add a remote storage location:
dvc remote add -d myremote s3://mybucket/path
The -d flag sets it as the default remote. Configuration is stored in .dvc/config:
[core]
    remote = myremote
['remote "myremote"']
    url = s3://mybucket/path
Use dvc remote add to configure remotes, or edit .dvc/config directly.

Common Configuration

These options apply to all remote types:
url
string
required
Remote storage URL. Format depends on the storage type:
  • S3: s3://bucket/path
  • GCS: gs://bucket/path
  • Azure: azure://container/path
  • SSH: ssh://user@host:/path
  • Local: /path/to/storage or file:///path/to/storage
jobs
integer
default:"4"
Number of parallel jobs for upload/download operations. Higher values speed up transfers but use more resources.
checksum_jobs
integer
default:"4"
Number of parallel jobs for checksum calculation.
version_aware
boolean
default:"false"
Enable version-aware operations for supported cloud storage. Useful for S3, GCS, and Azure with versioning enabled.
worktree
boolean
default:"false"
Enable worktree mode for the remote.

Local & File Remotes

Local filesystem or network-mounted storage.

Configuration

# Local directory
dvc remote add storage /mnt/shared/dvc-storage

# Network share (Linux/Mac)
dvc remote add storage /mnt/nas/dvc-storage

# Windows network share
dvc remote add storage \\server\share\dvc-storage

# Relative path (relative to .dvc directory)
dvc remote add storage ../../shared-storage

Parameters

type
string
Cache link type: reflink, hardlink, symlink, or copy.
  • reflink: Copy-on-write (fastest, limited support)
  • hardlink: Hard links (fast, same filesystem required)
  • symlink: Symbolic links
  • copy: Full copy (slowest, most compatible)
Can specify multiple as comma-separated list (tried in order).
shared
string
Set to group to make cache group-writable. Useful for shared storage accessed by multiple users.
Warn when using slow link types (copy).
verify
boolean
default:"false"
Verify checksums after transfer.

Example

['remote "local"']
    url = /mnt/shared/dvc-cache
    type = hardlink,copy
    shared = group

Amazon S3

Amazon S3 and S3-compatible storage (MinIO, DigitalOcean Spaces, etc.).

Basic Setup

dvc remote add s3remote s3://mybucket/path
dvc remote modify s3remote region us-east-1

Authentication Parameters

access_key_id
string
AWS access key ID. Alternatively, set AWS_ACCESS_KEY_ID environment variable.
secret_access_key
string
AWS secret access key. Alternatively, set AWS_SECRET_ACCESS_KEY environment variable.
Store credentials in .dvc/config.local (git-ignored) or use environment variables.
session_token
string
AWS session token for temporary credentials.
profile
string
AWS profile name from ~/.aws/credentials.
credentialpath
string
Path to custom AWS credentials file.
configpath
string
Path to custom AWS config file.
allow_anonymous_login
boolean
default:"false"
Allow anonymous access to public buckets.

Region & Endpoint

region
string
AWS region (e.g., us-west-2, eu-central-1).
endpointurl
string
Custom S3 endpoint URL for S3-compatible services:
  • MinIO: http://localhost:9000
  • DigitalOcean: https://nyc3.digitaloceanspaces.com

Connection Settings

use_ssl
boolean
default:"true"
Use HTTPS for connections.
ssl_verify
boolean | string
default:"true"
Verify SSL certificates. Set to false to disable or path to CA bundle.
read_timeout
integer
Read timeout in seconds.
connect_timeout
integer
Connection timeout in seconds.

Encryption

sse
string
Server-side encryption algorithm: AES256 or aws:kms.
sse_kms_key_id
string
KMS key ID for KMS encryption.
sse_customer_algorithm
string
Customer-provided encryption algorithm.
sse_customer_key
string
Customer-provided encryption key.

Access Control

acl
string
Canned ACL: private, public-read, public-read-write, authenticated-read, etc.
grant_read
string
Grant read permissions (grantee format: id=xxx or emailAddress=xxx).
grant_read_acp
string
Grant read ACP permissions.
grant_write_acp
string
Grant write ACP permissions.
grant_full_control
string
Grant full control permissions.

Performance

cache_regions
boolean
Cache region information for faster bucket access.

Complete Example

# Add remote
dvc remote add s3storage s3://my-dvc-bucket/project

# Configure region and encryption
dvc remote modify s3storage region us-west-2
dvc remote modify s3storage sse AES256
dvc remote modify s3storage acl bucket-owner-full-control

# Set credentials (local config, git-ignored)
dvc config --local remote.s3storage.access_key_id AKIAIOSFODNN7EXAMPLE
dvc config --local remote.s3storage.secret_access_key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Resulting .dvc/config:
['remote "s3storage"']
    url = s3://my-dvc-bucket/project
    region = us-west-2
    sse = AES256
    acl = bucket-owner-full-control
dvc remote add minio s3://my-bucket/dvc
dvc remote modify minio endpointurl http://localhost:9000
dvc remote modify minio access_key_id minioadmin
dvc remote modify minio secret_access_key minioadmin

Google Cloud Storage (GCS)

Google Cloud Storage backend.

Basic Setup

dvc remote add gsremote gs://mybucket/path

Authentication

credentialpath
string
Path to service account JSON key file. Alternatively, set GOOGLE_APPLICATION_CREDENTIALS environment variable.
projectname
string
Google Cloud project name.
allow_anonymous_login
boolean
default:"false"
Allow anonymous access to public buckets.

Endpoint

endpointurl
string
Custom GCS endpoint URL (for emulators or compatible services).

Example

dvc remote add gcs gs://my-dvc-bucket/data
dvc config --local remote.gcs.credentialpath /path/to/service-account.json
dvc remote modify gcs projectname my-gcp-project

Microsoft Azure Blob Storage

Azure Blob Storage backend.

Basic Setup

dvc remote add azure azure://mycontainer/path

Authentication Methods

connection_string
string
Azure storage connection string.Format: DefaultEndpointsProtocol=https;AccountName=...;AccountKey=...
account_name
string
Storage account name.
account_key
string
Storage account key.
sas_token
string
Shared Access Signature token.
tenant_id
string
Azure AD tenant ID for service principal authentication.
client_id
string
Azure AD client ID.
client_secret
string
Azure AD client secret.
allow_anonymous_login
boolean
default:"false"
Allow anonymous access.

Credential Chain Control

exclude_environment_credential
boolean
Exclude environment variables from credential chain.
exclude_visual_studio_code_credential
boolean
Exclude VS Code credentials.
exclude_shared_token_cache_credential
boolean
Exclude shared token cache.
exclude_managed_identity_credential
boolean
Exclude managed identity credentials.

Connection Settings

timeout
integer
General timeout in seconds.
read_timeout
integer
Read timeout in seconds.
connection_timeout
integer
Connection timeout in seconds.

Example

dvc remote add azure azure://mycontainer
dvc config --local remote.azure.connection_string "DefaultEndpointsProtocol=https;AccountName=...;AccountKey=..."
dvc remote add azure azure://mycontainer/dvc
dvc remote modify azure account_name mystorageaccount
dvc config --local remote.azure.tenant_id YOUR_TENANT_ID
dvc config --local remote.azure.client_id YOUR_CLIENT_ID
dvc config --local remote.azure.client_secret YOUR_CLIENT_SECRET

SSH / SFTP

Remote storage over SSH/SFTP.

Basic Setup

dvc remote add sshremote ssh://user@example.com:/path/to/storage

Authentication

user
string
SSH username.
password
string
SSH password (not recommended, use key-based auth).
ask_password
boolean
Prompt for password interactively.
keyfile
string
Path to SSH private key file.
passphrase
string
Passphrase for encrypted private key.
ask_passphrase
boolean
Prompt for passphrase interactively.
gss_auth
boolean
Use GSS-API authentication.
allow_agent
boolean
Allow SSH agent for authentication.

Connection Settings

port
integer
default:"22"
SSH port number.
timeout
integer
Connection timeout in seconds.
max_sessions
integer
Maximum number of concurrent SSH sessions.

Cache Settings

type
string
Cache link type on remote: reflink, hardlink, symlink, or copy.

Example

dvc remote add ssh ssh://user@server.com:/data/dvc-storage
dvc remote modify ssh port 2222
dvc remote modify ssh keyfile ~/.ssh/id_rsa_dvc

HDFS

Hadoop Distributed File System.

Basic Setup

dvc remote add hdfsremote hdfs://namenode:8020/path/to/storage

Parameters

user
string
HDFS username.
kerb_ticket
string
Path to Kerberos ticket cache file.
replication
integer
HDFS replication factor.

WebHDFS

Web-based HDFS access.

Basic Setup

dvc remote add webhdfs webhdfs://namenode:50070/path/to/storage

Authentication

user
string
WebHDFS username.
password
string
WebHDFS password.
kerberos
boolean
Use Kerberos authentication.
kerberos_principal
string
Kerberos principal name.
token
string
Delegation token.

Settings

use_https
boolean
Use HTTPS instead of HTTP.
ssl_verify
boolean | string
Verify SSL certificates.
proxy_to
string
Proxy to specific DataNode.
data_proxy_target
string
Target for data proxy operations.

Alibaba OSS

Alibaba Cloud Object Storage Service.

Basic Setup

dvc remote add oss oss://mybucket/path

Authentication

oss_key_id
string
OSS access key ID.
oss_key_secret
string
OSS access key secret.
oss_endpoint
string
OSS endpoint URL (e.g., oss-cn-hangzhou.aliyuncs.com).

Example

dvc remote add oss oss://my-bucket/dvc-storage
dvc remote modify oss oss_endpoint oss-cn-beijing.aliyuncs.com
dvc config --local remote.oss.oss_key_id YOUR_KEY_ID
dvc config --local remote.oss.oss_key_secret YOUR_KEY_SECRET

Google Drive

Google Drive backend (experimental).

Basic Setup

dvc remote add gdrive gdrive://folder-id

Authentication

gdrive_client_id
string
Google OAuth client ID.
gdrive_client_secret
string
Google OAuth client secret.
gdrive_user_credentials_file
string
Path to user credentials file.

Service Account

gdrive_use_service_account
boolean
Use service account for authentication.
gdrive_service_account_json_file_path
string
Path to service account JSON file.
gdrive_service_account_user_email
string
Email for service account impersonation.

Settings

gdrive_trash_only
boolean
default:"false"
Move files to trash instead of permanent deletion.
gdrive_acknowledge_abuse
boolean
default:"false"
Acknowledge abuse risk when downloading flagged files.

HTTP / HTTPS

Read-only remote over HTTP(S).

Basic Setup

dvc remote add httpremote https://example.com/data

Authentication

auth
string
Authentication method: basic, digest, or custom.
user
string
Username for basic/digest auth.
password
string
Password for authentication.
ask_password
boolean
Prompt for password interactively.
custom_auth_header
string
Custom authentication header value.Example: Bearer YOUR_TOKEN

Connection Settings

ssl_verify
boolean | string
Verify SSL certificates. Can be true, false, or path to CA bundle.
method
string
HTTP method to use (default: GET).
connect_timeout
number
Connection timeout in seconds.
read_timeout
number
Read timeout in seconds.

WebDAV / WebDAVs

WebDAV protocol support.

Basic Setup

dvc remote add webdav webdavs://example.com/dvc-storage

Authentication

user
string
Username.
password
string
Password.
ask_password
boolean
Prompt for password.
token
string
Authentication token.
bearer_token_command
string
Command to generate bearer token dynamically.
custom_auth_header
string
Custom authentication header.

SSL/TLS

cert_path
string
Path to client certificate.
key_path
string
Path to client private key.
ssl_verify
boolean | string
Verify SSL certificates.
timeout
integer
Connection timeout in seconds.

Best Practices

Use .dvc/config.local for sensitive data:
# Add remote (tracked by Git)
dvc remote add storage s3://bucket/path

# Add credentials (git-ignored)
dvc config --local remote.storage.access_key_id YOUR_KEY
dvc config --local remote.storage.secret_access_key YOUR_SECRET
Or use environment variables:
export AWS_ACCESS_KEY_ID=YOUR_KEY
export AWS_SECRET_ACCESS_KEY=YOUR_SECRET
Configure different remotes for different purposes:
# Production storage
dvc remote add -d production s3://prod-bucket/data

# Backup storage
dvc remote add backup gs://backup-bucket/data

# Local cache for fast access
dvc remote add local /mnt/fast-storage

# Push to specific remote
dvc push -r backup
When running on cloud instances, use IAM roles instead of access keys:AWS EC2:
# No credentials needed - uses instance profile
dvc remote add storage s3://bucket/path
Azure VM:
# Uses managed identity
dvc remote add storage azure://container/path
dvc remote modify storage account_name myaccount
Adjust job counts based on your network and CPU:
# Increase for fast networks
dvc remote modify storage jobs 16
dvc remote modify storage checksum_jobs 16

# Reduce for slow connections or limited CPU
dvc remote modify storage jobs 2
dvc remote modify storage checksum_jobs 2
For important datasets, enable post-transfer verification:
dvc remote modify storage verify true
Verification doubles the time needed for transfers but ensures data integrity.

Remote Management Commands

# List remotes
dvc remote list

# Add remote
dvc remote add myremote s3://bucket/path

# Set as default
dvc remote default myremote

# Modify remote
dvc remote modify myremote region us-west-2

# Rename remote
dvc remote rename myremote newname

# Remove remote
dvc remote remove myremote

# Push to specific remote
dvc push -r myremote

# Pull from specific remote
dvc pull -r myremote

Troubleshooting

Check credentials:
# Verify config
dvc remote list
dvc config remote.storage.url

# Test connection
dvc remote list storage
Common issues:
  • Expired credentials
  • Wrong region/endpoint
  • Missing permissions
  • Credentials in wrong config level
Optimize settings:
# Increase parallelism
dvc remote modify storage jobs 16

# Check network
dvc pull -v  # verbose output
Considerations:
  • Network bandwidth
  • Remote storage throughput limits
  • Number of files vs. file sizes
Disable verification (not recommended for production):
dvc remote modify storage ssl_verify false
Or provide CA bundle:
dvc remote modify storage ssl_verify /path/to/ca-bundle.crt

Next Steps

Configuration Overview

Learn about DVC configuration system

DVC Files

Understand DVC file formats