Configuration Reference
About Configuration Files
Cosnim’s configuration files control every facet of the software’s operations.
All essential settings for Cosnim software are contained within a single yaml or JSON file. This compact configuration is intentional; it is streamlined for simplicity, efficiency, and resilience, allowing for rapid deployment, easy distribution, and effective disaster recovery across environments.
Location of configuration files
Cosnim uses the first of the following locations to read its configuration:
The file specified in the --config option of the cosnim commands.
The file specified in the COSNIM_CONFIG environment variable.
The file ‘config.yml’ in the current directory.
The file ‘config/config.yml’ under the current directory.
Format of configuration
The standard and recommended format for Cosnim configurations is yaml, but Cosnim will accept
configurations in JSON format. Configurations can be re-exported in another format using the
cosnim show config command.
Splitting up configuration files with !include
It is not necessary to put the entire configuration in a single file. To construct a configuration from multiple files, use the yaml ‘!include’ directive to embed configurations from other files.
Example
continuum:
...CONTINUUM_OPTIONS...
gateways:
!include config/gateways.yml
Standard Value Formats
Cosnim supports special value formats beyond yaml/json to facilitate configuration:
DURATION
Synopsis
NUMERIC_VALUE TIME_UNIT
Description
Specifies a time duration in a human-friendly format, such as ‘30.5 minutes’. The following units are supported. All units are internally converted to seconds. The default is ‘ seconds ‘ if no unit is specified in a DURATION parameter.
TIME_UNIT |
Converted to (seconds) |
|---|---|
ms, milliseconds |
0.000001 |
us, microseconds |
0.001 |
s, sec, secs |
1 |
m, min, mins |
60 |
h, hr, hrs, hour, hours |
3,600 |
d, day, days |
86,400 |
w, wks, week, weeks |
592,200 |
Examples
Configuration Parameter |
Result in seconds |
|---|---|
delay: 30 |
30 |
delay: 30 s |
30 |
delay: .5 minutes |
30 |
delay: 50 ms |
0.050 |
delay: 1.25 hours |
4500 |
SIZE
Synopsis
NUMERIC_VALUE SIZE_UNITS
Description
Specifies a size, such as storage space or memory, in a human-friendly format, such as 10 MB. The following units are supported; the case of the unit is not important ‘20 kB’ and ‘20 KB’ are equivalent. All units are internally converted to bytes. The default is ‘ bytes ‘ if no unit is specified in a SIZE parameter.
SIZE_UNIT |
Converted to (bytes) |
|---|---|
B |
1 |
bytes |
1 |
KiB |
1,024 |
Mib |
1,048,576 |
Gib |
1,073,741,824 |
Tib |
1,099,511,627,776 |
Pib |
1,125,899,906,842,624 |
kB |
1,000 |
KB |
1,000 |
MB |
1,000,000 |
GB |
1,000,000,000 |
TB |
1,000,000,000,000 |
PB |
1,000,000,000,000,000 |
Examples
Configuration Parameter |
Result in bytes |
|---|---|
size: 256 MB |
256,000,000 |
size: 256 MiB |
268,435,456 |
size: 1.5 GB |
1,500,000,000 |
BOOLEAN
Synopsis
[ yes | no | on | off | true | false ]
Description
Boolean values can be specified in yaml files using labels such as ‘yes’, ‘no’, ‘on’, ‘off’, ‘true’, and ‘false.’ All boolean variants produce the same results; choosing one type of label over the other is a question of personal choice.
Examples
YAML Configuration Parameter |
Result |
|---|---|
active: on |
True |
active: yes |
True |
active: true |
True |
active: off |
False |
active: no |
False |
active: false |
False |
Cache Management
caches
This section defines the global RAM caches. Global caches are used primarily by storage hubs and reduce the frequency at which data is pulled from disk and cloud storage, which increases performance and efficiency.
Caches can be shared between all hubs or assigned to individual groups of hubs for more refined tuning.
Synopsis
caches:
CACHE_NAME:
type: [ unified ]
size: SIZE
reuse_ratio: NUMBER(0-1)
⋮
Options
- CACHE_NAME
Name of the cache as referred to by the
cache:parameter of the storage hubs.
- type: [ unified ]
Type of cache to use. The default, and now the only option, is
unified, which uses Cosnim’s unified cache manager.
- size: SIZE
Defines the maximum amount of memory that can be used for caching. The more memory is used for caching, the higher the performance will be, provided the system has sufficient RAM.
You should try to allocate at least 1 GB for optimal results if the system resources permit. Otherwise, a cache size below 256 MB is not suggested without prior testing. Depending on how the continuum is used, a very low cache size could have detrimental effects on performance.
- reuse_ratio: NUMBER
Determines the amount of memory cache that should be prioritized for reused items as a ratio of the total cache size in the form of a numerical value between 0 and 1.
For example, if the cache size is 1GB and the reuse_ratio is .75, 750 MB of cache will be prioritized for previously reused items.
The reuse ratio is a fine-tuning parameter of Cosnim’s unified cache algorithm. A higher ratio increases the probability that older historical data remains longer in the cache, improving random access patterns’ performance. In contrast, a lower ratio increases the likelihood that recently read data will be reused from the cache. The recommended reuse ratio is .75, but you can freely experiment with other values.
Examples
caches:
primary:
type: unified
size: 768 MB
reuse_ratio: .75
data:
type: unified
size: 128 MB
reuse_ratio: .50
This defines two caches: ‘primary’, a self-tuning large cache geared for general use, and
‘data’, a smaller separate cache for some data where performance is not a concern. Those
caches are referred to in the hub with the option cache:. prevent
Continuum Configuration
continuum
This section defines the global operation of the continuum.
Synopsis
continuum:
primary_hub: HUB_NAME
keychain: FILE
capsule_version: [ 3 ]
autocreate: [ BOOLEAN ]
autoexpand: [ BOOLEAN ]
relay: RELAY_NAME
use_locks: [ BOOLEAN ]
auto_retry: DURATION
signature:
private_key: KEY_NAME
hash_algo: HASH_ALGORITHM
sign:
- files
Options
- primary_hub: HUB_NAME
Name of the primary storage hub to use. It’s recommended to call this hub ‘primary’ to avoid confusions with other hubs.
- keychain: FILE
File name and path of the keychain file to be used by this continuum for encryption and signatures. The keychain must contain at least an encryption key block named ‘default’.
- capsule_version: [ 3 ]
Capsule version to use in this continuum. For now, only version 3 may be used.
- autocreate: [ BOOLEAN ]
When this option is true, Cosnim will automatically create the continuum if any command is run and all storage hubs are uninitialized. If false, a
cosnim create continuumcommand must be run explicitly to create the continuum.
- autoexpand: [ BOOLEAN ]
When this option is true, Cosnim will automatically expand the continuum if some new storage hubs are uninitialized. This condition may occur if new storage hubs have been added to an existing continuum. When false, a
cosnim expand continuumcommand must be run explicitly to expand the continuum to use new, previously uninitialized storage hubs.
- relay: RELAY_NAME
Name of the primary relay to use to accelerate certain continuum operations. Optional.
- use_locks: BOOLEAN
Use relay locks to add an extra layer of security when sharing continuums. THIS OPTION IS DEPRECATED. Use the
share_leveloption infilesystemsinstead.
- auto_retry: DURATION
Specifies the maximum time Cosnim will wait to access capsules that are temporarily unavailable. This setting is handy when:
Connecting to typically unshared continuums (e.g., backup destinations)
Working with storage hubs that upload control information or metadata to cloud/shared storage before completing full data transfer
The operation fails if Cosnim cannot access the capsule within the specified duration.
Default: 0 seconds (no waiting) Recommended: 30 seconds for shared continuums
- signature: ...
Option group related to digital signatures. See below.
- private_key: KEY_NAME
Name of the private key in the keychain used to sign data modifications made by the user. The key name and public key associated with the private key are automatically recorded in the continuum when first used in a continuum.
- hash_algo: ALGO
The hash algorithm to use for data signatures digests. The default and recommended algorithm is ‘sha256’. Use the command
cosnim test hashesto see other available hash algorithms. Ensure you use only secure hash algorithms to preserve the signatures’ integrity.
- sign: [ files ]
Indicates what type of data should be signed automatically with digital signature cascades for the current user. At the moment, there is only one option:
- files
Automatically sign files that are created or updated by the user. This indirectly spawns a cascade of signatures and countersignatures based on other file updates made by this or other users.
Activating or deactivating signatures for a user does not affect the previous signatures or the cascades produced by other users.
Examples
continuum:
primary_hub: primary
keychain: ./credentials/keychain
autocreate: false
autoexpand: false
relay: myrelay_1
auto_retry: 30 s
signature:
private_key: john.doe@cosnim.com
sign:
- files
Filesystem Configuration
This section defines the filesystems that are hosted on a continuum.
Note
Currently, operating only one active filesystem per continuum is recommended. If you need
multiple filesystems, the preferred approach is to use different continuums. See the prefix
and namespace options in storage hubs for easy ways of hosting multiple continuums on the
same physical storage.
filesystems
Synopsis
FILESYSTEM_NAME:
version: [ 1 | 2 ]
share_level: [ 0 - 3 ]
merge_policy: [ drop | replace | rename_old | rename_new ]
pacing: DURATION
iosize: SIZE
stripe_size: SIZE
inline_size: SIZE
dir_cache_size: COUNT
file_cache_size: COUNT
priority_data:
- offset: 0
size: NUMBER
maxqueue: NUMBER
maxfiles: NUMBER
timetravel_marker: [ ~~~ | STRING ]
trace_api: BOOLEAN
dedup:
algo: HASH_ALGO
minsize: NUMBER
update: BOOLEAN
verify: BOOLEAN
verify_file: BOOLEAN
security:
idmap:
global:
users:
USER_NAME: UID
⋮
groups:
GROUP_NAME: GID
⋮
rindex: BOOLEAN
metadata:
ds_store: BOOLEAN
inode: BOOLEAN
- FILESYSTEM_NAME
Name of the filesystem. The primary (or only) filesystem should be named ‘root’.
- version: [ 1 | 2 ]
Filesystem version. Version 2 is recommended; version 1 exists only for legacy support with older versions of Cosnim.
This parameter only affects how new data is written. The filesystem always operates in hybrid mode, supporting multiple versions simultaneously. All existing data remains in the original format it was written until updated or deleted.
Determines how the filesystem will be shared in updates with other users. Please use the appropriate level according to your actual needs. Higher sharing levels may be more efficient at sharing with other users but also add processing and storage overhead. You can freely intermix different share levels (except level 0) between users. You should, therefore, activate higher levels only when needed. Share levels apply only to updates, can be changed at any time, and never prevent simultaneous users from reading the filesystem or continuum.
Global Defaults
defaults
This section defines configuration-wide default parameters for other sections, such as storage hubs. It helps reduce configuration file clutter and normalize parameters.
Synopsis
defaults:
hubs:
cache: ...
encryption: ...
group_size: ...
key: ...
compression: ...
compression_level: ...
max_capsule_size: ...
max_elements: ...
Options
- hubs: [...]
Sets the default parameters that apply to all storage hubs in the current configuration. This allows for better standardization. Please see the
hubsprimary section for the definition of these parameters.
Gateways Configuration
gateways
This section defines storage gateways. Gateways are small servers that provide private cloud capsule storage services to continuums by making local or enterprise storage available to many users instead of relying on cloud services. Gateways may also be used to centralize or firewall access to cloud services to protect storage infrastructure further.
Gateways internally leverage their own relays to manage network communications and, therefore, share many of the same parameters. Refer to the ‘relays’ section for details on those parameters.
Synopsis
gateways:
GATEWAY_NAME:
client:
url: URL
security:
private_key: KEY_NAME
ca_cert_file: FILE
server:
type: capsule
url: URL
security:
private_key_file: FILE
ssl_cert_file: FILE
auth_keychain: [ BOOLEAN | FILE ]
...HUBOPTIONS...
immutable: BOOLEAN
Options
- GATEWAY_NAME:
Name of the gateway. Refer to this name in storage hubs that use the gateway as a client and in the
cosnim start gatewaycommands used to start the gateway servers.A gateway definition can have a ‘client’ section, a ‘server’ section, or both. When sharing gateways with multiple users, consider storing the gateway configuration in a separate file and which can then be ‘!include’ in the client’s configurations.
- client:
Begin the definition of a gateway’s client parameters. The following options are identical to those of relays. See the ‘relays’ section for further details:
client: url: URL security: private_key: KEY_NAME ca_cert_file: FILE
- server:
Begin the definition of a gateway’s server parameters. The following options are identical to those of relays. See the ‘relays’ section for further details on these:
server: url: URL security: private_key_file: FILE ssl_cert_file: FILE auth_keychain: [ BOOLEAN | FILE ]
The following options are exclusive to gateways and are described below:
server: type: capsule immutable: BOOLEAN ...HUBOPTIONS...
- type: capsules
Identifies the type of storage hub that this gateway will be leveraging. At the moment, only Capsule Hubs are allowed.
- ...HUBOPTIONS...
Include in this section all the storage hub parameters that define the storage this gateway will serve. The parameters are identical to those of a Capsule Hub.
- immutable: BOOLEAN
When true, the gateway operates purely as an immutable WORM-like (Write-Once-Read-Many) storage system, strictly prohibiting all capsule updates and deletions. This protects the physical storage against encryption, destruction and corruption attacks while providing all services to users. It is recommended to set this option to true.
This option may also protect other types of hubs, such as cloud storage, to create a security firewall between clients and cloud storage providers, further shielding that physical storage against attacks.
Storage Hubs Configuration
hubs
This section defines storage hubs, which dictate how and where capsules are stored in the continuum. There are multiple types of hubs, each with a specific purpose:
Capsule Hubs
This foundation hub is responsible for assembling, compressing, encrypting, and storing capsules in a specific location, such as local storage, a gateway or a cloud storage service. All other hubs ultimately send their data to capsule hubs for actual storage.
Split Hubs
Redistribute elements to different hubs according to the element types, such as control data, metadata and actual user data. This effectively splits capsules by function. They help to optimize data storage usage and general performance.
Staging Hubs
Upload, download, and cache capsules in stages. Stages are useful to optimize I/O performance, manage local capsule storage caching, and reduce network congestion.
Scatter Hubs
Scatter, replicate and distribute capsules to multiple locations to provide high resiliency to outages and disasters. This is also known as asymmetric replication.
See also the Providers subsection for the definition of providers, which are the last-mile interface to actual storage.
Synopsis
hubs:
HUB_NAME:
type: [ capsule, split, staging, scatter ]
...HUB_CONFIGURATION...
- HUB_NAME:
Name to give to the hub. The name must be unique among all hubs and providers. One hub has to be designated as the ‘primary_hub’ in the continuum configuration. Ideally, the hubs configuration section should start with this primary hub, often named ‘primary’, followed by the subordinate hubs.
- type: [ capsule, split, staging, scatter ]
Type of hub being defined. The default is ‘capsule’ for Capsule Hubs.
- ...HUB_CONFIGURATION...
All other options depend on the hub’s type, as described below.
Capsule Hubs
These hubs assemble, compress, encrypt, and store data in capsules and reverse the process when retrieving data. Capsule hubs use providers to access physical storage, such as local, network, enterprise or cloud storage.
Synopsis
hubs::
HUB_NAME:
**type: capsule**
key: KEY_NAME
group_size: NUMBER
max_capsule_size: SIZE
max_elements: NUMBER
compression: COMPRESSION_ALGO
compression_level: NUMBER
encryption: ENCRYPTION_ALGO
cache: CACHE_NAME
read_prio: PRIORITY
write_prio: PRIORITY
provider:
...PROVIDER_CONFIGURATION...
Options
- type: capsule
Identifies this hub as a Capsule Hub. This is the default.
- key: KEY_NAME
Name of the encryption key block in the keychain to encrypt the capsules’ contents.
The default key name is ‘default’.
- group_size: NUMBER
The size of capsule groups. Most providers use a form of name hierarchy similar to directories to organize and retrieve capsules efficiently. The group size determines the maximum number of capsules that can be put in a given group/directory. This, in turn, controls the directory structure and hierarchy. The group size must be a power of 2 and is a permanent parameter – do not change it once capsules have started to be written to storage.
Recommended group_size: 4096
- max_capsule_size: SIZE
Sets the ideal maximum size of a capsule in storage. Beware that capsules may be larger than this size due to mandatory payload.
This is an important tuning parameter that can significantly impact performance and storage efficiency. The recommendation is to first split capsules according to group type (see Split Hubs) and then use smaller sizes for control and metadata capsules and a large size for data capsules.
Split Hub objtype
Suggested max_capsule_size
control
16 KB
metadata
265 KB
data
10 x the filesystem’s stripe_size or iosize
- max_elements: NUMBER
Sets the maximum number of data elements (aka “objects”) that can be stored in a given capsule. This tuning parameter limits the size and density of capsules and helps avoid having too many small data elements in a given capsule. In most cases, the recommended values below will provide optimal results.
Recommended max_elements: 1000
- compression: COMPRESSION_ALGO
Activates capsule data compression. The only compression algorithm currently available is ‘zlib’ (other algorithms will be added in the future). To disable compression, omit or set this option to ‘null’.
- compression_level: NUMBER
Set the compression level to NUMBER. For zlib, the default and recommended compression level is 6, but you can use any compression level between 1 and 9 according to your particular needs. Do not use compression level 0; this will needlessly increase processing time; instead, remove or nullify the compression algorithm altogether, giving you better performance.
- encryption: ENCRYPTION_ALGO
Activates the encryption of capsules using the given algorithm. Use the command
cosnim test encryptionto see all encryption algorithms available and their performance on a particular platform.Recommended value: aes-256-cbc
It is strongly recommended to encrypt capsules with AES-256 (CBC), as it is highly secure, fully vetted, quantum-safe, and gives good performance on all platforms.
- cache: CACHE_NAME
Name of the cache in which capsules can be kept in RAM to improve performance. See the ‘caches’ section for additional information.
- read_prio: PRIORITY
Sets the read priority of this hub relative to other hubs within a scatter group. A read_prio of 1 is the highest priority. See Scatter Hubs for details.
- write_prio: PRIORITY
Sets the write priority of this hub relative to other hubs within a scatter group. A write_prio of 1 is the highest priority. See Scatter Hubs for details.
- provider:
Defines the storage provider that connects the hub to the actual physical storage or cloud service. See the ‘providers’ section for the complete list of providers and their parameters.
Split Hubs
Split hubs redistribute data elements (aka objects) to different capsules and hubs according to the type of those elements. This increases performance and reduces overhead for some data types, such as metadata, by regrouping them together in smaller, more efficient capsules. Using a split hub as the continuum’s primary hub is highly recommended.
Synopsis
hubs:
HUB_NAME:
**type: split**
interleave: BOOLEAN
hub_types:
HUB_NAME:
objtypes: OBJTYPES
⋮
Options
- type: split
Required to identify a hub as a split hub.
- interleave: BOOLEAN
Indicates if elements can interleave between capsules (true) or should be in isolated capsules (false). This option is no longer recommended and should always be omitted or set to false.
- hub_types:
Defines how object types are split and which hubs the elements are redirected to. For each subordinate hub:
- HUB_NAME
Name of the subordinate hub. Only the name of the hub is given here. The hub itself must be defined separately in the ‘hubs:’ section.
- objtypes: OBJTYPES
Defines the types and groups of elements (objects) that should be assigned to this subordinate hub. You should use object type groups whenever possible:
objtypes group
Description
data
User data (file data, streams, …)
matadata
User metadata (file info, directories, embedded data, …)
dedup
Deduplication information
security
Signature cascades, certificates & security info
control
Internal continuum controls
system
General system management (Time-Travel, Reclaimer, Acceleration, …)
It’s strongly recommended to use split hubs and redirect at least data elements to a different hub and capsules – this increases efficiency and performance significantly. Large and shared continuums should also consider splitting control data capsules, which have very different storage & retrieval profiles than other capsules.
The recommended split hubs layouts are:
Shared continuums
For continuums that are shared with other users (filesys.share_level >= 3), a split hub to redistribute data to three hubs is recommended (one hub for data, one for control, and a default hub for everything else):
hubs: primary: type: split hub_types: control: objtypes: control system: objtypes: default data: objtypes: data
Private continuums
A two-tier setup is more efficient for small and private continuums that are not actively shared, such as backups and single-user mode (filesys.share_level = 2).
hubs: primary: type: split hub_types: system: objtypes: default data: objtypes: data
Note
Please see the Capsule Hubs definition for recommended capsule sizes for each hub type.
Staging Hubs
Staging hubs are special Capsule Hubs that transport capsules from one provider to another in stages. They help to manage local caches for performance. They also help to smooth out temporary network congestions by first writing capsules to local storage and uploading them asynchronously to remote or cloud.
Synopsis
hubs::
HUB_NAME:
**type: staging**
..CAPSULE_HUB_CONFIGURATION..
keep_cached: BOOLEAN
safe_to_purge: FILE
max_memory_size: SIZE
stages:
- provider:
...PROVIDER_CONFIGURATION...
⋮
Options
- type: staging
Identifies this hub as a Staging Hub.
- keep_cached: BOOLEAN
Specifies if the first stage hub is used for caching (true) or transit (false). When true, the first stage hub will always keep a copy of the capsules it handled, even if they are now in the second stage. When false, capsules are deleted from the first stage hub as soon as they are successfully written to the second stage.
- safe_to_purge: FILE
Name and path of the safe-to-purge file that contains the list of all the capsules stored in the first stage that are safe to purge. This list can be used to safely delete capsules from local storage to free up space.
- max_memory_size: SIZE
The maximum amount of RAM the staging hub can use to keep a copy of capsules waiting to be transmitted to the second stage. When this size is exceeded, the staging hub will purge the pending capsules from RAM and will instead re-read the capsule from the first stage / local storage when the second stage is ready to accept capsules.
- stages: [...]
Defines the capsule storage stages. Each stage must have a provider configuration that defines where the capsules will be physically stored. Currently, only two stages can be defined in a staging hub. The first stage should be faster storage than the second stage, for example, local storage. The second stage should be the ultimate destination of the capsules, for example, cloud or enterprise storage.
The order in which capsules are written to the stages is determined by the capsules’ content, purpose and priority. Most capsules are stored first in the first stage and then uploaded to the second stage as network conditions permit. Some capsules, such as those handling continuum controls, will instead be written to the second stage first to formally broadcast their contents and then written to the first stage for caching purposes.
Scatter Hubs
Scatter hubs are used to scatter, distribute and replicate capsules asymmetrically to multiple locations to provide high resiliency to outages and disasters.
Synopsis
hubs::
HUB_NAME:
**type: scatter**
hubs: [ HUB_NAME, ... ]
read_policy: priority
write_policy: [ roundrobin | priority | random ]
read_prio: NUMBER
write_prio: NUMBER
min_copies: NUMBER
min_hubs: NUMBER
fast_resume: BOOLEAN
distribute: BOOLEAN
Options
- type: scatter
Identifies this hub as a Scatter Hub.
- hubs: [ HUB_NAME, ... ]
Name of the subordinate hubs to which capsules will be distributed. The order is unimportant. The subordinate hub’s ‘read_prio’ and ‘write_prio’ may impact capsule distribution depending on the scatter hub’s ‘read_policy’ and ‘write_policy’.
- read_policy: priority
Determines how hubs are to be selected when reading capsules. Currently, the only option available is ‘priority’, meaning hubs will be selected in order of priority. Also see ‘read_prio’ in Capsule Hubs.
- write_policy: [ roundrobin | priority | random ]
Determines how hubs are selected when writing capsules. The available options are:
priority
Hubs are selected in order of priority. See ‘write_prio’ in Capsule Hubs. You’ll probably also want to set distribute=false with this policy if you wish all capsules to be written primarily to the higher-priority hubs.
roundrobin
Hubs are selected in a round-robin fashion, one after the other, skipping over hubs that are not available.
random
Hubs are selected randomly among the currently available hubs.
- min_copies: NUMBER
The minimum number of copies to write to the subordinate hubs. This establishes the outage tolerance of the scatter hub. For example, if min_copies=2, at least two copies of capsules will be written to different hubs, meaning that any one hub may go down without affecting data availability.
- min_hubs: NUMBER
The minimum number of hubs that must be online for the scatter hub to be considered online. The default and minimum is the total number of hubs minus the ‘min_copies’, plus 1. For example, a scatter hub with three hubs and a ‘min_copies’ of 2 would need a minimum (min_hubs) of 2 online hubs. You may set a min_hubs value superior to the minimum. If the number of available hubs goes below this threshold, the scatter hub’s status is changed to ‘fractured’. Depending on the other hubs’ configuration, this may escalate the fractured status up to the continuum, disabling its operation.
- distribute: BOOLEAN
When true, the scatter hub attempts to distribute the capsules more or less evenly across all available hubs. When distribute is true, the write_policy only affects the order in which hubs are selected for writing; all hubs will ultimately receive capsules or more or less the same quantity, provided they are online.
When false, the scatter hub does not attempt to distribute capsules evenly and obeys the write_policy strictly. For example, if write_policy=priority, only the highest priority hubs will usually receive capsules. All other hubs will be ignored except when one or more higher-priority hubs are temporarily unavailable.
Providers
Providers are the bridge and last-mile interface between the storage hubs, which assemble and manage capsules, and the actual physical or cloud storage where capsules are stored. They leverage internal adapters that understand how a particular storage system or service should be accessed. Providers are integral to the storage hubs’ configuration and should be included anywhere there is a reference to PROVIDER_CONFIGURATION in the hub’s configuration.
Synopsis
There are multiple classes and types of providers, each with its own set of options:
Local/Enterprise Storage
provider:
name: PROVIDER_NAME
type: file
path: PATH
prefix: PREFIX
namespace: NAMESPACE
autocreate: BOOLEAN
relay: RELAY_NAME
remote: BOOLEAN
maxqueue: NUMBER
Gateways
provider:
name: PROVIDER_NAME
type: gateway
gateway: GATEWAY
namespace: NAMESPACE
Cloud Object Storage
provider:
name: PROVIDER_NAME
type: amazon_s3 | azure_blob | tencent_cos | digitalocean_spaces | google_gcs | backblaze_s3 | ovh_s3 | wasabi
region: REGION
bucket_name: BUCKET_NAME
prefix: PREFIX
namespace: NAMESPACE
autocreate: BOOLEAN
host: HOSTNAME
credentials: [ key: KEYNAME | file: FILE ]
DynamoDB
provider:
name: PROVIDER_NAME
type: amazon_dynamodb
region: REGION
bucket_name: BUCKET_NAME
prefix: PREFIX
namespace: NAMESPACE
autocreate: BOOLEAN
region: REGION
host: HOSTNAME
credentials: [ key: KEYNAME | file: FILE ]
read_capacity_units: NUMBER
write_capacity_units: NUMBER
exceed_capacity_backoff: NUMBER
Dropbox
provider:
name: PROVIDER_NAME
type: dropbox
prefix: PREFIX
namespace: NAMESPACE
autocreate: BOOLEAN
credentials: [ key: KEYNAME | file: FILE ]
rate_limit_backoff: NUMBER
Options
- name: PROVIDER_NAME
Assigns a specific name to the provider. If omitted, Cosnim generates a name from the hub’s configuration.
- type: TYPE
Defines the type of provider:
file
Uses local native filesystems to store and retrieve capsules. Capsules are organized in directories to to lighten the load on local filesystems.
gateway
Uses a Cosnim gateway to access capsule storage. The gateway acts as a bridge between a Cosnim instance and the actual storage provider. Gateways are often used to build a private cloud storage environment shared among many users as an alternative to cloud storage services.
amazon_s3 | azure_blob | tencent_cos | digitalocean_spaces | google_gcs | backblaze_s3 | ovh_s3 | wasabi
Uses public cloud object storage services to store and retrieve capsules.
amazon_dynamodb
Uses Amazon DynamoDB to store and retrieve capsules. This provider should be used only for the small ‘control’ and ‘system’ capsules that benefit highly from this type of storage. Larger data and metadata capsules should be written to other types of providers.
dropbox
Leverages Dropbox for capsule storage. Although perfectly functional, this service has performance limitations due to the Dropbox services limitations; it should be used mostly for low-priority & low-cost storage.
- autocreate: BOOLEAN
When true, the provider and underlying adapters can automatically create the underlying storage structures, such as the directories, buckets and paths required to store capsules. When false, providers and adapters may do this only during a cosnim create continuum or cosnim expand continuum command.
Applicable to all provider types except gateways.
- path: PATH
Path to the local or enterprise storage filesystem where capsules will be stored.
Applicable to provider types: file
- region: REGION
Identifies the cloud storage provider’s region where the bucket is located.
Applicable to the following providers:
amazon_dynamodb
amazon_s3
backblaze_s3
cloudflare_r2
ovh_s3
tencent_cos
wasabi
- bucket_name: BUCKET_NAME
Name of the cloud object storage bucket in which capsules will be stored.
Applicable to all cloud storage providers.
- prefix: PREFIX
This is the fixed prefix to prepend to all capsule names and paths prior to storing and retrieving them. It permanently subdivides the primary provider’s storage, for example, a bucket, for specific uses. The provider never attempts to access data, for example, objects, files or database records outside of this prefix. See ‘namespace’ below for further description of hub storage subdivision and capsule naming conventions and how they relate to prefixes.
Applicable to all providers.
- namespace: NAMESPACE
Subdivides a provider’s storage for a specific continuum. The namespace is appended to the provider’s bucket name/filesystem path and the prefix to create a complete path to this continuum’s capsules. The full path of capsules then becomes:
For cloud storage: BUCKET_NAME / PREFIX / NAMESPACE / CAPSULE_PATH For local storage: PATH / PREFIX / NAMESPACE / CAPSULE_PATH
Gateways use namespaces primarily to share a physical storage space with multiple users and continuums. When gateways are not used, a prefix is usually sufficient to subdivide storage.
The recommendation is to:
Use ‘bucket_name’ and ‘path’ parameters to formally delimit the physical boundaries of the provider’s storage.
Use ‘prefix’ to direct the provider to use only a subset of that storage. The provider will not attempt to access storage outside of this boundary. The prefix can be part of the provider’s security profile in the physical storage.
Use ‘namespace’ to identify a particular continuum in the provider’s storage. A provider may access multiple namespaces.
Applicable to all providers.
- credentials: [ key: KEYNAME | file: FILE ]
Supplies the access credentials the provider and its adapter need to connect to the cloud storage service. The credentials can be read from the keychain or a separate file.
The precise format and contents of the credentials depend on the cloud storage provider. Where possible, Cosnim adapters use the json format as follows:
Provider type
Credentials format
amazon_dynamodb
{“aws_access_key_id”: “…”, “aws_secret_access_key”: “…”}
amazon_s3
{“aws_access_key_id”: “…”, “aws_secret_access_key”: “…”}
azure_blob
DefaultEndpointsProtocol=https;AccountName=…;AccountKey=…;…
backblaze_s3
{“access_key_id”: “…”, “secret_access_key”: “…”}
cloudflare_r2
{“access_key_id”: “…”, “secret_access_key”: “…”}
dropbox
[…]
google_gcs
{“type”: “service_account”, “project_id”: “…”, “private_key”: “…”, …}
ovh_s3
{“access_key_id”: “…”, “secret_access_key”: “…”}
tencent_cos
{“secret_id”: “…”, “secret_key”: “…”}
wasabi
{“access_key_id”: “…”, “secret_access_key”: “…”}
Applicable to all providers.
- gateway: GATEWAY
Name of the gateway to use. The remainder of the configuration is in the ‘gateways:’ section under ‘client:’.
Applicable to gateway profilers.
- host: HOSTNAME
Override the hostname of the cloud storage provider. This parameter is set by default by Cosnim adapters.
Applicable to cloud providers.
- relay: RELAY_NAME
Name of the relay to use to accelerate access to this provider. Storage hubs use relays to cache and exchange capsule rosters (inventories) with other storage hubs and users and between executions; this reduces the number of queries a given provider will need to make to the cloud storage service, increasing performance and reducing cloud access costs. Relays can be used whether or not continuums are shared with other users.
Applicable to all providers.
- remote: BOOLEAN
When true, the provider’s adapter is run in a separate remote process to provide better isolation. This may also help some adapter’s performance.
Recommended value: true for ‘azure_blob’, ‘dropbox’ and ‘google_gcs’, false for all others.
- maxqueue: NUMBER
Establishes how many concurrent requests may be queued to a remote adapter. When this number is exceeded, the storage hub will accumulate the additional requests until the adapter catches up.
The recommended value is 10.
Applicable only to providers using ‘remote=true’.
- rate_limit_backoff: NUMBER
Set the starting backoff delay in seconds when an adapter exceeds the rate limit of cloud access requests. The adapter will slow down the rate of requests starting with this value and keep doubling it until the cloud service stops returning errors.
The default value is .05
Applicable to Dropbox only.
- read_capacity_units: NUMBER
Sets the rate (per second) at which read requests are provisioned to DynamoDB. The read_capacity_units are set by the adapter before accessing the table. This affects performance and billing. See AWS documentation for details.
The default is 25.
Applicable to Amazon DynamoDB only.
- write_capacity_units: NUMBER
Sets the rate (per second) at which write requests are provisioned to DynamoDB. The write_capacity_units are set by the adapter before accessing the table. This affects performance and billing. See AWS documentation for details.
The default is 25.
Applicable to Amazon DynamoDB only.
- exceed_capacity_backoff: NUMBER
Set the starting backoff delay in seconds when the adapter is starting to exceed the read or write capacity units. The adapter will slow down the rate of requests starting with this value and keep doubling it until DynamoDB stops returning errors.
The default value is .5
Applicable to Amazon DynamoDB only.
Licensing Control
license
Identifies the software license to use for this instance.
Synopsis
license: FILE
- license: FILE
Name and path of the license file to use.
Note
The license file may contain multiple licenses and be shared between users and servers. Cosnim will automatically pick the first license that matches the current host and environment.
Logging Control
logging
Defines how logging is to be handled.
Synopsis
logging:
- type: [ stdout | stderr | file ]
path: FILE
verbosity: NUMBER
events:
- EVENT_NAME
- logging: [...]
Defines the logging facilities to use. There can be more than one logging facility, each with its particular options.
- type: [ stdout | stderr | file ]
Type and destination of this logging facility. Output can be directed to standard output, standard error or a particular file.
The default is stderr.
- path: FILE
Name and path of the file to receive logging. Use when type=file.
- verbosity: NUMBER
Verbosity of the messages. Use positive numbers to increase the amount and details of messages, and negative numbers to reduce logging:
Verbosity level
Effect
0
Normal logging. Includes informational messages (default).
-1
Reduced logging. Includes notices, warnings, responses and errors.
-2
Reduced logging. Includes warnings, responses and errors.
-3
Minimal logging. Includes responses and errors only.
-4
Suppressed logging. Includes errors only.
-5
Suppressed logging. Includes fatal/disastrous errors only.
-6
Disabled logging.
1
Increased logging. Includes additional informational messages.
2
Increased logging. Includes all informational messages.
3
Increased logging. Includes informational messages and some traces.
4 and +
Debugging logging.
- events: [...]
List of internal diagnostic events to include in the logging. Use as directed by Cosnim support.
Mounts
mount
Defines how the continuum filesystem should be mounted on the local machine to access it as a regular filesystem. This is used by the cosnim mount command.
Synopsis
mount:
mountpoint: PATH
volume: VOLUME_NAME
fsname: FILESYS_NAME
threads: BOOLEAN
readonly: BOOLEAN
default_permissions: BOOLEAN
iosize: SIZE
direct_io: BOOLEAN
allow_root: BOOLEAN
allow_other: BOOLEAN
auto_cache: BOOLEAN
auto_xattr: BOOLEAN
fuse_debug: BOOLEAN
driver:
type: cosnim
filesystem: root
- mount:
Begins a mount section. At the moment, only one filesystem may be mounted. To mount multiple filesystems, run a cosnim mount command separately for each mountpoint.
- mountpoint: PATH
The path where the filesystem will be mounted and where users & applications may access the continuum’s filesystem. The mountpoint directory is automatically created and deleted if the --mkdir option of the cosnim mount command is set or allowed to default.
- volume: VOLUME_NAME
The name of the volume as seen by the host operating system. This is currently used only under MacOS.
- fsname: FILESYS_NAME
The name of the filesystem type that is presented to the OS. This helps to differentiate a Cosnim filesystem from others hosted on the same OS.
The default is ‘cosnim-fs’.
- threads: BOOLEAN
Enables or disables the use of threads when running the mounted filesystem. Threads increase performance and responsiveness, but may make debugging easier.
The recommended and default value is true — Disable threads only when requested by Cosnim support.
- readonly: BOOLEAN
Set to true if the filesystem should be mounted read-only and false if read & write access is to be permitted. The --ro and --rw options of the cosnim mount command may be used to override this configuration parameter.
- iosize: SIZE
Sets the size of I/O operations of the filesystem. This is currently used only under MacOS. It should usually match the iosize of the continuum filesystem this mountpoint is using.
The default is 131072 (128KiB), which provides a good balance of performance vs buffering.
- direct_io: BOOLEAN
When true, this activates direct I/O with the operating system. Direct I/O bypasses OS buffering, which can increase performance, but it can also decrease performance if applications are making small I/Os.
The recommended value is false unless directed by Cosnim support.
- allow_root: BOOLEAN
Allows the root user to access the mounted filesystem.
By default, only the actual user that runs the cosnim mount command can access the filesystem. All other users are locked out for security reasons. This option allows the root user to access the mounted filesystem with full authority.
- allow_other: BOOLEAN
Allow other users to access the mounted filesystem.
By default, only the actual user that runs the cosnim mount command can access the filesystem. This option allows other non-root users to access the mounted filesystem.
Note
You may mount a filesystem using a regular user or root, independently of how the filesystem is shared with other users.
- auto_cache: BOOLEAN
Turns on FUSE automatic file caching. This should always be turned on (default) as it improves performance significantly. You may disable auto_cache if you suspect automatic caching interferes with file contents.
The default and recommended value is true.
- fuse_debug: BOOLEAN
When true, turn on FUSE debugging. Beware, this produces a large amount of output and should be used only for debugging.
- driver:
Defines the backend driver and filesystem to serve on this mountpoint.
The default is to use the Cosnim filesystem named ‘root’. You should change this option only if you named your filesystem in the ‘filesystems:’ section, something other than ‘root’.
- ::
- driver:
type: cosnim filesystem: root
Relays Configuration
relays
Relays are very small servers (servicelets) that cache and exchange information between running Cosnim instances and between Cosnim executions to improve performance and reduce costs. Relays host the following services:
Rosters
Relay rosters collect, cache and share minimal capsule information between running Cosnim instances. This reduces the number of requests storage hubs need to make to the storage adapters, increasing performance and reducing costs, especially in the cloud. Rosters do not store or share any user data, metadata or other data that could leak information about the continuum’s content. Relays operating rosters can be shut down and restarted with no impact.
Distributed Locks
Relays, when present, will sometimes be used to operate and share internal locks with other instances to optimize sharing activities. This reduces the instances’ overhead and increases the general performance of the continuum in a shared environment.
Gateways
Gateways also use internal relays to provide connectivity between clients and gateway servers. These relays are independent of the relays defined here.
Synopsis
relays:
RELAY_NAME:
client:
url: URL
security:
private_key: KEY_NAME
ca_cert_file: FILE
server:
url: URL
security:
private_key_file: FILE
ssl_cert_file: FILE
auth_keychain: [ BOOLEAN | FILE ]
Options
- RELAY_NAME:
Name of the relay. Refer to this name in storage hubs that use the relay as a client and in the
cosnim start relaycommands used to start the relay servers.A relay definition can have a ‘client’ section, a ‘server’ section, or both. When sharing relays with multiple users, consider storing the relay configurations in a separate file and which can then be ‘!include’d in the client’s configurations.
- client:
Begin the definition of a relay’s client parameters.
- url: URL
URL of the relay, in the form
tps[s]://HOST:PORT, that clients should use to connect to the relay server.
- security:
Section to define the security identifiers and certificates to use to connect to the relay server.
- private_key: KEY_NAME
Name of the private key to use to authenticate with the server. The key is read from the continuum’s keychain. It is required if the relay server is configured to authenticate client connections (‘auth_keychain’). You can create a key using the
cosnim generate key --private ..command.
- ca_cert_file: FILE
Name and path of the CA certificate file that authenticates the relay server. This parameter is required when connecting with SSL/TLS (protocol ‘tcps’). See the Installation and Configuration Guide for instructions on how to generate a relay CA key and certificate.
- server:
Begin the definition of a relay’s server parameters.
- url: URL
URL that the relay server will bind to and listen for client connections. Specify in the form
tcp[s]://HOST:PORT. Use a HOST of ‘0.0.0.0’ to accept connections from any interface. There are two protocols supported:tcp
Serves client requests on unencrypted TCP communication channels. This protocol does not disclose any sensitive or valuable information in clear text. As long as storage hubs are configured with encryption, all user data, metadata, and control information are already fully encrypted and quantum-safe in capsules prior to transmission. No other sensitive information is transmitted in clear text, and all client authentication is performed using an SSL-like challenge protocol, fully preserving confidentiality over unencrypted channels.
tcps
Services client requests over encrypted SSL/TLS communication channels. This is in addition to the quantum-safe encryption of capsules. An SSL/TLS channel can be used to increase the protection against man-in-the-middle attacks and make communication even more opaque. When ‘tcps’ is used, a ‘private_key_file’ and ‘ssl_cert_file’ must be provided to the server, and a ‘ca_certfile’ must be provided to clients. See the Installation and Configuration Guide for instructions on generating these certificates.
- security:
Defines the security options of the relay server (see below).
- private_key_file: FILE
Name and path of the SSL/TLS private key. Keep this key in a secure location. See the Installation and Configuration Guide for instructions on how to generate this key.
- ssl_cert_file: FILE
Name and path of the SSL/TLS certificate that will be presented to clients that connect to the server. See the Installation and Configuration Guide for instructions on how to generate this key.
- auth_keychain: [ BOOLEAN | FILE ]
Determines how clients connecting to the relay are authenticated:
FILE
Name of a keychain that contains the public keys of clients authorized to connect to this server. The relay uses a challenge protocol and public-key authentication to confirm the identity of the clients connecting.
Only users defined in this keychain can connect to the server. See the Installation and Configuration Guide for instructions on how to manage relay and relay security.
true
When set to true, the continuum’s keychain is used to authenticate client connections.
false
When set to false, no connection security protocol is implemented; any client can connect to the relay server. However, this does not create a significant security risk as clients still need to provide additional security information, such as internal UUIDs and storage hub namerkeys before they are allowed to participate in a relay. Without this critical information, which can only be obtained from authorized Cosnim instances, unauthenticated users cannot effectively use relays. Nevertheless, authentication security should be implemented when relays are accessed through the Internet or untrusted networks.
Sync Profiles
sync_profiles
Defines the profiles used by the cosnim sync|backup|restore commands, which in turn defines many of the commands’ parameters, avoiding the need to supply them to the commands.
There are three default profiles: ‘sync’, ‘backup’ and ‘restore’, which automatically match the cosnim command used. Other profiles may be created and selected with the ‘--profile’ option of the cosnim command.
Synopsis
sync_profiles:
PROFILE_NAME:
override:
...CONFIGURATION_OPTIONS...
⋮
drift_ns: NANOSECONDS
delete_pacing: DURATION
timetravel_tag: STRING
filters:
- include: PATTERN
exclude: PATTERN
ignore: PATTERN
⋮
paths:
- source: SYNC_PATH
destination: SYNC_PATH
⋮
Options
- PROFILE_NAME
Name of the sync profile. It can be ‘sync’, ‘backup’ or ‘restore’, or any other name of your choosing as selected by the --profile option of the cosnim command.
- override:
Overrides other configuration options when this profile is selected. This can be used to customize the operation of the continuum while sharing a standard configuration with other uses, such as mounting live filesystems. The most frequent option overridden during syncs is the filesystems pacing value. For example:
- ::
- sync_profiles:
- backup:
- overrides:
- filesystems:
- root:
pacing: 15 s
The above reconfigures the filesystem to pace commits every 15 seconds when running a cosnim backup command. Increasing the pacing value during backups is highly recommended as backup operations don’t need continuous Time-Travel - the objective of Time-Travel points during backups is to capture the state of the continuum at the end of the backup, not intermediary points. A modest pacing value, for example, at every 15 to 60 seconds, instead of a very high pacing value is still helpful as this accelerates resumed backups if they are ever interrupted.
- drift_ns: NANOSECONDS
Determines how much file modifications are allowed to drift before being considered unequal. This helps to compensate for different filesystem time resolutions. See the --drift-ns option of the cosnim sync|backup|restore command for more information about the effects of this parameter.
If this option is specified both in the profile configuration and on the cosnim command options, the latter has precedence.
- delete_pacing: DURATION
Delays file and directory deletions in the destination for an amount of time. This helps to smooth out events where applications continuously delete and recreate the same files and directories as a way of updating them atomically. Delete pacing forces Cosnim to wait a little bit before considering the file or directory as effectively deleted. This helps Time-Travel to reflect the actual net event (update) instead of reporting repeating deletions & re-creations.
Delete pacing may impact the RPO during small and frequent backups as it may force Cosnim to wait a little longer when deleting files.
The default and recommended delete_pacing value is 500 ms.
- timetravel_tag: STRING
Customizes the Time-Travel tag that is created when a sync or equivalent operation is completed. Time-Travel tags are sometimes referred to as “snapshots” as they identify by name a particular Time-Travel point. However, whether or not Time-Travel points are tags does not affect how Time-Travel functions.
The Time-Travel tag parameter is suffixed with the current time and the unique Time-Travel identifier to produce a full tag name. For example, if timetravel_tag=’Sync’, the following Time-Travel tag name would be created after a cosnim sync command is completed:
By default, the timetravel_tag is named after the cosnim command run. It can also be overridden with the --timetravel-tag option of the cosnim sync|backup|restore command, which takes precedence over the profile’s value.
- filters: [ include | exclude | ignore : PATTERN, ... ]
Enumerates the list of filters that determine which files and directories are synchronized, backed up or restored. The filters are tested one by one in the order they appear in this section. The first pattern that matches the current file/directory is used. There are three types of filters:
include
Identifies files and directories that should be synchronized.
exclude
Identifies files and directories that should not be synchronized. If such files or directories appear in the destination, they are removed to keep the destination in sync with the source.
ignore
Identifies files and directories that should be ignored completely, both on the source and destination. Those files or directories are never synchronized nor removed from the destination if they don’t exist on the source.
Patterns are used to determine if a given file or directory matches. The following rules apply:
A pattern with no path delimiter (/) applies to any directory or subdirectory. For example, the pattern ‘tempfile’ would match any file named ‘tempfile’.
A pattern that contains a path delimiter is relative to the source. For example, if a backup is run on the source filesystem directory ‘/data/mydata/’ and the filter pattern is ‘mydir/myfile’, this would match the file ‘/data/mydata/mydir/myfile’ on the local filesystem.
A ‘?’ matches any single character. For example, the pattern ‘mydir/m?file’ would match ‘mydir/myfile’ and ‘mydir/mofile’, but not ‘mydir/mfile’.
A ‘*’ matches any number of characters, including no characters, within a given directory level. For example, the pattern ‘*/myfile’ would match ‘mydocs/myfile’ and ‘mydir/myfile’, but not mydir/mydocs/myfile.
A ‘**’ matches any number of characters, including no characters, in any number of directory levels. The pattern can be used at a path’s beginning, middle and/or end. For example, the pattern ‘**/tempdir/**’ would match any directory named ‘tempdir’ at any level, including all subdirectories.
Example:
sync_profiles:
backup:
filters:
- exclude '*.swp' # Excludes all files that end with .swp
- exclude '.TemporaryItems' # Excludes all files named '.TemporaryItems'
- exclude '**/cache/*.tmp' # Excludes all files that end with .tmp that are under a directory named 'cache'.
- ignore '.DS_Store' # Ignore MacOS finder settings files
- paths: [ source: SOURCE_PATH, destination: DEST_PATH, ... ]
Specifies one or more pairs of source and destination paths that the sync, backup, or restore command will synchronize. For each pair, one path must be on a local filesystem and the other must be in the continuum. Local paths must start with a ‘./’ (current directory) or ‘/’ (absolute path). Paths on the continuum must begin with ‘cosnim:/’.
When running the cosnim backup command, all source paths must be on a local filesystem, and destination paths must be on a continuum. When running the cosnim restore command, all paths must be in the opposite direction. There is no restriction as to the direction when running the cosnim sync command.
Examples
sync_profiles:
backup:
paths:
- source: /home/johndoe
destination: cosnim:/Backups/user/johndoe
- source: /opt/mysoft
destination: cosnim:/Backups/mysoft