URL attributes may be defined using the URL File Dialog.
Unless explicitly stated otherwise, URL attributes of File Operation components accept multiple URLs separated with a semicolon (';').
![]() | Important |
---|---|
To ensure graph portability, forward slashes must be used when defining the path in URLs (even on Microsoft Windows). |
Most protocols support wildcards: ?
(question mark) matches
one arbitrary character; *
(asterisk) matches any number
of arbitrary characters. Note that wildcard support and their syntax is
protocol-dependent.
Below are some examples of possible URL for File Operations:
/path/filename.txt
One specified file.
/path1/filename1.txt;/path2/filename2.txt
Two specified files.
/path/filename?.txt
All files satisfying the mask.
/path/*
All files in the specified directory.
/path?/*.txt
All .txt
files in directories that satisfy
the path?
mask.
ftp://username:password@server/path/filename.txt
Denotes the path/filename.txt
file on a remote
server connected via an FTP protocol using username and password.
If the initial working directory differs from the server root directory, please use absolute FTP paths, see below.
ftp://username:password@server/%2Fpath/filename.txt
Denotes the /path/filename.txt
file on a remote
server - the initial slash must be escaped as %2F
.
The path is absolute with respect to the server root directory.
ftp://username:password@server/dir/*.txt
Denotes all files satisfying the mask on a remote server connected via an FTP protocol using username and password.
sftp://username:password@server/path/filename.txt
Denotes the filename.txt
file on a remote server
connected via an SFTP protocol using username and password.
sftp://username:password@server/path?/filename.txt
Denotes all files filename.txt
in directories
satisfying the mask on a remote server connected via SFTP protocol
using username and password.
http://server/path/filename.txt
Denotes the filename.txt
file on a remote
server connected via an HTTP protocol.
https://server/path/filename.txt
Denotes the filename.txt
file on a remote
server connected via an HTTPS protocol.
s3://access_key_id:secret_access_key@s3.amazonaws.com/bucketname/path/filename.txt
Denotes the path/filename.txt
object located in
Amazon S3 web storage service in a bucket bucketname
.
The connection is established using the specified access key ID and
secret access key.
hdfs://CONNECTION_ID/path/filename.txt
Denotes the filename.txt
file on Hadoop HDFS.
The "CONNECTION_ID
" stands for the ID of a Hadoop
connection defined in a graph.
smb://domain%3Buser:password@server/path/filename.txt
smb2://domain%3Buser:password@server/path/filename.txt
Denotes a file located in a Windows share (Microsoft SMB/CIFS protocol).
The URL path may contain wildcards (both * and ? are supported). The
server
part may be a DNS name, an IP address or
a NetBIOS name. The Userinfo part of the URL (domain%3Buser:password
)
is not mandatory and any URL reserved character it contains should be
escaped using the %-encoding similarly to the semicolon ;
character with %3B
in the example (the semicolon
is escaped because it collides with the default CloverDX file URL separator).
The SMB version 1 protocol is implemented in the JCIFS library which may be configured using Java system properties. See Setting Client Properties in JCIFS documentation for a list of all configurable properties.
The SMB version 2 and 3 protocol is implemented in the SMBJ library which depends on the Bouncy Castle library.
A sandbox resource, whether it is a shared, local or partitioned sandbox, is specified in a graph under the fileURL attributes as a so called sandbox URL like this:
sandbox://data/path/to/file/file.dat
where "data" is a code for sandbox and "path/to/file/file.dat" is the path to the resource from the sandbox root. A graph does not have to run on the node which has local access to the resource.