Skip to content
  • There are no suggestions because the search field is empty.

How to set arbitrary S3 configuration options (Hadoop S3A, Presto S3) in Flink?

The Flink docs on S3 only present a few configuration examples, for example, how to configure access credentials. How can I set arbitrary configuration parameters like presto.s3.max-retry-time or fs.s3a.attempts.maximum in Flink?

Answer

Note: This section applies to Flink 1.5 or later.

You can configure both S3 file system implementations via flink-conf.yaml. For configuration parameters to be forwarded to their native implementation, they need to match specific prefixes:

  • Hadoop S3A: s3. | s3a. | fs.s3a.
  • Presto S3: s3. | presto.s3.

Examples

Please see the following examples for how the mapping is done. Please note a special replacement that allows s3.access-key/s3.access.key and s3.secret-keys3.secret.key to be used in both implementations despite their different native configuration names.

Flink-s3-fs-hadoop

flink-s3-fs-hadoop native     |   flink-conf.yaml variants
==============================================================
fs.s3a.access.key <-> s3.access.key
s3a.access.key
fs.s3a.access.key
s3.access-key
fs.s3a.secret.key <-> s3.secret.key
s3a.secret.key
fs.s3a.secret.key
s3.secret-key
fs.s3a.endpoint <-> s3.endpoint
s3a.endpoint
fs.s3a.endpoint
fs.s3a.proxy.host <-> s3.proxy.host
s3a.proxy.host
fs.s3a.proxy.host
fs.s3a.proxy.port <-> s3.proxy.port
s3a.proxy.port
fs.s3a.proxy.port

For more configuration options, please refer to Hadoop's AWS module documentation (the actual supported values may depend on the flink-s3-fs-hadoop version you use).

Flink-s3-fs-presto

flink-s3-fs-presto native     |   flink-conf.yaml variants
==============================================================
presto.s3.access-key <-> s3.access-key
presto.s3.access-key
s3.access.key
presto.s3.secret-key <-> s3.secret-key
presto.s3.secret-key
s3.secret.key
presto.s3.endpoint <-> s3.endpoint
presto.s3.endpoint
presto.s3.max-error-retries <-> s3.max-error-retries
presto.s3.max-error-retries
presto.s3.max-retry-time <-> s3.max-retry-time
presto.s3.max-retry-time

For more configuration options, please refer to Presto's S3 configuration docs replacing hive.s3 with presto.s3 (the actual supported values may depend on the flink-s3-fs-presto version you use).

Related Information