SFTP and S3 Imports/Exports

Prev Next

Installation

File storage (SFTP and S3) import/export extension is provided and maintained by Rossum.ai in the form of webhook. To start using it, follow these steps:

  1. Login to your Rossum account.

  2. Navigate to ExtensionsMy extensions.

  3. Click on Create extension.

  4. Fill the following fields:

    1. Name: SFTP/S3 import/export

    2. Trigger events: Scheduled for import or Export for exports

    3. Extension type: Webhook

    4. URL (see import and export endpoints below)

    5. In "Advanced settings" select Token owner (should have Admin access)

  5. Click Create the webhook.

Dataset import endpoints

Document import endpoints

Export endpoints

⚠️  S3 Import Limitations

When configuring S3 imports, please be aware of the following technical constraints:

  • 1,000 Object Maximum: The S3 import extension retrieves a maximum of 1,000 objects per configured prefix (folder path).

  • Includes Sub-folders: This limit is aggregate. It includes all files within the target folder and any nested sub-folders.

  • Alphabetical Sorting: Only the first 1,000 objects in alphabetical order are retrieved.

  • Regex Note: Filtering via Regular Expressions occurs after the initial 1,000 objects are fetched. If a file is not among the first 1,000 alphabetical objects, it will not be imported, even if it matches your Regex.

🎓  Best Practice

To avoid missing files, ensure your monitored folder stays below the 1,000-object limit. Instead of moving processed files to a sub-folder (e.g., input/archive/), move them to a separate root folder (e.g., archive/) so they are not included in the prefix count.

Basic usage

🚧 WORK IN PROGRESS 🚧

We're still working on this part and would love to hear your thoughts! Feel free to share your feedback or submit a pull request. Thank you! 🙏

Available configuration options

Available configuration options are described in the API documentation:

Logging and observability

Extensions Logs

  • URL https://[org].rossum.app/settings/extensions/logs

  • The import job is not triggered directly, but using scheduler. Thus successfull record (type INFO) in the Extensions Logs does not necessary means the downstream import job was sucessfull, but it is a good start for observation

Master Data Hub

  • URL: https://[org].rossum.app/svc/master-data-hub/web/management

  • Directly in the MDH, there is a status screen "Upload Status", regardless of the origin of "upload".

  • There is also note with the more detailed info in case of some error.

SFTP import notifications

SFTP dataset imports can send you emails in case your import was unsuccessful. To set this up, follow these steps:

  1. Open your SFTP import configuration. Extend to a format like the following:

    {
      "credentials": { ... },  # remains unchanged
      "import_rules": [ ... ],  # remains unchanged
      "notifications" [  # new configuration section
        {
          "notify_on": "failure",  # so far the only value that can be set is "failure", required
          "to": [{"email": "john@doe.com"}],  # required
          "cc": [{"email": "john@doe.com", "name": "John Doe"}],  # optional
          "bcc": [{"email": "john@doe.com"}],  # optional
          "queue_id": 1716063  # required
        }
      ]
    }

  2. Enter the recipient addresses that you would like to notify. You can list up to 10 addresses (as per the Rossum API). At least one address is required in the to field; while cc and bcc are optional.

  3. Specify the ID of a queue associated with the extension. The extension uses that queue’s inbox as the outgoing “From” address for the emails.

💬  YOUR FEEDBACK MATTERS!

Help us keep this page accurate and useful. Select Yes or No below, then use the feedback form to propose a correction, ask for clarification, or request a new article.