Skip to main content

Standard Data ETLs

Overview

An ETL (Extract → Transform → Load) is a scheduled or triggered flat-file exchange flow that feeds the Actito Standard Data Model.

PhaseResponsibility
ExtractYour system produces one or more CSV files and uploads them to a shared file server.
TransformOptional transformations are applied to the files before ingestion (format conversion, value mapping, etc.).
LoadActito validates and imports the file data into your licence data model.

ETLs can feed Profile and Custom tables.

An ETL is:

  • Standardized — mapping and validation follow a defined schema.
  • Automated — no manual trigger required.
  • Scheduled or triggered — runs at a configured frequency/time, or when a matching file is detected.
  • Monitored — execution results are available via email reports, webhooks, and API.

1. Extract and transfer

For each target table, provide one CSV file where each row maps to a table record and each column maps to a table attribute.

Extract delta data only

Always extract only changed records. Full-table extracts have a significant impact on import performance and should be used only for initial setup. Files are limited to 100 MB.

File server setup

File transfer requires a file transfer server configuration with at least one location (a folder on the server). The ETL definition references that location and looks for the specified file on every execution.

See File Transfers for setup instructions.

File naming

Each file to retrieve is identified by a name pattern. Patterns support date placeholders so the ETL can resolve the correct file at runtime.

Example: the pattern myfile_$yyyyMMdd.csv resolves to myfile_20220831.csv when executed on 2022-08-31. Full placeholder reference: Mass Imports API.

Multi-file ETLs — file retrieval behavior

In a multi-file ETL (one file per table), all required files must be present when the ETL runs. A single missing required file fails the entire execution.

Files marked as optional can be absent without causing a failure — their absence is treated as a normal case.

A single physical file can be loaded into multiple tables within the same ETL, each with independent settings (destination table, attribute mapping, transformations, etc.).

2. Transform

Transformations are applied to file data before loading. They are optional and defined per column in the ETL configuration. Only one transformation can be defined per column.

TransformationDescription
Date/datetime formattingReformats from a declared pattern to the Actito standard.
Value mappingReplaces matched values with configured substitutes.
Country formattingConverts to ISO 3166-1 alpha-2 (2-char code).
Language formattingConverts to ISO 639-1 (2-char code).
Phone number formattingReformats to international format using a provided country code.
String normalizationApplies text transforms (lower case, upper case, etc.).
String croppingKeeps or removes a specified number of left or right characters.
String replacementReplaces all occurrences of a substring with a specified value.
Chars removalRemoves all occurrences of specified characters.
Transformation failure behavior

If a transformation cannot be applied (e.g. the date value does not match the declared pattern), the ETL does not fail — the original value is preserved. Depending on the destination attribute, this may result in a validation reject at load time.

See data formats for Actito's expected input formats.

3. Load

Loading runs in two sequential steps: validation, then write.

Validation

  1. Checks that the file is a valid CSV (per RFC 4180).
  2. Checks data format integrity for each mapped column against its destination attribute (e.g. yyyy-MM-dd for a date attribute, . decimal separator for a numeric attribute).

Write

The data operation applied to each row is configured per ETL:

  • CREATE_ONLY — insert new records only.
  • UPDATE_ONLY — update existing records only.
  • CREATE_OR_UPDATE — upsert.
  • DELETE — delete matching records.

Only valid rows are written. Invalid rows are rejected and written to an error file (see Rejects).

Execution counters track lines read, rejected, and written for each file. Lines with no actual data change count as read but not as updated.

Multi-file ETLs — validation and loading order

Files in a multi-file ETL are validated and loaded independently and sequentially. If your data model has integrity constraints (e.g. a Tickets table referencing a Customers profile table), declare the loadings in dependency order — parent tables first.

If a parent file is invalid, the child file is still processed. Rows referencing already-existing parent records are loaded; rows referencing new parent records present only in the failed file are rejected.

4. Automate and schedule

An activated ETL runs either on a schedule (up to once per day, at a configured time) or as triggered (runs automatically when a file matching the name pattern is detected).

ETLs can be deactivated at any time, suspending all executions until reactivated.

tip

See the ETL use case for a full walkthrough of scheduled vs. triggered ETLs.

5. Execution results

Each run produces an execution record containing:

  • Execution timestamp and global status.
  • Per-file details: retrieve, validation, and load results.
  • Row counters: read / rejected / written.

Failure modes:

StageCause
RetrieveFile server unreachable, required file not found, or file not present in the archive.
ValidationInvalid CSV structure, missing headers, or rows with invalid cell values.
WriteDuplicate key (CREATE_ONLY), record not found (UPDATE_ONLY / DELETE), table row limit reached.

Execution result reports can be sent by email to a configured list of recipients.

ETL Report Success

6. Rejects

When validation or load detects invalid rows, Actito produces a reject file containing those rows with two additional columns: the invalid column name and the reject reason.

Reason codeDescription
INVALID_FIELD_VALUEValue does not match the expected format.
MISSING_FIELD_VALUEMandatory value is absent.
DUPLICATE_OBJECTValue violates a unique constraint in the destination table.
UNKNOWN_DATARecord not found in UPDATE_ONLY mode.
NOT_FOUNDRecord not found in DELETE mode.
DATA_ALREADY_EXISTSRecord already exists in CREATE_ONLY mode.
HARD_LIMIT_EXCEEDEDDestination table has reached its maximum record count.

The reject file output location is configured in the ETL definition. Retrieve, process, and re-integrate corrected rows in a subsequent file (e.g. the next day's extract).

API reference

All ETL operations are available via the Mass Imports API.