Getting started with DataHooks

Flatfile's DataHooks are a useful data healing element to re-format, validate and/or correct data automatically during the import without the user having to correct manually. When used properly, they can be used for things like automatically reformatting area codes or country codes, removing special characters, validating emails against external data, and really anything else you can code up. DataHooks are the most powerful data healing element to date within Flatfile. There are two hooks that are available, field hooks (AKA column hooks) and record hooks (AKA row hooks).

Before beginning, there are a couple considerations to be made when choosing which type of hook to use and how to use it, and those revolve around the order and event flow of the hooks. Below is a helpful diagram that shows the general flow for Flatfile, but it should be noted that

registerFieldHook()
runs first and only runs once after the 'matching' stage. After these run,
registerRecordHook()
will run on all records. This hook by default will also then run on individual records as they are updated. You might run into a scenario where you either don't want to run the record hooks on init or on change. This is possible, and you'll see how in the below section on record hooks.

Datahooks flow

Field Hooks

FlatfileImporter.registerFieldHook(field: string, values => { // function block })

Field hooks run validation on a particular field (column) of data at the beginning of the matching stage. These hooks are run before record hooks and will only run once during the import process. These are best used to bulk edit a field or use an outside data source for validation. For example, say you want to verify the email addresses in a file are not already in your database, you could grab all the values in the column within a field hook, send them to your server and validate against your server and send back an error message with any that already exist in your system to display for the user.

In order to use field hooks, you call the field hook with

FlatfileImporter.registerFieldHook("field_name", callback => {})
. Each field hook callback function needs to return an array where each item in that array is another array which corresponds to an individual record with the values and errors being the item index of 0 on that array and the original row number of the item being the index item 1 of the array. Here's an example of what the output data will look like and what the hook needs to have as a return value:

In the above scenario, let's say we are sending the values to our server and returning an error for any emails that are already in the database. Let's assume that John and Steve's emails are already in the system

With all that in mind, let's visualize the above within the context of making a server call. Quick note: while it's not required to use async/await with data hooks, we recommend using it when working with an outside data source/API call. Note: For visual purposes, we included the

value
key above, but if the actual value of the data isn't changing, you don't need to pass this back.

Without external data example:

In the below examples, you'll see us add a zero to the beginning of each value. Please notice in this instance that we do not use the async/await syntax.

Field Hooks additional notes:

  • While we use the
    value
    key in each of the above examples, if you aren't changing the original value, this is not required, and we recommend to not include it in your returned values.
  • We also use the
    info
    array with
    message
    and
    level
    in all of the examples above to provide a custom error message. This is not required to be given, however, please note that if you choose not to use this, there is still a standard "info" level message letting the user know that the data was automatically formatted.
  • You can also call multiple field hooks per import. In order to do this, you would use the
    .registerFieldHook()
    method for each field you wish to use a hook.
  • If you have registered the field hook and are not seeing the expected results in the import process, please check to make sure that the field name in your config matches the field provided in the
    .registerFieldHook()
    method and also that the returned data structure is correct.

Record Hooks

FlatfileImporter.registerRecordHook((record, index, mode) => { // function block})

Record hooks run validation on each record (row) of data and return the record with new data and/or error messaging for the user. These hooks run on

init
(meaning at the beginning of the "review" step) and then also on
change
(meaning when a record is updated during the "review" step). In order for this hook to work properly, you will call the function, passing in a callback function with the
record
(at minimum) as a parameter. Optional other parameters include the
index
and
mode
. The
record
is going to be a specific row of data. You can then use
record.fieldName
to work with a specific field on each record. You can use these hooks for single-field, multi-field or cross-field validation (examples of each below). You can use
index
to get the value's index within the data. You can use
mode
to differentiate between the hook being run on
init
and also on
change
.

These hooks can be used in conjunction with other validators (like regex) or can also be used to replace some of the regex validators and pre-format errors instead of having the user do it. They can be used to reformat, replace and validate data accuracy on init and change of a record during the "review" step.

Here are some examples of using the hooks. For context, here is a configuration we can use for all these hooks with the commented out section being where you would put your hooks.

Single field validation example - zip code re-formatting (using the above example)

Multi-field validation example

Cross-field validation example - if city and state aren't present, then zip code is required

Filtering event with mode - call mode and do something on "change" only - also do something on init only