# univocity-examples
**Repository Path**: mirrors_uniVocity/univocity-examples
## Basic Information
- **Project Name**: univocity-examples
- **Description**: Project to demonstrate the capabilities of uniVocity and to quickly guide you through the main functionalities that can be used to make the development of data integration processes easier.
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-08-18
- **Last Updated**: 2026-05-23
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README

Welcome to uniVocity
====================
**[uniVocity](http://www.univocity.com)** is a data integration framework for Java that provides a fast and flexible foundation for the
implementation of complex data mappings and transformations. uniVocity is free for non-commercial use and gives you much more power and flexibility
than a conventional ETL framework.
This tutorial covers the essential building blocks you can use to develop powerful data integration solutions.
### Table of contents ###
* [Installation](#installation)
* [Maven settings](#maven-settings)
* [Obtaining a license](#obtaining-a-license)
* [Introduction](#introduction)
* [Background](#background)
* [The data input example](#the-data-input-example)
* [Configuring the data stores](#configuring-the-data-stores)
* [uniVocity engine initialization](#univocity-engine-initialization)
* [Essential building blocks](#essential-building-blocks)
* [Reading data from an entity](#reading-data-from-an-entity)
* [Copying data from CSV to Fixed-width entities](#copying-data-from-csv-to-fixed-width-entities)
* [Using row readers to manage rows](#using-row-readers-to-manage-rows)
* [Functions](#functions)
* [Reference mappings](#reference-mappings)
* [Intercepting engine lifecycle activities](#intercepting-engine-lifecycle-activities)
* [Map functions](#map-functions)
* [Objects with functions](#objects-with-functions)
* [Auto-detection and other useful capabilities](#auto-detection-and-other-useful-capabilities)
* [Auto-detection of mappings](#auto-detection-of-mappings)
* [Applying readers and functions to multiple mappings at once](#applying-readers-and-functions-to-multiple-mappings-at-once)
* [Dumping data](#dumping-data)
* [Generating database schemas](#generating-database-schemas)
* [Queries, more functions, and variables](#queries-more-functions-and-variables)
* [Using parameters and variables](#using-parameters-and-variables)
* [Mapping between incompatible schemas](#mapping-between-incompatible-schemas)
* [Input management](#input-management)
* [Input sharing](#input-sharing)
* [Data increments](#data-increments)
* [Update prevention](#update-prevention)
* [Dataset producers](#dataset-producers)
* [Advanced](#advanced)
* [Persistent scope](#persistent-scope)
* [Custom entities](#custom-entities)
* [Advanced settings for JDBC entities](#advanced-settings-for-jdbc-entities)
## Installation ##
To install uniVocity, you need two artifacts:
1. the [public API](http://github.com/uniVocity/univocity-api), which provides the essential interfaces and configuration options to configure data inputs,
outputs, and their mappings. You must write your code against the interfaces provided by this API.
2. our data integration engine implementation, which can be downloaded from our [website](www.univocity.com) or from our [maven](http://maven.apache.org) repository (http://artifacts.univocity.com).
We split the API so your code can be totally isolated from our implementation code. Any new version of uniVocity will support the published API's
so you can update uniVocity transparently without worrying about compilation errors and code rewrites.
### Maven settings ###
If you use [Maven](http://maven.apache.org), you'll need to add an entry for our repository to your `pom.xml` in order to obtain the `univocity-1.0.7` jar.
```xml
univocity-releaseshttp://artifacts.univocity.com/release
```
These are the dependencies you need to include in your `pom.xml`:
```
...
com.univocityunivocity1.0.7jarcom.univocityunivocity-api1.0.6jar
...
```
**Note**: univocity-api is in the [Maven Central Repository](http://search.maven.org)
To get get access to uniVocity *snapshot* releases, add an additional `repository` entry to the `repositories` section of your `pom.xml`:
```xml
...
univocity-releaseshttp://artifacts.univocity.com/snapshot
```
### Obtaining a license ###
uniVocity is free for non-commercial use and can be used without a license. In this case, batch operations are disabled.
To unleash the true power of uniVocity, and experience maximum performance, we suggest you to obtain a license file.
You can get a free 30-day trial license immediately by simply creating a license request for your computer and sending it to us. To create a license request, you can execute one of
the following classes from the `univocity-1.0.7.jar`, as regular java applications:
1. The graphical license request wizard: `com.univocity.LicenseRequestWizard` (if you have a graphical interface).
2. The command-line license request script: `com.univocity.LicenseRequest` (if you want to execute from the command line)
Once the license request process starts, provide your details to generate a license request file.
Send your license request to licenses@univocity.com and you will receive your license file shortly after.
Once you receive your `license.zip` file, all you need to do is to place it in your classpath so `uniVocity` can validate it and start up. You can also place the license file anywhere in your computer and invoke `Univocity.setLicensePath("/path/to/your/license.zip");` before starting the data integration engine.
*You can find more information about licenses [here](http://www.univocity.com/pages/license-request)*
## Introduction ##
### Background ###
uniVocity is essentially a data mapping framework built around the concept of data stores and their entities:
* A *data entity* is an abstraction for any software component that provides data in tabular format, such as: database tables, CSV files, arrays of objects, etc.
It must provide:
* A sequence of fields names that define the information stored for each individual record.
* A means to retrieve records, update and delete them, or write new ones. The entity can support all these operations or
be just read-only or write-only.
* A *data store* is an abstraction for any software component that provides and manages access to one or more data entities: databases, CSV file directories,
sets of files, custom Java objects, etc. It can provide these additional features:
* queries: treated by uniVocity as a read-only data entity that is represented by a string. The string defines some data retrieval operation, and
its behavior is determined by the data store implementation. Queries can accept parameters if required.
* transactional behavior: the data store can rollback/commit any changes made to its data entities in case of errors.
With these components, uniVocity lets you create complex data mappings that define how information should flow from one data entity to another.
That's all you need to know for now, so let's get started.
### The data input example ###
In our example, we are interested in synchronizing data of foods and groups of foods with another data store.
As the source data store, we will be using a stripped-down version of the nutrient database provided by the
[U.S. Department of Agriculture](http://www.ars.usda.gov/ba/bhnrc/ndl).
We provide this data in a couple of files that you can find [here](http://github.com/uniVocity/univocity-examples/tree/master/src/test/resources/examples/source_data/csv).
Some of the original data was modified to make the examples easier to read.
The tables we are interested in are: [FD_GROUP.csv](http://github.com/uniVocity/univocity-examples/tree/master/src/test/resources/examples/source_data/csv/FD_GROUP.csv) for general
groups of food and [FOOD_DES.csv](http://github.com/uniVocity/univocity-examples/tree/master/src/test/resources/examples/source_data/csv/FOOD_DES.csv) for individual food descriptions.
These are stored in 2 CSV files with the following content:
``` FD_GROUP.csv
"FdGrp_CD","FdGrp_Desc"
"0100","Dairy and Egg Products"
"1200","Spices and Herbs"
"0300","Baby Foods"
"1400","Fats and Oils"
"1500","Poultry Products"
```
And
``` FOOD_DES.csv
"NDB_No","FdGrp_Cd","Long_Desc","Shrt_Desc","ComName","ManufacName","Survey","Ref_Desc","Refuse","SciName","N_Factor","Pro_Factor","Fat_Factor","CHO_Factor"
"01001","0100","Butter, salted","BUTTER,WITH SALT",,,"Y",,0,,2.38,4.27,1.89,8.87
"01002","1400","Butter, whipped, with salt","BUTTER,WHIPPED,WITH SALT",,,"Y",,0,,8.38,3.27,2.79,1.87
"01004","0100","Cheese, blue","CHEESE,BLUE",,,"Y",,0,,4.38,9.27,4.59,3.87
"01005","0100","Cheese, brick","CHEESE,BRICK",,,"Y",,0,,4.38,5.27,6.79,8.87
"01007","0100","Cheese, camembert","CHEESE,CAMEMBERT",,,"Y",,0,,5.38,6.27,2.79,3.87
"01008","0100","Cheese, caraway","CHEESE,CARAWAY",,,,,0,,6.36,6.27,8.79,3.89
"01009","0100","Cheese, cheddar","CHEESE,CHEDDAR",,,"Y",,0,,6.33,4.22,8.13,3.12
"01010","0100","Cheese, cheshire","CHEESE,CHESHIRE",,,,,0,,4.78,4.67,0.79,3.13
"01110","0300","Milk shakes, thick chocolate","MILK SHAKES,THICK CHOC",,,,,0,,8.38,0.27,3.79,1.87
"01111","0300","Milk shakes, thick vanilla","MILK SHAKES,THICK VANILLA",,,,,0,,3.98,8.97,3.69,3.87
```
### Configuring the data stores ###
To use these files as data entities in uniVocity, we need to configure a data store. uniVocity comes with its own
CSV data store implementation, so all you need to do is configure it:
```java
//This creates a new data store for CSV files. uniVocity provides a few basic data stores out of the box,
//but you can always create your own (see com.univocity.api.entity.custom.CustomDataStoreFactory)
CsvDataStoreConfiguration csvDataStore = new CsvDataStoreConfiguration("csvDataStore");
//This adds all files under the specified path to the data store. Each file will be
//considered an individual data entity.
csvDataStore.addEntities("examples/source_data/csv");
```
To make the following examples simple and easy to read, we will map the contents of these CSV files to a destination data store
that contains fixed-width entities. We don't have these entities physically on disk so all data will be written to a `String`.
This will require a little bit of extra configuration to provide the essential information required by uniVocity. We will need to define the
entities, their names, their fields and which fields should be used as identifiers.
```java
//This creates a new data store for fixed-width entities. Let's call it "fixedWidthDestination"
FixedWidthDataStoreConfiguration fixedWidthDataStore = new FixedWidthDataStoreConfiguration("fixedWidthDestination");
//Here we create a write-only fixed-with entity named "food_group". All data written to this entity
//will end up in a String. You can also use files, resources in the classpath or custom writers/readers as entities.
foodGroupOutput = new StringWriterProvider();
fixedWidthDataStore.addEntity("food_group", foodGroupOutput);
//We will also need a "food" entity.
foodOutput = new StringWriterProvider();
fixedWidthDataStore.addEntity("food", foodOutput);
//Let's define the default settings that should be used for all entities in this data store
FixedWidthEntityConfiguration defaults = fixedWidthDataStore.getDefaultEntityConfiguration();
//By default, we want to write headers to the output (i.e. the first row that displays the name of each column).
defaults.setHeaderWritingEnabled(true);
//We also want underscore (instead of whitespace) to highlight unwritten spaces in the fixed-width fields.
defaults.getFormat().setPadding('_');
//Use a question mark to denote null values in the output (instead of simply leaving it blank in the output)
defaults.setNullValue("?");
//Let's configure the entities: uniVocity needs to know what the records of each entity look like.
//As these are not files with headers, nor database tables, we need to provide this information manually.
//To configure an entity, simply get its configuration from the data store:
FixedWidthEntityConfiguration foodGroupConfig = fixedWidthDataStore.getEntityConfiguration("food_group");
//Set the names of each field in the entity food_group. These will become the headers in the output.
foodGroupConfig.setHeaders("id", "name");
//A fixed-width entity depends on the length of each field. This configures "id" to use 6 spaces, and "name" to use 35
foodGroupConfig.setFieldLengths(6, 35);
//Marks the "id" field as the identifier of records in food_group.
foodGroupConfig.setIdentifiers("id");
//Now for the "food" entity:
FixedWidthEntityConfiguration foodConfig = fixedWidthDataStore.getEntityConfiguration("food");
foodConfig.setHeaders("id", "group", "description", "scientific_name");
foodConfig.setFieldLengths(8, 6, 40, 20);
foodConfig.setIdentifiers("id");
```
### uniVocity engine initialization ###
Having the data stores configured, we can finally configure and initialize a data integration engine:
```java
//Creates a new engine configuration to map data between entities in CSV and fixed-width data stores
EngineConfiguration engineConfig = new EngineConfiguration(engineName, csvDataStore, fixedWidthDataStore);
//Registers this engine configuration.
Univocity.registerEngine(engineConfig);
```
In order to perform complex mappings and enable different data management strategies, uniVocity relies on a couple of metadata tables.
If you don't provide an explicit configuration for these tables in an [EngineConfiguration](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/config/EngineConfiguration.java) object, uniVocity will automatically start
an in-memory database to store this information. All metadata information in this in-memory database will be lost once the engine is stopped.
While the in-memory metadata is very convenient for testing and initial development purposes, it is probably not enough for most real-world applications.
Make sure your [EngineConfiguration](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/config/EngineConfiguration.java) provides access to these tables in a database of your preference, unless you just need basic persistence functionalities.
## Essential building blocks ##
The following sections introduce the basic building blocks that allow you to use uniVocity to create powerful data mappings.
### Reading data from an entity ###
Any data entity accessible through a data store can be read using the exact same interface: [Entity](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/entity/Entity.java). It doesn't matter if the entity is a file, a database table or any other [CustomDataEntity](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/entity/custom/CustomDataEntity.java) you define. Once the `beginReading` method is invoked, a concurrent process will start to read data from your entity into an in-memory buffer. All you need to do is to consume the rows, as demonstrated in the following example:
```java
//Obtains the configured engine instance
DataIntegrationEngine engine = Univocity.getEngine(engineName);
//obtains the entity from our configured CSV data store
Entity entity = engine.getEntity("csvDataStore.FD_GROUP");
try {
//This starts a reading process in parallel. This keeps a limited number
//or rows loaded in memory at any given time.
//Let's select only one column of the entity FD_GROUP.
entity.beginReading("FdGrp_Desc");
//let's read rows iterator-style and print them.
Object[] row = null;
while ((row = entity.readNext()) != null) {
println(row[0]);
}
} finally {
//We need to stop the reading process manually here (unless all data available has been read)
entity.stopReading();
}
```
Here only the `FdGrp_Desc` column is selected from the underlying CSV. The output is printed out as follows:
```
Dairy and Egg Products
Spices and Herbs
Baby Foods
Fats and Oils
Poultry Products
```
Reading data from `Object` arrays can be annoying, but the [Entity](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/entity/Entity.java) offers a few useful methods to make the reading process less cumbersome:
```java
//Here we start a reading process as before, but now we handle data differently:
//Field names are case insensitive
entity.beginReading("FdGrp_Desc", "FDGRP_CD");
//We don't need to retain the output array, just check whether it's null.
while (entity.readNext() != null) {
//let's read a field of the current row. The expected type is String.
String description = entity.valueOf("FdGrp_Desc", String.class); //
//field names are case insensitive
String code = entity.valueOf("fdgrp_cd", String.class);
//Let's print the results again
println(code + " - " + description);
}
```
Now, we are reading the description and code of all food groups, which produces the following output:
```
0100 - Dairy and Egg Products
1200 - Spices and Herbs
0300 - Baby Foods
1400 - Fats and Oils
1500 - Poultry Products
```
Reading from the data entities directly is simple and extremely fast. However, we are usually interested in mapping the data returned by the entities into some destination, which usually is very different from the source. Writing code to map from each row to a wildly different destination becomes increasingly error-prone and complicated. The [uniVocity data integration framework](http://www.univocity.com/pages/about-univocity) was carefully designed to mitigate this sort of issue and let you map data easily while being fast, flexible and consistent.
### Copying data from CSV to Fixed-width entities ###
With a data integration engine registered into uniVocity, we can obtain an instance of [DataIntegrationEngine](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/DataIntegrationEngine.java) and define data mappings:
```java
//Obtains the configured engine instance
DataIntegrationEngine engine = Univocity.getEngine(engineName);
//Creates a mapping between the csv and fixed-width data stores.
DataStoreMapping mapping = engine.map("csvDataStore", "fixedWidthDestination");
//With a data store mapping, we can define the mapping between their entities.
//We will map csvDataStore.FD_GROUP to fixedWidthDestination.food_group
EntityMapping foodGroupMapping = mapping.map("FD_GROUP", "food_group");
//Here we associate FD_GROUP.FdGrp_CD to food_group.id. Values will be copied directly between source and destination.
//The identity mapping defines that an association should be created for each record mapped between source and destination.
//A new entry will be added to uniVocity's metadata, containing the source and destination values mapped here.
//This linkage enables complex mapping operations that will be demonstrated later.
foodGroupMapping.identity().associate("FdGrp_CD").to("id");
//Copies values from FD_GROUP.FdGrp_Desc to food_group.name
foodGroupMapping.value().copy("FdGrp_Desc").to("name");
//Configures the mapping to delete all rows in the destination to then insert all rows mapped from the source
foodGroupMapping.persistence().usingMetadata().deleteAll().insertNewRows();
//Executes a data mapping cycle with the entity mappings defined so far.
engine.executeCycle();
```
The first step is to map a source data store to a destination data store, and then map their data entities.
When establishing a mapping between entities, it is mandatory to define an identifier mapping. This mapping must ensure that both source and destination values
form a unique representation of each record (similarly to primary keys in a database). The identifier does not need to be a primary key of a database, and uniVocity
won't perform any validation against its contents. If the source data can contain duplicates, you can manage that and discard/transform any anomaly with the features
described in the following sections. The identifier values of both source and destination will have their `String` representation used in uniVocity metadata tables.
Therefore, you must ensure identifiers can be converted from/to `String` consistently.
uniVocity enables the use of transformations that can be used to achieve this if required.
Mappings can be auto-detected by default if field names match in both source in destination (or if you provide a [NameMatcher](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/NameMatcher.java)).
As this is not the case in the example, we must create the mappings between values manually. `value().copy("FdGrp_Desc").to("name")` does what it says: it copies
the value in "FdGrp_Desc" to "name".
Then, we can configure the appropriate persistence settings. This configuration determines what should happen to the destination data when a mapping cycle is executed.
In this example, we will store the identity values in uniVocity's metadata, and all values in the source entity will be deleted before inserting.
With the mappings properly defined, we can execute a data mapping cycle and see what data ended up being written in the `food_group` fixed-width entity :
```
id____name_______________________________
0100__Dairy and Egg Products_____________
1200__Spices and Herbs___________________
0300__Baby Foods_________________________
1400__Fats and Oils______________________
1500__Poultry Products___________________
```
In general, the steps taken so far are the first thing you will do when creating your own data mappings. But we expect your input data to
be much, much more intricate than that. uniVocity provides many tools for you to easily handle such intricacies. Let's explore some of them.
### Using row readers to manage rows ###
One of the most powerful tools in uniVocity is the [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java). With it you can intercept rows as soon as they are loaded from the source,
then modify, log, perform calculations, etc, and then send them over to uniVocity to continue with the mapping process. Once uniVocity has a mapped row, and is ready
to send it to the destination entity, you can use a [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java) to intercept and manipulate the row before it is written.
Finally, after the row is successfully persisted, uniVocity modifies the
row if needed (typically, when generated values are produced by the destination entity upon insertion of new records), and exposes it to RowReaders again.
In the following example we use the same mapping as before, but now we attach a RowReader to the input for discarding unwanted rows.
We also attach a RowReader to the output for converting all food group names to uppercase before writing to the output:
```java
//Obtains the engine instance already configured
DataIntegrationEngine engine = Univocity.getEngine(engineName);
//Gets the existing data store mapping
DataStoreMapping mapping = engine.getMapping("csvDataStore", "fixedWidthDestination");
//Gets the existing entity mapping
EntityMapping foodGroupMapping = mapping.getMapping("FD_GROUP", "food_group");
//We can easily manipulate the input and output rows before, during and after they are mapped. To do so we use a RowReader.
//Here we add a RowReader to discard the first 3 rows in the input.
foodGroupMapping.addInputRowReader(new RowReader() {
@Override
public void processRow(Object[] inputRow, Object[] outputRow, RowMappingContext context) {
if (context.getCurrentRow() <= 3) {
context.discardRow();
}
}
});
//This rowReader converts descriptions to uppercase. This is an output rowReader, so we have access to
//all data already mapped from the input, and ready to be written to the output.
foodGroupMapping.addOutputRowReader(new RowReader() {
private int name;
@Override
public void initialize(RowMappingContext context) {
//for performance reasons, we use the initialize method to get the index of any fields we are interested in
//before changing the output Rows. Here we get index of the output field "name"
name = context.getOutputIndex("name");
}
@Override
public void processRow(Object[] inputRow, Object[] outputRow, RowMappingContext context) {
//converts the food group name in the destination row to uppercase
outputRow[name] = outputRow[name].toString().toUpperCase();
}
});
//Executes a data mapping cycle again
engine.executeCycle();
```
In this example we simply get the existing mapping, add our [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java)s and execute the mapping cycle again. The action happens in [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java)'s `processRow` method:
this method exposes the original input row and the mapped row (if any) that is going to be inserted into a destination entity. The [RowMappingContext](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowMappingContext.java) is provided by
uniVocity so you can have more control over and information about the mapping in execution. The names of each mapped field, both in the input or output,
can be obtained from this contextual object, as well as their positions in the input/output arrays. If you are manipulating a huge input, we recommend that you
acquire the positions of fields you are interested in manipulating using the `initialize()` method.
After executing a data mapping cycle with these [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java) implementations, the output will contain 2 rows (the first 3 rows in the input were skipped):
```
id____name_______________________________
1400__FATS AND OILS______________________
1500__POULTRY PRODUCTS___________________
```
Note that you can have multiple [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java)s defined for each step of the process (input, output, persisted). In this case they will be executed sequentially and
changes made by the first [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java) will reflect on the next. A row discarded by one [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java) won't be passed on to the next.
While powerful, [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java)s are more suitable for mappings that require very specific operations. In short, they are not particularly easy to reuse,
especially when you just want to perform some common data cleanup or transformations on fields processed by different mappings.
### Functions ###
uniVocity engines allow you to provide functions that can be used during the execution of a data mapping cycle.
Functions are especially useful to transform input values, and to provide input to destination fields in data mappings.
As functions can perform operations that are potentially expensive (such as invoking a web service somewhere to get an account number in another system),
they are associated with a scope. The scope in uniVocity is bound to each individual data integration engine instance.
The following scopes, defined in [EngineScope](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/EngineScope.java), are supported:
1. PERSISTENT: outlives the data integration engine (depends on user configuration)
2. APPLICATION: values are retained and reused while the engine is active
3. CYCLE: values are retained and reused while a mapping cycle is active
4. MAPPING: values are retained and reused while an individual entity mapping is active
5. STATELESS: no values are retained and reused.
When a function is invoked, its signature, arguments and result are stored into the scope with which it has been associated. A subsequent function invocation within the same scope,
with the same arguments, will not trigger the function. Instead, the result obtained previously will be returned. This will happen until the scope is left.
Getting back to our example, let's see how they can be put to work:
```java
//Adds a function to the engine that converts Strings to Integers.
//The APPLICATION scope makes the engine retain and reuse the result of function calls
//using a given parameter, until the engine is shut down.
engine.addFunction(EngineScope.APPLICATION, "toInteger", new FunctionCall() {
@Override
public Integer execute(String input) {
return Integer.parseInt(input);
}
});
//Adds a function to the engine that converts Strings to lower case.
//With the CYCLE scope, values are held while a cycle is active. Once it completes all
//values in this scope will be discarded.
engine.addFunction(EngineScope.CYCLE, "toLowerCase", new FunctionCall() {
@Override
public String execute(String input) {
return input.toLowerCase();
}
});
//Adds a function to the engine that trims Strings.
//With the STATELESS scope, results of function calls are never reused.
engine.addFunction(EngineScope.STATELESS, "trim", new FunctionCall() {
@Override
public String execute(String input) {
return input.trim();
}
});
//Gets the existing data store mapping
DataStoreMapping mapping = engine.getMapping("csvDataStore", "fixedWidthDestination");
//Removes the existing entity mapping
mapping.removeMapping("FD_GROUP", "food_group");
//We will map csvDataStore.FD_GROUP to fixedWidthDestination.food_group again
EntityMapping foodGroupMapping = mapping.map("FD_GROUP", "food_group");
//Here we associate FD_GROUP.FdGrp_CD to food_group.id, but now FdGrp_CD will be converted to an integer value
//before being associated with the destination id
foodGroupMapping.identity().associate("FdGrp_CD").to("id").readWith("toInteger");
//Copies values from FD_GROUP.FdGrp_Desc to food_group.name. All values read from FdGrp_Desc will be
//trimmed then converted to lower case, as specified in the sequence of functions
foodGroupMapping.value().copy("FdGrp_Desc").to("name").readingWith("trim", "toLowerCase");
//Configures the mapping to keep all rows in the destination and only insert new rows in the source
//Records mapped in a previous cycle won't be discarded nor updated.
foodGroupMapping.persistence().usingMetadata().deleteDisabled().updateDisabled().insertNewRows();
//Executes a data mapping cycle again
engine.executeCycle();
```
As you can see, functions are added to the engine, and referred to in mappings by their name, without arguments.
This mapping uses 2 functions: `copy("FdGrp_Desc").to("name").readingWith("trim", "toLowerCase")`.Here the values in "FdGrp_Desc"
will be transformed by the "trim" function, and the result will then be passed on to "toLowercase"
Also notice we changed the mapping's persistence settings with ` persistence().usingMetadata().deleteDisabled().updateDisabled().insertNewRows()`.
This means we won't be updating existing rows. (Remember what was done in the previous example? We had 2 rows with uppercase letters.) Although this is a new mapping
and the previous one was discarded, the metadata will be used to identify which rows NOT to update.
The result of the execution of this new mapping will be:
```
id____name_______________________________
1400__FATS AND OILS______________________
1500__POULTRY PRODUCTS___________________
100___dairy and egg products_____________
1200__spices and herbs___________________
300___baby foods_________________________
```
The result shows the previous 2 rows in uppercase and 3 new, all in lower case and, as a result of the `toInteger` function, no leading zeros.
### Reference mappings ###
Analogous to foreign keys in a database, uniVocity supports the concept of references to identifiers produced in other mappings.
Now, we want to map the CSV entity `FOOD_DES` to the fixed-width entity `food`. The [FOOD_DES.csv](http://github.com/uniVocity/univocity-examples/tree/master/src/test/resources/examples/source_data/csv/FOOD_DES.csv) file has the field `FdGrp_Cd`, which contains references to [FD_GROUP.csv](http://github.com/uniVocity/univocity-examples/tree/master/src/test/resources/examples/source_data/csv/FD_GROUP.csv).
In the previous example, the identity mapping between `FD_GROUP` and `food_group` was declared as: `identity().associate("FdGrp_CD").to("id")`.
We want to maintain this association on the destination data store, instead of simply copying from whatever is in the input.
To do that, we need to give uniVocity the basic metadata it needs to be able to locate identifiers previously
mapped, and potentially transformed, by previous mappings.
A reference mapping is declared like this: `reference().using("FdGrp_Cd").referTo("FD_GROUP", "food_group").on("group")`
1. The `.using("FdGrp_Cd")` part tells uniVocity to read the value of this field in `FOOD_DES` (the source entity)
2. `.referTo("FD_GROUP", "food_group")` tells uniVocity to read the metadata generated by a mapping from "FD_GROUP" to "food_group".
It will use the values read from "FOOD_DES.FdGrp_Cd" to find the identifier previously persisted to "food_group"
3. `.on("group")` defines what field in the destination entity (in this case "food.group") should receive the identifier restored from the metadata.
References that could not be matched will be set to null by default. You can also choose to discard the row entirely, use a placeholder or abort the mapping cycle entirely.
The following example demonstrates how [FOOD_DES.csv](http://github.com/uniVocity/univocity-examples/tree/master/src/test/resources/examples/source_data/csv/FOOD_DES.csv) can be mapped to `food`:
```java
//Gets the existing data store mapping
DataStoreMapping mapping = engine.getMapping("csvDataStore", "fixedWidthDestination");
//Let's add a mapping between the food description CSV and the in-memory, fixed-with entity
EntityMapping foodMapping = mapping.map("FOOD_DES", "food");
foodMapping.identity().associate("NDB_No").to("id");
foodMapping.value().copy("Long_Desc", "SciName").to("description", "scientific_name");
//In the source data store, FOOD_DES contains a reference to FD_GROUP. In the destination, "food" contains a reference to "food_group".
//The reference mapping uses uniVocity's metadata to restore references to identifiers that were mapped and transformed.
//
//Our mapping from FD_GROUP to "group" transformed the values in "FdGrp_Cd" to integers, using the "toInteger" function.
//Now, when reading the field in FOOD_DES that references FD_GROUP, we need to convert its values using the "toInteger" function.
//The results will be used to query uniVocity's metadata and restore the corresponding values used as identifiers of "food_group"
foodMapping.reference().using("FdGrp_Cd").referTo("FD_GROUP", "food_group").on("group").readingWith("toInteger");
foodMapping.persistence().usingMetadata().deleteAll().insertNewRows();
//Executes a data mapping cycle again
engine.executeCycle();
```
The output of this mapping will be:
```
id____name_______________________________
1400__FATS AND OILS______________________
1500__POULTRY PRODUCTS___________________
100___dairy and egg products_____________
1200__spices and herbs___________________
300___baby foods_________________________
id______group_description_____________________________scientific_name_____
01001___100___Butter, salted__________________________?___________________
01002___1400__Butter, whipped, with salt______________?___________________
01004___100___Cheese, blue____________________________?___________________
01005___100___Cheese, brick___________________________?___________________
01007___100___Cheese, camembert_______________________?___________________
01008___100___Cheese, caraway_________________________?___________________
01009___100___Cheese, cheddar_________________________?___________________
01010___100___Cheese, cheshire________________________?___________________
01110___300___Milk shakes, thick chocolate____________?___________________
01111___300___Milk shakes, thick vanilla______________?___________________
```
### Intercepting engine lifecycle activities ###
Each time the data integration engine is executed, several lifecycle events take place:
- A data mapping cycle is started
- An entity mapping is started within the cycle
- The entity mapping is finalized
- The mapping cycle is finalized
- The engine becomes ready to execute a new data mapping cycle
- An engine shutdown process is started
- The engine stops completely
To obtain notifications about one or more of such events when they happen, you can use an [EngineLifecycleInterceptor](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/EngineLifecycleInterceptor.java). This allows you to
prepare for, or to perform cleanup after, each lifecycle event. uniVocity provides an [EngineLifecycleContext](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/EngineLifecycleContext.java) object to its interceptors,
so you can obtain more detailed information about the current state of the engine.
The following example prints out a few messages of an [EngineLifecycleInterceptor](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/EngineLifecycleInterceptor.java) in action.
```java
engine.addInterceptor(new EngineLifecycleInterceptor() {
//prints the current active scope in the engine.
private String getScope(EngineLifecycleContext context) {
return ". Current scope: " + context.getExecutionContext().getCurrentActiveScope();
}
//prints the current mapping being executed. The --X arrow represents a data removal
//operation, whilst the --> represents a data mapping.
private String getMapping(EngineLifecycleContext context) {
EntityMappingContext entityMapping = context.getCurrentEntityMapping();
String arrow = entityMapping.isExclusionMapping() ? " --X " : " --> ";
return entityMapping.getSourceEntity() + arrow + entityMapping.getDestinationEntity();
}
//Executed when the engine becomes available after a data mapping cycle.
@Override
public void engineReady(EngineLifecycleContext context) {
String name = context.getEngineName();
int cycle = context.getCurrentCycle();
println(out, name + " ready. Cycles executed: " + cycle + getScope(context));
}
//Called when a data mapping cycle has been started
@Override
public void cycleStarted(EngineLifecycleContext context) {
println(out);
println(out, "Starting new cycle: " + context.getCurrentCycle() + getScope(context));
}
//Called when an individual mapping between two entities is about to be executed.
//Note some mappings can be exclusion mappings generated automatically by uniVocity
//when deletions are enabled in the entity mappings.
@Override
public void mappingStarted(EngineLifecycleContext context) {
println(out, " Executing: " + getMapping(context) + getScope(context));
}
//Called when an individual mapping between two entities has been completed
@Override
public void mappingCompleted(EngineLifecycleContext context) {
println(out, " Completed: " + getMapping(context) + getScope(context));
}
//Called when a data mapping cycle has been completed
@Override
public void cycleCompleted(EngineLifecycleContext context) {
println(out, "Completed cycle: " + context.getCurrentCycle() + getScope(context));
println(out);
}
//Called when the engine begins to shut down
@Override
public void engineShuttingDown(EngineLifecycleContext context) {
println(out, "Shutting down " + context.getEngineName());
}
//Called when the shut down process is finalized and the engine is effectively stopped.
@Override
public void engineStopped(EngineLifecycleContext context) {
println(out, context.getEngineName() + " shut down.");
}
});
//executes a mapping cycle to trigger some interceptor methods.
engine.executeCycle();
//shuts down the engine to trigger "engineShuttingDown" and "engineStopped"
Univocity.shutdown(engineName);
```
The execution of this code will produce the following output:
```
Starting new cycle: 5. Current scope: CYCLE
Executing: FOOD_DES --X food. Current scope: MAPPING
Completed: FOOD_DES --X food. Current scope: MAPPING
Executing: FD_GROUP --> food_group. Current scope: MAPPING
Completed: FD_GROUP --> food_group. Current scope: MAPPING
Executing: FOOD_DES --> food. Current scope: MAPPING
Completed: FOOD_DES --> food. Current scope: MAPPING
Completed cycle: 5. Current scope: CYCLE
engine_001 ready. Cycles executed: 5. Current scope: APPLICATION
Shutting down engine_001
engine_001 shut down.
```
### Map functions ###
`Maps` can be used as functions, where the input parameter of the function is used as a key to retrieve a value from the map. For example:
```java
//Maps can be used as functions as well. Here we create a map from
//codes of food groups to descriptions:
Map groupCodeToGroupNames = new HashMap();
groupCodeToGroupNames.put("0100", "Dairy");
groupCodeToGroupNames.put("0300", "Baby");
groupCodeToGroupNames.put("1400", "Fats");
//This will create a function named "getNameOfGroup" that will use the given map to produce values
engine.addMap("getNameOfGroup", groupCodeToGroupNames);
//Let's test this function with a mapping to food descriptions:
EntityMapping foodMapping = mapping.map("FOOD_DES", "food");
foodMapping.identity().associate("NDB_No").to("id");
//Here we invoke the "getNameOfGroup" function using values read from FdGrp_CD
//This will return the description associated with each code in the map.
foodMapping.value().copy("FdGrp_CD").to("group").readingWith("getNameOfGroup");
foodMapping.value().copy("Long_Desc").to("description");
engine.executeCycle();
```
In the above code, we created a `Map` from group codes to their descriptions. By creating a mapping using `copy("FdGrp_CD").to("group").readingWith("getNameOfGroup")`, we
assign the values in `FdGrp_CD` as parameters of the map function `getNameOfGroup`. The function will then return a group name for each code, and this name will be
written to the destination field `group`.
The execution of this code will produce the following output:
```
id______group_description_____________________________scientific_name_____
01001___Dairy_Butter, salted__________________________?___________________
01002___Fats__Butter, whipped, with salt______________?___________________
01004___Dairy_Cheese, blue____________________________?___________________
01005___Dairy_Cheese, brick___________________________?___________________
01007___Dairy_Cheese, camembert_______________________?___________________
01008___Dairy_Cheese, caraway_________________________?___________________
01009___Dairy_Cheese, cheddar_________________________?___________________
01010___Dairy_Cheese, cheshire________________________?___________________
01110___Baby__Milk shakes, thick chocolate____________?___________________
01111___Baby__Milk shakes, thick vanilla______________?___________________
```
### Objects with functions ###
In many situations having functions in isolation is insufficient to handle more complex data mappings. There are cases when information must
processed in multiple steps, or the value of one or more functions must be determined from the state of an object. To handle such cases, uniVocity
lets you annotate multiple methods of your classes with the [FunctionWrapper](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/config/annotation/FunctionWrapper.java) annotation.
In the following example, we use an instance of the [NameSplitter](http://github.com/uniVocity/univocity-examples/tree/master/src/test/java/com/univocity/examples/NameSplitter.java) class.
It will be used split `String`s between commas and assign a numeric value to each unique `String`. The numeric values that represent each
component of the input `String` will then be concatenated with the pipe character. The `toCodes` method of this class is annotated with
`@FunctionWrapper(scope = EngineScope.APPLICATION)`, and uniVocity will create a function with it so it can be used as a function in mappings:
```java
//Here we use an object whose class contains methods annotated with @FunctionWrapper.
//The annotated methods in this object will be used to create functions accessible by uniVocity.
//This is useful to manipulate the state of data using multiple functions and to expose
//information from a complex object that can't be easily used in mappings or datasets.
NameSplitter splitter = new NameSplitter();
engine.addFunctions(splitter);
DataStoreMapping mapping = engine.map("csvDataStore", "fixedWidthDestination");
EntityMapping foodMapping = mapping.map("FOOD_DES", "food");
foodMapping.identity().associate("NDB_No").to("id");
//Here we call the "toCodes" function in our splitter object to convert each string after a comma into a code.
//Each description in the destination will have numeric codes concatenated with the pipe character.
foodMapping.value().copy("Long_Desc").to("description").readingWith("toCodes");
engine.executeCycle();
```
In the example above, `engine.addFunctions(splitter)` will create a function named `toCodes` in the engine. It is just a wrapper around the original
method of the *splitter* instance. The mapping declaring `copy("Long_Desc").to("description").readingWith("toCodes")` will send the original
food descriptions to this *toCodes* function and the result will be written to the destination field *description*.
The output will be as follows:
```
1 - butter
2 - salted
3 - whipped
4 - with salt
5 - cheese
6 - blue
7 - brick
8 - camembert
9 - caraway
10 - cheddar
11 - cheshire
12 - milk shakes
13 - thick chocolate
14 - thick vanilla
id______group_description_____________________________scientific_name_____
01001___?_____1|2_____________________________________?___________________
01002___?_____1|3|4___________________________________?___________________
01004___?_____5|6_____________________________________?___________________
01005___?_____5|7_____________________________________?___________________
01007___?_____5|8_____________________________________?___________________
01008___?_____5|9_____________________________________?___________________
01009___?_____5|10____________________________________?___________________
01010___?_____5|11____________________________________?___________________
01110___?_____12|13___________________________________?___________________
01111___?_____12|14___________________________________?___________________
```
## Auto-detection and other useful capabilities ##
uniVocity provides many useful features to make your life much easier when you need to implement common tasks. Let's see a few powerful and extremely convenient features.
### Auto-detection of mappings ###
When mapping data from multiple fields on a given source to a destination, the most common operation one need to execute is to define mappings. uniVocity makes this process much more efficient with its *auto-detection* features. By default, uniVocity will map entities and their fields automatically if their names are similar.
Let's have a look at the following example, where we map the contents of a set of CSV files to database tables with matching names and columns:
```java
//Let's create a mapping between the CSV and JDBC data stores
DataStoreMapping dsMapping = engine.map("csvDataStore", "originalSchema");
//Names of tables and columns match exactly. We can just autodetect everything.
dsMapping.autodetectMappings();
//And map all data from CSV to our database tables.
engine.executeCycle();
//Let's print the data migrated to database tables
print(readFoodGroupTable());
print(readFoodDescriptionTable());
```
That's *3 lines of code!* All we have to do is to create a mapping between two data stores, invoke `autodetectMappings` and execute a mapping cycle. This is the data migrated into our database:
```
===[ FD_GROUP ]===
FdGrp_CD___FdGrp_Desc___________________________________________________________
0100_______Dairy and Egg Products_______________________________________________
0300_______Baby Foods___________________________________________________________
1200_______Spices and Herbs_____________________________________________________
1400_______Fats and Oils________________________________________________________
1500_______Poultry Products_____________________________________________________
===[ FOOD_DES ]===
NDB_No___FdGrp_Cd___Long_Desc____________________________________Shrt_Desc__________________________________________ComName___ManufacName___Survey___Ref_Desc___Refuse___SciName___N_Factor___Pro_Factor___Fat_Factor___CHO_Factor___
01001____0100_______Butter, salted_______________________________BUTTER,WITH SALT___________________________________________________________Y___________________0__________________2.380______4.270________1.890________8.870________
01002____1400_______Butter, whipped, with salt___________________BUTTER,WHIPPED,WITH SALT___________________________________________________Y___________________0__________________8.380______3.270________2.790________1.870________
01004____0100_______Cheese, blue_________________________________CHEESE,BLUE________________________________________________________________Y___________________0__________________4.380______9.270________4.590________3.870________
01005____0100_______Cheese, brick________________________________CHEESE,BRICK_______________________________________________________________Y___________________0__________________4.380______5.270________6.790________8.870________
01007____0100_______Cheese, camembert____________________________CHEESE,CAMEMBERT___________________________________________________________Y___________________0__________________5.380______6.270________2.790________3.870________
01008____0100_______Cheese, caraway______________________________CHEESE,CARAWAY_________________________________________________________________________________0__________________6.360______6.270________8.790________3.890________
01009____0100_______Cheese, cheddar______________________________CHEESE,CHEDDAR_____________________________________________________________Y___________________0__________________6.330______4.220________8.130________3.120________
01010____0100_______Cheese, cheshire_____________________________CHEESE,CHESHIRE________________________________________________________________________________0__________________4.780______4.670________0.790________3.130________
01110____0300_______Milk shakes, thick chocolate_________________MILK SHAKES,THICK CHOC_________________________________________________________________________0__________________8.380______0.270________3.790________1.870________
01111____0300_______Milk shakes, thick vanilla___________________MILK SHAKES,THICK VANILLA______________________________________________________________________0__________________3.980______8.970________3.690________3.870________
```
But if everything is auto-detected, how can you have any flexibility over how the data is processed and transferred? Read on.
### Applying readers and functions to multiple mappings at once ###
uniVocity lets you assign one or more [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java)'s to the input, output and persisted data of any [EntityMapping](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/config/builders/EntityMapping.java) in a [DataStoreMapping](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/config/builders/DataStoreMapping.java).
You can also assign functions to any field mapping of an [EntityMapping](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/config/builders/EntityMapping.java). Let's have a look at the following example:
```java
//We can manipulate the rows of multiple mappings at once.
//This RowReader lowercases all strings in all rows processed by all entity mappings:
mapping.addInputRowReader(new RowReader() {
@Override
public void processRow(Object[] inputRow, Object[] outputRow, RowMappingContext context) {
for (int i = 0; i < inputRow.length; i++) {
if (inputRow[i] instanceof String) {
inputRow[i] = inputRow[i].toString().toLowerCase();
}
}
}
});
//We can also associate specific functions with fields of source entities in existing mappings.
//This function reverses strings
engine.addFunction(EngineScope.STATELESS, "reverse", new FunctionCall() {
@Override
public String execute(String input) {
return StringUtils.reverse(input);
}
});
//Here, we associate the "reverse" function with 2 fields of source entity "FOOD_DES"
mapping.getMapping("FOOD_DES", "FOOD_DES").transformFields("reverse", "long_desc", "SHRT_DESC");
//Let's execute a mapping cycle. We expect to have all data in lower case, in all tables.
//We also expect to have reversed descriptions in FOOD_DES
engine.executeCycle();
```
Here an anonymous [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java) is added to the input of all [EntityMapping](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/config/builders/EntityMapping.java)s to convert any `String` value to lower case. Following that, we also create a function to reverse `Strings`. This will be applied to the description fields of the source entity, `Long_Desc` and `Shrt_Desc`. After executing this mapping cycle, the data in the database will look like this:
```
===[ FD_GROUP ]===
FdGrp_CD___FdGrp_Desc___________________________________________________________
0100_______dairy and egg products_______________________________________________
0300_______baby foods___________________________________________________________
1200_______spices and herbs_____________________________________________________
1400_______fats and oils________________________________________________________
1500_______poultry products_____________________________________________________
===[ FOOD_DES ]===
NDB_No___FdGrp_Cd___Long_Desc____________________________________Shrt_Desc__________________________________________ComName___ManufacName___Survey___Ref_Desc___Refuse___SciName___N_Factor___Pro_Factor___Fat_Factor___CHO_Factor___
01001____0100_______detlas ,rettub_______________________________tlas htiw,rettub___________________________________________________________y___________________0__________________2.380______4.270________1.890________8.870________
01002____1400_______tlas htiw ,deppihw ,rettub___________________tlas htiw,deppihw,rettub___________________________________________________y___________________0__________________8.380______3.270________2.790________1.870________
01004____0100_______eulb ,eseehc_________________________________eulb,eseehc________________________________________________________________y___________________0__________________4.380______9.270________4.590________3.870________
01005____0100_______kcirb ,eseehc________________________________kcirb,eseehc_______________________________________________________________y___________________0__________________4.380______5.270________6.790________8.870________
01007____0100_______trebmemac ,eseehc____________________________trebmemac,eseehc___________________________________________________________y___________________0__________________5.380______6.270________2.790________3.870________
01008____0100_______yawarac ,eseehc______________________________yawarac,eseehc_________________________________________________________________________________0__________________6.360______6.270________8.790________3.890________
01009____0100_______raddehc ,eseehc______________________________raddehc,eseehc_____________________________________________________________y___________________0__________________6.330______4.220________8.130________3.120________
01010____0100_______erihsehc ,eseehc_____________________________erihsehc,eseehc________________________________________________________________________________0__________________4.780______4.670________0.790________3.130________
01110____0300_______etalocohc kciht ,sekahs klim_________________cohc kciht,sekahs klim_________________________________________________________________________0__________________8.380______0.270________3.790________1.870________
01111____0300_______allinav kciht ,sekahs klim___________________allinav kciht,sekahs klim______________________________________________________________________0__________________3.980______8.970________3.690________3.870________
```
### Dumping data ###
Dumping data from a database can become a cumbersome task, especially if you want to have some control over what information should be extracted. uniVocity allows you to auto-generate mappings from any source of data to a resource that does not exist. What happens is that uniVocity will attempt to create destination entities to accomodate the data of each source entity. For example: if you auto-detect mappings from a JDBC data store to a CSV data store with an output directory, a new CSV file will be created in the output directory. This file will contain all columns of the source table, and the proper mappings will be created automatically to transfer data into the file.
The following example demonstrates how you can dump your entire database to TSV files.:
```java
//Now, let's map from the database tables to TSV files that do not exist anywhere (yet).
//All we need to do is to provide an output directory where the files should be created:
File tsvOutputDir = new File(System.getProperty("user.home") + File.separator + "TSV");
//Let's create a TSV data store configuration with this output directory:
TsvDataStoreConfiguration tsvConfig = new TsvDataStoreConfiguration("tsvOutput");
tsvConfig.setOutputDirectory(tsvOutputDir, "UTF-8");
//We want to print out headers on the TSV files to identify the column names
tsvConfig.getDefaultEntityConfiguration().setHeaderWritingEnabled(true);
//Let's just reuse the dataSource we already have to connect to the database loaded in the previous example
JdbcDataStoreConfiguration jdbcConfig = new JdbcDataStoreConfiguration("database", this.dataSource);
jdbcConfig.setSchema("public");
//And let's configure a new data integration engine for our purposes.
Univocity.registerEngine(new EngineConfiguration("TSV_DUMP", tsvConfig, jdbcConfig));
DataIntegrationEngine engine = Univocity.getEngine("TSV_DUMP");
//Now, we map from our database to the TSV data store:
DataStoreMapping dsMapping = engine.map("database", "tsvOutput");
//dsMapping.configurePersistenceDefaults().notUsingMetadata().deleteAll().insertNewRows();
//Here, the boolean argument means that destination entities should be created automatically.
//The destination entities will be created to match the respective configurations of detected source entities.
//The destination data store must support dynamic creation of entities for this to work.
dsMapping.autodetectMappings(true);
//After executing a mapping cycle, we will have a TSV file for each table in the database.
engine.executeCycle();
```
Once you execute this example, expect to find a folder `TSV` on your home directory, with the following files:
```
==[ FD_GROUP.tsv ]==
FDGRP_CD FDGRP_DESC
0100 dairy and egg products
0300 baby foods
1200 spices and herbs
1400 fats and oils
1500 poultry products
==[ FOOD_DES.tsv ]==
NDB_NO FDGRP_CD LONG_DESC SHRT_DESC COMNAME MANUFACNAME SURVEY REF_DESC REFUSE SCINAME N_FACTOR PRO_FACTOR FAT_FACTOR CHO_FACTOR
01001 0100 detlas ,rettub tlas htiw,rettub y 0 2.380 4.270 1.890 8.870
01002 1400 tlas htiw ,deppihw ,rettub tlas htiw,deppihw,rettub y 0 8.380 3.270 2.790 1.870
01004 0100 eulb ,eseehc eulb,eseehc y 0 4.380 9.270 4.590 3.870
01005 0100 kcirb ,eseehc kcirb,eseehc y 0 4.380 5.270 6.790 8.870
01007 0100 trebmemac ,eseehc trebmemac,eseehc y 0 5.380 6.270 2.790 3.870
01008 0100 yawarac ,eseehc yawarac,eseehc 0 6.360 6.270 8.790 3.890
01009 0100 raddehc ,eseehc raddehc,eseehc y 0 6.330 4.220 8.130 3.120
01010 0100 erihsehc ,eseehc erihsehc,eseehc 0 4.780 4.670 0.790 3.130
01110 0300 etalocohc kciht ,sekahs klim cohc kciht,sekahs klim 0 8.380 0.270 3.790 1.870
01111 0300 allinav kciht ,sekahs klim allinav kciht,sekahs klim 0 3.980 8.970 3.690 3.870
```
You can easily determine what data should be dumped into the generated files using a [RowReader](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/RowReader.java). Simply discard the rows with values you don't want. This is very convenient to produce test datasets to reproduce isolated scenarios. But wait, there's more...
### Generating database schemas ###
Based on the configurations of your input data stores, you can easily generate a database schema for your favorite database and store data coming from your input entities.
The following example demonstrates how to generate an equivalent database schema based on the entities and configurations of your data store:
```java
//Here we export each entity from the "database" data store to a SQL "create table" script
String schemaExport = engine.exportEntities("database")
.asCreateTableScript(DatabaseDialect.HSQLDB) //generate the script using HSQLDB's dialect
.toObject(); //returns the export result as a String.
//Let's print the script:
println("--[ HSQLDB script ]--");
println(schemaExport);
//Again, we export the same entities, but this time we want the script generated differently:
schemaExport = engine.exportEntities("database")
.asCreateTableScript(DatabaseDialect.SQLServer_2012) //generate a script compatible with SQL Server 2012 dialect
.noNotNullConstraint() //do not create NOT NULL constraints
.noPrimaryKeyConstraint() // do not create PRIMARY KEY constraints
.toObject();
//Let's see how this one shows up:
println("--[ SQL Server script ]--");
print(schemaExport);
```
You have some degree of control over the ouput result and it's easy to omit some constraints such as NOT NULL or even PRIMARY KEY. This may be desirable especially if you are generating a test database to be loaded with files containing data samples.
```
--[ HSQLDB script ]--
create table FD_GROUP (FDGRP_CD char(4), FDGRP_DESC varchar(128) not null, primary key (FDGRP_CD));
create table FOOD_DES (NDB_NO char(5), FDGRP_CD char(4), LONG_DESC varchar(256) not null, SHRT_DESC varchar(64), COMNAME varchar(128), MANUFACNAME varchar(128), SURVEY char(1), REF_DESC varchar(256), REFUSE integer, SCINAME varchar(128), N_FACTOR decimal(14,3), PRO_FACTOR decimal(14,3), FAT_FACTOR decimal(14,3), CHO_FACTOR decimal(14,3), primary key (NDB_NO));
--[ SQL Server script ]--
create table FD_GROUP (FDGRP_CD char(4), FDGRP_DESC varchar(128)) GO
create table FOOD_DES (NDB_NO char(5), FDGRP_CD char(4), LONG_DESC varchar(256), SHRT_DESC varchar(64), COMNAME varchar(128), MANUFACNAME varchar(128), SURVEY char(1), REF_DESC varchar(256), REFUSE int, SCINAME varchar(128), N_FACTOR decimal(14,3), PRO_FACTOR decimal(14,3), FAT_FACTOR decimal(14,3), CHO_FACTOR decimal(14,3)) GO
```
You can quickly create test scenarios for systems that depend of complicated databases: use uniVocity to export the schema to an in-memory database such as HSQLDB, and dump the contents you need to work with into CSV files. Then, simply use the autodection feature to generate mappings from these files to your in-memory database tables and you are ready to test your application.
There are many more features options in the API that you can explore. For example, the auto-detection mechanism allows you to specific a [NameMatcher](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/engine/NameMatcher.java) for example, to control what field names and entity names to match, and how. Check the [uniVocity-API](http://docs.univocity.com/api/1.0.3/index.html) and the [javadocs](http://docs.univocity.com/api/1.0.6/index.html) for more information.
## Queries, more functions, and variables ##
uniVocity strives for convenience. We thought it would be convenient to use SQL to produce data from your entities. Your data store doesn't even need to be a database!
In the following example, we tell uniVocity to enable database-like operations in the CSV entities. In the
example class [Tutorial002QueriesFunctionsAndVariables](http://github.com/uniVocity/univocity-examples/tree/master/src/test/java/com/univocity/examples/Tutorial002QueriesFunctionsAndVariables.java), we did that by configuring the data store width
`enableDatabaseOperationsIn("FD_GROUP", "FOOD_DES")` and providing the length of each field in entities *FD_GROUP* and *FOOD_DES*.
When database operations are enabled for any entity, an in-memory database will be created automatically, and all data in these entities will be dumped
into in-memory tables. Here's an example of how you can take advantage of this feature to easily create data mappings:
```java
//As database-like operations are enabled for FD_GROUP and FOOD_DES, we can create a SQL query with them
engine.addQuery(EngineScope.CYCLE, "groupsInUse") //scope and name of the query. CYCLE scope will make the engine reuse results within a cycle
.onDataStore("csvDataStore") // data store name where the query will be executed.
.fromString(" " +
" SELECT fdgrp_cd AS id, fdgrp_desc AS name " + // *NOTE* we are giving labels to the column names here.
" FROM fd_group g " +
" WHERE fdgrp_cd IN ( SELECT DISTINCT fdgrp_Cd FROM food_des ) " +
" ORDER BY fdgrp_desc ASC "
).returnDataset(); // this query returns a dataset that can be used as the source entity in entity mappings.
//Let's create a mapping between the CSV and fixed-width data stores.
DataStoreMapping mapping = engine.map("csvDataStore", "fixedWidthDestination");
//Here we map the result of the query to the fixed-width entity "food_group"
//Note the query name is declared within curly braces.
EntityMapping queryMapping = mapping.map("{groupsInUse}", "food_group");
//As we labeled the columns in the query to be the same as in the destination entity,
//we can just use auto-detection and all fields with similar names will be associated, including identifiers.
queryMapping.autodetectMappings();
//Let's execute and check the output
engine.executeCycle();
```
In this example, we want to map only the food groups that are actually used. To do that, we add a query to the engine using
`engine.addQuery(EngineScope.CYCLE, "groupsInUse")`. This will create a query with the name *groupsInUse*, and its results will be reused within the mapping cycle:
if multiple mappings depend on the data returned by *groupsInUse*, the SQL query won't be executed again until a new mapping cycle is started.
The `.onDataStore("csvDataStore")` indicates to which data store the query applies.
`.from`String`(" select ... ` is the query itself. You can also use queries from files and resources in your classpath. Notice the select statement has a label for each
selected column, and the labels match the fields names in the destination entity. We did that so we could use auto-detection, instead of mapping each field manually.
Finally `.returnDataset` tells the engine to return the query results as a [Dataset](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/data/Dataset.java). [Dataset](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/data/Dataset.java)s can be used as regular entities in any mapping.
In this case, the dataset will be produced with the result of the query, and its field names will match the labels of the select statement.
With the *groupsInUse* query properly built and initialized, an entity mapping is created using `map("{groupsInUse}", "food_group")`.
Notice the curly braces around the query name. These indicate the source entity is a function. We are effectively mapping a function to an entity.
Lastly, `queryMapping.autodetectMappings()` will create mappings between fields with the same name. As the dataset produced by our query contains the
field names *id* and *name*, and the destination entity *food_group* contains fields with these names, the mappings will be automatically created. As the destination
entity *food_group* was configured with *id* as its identifier, an identity mapping will be created for this field.
After executing the mapping cycle, the output contains only the food groups that are actually used in [FOOD_DES.csv](http://github.com/uniVocity/univocity-examples/tree/master/src/test/resources/examples/source_data/csv/FOOD_DES.csv):
```
id____name_______________________________
0300__Baby Foods_________________________
0100__Dairy and Egg Products_____________
1400__Fats and Oils______________________
```
### Using parameters and variables ###
Queries can also be parameterized and used as functions anywhere. The following example demonstrates how you can define and use them in conjunction with variables.
```java
//This query uses the description of a food group to find and return its code.
//Note that queries that do not return a dataset are simply functions that can be used everywhere.
//Parameters: groupDescription
engine.addQuery(EngineScope.MAPPING, "findGroupCode")
.onDataStore("csvDataStore")
.fromString("SELECT fdgrp_cd FROM fd_group WHERE fdgrp_desc like :groupDescription")
.returnSingleValue() //expects one single result.
.directly() // just returns the result without applying any function to it.
.onErrorAbort(); //if the query does not return any value, or returns more than one value, abort the data mapping cycle.
//Locates all foods that are part of a given group and that have a given description.
//In SQL queries used by uniVocity's entities, parameters must be set in the order they appear in a query.
//Parameters: foodGroupCode and foodDescription
//Custom entities provided by you can use any other approach to determine how queries and parameters are processed.
engine.addQuery(EngineScope.STATELESS, "foodsOfGroup")
.onDataStore("csvDataStore")
.fromString(" SELECT * FROM food_des WHERE fdgrp_cd = :foodGroupCode AND Shrt_Desc like :foodDescription")
.returnDataset();
DataStoreMapping mapping = engine.getMapping("csvDataStore", "fixedWidthDestination");
//Here we define our mapping using a parameterized query: Parameters can be other functions, variable names, constants or plain strings.
//This expression will be evaluated as follows:
// 1 - The value of the variable "groupName" will be sent to the "findGroupCode" query, producing the code of the given food group.
// This result is then used to set the "foodGroupCode" parameter.
// 2 - The value of variable "foodName" will be used to set the "foodDescription" parameter
// 3 - The "foodsOfGroup" query will be called with these parameter values, and return a dataset that can be used to map data to "food".
EntityMapping queryMapping = mapping.map("{foodsOfGroup(findGroupCode($groupName), $foodName)}", "food");
//Here we map the column names in the dataset produced by the "foodsOfGroup" query to the fields in "food".
queryMapping.identity().associate("NDB_No").to("id");
queryMapping.value().copy("Long_Desc", "SciName").to("description", "scientific_name");
queryMapping.reference().using("{findGroupCode($groupName)}").referTo("{groupsInUse}", "food_group").on("group");
queryMapping.persistence().usingMetadata().deleteDisabled().updateDisabled().insertNewRows();
//Before executing the cycle, let's set the values of each variable we are using as a parameter in "foodsOfGroup"
engine.setVariable("groupName", "Dairy%");
engine.setVariable("foodName", "CHEESE%");
engine.executeCycle();
printDataInEntities(output, "After first cycle:");
//Let's change the variables and execute the cyle again.
engine.setVariable("groupName", "Baby Foods");
engine.setVariable("foodName", "%");
engine.executeCycle();
printDataInEntities(output, "After second cycle:");
```
The first query, *findGroupCode*, contains the named parameter `:groupDescription`. The `.returnSingleValue` option mandates that just one value should be returned.
`.directly` indicates no function will be applied to the result of this query, and `.onErrorAbort` will make the query throw an exception in case an
unexpected result is produced. In this case, an unexpected result would be more than one value, or no value at all, being produced.
The next query, *foodsOfGroup*, contains two named parameters: `:foodGroupCode` and `:foodDescription`. This query returns a [Dataset](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/data/Dataset.java).
With the queries defined, a very different entity mapping is created: `map("{foodsOfGroup(findGroupCode($groupName), $foodName)}", "food")`.
This mapping uses the *foodsOfGroup* function to produce a dataset and map it to the destination entity *food*.
However, *foodsOfGroup* takes two parameters: the code of a food group, and the description of a food. Instead of using a literal value, we chose
to use the *findGroupCode* function to fetch food group codes. The $ sign in `$groupName` indicates there is a variable with the name *groupName*.
When executing this mapping, uniVocity will read the value of the variable *groupName* and invoke the function *findGroupCode* with it. The result
will be passed on to the fist parameter of the function *foodsOfGroup*, and its second parameter will receive the value of the variable *foodName*.
In this example, the variables *groupName* and *foodName* are set using the commands `engine.setVariable("groupName", "Dairy%")` and
`engine.setVariable("foodName", "CHEESE%")`, before executing a mapping cycle.
> ** Important: ** when using parameterized functions/queries as source entities, the *signature* is used to generate the metadata entries, not the actual variable values.
> This may have implications when detecting data updates for the destination entity. For example, if your mapping is configured with the `deleteAbsent` persistence setting:
>
> 1. Records will be mapped mapped using a set of variable values
> 2. On the next cycle, if the values of these variables changed, the resulting dataset may be different.
> 3. uniVocity will remove records previously mapped, that are not in the current dataset.
>
> If you don't want such behavior, then you may need to consider creating a duplicate entity mapping using literal values instead of variables.
There's also an interesting reference mapping in this example: `reference().using("{findGroupCode($groupName)}").referTo("{groupsInUse}", "food_group").on("group")`
This reference mapping uses the result of the function call `findGroupCode($groupName)` to obtain the code of a food group. The result is applied to uniVocity's metadata
to discover which identifier was used in the destination of mapping *{groupsInUse} -> food_group* (from the previous example). The identifier restored from uniVocity's
metadata will then be copied to food's *group* field.
Finally, we set the variables *groupName* to "Dairy%" and *foodName* to "CHEESE%", execute a mapping cycle, then print the results.
Next, we set them again to "Baby Foods" and "%" respectively. The result of both mapping cycles is as follows:
```
After first cycle:
-- Food groups --
id____name_______________________________
0300__Baby Foods_________________________
0100__Dairy and Egg Products_____________
1400__Fats and Oils______________________
-- Foods --
id______group_description_____________________________scientific_name_____
01004___0100__Cheese, blue____________________________?___________________
01005___0100__Cheese, brick___________________________?___________________
01007___0100__Cheese, camembert_______________________?___________________
01008___0100__Cheese, caraway_________________________?___________________
01009___0100__Cheese, cheddar_________________________?___________________
01010___0100__Cheese, cheshire________________________?___________________
After second cycle:
-- Food groups --
id____name_______________________________
0300__Baby Foods_________________________
0100__Dairy and Egg Products_____________
1400__Fats and Oils______________________
-- Foods --
id______group_description_____________________________scientific_name_____
01004___0100__Cheese, blue____________________________?___________________
01005___0100__Cheese, brick___________________________?___________________
01007___0100__Cheese, camembert_______________________?___________________
01008___0100__Cheese, caraway_________________________?___________________
01009___0100__Cheese, cheddar_________________________?___________________
01010___0100__Cheese, cheshire________________________?___________________
01110___0300__Milk shakes, thick chocolate____________?___________________
01111___0300__Milk shakes, thick vanilla______________?___________________
```
## Mapping between incompatible schemas ##
One of the most powerful features of uniVocity is probably how easy it is to map data between different schemas, consistently. To demonstrate how this works
we are going to map data in the [FD_GROUP.csv](http://github.com/uniVocity/univocity-examples/tree/master/src/test/resources/examples/source_data/csv/FD_GROUP.csv) and [FOOD_DES.csv](http://github.com/uniVocity/univocity-examples/tree/master/src/test/resources/examples/source_data/csv/FOOD_DES.csv) files to a peculiar database.
This database stores food information and their descriptions in multiple languages
The following entity-relationship diagram shows how its tables are associated:

The database DDL scripts are located [here](http://github.com/uniVocity/univocity-examples/tree/master/src/test/resources/examples/new_schema).
In this example, we will populate the following tables:
* LOCALE - the languages supported in the database.
* FOOD_GROUP - groups of foods (without their descriptions)
* FOOD_GROUP_DETAILS - descriptions and locale dependent information of each FOOD_GROUP
* FOOD_NAME - names of foods (without locale dependent information)
* FOOD_NAME_DETAILS - descriptions and locale dependent information of each FOOD_NAME
* FOOD - food information: name, proteins, energy, etc.
* GROUP_OF_FOOD - join table between foods and different groups.
The data integration engine will now be initialized with a [JdbcDataStoreConfiguration](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/entity/jdbc/JdbcDataStoreConfiguration.java). First we need to create a configuration object for the database.
As uniVocity is able to query the JDBC data store to return all information it needs (tables, columns and primary keys),
the configuration effort is fairly minimal:
```java
//Creates a new JDBC data store based on a javax.sql.DataSource, with the name "new_schema".
//Upon initialization, uniVocity will try to auto-detect all available tables, columns and primary keys
//If this doesn't work you still can configure each table manually.
JdbcDataStoreConfiguration newSchemaDataStore = new JdbcDataStoreConfiguration(dataStoreName, dataSource);
//The database contains lots of internal tables in addition to the tables we are interested in
//By setting the schema to "public", these internal database tables won't be made available to uniVocity.
newSchemaDataStore.setSchema("public");
```
The [CsvDataStoreConfiguration](http://github.com/uniVocity/univocity-api/tree/master/src/main/java/com/univocity/api/entity/text/csv/CsvDataStoreConfiguration.java) remains the same as before. We just need to register the engine.
```java
//Creates a new engine configuration to map data between the entities in CSV and JDBC data stores
EngineConfiguration engineConfig = new EngineConfiguration(engineName, csvDataStore, newSchemaDataStore);
//Registers this engine configuration.
Univocity.registerEngine(engineConfig);
```
With the engine ready, all we need to do is to configure the mappings and execute a mapping cycle:
```java
DataIntegrationEngine engine = Univocity.getEngine(engineName);
//Let's create a mapping between the CSV and JDBC data stores
DataStoreMapping dsMapping = engine.map("csvDataStore", "newSchema");
//Every entity mapping will be created with the following defaults for handling data
dsMapping.configurePersistenceDefaults().usingMetadata().deleteDisabled().updateModified().insertNewRows();
//Let's create two variables to represent the current locale information.
engine.setVariable("locale", "en_US");
engine.setVariable("localeDescription", "American English");
EntityMapping mapping;
//Here we create mapping from expressions to the destination entity "locale". No data in the source data store is required
mapping = dsMapping.map("locale");
//The value of the "locale" variable will be associated with the generated ID in the "locale" table.
mapping.identity().associate("{$locale}").toGeneratedId("id");
//The values of variables "locale" and "localeDescription" will be copied to "acronym" and "description" columns
mapping.value().copy("{$locale}", "{$localeDescription}").to("acronym", "description");
//This mapping just creates generated ID's in "food_group". Each generated ID will be associated with a value of "FdGrp_CD"
mapping = dsMapping.map("FD_GROUP", "food_group");
mapping.identity().associate("FdGrp_CD").toGeneratedId("id");
//Next, we map the values in FD_GROUP to the locale-specific "food_group_details"
mapping = dsMapping.map("FD_GROUP", "food_group_details");
//Using the code in "FdGrp_CD" and the current value of the "locale" variable, we define an identifier association to "id" and "loc", respectively.
//The values in "id" and "loc" are references to the identifiers mapped above.
mapping.identity().associate("FdGrp_CD", "{$locale}").to("id", "loc");
//We use the "FdGrp_CD" to obtain the identifiers generated in a mapping from "FD_GROUP" to "food_group".
//The identifier written to "food_group" will be copied to "id".
mapping.reference().using("FdGrp_CD").referTo("FD_GROUP", "food_group").on("id");
//Here we use the current value in the "locale" variable to restore the identifier generated in the mapping to "locale".
//This identifier will be copied to "loc"
mapping.reference().using("{$locale}").referTo(null, "locale").on("loc");
//Just copy the food group description
mapping.value().copy("FdGrp_Desc").to("description");
//Again, we just associate values of "FOOD_DES" to a generated ID in the destination "food_name".
mapping = dsMapping.map("FOOD_DES", "food_name");
mapping.identity().associate("NDB_No").toGeneratedId("id");
//Similarly to the mapping from "FD_GROUP" to "food_group_details", we just map identity fields and references
mapping = dsMapping.map("FOOD_DES", "food_name_details");
mapping.identity().associate("NDB_No", "{$locale}").to("id", "loc");
mapping.reference().using("NDB_No").referTo("FOOD_DES", "food_name").on("id");
mapping.reference().using("{$locale}").referTo(null, "locale").on("loc");
mapping.value().copy("Long_Desc").to("description");
//Here the basic food composition data in "FOOD_DES" is mapped to the "food" entity.
mapping = dsMapping.map("FOOD_DES", "food");
mapping.identity().associate("NDB_No").toGeneratedId("id");
// "food.name_id" is a reference to "food_name". We use "NDB_No" to obtain the ID generated in the mapping from "FOOD_DES" to "food_name"
mapping.reference().using("NDB_No").referTo("FOOD_DES", "food_name").on("name_id");
// let's finally copy the food composition data
mapping.value().copy("CHO_Factor", "Fat_Factor", "Pro_Factor", "N_Factor").to("carbohydrate_factor", "fat_factor", "protein_factor", "nitrogen_protein_factor");
//Lastly, we associate each "food" to a "food_group". Once again, this is easily achieved with the use of reference mappings.
mapping = dsMapping.map("FOOD_DES", "group_of_food");
mapping.identity().associate("NDB_No", "FdGrp_CD").to("food_id", "group_id");
mapping.reference().using("NDB_No").referTo("FOOD_DES", "food").on("food_id");
mapping.reference().using("FdGrp_CD").referTo("FD_GROUP", "food_group").on("group_id");
//In this cycle, the current locale is "en_US".
engine.executeCycle();
//Let's now use another locale. All tables with locale-specific data will have new descriptions associated with this new locale.
engine.setVariable("locale", "en_GB");
engine.setVariable("localeDescription", "British English");
engine.executeCycle();
//Finally, let's print all tables and see how the data looks like:
printAndValidate(printTables());
```
This example is not very different from what we've seen, but some novelties can be highlighted:
**Locale**
* An entity mapping from "nowhere" to the destination entity *locale*: `mapping = dsMapping.map("locale")` maps values from expressions/variables
in the engine context only.
* `identity().associate("{$locale}").toGeneratedId("id")` will associate the current value in the *locale* variable to a generated identifier of the *locale table*
**Mapping to *food_group* and *food_group_details***
* In `map("FD_GROUP", "food_group")`, there is only one mapping: `identity().associate("FdGrp_CD").toGeneratedId("id")`. When a new value for *FdGrp_CD* appears on
the input entity, uniVocity will insert an empty row into *food_group*. This will produce a generated identifier. This identifier is associated with the value in
*FdGrp_Cd* using uniVocity's metadata.
* Next, on `map("FD_GROUP", "food_group_details")`, the destination entity *food_group_details* is a table with a composite primary key: *id* is a foreign key that
references *food_group*, and *loc* is another foreign key that references to *locale*. To map this properly, we need to configure 3 field mappings:
* `identity().associate("FdGrp_CD", "{$locale}").to("id", "loc")`: this will link the values of *FdGrp_CD* to *food_group_details.id*, and the current value
of variable *locale* to *food_group_details.loc*. The values won't be written to the destination, the reference mappings will obtain the correct references.
* `reference().using("FdGrp_CD").referTo("FD_GROUP", "food_group").on("id")`: using the values in *FdGrp_CD*, uniVocity will have to query its metadata to discover
which IDs were generated in the mapping from *FD_GROUP* to *food_group*. It will obtain the correct generated ID's in *food_group* and copy them
to *food_group_details.id*.
* `reference().using("{$locale}").referTo(null, "locale").on("loc")`: using the current value of variable *locale*, uniVocity will once again query its metadata to discover
which ID was generated in the mapping to the *locale* table. As the mapping to *locale* does contain a source entity, we use *null* as the source entity.
This mapping will obtain the correct generated ID in the *locale* table and copy it to *food_group_details.loc*.
The remainder of this code is very similar and we hope it is easier to understand at this point. Here is how the migrated data should look like in the destination database:
```
===[ locale ]===
id___acronym___description________________________________________
0____en_US_____American English___________________________________
1____en_GB_____British English____________________________________
===[ food_group ]===
id___
0____
1____
2____
3____
4____
===[ food_group_details ]===
id___loc___description__________________________________________________________
0____0_____Dairy and Egg Products_______________________________________________
0____1_____Dairy and Egg Products_______________________________________________
1____0_____Spices and Herbs_____________________________________________________
1____1_____Spices and Herbs_____________________________________________________
2____0_____Baby Foods___________________________________________________________
2____1_____Baby Foods___________________________________________________________
3____0_____Fats and Oils________________________________________________________
3____1_____Fats and Oils________________________________________________________
4____0_____Poultry Products_____________________________________________________
4____1_____Poultry Products_____________________________________________________
===[ food_name ]===
id___
0____
1____
2____
3____
4____
5____
6____
7____
8____
9____
===[ food_name_details ]===
id___loc___description__________________________________
0____0_____Butter, salted_______________________________
0____1_____Butter, salted_______________________________
1____0_____Butter, whipped, with salt___________________
1____1_____Butter, whipped, with salt___________________
2____0_____Cheese, blue_________________________________
2____1_____Cheese, blue_________________________________
3____0_____Cheese, brick________________________________
3____1_____Cheese, brick________________________________
4____0_____Cheese, camembert____________________________
4____1_____Cheese, camembert____________________________
5____0_____Cheese, caraway______________________________
5____1_____Cheese, caraway______________________________
6____0_____Cheese, cheddar______________________________
6____1_____Cheese, cheddar______________________________
7____0_____Cheese, cheshire_____________________________
7____1_____Cheese, cheshire_____________________________
8____0_____Milk shakes, thick chocolate_________________
8____1_____Milk shakes, thick chocolate_________________
9____0_____Milk shakes, thick vanilla___________________
9____1_____Milk shakes, thick vanilla___________________
===[ food ]===
id___name_id___carbohydrate_factor___fat_factor___protein_factor___nitrogen_protein_factor___
0____0_________8.87__________________1.89_________4.27_____________2.38______________________
1____1_________1.87__________________2.79_________3.27_____________8.38______________________
2____2_________3.87__________________4.59_________9.27_____________4.38______________________
3____3_________8.87__________________6.79_________5.27_____________4.38______________________
4____4_________3.87__________________2.79_________6.27_____________5.38______________________
5____5_________3.89__________________8.79_________6.27_____________6.36______________________
6____6_________3.12__________________8.13_________4.22_____________6.33______________________
7____7_________3.13__________________0.79_________4.67_____________4.78______________________
8____8_________1.87__________________3.79_________0.27_____________8.38______________________
9____9_________3.87__________________3.69_________8.97_____________3.98______________________
===[ group_of_food ]===
group_id___food_id___
0__________0_________
0__________2_________
0__________3_________
0__________4_________
0__________5_________
0__________6_________
0__________7_________
2__________8_________
2__________9_________
3__________1_________
```
## Input management ##
### Input sharing ###
In the previous example, there are 4 mappings reading from the source entity *FOOD_DES*. Does it mean uniVocity will read from it 4 times?
The answer is: no, *unless you want it to*. uniVocity detects there are 4 *consecutive* mappings reading from the same source. By default, it will create a single reading
process to extract the values required by all 4 mappings, in one go, then reuse the data in each mapping. This may have implications when the number of rows in memory
is limited to be smaller than the input: if the input has 100 rows but 10 rows are to be kept in memory, uniVocity will load 10 rows, then execute each one of the 4 mappings;
Then, the next 10 rows will be read and used to execute each mapping again. This will proceed until all rows from the input are read.
If a row is discarded from the input in one mapping, it will still be available for the other mappings. Data modifications on the row won't be visible on other mappings.
If you need to process all rows before executing the subsequent mappings, disable input sharing with: `entityMapping.setInputSharingEnabled(false)`.
Additionally, the input will only be shared when mappings using data from the same source entity are executed one after the other. If mapping 1 reads from FOOD_DES,
mapping 2 from FD_GROUP, and mapping 3 from FOOD_DES again, then all rows of FOOD_DES will be read twice: first execute mapping 1, then again to execute mapping 3.
### Data increments ###
You probably don't want to execute a data mapping cycle using ALL records of your source entities every time a few changes were made. With uniVocity, you can
provide *data increments* to execute your mappings using limited datasets for each source entity.
This example demonstrates how you can create a data increment and use it to execute your data mappings:
```java
//Data increments can be used to update certain records without re-reading the entire source entity.
//To define a data increment, you must first create a dataset with the updated data.
List