Don't get lost: IMan Guides » IMan User Guide » Readers » XML Reader

XML Reader

The XML Read transform uses XPath expressions to obtain record and field values from XML files. Therefore it is necessary to be familiar with XPath and XML Document structure before setting up an XML Read transform.

Two good starting places for learning about this are:

Microsoft MSDN - http://msdn.microsoft.com/en-us/library/ms256086.aspx
W3Schools - http://www.w3schools.com/xpath/

Reader > File Layout

Used to specify the file path and name.

Transform Id

The unique user-defined name for the transform.

Data Source

The controller type is defined here. This is the source from where the XML data will be read, which can be Email, File or Http.
See Input/Output Controllers (IO Controller)for more information.

Controller Options

The expandable options section beneath the Data Source drop-down. These options change according to the data source selected.

See Input/Output Controllers (IO Controller) for more information.

Namespace

XML namespaces are used uniquely identify identically named elements and attributes in an XML document.

Namespaces used within the source document(s) are maintained using the Add, Edit, Delete controls.

Namespaces follow the format:

xmlns:<prefix>="<uri>"

Example

Or to declare a namespace with no prefix use:

XPath Statements

When declaring the XPath statements, as opposed to namespaces, it is important to include the prefix in the declaration, otherwise an invalid path will be created returning an empty dataset.

Example

In this case, the prefix declaration 'xhtml:' must be included.

xhtml:path1/xhtml:subpath2/xhtml:subpath3

Building the XML Read Transform

XML is by nature a nested data format. This means that the data is intrinsically structured with parent-child relationships already within it. As a result, it is necessary to specify the parents and any children through the Transaction XPath in this setup. Otherwise, the IMan dataset could be imported with far more, or far less, relationships than required.

The example below is an output data file from a purchasing system containing order details:

As seen here, the data is very much a structured format. The top parent is sales_order, while its child is order_details; both on the same level in the document's hierarchy! This is very often the case, and makes it necessary to define each and every field with a transaction XPath to the document.

This setup is absolutely critical to importing this data correctly.

Reader > Field Mapping

XML Entry Point

The first step is to define the Entry Point XPath expression. The Entry Point is where the topmost parent transaction type begins. In the example here, this is fb_sales. This is the starting position from where the transactions begin to recur.

New Transaction Id

To define transaction types, enter their name into this field, and press the Add (>)button.

Add Button

The Add button creates a new transaction type of the name provided in the ‘New Transaction Id’ field.

Transaction Type Drop Down

Each of the transaction types created are listed in this drop down.

Selecting a type from here saves any changes to the currently open transaction type and, opens the newly selected transaction to be edited.

Parent Id

Select the parent node of the currently open transaction type.

Edit Button

The Edit Button displays the transaction type setup page.

Remove Button

Deletes the transaction type from hierarchy.

Read> Field Mapping > Edit

XPath

The XPath of the transaction type.

This is the query that is sent to the XML file to retrieve the relevant data to fill this transaction type's field.

The XPath for a parent transaction will append to the Entry Point XPath, e.g.:

fb_sales/sales_order

A child's XPath will append to its Parent's Xpath, e.g.:

sales_order/order_details

Reader > Field Mapping > Edit > Edit Record

When edit button pushed again:

Field Name

The name to identify the field.

Relative

Indicates whether the field should use a relative or an absolute path.

If the field is marked as relative, the XPath expression is used for the field.

If it is not relative, an absolute path is used. This allows nodes which don’t fit into the child-parent node relationship to be pulled into the dataset.

For example, a Transfer Date field could use an absolute path to read the value from the header section of the document.

XPath

The XPath of the transaction type.

This is the query that is sent to the XML file to retrieve the relevant data to fill this transaction type's field.

The XPath for a parent transaction will append to the Entry Point XPath, e.g.:

fb_sales/sales_order

A child's XPath will append to its Parent's Xpath, e.g.:

sales_order/order_details

Type

The data type of the field.

Delete Button

Deletes the field from the transaction type.

SYS.INPUTFILE

Create a field called 'SYS.INPUTFILE', with any XPath, to capture the filename. See Streamline processing for more information.