XML Reader
The XML Read transform uses XPath expressions to obtain record and field values from XML files. Therefore it is necessary to be familiar with XPath and XML Document structure before setting up an XML Read transform.
Two good starting places for learning about this are:
- Microsoft MSDN - http://msdn.microsoft.com/en-us/library/ms256086.aspx
- W3Schools - http://www.w3schools.com/xpath/
Reader > File Layout
Used to specify the file path and name.
Transform Id
The unique user-defined name for the transform.
Data Source
The controller type is defined here. This is the source from where the XML data will be read, which can be Email, File or Http.
See Input/Output Controllers (IO Controller)for more information.
Controller Options
The expandable options section beneath the Data Source drop-down. These options change according to the data source selected.
See Input/Output Controllers (IO Controller) for more information.
Namespace
XML namespaces are used uniquely identify identically named elements and attributes in an XML document.
Namespaces used within the source document(s) are maintained using the Add, Edit, Delete controls.
Namespaces follow the format:
xmlns:<prefix>="<uri>"
Example
Or to declare a namespace with no prefix use:
XPath Statements
Example
In this case, the prefix declaration 'xhtml:' must be included.
xhtml:path1/xhtml:subpath2/xhtml:subpath3
Building the XML Read Transform
XML is by nature a nested data format. This means that the data is intrinsically structured with parent-child relationships already within it. As a result, it is necessary to specify the parents and any children through the Transaction XPath in this setup. Otherwise, the IMan dataset could be imported with far more, or far less, relationships than required.
The example below is an output data file from a purchasing system containing order details:
As seen here, the data is very much a structured format. The top parent is sales_order, while its child is order_details; both on the same level in the document's hierarchy! This is very often the case, and makes it necessary to define each and every field with a transaction XPath to the document.
Reader > Field Mapping
XML Entry Point
The first step is to define the Entry Point XPath expression. The Entry Point is where the topmost parent transaction type begins. In the example here, this is fb_sales. This is the starting position from where the transactions begin to recur.
New Transaction Id
To define transaction types, enter their name into this field, and press the Add (>)button.
Add Button
The Add button creates a new transaction type of the name provided in the ‘New Transaction Id’ field.
Transaction Type Drop Down
Each of the transaction types created are listed in this drop down.
Selecting a type from here saves any changes to the currently open transaction type and, opens the newly selected transaction to be edited.
Parent Id
Select the parent node of the currently open transaction type.
Edit Button
The Edit Button displays the transaction type setup page.
Remove Button
Deletes the transaction type from hierarchy.
Read> Field Mapping > Edit
XPath
The XPath of the transaction type.
This is the query that is sent to the XML file to retrieve the relevant data to fill this transaction type's field.
The XPath for a parent transaction will append to the Entry Point XPath, e.g.:
fb_sales/sales_order
A child's XPath will append to its Parent's Xpath, e.g.:
sales_order/order_details
Reader > Field Mapping > Edit > Edit Record
When edit button pushed again:
Field Name
The name to identify the field.
Relative
Indicates whether the field should use a relative or an absolute path.
If the field is marked as relative, the XPath expression is used for the field.
If it is not relative, an absolute path is used. This allows nodes which don’t fit into the child-parent node relationship to be pulled into the dataset.
For example, a Transfer Date field could use an absolute path to read the value from the header section of the document.
XPath
The XPath of the transaction type.
This is the query that is sent to the XML file to retrieve the relevant data to fill this transaction type's field.
The XPath for a parent transaction will append to the Entry Point XPath, e.g.:
fb_sales/sales_order
A child's XPath will append to its Parent's Xpath, e.g.:
sales_order/order_details
Type
The data type of the field.
Delete Button
Deletes the field from the transaction type.
SYS.INPUTFILE
Create a field called 'SYS.INPUTFILE', with any XPath, to capture the filename. See Streamline processing for more information.