|
The
Seven Hidden Challenges of Application Integration
By
Peter R. Chase
About
the Author
Peter
R. Chase is Executive Vice President and founder of Scribe
Software Corporation, a developer of rapid application integration
solutions for application software vendors and their system
integrators, value-added resellers and end-user customers.
With thousands of customers, Scribe is a leading provider
in the market for application integration solutions. In
his capacity at Scribe, Mr. Chase has advised numerous enterprise
application vendors as they mapped out their strategic integration
strategies. He has also worked with many of Scribe’s
customers to ensure a successful rollout of their application
integration solutions.
The
market for packaged enterprise applications has exploded
in recent years.Companies are deploying “Acronym Applications”
(ERP, CRM, ERM, SCM,etc) at an unprecedented rate. As varied
as these applications may be, they eachhave one major thing
in common. It is imperative that they work together.When
deploying an enterprise business application, a critical
question is howrapidly and effectively the application can
be integrated into the company’soverall computing
environment.
How do we enable the rapid integration of anapplication
with other applications and data stores? How do we develop
aframework for simplifying the tasks of integrating data
with these applications?Before an integration framework
can be developed, we need to answer thefollowing questions:1.
What are the complexities introduced by these businessapplications?and2.
What can be done to simplify these complexities when users1develop
integration processes?By understanding the first question
and providing answers to the second, wecan dramatically
reduce the time and effort it takes to perform and maintainour
integration processes without compromising the quality and
integrity of thedata. Think of this paper as a requirements
document for some key elements ofapplication integration.
It outlines those requirements that are driven by sevenhidden
challenges of integrating data between complex business
applications.The challenges are referred to as hidden because
they are tightly intertwinedwith the application’s
architecture and capabilities and are easily and oftenoverlooked.
They are complex problems that are not readily apparent
duringthe planning phase of a project. Planning for these
challenges early is key tokeeping your integration project
on track.1 For the purposes of this paper, users are defined
as the individualsresponsible for designing and maintaining
integration processes.
Challenge
# 1: An application has varied and multiple integration
points.What integration points exist?Aside from the user
interface, applications have a number of points fromwhich
data can be introduced. Typical points of integration include:i)
the application’s database (either through the application’s
base tables,interface tables, or stored procedures),ii)
flat file (ASCII) import and export routines,iii) an API
(application programming interface),iv) and document style
interfaces (which can be based on a proprietaryformat, such
as SAP’s IDOCs or a more standardized approach, such
asXML). Many application vendors have developed proprietary
code intended to addresssome of the complexities of data
integration. This code may exist within theapplication itself
or in stored procedure code within the database. As we willsee
in this paper, this vendor code does not eliminate the complexities
inherentin many integration processes.Figure 1 below illustrates
these varied integration points.Figure 1:Which point to
use, when?The first step in determining which integration
point to use is to identify whatthe vendor currently supports.
If you surveyed vendors of the leading “best ofVendor
Code(StoredProcedures)Application LogicUser InterfaceVendor
CodeDirect to DBTo InterfaceTablesTo StoredProceduresAPIsASCII
ImportDocumentsInterfaceTablesDatabase
breed” business applications you would be
hard pressed to find two thatsupport the same set of integration
points.The second step in determining which integration
point to use is to identify therequirements of the integration
task being performed. This is a world of tradeoffs;the trade-off
between performance and control being the most common.At
one end of the spectrum of this trade-off is a high volume
data load thatrequires minimal control over data integrity
and validation. In this case, writingdirectly to a database
table, perhaps through a bulk loading facility, andavoiding
the overhead of validation layers is typically most efficient.
On theother end of the spectrum, performing a real-time
integration of transactionsbetween applications demands
a high level of validation and business rulesprocessing.
In this case it is not advisable to undercut these other
layers.When designing an integration approach make sure
to consider the broad set of integrationpoints that users
will encounter across the applications in the enterprise.
Provide the flexibilityto operate at many points along the
performance-control spectrum.Challenge # 2: An application
has many complex data relationships thatneed to be established
and maintained.Today’s complex business applications
are built over highly normalized datastructures.
Database
schemas with hundreds (or thousands) of tables that arereplete
with defined key relationships between records and fields.
Maintainingreferential integrity (i.e. enforcing the defined
primary key/foreign keyrelationships) between records and
related tables is essential to avoid thecreation of orphaned
records. The application may also link certain records todomain
tables of possible values for a given field or fields. An
example of adomain table would be a table that contains
the possible values for state that isreferred to by an address
table.What role does sequence play?Because of these dependencies,
the order by which you process data isimportant.
For example,
an application could have a table for a company thathas
a parent table of possible addresses. In order to insert
the company record,a user would need to insert the address
record first and then insert thecompany record (using the
primary key from the address table as the foreignkey to
the address table in the company insert). A unique primary
key for eachtable needs to be generated for any new record
inserts. Providing a simplemechanism to automatically generate
these keys is critical. The address insert
may also require that the state field be populated by a
valid state value found inthe state domain table.
This simple example serves to illustrate the importance
of multi-step processingand sequence. The relationships
and dependencies in enterprise applicationsare dramatically
more complex. It is typical to have multiple child tablesassociated
with one parent table along with multiple levels of parent-childrelationships.
Many applications even have “link tables” that
serve as a parentfor two related child tables to create
many-to-many relationships betweentables2. Attempting to
insert a record that spans a number of target tablesrequires
sophisticated processing logic.Design an approach that automatically
manages the complexities of the critical keys,relationships,
and dependencies of the tables within the application. This
includes settingprimary and foreign keys, setting the proper
order of processing, and enforcing the domainvalue relationships.2
This could allow a company to have many address records
associated with it,some or all of which could also be associated
with other company records.Challenge #3: An integration
process has workflow requirements.
What occurrences affect workflow?
When integrating information into an application,
it is important to supportworkflow requirements. Typically,
three types of occurrences impact the flowof processing
for a given transaction or record;1. the existence of a
record in the target2. the value of certain fields within
the source record and3. error conditions that occur within
the process.We will discuss the first two scenarios now
and the third later.The processing of a transaction or record
can be affected by the existence of arecord in the target.
Take the following example:When loading a lead record into
sales system, update the lead if it is already inthe target
database. If it is not in the target database, insert the
lead and alsocall another process within the application
that assigns the account to a sales repand a territory based
on some algorithm.Our first step would be to perform a “seek”
operation against the targetaccount table based on a key
field or fields e.g. the first five characters of thecompany
name and the last five digits of the zip code.
If a match
were found,an update would be performed on that account
record. If a match were notfound, the record would need
to be inserted and a process to assign the rep andterritory
would need to be invoked (perhaps by calling an API from
the targetapplication).The processing of a transaction or
record can also be affected by the value ofcertain fields
in the source record. Using the same example, the lead data
maybe coming from an outside telemarketing firm and each
lead may be designatedas “hot” or “not
hot” based upon the results of the telemarketer’s
qualification. If the lead is hot, it is necessary to assign
the record to a sales rep following thesame process outlined
above.If the lead is “not hot”, the sales rep
and territory should not be assigned, butinstead a fulfillment
order should be created and a follow-up by a telesalesrepresentative
should be scheduled. In this case, the rep assignment algorithmwould
not be invoked, but new records would be inserted in the
“fulfillment”table (to request the distribution
of some information to the prospect) and the
task” table (to establish a tickler for the
follow-up call).
Figure 3 representsthis entire process:Figure
3:Provide a graphical design environment that allows users
to quickly and easily establishworkflow for the processing
of target records based on the existence of records in the
target andthe value of certain fields in the source.Challenge
# 4: An application relies on the validation andstandardization
of input data.An enterprise application needs to maintain
control over its data in order todeliver meaningful solutions
to its users. Accounting applications represent avivid example.
It would be difficult to produce meaningful financial statementsif
the accounting system allowed users to enter journal entries
where the debitsdid not equal the credits, or to complete
the entry of an invoice where thedetail of items purchased
are not equal to the total of the invoice.
It would alsobe
difficult to provide an aging of accounts receivable if
invoices were loadedinto the application without valid date
values.For these reasons, the user interface of the application
enforces these validationand standardization rules. Unfortunately,
the user interface is not a practical orefficient means
to integrate data from other applications. An alternativemechanism
is required to enforce validation and standardization rules.What
is needed to address validation issues?Many applications
vendors provide code designed to perform these validationsand
enforce standardization rules when bringing in data from
otherapplications. The challenge is that in most cases,
this code doesn’t go farenough. It is typically designed
to reject records that do not conform to theapplication’s
requirements. It keeps you out of trouble but is not very
helpful.It does not provide a simple way for users to calculate
(or locate) appropriatevalues as records are being processed.
It does not “fix” the data.
These tasksare left
to the user and are very burdensome.Take the invoice scenario
outlined above. The invoice requires that we havevalid dates
for the “invoice date” and “due date”
fields. Without valid dates inthese fields, the record will
be rejected. We could add additional validationprocessing
that would provide default values (e.g. “today’s
date” for invoicedate and “invoice date plus
30 days” for due date) only in the absence of validdates.
Figure 4 illustrates an invoice record being inserted into
the invoiceheader API of a target application utilizing
the process described above.
In some cases, user intervention or dynamic processing at
run-time may berequired for validation and correction. In
the journal entry scenario, a user mayneed to define the
entry type for the journal entry based on the source of
thejournal entries being processed. In another scenario,
the creation of defaultvalues may need to be generated outside
of the application dynamically.
For example, an indicator
within the source record(s) could define the entry type.Provide
a simple way to design validation that can be performed
dynamically during theintegration process. In the case where
validations are being enforced by vendor code, provide away
to define valid values at run time, with or without user
involvement.Challenge #5: Managing application integration
requires robust errorreporting and error management.An integration
process introduces a new set of interactions with the applicationand
ideally accomplishes these new tasks without error. It is
unlikely in today’scomplex IT environments that a
process will be completely error free overtime. In cases
where errors have occurred, it is essential to quickly identify
andisolate the error and then efficiently correct the error.What
makes good error detection and reporting? Error correction
is a very complex area that drives three importantrequirements:1)
Report errors in a way that makes it easy for users to identify
the nature ofthe problem. Standard vendor database and API
level error detection andreporting are typically not robust
enough. The trick to useful reporting lies inmatching all
errors to the specific transaction(s) responsible for the
problem.Once the error has been resolved, the user can reprocess
just the sourcerecords that caused the error instead of
an entire set of records. Allowing theuser to define error
conditions and messages can also provide additionaldiagnostic
information.2) Provide a granular “rollback”
capability to preserve the integrity of recordsand transactions
in the target application. For example, when inserting aninvoice
transaction, it may be necessary to verify that the total
of the detailrecords is equal to the total transaction value
in the header record.
If an “out of balance” error occurred, it would
be necessary to rollback all of the headerand detail inserts
for that transaction. Since each integration process isdifferent
it is necessary to provide control over the rollback process
to the user.3) Tightly integrate error management functionality
with workflow. Taking ourlead example that was discussed
in the workflow section, the user may want todefine rules
at each point of the workflow for dealing with error conditions.
Ifan error occurred while inserting the fulfillment record,
a user could have thechoice of rolling back the entire transaction
(which includes the lead insert orupdate) or just the fulfillment
entry itself.
An insert into another table couldindicate
that the fulfillment failed without removing the lead record.Integration
processes are extremely difficult to set-up and maintain
without robust errorreporting and management capabilities.
When designing an error reporting capability, startwith
the error reporting resources available within the application
and database, provide foradditional reporting specific to
the integration task being performed, and provide a set
ofreports and logs that can be matched easily with the errors
themselves. Provide errormanagement capabilities that can
be configured by the user and integrated into the processflow.Challenge
#6: The integration process needs to interact with theapplication
(or a surrogate to the application) at run-time.Many of
the requirements outlined in the earlier challenges are
created by theapplication. Some exist within the application
itself.
If the application vendorhas incorporated them into
the integration process (represented by the vendorcode sections
of Figure 1), then there will be few issues. If they have
not, thena mechanism to interact with the application at
run-time must be created and iscritical. This run-time interaction
must be capable of invoking conversations atdifferent points
in the integration process.For example, in the case of the
invoice date requirement, the validationconversation is
performed at the point that the entry is made for the invoicedate
and due date fields. If a valid value is not entered, the
processing of thatrecord is not allowed to continue. In
the case of the journal entry or invoicedetail items, the
validation can only occur after all values have been entered,and
a posting has been attempted.Take the invoice example described
earlier. In this example, we will insert theinvoice header
and detail records into available APIs of the application.
It
is important that the header record has
a valid invoice date and due date beforethe header insert
is processed. It is also important that, after inserting
all of thedetail invoice lines, the total dollar value of
the detail lines is equal to theinvoice total in the header
record. It is therefore important to have amechanism to
“call-out” to the application to invoke its
validation logic at thesedifferent points in the integration
process. Figure 5 is an illustration of thisprocess.Figure
5:There is one difficult obstacle to this application interaction,
however.
The code that performs these validations may not
be encapsulated in discreetcomponents within the application.
It may be buried in the code of theapplication and not accessible.
In this case, a mechanism to encapsulate thevalidation logic
within accessible components is required. These componentscontain
code that essentially acts as the application’s surrogate,
performing thevalidations that the application cannot.Challenge
#7: An application is highly configurable and dynamic.Business
applications are customized to meet the demands of the business.Field
names and definitions are changed, tables may be added,
relationshipsand interdependencies between tables and fields
are redefined, and businessrules are modified.
It is critical
to deliver integration capabilities that are asflexible
and dynamic as the source and target application(s). This
challengemay be the most difficult of the seven since it
permeates the other six.What capabilities are important?Two
key capabilities are critical when dealing with configurable
applications:1) The current state of the application must
be represented to the user. Theschema definition, whether
presented to the user at the table level, in a fileformat,
or within an API, should include these customizations.32)
The second key capability is the ability to quickly adapt
the integrationprocess to configuration changes in the application.
This is best accomplishedusing a graphical environment that
externalizes the definition of the integrationprocess. It
is not practical to sift through lines and lines of code
to make alimited number of modifications to the process.
Modifications to theapplication should be isolated and their
effect on the integration process shouldbe presented to
the user.
The user can then quickly and efficiently make thenecessary modifications to the integration process.Implement
an integration framework that is flexible to customizations
in the application.Utilizing multiple integration points
into the application can provide for a more dynamicintegration
process environment.
Make sure the integration process can
quickly adapt tochanges in the application by employing
a graphical design environment that externalizes theintegration
process and isolates the effect of any changes in the application
environment.3 The integration points provided by the application
vendor typically do notaddress this requirement well. They
tend not to be very dynamic to changes.An approach to addressing
this limitation is to utilize one integration point inthe
application for schema definition and another for the processing
oftransactions.
Some final thoughts…C an we really hide complexity
from the user ?The answer is yes and no. Complexities that
are tied more closely to thestructure of the application
(Challenges 1, 2, 7, and part of 5) are greatcandidates
to be hidden from the user, for example the generation of
uniqueprimary keys on tables. Conversely, complexities that
are too tightly tied to theintegration process are not good
candidates to hide. Workflow requirementsrequire configuration
and control directly from the user. That’s not to
say thata tool with intelligent workflow architecture cannot
simplify the process, but itcannot be completely hidden
from the user.The key to hiding these challenges is to provide
a simplified, higher-levelintegration object to the user.
The user deals with the object, while the objecttakes over
tasks like data normalization, primary key generation, enforcingdefault
values, etc. This integration object provides an abstraction
layer thatdramatically simplifies the integration process
for the user.B uild or buy ?An option when developing an
integration framework is to license technologyfrom an integration
tool vendor. This strategy enables the company to focuson
its own core competence while ensuring adaptability to market
changes.The major impediment to the success of this strategy
is the difficulty inintegrating 3rd party tools effectively.
The nature of this difficulty is embodiedin the seven hidden
challenges.When evaluating an integration tool vendor, it
is critical to understand howtheir product will handle the
seven hidden challenges.
Does the tool’sarchitecture
address all seven challenges? Can the tool, with minimalcustomization
and effort, satisfy these requirements?Beware of any product
that is marketed as an application “Adapter”
or“Connector”. These products typically promise
seamless integration with theapplication. If not designed
properly, however, they end up as a repository forcustom
code to address some or all of the seven challenges.Finally,
will the tool be flexible enough to respond to future customizationsand
architectural changes in the application? The answers to
these questionswill ultimately determine the success or
failure of a “buy” strategy forapplication integration. |