Chapter 14

Know Your Integrations

Summary

A website will often need to communicate with some external system. These situations can be fraught with peril. The key is to know what they are early, and to plan for the unknown.

Does this Apply?

If all of the content and functionality for your site will be delivered from inside the walls of your CMS, then you don’t need to worry about integrating with anything. And this can be true of smaller, content-based sites. However, as your site grows, you’ll likely run into an integration need, sooner or later.

Narrative

If there’s one thing that is universally reviled in all of academia, it’s this: the group project.

Usually, throughout school, you’re just barely managing to get your own stuff done. But then, inevitably, your professor assigns a project to be done with a partner, or – even worse – in teams of three or four, and your heart just sinks.

Why do we all hate group projects? A few reasons:

  • We have to coordinate with other people. There are communication issues to worry about – when can we meet, how will we divide the work up, how will we keep in sync?

  • We’re limited by the worst-performing member. If Joe doesn’t step up, then he brings the entire group down, and our grade is affected by him.

  • The final product runs the risk of being a mish-mash of styles, instead of an integrated whole. It looks weird when Sarah delivers her part of the presentation as an interpretive dance.

In working as a group, we’re now at the mercy of group dynamics, and we start to lose control of things. Unless we’re the weak link ourselves, this is not something we want to concede.

But if you think you hated group projects in school, just wait until you deal with them in a development project.


What is an Integration?

An “integration” is when two systems – meaning, two different pieces of software – have to talk to each other in some way. The two systems have to be ... wait for it ... integrated.

When building a website, one of those systems will be your content management system (CMS). The other system could be anything.

In general terms, this is done because some content or functionality is not provided from your CMS. Often, content in your CMS needs to be augmented with content from some other source, or some functionality is provided by some other system and you’d like to make it appear as though that other system works as an integrated (ahem) part of your website.

Some examples:

  • A university usually has a course and faculty management system which maintains all the courses and faculty assigned to teach them. They’d like to display this on the website, but rather than enter this information twice and keep it incrementally updated, it’s more manageable to retrieve that information directly from the source system.

  • A mortgage company wants to provide a calculator for potential customers to estimate their monthly payment with all service fees. The website provides a form to collect financial information, then it contacts an internal system to retrieve the calculated result and display it to the user.

  • A company would like to use the same user credentials for its website as it does for other systems. The website can often be connected to a centralized authentication system which checks usernames and passwords and informs the calling system who this user is and what permissions they possess.

In each of these cases, you’re “gluing together” two systems to provide some consolidated experience through integrated content, functionality, or authentication. You’re going to need to reach beyond the CMS and communicate with an external system, then somehow make sense of how it responds and incorporate that response into the flow of your website.

This is where the fun starts.


The Challenges of Integration

Much like your group project at school, integrations can be fraught with peril. Many organizations have problems keeping one system running, and now we’re asking it to keep more than one system in good working order, and keep those systems talking to each other.

There are two types of risk:

  1. Development risk, which is risk incurred during the development and launch of the project. There is a risk of the two systems being incompatible, or of the work to connect them together taking too long and consuming too many resources. This risk, thankfully, is one-time. However, once you get past this, you run headlong into ...

  2. Runtime risk, which is the ongoing risk of keeping the two systems working together over the long-term. They both need to stay running, and they need to be able to communicate without unreasonable delay. This risk is ongoing – you might overcome development risk and get the systems working together, only to have the partnership break down in production when they lose connection with each other. And if you’re really unlucky, this might happen randomly, multiple times a week ...

Thankfully, some integrations are so common as to be productized, meaning you’re not the first organization to attempt to link these two particular systems, so the communication between them is known and already developed. In fact, several products are specifically designed to be connected.

For example, many content management systems have only minimal functionality in the area of media file management – the handling of pictures, video, and other rich media. Some editors want more control and features around media, so they turn to one of many Digital Asset Management (DAM) systems. These systems can operate as standalone software, but they become more helpful when they’re integrated with a content management system, so an editor can search for media to insert in content without ever leaving the CMS.

Since DAM systems work with CMS systems all the time, this integration is quite common and is even offered as a supported product. A DAM might advertise that they integrate with CMS X and offer the tools to make that happen at additional cost, or, often, even for free as an incentive to purchase their system.

That’s an easy one.

However, in other cases, the integration you need might be the only time anyone has ever tried to glue together those two systems. Either because the system you want to integrate is relatively rare, or it’s a one-off custom system that your organization built internally.

For example, a university might have a system that shows the status of washing machines and dryers in the laundry area of a residence hall. The university might want to display this information on the page for each residence hall so students know when there are free machines available.

This integration is likely quite rare. It’s not likely to be productized for even one CMS, much less multiple CMSs that happen to include the CMS you happen to be using.

Not to overdo the group project metaphor, but if you have to do a group project, clearly you want to do it with people you know. If you already have established friendships and communication patterns, it makes things easier. The person you have to work with is already in the contacts in your phone, you know their schedule, you know how they relate to other people, and you may have known them long enough that your relationship has already had its ups and down, so you’ve fought, talked it out, and made up multiple times.

Other times, you’ve never met this person. In these cases, you cross your fingers and hope for the best.

Real Time vs. Scheduled

A large percentage of integrations are intended to combine content from two sources – usually your CMS and some external system. When doing this, a key question is how close to real-time/instant does this connection have to be?

Content is naturally WORM – Write Once, Read Many. We might publish an article one time, and it is read 10,000 times. Once published, the content doesn’t change. And if it hasn’t changed since the last time our CMS retrieved it, do we need to retrieve it again?

When discussing issues of timing, we’re juggling the inter-connected factors of:

  1. Communication – can the systems speak to each other?

  2. Velocity – how fast does the content changes?

  3. Latency – how quickly do we need to make those changes?

Clearly, communication can break down. If your systems have to talk to each other, then that line of communication has to remain open. This can get into some deep issues of network topology and architecture, but understand that just because you can log into both systems, doesn’t necessarily mean they can talk to each other.

Additionally, if you’re integrating systems to combine content from multiple sources, then the velocity of that content matters, as does your requirement for latency.

How often do the external content change? And how quickly do you need those changes reflected in your content exposed to the public?

Let’s return to our university example from above. Say we have a page for each faculty member. That page has some very subjective, marketing-ish copy like their biography and credentials. This is maintained by the marketing staff, directly inside the CMS.

However, also displayed on that page is more fundamental information like the professor’s email, phone number, and name. This is not information that needs to be maintained in the CMS, because it’s stored elsewhere. To also put the phone number in the CMS just to display it would mean entering and maintaining this information in yet another place (the “double entry” problem).

This is a classic case for an integration. Ideally, the professor’s page on the website would be a hybrid of content coming from both the CMS, and the internal course management system.

The delivery of a faculty page to the requesting visitor is an internal mixing of content from the CMS repository and an external system.

To connect these two systems, let’s consider a combination of technical and logical questions:

  1. Can the CMS and the course management system communicate with each other?

  2. How fast and stable is this line of communication?

  3. How often does the information in the course management system change?

  4. How quickly do we need those change reflected on the website?

  5. What is the risk of this content becoming outdated?

For the purposes of this example, let’s assume that yes, the web server and course management system are located on the same network and can communicate with each other. But, for fun, let’s say this connection is unstable. The course management system is on an old server, and gets overloaded from time to time when students are registering for classes. The server admin has said – quite gruffly, while stroking his neckbeard – that you new-fangled kids cannot have on-demand access to this server.

Okay, that’s questions #1 and #2 answered.

For question #3, let’s consider how often this information changes. Professors are normally hired during the summer – not a lot of faculty starts at other times of the year. Also, once a faculty member starts at the university, their phone numbers and email addresses rarely change. The courses they teach do change, but maybe once a year, at most. For all intents and purposes, the information in our course management system is static – it has a low velocity.

Highly related is question #4. What are our requirements for immediacy? If a professor did change their email address, how quickly do we need that change reflected on the website? Right away? In a perfect world, sure. But in the real world, a 24-hour delay isn’t going to kill anyone. This means we can tolerate high latency in our content changes.

Finally, what is the risk of incorrect content? Of course, we want our content to be up-to-date all the time, but at the end of the day, is anyone going to die if an email address is temporarily wrong? Probably not. There are likely multiple other ways to contact this professor, and the sender would get a returned email to notify them that their email wasn’t received and they should use other means. Clearly, our risk is low.

In this case, we could likely use an “import and update” pattern. We can configure our CMS to import content from the course management system, then keep it up dated on a schedule. Once every 24 hours, let’s say, in the wee hours of the morning when load is low, a scheduled job would check for new or changed information and update the content in the CMS.

If a professor changed their email address sometime in the afternoon on a Tuesday, it would be wrong for the rest of that day, but correct itself overnight. No harm, no foul.

Our university example might seem contrived, but this set of circumstances is fairly common. External content sources often move slowly, and the content is low-risk.

However, there are exceptions.

Consider if you provide stock quotes on your website.

  • This information is highly volatile – it has a high velocity

  • We need to see changes immediately – we require low latency

  • People will be making financial decisions based on this information – it has high risk

In this case, you can’t do an import and update. Doing that even once per minute would likely be more latency and therefore more risk than people are willing to tolerate. This type of content simply demands a real-time connection between systems.


Determining if Integration is Necessary

Let’s back up a second –

Are we even sure that integration is even the right choice? Because there are times when it’s not.

In our university example, an integration is probably the right way to go, because what is the alternative? The content we want – phone numbers, emails, etc. – is locked away in our course management system, and that probably doesn’t have a public viewing option, so we have to get that content into our CMS where we can display it.

However, consider the stock quote example. If we want to display stock quotes on our website, we open up a Pandora’s Box of issues, as we noted. An argument could be made that it’s more trouble than it’s worth.

Removing a Link in the Chain

One option would be to not display stock quotes, of course. We could decide there are too many issues and just throw in the towel on the idea entirely. In this case, we could simply link users to another site to get the quote.

However, another option would be to remove a link in the chain by moving the integration to the client, rather than the server.

So far, we’ve been integrating at our website. So, when someone requests our home page with the stock quote, they’re requesting a page from our website, and then we’re turning around and requesting a stock quote from another service, bringing that back and mixing it with our content, and then sending it back to the requestor. (And, it’s worth noting, the service from where we got the stock quote probably even got it from somewhere else.)

This can end up like the proverbial game of telephone. With digital communication, the message won’t get garbled, but every time we pass it down the line, we introduce delay and risk that something could go wrong.

We can take a link out of that chain by instead delivering the code to request the stock quote, and have that request come out of the user’s browser, directly to the source. We can do this a couple of ways:

  • An inline frame (an IFRAME) could contain a “window” into a source’s website that would display the stock quote. For many services, they provide URLs specifically for this.

  • Some JavaScript could execute in the user’s browser and request content directly form the source.

Technical details aside, both of these methods have done the same thing: they’ve removed one link in the chain. Now the user is bypassing our website and getting content from the source, which is always going to be more direct and involve less risk.

Showing the Seam

Additionally, sometimes the system you want to integrate with is just too feature rich and constant evolving to “proxy” all its functionality into your website.

For example, there are services to which banks subscribe that manage the process of applying for a mortgage. These systems allow users to fill out a lengthy, multi-step application, upload documents, authorize credit checks, attest to facts, etc. They’re so complicated that an ecosystem of vendors has sprung up to build these systems. Banks don’t build them internally because they’re complex and other people have solved the problems.

Occasionally, a banking client will say, “But I don’t want visitors to have to go somewhere else! I want them to do all of that on our site!”

What the client is saying, basically, is “Rebuild all that functionality on our site, then connect to the vendor’s system, and exchange data in both directions, in real-time.”

While noble and ambitious, this is just not a realistic approach. First, the initial development would be staggering – remember, the reason this external vendor stays in business is because they’ve built something that’s difficult to replicate. Second, the vendor’s system likely evolves over time. They release new updates and features every few months, and some of these might interfere with communication patterns to your website (a so-called “breaking change”). You would spend a lot time (and a lot of downtime) chasing these bugs.

It would quickly become apparent that the benefit of seamless-ness you were seeking just wasn’t worth the pain.

Sometimes, it’s better to just show the seam. If the customer wants to apply for a mortgage, let them know you use an external service for this, and send them over there. These services will often let you “skin” their tools with your logo and colors, so it has some branding connection.

Would it be perfect if this was on-brand and on-site? Sure. But it’s often not worth the trouble.


The Costs of Integration

Given the volatile nature of integrations, they can be hard to scope. Often, you’ll be trying to integrate one system with another in a combination that you or your implementation partner has never done before.

Thus, the answer to the inevitable question of “How much will it cost?” might very well be "… we have no idea."

Integrations can often be a whirlwind trip into the unknown. Even if it seems straightforward, these things can turn on the smallest of issues – some minor thing pops up that makes attaining the goal impossible, due to a quirk of how that factor intersects with your particular requirements.

If you’re using an internal development group, this can mess with your timeline. If you’re using an external development partner, both your timeline and your budget are at risk. Problems with unknown integrations have run wild, soaking up tens of thousands of dollars that were intended for something else.

When an external partner is asked to budget for one of these, they’ll generally do one of two things:

  1. Leave the budget open-ended – they might provide a good-faith estimate, but allow for overages

  2. If a firm, fixed number is demanded, they’ll pad that number like a bad living room sofa from the 70s

Actually there’s a third option: they might give you a lean, fixed number. But if they do, I’d be very skeptical of it, and don’t be shocked if they come up with an excuse of why they can’t complete the integration for the number they quoted.

Practical vs. Perfection

Remember, everything is a trade-off, and until a CMS is created that does everything you could ever want to do, we’re going to be stuck integrating with other systems. The trick is to be wise about when to do this, and when to just show the seam and have your users get their content or functionality directly from other systems.

Every answer to an integration question is a combination of budgetary, schedule, and technical factors. There’s no blanket approach that works in all cases, so be prepared to consider all the factors when making a decision.

Inputs and Outputs

The input to this phase is an understanding of where everything on your website is coming from. Consider every hope and dream you want for your website – are you going to create all that content (or have it created)? If not, then where it is coming from? You need to be able to answer that question in every case, and the answers to those questions are the output of this phase. You need to be able to articulate those data-sharing requirements to whomever is scoping and developing your website.

The Big Picture

This phase can sort of be lumped into scoping and budget, but it needs to happen just before that – you need to go into that process with understand of where all your content is coming from. To scope anything, a developer needs to know the technical factors they’re going to need to juggle. So, the project can’t be scoped for schedule or budget until this is complete.

Staffing

The logical component of this can be done by a content strategist. As part of the planning process for the website, whomever is in charge of content should naturally know where this content is coming from, whether internally to the CMS, or externally from somewhere else. The scoping component of this is technical by nature, and needs to be done by a developer responsible for figuring out development schedule and budget.

Resources

Articles

Books

Presentations