This sponsored post is produced in association with XIM.
When a global platform like Facebook simply stops working — even for just barely an hour — the whole world takes notice. That’s what happened recently when Facebook’s services, including its popular photo-sharing app Instagram, ceased to function properly, leaving over a billion users on both the web-based platform and the mobile apps without access for about 40 minutes.
In the daily lives of modern consumers, the reliability of the technologies and services they use is serious business. This is especially true in the mobile space, where a chaotic mess of fragmented devices, operating systems, and cloud services can render software unreliable.
One example, Matt Halligan says, is Java: “A lot of people use the latest versions of Java, but the Java systems are built on a Java machine that developers have no control over, and that Java machine intermittently does its own memory management. That memory management can create different timing behaviors, which can generate different behaviors in your software — which could then make your software unreliable in terms of how it responds to inputs and outputs.”
Matt Halligan is the CTO and VP of Engineering at Openwave Mobility, a mobile solutions provider that serves operators worldwide like Vodafone, Sprint, and Orange. Previously part of Openwave Systems, the company behind the development of Wireless Access Protocol (WAP) in the 1990s, Openwave Mobility is now a private company specializing in mobile software.
“In our world,” Matt says in an interview with VB, “where you’re supporting millions and millions of customers, the first thing around reliability from our perspective is ensuring consistent behavior.” Matt explains that you want to know how your product will behave during its operation so you can maintain proper behavior and diagnose potential issues, ensuring that in the end, “it does what the customer requires, and not necessarily what the engineering team thinks it requires.”
The complexity in meeting demand for “always-on”
When it comes to mobile software, reliability goes beyond just stability, especially for a modern consumer base that craves on-demand mobile solutions they can access with a swipe of a finger. A reliable mobile app, for instance, needs to be stable, secure, and reliable. Ideally, when an instance crashes, a secondary instance runs in its place, errors are logged and monitored, and the issues are fixed without disruption to the user’s service.
This is just the surface of the mobile industry, however. Beneath the apps is the mobile OS, beneath that is the network, and beneath that is a multitude of interconnected layers of solutions all relying on one another to make the grand scheme work. Every time you bring up an app on your smartphone, dozens of companies are involved.
And all the apps on all your mobile devices are dependent on the reliability that companies like Openwave Mobility ensure.
In the space where Matt’s company operates, it’s a lot more complex than in the app level. Their solutions don’t keep just a single app’s uptime consistent or responsiveness quick — they support millions of mobile devices and the millions of apps installed on them. At this level, reliability becomes tricky, to say the least.
Matt shares a particular challenge the company had to face in delivering a portfolio of components to a tier-one mobile operator as a single, combined solution: “Each of those components were built independently and were startups in their own right. The individual parts met all their individual requirements, but as you combined them together, they didn’t meet the requirements reliability across the solution.” While Matt says they didn’t have to change any code implementations, they did perform a “complete deployment architecture change” to resolve the issue that resulted in an outage for their client.
Facebook says their recent outage happened when they “introduced a change that affected [their] configuration systems.” In the meantime, the whole world panicked on Twitter. This is the scale of the significance of reliability today.
“A lot of times,” Matt says, “it’s assumed your software is reliable. When you deliver a software or service that’s not, then effectively, you lose the trust of the customer, you lose the commercial agreement, and you probably quickly lose your market share within the industry.” You can imagine Facebook’s engineers going crazy trying to fix the issue during those 40 tense minutes.
Getting to A-level reliability
So we asked Matt: what does it take to be reliable?
Matt says “from my perspective, reliability is designed into a product, it’s not implemented into a product.”
He talked about Openwave Mobility’s strategic partners and how they help the company maintain its commitment to reliability. Having worked with several potential candidates, the company partnered with XIM in 2006 and has been working with them since. XIM is a provider of outsourced quality assurance and software development services.
According to Matt, having a strategic partner with the technical background to uphold the standards of reliability is paramount. He requires engineers that are not just technically strong but come with a background in mathematics and electronics in order to bring “strong problem-solving technical analysis capabilities” to the table. Matt highlights the ability to code beyond what he refers to as the “Happy Path,” the singular coding scenario where nothing goes wrong.
A lot of startups are only focused on the Happy Path, Matt says, “because they don’t have experience on scenario development — developing for scenarios aside from the Happy Path,” such as error and breakage scenarios.
“We put a lot of emphasis on performance and capability testing between XIM and ourselves,” Matt adds, explaining that they run aggressive, automated tests on their products and services on a regular basis to ensure reliability.
Bottom line: reliability trumps high availability and maintainability
Reliability is hardly a new concept, but it certainly has been taking center stage a lot lately. What does it mean to be reliable? How important is it in our day and age?
“The way I look at it is if your software is not reliable when it’s in service, then it’s not really a game of percentages,” Matt says, “It’s a binary. And you’re not really providing a service at all because there’s no consistency in what you’re doing. There’s no trust from your customer to you for the service you’re offering. It’s not an optional capability, it’s a mandatory capability.”
Matt adds: “It’s up there with high availability, reliability, and maintainability — of those three, it’s the most important.”
Evidently, in the multi-faceted mobile space, reliability isn’t just one of the factors that are nice to have. “It’s a foundational item when it comes to building and providing a service. It’s a must-have,” Matt says.
Think back on when you couldn’t access your favorite app via your mobile and you’ll definitely agree.
Sponsored posts are content that has been produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. The content of news stories produced by our editorial team is never influenced by advertisers or sponsors in any way. For more information, contact sales@venturebeat.com.