Skip to main content [aditude-amp id="stickyleaderboard" targeting='{"env":"staging","page_type":"article","post_id":1960182,"post_type":"guest","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"bots,mobile,","session":"D"}']
Guest

Why bots won’t replace apps anytime soon

Image Credit: Profit_Image / Shutterstock
TRANSLATIONS: 中文 (品玩)

Lately, everyone’s talking about “conversational UI” [user interface]. It’s the next big thing. But the more articles I read on the topic, the more annoyed I get. It’s taken me so long to figure out why!

Conversations, writes WIRED, can do things traditional GUIs can’t. Matt Hartman equates the surge in text-driven apps to a kind of “hidden homescreen”. TechCrunch says “Forget apps, now bots take over.” The creator of Fin thinks it’s a new paradigm all apps will move to. Dharmesh Shah wonders whether the rise of conversational UI will be the downfall of designers. Design, says Emmet Connolly at Intercomis a conversation.

[aditude-amp id="flyingcarpet" targeting='{"env":"staging","page_type":"article","post_id":1960182,"post_type":"guest","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"bots,mobile,","session":"D"}']

Benedict Evans prophesied that the new lay of the land will be that “all messaging expands until it includes software.”

“People don’t want apps for every single business that you interact with,” says David Marcus, head of Facebook Messenger, “…just have a message within a nicely designed bubble … [that’s a] much nicer experience than an app.” Under his charge, Facebook Messenger has tested this approach, building integrations with high profile partners as well as opening up a bot API.

AI Weekly

The must-read newsletter for AI and Big Data industry written by Khari Johnson, Kyle Wiggers, and Seth Colaner.

Included with VentureBeat Insider and VentureBeat VIP memberships.

We’ve even seen avant-garde attempts at taking this idea to its extreme, like Quartz’s latest app, which presents the news as a conversation, or the game Lifeline. Apps like MailTime even promise to save us from our emails by turning them into chats.

Well!

I guess I might be partially to blame for this, as a few people have cited a section in a 2014 piece of mine titled “Chats as Universal UI.”

This recent “bot-mania” is at the confluence of two separate trends. One is that agent AIs are steadily getting better, as evidenced by Siri and Alexa being something people actually use, rather than gimmicks. The other is that the the U.S. somehow still hasn’t gotten a dominant messaging app, and Silicon Valley is scrambling to learn from the success of Asia’s most popular messenger apps. This involves a peculiar fixation on how these apps, particularly WeChat, incorporate all sorts of functionality seemingly unrelated to messaging. U.S. competitors come away surprised by just how many differently shaped pegs fit into this seemingly oddly shaped hole. The thesis, then, is that users will engage more frequently, deeply, and efficiently with third-party services if they’re presented in a conversational UI instead of a separate native app.

It’s that part which, as I’ve spent the past two years in my current job eating and breathing messaging, seems a major misattribution of what makes chat apps work and which problems they’re best at solving.

As I’ll explain, messenger apps’ apparent success in fulfilling such a surprising array of tasks is not owing to the triumph of “conversational UI.” What they’ve achieved can be much more instructively framed as an adept exploitation of Silicon Valley phone OS makers’ growing failure to fully serve users’ needs, particularly in other parts of the world. Chat apps have responded by evolving into “meta-platforms.” They’ve taken on many of platform-like aspects to plaster over gaps in the OS that actually have little to do with the core chat functionality. Not only is “conversational UI” a red herring, but as we look more closely, we’ll even see places where conversational UI has breached its limits and broken down.

[aditude-amp id="medium1" targeting='{"env":"staging","page_type":"article","post_id":1960182,"post_type":"guest","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"bots,mobile,","session":"D"}']

But first, let’s retrace how this state of affairs really came about in the first place.

Note: The opinions expressed here are purely my own and do not reflect those of my employer.

A brief history of the chat bubble

We’ll begin by taking a closer look at the apparent atomic unit of the “conversational UI,” our friend the message bubble. To do that, we’re going to go back in time a bit. Let’s take a stroll to, oh, about 2003.

In those days, sending a quick text meant dealing with a UI that looked like this:

[aditude-amp id="medium2" targeting='{"env":"staging","page_type":"article","post_id":1960182,"post_type":"guest","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"bots,mobile,","session":"D"}']

In many phone’s UIs, SMSes were treated like mini-emails, often complete with an inbox, outbox, and drafts. So fussy!

Later, some time in the last decade, perhaps owing to a prototype by Jens Alfke, our IMs began taking on their familiar appearance as cartoon dialog bubbles. When smartphones took off later, it was a natural fit for the system SMS apps on the first versions of iOS and Android.

[aditude-amp id="medium3" targeting='{"env":"staging","page_type":"article","post_id":1960182,"post_type":"guest","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"bots,mobile,","session":"D"}']

Soon after smartphones launched, those default SMS apps were eclipsed instantly by the third-party messaging apps emerging in Europe and Asia (in the U.S., we have somehow still clung to SMS). These had started as direct clones of the system SMS apps — the only difference being that messages were counted against one’s data quota instead of against the stingy and arbitrary SMS allotment given by carriers.

These apps, which initially came along to replace SMS, have styled the message bubble every way imaginable: round and square, flat and puffy, green and blue. Free from the constraints of a 20-year-old protocol, the apps evolved, taking on more features. The bubbles displayed in these apps developed a number of affordances for new features like read receipts, names in group chats, and more. New kinds of bubbles emerged to accommodate the new types of content these apps supported:

The app I’ve been working on has truly maximized the opportunities available. WeChat’s got bubbles for text, voice messages, big videos, l’il “Sight” videos, full-width cards with hero shots for news headlines, bubbles for payments, files, links, locations, and contact cards. Mucking through some code once, I saw definitions for nearly 100 types of supported messages, most I’d never seen in actual use.

Aside from supporting so many different types of messages, another advance WeChat made was realizing that a messaging app needed different types of accounts, as well. The company had seen brands and celebrities registering personal accounts and making a series of giant group chats to invite their fans into. There had to be a better way! Thus was born Official Accounts.

[aditude-amp id="medium4" targeting='{"env":"staging","page_type":"article","post_id":1960182,"post_type":"guest","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"bots,mobile,","session":"D"}']

Here’s what one of the first accounts, China Southern Airlines, looked like when the feature launched in 2012:

Yeah…this bot ain’t exactly HAL 9000.

Here’s what the account for my city’s subway system looked like:

[aditude-amp id="medium5" targeting='{"env":"staging","page_type":"article","post_id":1960182,"post_type":"guest","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"bots,mobile,","session":"D"}']

Why was the user asked to enter numbers, as if on an IVR system? Were the creators of these accounts so unable to imagine the possibilities of a new medium that they were compelled to replicate their old-school hotline?

Actually, no! In fact, keywords could be defined, and messages could even be routed through the third party’s server to formulate a response using whatever method it pleased. Yet, in this case, entering keywords or more complex queries in Chinese (or god forbid, formulating a complete sentence) would be even worse. At the time, typing in numbers really was the best UI choice, given the constraints.

Critically, these experiences were still often preferable to downloading a separate app on a data plan or spotty Wi-Fi connection, or having to call someone’s customer service hotline and wait on hold. The Official Account platform was a rousing success; there are over 8 million of these accounts today. As it took off, the APIs offered to third parties to build their accounts expanded to accommodate a growing array of use cases and demands.

Some of these new APIs deepened and enabled new possibilities within the “conversational” nature of these interactions. Voice messages were transcribed via speech recognition before being sent to the OA’s server. Objects could be recognized in pictures. Advanced natural language processing could even extract named entities and certain types of queries from text sent by users. These users could be patched in to agents at service centers to carry on a conversation — exactly as they would with a friend. There was even a special integration whereby a user could select a message in a chat and forward it to Evernote’s Official Account (as they would to a friend) to save it to a note. Cute, right?

On the other hand, enhancements that ran counter or orthogonal to the idea of conversational UI were far greater and more successful.

One affordance added right off the bat was the three-tabbed fixed menu. Now accounts could offer fast access to all their features without having to send a prompt or depend on state information. Here’s what the menu on the Guangzhou Metro’s main official account looks like today:

Not only can those tabs send keywords, they can open up webpages as well. Web apps invoked in this way can identify the user (using OAuth). They have an extensive JavaScript API at their disposal to integrate with all sorts of features elsewhere in the app, even reacting to Bluetooth beacons.

OAs gained the ability to send and receive money. The accounts could have QR codes — for the account itself, as well as parametric ones that can send along extra data (like what product I’ve picked up in a store or what table I’m sitting at). They gained the ability to authenticate me on their owners’ Wi-Fi hotspots. (This development emerged, no doubt, from merchants with OAs made for their shop writing messages to tell customers their router’s Wi-Fi password). Official accounts could not only send out headline news to users, but, if they wished, host the linked articles on WeChat itself, letting users add comments and even send cash tips via the app. None of these things have anything at all to do with chat, but they’re darn nifty!

While this craziness was flying around out here, what sort of vision did those disruptors back on the West Coast begin conjuring for our future bot overlords? Let’s ponder this example from the homepage of Microsoft’s recently launched Bot Framework. Here’s how they think we’ll be ordering pizzas in the future:

Good gravy, that’s over 73 taps1 to tell Pizza Bot what I want. And this is when he already knows me on a first-name basis! I’d hate to see him when he’s just warming up to someone.

Man, counting those taps sure has made me hungry! We haven’t quite got pizza here, but there’s Pizza Hut, which is almost the same. Let me open their official account…

I have, in 16 taps, ordered a pizza. That includes 1 for choosing ‘medium’, 1 for dismissing their coach marks, and 6 for entering my PIN. For some reason, it’s not set up to use my TouchID. Afterwards, Pizza Hut’s account even sent me a special transaction message with a link to let me track it:

Well, it isn’t exactly Ray’s, that’s for sure, but it’s pizza. And I didn’t have to leave my chat app to get it.

The key wins for WeChat in the above interaction (compared to a native app) largely came from steamlining away app installation, login, payment, and notifications, optimizations having nothing to do with the conversational metaphor in its UI.

It shouldn’t require any detailed analysis, then, to point out the patent inanity of these other recent examples of bots and conversational UI proffered by companies at the vanguard of the trend:

This notion of a bot handling the above sorts of tasks is a curious kind of skeuomorphism. In the same way that a contact book app (before the flat UI fashion began) may have presented contacts as little cards with drop shadows and ring holes to suggest a Rolodex, conversational UI, too, has applied an analog metaphor to a digital task and brought along details that, in this form, no longer serve any purpose. These range from things like the small pleasantries in the above exchange (“please” and “thank you”) to asking for various pizza-related choices sequentially and separately (rather than all at once). These vestiges of human conversation no longer provide utility (if anything, they impede the task). I am no more really holding a conversation than my contact book app really is a little Rolodex. At the end, a single call to some ordering interface will be made.

Designing the UI for a given task around a purely conversational metaphor makes us surrender the full gamut of choices we’d otherwise have in representing each facet of the task in the UI and in how they are arranged spatially and temporarily. Consider those made in Pizza Hut’s account: I can see exactly how many slices a medium is, how much corn is inexplicably sprinkled on top of a “Tianfu Beef” pizza, what address it thinks it’s delivering to, and exactly how much the pizza will cost.

So let’s take these past few years in China as “The Great Conversational UI Experiment.” Here, you have a messaging platform that achieved such total saturation among both users and businesses (to an extent that Facebook, Kik, and Telegram would die for). It boldly and earnestly carried the “Make every interaction a conversation” torch as far as it could. It added countless features to its APIs — and yet those that actually succeeded in bringing value to users were the ones that peeled back conventions of “conversational” UI. Most instructively, these successes were born out of watching how users and brands actually used the app and seeking to optimize those cases.

You can see from Facebook and others’ early forays into bots that they’re already beginning to have the same hunch. Telegram’s take is true to its inspiration in IRC-style slash commands.

To be fair, it’s still surprising the range of apps and services that can be shoehorned into a chat-style UI. No doubt it can be expanded with great AI and little UI affordances here and there.

I should concede, too, that performing certain tasks in a chat brings along some useful side benefits. It can be, compared to apps, a low-bandwidth, snappy, and consistent way to get a task done. I’m even left with a handy, time-stamped, offline-viewable record of everything that’s transpired. I can search it and quickly jump to media and links. I can clip parts of it and forward it to friends within the app or save it to an archive.

Because I’m interacting with certain services via a messaging app instead of via independent apps, when things happen that might deserve my attention, the thread gets bumped up in my inbox instead the message getting lost in a sea of push notifications and emails.

And though it’s clear that pure “conversational UI” is ultimately a failed conceit, that last piece may be more important than it first seems…

The inbox is the new home screen

The inbox is where it’s really at. I am, of course, heavily biased, but I feel WeChat’s is the best in class. I’d even go as far as to say it’s an overlooked piece of genius in the app. Some key improvements (compared to the inboxes we’re used to in email and SMS apps) include:

  • Stickyability: If I want to stay on top of a particular thread in the inbox (whether it represents a person, a group, an official account, or another feature exposed here), I can “sticky” the thread to the top of the inbox.
  • Mutability: I can mute notifications from any thread, but the thread will still pop up in the inbox, as any thread does, only with an indeterminate red badge instead of a numbered one.
  • Killablity: If I don’t want to receive messages from something anymore, it’s two taps to kill it.
  • Hierarchy: News and promotions can be pushed to me through official accounts, but when they arrive, they just make the “Subscriptions” category pop up and show me the latest headline without interfering with other messages. When a service has a real reason to send me, personally, a message, it can pop out and appear in the main inbox. I find this approach superior to Gmail’s “sidelining” of messages into separate inboxes. 2
  • Status Items: Persistent processes/statuses can be displayed in a special cell at the top. These include things like being logged into a web/desktop client, using WeChat to authenticate on Wi-Fi, playing a song, or migrating data between phones.
  • Searchability: The search bar on the main screen searches not only my contacts but my groups, chat history, favorited content, articles on the Web, my news feed, and names of features in the app.

It is telling, then, that in all localizations, the name of the first tab in the app is not “Chats” or “Inbox” (as in other messengers), but rather just the name of the app.

Indeed, the cornerstone of whole experience is effectively a common, semi-hierarchical stream of messages, notifications, and news with a consistent set of controls for handling them. It’s no stretch to see WeChat and its ilk not as SMS replacements, but as nascent visions of a mobile OS whose UI paradigm is, rather than rigidly app-centric, thread-centric (and not, strictly speaking, conversation-centric).

When you think about it this way, the things listed there in my inbox don’t need to be conversations per se. But everything there, most abstractly, is something that can send me updates and notifications, will change in position when it does so, retains a read/unread status, and, most essentially, allows me, the user, the aforementioned modes of control.

And if we really run with this idea to its extreme, what actually might appear when I tap on a cell in the inbox doesn’t matter — I could see a conversation, a song or video, news headlines, a map showing me my route, a timer, or a sub-group of other such threads. Anything, really. Though I guess it’d be best when it’s at least something dynamic or based on a service (I certainly wouldn’t want to access my calculator or camera this way).

Rise of the tortilla chip app

This term “app” is rather old, yet it only entered common parlance with the proliferation of smartphones. This is no coincidence. The app paradigm introduced on smartphone OSes circa 2007 was a radical improvement over what we’d had on the desktop. For the first time, software was easy to install, even easier to delete, and was guaranteed to not totally screw with your system (due to sandboxing/permissions models).

At the time, smartphone apps were envisioned as baby brothers to desktop apps. On iOS, apps like Mail and Calendar were designed to evoke their Mac versions. Apple came out with pocket-sized editions of apps like Pages, GarageBand, and iMovie. For the first few years, setting up an iPhone even required plugging it into a desktop and syncing with that monstrosity known as iTunes.

Though some apps are indeed mini-desktop apps that make full use of the supercomputer I carry in my pocket, well over half fall into another category. These apps are just a vessel for a steady stream of news, notifications, messages, and other timely info ultimately residing in a backend service somewhere. They don’t really do much on their own. It’s much like how a tortilla chip’s main value is not so much in its appeal as a chip, but as a cheese and chili delivery mechanism.

The smartphone OS we use is still largely based on the idea of my phone being a mini-desktop, rather than, well, an information nacho, if you will. Consequently, if you’re making one of these apps, your app must give me something new daily (or even more frequently), or else it has no reason to live. Its information would be better shown to me via another app I do check often, like a social news feed or a messaging app. The only recourse the OS affords these apps in avoiding such a fate is the rather blunt instrument of push notifications (and things like Today widgets or Android home screen gadgets).3

The other ways smartphone OSes are failing us

After coming to rely on WeChat in China, it can seem a bit like its own separate environment. After all, within it are not only my chats, but my social news feed, my news and blog subscriptions (many only available via the app), my digital wallet, and my reading list. It even directly reads my step count from the various Bluetooth devices my friends and I use. It can scan QR codes, something my OS should do, but doesn’t (more on this later). It can recognize songs being played, even books and other objects from photos. And you can pretty easily sling all types of data between these different areas of the app, in the ways you’d expect.

Sometimes it reminds me of those awkward transitional days in the early 1990s when one might launch Windows or other shell environments from DOS and then occasionally drop back out to do other stuff. That’s what switching out of WeChat to my homescreen and into other apps is slowly heading toward.

It should be no surprise, then, when I say it feels like my OS just isn’t doing much for me lately. How is that? These days, a smartphone OS’s job, aside from the low-level drudgery we take for granted (managing memory and thread pools and the like), is to provide some common infrastructure and higher-level services that apps can rely on so that apps can focus on doing what they do best. And, well, it seems like there are so many areas where the OS is just not making much of a difference.

Each item below seems like a petty, inconsequential annoyance — to the point that I feel like some kind of strange, cranky, millennial version of Andy Rooney for even writing it — but they quickly add up!

Notifications — When I glance at my homescreen, there are red dots splattered everywhere. My eyes dart first toward a few I can interpret. WeChat, naturally, then Mail. My inboxes have 8,108 unread messages, but I surely would notice if it changed to 8,109.

My “Social” folder has four, one from Facebook, which I will check, and three from other stray apps displaying “1”s. I’m not sure what those apps are telling me, or what I’ll need to do after opening the app to clear the dot. I think one might be from when my friend checked me in on Foursquare at a bar a few weeks ago on a trip back to SF, a fact I was aware of because I was standing next to him when he did it, and because the notification already appeared on my phone then. Another might be Instagram, which just throws up a red badge from time to time when it feels lonely. But I mainly know that if my “Social” folder is displaying a 3, there’s probably nothing to see, and a 4 or a 5 may deserve checking.

The system Messages app, which I still keep on my home screen, is showing 39 unreads, mostly one-time passwords, transaction notifications from my bank, and spam. Messages, for most here, serves no other purpose. My News folder displays the sum of a few apps that are trying to tell me something. Airpocalypse is displaying the current AQI of 93 for Guangzhou in its badge.

Starbucks has a ‘1’. What’s that? Have I got a free coffee credit to redeem? Possibly a scone? Let’s see. No, it’s an unread message within the app’s own inbox saying, “Welcome to the Starbucks App!” from 43 days ago. Christ on a crutch.

Even worse than those notifications gazing at me longingly from my homescreen are those that interrupt me. When I install a new app, I’m usually prompted right off the bat to enable notifications for it. I’m taking a risk in doing so, not knowing how often or for what they’ll be sent. When I’m interrupted by a superfluous notification on my iPhone (or worse, on my Apple Watch), there’s no quick way to tell it, “Shut up, and never bother me with this sort of thing again.” I must fish through Settings, find the app, and tweak it there. It is often easier to delete the app entirely. MIUI and some other flavors of Android at least allow me to mute a given app’s notifications right after seeing one. Many apps offer settings to specify what sort of things merit notifications, but they’re often located in different places and not worth the trouble.

On iOS, if I miss a critical notification on the lock screen because I actually wanted to unlock my phone to make a call or look something up, until recently, there was no way to quickly go back and find what it was. However, iOS 9’s notifications drawer, like Android’s, now defaults to sorting notifications reverse-chronologically, instead of grouped by app — an advance five years in the making.

Lastly, things become even more clunky across multiple devices. When I get home from work and crack open my personal laptop, I am notified a second time of all the Facebook messages I received during the past couple days, all of the LinkedIn invitations I already saw (because they sent me an email and another push on my phone), and all of my friends’ birthdays.

QR codes — When I left the U.S., QR codes were a joke. Putting them on things was a way to tell people you’re a douche, like using lots of hashtags or wearing a Bluetooth headset. They were once this way in China, too, until WeChat doubled-down on them. Now, they’re used for people, group chats, brands, payments, login, and more. They’re in plenty of other apps, as well. In a place where everyone has adopted them and knows how to scan them, they’ve become a wonderful, fast way to link the offline and online worlds that saves untold amounts of time. But they have a few downsides. One is that they look like robot barf. The other is that, at least here, if you scan a code in the wrong app, you’ll get a webpage telling you to go install the right app (if not something totally inscrutable). Something that was once defined as an open standard is now non-inoperable. I predict great things for Facebook and Snapchat’s de-uglified take on QR codes. Still, I wish my phone’s OS could scan any such code (or detect them in photos) and do the right thing, but it seems the window of opportunity has passed for this.

App distribution — Aside from the obvious gripes — the app store’s poor discovery mechanisms and inconsistent approval process — I’d written an aside in my last piece about the ways iOS’s App Store misses the mark in China. In short, it’s dog slow and doesn’t support QR codes (which appear in every app advertisement here).

Apps are too big — Not to mention, apps are just too darn big these days. Twitter, an app that displays 140-character messages, weighs in at 72MB. Bigger apps are less likely to be downloaded on data plans, or even on bad Wi-Fi connections. And they’re much more likely to be deleted, forcing users to go through the setup process again every time they re-install them. Apple’s tried to solve this problem via app thinning and on-demand resources, but it hasn’t seemed to make a difference yet. David Smith astutely summed up the issue in his post “16GB is a bad experience”, and, I would add, this experience is one disproportionately had by mobile users in the developing world.

Contacts and social graph — The idea behind the Contacts app (beyond giving me a way to tag phone numbers with names) is to act as a central repository where a single entry for a person can be linked to every kind of phone number, address, or ID I know for them. The iOS version has roots in the Address Book in OS X and NeXTStep. In theory, I should be able invoke it in an app to store or retrieve a person’s info for the task at hand, rather than maintaining the same contacts in a bunch of separate app-specific databases. In practice, well, it doesn’t really work that way. The concept of a person as they exist in Facebook or WeChat is rather disjointed from their profile elsewhere.

Not only this, but adding people could be far better. Something clicked in my mind the first time I met a cute girl and she asked to scan my QR code (rather than type in my phone number or search for me on Facebook). Once I got into the habit of adding just-met friends and colleagues via QR code (or Bluetooth) I never wanted to add someone any other way. Why can’t I pull out my phone and, with a swipe from the lock screen, add someone I’ve just met to my phone’s contacts, with whatever phone numbers, websites, or messaging app usernames they’ve chosen to expose to me?

Connectivity — I wrote before about how apps here get around people’s reluctance to use their data plans. I’d mentioned WeChat, Alipay, and Xiaomi’s attempts to make their Wi-Fi-dependant users’ lives easier. This is as big a problem in China as it is in many other developing countries. It’s an issue the OS could address more directly, whether it’s improving the process for authenticating on public hotspots or giving me better ways to monitor my usage.

Authentication — When I open most apps for the first time, they either make me sign up for a new account with my email, use Facebook or other third-party services to log in, or, as is increasingly common, use my phone number to send me a one-time password. These are super clunky. Apps should be already logged in the first time I open them. There should be some flexible concept of identity that the OS can provide to apps immediately without asking, and then, with permission, supplement with further details. If users must switch identities, maybe a Mozilla Persona-like system could be adopted. Anything would be better than the mess that is app login now.

Data interop — My apps are terrible at sharing data. Lots of friends send me screenshots of articles, chats, tweets, even other apps as a way to share the underlying information. It’s particularly annoying when compression artifacts make the text illegible, or I want to go read the rest of the article or engage with the thing in the screenshot somehow. If I open a page in Facebook and want to share it in Twitter, I have to choose “Open in Safari,” reload the page, and do it from there (though Facebook clearly knows exactly what it’s doing in that instance.) I wish the data in my apps was more atomic and could be freely shared, persisted offline, and searched in a consistent way. But this sort of thing has been a pipe dream since OpenDoc and OLE, so maybe it’s just one of those things you should never do.

Offline storage and storage management — As a consequence of people being so reluctant to use their data plans, apps here are big on offline storage. All the music and movie apps do it, as do news apps and the third-party browsers popular here. Some give users detailed interfaces to manage their storage, even showing little pie graphs. I like this level of control, and I wish all my apps had it. I’d prefer not to think about storage, but if I have to clear data, I’d rather do it from a central UI, rather than going into each individual app to manage the things it has saved (or deleting the app out of frustration).

Payments — I wrote before about how nifty online payments are in China. Any website or app that takes my money pretty much uses Alipay or WeChat Wallet. In the U.S., I have to type in and update credit card and address info for every new app I install. We have OS-provided solutions in Apple Pay and Android Pay, but these seem to be accepted in few places and strictly NFC-based, limiting potential network effects. The nice thing about the solutions here is just how many combinations of scenarios and hardware they’ve covered, from expensive POS equipment that just requires me to hold my phone up to scanning a pre-generated QR code the merchant has printed on a vinyl mat to web payments to third-party app payments to peer-to-peer payments between normal users who aren’t connected. Whether you’re an app startup or a mom-and-pop convenience store, you have no excuse to not accept one of these solutions. And, as a user, there’s no place where it’s more frictionless to part with your money. When will blowing my hard-earned dough in U.S. apps be this easy?

The coming meta-platform war

So the meta platforms — WeChat, Facebook, LINE, and the like — have come and addressed many of the pain points above. They’ve delivered solutions neither the open Web nor those behind the closed app store model were coordinated enough, thoughtful enough, or perhaps incentivized enough to produce.

Originally, the whole tradeoff we were promised with locked-down devices and app stores was that things would be much nicer inside the “walled garden.” But over the years, as so many weeds sprung there, others came and built another wall with another garden inside of it, with yet another gatekeeper to deal with.

In the 1990s, OS makers shook in their boots over the prospect of web browsers disintermediating them, but somehow it’s taken more than another decade for the next challenger to emerge in the peculiar form of messaging apps. And though they’re still quite far from wholly replacing the high-level features OS offer to users and app developers, we can clearly see the beginning of this encroachment.

So here we are. What do we do?

A little less conversation, a little more action

I don’t know about you, but here’s what I want to see happen.

I want the first tab of my OS’s home screen to be a central inbox half as good as my chat app’s inbox. I want it to incorporate all of my messengers, emails, news subscriptions, and notifications and give me as great a degree of control in managing it. No more red dots spattered everywhere, no swiping up to see missed notifications. Make them a bit richer and better-integrated with their originating apps. Make them expire and sync between my devices, as appropriate. Just fan it all out in front of me and give me a few simple ways to tame them. I’ll spend most of my day on that page, and when I need to go launch Calculator or Infinity Blade, I’ll swipe over. Serve me a tasty info burrito as my main course instead of a series of nachos.

The next time I’m back stateside, I want my phone to support something like Chrome Apps, while retaining a few useful properties of apps instead of giving me big, weird icons that just link to websites. I want to sit down at TGI Fridays4 and scan a QR code at my restaurant table and be able to connect to their Wi-Fi, order, and pay without having to download a big app over my data plan, set up an account, and link a card when it is installed. Imagine if I could also register at the hospital or DMV in this fashion. Or buy a movie ticket. Or check in for a flight.

As a user, I want my apps — whether they’re native or web-based pseudo-apps — to have some consistent concept of identity, payments, offline storage, and data-sharing. I want to be able to quickly add someone to my contacts in-person or from their website. The next time I do a startup, I want to spend my time specializing in solving a specific problem for my users, not getting them over the above general hurdles.

I don’t actually care how it happens. Maybe the OS makers will up their game. Maybe Facebook, Telegram, or Snapchat can solve these problems for me by bolting solutions onto their messaging products. Hell, maybe Chrome or UC Browser will do it. Or maybe it’ll be delivered in some magic, blockchain-distributed, GNU-licensed, neckbeard-encrusted solution that the masses, in a sudden epiphany, repent to. As they say at Pizza Hut, there’s more than one way to skin a cat.

But more than anything, rather than screwing around with bots, I want the tech industry to focus on solving these major annoyances and handling some of the common use cases that my phone ought to do better with by now.

SEE ALSO

If this is the first essay you’ve read that exposed you to the wacky and wonderful world of Chinese software, you may enjoy my 2014 summary on the topic and its 2016 followup.

ACKNOWLEDGEMENTS

Thanks to Kevin Shimota, Jeff Dlouhy, Andrew Badr, Jon Russel, Muzzammil Zaveri, Sonya Mann, Stephen Wang, Hank Horkoff, Mark Evans, Michael Belfrage, and Jake Rozin for reviewing drafts of this essay.

FOOTNOTES
  1. One might object to the tap-counting approach above with “But what about speech recognition? Why can’t it be like Jarvis in Iron Man?” First, you are not Tony Stark. Second, speech recognition UIs are only economical for a given task when describing the task orally is faster than the equivalent tapping. I’ve only ever had one use-case for Siri: when I’m leaving the laundromat and tell her “Set a timer for 35 minutes” so that I can come back to put my clothes in the dryer. That is to say, it takes longer to set a timer than it takes to utter the words “Set a timer.” Performing complex, multi-choice tasks like ordering a pizza with only a speech UI would take several multiples of the time it takes to do them using a well-optimized conventional UI, as we’ve seen above (particularly if I’m waiting for a synthesized voice to rattle off the response each time). In conversations longer than a single command, using such UIs can feel less like being Iron Man and more like speaking to the sloths in Zootopia. The only case I see them being useful is when I’m not able to use my hands.
  2. I’ve been wanting to pitch a feature that lets users to put any thread into folders. This would let users tame their growing number of group chats, decide which ones should have priority in their stream, as well as making it harder to lose track of them. I fear it would be too complex, though.
  3. There are actually a few decent choices for presenting timely info snippets from disparate sources/apps. You could choose a conventional inbox, a modern chat app-style inbox, as described here, dashboard widgets/tiles (as in Windows’ Metro-style UI), Facebook-style filtered news feeds, unfiltered Twitter-style feeds, or Google Now-style cards. But I think the chat-style inbox as detailed here is the most versatile.
  4. Shut up. It’s exactly the kind of food you want to eat when you’ve been gone for a year.

A version of this post appeared on DanGrover.com.

Dan Grover is a product manager at WeChat.The opinions expressed here are his own and do not reflect those of his employer.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More