Voice assistance and privacy

Voice assistants technologies are hyped nowadays. However one of the main voiced concerns is about privacy. The main concern about privacy is that devices listen to us all the time and document everything. For example, Google keeps every voice search users do. They use it to improve its voice recognition and to provide better results. Google also provides the option to delete it from your account.

A few questions that come to mind are: how many times do companies go over your voice messages? How often do they compare it with other samples? How often does it improve thanks to it? I will try to assume answers to these questions and suggest solutions.

A good example for a privacy considered approach is Snapchat. Messages in Snapchat are controlled by the user, and they also disappear from the company’s servers. Considering the age target they aimed for, it was a brilliant decision since teenagers don’t want their parents to know what they do, and generally, they want to “erase their sins”. Having things erased is closer to a real conversation than a chat messenger.

Now imagine this privacy solution in a voice assistant context. Even though users aspire the AI to know them well, do they want it to know them better than they know themselves?

What do I mean by that? Some users wouldn’t want their technology to frown upon them and criticize them. Users also prefer data that doesn’t punish them for driving fast or being not healthy. This is a model that is now led by insurance companies.

Having spent a lot of time in South Korea I have experienced a lot of joy rides with taxi drivers. The way their car navigation works is quite grotesque. Imagine a 15-inch screen displaying a map goes blood red with obnoxious sound FX in case they pass the speed limit.

Instead, users might prefer a supportive system that can differentiate between public information that can be shared with the family to private information which might be more comfortable to be consumed alone. When driving a car, situations like this are quite common. Here is an example — A user drives a car and has a friend in the car. Someone calls and because answering will be on the car’s sound system the driver has to announce that someone else is with them. The announcement is made to define the context of the conversation thus to prevent content or behaviors that might be private.

The voice assistant will need to be provided with contextual information so it could figure out exactly what scenario the user is in, and how / when to address them. But we will probably need to let it know about our scenario in some way too. Your wife can hear that you are with someone in the car but can’t quite decipher who with. So she might ask “are you with the kids?”.

Voice = social

Talking is a social experience that most people don’t do when they are alone. Remember the initial release of the bluetooth headset? People in the streets thought that you are speaking to them but you actually were on the phone. Another example is the car talking system. Some people thought that the guy sitting in the car is crazy because he is talking to himself.

Because talking is a social experience we need to be wary of who we speak to and where; so does the voice assistant. I know a lot of parents that have embarrassing stories of their kids “blab” things they shouldn’t say next to a stranger. Many times it’s something that their parent said about a person or some social group. How would you educate your voice assistant? By creating a scenario where you actively choose what to share with it.

Companies might aspire to get the most data possible, but I doubt that they really know how to use it. In addition, it doesn’t correspond with the level of expectations that consumers expect. From the users perspective, they probably want their voice assistant to be more of a dog, than a human or a computer. People want a positive experience with a system that helps them remember what they don’t remember, and that forgets what they don’t want to remember. A system that remembers that you wanted to buy a ring for your wife but doesn’t say it out loud next to her, and reminds you in a more personal way. A system that remembers that your favorite show is back but doesn’t say it next to the kid because it’s not appropriate for their age.

A voice assistant that has Tact.

Being a dog voice assistant is probably the maximum voice assistants can be nowadays. It will progress but in the meantime, users will settle on something cute like Jibo that has some charm to it in case it makes a mistake and that can at least learn not to repeat it twice. If a mistake happened and for example, it said something to someone else, users will expect a report about things that got told to other users in the house. The Voice assistant should have some responsibility.

Mistakes can happen in privacy, but then we need to know about it before it is too late.

Using Big Data

The big promise of big data is that it could globally heal the world using our behavior. There is a growing rate of systems that are built to cope with the abundance of information. Whether they cope or not is still a question. It seems like many of these companies are in the business of collecting for the sake of selling. They actually don’t really know what to do with the data, they just want to have it in case that someone else might know what to do with it. Therefore I am not convinced that the voice assistant needs all the information that is being collected.

What if it saved just one day of your data or a week, would that be contextual enough?

Last year I was fascinated by a device called Kapture. It records everything around you at any give moment. But if you noticed something important happen you can tap it and it will save the previous 2 minutes. Saving things retrospectively, capturing moments that are magical before you even realized they were so, that’s incredible. You effortlessly collect data and you curate it while all the rest is gone. Leaving voice messages to yourself, writing notes, sending them to others, having a summary of your notes, what you cared about, what interested you, when do you save most. All of these scenarios could be the future. The problem it solved for me was, how can I capture something that is already gone whilst keeping my privacy intact.

Kapture

Social privacy

People are obsessed with looking at their information the same as they are obsessed with looking in the mirror. It’s addictive, especially when it comes as a positive experience.

In social context the rule of “the more you give the more you get” works, but it suffers in software. Maybe at some point in the future it will change but nowadays software just don’t have the variability and personalization that is required to actually make life better for people who are more “online”. Overall the experience is more or less the same if you have 10 friends in Facebook or 1000. To be honest it’s probably worst if you have 1000 friends. The same applies to Twitter or Instagram. Imagine how Selena Gomez’s Instagram looks like. Do you think that someone in Instagram thought of that scenario, or gave her more tools to deal with it? Nope. It seems like companies talk about it but rarely do about it and it definitely applies to voice data collections.

It seems clear, the ratio of reveal doesn’t justify or power the result users get. One of the worst user experiences that can happen is for example signing into an app with Facebook. The user is led to a screen that requests them to grant access to everything…and in return they are promised they could write down notes with their voice. Does it has anything to do with their address, or their online friends, no. Information is too cheap nowadays and users got used to just press “agree” without reading. I hope we could standardize value for return while breaking down information in a right way.

Why do we have to be listened to every day and be documented if we can’t use it? Permissions should be flexible and we should incorporate a way to make the voice assistant stop listening when we don’t want them to listen. Leaving a room makes sense when we don’t want another person to listen to us, but how will that look like in a scenario in which the voice assistant is always with us? Should we tell it “stop listening for five minutes”?

Artificial intelligence in its terminology is related to a brain but maybe we should consider its usage or creation to be more related to a heart. Artificial Emotional Intelligence (A.E.I) could help us think of the assistant differently.

Use or be used?

How does it improve in our lives and what is the price we need to pay for it? In “Things I would like to do with my Voice Assistant” I talked about how useful some capabilities would be in comparison to how much data will this action need to become a reality.

So how far is the voice assistant from reading emotions, having tact and syncing with everything? Can this thing happen with taking care of privacy issues in mind? Does your assistant snitch on you, or tell you when someone was sniffing and asking weird questions? It’s not enough to choose methods like differentiated privacy to protect users. Companies should really consider the value of loyalty and creating a stronger bond between the machine and the human rather than the machine and the company that created it.

Further more into the future we can get to these scenarios:

There could also be some sort of behavioral understanding mechanism that mimics a new person that just met you in a pub. If you behave in a specific way the person will probably know how to react to you in a supportive way even though they didn’t knew you before. In the same way a computer that knows these kind of behaviors can react to you. Even more assuming there are sensors that tells it what’s your physical status and recognize face pattern and tone of voice.

Another good example are Doctors that many times can diagnose patients’ disease without looking at their full health history. Of course it’s easier to look at everything, but they would usually do that in case they need to figure out something that is not just simple. When things are simple it should be faster and in the tech’s case more private.

Summary

There are many ways to make Voice assistants more private whilst helping people trust them. It seems like no company has adopted this strategy yet. It might necessitate that this company would not rely on a business model that is driven by advertising. A company that creates something that is being released to the wild, a machine that becomes a friend that has a double duty for the company and the user, but one that is at least truthful and open about what it shares.