Can we trust AI if we don’t know how it works?

Thanks to some impressive advancements of AI, we now live in a futuristic world. The first self-driving cars are now on the road, voice-based assistants such as Amazon Alexa and Google Home have become the ordinary, and you can even turn yourself into an anime character. It seems that such great strides have been made, but the basics have been overlooked. In particular, I refer to one show-stopper: no one really understands how AI algorithms make decisions. This makes AI vulnerable to attacks and raises the question whether we can trust it.

Meet AI's worst enemy : duct tape

One example of a possible attack is for the AI self-driving cars which identifies the traffic signs on the road. For example, McAfee’s threat research division showed that Tesla’s Advanced Driver Assist Systems (ADAS) can be tricked by duct tape. By simply adding a small piece of tape to a 35mph speed limit sign, it can trick ADAS to identify the sign as 85mph and speed up! Luckily, the Tesla attack has been done on an 2016 software which has since then been patched.

Other AI researchers show that by adding a few duct tape pieces to a stop sign, can make the AI believe it is a 45mph speed limit sign. Their research is a bit scarier, because they created a general attack algorithm called Robust Physical Perturbations which has the potential to attack any AI model.

A new way of working?

The reason why these attacks exist and they can be successful is because of the standard way of working in AI. You start by designing a new AI model with a huge amount of parameters with the latest models having millions and millions of parameters. Then a computer is programmed to learn the optimal set of parameters for a specific task, such as to learn to identify traffic signs. Once the computer is ready, the millions of parameters are tuned into the task and the AI solution is ready for use. This means it is no longer possible for humans to understand why the AI model choses to identify a traffic sign as a 45mph sign instead of a stop sign. We only know this works with an accuracy of 99.5% of the cases.

The good news is that these vulnerabilities caused a surge in the interest in new fields of AI like Explainability and Adversarial Machine Learning. The first promising techniques like Shaply Additive Explanations and XRAI to patch these vulnerabilities are appearing. However, these solutions still adhere to the standard way of working, so I am not sure how far they can bring us in solving all sorts of AI attacks.

Do we need a whole new methodology for developing AI models to build more robust AI applications? That’s a question left for you to answer. You can leave your comment on my Linkedin page.

Can we trust AI if we don’t know how it works?

Topics

Author

Publication Date

Share