To AI or not to AI
Something in the air reminds me of 1999/2000, where everything suddenly became “.com.” It feels uncannily similar today where everything is magically AI-driven. Are these companies truly AI-led?
Over the last six months, at Stellaris Venture Partners, my colleagues and I have seen more than 100 companies that claim to be “AI-driven.” They come in all shapes and forms - recommending courses to a student, cross-sell/ up-sell opportunities for a retail consumer, credit scoring in a lending company, lead scoring for a B2B business...and the list is unending. Phrases like chatbots, NLP, neural networks are becoming common lingo, and it is just a matter of time before your cab driver gives you gyan on these topics.
Challenge is that many companies that profess to be AI driven are anything but AI.
“This trend is dangerous for potential customers, entrepreneurs and investors”
So why do they do it? Sometimes it is willful mispositioning. As per Gartner, 2016 was the year when “machine learning” was at its peak of inflated expectations. Many companies - both large and small - want to benefit from the hype, and hence the earlier comparison with the dot-com era.
This trend is dangerous for potential customers, entrepreneurs and investors. Customers may end up buying products with higher expectations, only to disillusioned later, leading to high churn. Investors are prone to hype cycles - even though they will apply their rational minds, but when you hear the same noise a thousand times, you begin to believe in it. Finally, it is the entrepreneurs who will pay the real price if they are positioning their offering to be more than what it is - life does catch up.
But not everyone is mispositioning wilfully. The term machine learning or AI (used interchangeably here) are not well understood, and people often apply it to problems/ businesses inappropriately.
The rest of this article is an attempt to classify different problem-solving approaches or algorithms in layperson terms (I ask for forgiveness from computer scientists in advance!). Amongst the following five approaches, in my view, only the fifth one truly constitutes true machine learning. First four are not so.ai.
For the lack of any other better alternative, I am calling this category “classical algorithms”. This includes algorithms that solve a problem optimally and can do so in a tractable time (i.e. the time to solve a problem does not scale exponentially).
Think of any ride-sharing app, where you order a cab and put in your destination.
A city can be thought of as a graph with roads as the edges in this graph, and what the ride-sharing company needs to do is to find the shortest path to go from point “A” to point “B.” Such a problem can be solved using the shortest path algorithm by Edgar W Dijkstra. It can solve the problem optimally, i.e. tell you the shortest path, and do so in a time that scales linearly with the number of nodes in the graph. Interested readers can read more about it on this link.
The algorithm here is “fixed” and is not “improving” with time.
Rule Based or Decision Tree Systems:
Many applications have “states”, “inputs” and “outputs” which are predefined and such systems are called rule-based systems or finite state automatons.
For example; A chat bot app deployed by an electronics company for a customer support application.
When a user starts using the application for a TV that is not working, the chatbot can guide the user through different issues to be able to diagnose the root cause and recommending what the user can do. Based on “yes”/”no” answers or specific keywords in the user’s response, new questions can be thrown up for the user to deepen the diagnosis. Here is an example of such a chatbot (http://www.sfs.uni-tuebingen.de/~vhenrich/ss12/java/homework/hw7/decisionTrees.html).
Like the previous example, please note that the decision tree is a static one, i.e. it has been predefined and should there be a new issue with the TV, this approach will not lend itself to “learning” new issues.
One of the most misunderstood class of algorithms. There are many problems that have an exponential complexity and cannot be solved in a reasonable time when faced with a large-scale problem in that domain. For such problems, we use heuristics. Often people use “practical” methods or “rule of thumb” which may not be optimal but are good enough.
Imagine, an OOH advertising solution at a bus stop, where the screen also has a camera.
Typically, such screens show advertisements that are not personalized. One can make this system smarter by putting in a camera that is taking periodic snapshots to see if there is someone in front, and if there is a face in front, then personalize the ad based whether the person is a male or female. While it may seem relatively easy to humans, this is not such a simple problem for a machine. A potential heuristic could be that if the image has long hair, then classify the picture as that of a woman and a man if the hair is short. Please note that identification of hair and its length by itself is a non-trivial problem!
Again, you will notice that the ability to classify the person as a man or a woman does not improve with time and that the “intelligence level” of the approach remains the same no matter how many times it has been applied.
This is again a special class of algorithms which are used when there the problems have exponential complexity, but you do not have the luxury of relying on a heuristic where sometimes the answer can be way off from the optimal answer. In such scenarios, there could be algorithms that guarantee the outcome to be a factor (or multiple) of the optimal answer.
For example, an eCommerce company where a delivery person has to deliver 15 different orders on a particular day.
The person knows the addresses of each of the places he (or she) needs to deliver but does not know what is the best order of delivery that optimizes the overall distance or time. This is a classical Traveling Salesman Problem (TSP) problem in computer science jargon, i.e. you have to visit n different points on a graph and you are trying to figure out the shortest path that covers all the points. While an optimal solution to TSP cannot be found in polynomial time, there are approximate algorithms that guarantee the outcome to be within a twice the time (or distance) of the optimal solution. I had used such an algorithm with a few other PhD students 25 years back to create a path for a disk head to fetch high-quality media streams (www.cs.utexas.edu/~dmcl/papers/ps/CompComm95.ps).
This is the category of classical “AI” algorithms. Attempt here is to mimic how a human brain works. Humans do not explicitly run algorithms most of the time, but “learn” over a period of time. Computer scientists have figured out mathematical models that can mimic that behavior to a large extent. Most common of these are neural networks. At a simplistic level, a neural network may look like this:
Companies in the domain of Optical character recognition (OCR), credit card fraud detection, image recognition, and speech recognitions are all examples where such algorithms are applied.
Each node represents an input or a calculation, and the output is passed along to the next receiving layer with a “weight” or “importance” attached to it. While the calculation is static, the system can re-adjust its weights based on whether or not the output is correct or not. As such, these systems require training data. The more difficult the problem, the higher the amount of data required to train these systems. The beauty of these systems is that they can improve over time when there is a feedback loop - not very different from humans! There are a large number of problems where people are using these algorithms. Companies in the domain of Optical character recognition (OCR), credit card fraud detection, image recognition, and speech recognitions are all examples where such algorithms are applied.
My description of AI algorithms is a simplistic one. In reality, there are many kinds of AI algorithms e.g. Genetic Algorithms, Hidden Markov Models, etc. and careful selection or a combination of these are required to solve complex real world problems. So before you jump on the bandwagon - think hard, think deep, think intelligent!
Even though the core concepts of AI have existed for many decades now, we now have the computing power, network bandwidth and vast volumes of data for training to have AI come to life. Every year, the art of what is possible with AI is changing and changing rapidly. It is a powerful tool, use it wisely. More importantly, use AI only where you should and do not label your business as an AI business when it is not!