Results for ""
Artificial Intelligence (AI) is a general terminology that describes an automated decision-making system from predefined rules. A basic AI system need not learn from experience. An example is a simple if-else statement with predefined conditions. As I have already described in my previous articles, using a sensor (light sensor) to automate the switching on and off a device (street light) is a simple AI system.
Machine Learning (ML) is a general terminology that describes an automated decision-making system from simple human-guided but machine-optimized rules. It can improve with experience but reaches saturation soon. Examples; Decision Trees, Random Forest, SVM, etc.
Neural Networks (NN) and Deep Learning (DL) is a specific terminology used to describe an automated decision-making system from complex human-independent rules learned by the machine using Neural Networks. Examples; Multilayer Perceptron, Convolutional Neural Networks, etc.
Since AI is the most basic form of decision-making, we will not go into details here but will discuss the key differences between ML and DL.
My previous articles have also described a generic AI system as a transformation function. Be it ML or DL, it is ultimately trying to find an optimal transformation function, which is usually described by its independent variables and coefficients
f(W,X).
When it comes to ML algorithms, it requires a human to decide what family of functions to use for transforming inputs into outputs. Once the function is decided, computers are used to find the optimal coefficients within this family to best map inputs to outputs. Finding the optimal coefficients from inputs is called learning, and since the machine does this following some algorithm, it is called Machine Learning. The inputs are generally not taken as is. A domain expert extracts what are called features from the input data to reduce redundancy and transform it into a more abstract but compact form before passing it to the optimization process. Since humans are known for their imagination and creativity and that is reflected in the way they would design features and decide on functions, the accuracy of the solution is also human-dependent.
ML solutions can also sometimes work with very little data as it heavily depends on the engineer's ability to code the solution. This is exactly why ML requires the solution to be precisely describable. The amount of input data does not play a significant role in continuously improving the algorithm's accuracy as it will mostly be restricted by the quality of features and the ability of the function considered for training. It can be used to automate several tasks and can sometimes be computationally cheap. ML system’s dependency on a human can also create a human bias in the solution.
DL algorithms on the other hand have the ability to construct features and come up with transformation functions on their own by just knowing the input-output mapping. It is for this reason that NNs are called universal function approximators. All that heavy lifting that they do comes at a cost, and that is data. DL systems learn the solution on their own at the expense of massive amounts of data and humans having to define the problem precisely. Posing the problem in the wrong way creates a wrong solution.
This is exactly why DL systems require the problem to be precisely describable. In order to approximate a transformation function, NN or DL systems use multiple scaled and shifted versions of predefined functions, each one coming as an output of a computation node. The word deep comes from the fact that these nodes can be cascaded in several layers. With this kind of arrangement, they not only learn the coefficients of the function but learn the function as well.
Unlike ML systems that are human-dependent, DL systems are almost computer-independent given the architecture and data. The architecture here is basically the arrangement of neurons (nodes) and their behavior. Though DL systems can outperform human-level performance in several specialized tasks, without data there is no solution. It also suffers to generalize with less data but improves significantly with more and more data. DL systems arrive at a solution entirely from the data and therefore can have a bias due to the quality of the data used. Given the right kind of data and a precise definition of the problem, a DL system is always optimal in the features and the function it comes up with.
Its my own article