Results for ""
Researchers have come up with a new AI method called ADEV, which automates the math needed to maximize the expected value of actions in an uncertain world.
Computer science and its applications, such as AI, operations research, and statistical computing, have a lot of work to do to find the best way to optimise the predicted values of probabilistic processes. Unfortunately, widely used gradient-based optimisation solutions do not typically compute the required gradients using automatic differentiation techniques designed for deterministic algorithms.
It has never been easier to specify and solve optimisation problems, thanks in large part to the advancement of computer languages and libraries that enable automatic differentiation (AD). Users can automate the creation of programmes to compute the derivatives of objective functions by specifying them as programmes in AD. By feeding them into optimisation algorithms like gradient descent or ADAM, these derivatives can locate local minima or maxima of the original objective function.
Features
A new AD algorithm called ADEV is used to automate the correct expectations of derivatives of expressive probabilistic systems. It has the following good things about it:
Advantage
Deep learning has grown a lot in the last ten years, in part because computer languages were developed that could automate the college-level calculus needed to train each new model. Neural networks are trained by making changes to their parameter settings to get the best score that can be quickly calculated from the training data. Before, the equations for adjusting the parameters for each tuning step had to be carefully made by hand. Automatic differentiation is a method used by deep learning platforms to automatically figure out the differences. Without knowing the math behind it, researchers could quickly test a huge number of models and figure out which ones worked.
Challenges
Conclusion
One of the most important problems in computer science and its applications is how to optimise the expected values of random processes. It comes up in fields like artificial intelligence, operations research, and statistical computing. Unfortunately, the automatic differentiation techniques that were made for deterministic programmes don't usually work for gradient-based optimisation solutions, which are used a lot.
In their paper, the researchers show ADEV, an extension to forward-mode AD that correctly separates the expectations of probabilistic processes that are shown as programmes that make random choices. Their algorithm is a source-to-source programme transformation for a higher-order language with both discrete and continuous probability distributions for probabilistic computing. The result of their change is a new probabilistic programme whose expected return value is a derivative of the expected value of the original programme. This output programme can be run to get unbiased Monte Carlo estimates of the desired gradient, which can then be used in the inner loop of stochastic gradient descent.
Furthermore, the researchers show that ADEV is right by making logical connections between the meanings of the source probabilistic programme and the target probabilistic programme. Their algorithm is easy to implement because it builds on forward-mode AD in a modular way. They used this to make a prototype in just a few dozen lines of Haskell.
Image source: Unsplash