It’s Only Natural: An Excessively Deep Dive Into Natural Gradient Optimization
https://towardsdatascience.com/its-only-natural-an-excessively-deep-dive-into-natural-gradient-optimization-75d464b89dbb [towardsdatascience.com]
2019-03-10 05:02
The short answer is: practically speaking, it doesn’t provide compelling enough value to be in common use for most deep learning applications. There is evidence of natural gradient leading to convergence happening in fewer steps, but, as I’ll discuss later, that’s a bit of a complicated comparison. The idea of natural gradient is elegant and satisfying to people frustrated by the arbitrariness of scaling update steps in parameter space. But, other than being elegant, it’s not clear to me that it’s providing value that couldn’t be provided via more heuristic means.
source: HN