https://openreview.net/forum?id=HJNMhuWdWr&referrer=%5Bthe%20profile%20of%20Albert%20S%20Berahas%5D(%2Fprofile%3Fid%3D~Albert_S_Berahas1)
The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead...
bfgs methodmachine learningmultibatchopenreview