https://distill.pub/2019/memorization-in-rnns/
Inspecting gradient magnitudes in context can be a powerful tool to see when recurrent units use short-term or long-term contextual understanding.
memorizationrnns
https://openreview.net/forum?id=amOpepqmSl&referrer=%5Bthe%20profile%20of%20Franck%20Mamalet%5D(%2Fprofile%3Fid%3D~Franck_Mamalet2)
Binary and sparse ternary weights in neural networks enable faster computations and lighter representations, facilitating their use on edge devices with...
binary andsparseternaryorthogonalrnns