https://joecooper.me/blog/softmaxastar/
Using softmax, A* search can generate diverse candidates for use with test-time compute.
joseph cooperpathfindingsoftmax
https://www.tensorflow.org/versions/r2.15/api_docs/python/tf/nn/softmax_cross_entropy_with_logits?authuser=3
Computes softmax cross entropy between logits and labels.
cross entropytfnnsoftmaxtensorflow
https://towardsdatascience.com/learning-triton-one-kernel-at-a-time-softmax/
Nov 27, 2025 - All you need to know about a fast, readable and PyTorch-ready softmax kernel!
a timelearningtritononekernel
https://openreview.net/forum?id=VCJ8NfVrcO&referrer=%5Bthe%20profile%20of%20Sharan%20Vaswani%5D(%2Fprofile%3Fid%3D~Sharan_Vaswani1)
Natural policy gradient (NPG) is a common policy optimization algorithm and can be viewed as mirror ascent in the space of probabilities. Recently, Vaswani et...
fast convergencesoftmaxpolicymirrorascent
https://www.tensorflow.org/versions/r2.15/api_docs/python/tf/sparse/softmax?authuser=2
Applies softmax to a batched N-D SparseTensor.
tfsparsesoftmaxtensorflow