"Just when you thought it was over... we’re introducing Gemini 2.0 Flash Thinking, a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts.
The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more ..."
Simran Arora: "Join us for a livestream this Thursday, Halloween/Diwali, and join our channel on the GPU Mode Discord server to hang out with us/get involved:"
Candle dev here, we also support training/backdrop! We certainly focus on optimizing inference performance but hopefully that should improve the training efficiency too.
Inference means using the neural net, as opposed to training it.
During inference you feed an input into the NN and it passes through it in "forwards" direction (i.e. from input to output), being modified according to the "weights" that were learnt during training, to derive the output.
During training, each training sample is first fed forwards through the NN, the same way as for inference, but then the output of the model (which at the beginning of training will be random/wrong) is compared to the correct/desired output for that training sample, and a corresponding error value will then be fed backwards (from output to input) through the NN according to the "backpropagation" mechanism to update the weights.
Training is a lot more involved than inference since it involves this backpropagation step.
https://x.com/OfficialLoganK/status/1869789822384255300