More

convexstrictly · on Dec 19, 2024

Video Demo:

https://x.com/OfficialLoganK/status/1869789822384255300

convexstrictly · on Dec 19, 2024

"Just when you thought it was over... we’re introducing Gemini 2.0 Flash Thinking, a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts.

The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more ..."

- Logan Kilpatrick

https://x.com/OfficialLoganK/status/1869789822384255300

convexstrictly · on Oct 30, 2024

GB per second

convexstrictly · on Oct 30, 2024

Simran Arora: "Join us for a livestream this Thursday, Halloween/Diwali, and join our channel on the GPU Mode Discord server to hang out with us/get involved:"

https://discord.com/login?redirect_to=%2Fchannels%2F11894982...

simarora777 · on Oct 31, 2024

Livestream link: https://youtube.com/live/IAwLzkldxUk?feature=share! Come ask questions!

pama · on Oct 31, 2024

Thanks!

convexstrictly · on Oct 30, 2024

CUDA + ThunderKittens 4.5 hour tutorial

https://www.youtube.com/watch?v=xcpEl0cGCC4

convexstrictly · on June 14, 2024

Results https://x.com/sbeastwindy/status/1801525876267372874

convexstrictly · on June 8, 2024

Aider uses Treesitter to improve code generation. https://aider.chat/2023/10/22/repomap.html

Aider: https://github.com/paul-gauthier/aider

It is state of the art on SWE-Bench and SWE-Bench Lite. https://aider.chat/2024/06/02/main-swe-bench.html

convexstrictly · on April 9, 2024

Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use

https://github.com/huggingface/candle

jeroenvlek · on April 9, 2024

Love Candle! I actually ported Karpathy's previous GPT tutorial to candle, including training [0]

[0] https://www.perceptivebits.com/building-gpt-from-scratch-in-...

r-k-jo · on April 11, 2024

You can also target WASM. Depending on the model size, it can be really good. Here are some examples of quantized models.

Vision Model https://huggingface.co/spaces/radames/Candle-Moondream-2 Blip Image Captioning https://huggingface.co/spaces/radames/Candle-BLIP-Image-Capt... Microsoft Phi 2 https://huggingface.co/spaces/radames/Candle-phi1-phi2-wasm-...

basbuller · on April 9, 2024

Not barely as minimal as Karpathy his implementation

0xfedbee · on April 9, 2024

I wouldn't call in "minimalist" after seeing Karpathy's code.

imjonse · on April 9, 2024

Candle focuses on inference though.

l-m-z · on April 9, 2024

Candle dev here, we also support training/backdrop! We certainly focus on optimizing inference performance but hopefully that should improve the training efficiency too.

revskill · on April 9, 2024

What is referecing ?

HarHarVeryFunny · on April 9, 2024

Inference means using the neural net, as opposed to training it.

During inference you feed an input into the NN and it passes through it in "forwards" direction (i.e. from input to output), being modified according to the "weights" that were learnt during training, to derive the output.

During training, each training sample is first fed forwards through the NN, the same way as for inference, but then the output of the model (which at the beginning of training will be random/wrong) is compared to the correct/desired output for that training sample, and a corresponding error value will then be fed backwards (from output to input) through the NN according to the "backpropagation" mechanism to update the weights.

Training is a lot more involved than inference since it involves this backpropagation step.

convexstrictly · on March 27, 2024

An author claims better performance than LoRA in 50% of the time.

https://twitter.com/Rui45898440/status/1772996453557997924

convexstrictly · on March 27, 2024

The federal government requests comments on regulation of AI models with openly available weights. The deadline is March 27, 2024.

Earlier thread. https://news.ycombinator.com/item?id=39494760