| | 10Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module (gilesthomas.com) |
| 1 point by ibobev 4 days ago | past | discuss |
|
| | 10Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module (gilesthomas.com) |
| 3 points by gpjt 5 days ago | past | discuss |
|
| | 10Gb/s Ethernet: what I did to get it working in my home (gilesthomas.com) |
| 232 points by gpjt 24 days ago | past | 177 comments |
|
| | 10Gb Ethernet: what I had to (re)learn (gilesthomas.com) |
| 2 points by ibobev 24 days ago | past |
|
| | 10Gb Ethernet: what I had to (re)learn (gilesthomas.com) |
| 1 point by gpjt 25 days ago | past | 1 comment |
|
| | LLM from scratch, part 33 – what I learned from the appendices (gilesthomas.com) |
| 5 points by gpjt 31 days ago | past |
|
| | LLM from scratch (32l) – Interventions: updated instruction fine-tuning results (gilesthomas.com) |
| 1 point by gpjt 33 days ago | past |
|
| | An LLM becomes more coherent as we train it (gilesthomas.com) |
| 1 point by ibobev 35 days ago | past |
|
| | How an LLM becomes more coherent as we train it (gilesthomas.com) |
| 3 points by gpjt 36 days ago | past |
|
| | LLM from scratch, part 32k – Interventions: gradient accumulation (gilesthomas.com) |
| 2 points by gpjt 38 days ago | past |
|
| | Interventions: Trying to train a better model in the cloud (gilesthomas.com) |
| 1 point by ibobev 43 days ago | past |
|
| | LLM from scratch, part 32j – trying to train a better model in the cloud (gilesthomas.com) |
| 2 points by gpjt 44 days ago | past |
|
| | Writing an LLM from scratch, part 32i – Interventions: what is in the noise? (gilesthomas.com) |
| 2 points by ibobev 45 days ago | past |
|
| | Writing an LLM from scratch, part 32i – Interventions: what is in the noise? (gilesthomas.com) |
| 1 point by gpjt 46 days ago | past |
|
| | Writing an LLM from scratch, part 32h – Interventions: full fat float32 (gilesthomas.com) |
| 2 points by ibobev 46 days ago | past |
|
| | Writing an LLM from scratch, part 32h – Interventions: full fat float32 (gilesthomas.com) |
| 7 points by gpjt 50 days ago | past |
|
| | Automating starting Lambda Labs instances (gilesthomas.com) |
| 2 points by ibobev 50 days ago | past |
|
| | Writing an LLM from scratch, part 32g – Interventions: weight tying (gilesthomas.com) |
| 2 points by ibobev 59 days ago | past |
|
| | Writing an LLM from scratch, part 32g – Interventions: weight tying (gilesthomas.com) |
| 2 points by gpjt 60 days ago | past |
|
| | Writing an LLM from scratch, part 32f – Interventions: weight decay (gilesthomas.com) |
| 6 points by gpjt 61 days ago | past |
|
| | Writing an LLM from scratch, part 32e – Interventions: the learning rate (gilesthomas.com) |
| 3 points by ibobev 68 days ago | past |
|
| | Writing an LLM from scratch, part 32e – Interventions: the learning rate (gilesthomas.com) |
| 3 points by gpjt 74 days ago | past |
|
| | Writing an LLM from scratch, part 32a – Interventions: training a baseline model (gilesthomas.com) |
| 3 points by ibobev 3 months ago | past |
|
| | Writing an LLM from scratch, part 32B – Interventions: gradient clipping (gilesthomas.com) |
| 1 point by ibobev 3 months ago | past |
|
| | Writing an LLM from scratch, part 32c – Interventions: removing dropout (gilesthomas.com) |
| 1 point by ibobev 3 months ago | past |
|
| | Writing an LLM from scratch, part 32d – Interventions: adding attention bias (gilesthomas.com) |
| 1 point by ibobev 3 months ago | past |
|
| | Writing an LLM from scratch, part 32d – Interventions: adding attention bias (gilesthomas.com) |
| 6 points by gpjt 3 months ago | past |
|
| | Writing an LLM from scratch, part 32c – Interventions: removing dropout (gilesthomas.com) |
| 1 point by gpjt 3 months ago | past |
|
| | Writing an LLM from scratch, part 32B – Interventions: gradient clipping (gilesthomas.com) |
| 2 points by gpjt 3 months ago | past |
|
| | Writing an LLM from scratch, part 32a – Interventions: training a baseline model (gilesthomas.com) |
| 1 point by gpjt 3 months ago | past |
|
|
| More |