Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Fortunately, costs for training superlarge models are coming down rapidly thanks to TPUs (which was the approach used to train GPT-J 6B) and DeepSpeed improvements.


Are there any TPUs that can be purchased off-the-shelf and then owned, like you can do with a CPU or GPU? Or are you just limited to paying rent to cloud providers and ultimately being at their mercy when it comes to pricing, ToS, etc?


No, but you probably aren't going to buy an A100 either, so it's a moot point.


An A100 looks to be about $12k or so. A bit out of reach for individuals, but not so bad as a business expense, but maybe you can use it to mine Bitcoin when you're not using it to train models to help pay for itself or something.



I don't think these are good for training though, unfortunately.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: