Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We used Flax, of course.

We thought Haiku was cool, but none of us knew it, and learning resources were scarce. But we are very happy with Flax.

We used the Bengali subset of the mC4 dataset [0] for training the GPT-2 model.

[0]: https://huggingface.co/datasets/mc4



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: