We used Flax, of course. We thought Haiku was cool, but none of us knew it, and ...

We used Flax, of course.

We thought Haiku was cool, but none of us knew it, and learning resources were scarce. But we are very happy with Flax.

We used the Bengali subset of the mC4 dataset [0] for training the GPT-2 model.