self.w2(F.silu(self.w1(x)) * self.w3(x))
https://github.com/meta-llama/llama3/blob/14aab0428d3ec3a959...
https://github.com/meta-llama/llama3/blob/14aab0428d3ec3a959...