767
*_| Переслано из: Loss function porn (Vladimir Ivashkin) |_*
GPT-3 is a new SOTA in translation and question-answering and it doesn’t require finetuning. The biggest version has 175 billion parameters (~700gb weights, over 100 times bigger than GPT-2!)
_*📝*_ [arxiv.org/abs/2005.14165](http://arxiv.org/abs/2005.14165)
_*📉*_ [@loss_function_porn](https://t.me/loss_function_porn)