tl;dr: Write an intro paragraph here.
Papers
- Attention Is All You Need, by Vaswani et al., 2017
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, by Devlin et al., 2018
- Language Models are Unsupervised Multitask Learners, by Radford et al., 2019
Learning resources
- Transformer Explainer, by Polo Club of Data Science, Georgia Tech
References
For cited works, see below 👇👇
PREVIOUSUntraceable transactions (UTT)