BERT, Tranformer

배워서 남주나

BERT, Tranformer

Broca & Wernicke 2020. 9. 1. 23:12

seq1seq도 이제 옛말이란다. GPT-3, BERT, Transformer. 알아야 할게 너무 많다.

http://incredible.ai/nlp/2020/02/29/Transformer/

Transformer

1. Introduction Attention Is All You Need 페이퍼를 읽고 문서를 작성합니다. 최대한 간결하고 빠르게 설명하겠습니다. 2. Transformer 2.1 Architecture 자세하게 설명하기전에 큰 그림부터 대략적으로 설명합니��

incredible.ai

http://docs.likejazz.com/bert/#position-wise-feed-forward-network

BERT 톺아보기 · The Missing Papers

BERT 톺아보기 17 Dec 2018 어느날 SQuAD 리더보드에 낯선 모델이 등장했다. BERT라는 이름의 모델은 싱글 모델로도 지금껏 state-of-the-art 였던 앙상블 모델을 가볍게 누르며 1위를 차지했다. 마치 ELMo를 �

docs.likejazz.com

http://jalammar.github.io/illustrated-transformer/

The Illustrated Transformer

Discussions: Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments) Translations: Chinese (Simplified), Japanese, Korean, Russian Watch: MIT’s Deep Learning State of the Art lecture referencing this post In the previous pos

jalammar.github.io

http://jalammar.github.io/how-gpt3-works-visualizations-animations/

How GPT3 Works - Visualizations and Animations

Discussions: Hacker News (397 points, 97 comments), Reddit r/MachineLearning (247 points, 27 comments) Translations: German, Chinese (Simplified) The tech world is abuzz with GPT3 hype. Massive language models (like GPT3) are starting to surprise us with t

jalammar.github.io

https://ratsgo.github.io/natural%20language%20processing/2019/09/11/xlnet/

XLNet · ratsgo's blog

XLNet은 구글 연구팀(Yang et al., 2019)이 발표한 기법으로 공개 당시 20개 자연어 처리 데이터셋에서 최고 성능을 기록한 아키텍처입니다. 일부 데이터에 한해서는 기존 강자인 BERT를 크게 앞서 자연�

ratsgo.github.io