PyTorch Transformer Language Model Clarified
Yang Xu
Apr 25
PyTorch provides a pretty thorough tutorial for building a complete pipeline for training and evaluating a Transformer-based language model (link) It provides sufficient amount of details as a tutorial for beginners, but there are several places I found that can be further clarified.1. The batch_size in section "run the model" should be seq_lenFirst, let's look at the for loop in train(), where a batch is taken from train_data and processed into data and targetsfor batch, i in enumerate(...
ParagraphParagraph

Yang Xu

Written by
Yang Xu

I am an Assistant Professor of Computer Science at San Diego State University. I do research in machine learning and NLP.

Subscribe

2025 Paragraph Technologies Inc

PopularTrendingPrivacyTermsHome
Search...Ctrl+K

Yang Xu

Subscribe