Concepts You Should Know Before Getting Into Transformers

John Vastola
3 min readFeb 14, 2023

Let me introduce you to the Transformer. Yes, you read that right, the Transformer. If you’re looking to get into the latest and greatest advancements in ML, then this is a must-read article.

I’m here to help you get up to speed with the basics and some important concepts that will make your journey into the world of Transformer models a breeze.

Trust me, by the end of this article, you’ll be a pro at explaining the Transformer to your colleagues and friends.

So, why all the fuss about the Transformer?

Well, because it has taken the world of NLP (Natural Language Processing) by storm and has set the bar for other models to follow. And, not to mention, Google’s BERT (Bidirectional Encoder Representations from Transformers) model has won numerous awards and has been pre-trained on massive amounts of data, making it a state-of-the-art model for NLP tasks.

Basic Terminology and Concepts

Before diving into the nitty-gritty details of a Transformer, let’s get a grip on some basic terminologies and concepts:

  • Attention Mechanism: This is a method that allows the model to focus on specific parts of the input while making predictions.
  • Multi-Head Attention: A…

--

--

John Vastola
John Vastola

Written by John Vastola

Data scientist, AI enthusiast, and self-help writer sharing insights on using data science and AI for good. johnvastola.medium.com/membership