Filling in the Gaps
About
Filling in the Gaps
Dropout
In this post I want to look at the regularization technique called
dropout
. This technique was introduced by Srivastava,
et al.
in the paper Dropout: A Simple Way to Prevent…
Jan 26, 2025
Mark Cassar
Understanding Byte Pair Encoding: Part 4: Nuances
With the basics of the byte pair encoding (BPE) algorithm sorted out in my last post, I want to delve into some of the nuances of its application for GPT2.
Jan 11, 2025
Mark Cassar
Understanding Byte Pair Encoding: Part 3: the Algorithm
I wrote about encodings and the basics of tokenization in my two earlier posts, so in this post, I will dig into the actual algorithm of byte-pair encoding (BPE). In the…
Jan 7, 2025
Mark Cassar
Understanding Byte Pair Encoding: Part 2: Tokenization
In my last post, I discussed encoding text, specifically using UTF-8. As I noted there, this encoding uses 1 to 4 bytes to represent all the characters in the Unicode…
Dec 23, 2024
Mark Cassar
Understanding Byte Pair Encoding: Part 1: Encodings
My goal is to get a deeper understanding of tokenization as it relates to the preprocessing of text for input into a large language model (LLM). I had heard of byte pair…
Dec 18, 2024
Mark Cassar
No matching items