Mathematics Behind Large Language Models and Transformers

سرفصل های دوره

Deep Dive into Transformer Mathematics: From Tokenization to Multi-Head Attention to Masked Language Modeling & Beyond

1. Tokenization and Multidimensional Word Embeddings

1. Introduction to Tokenization

2. Tokenization in Depth

3. Encoding Tokens

4. Programatically Understanding Tokenizations

5. BERT vs. DistilBERT

6. Embeddings in a Continuous Vector Space

2. Positional Encodings

1. Introduction to Positional Encodings

2. How Positional Encodings Work

3. Understanding Even and Odd Indicies with Positional Encodings

4. Why we Use Sine and Cosine Functions for Positional Encodings

5. Understanding the Nature of Sine and Cosine Functions

6. Visualizing Positional Encodings in Sine and Cosine Graphs

7. Solving the Equations to get the Positional Encodings

3. Attention Mechanism and Transformer Architecture

1. Introduction to Attention Mechanisms

2. Query, Key, and Value Matrix

3. Getting started with our Step by Step Attention Calculation

4. Calculating Key Vectors

5. Query Matrix Introduction

6. Calculating Raw Attention Scores

7. Understanding the Mathematics behind Dot products and Vector Alignment

8. Visualising Raw Attention Scores in 2 Dimensions

9. Converting Raw Attention Scores to Probability Distributions with Softmax

10. Normalisation and Scaling

11. Understanding the Value Matrix and Value Vector

12. Calculating the Final Context Aware Rich Representation for the word river

13. Understanding the Output

14. Understanding Multi Head Attention

15. Multi Head Attention Example, and Subsequent layers

16. Masked Language Modeling

189,000 تومان

افزودن به سبد خرید

خرید دانلودی فوری

در این روش نیاز به افزودن محصول به سبد خرید و تکمیل اطلاعات نیست و شما پس از وارد کردن ایمیل خود و طی کردن مراحل پرداخت لینک های دریافت محصولات را در ایمیل خود دریافت خواهید کرد.

تولید کننده: Udemy