Deep Learning With PyTorch

This eBook not only focuses on the explanation of theoretical knowledge, but also pays more attention to engineering practice. By combining a large number of practical cases, especially how to train, optimize and deploy models, readers will be able to master how to use PyTorch to complete various deep learning tasks.

Textbook Cover

You need to have basic knowledge of Python to study this course. You can check your Python level by looking at the file download.py. If you don't know Python, it is recommended to read the textbook Introduction to Python Programming .

C6H12

All source code implementation in Github artinte/deep-learning repository, include this website. You can follow my YouTube channel, support my channel with likes and follows — the more support, the faster the updates!

Preface

Learning machine learning is difficult, especially if you are entering from other majors. I saw the trend of artificial intelligence about six years ago (2019) and wanted to switch to the artificial intelligence industry, but I didn't have a good entry point.

Since 2024, I have been preparing to learn deep learning systematically. At first, I wrote in Google Docs, then wrote a Machine Learning Series, and now Deep Learning with PyTorch. Along the way, I think the most important thing is the goal and persistence, and you will gradually find the fun of learning.

Deep Learning with PyTorch combines a large number of excellent articles, including published papers, and processes them into an e-book. The blue link is the relevant reference. I would like to thank everyone for their selfless dedication here, and wish you peace and happiness!

Chapter 1 - 6: Fundamentals of Deep Learning

In these chapters, we'll establish a solid foundation in deep learning by exploring the fundamental concepts of various neural network architectures. We'll cover:

  • Dense Neural Networks (DNNs): The building blocks of deep learning.

  • Convolutional Neural Networks (CNNs): Essential for image processing and computer vision.

  • Transformer Models: Crucial for natural language processing (NLP) and sequence-to-sequence tasks.

  • Diffusion Models: A newer class of generative models used for tasks like image generation.

Chapter 7 - 9: Practical Applications of Deep Learning

This section focuses on applying the concepts from the first six chapters to real-world problems. We'll explore advanced techniques and their applications in various domains, including:

  • Text and Audio Processing

  • Image and Video Analysis

We'll also introduce advanced techniques like Variational Autoencoders (VAEs), which are powerful generative models.

Chapter 10: Reinforcement Learning

This chapter provides an introduction to Reinforcement Learning (RL), a critical component of modern AI. Our primary references will be the textbook "Reinforcement Learning" and official PyTorch tutorials.

We will specifically examine how RL is utilized in developing Large Language Models (LLMs), for example, through techniques like Reinforcement Learning from Human Feedback (RLHF) to enhance model performance.

Chapter 11 - 14: Advanced Topics and Optimization

These chapters are dedicated to practical, advanced topics essential for working with large-scale deep learning models. We will focus on:

  • Extending PyTorch: Customizing the framework for specific needs.

  • Model Deployment: Running models on different hardware devices.

  • Optimization Techniques: Improving model efficiency and performance.

  • Distributed Training: Methods for training models that are too large for a single device.

These topics are crucial because modern deep learning models are often too large to be handled with basic methods.

Chapter 15: Graph Neural Networks (GNNs)

The final chapter introduces Graph Neural Networks (GNNs). Due to their complex structure, we'll focus on the basics and explore how to implement them using PyG (PyTorch Geometric) libraries.


My goal is to find a high-paying, AI related job, maybe a youtuber. Although I don’t success yet :) , current time (2025.08) maybe I need more Gump’s spirits. Here is my profile, so that I can have more time to write relevant tutorials and better development space.

01 Tensor and Gradient Basics

It mainly introduces the core concepts in deep learning - tensors and gradients, and lays the foundation for subsequent learning.

One Layer Model

1.1 Install PyTorch

pip3 install torch torchvision torchaudio

Select preferences and run the command to install PyTorch locally

1.2 Introduction to Tensors

A torch.Tensor is a multi-dimensional matrix containing elements of a single data type.

Introduction to PyTorch Tensors

Indexing on ndarrays — NumPy v2.2 Manual

Tensor Views - PyTorch 2.7 Documentation

1.3 Data Representation

Explains common data categories used in machine learning and data science, focusing on how they are represented as tensors (multi-dimensional arrays).

MNIST Handwritten Digit Database

A hands-on introduction to video technology: image, video, codec (av1, vp9, h265) and more (ffmpeg encoding).

1.4 Principles of Deep Learning

Introduce what deep learning is, its relationship with neural networks, and the various components of neural networks and how they work.

Artificial Intelligence, Machine Learning, and Deep Learning - Deep Learning with Python

1.5 Calculus

Calculus is designed for the typical two- or three-semester general calculus course, incorporating innovative features to enhance student learning. The book guides students through the core concepts of calculus and helps them understand how those concepts apply to their lives and the world around them. Due to the comprehensive nature of the material, we are offering the book in three volumes for flexibility and efficiency.

Calculus Volume 1 - OpenStax

Calculus Volume 2 - OpenStax

Calculus Volume 3 - OpenStax

1.6 Gradient Descent

Chain rule interpretation, real-valued circuits, patterns in gradient flow.

Backpropagation, Intuitions

1.7 Neural Network from Scratch

A simple explanation of how they work and how to implement one from scratch in Python.

Machine Learning for Beginners: An Introduction to Neural Networks

02 Fully Connected Network

Fully connected neural networks (FCNNs) are a type of artificial neural network where the architecture is such that all the nodes, or neurons, in one layer are connected to the neurons in the next layer.

Simple DNN Graph

2.1 Linear Algebra

This sixth edition of Professor Strang's most popular book, Introduction to Linear Algebra, introduces the ideas of independent columns and the rank and column space of a matrix early on for a more active start. Then the book moves directly to the classical topics of linear equations, fundamental subspaces, least squares, eigenvalues and singular values – in each case expressing the key idea as a matrix factorization. The final chapters of this edition treat optimization and learning from data: the most active application of linear algebra today.

Introduction to Linear Algebra, Sixth Edition

LibreTexts - Mathematics

2.2 Points Classification

In this post we will implement a simple 3-layer neural network from scratch.

Implementing a Neural Network from Scratch in Python

2.3 PyTorch Basics

Most machine learning workflows involve working with data, creating models, optimizing model parameters, and saving the trained models. This tutorial introduces you to a complete ML workflow implemented in PyTorch, with links to learn more about each of these concepts.

Learn the Basics - PyTorch

2.4 Activation Function

The activation function of a node in an artificial neural network is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation function is nonlinear.

A Beginner’s Guide to the Rectified Linear Unit (ReLU)

2.5 Loss Function

A loss function is a crucial component in machine learning that quantifies the difference between a model's predicted output and the actual target values.

What is a loss function?

2.6 Optimizer

An optimizer in machine learning, particularly in deep learning, is a function or algorithm that adjusts the model's parameters (like weights and biases) to minimize the loss function, thereby improving the model's performance.

Optimizer Implementations

torch.optim - PyTorch

Neural Network Optimizers from Scratch in Python

03 Convolutional Network

A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization.

Simple CNN Architecture

3.1 CNN from Stratch

CNNs, Part 1: An Introduction to Convolutional Neural Networks

CNNs, Part 2: Training a Convolutional Neural Network

3.2 AlexNet

We trained a large, deep convolutional neural network to classify the 1.3 million high-resolution images in the LSVRC-2010 ImageNet training set into the 1000 different classes.

ImageNet Classification with Deep Convolutional Neural Networks

3.3 ResNet

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions.

Deep Residual Learning for Image Recognition

3.4 U-Net

In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization.

U-Net: Convolutional Networks for Biomedical Image Segmentation

3.5 DenseNet

In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections.

Densely Connected Convolutional Networks

04 Recurrent Network

Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important.

Many to Many

4.1 RNN from Stratch

A simple walkthrough of what RNNs are, how they work, and how to build one from scratch in Python.

An Introduction to Recurrent Neural Networks for Beginners

4.2 Word Embeddings

How to represent words as dense vectors (embeddings) so that similar words have similar representations — useful for NLP tasks.

Words embeddings - TensorFlow

4.3 Word2Vec

word2vec is not a singular algorithm, rather, it is a family of model architectures and optimizations that can be used to learn word embeddings from large datasets.

Word2Vec - TensorFlow

4.4 Text Generation With RNN

Recurrent Neural Networks Tutorial, Part 1 – Introduction to RNNs

Recurrent Neural Networks Tutorial, Part 2 – Implementing a RNN with Python, Numpy and Theano

Recurrent Neural Networks Tutorial, Part 3 – Backpropagation Through Time and Vanishing Gradients

Recurrent Neural Network Tutorial, Part 4 – Implementing a GRU and LSTM RNN with Python and Theano

Learning to store information over extended time intervals via recurrent backpropagation takes a very long time, mostly due to insufficient, decaying error back flow. We briefly review Hochreiter's 1991 analysis of this problem, then address it by introducing a novel, effcient gradient-based method called "Long Short-Term Memory" (LSTM).

Long Short-term Memory

In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU).

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

4.5 Neural Machine Translation

In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

Neural Machine Translation by Jointly Learning to Align and Translate

4.6 Attention-based NMT

This paper examines two simple and effective classes of attentional mechanism: a global approach which always attends to all source words and a local one that only looks at a subset of source words at a time.

Effective Approaches to Attention-based Neural Machine Translation

05 Transformer

The transformer is a deep learning architecture that was developed by researchers at Google and is based on the multi-head attention mechanism , which was proposed in the 2017 paper Attention Is All You Need .

Transformer Architecture

This chapter is the most important one in this tutorial, We will start by learning what the attention mechanism is, then read paper "Attention Is All You Need", and provide some practical examples. Finally, BERT and ViT are variants of the Transformer.

Transformer is a relatively large deep learning architecture with extensive applications. There are numerous optimizations related to it, and it is difficult to explain thoroughly in a single article. Learning this architecture requires patience, and hands-on practice is extremely helpful for understanding the process of data computation.

5.1 Attention Mechanism

Mathematically speaking, an attention mechanism computes attention weights that reflect the relative importance of each part of an input sequence to the task at hand.

We will learn what the attention mechanism is, understand how to compute it using query, key, and value, and look into how PyTorch implements it—laying a solid foundation for the subsequent learning of Transformer.

What is an attention mechanism?

Attention Mechanisms and Transformers

5.2 Attention Is All You Need

We propose a new simple network architecture, the Transformer , based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.

Regardless of whether you fully understand it or not, go through the paper first to get a general impression. Later, when you work on specific example implementations and look back at this paper, you will gain more insights.

Attention Is All You Need

5.3 nn.Transformer

The code is very concise and easy to understand. For example, when studying the Transformer chapter, you'll read the paper "Attention Is All You Need," which provides an example of English-to-German translation that can be implemented on a single computer.

For a beginner, completing such an example is not easy, even with the help of AI. During my experiments, I discovered that the Tokenizer lacked a start marker, and that the torch.nn.Transformer had too many parameters to pass. The model was reading too much code, and it was a bit confused.

After repeated analysis, I split a relatively large example into several files, each focusing on its own task. Once these modules were divided, the code became very concise and easy to understand.

Attention Is All You Need

Attention Mechanisms and Transformers

torch.nn.Transformer - PyTorch

5.4 Transformer from Stratch

The Transformer from "Attention is All You Need" has been on a lot of people’s minds over the last year. Besides producing major improvements in translation quality, it provides a new architecture for many other NLP tasks. The paper itself is very clearly written, but the conventional wisdom has been that it is quite difficult to implement correctly.

This chapter will implement the Transformer architecture from scratch, module by module, to give you a clear look at the model's details. To avoid any data-related distractions, we'll only use numbers. Our goal is for the model to perform a copy task: if we input the sequence 0, 1, 2, ..., 9, we expect the model to output the exact same sequence. This may seem strange, as a simple function could do this with no effort. However, the remarkable part is that we can achieve this operation after passing the data through a massive and complex network. Isn't that incredible?

The main implementation is based on the article below. The article is very well-written, but it's quite long and may be difficult to understand. This guide will take a more accessible approach, with each module's output explained in detail to clarify its inner workings.

The Annotated Transformer

5.5 nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

nanoGPT - GitHub

5.6 BERT

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

5.7 Vision Transformer

In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on image classification tasks.

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

06 Diffusion Model

Diffusion Model provides a comprehensive overview of the theoretical foundations and practical applications of diffusion models, breaking down the topic into seven key sub-sections.

6.1 Probability Theory

This sub-section emphasizes the diverse applications of probability theory across various fields like business, healthcare, sciences, sociology, political science, and computing. It links to resources on introductory statistics and explains fundamental concepts such as standard deviation and variance, which are crucial for understanding data distributions.

The text focuses on diverse applications from a variety of fields and societal contexts, including business, healthcare, sciences, sociology, political science, computing, and several others.

Introductory Statistics 2e - OpenStax

Standard Deviation and Variance

6.2 Gaussian Processes

This part delves into Gaussian Processes, a powerful tool in machine learning for modeling functions and making predictions. The linked resource, "Dive into Deep Learning," suggests a deeper exploration of this topic within the context of deep learning.

Gaussian Processes - Dive into Deep Learning

6.3 Mathematical Foundation

This section focuses on the mathematical underpinnings of diffusion generative models. It highlights the core theoretical concepts necessary to understand how these models function at a fundamental level.

Mathematical Foundation of Diffusion Generative Models

6.4 Diffusion from Scratch

This sub-section aims to provide a practical understanding of diffusion models by explaining Stable Diffusion from a foundational perspective, allowing users to grasp its mechanisms from the ground up.

Understanding Stable Diffusion from "Scratch"

6.5 Estimating Gradients

We introduce a new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching.

Generative Modeling by Estimating Gradients of the Data Distribution

6.6 Diffusion Probability Model

We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics.

Denoising Diffusion Probabilistic Models

6.7 Latent Diffusion

To enable DM training on limited computational resources while retaining their quality and flexibility, we apply them in the latent space of powerful pretrained autoencoders.

High-Resolution Image Synthesis with Latent Diffusion Models

07 Text

This document outlines four distinct tutorials related to artificial intelligence and natural language processing.

Input and Output

7.1 Translate text with Transformer

This tutorial demonstrates how to create and train a sequence-to-sequence Transformer model to translate Portuguese into English.

Neural machine translation with a Transformer and Keras

7.2 Easy OCR

This section introduces EasyOCR, a ready-to-use Optical Character Recognition (OCR) tool. It highlights EasyOCR's broad language support, covering over 80 languages, making it versatile for extracting text from images.

EasyOCR: Ready-to-use OCR with 80+ supported languages

7.3 Language Modeling

This part discusses advancements in large language models (LLMs), specifically mentioning Llama by Meta and DeepSeek-V3.

Llama: The most intelligent, scalable, and convenient generation of Llama is here: natively multimodal, mixture-of-experts models, advanced reasoning, and industry-leading context windows. Build your greatest ideas and seamlessly deploy in minutes with Llama API and Llama Stack.

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2.

Industry Leading, Open-Source AI | Llama by Meta

DeepSeek-V3 Technical Report

7.4 Chatbots

A chatbot is a computer program that simulates human conversation with an end user. This final section defines a chatbot as a computer program designed to simulate human conversation with an end-user. It points to a PyTorch tutorial for learning how to develop chatbots.

Chatbot Tutorial

08 Audio

Fundamentals of Music Processing (FMP)

8.1 Speech Feature Extraction

Sound is a mechanical wave that transmits energy through the vibration of a medium, such as air, water, or solids. Understanding its fundamental properties is crucial for converting it into a format that deep learning models can effectively process.

Sound Properties

WAVE PCM soundfile format

torchaudio.transforms.MelSpectrogram

Audio Feature Extractions

8.2 Automatic Speech Recognition

whisper: About Robust Speech Recognition via Large-Scale Weak Supervision

Introducing Whisper

Robust Speech Recognition via Large-Scale Weak Supervision

8.3 Text-to-Speech

An Open Source text-to-speech system built by inverting Whisper.

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

8.4 Music Transcription

Automatic Music Transcription (AMT) is the task of extracting symbolic representations of music from raw audio.

Music Transcription with Transformers

8.5 Music Synthesis

Python Notebooks for Fundamentals of Music Processing

Performance RNN

09 Image and Video

9.1 Object Detection

TorchVision Object Detection Finetuning Tutorial

9.2 Transfer Learning

Transfer learning is a machine learning technique where a model, trained on one task, is reused as a starting point for a different but related task.

Transfer Learning for Computer Vision Tutorial

9.3 FGSM Attack

Adversarial Example Generation

9.4 Spatial Transformer

Spatial Transformer Networks Tutorial

9.5 DeepFaceLab

DeepFaceLab is the leading software for creating deepfakes.

DeepFaceLab: Integrated, flexible and extensible face-swapping framework

9.6 DeepFaceLive

9.7 Segment Anything

segment-anything: provides code for running inference with the SegmentAnything Model (SAM)

9.8 Intro to Autoencoders

An autoencoder is a special type of neural network that is trained to copy its input to its output.

Intro to Autoencoders

Convolutional Variational Autoencoder

10 Reinforcement Learning

Reinforcement Learning: An Introduction

Implementation of Reinforcement Learning Algorithms

David Silver's Reinforcement Learning

10.1 Introduction of RL

Chapter 1: Markov Decision Processes - Reinforcement Learning

10.2 Markov Decision Processes

Chapter 3: Markov Decision Processes - Reinforcement Learning

10.3 Dynamic Programming

Chapter 4: Dynamic Programming - Reinforcement Learning

10.4 DQN

This tutorial shows how to use PyTorch to train a Deep Q Learning (DQN) agent on the CartPole-v1 task from Gymnasium.

Reinforcement Learning (DQN) Tutorial

10.5 PPO

This tutorial demonstrates how to use PyTorch and torchrl to train a parametric policy network to solve the Inverted Pendulum task from the OpenAI-Gym/Farama-Gymnasium control library.

Reinforcement Learning (PPO) with TorchRL Tutorial

10.6 Function Approximation

11 Extending PyTorch

This chapter provides insights into extending PyTorch's capabilities. It covers custom operations, frontend APIs, and advanced topics like C++ extensions and dispatcher usage.

Extending PyTorch

11.1 Custom Python Operators

Custom Python Operators

11.2 Custom C++ and CUDA Operators

Custom C++ and CUDA Operators

11.3 Double Backward

Double Backward with Custom Functions

11.4 Fusing Conv and Batch Norm

Fusing Convolution and Batch Norm using Custom Function

12 Deploying Models

12.1 ONNX

ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.

12.2 ExecuTorch

ExecuTorch is PyTorch’s solution to training and inference on the Edge.

Getting Started with ExecuTorch

12.3 LiteRT

LiteRT (short for Lite Runtime), formerly known as TensorFlow Lite, is Google's high-performance runtime for on-device AI.

LiteRT overview

12.4 TensorFlow.js

TensorFlow.js is a library for machine learning in JavaScript

TensorFlow.js — Handwritten digit recognition with CNNs

Demos - TensorFlow.js

13 Model Optimization

This Chapter covers four key techniques used to improve the efficiency and performance of machine learning models.

13.1 LoRA

We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

LoRA: Low-Rank Adaptation of Large Language Models

13.2 Pruning

In this tutorial, we will learn how to use torch.nn.utils.prune to sparsify your neural networks, and how to extend it to implement your own custom pruning technique.

Pruning Tutorial

13.3 Quantization

We’ll lay a (quick) foundation of quantization in deep learning, and then take a look at how each technique looks like in practice.

Practical Quantization in PyTorch

13.4 Distillation

Knowledge distillation is a technique that enables knowledge transfer from large, computationally expensive models to smaller ones without losing validity. This allows for deployment on less powerful hardware, making evaluation faster and more efficient.

Knowledge Distillation Tutorial

Distilling the Knowledge in a Neural Network

FitNets: Hints for Thin Deep Nets

14 Distributed Training

Distributed training is a model training paradigm that involves spreading training workload across multiple worker nodes, therefore significantly improving the speed of training and model accuracy. While distributed training can be used for any type of ML model training, it is most beneficial to use it for large models and compute demanding tasks as deep learning.

Distributed - PyTorch

14.1 Distributed Data Parallel

DistributedDataParallel (DDP) is a powerful module in PyTorch that allows you to parallelize your model across multiple machines, making it perfect for large-scale deep learning applications.

Getting Started with Distributed Data Parallel

14.2 Fully Sharded Data Parallel

PyTorch FSDP2 provides a fully sharded data parallelism (FSDP) implementation targeting performant eager-mode while using per-parameter sharding for improved usability.

Getting Started with Fully Sharded Data Parallel (FSDP2)

FullyShardedDataParallel

14.3 Tenser Parallel

This tutorial demonstrates how to train a large Transformer-like model across hundreds to thousands of GPUs using Tensor Parallel and Fully Sharded Data Parallel.

Large Scale Transformer model training with Tensor Parallel (TP)

Tensor Parallelism - torch.distributed.tensor.parallel

14.4 Device Mesh

DeviceMesh is a higher level abstraction that manages ProcessGroup. It allows users to effortlessly create inter-node and intra-node process groups without worrying about how to set up ranks correctly for different sub process groups.

Getting Started with DeviceMesh

14.5 Remote Procedure Call

This tutorial uses two simple examples to demonstrate how to build distributed training with the torch.distributed.rpc package.

Getting Started with Distributed RPC Framework

15 Graph Netural Network

PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data.

15.1 Graph Foundation

Graph - Hello Algo

15.2 Core Ideas

A Gentle Introduction to Graph Neural Networks

15.3 Design of GNN

Design of Graph Neural Networks

15.4 Use-Cases & Applications

Use-Cases & Applications

15.5 Advanced Concepts

Advanced Mini-Batching

Memory-Efficient Aggregations