Changes

← Older edit

DPS921/PyTorch: Convolutional Neural Networks

5,454 bytes added, 15:59, 30 November 2020

→‎Progress Report

3. Novell Rasam

~~== Progress ==~~

== Introduction to Neural Networks ==

return t

== ~~Parallelization Methods~~ Data Parallelism == This section details a way to parallelize your NN . As image recognition is graphical in nature, multiple GPUs are the best way to parallelize dataset training. <code>DataParallel</code> is a single-machine parallel model, that uses multiple GPUs [https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html]. It is more convenient than a multi-machine, distributed training model. You can easily put your model on a GPU by writing: device = torch.device("cuda:0") model.to(device) Then, you can copy all your tensors to the GPU: mytensor = my_tensor.to(device) However, PyTorch will only use one GPU by default. In order to run on multiple GPUs you need to use <code>DataParallel</code>: model = nn.DataParallel(model) ==== Imports and Parameters ==== Import the following modules and define your parameters: import torch import torch.nn as nn from torch.utils.data import Dataset, DataLoader # Parameters and DataLoaders input_size = 5 output_size = 2 batch_size = 30 data_size = 100 Device: device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") ==== Dummy DataSet ==== You can make a random dummy dataset: class RandomDataset(Dataset): def __init__(self, size, length): self.len = length self.data = torch.randn(length, size) def __getitem__(self, index): return self.data[index] def __len__(self): return self.len rand_loader = DataLoader(dataset=RandomDataset(input_size, data_size), batch_size=batch_size, shuffle=True) ==== Simple Model ==== Here is a simple linear model definition, but <code>DataParallel</code> can be used any model (CNN, RNN, etc). class Model(nn.Module): # Our model def __init__(self, input_size, output_size): super(Model, self).__init__() self.fc = nn.Linear(input_size, output_size) def forward(self, input): output = self.fc(input) print("\tIn Model: input size", input.size(), "output size", output.size()) return output ==== Create Model and DataParallel ==== Now that everything is defined, we need create an instance of the model and check if we have multiple GPUs. model = Model(input_size, output_size) if torch.cuda.device_count() > 1: print("Let's use", torch.cuda.device_count(), "GPUs!") # dim = 0 [30, xxx] -> [10, ...], [10, ...], [10, ...] on 3 GPUs model = nn.DataParallel(model) # wrap model using nn.DataParallel model.to(device) ==== Run the Model ==== The print statement will let us see the input and output sizes.

~~This section details the ways~~ for data in rand_loader: input = data.to ~~parallelize your NN~~(device) output = model(input) print("Outside: input size", input. ~~As image recognition is graphical in nature~~size(), "output_size", ~~multiple GPUs are the best way to parallelize dataset training~~output. ~~=== Single-Machine Model ===~~size())

~~The Single-Machine Model is a parallel model that is done on one machine, with multiple GPUs. It is more convenient than a distributed training model with multiple machines.~~==== Results ====

~~==== Basic Usage ====~~If you have no GPU or only one GPU, the input and output size will match the batch size, so no parallelization. But if you have 2 GPUs, you'll get these results:

~~Here is a toy model that contains two linear layers~~ # on 2 GPUs Let's use 2 GPUs! In Model: input size torch. ~~Each linear layer is designed to run a separate GPU~~Size([15, 5]) output size torch.Size([15, 2]) In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2]) Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2]) In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2]) In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2]) Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2]) In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2]) In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2]) Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2]) In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2]) In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2]) Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])

~~import torchimport torch~~With 2 GPUs, the input and output sizes are half of 30, 15.~~nn as nnimport torch.optim as optim~~

=== A More Intermediate Example ===

~~class ToyModel(nn.Module):~~ ~~def __init__(self):~~ ~~super(ToyModel, self).__init__()~~ ~~self.net1 = torch.nn.Linear(10, 10)~~Here is a toy model that contains two linear layers.Each linear layer is designed to~~('cuda~~run a separate GPU [https:~~0')~~ ~~self~~//pytorch.~~relu = torch~~org/tutorials/intermediate/model_parallel_tutorial.nnhtml].~~ReLU()~~ ~~self.net2 = torch.nn.Linear(10, 5).to('cuda:1')~~

~~def forward(self, x):~~ import torch ~~x = self~~ import torch.~~relu(self.net1(x.to('cuda:0')))~~nn as nn ~~return self.net2(x~~ import torch.~~to('cuda:1'))~~optim as optim

~~The code is very similar to a single GPU implementation~~ class ToyModel(nn.Module): def __init__(self): super(ToyModel, self).__init__() self.net1 = torch.nn.Linear(10, ~~except for the ''~~10).to('cuda:x0')~~'' calls~~ self.relu = torch.nn.ReLU() self.net2 = torch.nn.Linear(10, ~~where ''cuda:0'' and '~~5).to('cuda:1'~~' are each their own GPU.~~)

def forward(self, x):

x = self.relu(self.net1(x.to('cuda:0')))

return self.net2(x.to('cuda:1'))

<The code~~>model = ToyModel~~is very similar to a single GPU implementation, except for the ''.to('cuda:x')<'' calls, where ''cuda:0'' and ''cuda:1'' are each their own GPU [https:/~~code>~~ ~~<code>loss_fn = nn~~/pytorch.~~MSELoss()<~~org/tutorials/intermediate/~~code>~~ ~~<code>optimizer = optim~~model_parallel_tutorial.~~SGD(model~~html].~~parameters(), lr=0.001)</code>~~

~~<code>optimizer.zero_grad~~ model = ToyModel()~~</code><code>outputs~~ loss_fn = ~~model(torch~~nn.~~randn~~MSELoss(~~20, 10)~~)~~</code><code>labels~~ optimizer = ~~torch~~optim.~~randn~~SGD(~~20, 5)~~model.toparameters(~~'cuda:1'~~)~~</code><code>loss_fn(outputs~~, ~~labels).backward()</code><code>optimizer~~lr=0.~~step(~~001)~~</code>~~

optimizer.zero_grad()

outputs = model(torch.randn(20, 10))

labels = torch.randn(20, 5).to('cuda:1')

loss_fn(outputs, labels).backward()

optimizer.step()

The backward() and torch.optim will automatically take care of gradients as if the model is on one GPU. You only need to make sure that the labels are on the same device as the outputs when calling the loss function[https://pytorch.org/tutorials/intermediate/model_parallel_tutorial.html].

== Getting Started With Jupyter ==

pip install torch===1.7.0 torchvision===0.8.1 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

== Progress Report ==

*Update 1: Friday, November 27, 2020 - Started on Introduction to Neural Networks Section

*Update 2: Friday, November 27, 2020 - Installation and Configuration of Jupyter Lab

*Update 3: Saturday, November 28, 2020 - Practiced Working With and Learning About Jupyter Lab

*Update 4: Saturday, November 28, 2020 - Created a 4 layer ANN on Jupyter Lab

*Update 5: Saturday, November 28, 2020 - Initiated Training of the ANN and Verified Digit Recognition Capabilities

*Update 6: Sunday, November 29, 2020 - Finished Introduction to Neural Networks

*Update 7: Sunday, November 29, 2020 - Implemented a basic CNN Based on Previous Implementation of ANN

*Update 8: Monday, November 30, 2020 - Added Section on Data Parallel

==References==

*~~[1]~~ Khan, Faisa. “Infographics Digest - Vol. 3.” Medium, [https://medium.com/datadriveninvestor/infographics-digest-vol-3-da67e69d71ce]*~~[2][5][8]~~ “ANN vs CNN vs RNN | Types of Neural Networks.” Analytics Vidhya, 17 Feb. 2020, [https://www.analyticsvidhya.com/blog/2020/02/cnn-vs-rnn-vs-mlp-analyzing-3-types-of-neural-networks-in-deep-learning/.]*~~[3][4][6]~~ 3Blue1Brown. But What Is a Neural Network? | Deep Learning, Chapter 1. 2017. YouTube, [https://www.youtube.com/watch?v=aircAruvnKk.]*~~[7]~~ What Is Backpropagation Really Doing? | Deep Learning, Chapter 3. 2017. YouTube, [https://www.youtube.com/watch?v=Ilg3gGewQ5U&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi&index=3.]*PyTorch Tutorials: beginner Data Parallel [https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html]*PyTorch Tutorials: Intermediate Model Parallel [https://pytorch.org/tutorials/intermediate/model_parallel_tutorial.html]*Building our Neural Network - Deep Learning and Neural Networks with Python and Pytorch p.3, YouTube, [https://www.youtube.com/watch?v=ixathu7U-LQ] *Training Model - Deep Learning and Neural Networks with Python and Pytorch p.4, YouTube, [https://www.youtube.com/watch?v=9j-_dOze4IM]*Deep Learning with PyTorch: Building a Simple Neural Network| packtpub.com, YouTube, [https://www.youtube.com/watch?v=VZyTt1FvmfU&list=LL&index=4]*Github PyTorch Neural Network Module Implementation [https://github.com/pytorch/pytorch/tree/master/torch/nn/modules]*Project Jupyter Official Documentation [https://jupyter.org/documentation]*PyTorch: Get Started [https://pytorch.org/get-started/locally/]

Shervintafreshipour

84

edits

CDOT Wiki β

Changes

DPS921/PyTorch: Convolutional Neural Networks

CDOT Wiki ^β