56
edits
Changes
→Parallelization Methods
Here is a toy model that contains two linear layers. Each linear layer is designed to run a separate GPU.
import torch
x = self.relu(self.net1(x.to('cuda:0')))
return self.net2(x.to('cuda:1'))
The code is very similar to a single GPU implementation, except for the ''.to('cuda:x')'' calls, where ''cuda:0'' and ''cuda:1'' are each their own GPU.