=== Data Parallelism ===
The data can be parallelized with <code>DataParallel</code> is a single-machine parallel model, that uses multiple GPUs. It is more convenient than a multi-machine, distributed training model. You can easily put your model on a GPU by writing:
device = torch.device("cuda:0")