pytorch中嘗試用多進(jìn)程加載訓(xùn)練數(shù)據(jù)集,源碼如下:
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=3)
結(jié)果報(bào)錯(cuò):
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
從報(bào)錯(cuò)信息可以看到,當(dāng)前進(jìn)程在運(yùn)行可執(zhí)行代碼時(shí),產(chǎn)生了一個(gè)新進(jìn)程。這可能意味著您沒有使用fork來啟動(dòng)子進(jìn)程或者是未在主模塊中正確使用。
后來經(jīng)過查閱發(fā)現(xiàn)了原因,因?yàn)閣indows系統(tǒng)下默認(rèn)用spawn方法部署多線程,如果代碼沒有受到__main__模塊的保護(hù),新進(jìn)程都認(rèn)為是要再次運(yùn)行的代碼,將嘗試再次執(zhí)行與父進(jìn)程相同的代碼,生成另一個(gè)進(jìn)程,依此類推,直到程序崩潰。
解決方法很簡(jiǎn)單
把調(diào)用多進(jìn)程的代碼放到__main__模塊下即可。
if __name__ == '__main__':
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=3)
補(bǔ)充:pytorch-Dataloader多進(jìn)程使用出錯(cuò)
使用Dataloader進(jìn)行多進(jìn)程數(shù)據(jù)導(dǎo)入訓(xùn)練時(shí),會(huì)因?yàn)槎噙M(jìn)程的問題而出錯(cuò)
dataloader = DataLoader(transformed_dataset, batch_size=4,shuffle=True, num_workers=4)
其中參數(shù)num_works=表示載入數(shù)據(jù)時(shí)使用的進(jìn)程數(shù),此時(shí)如果參數(shù)的值不為0而使用多進(jìn)程時(shí)會(huì)出現(xiàn)報(bào)錯(cuò)
RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable.
此時(shí)在數(shù)據(jù)的調(diào)用之前加上if __name__ == '__main__':即可解決問題
if __name__ == '__main__':#這個(gè)地方可以解決多線程的問題
for i_batch, sample_batched in enumerate(dataloader):
以上為個(gè)人經(jīng)驗(yàn),希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。
您可能感興趣的文章:- pytorch多進(jìn)程加速及代碼優(yōu)化方法
- PyTorch 解決Dataset和Dataloader遇到的問題
- 解決pytorch DataLoader num_workers出現(xiàn)的問題