Skip to content

stopIteration: Caught StopIteration in replica 0 on device 0. #18

@shuaiqiqi1227

Description

@shuaiqiqi1227

When I use a single graphics card, it can run normally. Why does it report an error when I use two graphics cards?
Error message:
File "/home/amax/data/cjq/Secondpose-SD/GeoAware/model_utils/extractor_sd.py", line 247, in process_features_and_mask
features = get_features(model, aug, input_image, vocab, label_list, caption, pca=raw)
File "/home/amax/data/cjq/Secondpose-SD/GeoAware/model_utils/extractor_sd.py", line 226, in get_features
features = demo.get_features(np.array(image), pca=pca)
File "/home/amax/data/cjq/Secondpose-SD/GeoAware/model_utils/extractor_sd.py", line 90, in get_features
features = self.model.get_features([inputs],pca=pca)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/odise/modeling/wrapper/pano_wrapper.py", line 63, in get_features
results = self.model.get_features(batched_inputs, caption, pca=pca) if caption is not None else self.model.get_features(batched_inputs, pca=pca)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/odise/modeling/meta_arch/odise.py", line 217, in get_features
features = self.backbone(images.tensor, raw=pca)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/odise/modeling/backbone/feature_extractor.py", line 265, in forward
return self.slide_forward(img,caption,raw)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/odise/modeling/backbone/feature_extractor.py", line 236, in slide_forward
crop_features = self.single_forward(crop_img,caption=caption,raw=raw)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/odise/modeling/backbone/feature_extractor.py", line 150, in single_forward
features = self.feature_extractor(dict(img=img))
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/odise/modeling/meta_arch/ldm.py", line 1016, in forward
return self.ldm_extractor(batched_inputs)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/odise/modeling/meta_arch/ldm.py", line 862, in forward
cond_inputs = batched_inputs.get("cond_inputs", self.ldm.embed_text(captions))
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/odise/modeling/meta_arch/ldm.py", line 323, in embed_text
return self.ldm.get_learned_conditioning(text)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/ldm/models/diffusion/ddpm.py", line 665, in get_learned_conditioning
c = self.cond_stage_model.encode(c)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/ldm/modules/encoders/modules.py", line 131, in encode
return self(text)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/ldm/modules/encoders/modules.py", line 120, in forward
tokens = batch_encoding["input_ids"].to(self.device)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/odise/modeling/meta_arch/ldm.py", line 230, in
TempClass.device = property(lambda m: next(m.parameters()).device)
StopIteration

File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py", line 171, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py", line 181, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py", line 89, in parallel_apply
output.reraise()
File "/home/amax/.conda/envs/secondpose2/lib/python3.9/site-packages/torch/_utils.py", line 543, in reraise
raise exception
StopIteration: Caught StopIteration in replica 0 on device 0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions