What's the OOM is really for ?

This topic has 0 replies, 1 voice, and was last updated 1 year, 10 months ago by Audric.

Viewing 1 post (of 1 total)

Author

Posts
March 3, 2023 at 9:57 am #8065
Audric
Participant
Hello,

I tried a lot of different configurations to see what my computer can handle or not.
Before, I had a RTX 2060 (6Go) with a Intel i5 and I figured that I can’t do more than batch of 8 for a resolution of 256, and higher resolution crash with OOM tenses errors.

I recently change my setup for a Ryzen 7 5700X and a Radeon RX 6800 16 Gb. But even if I try to use a better resolution with this clearly better stuff, I’ve got the same errors again and again… “OOM when allocating tensor with shape[8,204,194,194]”

My question is : which memory is missing ? Is it from the graphic card, the CPU cache or the RAM ? Because the only thing which didn’t change is my 16Gb of RAM between the two setups. When I try to train, I choose the GPU instead of CPU, so why isn’t better to work with higher resolution than before ? Same for dimensions, I reduced it a lot from the original one.

The only configuration working well is this one, exactly the same that with my RTX 2060, I don’t get why I can’t do more :

================ Model Summary ================
== ==
== Model name: 256wf_SAEHD ==
== ==
== Current iteration: 57269 ==
== ==
==————– Model Options ————–==
== ==
== resolution: 256 ==
== face_type: wf ==
== models_opt_on_gpu: True ==
== archi: df-ud ==
== ae_dims: 256 ==
== e_dims: 64 ==
== d_dims: 64 ==
== d_mask_dims: 22 ==
== masked_training: True ==
== uniform_yaw: True ==
== lr_dropout: n ==
== random_warp: True ==
== gan_power: 0.0 ==
== true_face_power: 0.0 ==
== face_style_power: 0.0 ==
== bg_style_power: 0.0 ==
== ct_mode: none ==
== clipgrad: False ==
== pretrain: False ==
== autobackup_hour: 0 ==
== write_preview_history: False ==
== target_iter: 1000000 ==
== random_flip: True ==
== batch_size: 12 ==
== eyes_mouth_prio: False ==
== blur_out_mask: False ==
== adabelief: True ==
== random_hsv_power: 0.0 ==
== random_src_flip: False ==
== random_dst_flip: True ==
== gan_patch_size: 16 ==
== gan_dims: 16 ==
== ==
==————— Running On —————-==
== ==
== Device index: 0 ==
== Name: AMD Radeon RX 6800 ==
== VRAM: 14.45GB ==
== ==
===============================================

Which memory is out of ? Why when it’s finally running with a lower configuration, all the performances during the training are easily under the limit then ?

Thanks
Author

Posts

Viewing 1 post (of 1 total)

You must be logged in to reply to this topic.