Unusual long time for "mpi_col_den"
In the attached example the routine "mpi_col_den" took nearly 4h of the 6h total runtime on jureca (60 nodes each 2 MPI and also for 4 nodes each 32 MPI). Attached is also an output of a failed run (runtime limit) with "-debugtime".
Total execution time: 18014sec, minimal timing printed:360sec 0 Total Run 18014.40sec= 5h 0min 14sec -> 100.0% 0 measured in submodules: 3.74sec= 0h 0min 3sec 1 Iteration 18010.52sec= 5h 0min 10sec -> 100.0% 1 measured in submodules: 3357.20sec= 0h 55min 57sec 2 gen. of hamil. and diag. (tota 3297.83sec= 0h 54min 57sec -> 18.3% 2 measured in submodules: 3296.81sec= 0h 54min 56sec 3 eigen 3157.98sec= 0h 52min 37sec -> 17.5% 3 measured in submodules: 3157.36sec= 0h 52min 37sec 4 Diagonalization 3002.70sec= 0h 50min 2sec -> 16.7% (2calls: 1500.694sec - 1502.010sec) 4 measured in submodules: 0.06sec= 0h 0min 0sec 2 generation of new charge densi 14653.26sec= 4h 4min 13sec -> 81.3% 2 measured in submodules: 14653.12sec= 4h 4min 13sec 3 cdnval 14648.35sec= 4h 4min 8sec -> 81.3% (2calls: 7323.044sec - 7325.301sec) 3 measured in submodules: 14648.23sec= 4h 4min 8sec 4 pwden 628.27sec= 0h 10min 28sec -> 3.5% (2calls: 312.810sec - 315.459sec) 4 mpi_col_den 14009.55sec= 3h 53min 29sec -> 77.8% (2calls: 7002.314sec - 7007.235sec) Program used 240 PE
And the Debugtime output mpi-9532283.out
This issue was discussed with Daniel, but it might be that it is a normal behaviour. I just added it here to keep track of it.