Bug 17152 - With Intel optimization level >=2, doParallelMC foreach loops leave orphaned processes
Summary: With Intel optimization level >=2, doParallelMC foreach loops leave orphaned ...
Status: UNCONFIRMED
Alias: None
Product: R
Classification: Unclassified
Component: Installation (show other bugs)
Version: R 3.3.*
Hardware: x86_64/x64/amd64 (64-bit) Linux-RHEL
: P5 normal
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2016-09-16 12:52 UTC by juergen.salk
Modified: 2016-09-16 13:09 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description juergen.salk 2016-09-16 12:52:30 UTC
Dear all,

I am not sure if this has already been addressed, but I could not find
this in bugzilla.

Problem description:
 
When R is compiled with Intel compiler at optimization levels >= 2, foreach
loops with doParallelMC backend may leave orphaned processes (potentially
blocking significant system resources).
 
Steps to reproduce:
 
$ # Build R
$ wget http://cran.r-project.org/src/base/R-3/R-3.3.1.tar.gz
$ tar xzvf R-3.3.1.tar.gz
$ cd R-3.3.1
$ module load compiler/intel/15.0 # or whatever you need to get icc
$ ./configure --prefix=$HOME/sw/R/3.3.1 | tee -a R-3.3.1-build.log
$ make | tee -a R-3.3.1-build.log
$ make install | tee -a R-3.3.1-build.log
$ # Install doParallel package
$ $HOME/sw/R/3.3.1/bin/R -q
> install.packages("doParallel")
> q();
Save workspace image? [y/n/c]: n
$ # Run script Example_debug.R
$ cat Example_debug.R
# Example script
# Essentially, this R-File should not produce anything new, but nevertheless
# the RAM fills up round after round

library(doParallel)
 
iter <-  320    # Number of iterations 
nodes <-  16    # How many processors should be used for the calcuations?
search <- 64    # How many rounds should the model search
 
registerDoParallel(cores=nodes)
 
# --------------------------------------------------------------------------------                                       
 
for (s in 1:search) {
    results <- foreach(i=icount(iter)) %dopar% {
        # calculate something or generate a matrix
        mat1 <- m <- matrix(0,1000,1000)
    }
    gc()
}
 
$ $HOME/sw/R/3.3.1/bin/R CMD BATCH --no-save --no-restore Example_debug.R

Result:
 
Depending on the amount of RAM in your machine, this job may run to it's end
but leaving several hundreds of orphaned processes behind or it may run out of
memory with the following error message in Example_debug.Rout:

Error in mcfork() :
  unable to fork, possible reason: Cannot allocate memory
Calls: %dopar% -> <Anonymous> -> mclapply -> lapply -> FUN -> mcfork
 
This has been tested with Intel compiler versions 14 and 15 on RHEL6
and RHEL7.
 
I have worked around this issue for now by placing a `#pragma optimize("", off)ยด
on top of src/library/parallel/src/fork.c.
Although this issue is probably caused by the Intel compiler rather than R itself, 
I thought I'd let you know. Maybe there is some clean solution to make
the relevant code more hardy against over-aggressive Intel icc optimizations.
 
Best regards,
 
Juergen