A newer version of this document is available. Customers should click here to go to the newest version.
                                    
                                    
                                        
                                        
                                            Error Message: Bad Termination
                                        
                                        
                                    
                                        
                                        
                                            Error Message: No such file or Directory
                                        
                                        
                                    
                                        
                                        
                                            Error Message: Permission Denied
                                        
                                        
                                    
                                        
                                        
                                            Error Message: Fatal Error
                                        
                                        
                                    
                                        
                                        
                                            Error Message: Bad File Descriptor
                                        
                                        
                                    
                                        
                                        
                                            Error Message: Too Many Open Files
                                        
                                        
                                    
                                        
                                        
                                            Problem: High Memory Consumption Readings
                                        
                                        
                                    
                                        
                                        
                                            Problem: MPI Application Hangs
                                        
                                        
                                    
                                        
                                        
                                            Problem: Password Required
                                        
                                        
                                    
                                        
                                        
                                            Problem: Cannot Execute Binary File
                                        
                                        
                                    
                                        
                                        
                                            Problem: MPI limitation for Docker*
                                        
                                        
                                    
                                
                            Error Message: Bad File Descriptor
Error Message
[mpiexec@node00] HYD_sock_write (../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:353): write error (Bad file descriptor) [mpiexec@node00] cmd_bcast_root (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:147): error sending cwd cmd to proxy [mpiexec@node00] stdin_cb (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:324): unable to send response downstream [mpiexec@node00] HYDI_dmx_poll_wait_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:79): callback returned error status [mpiexec@node00] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:2064): error waiting for event
or:
[mpiexec@host1] wait_proxies_to_terminate (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:389): downstream from host host2 exited with status 255
Cause
The remote hydra_pmi_proxy process is unavailable due to:
- the host reboot;
- an unexpected signal received;
- out-of-memory manager (OOM) errors;
- job termination by the Job Scheduler (PBS Pro*, SLURM*) in case of resources limitation (for example, walltime or cputime limitation).
Solution
- Check the system log files.
- Try to find the reason of the hydra_pmi_proxy process termination and fix the issue.
 Parent topic: Troubleshooting