A newer version of this document is available. Customers should click here to go to the newest version.
                                    
                                    
                                        
                                        
                                            Error Message: Bad Termination
                                        
                                        
                                            
                                                Case 1
                                            
                                                Case 2
                                            
                                        
                                    
                                        
                                        
                                            Error Message: No such file or Directory
                                        
                                        
                                    
                                        
                                        
                                            Error Message: Permission Denied
                                        
                                        
                                    
                                        
                                        
                                            Error Message: Fatal Error
                                        
                                        
                                    
                                        
                                        
                                            Error Message: Bad File Descriptor
                                        
                                        
                                    
                                        
                                        
                                            Error Message: Too Many Open Files
                                        
                                        
                                    
                                        
                                        
                                            Problem: High Memory Consumption Readings
                                        
                                        
                                    
                                        
                                        
                                            Problem: MPI Application Hangs
                                        
                                        
                                    
                                        
                                        
                                            Problem: Password Required
                                        
                                        
                                    
                                        
                                        
                                            Problem: Cannot Execute Binary File
                                        
                                        
                                    
                                        
                                        
                                            Problem: MPI limitation for Docker*
                                        
                                        
                                    
                                
                            Error Message: Bad Termination
NOTE: The values in the tables below may not reflect the exact node or MPI process where a failure can occur.
Case 1
Error Message
=================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 1 PID 27494 RUNNING AT node1 = KILLED BY SIGNAL: 11 (Segmentation fault) ===================================================================================
or:
=================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 1 PID 27494 RUNNING AT node1 = KILLED BY SIGNAL: 8 (Floating point exception) ===================================================================================
Cause
One of MPI processes is terminated by a signal (for example, Segmentation fault or Floating point exception) on the node01.
Solution
Find the reason of the MPI process termination. It can be the out-of-memory issue in case of Segmentation fault or division by zero in case of Floating point exception.
Case 2
Error Message
================================================================================ = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 1 PID 20066 RUNNING AT node01 = KILLED BY SIGNAL: 9 (Killed) ================================================================================
Cause
One of MPI processes is terminated by a signal (for example, SIGTERM or SIGKILL) on the node01 due to:
- the host reboot;
 - an unexpected signal received;
 - out-of-memory manager (OOM) errors;
 - killing by the process manager (if another process was terminated before the current process);
 - job termination by the Job Scheduler (PBS Pro*, SLURM*) in case of resources limitation (for example, walltime or cputime limitation).
 
Solution
- Check the system log files.
 - Try to find the reason of the MPI process termination and fix the issue.
 
 Parent topic: Troubleshooting