Sunday, June 22, 2014

checkjob showq job is deferred. 'Execution server rejected request MSG=cannot send job to mom, state=PRERUN')

I'm using Torque 4.2.6 and MAUI 3.3.1

When I submitted a job, the job landed with a Deferred command

job is deferred.  Reason:  RMFailure  (cannot start job - RM failure, rc: 15043, 
msg: 'Execution server rejected request MSG=cannot send job to mom, state=PRERUN')

You can do a tail -f /var/log/messages or /var/spool/torque/server_logs
LOG_ERROR::No route to host (113) in send_job_work, send_job failed to host 
comp-node-1, c0a832a7 port 15002

This gave a hint. I checked my iptables and I realised that the iptables was on and I shut accordingly and the issue was cleared.

For more information:
  1. Troubleshooting deferred jobs, episode 80

No comments: