Condor-CE: Do not hold running jobs with expired proxy

Description

The Condor-CE puts any job with an expired proxy on hold.  We should only do this for non-running jobs.

Why is this OK?

  • The proxy in the spool directory on the CE is not necessarily the one used to communicate between the factory and the CE.  An expired proxy can be subsequently updated.

  • The glideinWMS pilot only needs to do a GSI auth to establish an initial connection to the collector - afterward, it uses the established HTCondor security session between startd and collector.

This recently bit us at Nebraska as a bunch of otherwise-healthy CMS pilots were preempted.

Freshdesk Tickets

None

Assignee

Brian Lin

Reporter

Brian Bockelman

Priority

Major

Labels

Components

Fix versions

Configure