calls to scontrol time out for busy Slurm queues


JLab is having issues with their new CE because times out with calls to scontrol [1] used to fill the BLAHP's Slurm job cache:

This is because scontrol doesn't (and can't) limit its query on a particular user so it grabs all jobs, which in JLab's case is 20-30k at any given time, taking upwards of 4 minutes. We should consider replacing calls to scontrol throughout with calls to sacct or squeue.



Mat Selmeci
March 30, 2021, 9:33 PM

Simple fix. Review passed

Brian Lin
March 30, 2021, 9:22 PM

I caught an issue while reviewing HTCONDOR-333. Adding Mat as a reviewer ( )

Jaime Frey
February 26, 2021, 6:59 PM

The only changelog I see is for the debian packaging, which has a single “Initial release” item.

Tim Theisen
January 7, 2021, 4:18 PM

You could add a one line entry in the changelog(s).

Jaime Frey
January 6, 2021, 8:27 PM

That is a good question.

Time remaining



Brian Lin