Syracuse has been making use of a docker container that they spin up on their worker nodes in lieu of accepting pilot jobs through the GlideinWMS factory -> CE -> batch system workflow. These container spin up and act like a pilot job, reporting back to the respective VO pool with one catch: we're not producing pilot records for the contributions of these containers!

Unfortunately, these containers can be stopped by the admin at any time so it's tough to capture all of the usage but we can try to capture at least some of it by having the container upload a "pilot" (BatchPilot?) record every ~4 hrs

Brian Lin
April 14, 2021, 3:40 PM

Pushing this through to RFR along with the other tickets

Carl Edquist
March 11, 2021, 7:52 PM

PR merged.

Promoted gratia-probe-1.23.1-1 to osg-3.5-el*-testing







Carl Edquist
November 20, 2020, 7:11 PM

For now i did make a small fix to report SiteName properly (previously it had been broken and commented out since it was trying to send "Site" instead of "SiteName"). I see at least distinct SiteNames are starting to show up now (eg "ISI" instead of the generic "OSG Pilot Container Probe").

It's not clear to me what we want for the fqdn part of the ProbeName, or if we even care.

Carl Edquist
November 20, 2020, 6:28 PM

I've worked through a number of issues finally getting this going in the tiger osg dev instance; records have begun showing up in gracc

Notably, the SiteName is generic (OSG Pilot Container Probe) and the fqdn par of the ProbeName is a fixed random string (the randomly generated hostname for the container).

would like to see a multi-site variant, with the Probe/SiteName set separately for each site. He pointed to an example in the condor probe, where the probe re-initialized in a forked process for each alternate site:

Carl Edquist
October 22, 2020, 2:26 PM


