Discussion:
[Supervisor-users] infinite startup retries?
Paul Fox
2016-12-02 16:09:22 UTC
Permalink
is there a way to get supervisor to attempt restarting a process
forever, at some low rate?

i have some services that rely on USB hardware. when they start, they
detect the hardware and continue, else the exit. i'd like them retry
once in a while, forever, in case the hardware has been inserted.

as far as i can tell from the man page, once "startretries" (which
undefined in the man page, but appears to be "4") have been attempted,
the process is never tried again.

is my only recourse to run a cron job to do a "supervisorctl start jobname"
every minute or two?

paul
=----------------------
paul fox, ***@foxharp.boston.ma.us (arlington, ma, where it's 44.6 degrees)
rod
2016-12-02 17:28:03 UTC
Permalink
Afaik infinite retries is not supported, but using a very high startretries
(eg. 999999999999) gives you a decent amount of infinite...
On Fri, 2 Dec 2016 at 16:10, Paul Fox <***@foxharp.boston.ma.us> wrote:

> is there a way to get supervisor to attempt restarting a process
> forever, at some low rate?
>
> i have some services that rely on USB hardware. when they start, they
> detect the hardware and continue, else the exit. i'd like them retry
> once in a while, forever, in case the hardware has been inserted.
>
> as far as i can tell from the man page, once "startretries" (which
> undefined in the man page, but appears to be "4") have been attempted,
> the process is never tried again.
>
> is my only recourse to run a cron job to do a "supervisorctl start jobname"
> every minute or two?
>
> paul
> =----------------------
> paul fox, ***@foxharp.boston.ma.us (arlington, ma, where it's 44.6
> degrees)
>
> _______________________________________________
> Supervisor-users mailing list
> Supervisor-***@lists.supervisord.org
> https://lists.supervisord.org/mailman/listinfo/supervisor-users
>
Paul Fox
2016-12-02 18:08:50 UTC
Permalink
rod wrote:
> Afaik infinite retries is not supported, but using a very high startretries
> (eg. 999999999999) gives you a decent amount of infinite...

oh -- startretries is something i can set in the conf file? i thought
it was an internal parameter of some sort. where can i find a full
list of the settable values? i would have expected it to be in the
man page. the backoff sequence is also not specified -- will the
delay interval eventually stop growing?

thanks,
paul


> On Fri, 2 Dec 2016 at 16:10, Paul Fox <***@foxharp.boston.ma.us> wrote:
>
> > is there a way to get supervisor to attempt restarting a process
> > forever, at some low rate?
> >
> > i have some services that rely on USB hardware. when they start, they
> > detect the hardware and continue, else the exit. i'd like them retry
> > once in a while, forever, in case the hardware has been inserted.
> >
> > as far as i can tell from the man page, once "startretries" (which
> > undefined in the man page, but appears to be "4") have been attempted,
> > the process is never tried again.
> >
> > is my only recourse to run a cron job to do a "supervisorctl start jobname"
> > every minute or two?
> >
> > paul
> > =----------------------
> > paul fox, ***@foxharp.boston.ma.us (arlington, ma, where it's 44.6
> > degrees)
> >
> > _______________________________________________
> > Supervisor-users mailing list
> > Supervisor-***@lists.supervisord.org
> > https://lists.supervisord.org/mailman/listinfo/supervisor-users
> >


=----------------------
paul fox, ***@foxharp.boston.ma.us (arlington, ma, where it's 45.0 degrees)
Andres Reyes Monge
2016-12-02 18:12:08 UTC
Permalink
Btw, i think the cron job it's a better solution for this case

Regards

El 2 dic. 2016 12:09 PM, "Paul Fox" <***@foxharp.boston.ma.us> escribió:

> rod wrote:
> > Afaik infinite retries is not supported, but using a very high
> startretries
> > (eg. 999999999999) gives you a decent amount of infinite...
>
> oh -- startretries is something i can set in the conf file? i thought
> it was an internal parameter of some sort. where can i find a full
> list of the settable values? i would have expected it to be in the
> man page. the backoff sequence is also not specified -- will the
> delay interval eventually stop growing?
>
> thanks,
> paul
>
>
> > On Fri, 2 Dec 2016 at 16:10, Paul Fox <***@foxharp.boston.ma.us> wrote:
> >
> > > is there a way to get supervisor to attempt restarting a process
> > > forever, at some low rate?
> > >
> > > i have some services that rely on USB hardware. when they start, they
> > > detect the hardware and continue, else the exit. i'd like them retry
> > > once in a while, forever, in case the hardware has been inserted.
> > >
> > > as far as i can tell from the man page, once "startretries" (which
> > > undefined in the man page, but appears to be "4") have been attempted,
> > > the process is never tried again.
> > >
> > > is my only recourse to run a cron job to do a "supervisorctl start
> jobname"
> > > every minute or two?
> > >
> > > paul
> > > =----------------------
> > > paul fox, ***@foxharp.boston.ma.us (arlington, ma, where it's 44.6
> > > degrees)
> > >
> > > _______________________________________________
> > > Supervisor-users mailing list
> > > Supervisor-***@lists.supervisord.org
> > > https://lists.supervisord.org/mailman/listinfo/supervisor-users
> > >
>
>
> =----------------------
> paul fox, ***@foxharp.boston.ma.us (arlington, ma, where it's 45.0
> degrees)
>
> _______________________________________________
> Supervisor-users mailing list
> Supervisor-***@lists.supervisord.org
> https://lists.supervisord.org/mailman/listinfo/supervisor-users
>
s***@skeeved.org
2016-12-02 19:28:03 UTC
Permalink
On 12/02/2016 06:08 PM, Paul Fox wrote:
> rod wrote:
> > Afaik infinite retries is not supported, but using a very high startretries
> > (eg. 999999999999) gives you a decent amount of infinite...
>
> oh -- startretries is something i can set in the conf file? i thought
> it was an internal parameter of some sort. where can i find a full
> list of the settable values? i would have expected it to be in the
> man page. the backoff sequence is also not specified -- will the
> delay interval eventually stop growing?
>
> thanks,
> paul

You might also want to look at the event subsystem available in supervisord.

http://supervisord.org/events.html

You could then have your code respond to, for instance, TICK_60 events
and decide whether it's the appropriate time to do whatever work they
are designed to do.
Paul Lockaby
2016-12-02 19:31:58 UTC
Permalink
If you go this route, I did write a program in Perl that does this that you could work with:

https://github.com/plockaby/supercron


> On Dec 2, 2016, at 11:28 AM, ***@skeeved.org wrote:
>
> On 12/02/2016 06:08 PM, Paul Fox wrote:
>> rod wrote:
>> > Afaik infinite retries is not supported, but using a very high startretries
>> > (eg. 999999999999) gives you a decent amount of infinite...
>>
>> oh -- startretries is something i can set in the conf file? i thought
>> it was an internal parameter of some sort. where can i find a full
>> list of the settable values? i would have expected it to be in the
>> man page. the backoff sequence is also not specified -- will the
>> delay interval eventually stop growing?
>>
>> thanks,
>> paul
>
> You might also want to look at the event subsystem available in supervisord.
>
> http://supervisord.org/events.html
>
> You could then have your code respond to, for instance, TICK_60 events and decide whether it's the appropriate time to do whatever work they are designed to do.
>
>
>
> _______________________________________________
> Supervisor-users mailing list
> Supervisor-***@lists.supervisord.org
> https://lists.supervisord.org/mailman/listinfo/supervisor-users
Paul Fox
2016-12-02 20:24:55 UTC
Permalink
thanks to you both. i've clearly been missing out on most of the
documenation for supervisord.

perhaps even _mentioning_ http://supervisor.org in the man page would
help! honestly -- it's bad enough that the man page doesn't offer
complete documentation, but to not even mention the site where the
documentation does exist is a huge omission.

paul

paul lockaby wrote:
> If you go this route, I did write a program in Perl that does this that you could work with:
>
> https://github.com/plockaby/supercron
>
>
> > On Dec 2, 2016, at 11:28 AM, ***@skeeved.org wrote:
> >
> > On 12/02/2016 06:08 PM, Paul Fox wrote:
> >> rod wrote:
> >> > Afaik infinite retries is not supported, but using a very high startretries
> >> > (eg. 999999999999) gives you a decent amount of infinite...
> >>
> >> oh -- startretries is something i can set in the conf file? i thought
> >> it was an internal parameter of some sort. where can i find a full
> >> list of the settable values? i would have expected it to be in the
> >> man page. the backoff sequence is also not specified -- will the
> >> delay interval eventually stop growing?
> >>
> >> thanks,
> >> paul
> >
> > You might also want to look at the event subsystem available in supervisord.
> >
> > http://supervisord.org/events.html
> >
> > You could then have your code respond to, for instance, TICK_60 events and decide whether it's the appropriate time to do whatever work they are designed to do.
> >
> >
> >
> > _______________________________________________
> > Supervisor-users mailing list
> > Supervisor-***@lists.supervisord.org
> > https://lists.supervisord.org/mailman/listinfo/supervisor-users
>
> _______________________________________________
> Supervisor-users mailing list
> Supervisor-***@lists.supervisord.org
> https://lists.supervisord.org/mailman/listinfo/supervisor-users
>


=----------------------
paul fox, ***@foxharp.boston.ma.us (arlington, ma, where it's 43.7 degrees)
Paul Fox
2017-02-15 17:08:47 UTC
Permalink
On Dec 2, 2016, rod wrote:
> Afaik infinite retries is not supported, but using a very high startretries
> (eg. 999999999999) gives you a decent amount of infinite...
>

i'm reviving an old thread here.

i'm interesting in enhancing the algorithm supervisord uses when starting
processes that experience errors.

the documentation (http://supervisord.org/subprocess.html?highlight=retry#process-states)
says:
"When an autorestarting process is in the BACKOFF state, it will
be automatically restarted by supervisord. It will switch between
STARTING and BACKOFF states until it becomes evident that it
cannot be started because the number of startretries has exceeded
the maximum, at which point it will transition to the FATAL state.
Each start retry will take progressively more time."

looking at the source, it seems that the time delay for each backoff
is equal to the number of backoffs --- so the first backoff is for 1
second, the second for 2, the third for 3, etc. (this should really
be documented.)

this means that setting the startretries to a high value will likely
not be very useful, since at some point the retry latency will
potentially be too long to be practical.

it seems like having a configurable cap on the backoff delay would
make the startretries parameter much more useful -- small values would
continue to be useful for catching quick unforeseen startup failures,
while high values, along with a cap on the retry delay, would be
useful for processes that might sometimes be expected to fail to
start, which should be retried forever, and which should recover
relatively quickly when their failure conditions are fixed.

would anyone else find such a configuration option useful? i'm picturing
a new "maxbackoffsecs" parameter to specify the maximum retry backoff. at
the same time, it might also be useful to allow specifying a "startretries"
value of -1, to signify "try forever", rather than having to rely on
enough digits in 9999999.

(i have an initial patch, which works.)

paul

> On Fri, 2 Dec 2016 at 16:10, Paul Fox <***@foxharp.boston.ma.us> wrote:
>
> > is there a way to get supervisor to attempt restarting a process
> > forever, at some low rate?
> >
> > i have some services that rely on USB hardware. when they start, they
> > detect the hardware and continue, else the exit. i'd like them retry
> > once in a while, forever, in case the hardware has been inserted.
> >
> > as far as i can tell from the man page, once "startretries" (which
> > undefined in the man page, but appears to be "4") have been attempted,
> > the process is never tried again.
> >
> > is my only recourse to run a cron job to do a "supervisorctl start jobname"
> > every minute or two?
> >
> > paul
> > =----------------------
> > paul fox, ***@foxharp.boston.ma.us (arlington, ma, where it's 44.6
> > degrees)
> >
> > _______________________________________________
> > Supervisor-users mailing list
> > Supervisor-***@lists.supervisord.org
> > https://lists.supervisord.org/mailman/listinfo/supervisor-users
> >


=----------------------
paul fox, ***@foxharp.boston.ma.us (arlington, ma, where it's 32.9 degrees)
Loading...