Raspberry Pi recovery timer

Thread Starter

bkcarter333

Joined Aug 18, 2018
7
I have a project using RPi devices acting as time lapse cameras in remote places.
I have been able to develop them so they are fairly reliable and stable. Several mechanisms are in place for multi-level battery backup, as well as error recovery.

The one problem that I have not solved yet, is the invariable RPi freeze-up.

The solution I have thought of is as follows:
1. The RPi board has a set of pads that will initiate a restart when shorted.
2. A timer circuit (555 or other) that counts to a set time interval (i.e. 5 minutes)
3. A small background program on the RPi that initiates on startup, which toggles a GPIO pin high and low every so many seconds (say 10), and is tied to the reset pins on the counters.
4. As long as the RPi is functioning normally, the timer would then be reset every 10 seconds.
5. If the RPi freezes or has other issues, the background program will halt and stop sending the reset signal.
6. Therefore if the time gets to 5 minutes without being reset, it would trigger the reset signal.

What I have tried so far:
1. Built a 555 timer circuit that drives 3 decimal counters. I used a 555 because the accuracy is not that important for the use case. I do NOT want to use a single 5 minute pulse from the 555. Three hundred 1 second pulses are far more dependable than one 5 minute pulse in my opinion. I only lose 1 second of accuracy if there is a misfire, which is okay.
2. Right now I have the RPi reset wired to the carry-out pin on the 3rd timer, via a transistor to short it out. It works but has some issues.

What I need help with:
1. Is there a better way to do this all together? I have looked at several small timer boards, but still struggle to see how they would implement the reset.
2. What resister and capacitor values will generate a 1 second pulse on the 555? I know there is a formula, but math was never my strong suit.
3. How would be the best way to create some sort of one-shot event when the count reaches a certain number (300). The problem with wiring it to the carry out, is that it stays high until the next pulse comes in to the counter. Since it as the last timer in the circuit, that could be a while, and it would hold the reset high for far longer than desired.

Clear as mud I'm sure, but any help would be greatly appreciated!

Bryan
 

MrChips

Joined Oct 2, 2009
34,812
Welcome to AAC!

I am not aware of a freeze-up problem with the rPi. Every modern embedded systems MCU has a watch-dog timer designed to do exactly what you are attempting to do. I would first investigate that watch-dog and see if a recovery mechanism is already in place in the rPi OS.

While your scheme is doable, you can also look into using an internal timer to implement a watch-dog timer. Failing all of the above, we can get you the appropriate R and C values and the remaining steps to implement your idea.
 

Thread Starter

bkcarter333

Joined Aug 18, 2018
7
hi Bryan,
Welcome to AAC.
Why do you wait for 5 minutes before you do a RPi Reset.??
E
Thanks for your reply Eric.
Right now the number is pretty arbitrary. I want to allow enough time for any programmatic problems to be worked out before I kick the knees out from under it. Plus 5 minutes gives the RPi enough time, even with an SD card cleanup, to reboot and become functional again. In this case it is less important to have it reboot quickly, as it is to just reboot :)

That number may change as I get deeper into the logic. It could get shorter or longer.
 

Thread Starter

bkcarter333

Joined Aug 18, 2018
7
Welcome to AAC!

I am not aware of a freeze-up problem with the rPi. Every modern embedded systems MCU has a watch-dog timer designed to do exactly what you are attempting to do. I would first investigate that watch-dog and see if a recovery mechanism is already in place in the rPi OS.

While your scheme is doable, you can also look into using an internal timer to implement a watch-dog timer. Failing all of the above, we can get you the appropriate R and C values and the remaining steps to implement your idea.
Thanks!

Yes, I am aware of the watchdog timers, but I’ve never trusted a computer to watch itself fail, and then take the appropriate action after it has already failed. In addition there will also be some additional external components that will need to be restarted along with the RPi, including a secondary controller RPi Zero.

Yes, I have had many MCUs hang on me over the years. I tend to push them hard.

Anyways, with all of the safety mechanisms already built in, I still want this as a last resort. Some of these cameras will be in very remote locations, including Alaska. I want to have every possible means of self preservation taken before I fly 3500 miles to see what is wrong
 

Thread Starter

bkcarter333

Joined Aug 18, 2018
7

Thread Starter

bkcarter333

Joined Aug 18, 2018
7
hi,
You could pulse a counter with the 555 output and use the summing counter output, capacitor coupled to give the final Reset pulse.
E
I think that is what I’m trying to do. Just not sure on the counter output at a certain number, and generating the pulse. I understand digital logic much better than the analog stuff.
 

ericgibbs

Joined Jan 29, 2010
21,442
hi,
Is the RPi running at 3.3v or 5v.?
If the project is using 3.3v you will need a CMOS version of the 555 say a TLC555.
If the RPi retrigger pulse is set for say 10Secs and you want 5mins thats 300Sec/10Sec , 30 counts.

Do you want to display the delay count.?
E

EDIT:
I see the RPi is a nominal 5V, I thought it may have the same spec as as an Arduino, ie: 3.3v or 5v
So a 'regular' 555 TTL version will be OK.
 
Last edited:

danadak

Joined Mar 10, 2018
4,057
... and what if the ATtiny85 fails?
Many options, dual processor for example used in medical injection pumps,
each watching each other looking for code synchronicity, behaviors. No
doubt there is a ton of NASA papers on this general topic.

But back to reality. What is user fault tolerance ? Will this kill someone if
not super redundant ? Or is it a 10% tolerable problem. Poster must tell us.

Regards, Dana.
 

Thread Starter

bkcarter333

Joined Aug 18, 2018
7
Many options, dual processor for example used in medical injection pumps,
each watching each other looking for code synchronicity, behaviors. No
doubt there is a ton of NASA papers on this general topic.

But back to reality. What is user fault tolerance ? Will this kill someone if
not super redundant ? Or is it a 10% tolerable problem. Poster must tell us.

Regards, Dana.
Dana, please see post #6 which defines risk tolerance level. No, it will not kill anyone, and redundancy is not really the issue, but self recovery is. I want to keep the solution as simple as possible.

Thanks!
I really like this solution and will look closer at it! It would be simple, and solves the single problem that I need solved!

Still curious about the R and C values for the 555 timer option, as I may have other needs for that later.

Thanks!
 

danadak

Joined Mar 10, 2018
4,057
If you are in hot cold environmental look carefully at any bulk
cap you use, either 555 timing of bulk bypass.

I have repaired lot of equipment, electrolytics are beasts.

And then there are runs of tant caps in the past that had major issues,
hopefully none left in market. Even ceramics cracking....

Regards, Dana.
 

Thread Starter

bkcarter333

Joined Aug 18, 2018
7
If you are in hot cold environmental look carefully at any bulk
cap you use, either 555 timing of bulk bypass.

I have repaired lot of equipment, electrolytics are beasts.

And then there are runs of tant caps in the past that had major issues,
hopefully none left in market. Even ceramics cracking....

Regards, Dana.
Ok, any you would suggest?
 

danadak

Joined Mar 10, 2018
4,057
My cap knowledge is a tad dated. I would contact their field engineers
for recommendations. I think Polymer tants are latest, but again my
info dated. Also MLC, but I know as recently as this decade cracking
was an issue, I assume that's behind us now.

Recently I worked on PC startup in cold temps. Built a controller to force a start
request (when BIOS no longer would do it), and look for power up. About to add
a heater because I cannot get below ~ 20F, and I think I need to look at caps in
ATX supply for issues. Newer industrial PC mobos have eliminated tants and
electrolytics in favor of MLC.

Regards, Dana.
 
Last edited:

blue_coder

Joined May 7, 2016
37
This is just a random suggestion, but for a simple solution you could have the 10 second pulse directly charge a cap, then use a single comparitor to provide a ground when the voltage falls below a certain level? Would need testing in a freezer to check how the cap performance changes with temperature.

As regards your 555 timer, this page may help https://www.allaboutcircuits.com/tools/555-timer-monostable-circuit/ values of 330nF and 2.7M seem to be pretty close, which would keep the cap ceramic.
 
Top