otmonitor uses 100% cpu after a while

This Forum is about the Opentherm gateway (OTGW) from Schelte

Moderator: hvxl

Post Reply
aapje
Starting Member
Starting Member
Posts: 3
Joined: Mon Jun 01, 2015 2:49 pm

otmonitor uses 100% cpu after a while

Post by aapje »

Hi,

I'm running the latest prebuilt otmonitor armhf off the otgw site (4.2.2) on a rbpi with the latest gateway firmware 4.2.3 on a gateway prebuilt by kiwi.
It's hooked up to a honeywell chronotherm round and a remeha avanta boiler.

The whole setup works beautifully except for the fact that the otmonitor binary's cpu usage increases by about 10% or so every day until it ends up using 100% of the pi's cpu constantly (and about 50% of memory) and is no longer responsive to commands. If I manually stop the binary and run it again it works fine until a few days later when the problem reoccurs.
It's running headless with only the web interface enabled (X and the like aren't even installed, running on minibian) and does not write logs to disk. I can provide the full configs if needed.

I noticed that the full command log history from the moment of starting is available in the web gui; to me it smells like the log ends up getting too big for the pi to handle, but I haven't looked at the code. Could this be the case?
If not, what might be causing this problem?
Independent of this issue, Is there log truncate or roundrobin option in the otmonitor somewhere I've missed (for the live in-memory log)? It does not seem very useful to push the entire log to web clients..

Thanks
yoja
Starting Member
Starting Member
Posts: 44
Joined: Wed Feb 24, 2010 12:00 pm

Re: otmonitor uses 100% cpu after a while

Post by yoja »

Hi,

I may have the same problem:

http://www.domoticaforum.eu/viewtopic.p ... 462#p74731

For now I reboot the Pi every night (crontab)

Yoja
hvxl
Senior Member
Senior Member
Posts: 1965
Joined: Sat Jun 05, 2010 11:59 am
Contact:

Re: otmonitor uses 100% cpu after a while

Post by hvxl »

The program is supposed to only keep 1000 lines of history. Do you get more than that?

But even if that feature doesn't work as intended, it would only account for the increased memory use. It should not have much impact on the cpu use.

Rebooting the Pi just to restart a program is a bit like killing a fly with a sledgehammer.
Schelte
aapje
Starting Member
Starting Member
Posts: 3
Joined: Mon Jun 01, 2015 2:49 pm

Re: otmonitor uses 100% cpu after a while

Post by aapje »

Hi Schelte,

I get way, way more log entries ;) Not sure how many exactly, but it's definately way more than 1000. After getting the reply notification for this topic I checked otmonitor again and it had died with:

# ./otmonitor-ahf --daemon -f etc/otmonitor.conf
unable to alloc 90443790 bytes
Aborted

I had also previously nabbed a strace of a unresponsive otmonitor and it was looping the following syscall sequence:

gettimeofday({1433240040, 792521}, NULL) = 0
futex(0x64dfd8, FUTEX_WAKE_PRIVATE, 1) = 1
clock_gettime(CLOCK_REALTIME, {1433240040, 797202716}) = 0
futex(0x87a9cc, FUTEX_WAIT_PRIVATE, 1295, {0, 95318284}) = -1 ETIMEDOUT (Connection timed out)
write(4, "\0", 1) = 1
futex(0x64dfd8, FUTEX_WAKE_PRIVATE, 1) = 1
write(4, "\0", 1) = 1

Not sure if that is useful, or if it isn't, what else may be. Haven't had time to look at the source yet (or grok Tcl for that matter).
Here's the otmonitor.conf for completeness:

web {
enable true
port 8080
nopass true
}
connection {
device /dev/ttyUSB0
type serial
enable true
}

Also, the pi's CPU is pretty dismal, I would not be surprised if something like this could bring it to its knees.
Any more hints?

Thanks
hvxl
Senior Member
Senior Member
Posts: 1965
Joined: Sat Jun 05, 2010 11:59 am
Contact:

Re: otmonitor uses 100% cpu after a while

Post by hvxl »

You're right. A bug was introduced in Tcl that interferes with the correct operation of the code that was supposed to limit the number of saved messages. When the correct fix is known, I'll build a new version that should not exhibit the ever increasing memory consumption. Then let's also see what impact that has on cpu use.
Schelte
hvxl
Senior Member
Senior Member
Posts: 1965
Joined: Sat Jun 05, 2010 11:59 am
Contact:

Re: otmonitor uses 100% cpu after a while

Post by hvxl »

I have posted otmonitor version 4.2.3 on my web site. In this version I avoid the Tcl bug by using a slightly different method to limit the number of saved messages. Please try it and let me know if that improves things.
Schelte
mike7
Member
Member
Posts: 102
Joined: Mon Dec 02, 2013 8:45 am

Re: otmonitor uses 100% cpu after a while

Post by mike7 »

Schelte,

I've got errors starting otmonitor on raspberry

Code: Select all

./otmonitor-ahf --daemon --system -w 8080
Not connected
    while executing
"dbus method / com.tclcode.debug.Eval {apply {{info str} {uplevel #0 $str}}}"
    (file "/home/pi/otmonitor-ahf/dbus.tcl" line 153)
    invoked from within
"source /home/pi/otmonitor-ahf/dbus.tcl"
    ("uplevel" body line 1)
    invoked from within
"uplevel #0 [list source [file join /home/pi/otmonitor-ahf $file]]"
    (procedure "include" line 2)
    invoked from within
"include dbus.tcl"
    (file "/home/pi/otmonitor-ahf/otmonitor.tcl" line 1863)
    invoked from within
"source /home/pi/otmonitor-ahf/otmonitor.tcl"
    ("uplevel" body line 1)
    invoked from within
"uplevel #0 [list source [file join /home/pi/otmonitor-ahf $file]]"
    (procedure "include" line 2)
    invoked from within
"include otmonitor.tcl"
    (file "/home/pi/otmonitor-ahf/main.tcl" line 13)
Is it related to changes made for domoticaforum.eu/viewtopic.php?f=75& ... 383#p75380 ?
hvxl
Senior Member
Senior Member
Posts: 1965
Joined: Sat Jun 05, 2010 11:59 am
Contact:

Re: otmonitor uses 100% cpu after a while

Post by hvxl »

Oops, my mistake. Sorry about that. Try 4.2.3.1.
Schelte
yoja
Starting Member
Starting Member
Posts: 44
Joined: Wed Feb 24, 2010 12:00 pm

Re: otmonitor uses 100% cpu after a while

Post by yoja »

Hi,

Getting this error when running otmonitor a couple of minutes, using telnet minute to monitor:

invalid command name "nu"
while executing
"$arg $val"
(procedure "otdecode" line 19)
invoked from within
"otdecode data $type $id"
(procedure "otmessage" line 6)
invoked from within
"otmessage [clock microseconds] $line [expr {$type & 7}] $id $data"
(procedure "process" line 12)
invoked from within
"process [append data $line]"
(procedure "receive" line 6)
invoked from within
"receive"
hvxl
Senior Member
Senior Member
Posts: 1965
Joined: Sat Jun 05, 2010 11:59 am
Contact:

Re: otmonitor uses 100% cpu after a while

Post by hvxl »

Should be fixed in 4.2.3.2. Thanks for reporting.
Schelte
aapje
Starting Member
Starting Member
Posts: 3
Joined: Mon Jun 01, 2015 2:49 pm

Re: otmonitor uses 100% cpu after a while

Post by aapje »

Been running 4.2.3 since saturday afternoon.
Log size limiting works fine now. Mem use is stable at around 32m virtual / 10m resident. CPU use hovers between 2 and 5% on the pi. Not using dbus so I didn't hit that snag.
Seems to be working fine!
Thanks for patching, Schelte!
hvxl
Senior Member
Senior Member
Posts: 1965
Joined: Sat Jun 05, 2010 11:59 am
Contact:

Re: otmonitor uses 100% cpu after a while

Post by hvxl »

That's good. Hopefully it stays that way. Thanks for reporting back.
Schelte
Post Reply

Return to “Opentherm Gateway Forum”