Page 2 of 3
Re: Strange behaviour of otgw
Posted: Sun Mar 22, 2020 10:56 pm
by bartgv
First a question: In the logfile each "Data" line is followed by its "Ack" line immediately. Except for "Remote override room setpoint", where "Data" and "Ack" are always separated by "Burner starts" or "Burner operation hours". Is that normal?
PR=G and PR=S returned exactly what you would expect in a normal situation:
So, downloaded the gateway.hex (md5sum 6abdbcbf6dbffcc5e5d7f18bab20eacf), connected the otgw to my laptop and started otmonitor with gui. File -> Firmware upgrade ..., selected the gateway.hex, kept "Transfer old EEPROM settings to the new firware" checked (not sure if I should uncheck this one??) and clicked "Program". The procedure took about 30 seconds, no retries, no errors and finished with "firmware download succeeded - 0 retries". I closed otmonitor, unplugged the power, connected otgw to rpi and started otmoniotor. Looks like everything is working.
Will let it run for a while and see what happens. But when I leave home for longer than a few hours I'll disable the otgw. Will let you know if my troubles are now gone.
Re: Strange behaviour of otgw
Posted: Tue Mar 24, 2020 11:06 pm
by hvxl
Those setback settings can't explain why the OTGW sudden requested a setpoint of 67.22 degrees. Even more reason to believe that the problem is in the firmware.
Please read my
story about alternatives to understand why there is another message between the Data and Ack messages for the "Remote override room setpoint". The "Remote override room setpoint" message is typically a message that a boiler does not support. The OTGW learns this after it has forwarded a few of these messages after a reset. Any time after that, it sends an alternative message to the boiler instead. So you see a 'T' message from the thermostat to the OTGW, then an 'R' message where the OTGW replaces that message with an alternative to the boiler. Then a 'B' message with the response from the boiler. And finally an 'A' message with the answer from the OTGW on the initial 'T' message. If the OTGW does not use an alternative message, it doesn't report the 'R' and 'A' messages, because they would be exactly the same as the 'T' and 'B' messages. So yes, it's normal.
Most of the time, you want to keep the "Transfer old EEPROM settings to the new firmware" option checked. That way any settings you have made are restored after the firmware update.
Re: Strange behaviour of otgw
Posted: Mon Mar 30, 2020 9:10 pm
by bartgv
After I reinstalled the firmware the same issues occurred as before. Each time it fails after a few hours. Sometimes after less than 2 hours, sometimes it runs for about a day.
Today I reinstalled the firmware again, but unchecked the "Transfer old EEPROM settings" option. Just a few hours later I lost connection while the red led was on. Next time I took it off after it set the room temperature to 35 degrees. So that didn't change anything neither.
Would it be possible and useful to run the diagnose software? Or do you have any other ideas?
Re: Strange behaviour of otgw
Posted: Wed Apr 01, 2020 4:54 pm
by hvxl
One last check before we rule out the firmware: Did you download a fresh copy from the site, or did you reuse the old version you used to flash the gateway initially? The MD5 checksum of gateway.hex should be 6abdbcbf6dbffcc5e5d7f18bab20eacf.
When the firmware is completely exonerated, the only reason I can imagine for the symptoms you describe is hardware. Possibly temperature related, because it happens after some hours of operation. It could be the power supply, or a flaky PIC. So, check that the 5V is still OK when you experience the problems. The next step would be to try another PIC.
The diagnostic firmware focuses on checking the communication. That doesn't seem to be the area where you have problems. So I don't expect to gain any new insights from using the diagnostic firmware.
Re: Strange behaviour of otgw
Posted: Sat Apr 04, 2020 2:09 pm
by bartgv
I did download the firmware from
http://otgw.tclcode.com/download.html and the md5sum was correct.
To measure temperature I used an IR themermometer. I did a background measurement when the otgw was turned off for a many hours: 21°C, which is pretty close to the temperature in the boiler room. Then I turned the otgw on and did multiple measurements after some time.
Code: Select all
time t PIC t Optocoupler
* off for many hours *
2 apr 22:20 21°C -
* 2 apr 22:29 turn on *
2 apr 22:30 23°C -
2 apr 22:31 22°C -
2 apr 22:40 23°C -
2 apr 22:50 22°C 25°C
2 apr 23:30 24°C 26°C
3 apr 11:30 24°C 27°C
3 apr 14:00 25°C 27°C
3 apr 20:00 24°C 27°C
4 apr 01:00 25°C 28°C
* 4 apr 02:18:47: last measurement in logfile *
4 apr 11:30 23°C 26°C
So the there is a ~75 minutes between last measurement and moment of failure. I will try to repeat this next week to get closer.
How do I check the 5V? Can I simple use the 5V and GND connections?
Re: Strange behaviour of otgw
Posted: Tue Apr 07, 2020 10:01 pm
by hvxl
So it looks like the PIC and opto-coupler are not heating up to alarming levels. But I was actually thinking the power circuitry may be getting hot. I believe the Nodo shop has done different types of power converters. I don't know which one you have.
My intention was indeed for you to measure between any GND and 5V connections you can reach. Measure both DC and AC. With a good multimeter, you should get a stable 5V when measuring DC, and 0V on AC. Cheap multimeters may also show 5V on AC. Using an oscilloscope would be even better, but somehow that's not part of everyone's toolbox.
Re: Strange behaviour of otgw
Posted: Thu Apr 09, 2020 1:53 pm
by bartgv
None of the components are getting warmer than about 28°C at maximum.
Before and after failure is 5V DC actually 4.98V and AC < 0.01V. Sounds pretty much okay to me.
Sorry, no oscilloscope in my toolbox.
Re: Strange behaviour of otgw
Posted: Sat Apr 11, 2020 11:08 am
by hvxl
Agreed, that looks fine. Then the next thing to try would be a different PIC.
Re: Strange behaviour of otgw
Posted: Thu Apr 23, 2020 7:13 pm
by bartgv
I've got a new PIC from Nodo. So I replaced the old PIC and tried to upload the firmware.
After hitting the "Program" button, the status bar says "Switching gateway to self-programming mode", and after a few seconds it says "Please manually reset the OpenTherm Gateway".
But how do I reset it?

- should I shortcut the RST pins?
- otgw.jpg (258.6 KiB) Viewed 6199 times
Re: Strange behaviour of otgw
Posted: Thu Apr 23, 2020 8:15 pm
by hvxl
Doesn't the PIC already contain the firmware? It needs to have at least the bootloader. But then just briefly shorting the reset pins when otmonitor asks for a reset should work.
Re: Strange behaviour of otgw
Posted: Thu Apr 23, 2020 10:00 pm
by bartgv
Actually, I didn't check that. Since I received it in a sealed package, I assumed it came without firmware.
But I've connected it and it seems to contain the firmware already.
Everything is running fine right now.
Lets see if it keeps running for more than a few hours or days...
Re: Strange behaviour of otgw
Posted: Fri May 01, 2020 10:57 pm
by bartgv
It looks like the new PIC solved my issues. The gateway is running for more than a week now, without any problems.
Thank you very much for all your time!
Re: Strange behaviour of otgw
Posted: Sat May 02, 2020 1:18 pm
by hvxl
That's a relief! I would hate to have told you to spend money on another PIC and it didn't fix the problem.
Re: Strange behaviour of otgw
Posted: Sat May 02, 2020 1:26 pm
by bartgv
Nodo-shop sent it for free.
Re: Strange behaviour of otgw
Posted: Tue Jun 09, 2020 2:04 pm
by bartgv
Bad News.
The otgw was working well with the new PIC for about a month. Since last week I've got the same issues as before: no new messages in the logfile while the red led is turned on. And once again it overrode the room setpoint to 67 degrees and the room thermostate indicated an OT error.
The last lines in the logfile:
Code: Select all
19:50:31.362262 WDT reset!
19:50:31.364953 0
19:50:31.374704 0010A00
19:50:31.383017 0010A00
19:50:31.392691 0000200
19:50:31.402432 0000204
22:54:56.499200 090000enT
06:36:01.507110 herm Gateway 4.2.5
06:36:01.519571 WDT reset!
06:36:01.522431 0
06:36:01.532049 0010A00
06:36:01.540286 0010A00
06:36:01.550041 0000200
06:36:01.559686 0000204
06:36:01.590190 090000enTherm Gateway 4.2.5
06:36:01.602686 WDT reset!
06:36:01.605412 0
06:36:01.615049 0010A00
06:36:01.623416 0010A00
06:36:01.633158 0000200
06:36:01.642782 0000204
06:36:01.673308 090000enTherm Gateway 4.2.5
06:36:01.685688 WDT reset!
06:36:01.688556 0
06:36:01.698176 0010A00
06:36:01.706537 0010A00
06:36:01.716159 0000200
06:36:01.725936 0000204
06:36:01.902741 090000enTherm Gateway 4.2
... repeated many times, sometimes with interval of some hours
Dbus info:
Code: Select all
dbus-send --dest=com.tclcode.otmonitor --print-reply / com.tclcode.debug.Eval ':set errorInfo'
method return time=1591607792.719618 sender=:1.0 -> destination=:1.1 serial=8 reply_serial=2
string "can not find channel named "sockbfc9a0"
while executing
"chan close sockbfc9a0"
invoked from within
"catch $script""
dbus-send --dest=com.tclcode.otmonitor --print-reply / com.tclcode.debug.Eval ':fconfigure $dev'
method return time=1591607813.828990 sender=:1.0 -> destination=:1.2 serial=9 reply_serial=2
string "-blocking 0 -buffering line -buffersize 4096 -encoding utf-8 -eofchar {{} {}} -translation {crlf cr} -mode 9600,n,8,1 -xchar { }"
dbus-send --dest=com.tclcode.otmonitor --print-reply / com.tclcode.debug.Eval ':fileevent $dev readable'
method return time=1591607825.451036 sender=:1.0 -> destination=:1.3 serial=10 reply_serial=2
string "receive"
That was when the room setpoint went to 67 degrees.
Today there was a slightly different lgofile:
Code: Select all
12:16:21.083926 T00110000 Read-Data Relative modulation level: 0.00
12:16:21.208919 BC0110000 Read-Ack Relative modulation level: 0.00
12:16:21.997443 192000
12:16:22.007204 0010A00
12:16:22.015426 0010A00
12:16:22.025160 0000200
12:16:22.034780 0000204
12:16:22.044554 0000200
12:16:22.052786 0000204
12:16:22.062542 0110000
12:16:22.072174 0110000
12:16:22.080548 192000
12:16:22.090180 0010A00
12:16:22.098583 0010A00
12:16:22.108188 0000200
12:16:22.117964 0000204
12:16:22.127538 0000200
12:16:22.135912 0000204
12:16:22.145654 0110000
12:16:22.155290 0110000
12:16:22.163537 192000
12:16:22.173295 0010A00
12:16:22.181734 0010A00
12:16:22.191287 0000200
12:16:22.201036 0000204
12:16:22.210668 0000200
12:16:22.218941 0000204
12:16:22.228692 0110000
12:16:22.238426 0110000
12:16:22.246658 192000
12:16:22.256417 0010A00
12:16:22.264653 0010A00
12:16:22.376183 0T80000200
12:17:26.364559 OpenTherm Gateway 4.2.5
12:17:26.374135 WDT reset!
After the WDT reset! there where no other messages and the red led was turned on.
Temperatures of the the components on the otgw are all within reasonable ranges.
Any suggestions?