Opentherm Gateway communication hangs
Moderator: hvxl
Opentherm Gateway communication hangs
Since a few weeks I have an Opentherm gateway, which I use to control my boiler in co-operation with a heat pump inverter airco to heat up my house.
The Opentherm gateway works fine, it is controlled by an Arduino UNO that uses serial communication to overrule the heating setpoint of the boiler while the at the same time controlling the heat up inverter airco via an infrared LED using the appropriate infrared codes.
The problem with my Opentherm gateway is that, every few days (not regular, sometimes after one day, but sometime after two weeks) the communication with the boiler hangs completely. It is then no longer possible to send serial control commands to the Opentherm gateway, they are simply ignored. Sometimes, the communication between the room thermostat and the boiler is also gone, and the room thermostat indicates an "F" (for "Fault") on its LCD screen.
If the serial communication hangs, I sometimes connect my old laptop (with RS232 connection) to see what is the problem. Usually I can see the T... and B... messages rolling over the terminal screen, as expected. However, the gateway simply ignores the commands that I type.
The remedy is to power-cycle the Opentherm gateway. After the power-cycle, the gateway accepts my serial commands again. However, this remedy is not beneficial for the "wife acceptance factor"
At one stage, the gateway did not pass through any communication, not from my RS232 laptop nor from room thermostat to boiler. Even after power-cycle. I then (as a last resort) uploaded the "gateway.hex" file again into the Opentherm gateway, and things started working again. Only to fail again after a few days...
Can anybody give me hints or tips what can be the problem? Why is my Opentherm Gateway hanging? Is the PIC controller perhaps broken?
Thanks!
The Opentherm gateway works fine, it is controlled by an Arduino UNO that uses serial communication to overrule the heating setpoint of the boiler while the at the same time controlling the heat up inverter airco via an infrared LED using the appropriate infrared codes.
The problem with my Opentherm gateway is that, every few days (not regular, sometimes after one day, but sometime after two weeks) the communication with the boiler hangs completely. It is then no longer possible to send serial control commands to the Opentherm gateway, they are simply ignored. Sometimes, the communication between the room thermostat and the boiler is also gone, and the room thermostat indicates an "F" (for "Fault") on its LCD screen.
If the serial communication hangs, I sometimes connect my old laptop (with RS232 connection) to see what is the problem. Usually I can see the T... and B... messages rolling over the terminal screen, as expected. However, the gateway simply ignores the commands that I type.
The remedy is to power-cycle the Opentherm gateway. After the power-cycle, the gateway accepts my serial commands again. However, this remedy is not beneficial for the "wife acceptance factor"
At one stage, the gateway did not pass through any communication, not from my RS232 laptop nor from room thermostat to boiler. Even after power-cycle. I then (as a last resort) uploaded the "gateway.hex" file again into the Opentherm gateway, and things started working again. Only to fail again after a few days...
Can anybody give me hints or tips what can be the problem? Why is my Opentherm Gateway hanging? Is the PIC controller perhaps broken?
Thanks!
Re: Opentherm Gateway communication hangs
Just upgraded to gateway.hex version 4.2.3 (Feb 12, 2015). Maybe that helps. Keeping fingers crossed!
Re: Opentherm Gateway communication hangs
I don't want to crush your hope, but the symptoms you describe are not something that I have fixed in any recent firmware version. It sounds like the problems start with the serial interface and the last change I made in that area was in one of the 4.0 beta versions. So, unless you were using very old firmware before, there is little chance that 4.2.3 will improve the situation.
In fact, since no one else has reported similar problems, I expect the issue is caused by something unique in your setup. Could it be that the Arduino is doing something weird on the serial interface? Sending strange characters, sending lots of data, pulling the line low, something like that? I tried to make the gateway resilient to such things, but that hasn't been extensively tested. Do you get the same problems if you run the gateway without the Arduino attached?
In fact, since no one else has reported similar problems, I expect the issue is caused by something unique in your setup. Could it be that the Arduino is doing something weird on the serial interface? Sending strange characters, sending lots of data, pulling the line low, something like that? I tried to make the gateway resilient to such things, but that hasn't been extensively tested. Do you get the same problems if you run the gateway without the Arduino attached?
Schelte
Re: Opentherm Gateway communication hangs
Thanks hvxl! Although your comment did not help me directly, it seems to have pointed me in the right direction.
Indeed the upgrade to version 4.2.3 did not help. The communication was hanging again Saturday evening...
I'm sorry if my post is long, but I'd like to share my findings of yesterday and today.
Some information about the used "hardware":
- Boiler is a Remeha Avanta 28c
- Room thermostat is a Honeywell Round Modulation T87M1003, model 2001-2014
- Serial level shifter between Arduino and OpenTherm gateway is a MAX3232 module
It looks like the OpenTherm gateway is very picky about the moment at which a command (or any form of serial communication) is being sent to it. If it receives characters while it is itself sending characters, the effect is undetermined, ranging from e.g. a 'SE', 'OE' or 'NG' response to a reboot or hang-up.
I hooked up an old laptop (with RS232 port and Tera Term) to try and monitor what is happening. I observed the following:
Indeed the upgrade to version 4.2.3 did not help. The communication was hanging again Saturday evening...
I'm sorry if my post is long, but I'd like to share my findings of yesterday and today.
Some information about the used "hardware":
- Boiler is a Remeha Avanta 28c
- Room thermostat is a Honeywell Round Modulation T87M1003, model 2001-2014
- Serial level shifter between Arduino and OpenTherm gateway is a MAX3232 module
It looks like the OpenTherm gateway is very picky about the moment at which a command (or any form of serial communication) is being sent to it. If it receives characters while it is itself sending characters, the effect is undetermined, ranging from e.g. a 'SE', 'OE' or 'NG' response to a reboot or hang-up.
I hooked up an old laptop (with RS232 port and Tera Term) to try and monitor what is happening. I observed the following:
- When I disconnect the serial port to the OpenTherm gateway (necessary when uploading a new binary into my Arduino UNO; Arduino Uno USB uses the same hardware as the serial port, and uploading a binary from within the Arduino IDE requires the serial port to be disconnected), I can see the following happen on the serial line:
I think the OpenTherm gateway is trying to transmit a report ('Rxxxxxxxx', 'Txxxxxxxx', ...), but then gets interrupted by the "chatter" on the serial port, and then reboots.Code: Select all
R80OpenTherm Gateway 4.2.3 T0OpenTherm Gateway 4.2.3 T0OpenTherm Gateway 4.2.3 T0OpenTherm Gateway 4.2.3 T0OpenTherm Gateway 4.2.3 T00OpenTherm Gateway 4.2.3 T0OpenTherm Gateway 4.2.3 T00OpenTherm Gateway 4.2.3 [...]
Note that the serial port is disconnected at the side of the Arduino, there is about 12 meters of cable between the OpenTherm gateway (near the boiler) and the Arduino (near the heat pump). - Here is an example where a command ('CS') is received incorrectly by the OpenTherm gateway (as 'CX'):
Code: Select all
[...] T80000200 BC0000208 T900E6400 B500E6400 PS: 1 00000010/00001000,10.00,00000011/00000011,100.00,0/0,17.00,0.00,0.00,21.11,29.30,0.00,0.00,0.00,55/40,40/20,0.00,40.00,65535,65535,51112,53704,6592,8556,526,685 PS: 0 T80190000 OpenTherm Gateway 4.2.3 SE CX: NG T80190000 B4019203D T10010A00 BD0010A00 [...]
- The following is a "normal" communication between the Arduino and the gateway (only showing the transmissions from the gateway, but that says enough):
Explanation: the Arduino requests a report (PS=1) from the OpenTherm gateway, then decides (based on the reported values) to overrule the boiler settings (CS=30, CH=0), and then enables message reporting again (PS=0).Code: Select all
[...] AC0000300 T00110000 BC0110000 T80190000 PS: 1 00000010/00000000,36.00,00000011/00000011,100.00,0/0,21.00,0.00,0.00,20.57,25.20,0.00,0.00,0.00,55/40,40/20,0.00,40.00,65535,65535,51128,53720,6592,8556,526,685 CS: 30.00 CH: 0 PS: 0 T00000300 R80000200 B40000200 AC0000300 [...]
Now this is what happens when it goes wrong:
Explanation: the Arduino requests a report (PS=1) from the OpenTherm gateway, but somehow the transmission of that command failed, resulting in a 'SE'. The Arduino could not determine any values, but decides to send the boiler overrule settings anyway. And now, that is very dangerous, since the transmission of the CS=xx and CH=x might now coincide with a report sent by the OpenTherm gateway, potentially causing instability on the gateway.Code: Select all
[...] BD0011E00 A50012300 T00000300 R80000200 SE B40000200 AC0000300 CS: 30.00 CH: 0 PS: 0 T90101500 B50101500 [...]
I changed the Arduino software to *only* transmit the boiler overrule commands if it is sure that the PS=1 command was successfull (i.e. the OpenTherm gateway is now sure to be 'silent'). If *not* sucessfull (no response or response too short), the Arduino will wait for the next gateway report. If no report ('Rxxxxxxxx', 'Txxxxxxxx', ...) is now received within X seconds, then apparently the PS=1 command was received by the gateway, but the response did not make it to the Arduino. In that case, just try again (re-send the PS=1 command and wait for reponse). - Here is a spontaneous reboot:
These spontaneous reboots force me to repeat the send boiler overrule settings to the gateway every XX seconds, not just when they change. Spontaneous reboots may occur every so many minutes.Code: Select all
[...] T10012800 R10010A00 BD0010A00 AD0012800 T00000300 R80OpenTherm Gateway 4.2.3 T00000300 BC0000300 T90101500 B50101500 T80190000 B40191580 T10012800 BD0012800 T00000300 B40000302 T101814AB BD01814AB T80190000 BC0191599 T10012800 BD0012800 T00000300 BC000030A OE SE T00110000 B40110400 T80190000 BC01918C7 [...]
- Below is what happened after having the serial port to the gateway disconnected for quite a long time (~10 minutes):
Code: Select all
[...] T10OpenTherm Gateway 4.2.3 T1OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T1OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T1OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 OE T1OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 T10OpenTherm Gateway 4.2.3 [--> Here the gateway was connected again to the Arduino; lots of 'Error 03' messages and an occasional reboot: ] OpenTherm Gateway 4.2.3 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 PS: 1 00000000/00000000,0.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 ErrOpenTherm Gateway 4.2.3 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 PS: 1 00000000/00000000,0.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 Error 03 PS: 0 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 PS: 1 00000000/00000000,0.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 PS: 0 T00000300 Error 03 T00000300 Error 03 T00000300 BC000030A T90101500 Error 03 T90101500 Error 03 T90101500 Error 03 T90101500 Error 03 T90101500 B50101500 T80190000 Error 03 PS: 1 00000011/00001010,0.00,00000000/00000000,0.00,0/0,21.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 PS: 0 T80190000 Error 03 T80190000 B4019264C T10012800 Error 03 T10012800 Error 03 T10012800 Error 03 T10012800 Error 03 T10012800 Error 03 T10012800 Error 03 PS: 1 00000011/00001010,40.00,00000000/00000000,0.00,0/0,21.00,0.00,0.00,0.00,38.30,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 Error 03 PS: 0 T10012800 Error 03 T10012800 Error 03 T10012800 BD0012800 T00000300 Error 03 T00000300 Error 03 T00000300 Error 03 [--> A reboot now and then: ] OpenTherm Gateway 4.2.3 T80190000 Error 03 T80190000 Error 03 T80190000 B4019264C T10015A00 Error 03 T10015A00 Error 03 T10015A00 BD0015A00 T00000300 BC0000300 T00OpenTherm Gateway 4.2.3 T00030000 B4003410B T80190000 Error 03 T80190000 Error 03 T80190000 Error 03 T80190000 B401920E6 T10015A00 Error 03 T10015A00 BD0015A00 T00000300 BC0000300 T807F0000 PS: 1 00000011/00000000,90.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,32.90,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 PS: 0 T80190000 Error 03 T80190000 Error 03 T80190000 Error 03 T80190000 B4019215C T10015A00 Error 03 T10015A00 Error 03 T10015A00 Error 03 T10015A00 Error 03 PS: 1 00000011/00000000,90.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,33.36,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 Error 03 PS: 0 T10015A00 BD0015A00 T00000300 BC0000300 T1002010D BD002010D T80190000 Error 03 T80190000 Error 03 T80190000 BC01921E6 T10015A00 BD0015A00 PS: 1 00000011/00000000,90.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,33.90,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 PS: 0 T907E0D16 BF07E0D16 T80190000 B401921D7 T10015A00 Error 03 T10015A00 Error 03 T10015A00 Error 03 T10015A00 BD0015A00 T00000300 BC0000300 T00090000 BF0090000 AC0090000 T80190000 B40192266 PS: 1 00000011/00000000,90.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,34.40,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 PS: 0 T10015A00 BD0015A00 T00000300 BC0000300 T900E6400 B500E6400 T80190000 BC0192257 T10015A00 BD0015A00 T00000300 Error 03 T00000300 BC0000300 T90101500 B50101500 T80190000 BC01921E3 PS: 1 [--> Slowly the gateway is recovering, the set room temperature is now correct (21.00 degr C): ] 00000011/00000000,90.00,00000000/00000000,100.00,0/0,21.00,0.00,0.00,0.00,33.89,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 PS: 0 T00000300 BC0000300 T901814D7 B501814D7 T80190000 B401920B0 T10015A00 BD0015A00 T00000300 BC0000300 T00110000 BC0110000 T80190000 B40191FB5 T90011500 B50011500 T00000300 BC0000300 PS: 1 [--> And now also the actual room temperature is correct (20.84 degr C): ] 00000011/00000000,21.00,00000000/00000000,100.00,0/0,21.00,0.00,0.00,20.84,31.71,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0 [--> Based on these readings (and on other data like outdoor temperature), the Arduino decides to overrule the boiler as follows:] CS: 26.00 CH: 0 PS: 0 T90011500 R90011A00 B50011A00 A50011500 T00000300 R80000200 [--> From here onwards, all is fine again]
- describe the/some timing constraints on the sending of serial commands on http://otgw.tclcode.com/firmware.html#configuration : advise that the best "window of opportunity" in which to send a command is just after the reception of an OpenTherm message.
- advise powering-off the OpenTherm gateway before disconnecting the serial port at the other end of a long cable, or tampering with the serial connection.
Re: Opentherm Gateway communication hangs
I think that I might have solved the problem. I changed the software in the Arduino to send its "PS=1" command ONLY after it has seen a full 'T-B' or 'T-R-B-A' OpenTherm message sequence, so I am making sure the command is NOT sent within the middle of such a sequence.
Looking at the serial output of the OpenTherm gateway, I noticed that the available time slot (in which the gateway is silent) is largest just after such a sequence has been run through. So I thought, why not just wait until the sequence is finished and then send "PS=1".
Since that change, I did not see ANY spontaneous reboots any more, nor any "SE", "OE" or "NG" indications. Communication is (for this evening) completely stable!
I will be watching this for the coming days and report if this has actually solved my problem.
Looking at the serial output of the OpenTherm gateway, I noticed that the available time slot (in which the gateway is silent) is largest just after such a sequence has been run through. So I thought, why not just wait until the sequence is finished and then send "PS=1".
Since that change, I did not see ANY spontaneous reboots any more, nor any "SE", "OE" or "NG" indications. Communication is (for this evening) completely stable!
I will be watching this for the coming days and report if this has actually solved my problem.
Re: Opentherm Gateway communication hangs
If I believed there was such a constraint, I would describe it. But I don't think there is. I have been sending serial commands to my gateway for years without paying attention to a "window of opportunity" and haven't had a problem. This includes sending the SC command every minute, so over the course of more than 5 years, that must have been over 2.5 million commands.kikker wrote:I think it might be a good idea to at least:
describe the/some timing constraints on the sending of serial commands on http://otgw.tclcode.com/firmware.html#configuration : advise that the best "window of opportunity" in which to send a command is just after the reception of an OpenTherm message.
I do believe that a long serial cable may cause trouble, as I've mentioned on the troubleshooting page. If you found the problems coincide with simultaneous serial transmission and reception, it may be due to cross-talk in the serial cable.
It would be interesting to find the cause of the spontaneous reboots. Can you issue the PR=Q command after that has happened? My guess is that the result will be PR: Q=S, i.e. a reset due to a break condition on the serial line.
Schelte
Re: Opentherm Gateway communication hangs
I cannot check this. I can only check the characters as they are received by the laptop/Arduino, and I've never seen any garbage, nor noted any missing characters. But you may be very right that cross-talk is causing the OTGW to receive garbage, or even receive a break condition.If you found the problems coincide with simultaneous serial transmission and reception, it may be due to cross-talk in the serial cable.
The troubleshooting section on serial communication (http://otgw.tclcode.com/debugging.html#serial) says:
"If you see only some of the characters (like "OpenThery 4.2"), your serial cable could be too long."
As said, I've never noted that characters were missing.
I tried it, and sure enough, PR: Q=S is reported.It would be interesting to find the cause of the spontaneous reboots. Can you issue the PR=Q command after that has happened? My guess is that the result will be PR: Q=S, i.e. a reset due to a break condition on the serial line.
Anyway, the good News is that the communication is still solid as a rock . Keeping all fingers crossed again!
(By the way, if I type the word "News" with all lowercase letters, this appears in the preview with capital "N"! Now isn't that wierd? Some Biblical influence in the software of this forum??)
Re: Opentherm Gateway communication hangs
This topic can be closed and marked [SOLVED]. The communication between the Opentherm Gateway and the Arduino is now stable for over a week . By choosing the right time slot for the transmissions from the Arduino, the communication is not affected, even if the cable is pretty long as in my case.
Re: Opentherm Gateway communication hangs
I just googled into the causes of crosstalk in serial data connections, and found:
http://osd.com.au/serial-data-communications/
Interesting statement: "As an example of really poor practice, OSD once had a customer using a single twisted pair to carry transmit and receive data at about 9.6kbps. The crosstalk made this unworkable after just a few meters!"
Bingo. I am using a twisted pair of a CAT5 ethernet cable as the Tx/Rx pair.
http://osd.com.au/serial-data-communications/
Interesting statement: "As an example of really poor practice, OSD once had a customer using a single twisted pair to carry transmit and receive data at about 9.6kbps. The crosstalk made this unworkable after just a few meters!"
Bingo. I am using a twisted pair of a CAT5 ethernet cable as the Tx/Rx pair.
Re: Opentherm Gateway communication hangs
Indeed, when using CAT5 for a serial connection you should pick a wire from two different twisted pairs for transmit and receive.
Schelte