Opentherm Gateway communication hangs

This Forum is about the Opentherm gateway (OTGW) from Schelte

Moderator: hvxl

Post Reply
kikker
Starting Member
Starting Member
Posts: 7
Joined: Thu Feb 19, 2015 12:08 pm

Opentherm Gateway communication hangs

Post by kikker »

Since a few weeks I have an Opentherm gateway, which I use to control my boiler in co-operation with a heat pump inverter airco to heat up my house.

The Opentherm gateway works fine, it is controlled by an Arduino UNO that uses serial communication to overrule the heating setpoint of the boiler while the at the same time controlling the heat up inverter airco via an infrared LED using the appropriate infrared codes.

The problem with my Opentherm gateway is that, every few days (not regular, sometimes after one day, but sometime after two weeks) the communication with the boiler hangs completely. It is then no longer possible to send serial control commands to the Opentherm gateway, they are simply ignored. Sometimes, the communication between the room thermostat and the boiler is also gone, and the room thermostat indicates an "F" (for "Fault") on its LCD screen.

If the serial communication hangs, I sometimes connect my old laptop (with RS232 connection) to see what is the problem. Usually I can see the T... and B... messages rolling over the terminal screen, as expected. However, the gateway simply ignores the commands that I type.

The remedy is to power-cycle the Opentherm gateway. After the power-cycle, the gateway accepts my serial commands again. However, this remedy is not beneficial for the "wife acceptance factor" :wink:

At one stage, the gateway did not pass through any communication, not from my RS232 laptop nor from room thermostat to boiler. Even after power-cycle. I then (as a last resort) uploaded the "gateway.hex" file again into the Opentherm gateway, and things started working again. Only to fail again after a few days...

Can anybody give me hints or tips what can be the problem? Why is my Opentherm Gateway hanging? Is the PIC controller perhaps broken?

Thanks!
kikker
Starting Member
Starting Member
Posts: 7
Joined: Thu Feb 19, 2015 12:08 pm

Re: Opentherm Gateway communication hangs

Post by kikker »

Just upgraded to gateway.hex version 4.2.3 (Feb 12, 2015). Maybe that helps. Keeping fingers crossed! :)
hvxl
Senior Member
Senior Member
Posts: 1966
Joined: Sat Jun 05, 2010 11:59 am
Contact:

Re: Opentherm Gateway communication hangs

Post by hvxl »

I don't want to crush your hope, but the symptoms you describe are not something that I have fixed in any recent firmware version. It sounds like the problems start with the serial interface and the last change I made in that area was in one of the 4.0 beta versions. So, unless you were using very old firmware before, there is little chance that 4.2.3 will improve the situation.

In fact, since no one else has reported similar problems, I expect the issue is caused by something unique in your setup. Could it be that the Arduino is doing something weird on the serial interface? Sending strange characters, sending lots of data, pulling the line low, something like that? I tried to make the gateway resilient to such things, but that hasn't been extensively tested. Do you get the same problems if you run the gateway without the Arduino attached?
Schelte
kikker
Starting Member
Starting Member
Posts: 7
Joined: Thu Feb 19, 2015 12:08 pm

Re: Opentherm Gateway communication hangs

Post by kikker »

Thanks hvxl! Although your comment did not help me directly, it seems to have pointed me in the right direction.

Indeed the upgrade to version 4.2.3 did not help. The communication was hanging again Saturday evening...

I'm sorry if my post is long, but I'd like to share my findings of yesterday and today.

Some information about the used "hardware":
- Boiler is a Remeha Avanta 28c
- Room thermostat is a Honeywell Round Modulation T87M1003, model 2001-2014
- Serial level shifter between Arduino and OpenTherm gateway is a MAX3232 module

It looks like the OpenTherm gateway is very picky about the moment at which a command (or any form of serial communication) is being sent to it. If it receives characters while it is itself sending characters, the effect is undetermined, ranging from e.g. a 'SE', 'OE' or 'NG' response to a reboot or hang-up.

I hooked up an old laptop (with RS232 port and Tera Term) to try and monitor what is happening. I observed the following:
  • When I disconnect the serial port to the OpenTherm gateway (necessary when uploading a new binary into my Arduino UNO; Arduino Uno USB uses the same hardware as the serial port, and uploading a binary from within the Arduino IDE requires the serial port to be disconnected), I can see the following happen on the serial line:

    Code: Select all

      R80OpenTherm Gateway 4.2.3
      T0OpenTherm Gateway 4.2.3
      T0OpenTherm Gateway 4.2.3
      T0OpenTherm Gateway 4.2.3
      T0OpenTherm Gateway 4.2.3
      T00OpenTherm Gateway 4.2.3
      T0OpenTherm Gateway 4.2.3
      T00OpenTherm Gateway 4.2.3
      [...]
    
    I think the OpenTherm gateway is trying to transmit a report ('Rxxxxxxxx', 'Txxxxxxxx', ...), but then gets interrupted by the "chatter" on the serial port, and then reboots.

    Note that the serial port is disconnected at the side of the Arduino, there is about 12 meters of cable between the OpenTherm gateway (near the boiler) and the Arduino (near the heat pump).
  • Here is an example where a command ('CS') is received incorrectly by the OpenTherm gateway (as 'CX'):

    Code: Select all

      [...]
      T80000200
      BC0000208
      T900E6400
      B500E6400
      PS: 1
      00000010/00001000,10.00,00000011/00000011,100.00,0/0,17.00,0.00,0.00,21.11,29.30,0.00,0.00,0.00,55/40,40/20,0.00,40.00,65535,65535,51112,53704,6592,8556,526,685
      PS: 0
      T80190000
      OpenTherm Gateway 4.2.3
      SE
      CX: NG
      T80190000
      B4019203D
      T10010A00
      BD0010A00
      [...]
    
  • The following is a "normal" communication between the Arduino and the gateway (only showing the transmissions from the gateway, but that says enough):

    Code: Select all

      [...]
      AC0000300
      T00110000
      BC0110000
      T80190000
      PS: 1
      00000010/00000000,36.00,00000011/00000011,100.00,0/0,21.00,0.00,0.00,20.57,25.20,0.00,0.00,0.00,55/40,40/20,0.00,40.00,65535,65535,51128,53720,6592,8556,526,685
      CS: 30.00
      CH: 0
      PS: 0
      T00000300
      R80000200
      B40000200
      AC0000300
      [...]
    
    Explanation: the Arduino requests a report (PS=1) from the OpenTherm gateway, then decides (based on the reported values) to overrule the boiler settings (CS=30, CH=0), and then enables message reporting again (PS=0).

    Now this is what happens when it goes wrong:

    Code: Select all

      [...]
      BD0011E00
      A50012300
      T00000300
      R80000200
      SE
      B40000200
      AC0000300
      CS: 30.00
      CH: 0
      PS: 0
      T90101500
      B50101500
      [...]
    
    Explanation: the Arduino requests a report (PS=1) from the OpenTherm gateway, but somehow the transmission of that command failed, resulting in a 'SE'. The Arduino could not determine any values, but decides to send the boiler overrule settings anyway. And now, that is very dangerous, since the transmission of the CS=xx and CH=x might now coincide with a report sent by the OpenTherm gateway, potentially causing instability on the gateway.

    I changed the Arduino software to *only* transmit the boiler overrule commands if it is sure that the PS=1 command was successfull (i.e. the OpenTherm gateway is now sure to be 'silent'). If *not* sucessfull (no response or response too short), the Arduino will wait for the next gateway report. If no report ('Rxxxxxxxx', 'Txxxxxxxx', ...) is now received within X seconds, then apparently the PS=1 command was received by the gateway, but the response did not make it to the Arduino. In that case, just try again (re-send the PS=1 command and wait for reponse).
  • Here is a spontaneous reboot:

    Code: Select all

      [...]
      T10012800
      R10010A00
      BD0010A00
      AD0012800
      T00000300
      R80OpenTherm Gateway 4.2.3
      T00000300
      BC0000300
      T90101500
      B50101500
      T80190000
      B40191580
      T10012800
      BD0012800
      T00000300
      B40000302
      T101814AB
      BD01814AB
      T80190000
      BC0191599
      T10012800
      BD0012800
      T00000300
      BC000030A
      OE
      SE
      T00110000
      B40110400
      T80190000
      BC01918C7
      [...]
    
    These spontaneous reboots force me to repeat the send boiler overrule settings to the gateway every XX seconds, not just when they change. Spontaneous reboots may occur every so many minutes.
  • Below is what happened after having the serial port to the gateway disconnected for quite a long time (~10 minutes):

    Code: Select all

      [...]
      T10OpenTherm Gateway 4.2.3
      T1OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T1OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T1OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      OE
      T1OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      T10OpenTherm Gateway 4.2.3
      [--> Here the gateway was connected again to the Arduino; lots of 'Error 03' messages and an occasional reboot: ]
      OpenTherm Gateway 4.2.3
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      PS: 1
      00000000/00000000,0.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      ErrOpenTherm Gateway 4.2.3
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      PS: 1
      00000000/00000000,0.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      Error 03
      PS: 0
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      PS: 1
      00000000/00000000,0.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      PS: 0
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      BC000030A
      T90101500
      Error 03
      T90101500
      Error 03
      T90101500
      Error 03
      T90101500
      Error 03
      T90101500
      B50101500
      T80190000
      Error 03
      PS: 1
      00000011/00001010,0.00,00000000/00000000,0.00,0/0,21.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      PS: 0
      T80190000
      Error 03
      T80190000
      B4019264C
      T10012800
      Error 03
      T10012800
      Error 03
      T10012800
      Error 03
      T10012800
      Error 03
      T10012800
      Error 03
      T10012800
      Error 03
      PS: 1
      00000011/00001010,40.00,00000000/00000000,0.00,0/0,21.00,0.00,0.00,0.00,38.30,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      Error 03
      PS: 0
      T10012800
      Error 03
      T10012800
      Error 03
      T10012800
      BD0012800
      T00000300
      Error 03
      T00000300
      Error 03
      T00000300
      Error 03
      [--> A reboot now and then: ]
      OpenTherm Gateway 4.2.3
      T80190000
      Error 03
      T80190000
      Error 03
      T80190000
      B4019264C
      T10015A00
      Error 03
      T10015A00
      Error 03
      T10015A00
      BD0015A00
      T00000300
      BC0000300
      T00OpenTherm Gateway 4.2.3
      T00030000
      B4003410B
      T80190000
      Error 03
      T80190000
      Error 03
      T80190000
      Error 03
      T80190000
      B401920E6
      T10015A00
      Error 03
      T10015A00
      BD0015A00
      T00000300
      BC0000300
      T807F0000
      PS: 1
      00000011/00000000,90.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,32.90,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      PS: 0
      T80190000
      Error 03
      T80190000
      Error 03
      T80190000
      Error 03
      T80190000
      B4019215C
      T10015A00
      Error 03
      T10015A00
      Error 03
      T10015A00
      Error 03
      T10015A00
      Error 03
      PS: 1
      00000011/00000000,90.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,33.36,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      Error 03
      PS: 0
      T10015A00
      BD0015A00
      T00000300
      BC0000300
      T1002010D
      BD002010D
      T80190000
      Error 03
      T80190000
      Error 03
      T80190000
      BC01921E6
      T10015A00
      BD0015A00
      PS: 1
      00000011/00000000,90.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,33.90,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      PS: 0
      T907E0D16
      BF07E0D16
      T80190000
      B401921D7
      T10015A00
      Error 03
      T10015A00
      Error 03
      T10015A00
      Error 03
      T10015A00
      BD0015A00
      T00000300
      BC0000300
      T00090000
      BF0090000
      AC0090000
      T80190000
      B40192266
      PS: 1
      00000011/00000000,90.00,00000000/00000000,0.00,0/0,0.00,0.00,0.00,0.00,34.40,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      PS: 0
      T10015A00
      BD0015A00
      T00000300
      BC0000300
      T900E6400
      B500E6400
      T80190000
      BC0192257
      T10015A00
      BD0015A00
      T00000300
      Error 03
      T00000300
      BC0000300
      T90101500
      B50101500
      T80190000
      BC01921E3
      PS: 1
      [--> Slowly the gateway is recovering, the set room temperature is now correct (21.00 degr C): ]
      00000011/00000000,90.00,00000000/00000000,100.00,0/0,21.00,0.00,0.00,0.00,33.89,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      PS: 0
      T00000300
      BC0000300
      T901814D7
      B501814D7
      T80190000
      B401920B0
      T10015A00
      BD0015A00
      T00000300
      BC0000300
      T00110000
      BC0110000
      T80190000
      B40191FB5
      T90011500
      B50011500
      T00000300
      BC0000300
      PS: 1
      [--> And now also the actual room temperature is correct (20.84 degr C): ]
      00000011/00000000,21.00,00000000/00000000,100.00,0/0,21.00,0.00,0.00,20.84,31.71,0.00,0.00,0.00,0/0,0/0,0.00,0.00,0,0,0,0,0,0,0,0
      [--> Based on these readings (and on other data like outdoor temperature),
           the Arduino decides to overrule the boiler as follows:]
      CS: 26.00
      CH: 0
      PS: 0
      T90011500
      R90011A00
      B50011A00
      A50011500
      T00000300
      R80000200 
      [--> From here onwards, all is fine again]
    
I think it might be a good idea to at least:
  • describe the/some timing constraints on the sending of serial commands on http://otgw.tclcode.com/firmware.html#configuration : advise that the best "window of opportunity" in which to send a command is just after the reception of an OpenTherm message.
  • advise powering-off the OpenTherm gateway before disconnecting the serial port at the other end of a long cable, or tampering with the serial connection.
kikker
Starting Member
Starting Member
Posts: 7
Joined: Thu Feb 19, 2015 12:08 pm

Re: Opentherm Gateway communication hangs

Post by kikker »

I think that I might have solved the problem. I changed the software in the Arduino to send its "PS=1" command ONLY after it has seen a full 'T-B' or 'T-R-B-A' OpenTherm message sequence, so I am making sure the command is NOT sent within the middle of such a sequence.

Looking at the serial output of the OpenTherm gateway, I noticed that the available time slot (in which the gateway is silent) is largest just after such a sequence has been run through. So I thought, why not just wait until the sequence is finished and then send "PS=1".

Since that change, I did not see ANY spontaneous reboots any more, nor any "SE", "OE" or "NG" indications. Communication is (for this evening) completely stable!

I will be watching this for the coming days and report if this has actually solved my problem.
hvxl
Senior Member
Senior Member
Posts: 1966
Joined: Sat Jun 05, 2010 11:59 am
Contact:

Re: Opentherm Gateway communication hangs

Post by hvxl »

kikker wrote:I think it might be a good idea to at least:

describe the/some timing constraints on the sending of serial commands on http://otgw.tclcode.com/firmware.html#configuration : advise that the best "window of opportunity" in which to send a command is just after the reception of an OpenTherm message.
If I believed there was such a constraint, I would describe it. But I don't think there is. I have been sending serial commands to my gateway for years without paying attention to a "window of opportunity" and haven't had a problem. This includes sending the SC command every minute, so over the course of more than 5 years, that must have been over 2.5 million commands.

I do believe that a long serial cable may cause trouble, as I've mentioned on the troubleshooting page. If you found the problems coincide with simultaneous serial transmission and reception, it may be due to cross-talk in the serial cable.

It would be interesting to find the cause of the spontaneous reboots. Can you issue the PR=Q command after that has happened? My guess is that the result will be PR: Q=S, i.e. a reset due to a break condition on the serial line.
Schelte
kikker
Starting Member
Starting Member
Posts: 7
Joined: Thu Feb 19, 2015 12:08 pm

Re: Opentherm Gateway communication hangs

Post by kikker »

If you found the problems coincide with simultaneous serial transmission and reception, it may be due to cross-talk in the serial cable.
I cannot check this. I can only check the characters as they are received by the laptop/Arduino, and I've never seen any garbage, nor noted any missing characters. But you may be very right that cross-talk is causing the OTGW to receive garbage, or even receive a break condition.

The troubleshooting section on serial communication (http://otgw.tclcode.com/debugging.html#serial) says:

"If you see only some of the characters (like "OpenThery 4.2"), your serial cable could be too long."

As said, I've never noted that characters were missing.
It would be interesting to find the cause of the spontaneous reboots. Can you issue the PR=Q command after that has happened? My guess is that the result will be PR: Q=S, i.e. a reset due to a break condition on the serial line.
I tried it, and sure enough, PR: Q=S is reported.

Anyway, the good News is that the communication is still solid as a rock :D . Keeping all fingers crossed again!

(By the way, if I type the word "News" with all lowercase letters, this appears in the preview with capital "N"! Now isn't that wierd? :lol: Some Biblical influence in the software of this forum??)
kikker
Starting Member
Starting Member
Posts: 7
Joined: Thu Feb 19, 2015 12:08 pm

Re: Opentherm Gateway communication hangs

Post by kikker »

This topic can be closed and marked [SOLVED]. The communication between the Opentherm Gateway and the Arduino is now stable for over a week :). By choosing the right time slot for the transmissions from the Arduino, the communication is not affected, even if the cable is pretty long as in my case.
kikker
Starting Member
Starting Member
Posts: 7
Joined: Thu Feb 19, 2015 12:08 pm

Re: Opentherm Gateway communication hangs

Post by kikker »

I just googled into the causes of crosstalk in serial data connections, and found:

http://osd.com.au/serial-data-communications/

Interesting statement: "As an example of really poor practice, OSD once had a customer using a single twisted pair to carry transmit and receive data at about 9.6kbps. The crosstalk made this unworkable after just a few meters!"

Bingo. I am using a twisted pair of a CAT5 ethernet cable as the Tx/Rx pair.
hvxl
Senior Member
Senior Member
Posts: 1966
Joined: Sat Jun 05, 2010 11:59 am
Contact:

Re: Opentherm Gateway communication hangs

Post by hvxl »

Indeed, when using CAT5 for a serial connection you should pick a wire from two different twisted pairs for transmit and receive.
Schelte
Post Reply

Return to “Opentherm Gateway Forum”