Interoperability between extended MS/TP frame capable devices and legacy devices

Original BACnet MS/TP data link specification only supports NPDU length up to 501 bytes, which is much shorter than 1497 bytes of Ethernet and IP data link. It limits transmission performance, increases complexity on application layer, especial when two IP/Ethernet networks are conjoined by a MS/TP network.

Extended frame was designed to solve this problem. The detail could be found here. Briefly, this addendum added two new frame types as:

  • 32: BACnet Extended Data Expecting Reply
  • 33: BACnet Extended Data Not Expecting Reply

Frame type 32 is extended from frame type 5 (BACnet Data Expecting Reply),  the special of it is that it is encoded by COBS and the NPDU length it carried is in range of 502 to 1497 bytes.

In the same way, frame type 33 is extended from frame type 6 (BACnet Data Not Expecting Reply)

Extended frame support was added into BACnet standard since revision 16. There are still lot of devices installed or on the market that do not support it. The interoperability between extended frame capable devices and legacy device is discussed below.

  • Non-router legacy device and extended frame capable device: Because all messages sent to legacy device are application layer message, the “Max APDU Length Accepted” from Device object property or confirmed service request primitive should be respected, the NPDU length will not exceed 501 bytes. So there are no problem with this configuration.
  • Legacy router and extended frame capable device: NPDU that should be relayed to other network through legacy router with length over 501 bytes will be discarded, no reject-message-to network with reason “Message Too Long” will be responded. Even more, the “Max APDU Length Accepted” of legacy router may be determined by other port that has a NPDU length larger than 501 bytes (It is allowed by standard), so NPDU for local application layer sent to legacy router will still possibly be carried by extended frame and discarded. So this configuration may cause problem in field.

BACRouter supports extended frame from very early version. From firmware version 3.18, we introduced “Extended frame” option on BACRouter’s MS/TP configuration, if there are legacy router that does not support extended frame on the bus, this option should be disabled to avoid Interoperability issues.

It’s worth noting that even “Extended frame” option is disabled, unlike legacy router, BACRouter will still be interoperable with extended frame capable devices.

(Screenshot has been updated on Aug 5, 2021. Because extended frame is mandatory from standard revision 16, from firmware 4.13, we move this option to extended configure mode.)

 

 

MSTP message delay guarantee

There are 2 types of BACnet service: unconfirmed and confirmed. The sender(client) of confirmed service request will wait for reply until timeout expires.

Usually there are no side effect of excess message delay for unconfirmed service. For confirmed service request or reply,  excess message delay will result in poor network performance, because the service reply will be dropped by client due to timeout.

Furthermore, too late reply for confirmed service will cause application logic wrong. Each confirmed service request has a invokeID, the reply message carries same invokeID. The value range of invokeID is 0~255. On a busy client, the invokeID will be exhausted and re-used soon. if the invokeID of a delayed confirmed service reply is re-used by another service, the reply will be regarded as replying to later one. For exmaple:

  • Client send WriteProperty service A to device X object Y property Z. The invokeID is 0.  The request or reply messages is delayed.
  • Client waits until timeout expires. Service A fails so invokeID 0 is reclaimed.
  • Client send WriteProperty service B to device X object U property V. The invokeID 0 is chosed.
  • Reply for service A is arrived, because its invokeID is same as service B, client believes that service B success.

For high speed data link type as Ethernet or IP, the message delay is neglect-able, but for MSTP, there are several possible reasons for excess message delay:

  1. Signal noise.
  2. Incorrect/improper device configuration(baudrate, max-master, max_info_frames)
  3. Excess traffic.
  4. Slow device

To avoid invokeID conflict and improve network performance, BACRouter with firmware version >=2.0 implements message delay guarantee of 10 seconds (It is changed to 6 seconds from v4.17 respecting BACnet default APDU_Timeout). Messages which could not be sent within the delay will be dropped by BACRouter.

Fixed/Auto/Forced baudrate for MSTP

Update on 2020-11-20 (Appended info. about JCI module)

MSTP baudrate is always painful for field technician. If the baudrate is wrong, device can’t join a MSTP bus.

Most devices have fixed baudrate. To modify the baudrate setting, technician have to physically access the device and change dip switches. Some devices support changing baudrate by BACnet service. but before that they should already have correctly baudrate setting for BACnet service to access it.

Some vendors implement auto baudrate, but introduce more problem than it solves. There are two types of auto baudrate mechanism:

  • Starting detection: The device detects and adopts baudrate on the bus when it starts. then never changes baudrate.
  • Dynamic detection: The device does same as starting detection type when it starts, but if it find there are error on bus for a predefined time, it considers that the baudrate is changed, it detects baudrate again.

For both types, it is difficult to change baudrate when devices is working. Simply changing baudrate on all fixed baudrate devices can not work, because auto baudrate devices are still working on old baudrate. The solution is to power off all auto baudrate devices, then power on all auto baudrate devices(Don’t power off/on auto baudrate devices one by one)

Our new firmware(>=2.0) introduces new baudrate management mechanism(Patent pending). There are 3 types of baudrate mode for BACRouter : Fixed/Auto/Force:

  • Fixed baudrate mode works as most traditional devices.
  • Auto baudrate mode is same as above-mentioned dynamic detection. The predefined time to re-initiate detection is 10 error frames, it usually take several seconds.
  • Forced baudrate mode is same as auto mode except that when the device get token, it changes baudrate to predefined value.

When there is a device with forced baudrate mode, the baudrate on the bus will be forced to predefined value. Devices with auto baudrate mode will automatically synchronize to predefined baudrate. Devices with fixed baudrate mode but baudrate setting different with predefined value will not be seen on bus (It’s easy to check out in “Recent active devices” field from BACRouter’s runtime info). Devices with starting detection type may run on wrong baudrate, they will not be seen on bus too, but powering off/on them one by one will synchronize them to forced value.

More than one device with forced baudrate mode could coexist on a bus, but the baudrate values on them should be same.

JCI FEC/IOM modules implements baudrate dynamic detection mechanism, the re-detect interval is about 150 sec on the test.

On a test bus , BACRouter cooperate perfectly with FEC2611, IOM3731,  the baudrate is dynamically controlled by BACRouter from 9.6k ~ 76.8k, FEC2611 and IOM3731 will catch up after 2.5 minutes.

Max_info_frames by token occupy time

From firmware ver2.0, BACRouter introduced new “Max_info_frames by token occupy time” feature.

In BACnet standard of MSTP, a master device could hold token until it has sent frames up to Max_info_frames. The default value of Max_info_frames is 1. But for router, this value may be increased to improve bandwidth between networks. Mostly the suggested value for router is between 5 to 20.

MSTP works as a field bus for controllers; sensors and actuators. The data exchanging latency between devices usually should be guaranteed.  We recommend devices get token at least every 1 second.

The APDUs passing router usually have size between 10~50 bytes, but could be up to 480 or 1476 (Extended frame). Larger APDU need more time to send or receive.

For APDU which need a reply from targeted device, router has to wait for reply. Usually the targeted device need more time to handle or generate larger APDU, router has to wait longer.

So the time router holding token could be varied much, which impacts latency guaranty of MSTP bus. To avoid this problem, “Max_info_frames by token occupy time” feature limits router’s token holding time.

The limitation is calculated by:

byte_time * 32 * Max_info_frames

For example, Max_info_frames is set to 10. The baud rate is 76800bps, so the byte_time is 0.13 milliseconds:

0.13 * 32 * 10 = 41.6 milliseconds.

When router founds it have held token for 41.6 milliseconds, it passes token to next station, though the frames it sent may be less than 10.

This feature could be enabled/disabled by user from WebUI.

BACRouter benchmark for routing between BIP and Ethernet

The intent of this benchmark is to investigate the capability of BACRouter. Because of low baudrate of MSTP, there is not a bottleneck on routing packet to/from MSTP network.

BACRouter support 10/100M Ethernet interface, so there will be a challenge to flood it. The testing machine is a Notebook with i7 2.8G 4 cores CPU and 1000M Ethernet card, directly connected to BACRouter with CAT5+ cable. The result is:

PathAPDU size in byteMax routing rate without packet drop
(per second)
Routing rate in packet flooding
(per second)
Packet flood rate
(per second)
BIP->Ethernet41320035479087
7508980611113500
1476552038006832
Ethernet->BIP4101001138113000
7508090603011895
1476611055426526

When BACRouter is flooded by small packets, the handling capability dramatically decreased, especially in BIP port.

On 2019-04-16, We made new benchmark on firmware version 2.18, with a new testing machine( i5 4 core CPU and 1000M Ethernet card). the result is:

PathAPDU size in byteMax routing rate without packet drop
(per second)
Routing rate in packet flooding
(per second)
Packet flood rate
(per second)
BIP->Ethernet415300188768705
75010300864513005
1476630054247179
Ethernet->BIP4127501937111358
75010300864012281
1476730073107453

The performance is improved much with new firmware.

BACnet MSTP auto addressing

Updated on 2020.3.25 for firmware version 3.x

Because the same time online for all devices could not be guaranteed, there is no auto addressing solution could avoid MAC conflict. We remove this feature on firmware 3.x. To help determine max_master and unused MAC on bus, “Sniffer mode” could be enabled, then “Current max master” could be obtained from run time info. Unused MAC also could be chosen referred to “Recently active devices”.

Every device on a MSTP bus should have a unique MAC address.  For master device, the available address range is 0~127,  and 128~254 for slave device.

Usually MAC address is set by DIP switch, jumper, LCD screen, firmware downloaded by configuration tools. Some devices support MAC address modification through BACnet object/property, but before doing that, it should have a valid MAC address to join BACnet network.

If the unique MAC address could be automatic obtained like we get IP address just by plugging notebook into home/office network, it would save a lot of time in commission.

There are several solutions had been discussed.    Now seems committee prefer  “Zero-Config” (addendum 135-2012bb)

BACnet stack has implemented “Zero-Config”.

“Zero-Config” only works on fixed configuration that Max-master is 127 and automatic assigned address range is 64~127. If not, it may cause mess.

To avoid above limitation, BACRouter implements proprietary auto addressing solution and keep compatible with “Zero-Config”.  It has some attractive features:

  1. Learning Max-master from bus traffic.
  2. Assigning MAC address from highest unused one.

So users have more freedom on MAC address schema,  For example, leave address 0~30 for fixed address devices, set Max-master as 40, so automatic addressing devices would use 31~40.

Both Zero-Config and BACRouter’s solution have trouble when a automatic addressing device is pulled out bus then plugged in again without reboot, because a new attached automatic addressing device would occupy the same address.(BACRouter is more weak in such situation because of it’s predictable address assigning), So

ALWAYS power on automatic addressing device after attaching to bus.

Solution to MSTP frame desynchronization

Updated on 2021.7.13

We had discussed BACnet MSTP weakness to frame desynchronization in below:

BACnet MSTP frame lost synchronization

Attack BACnet MSTP by frame desynchronization

But what is BACRouter’s solution to this problem, let’s look for the clues from the standard.

9.5.2 Variables

SilenceTimer: A timer with nominal 5 millisecond resolution used to measure and generate silence on the medium between octets. It is incremented by a timer process and is cleared by the Receive State Machine when activity is detected and by the SendFrame procedure as each octet is transmitted.

9.5.3 Parameters:

Tframe_gap: The maximum idle time a sending node may allow to elapse between octets of a frame the node is transmitting: 20 bits times.

Tturnaround: The minimum time after the end of the stop bit of the final octet of a received frame before a node may enable its EIA-485 driver: 40 bits time.

Tpostdrive: The maximum time after the end of the stop bit of the final octet of a transmitted frame before a node must disable its EIA-485 driver: 15 bit times.

9.5.5 The SendFrame Procedure

If SilenceTimer is less than Tturnaround, wait (Tturnaround – SilenceTimer).

9.2.3 Timing

Transmitter disable: The node shall disable its EIA-485 driver within Tpostdrive after the beginning of the stop bit of the final octet of a frame in order that it not interfere with any subsequent frame transmitted by another node. This specification allows, but does not encourage, the use of a “padding” octet after the final octet of a frame in order to facilitate the use of common UART transmit interrupts for driver disable control. If a “padding” octet is used, its value shall be X’FF’. The “padding” octet is not considered part of the frame, that is, it shall be included within Tpostdrive.

(It’s unclear that whether the Tturnaround include “padding” octet, but in 135.1 testing standard, chapter 12.1.3.4 “Verify T turnaround”: If the reference master employs a “padding” octet of X’FF’ as the last octet of every frame, then the time shall be measured starting from the trailing edge of the stop bit of the octet that precedes the X’FF’ “pad” octet in the frame transmitted by the reference master)

So in a valid frame,  the maximum bus idle is Tframe_gap plus tailing bit “1” in the previous octet. it’s 29 bits time (assuming previous octet is X’FF’)

Considering “padding” octet, the minimum bus idle between 2 frames is Tturnaround – Tpostdrive + 9 (tailing bit “1” in the “padding” octet), it’s 34 bits time.

BACRouter use a revised RSM to implement previous logic:

  1. When the time between receiving 2 sequential bytes is longer than 20 bits time, the receiving frame is aborted.
  2. Idle time on the bus greater than or equal to 33 bits time means there is a new frame.
  3. To be compatible with devices not respecting to Tturnaround, any data following valid frame will be regarded as new frame.

In 115200bps, one bit time is only 8.7us. To precisely measure duration of idle line, the timer granularity of BACRouter is set to only 5us. It help to resist to frame desynchronization, and reach 98.8% bandwidth utilization on 115.2kbps because BACRouter no more waste time when 40 bits Tturnaround is over.

Attack BACnet MSTP by frame desynchronization

As pointed out by previous article “BACnet MSTP frame lost synchronization” , BACnet MSTP has a design flaw on frame synchronization, but how to utilize it to perform attack and strictly obey the standard at the same time?

We make some assumptions here:

  1. There are at least 3 devices on the bus with MAC address 1, 8,10. The device 1 is carefully designed to perform attack. Device 8 and 10 are innocent.
  2. Device 1 supports extended frame, device 8 and 10 are not.
  3. The timers of 3 devices is precise enough.

The work flow of device 1 is:

  1. When get token, send out frame A
  2. Pass token to MAC address 2
  3. When get token again, send out frame B
  4. Pass token to MAC address 2
  5. goto step 1 again.

Frame A is a valid proprietary frame (hexadecimal);

55 ff 80 ff 01 00 1d a3 02 2b 72 fe 55 ff 03 08 01 00 11 a0 ff 55 ff 21 01 08 00 09 ce d4 f3 55 ff 00 01 08 00 00 bf

Frame B is also a valid proprietary frame as:

55 ff 80 ff 01 00 1d a3 02 2b fe dc 55 ff 03 0a 01 00 11 b1 ff 55 ff 21 01 0a 00 09 fd 8a 51 55 ff 00 01 0a 00 00 8c

Every thing will go well if there is no frame desynchronization, but after hours running, if device 8 losses synchronization with frame A header (It has same effect if device 10 losses synchronization when device 1 sends frame B) , device 8 find another frame when scan Frame A’s data portion:

55 ff 03 08 01 00 11 a0 ff 55 ff 21 01 08 00 09 ce d4 f3 55 ff 00 01 08 00 00 bf

It’s a Test-Request frame send to device 8,  so device 8 try to reply it after Tturnaround with a Test-Response frame:

55 ff 04 01 08 00 11 ae ff 55 ff 21 01 08 00 09 ce d4 f3 55 ff 00 01 08 00 00 bf

but at the same time, device 1 passes token by sending:

55 ff 00 02 01 00 00 73

So the first 8 bytes of two frames are collided, so device 10 drop invalid header, find data as below:

55 ff 21 01 08 00 09 ce d4 f3 55 ff 00 01 08 00 00 bf

When device 1 finishs sending, it starts receiving data and get the same as:

55 ff 21 01 08 00 09 ce d4 f3 55 ff 00 01 08 00 00 bf

For device 10, it get a valid Not-For-Us frame header, so it enter SKIP-DATA state, there is not enough data to skip, so device 10 will wait until Tframe_abort.

For device 1, it’s a BACnet-Extended-Not-Expecting-Reply frame header, because it support extended frame, so it validate header by procedure described in Addendum 135-2012an. Because the data length is too short, so it abort the frame enter IDLE state again, then find another frame:

55 ff 00 01 08 00 00 bf

It’s a token frame passing token to device 1, so device 1 get token then sending Frame B just after Tturnaround:

55 ff 80 ff 01 00 1d a3 02 2b fe dc 55 ff 03 0a 01 00 11 b1 ff 55 ff 21 01 0a 00 09 fd 8a 51 55 ff 00 01 0a 00 00 8c

As mentioned above, Device 10 still wait 1 byte to skip previous frame (because Tturnaround < Tframe_abort), so it miss this frame header. get wrong frame as:

55 ff 03 0a 01 00 11 b1 ff 55 ff 21 01 0a 00 09 fd 8a 51 55 ff 00 01 0a 00 00 8c

It’s another Test-Request frame send to device 10, thing repeats.

From above, every devices strictly obey standard, but once frame desynchronization occurs, the whole MSTP bus is stalled forever.

Read more on Solution to mstp frame desynchronization

BACnet MSTP frame lost synchronization

There are two concepts of frame:

  1. BACnet MSTP datalink layer frame, it has at least 8 octet bytes, including: 2 preamble bytes of 0x55 and 0xff, frame_type, destination_mac, source_mac, 2 data_len bytes, crc, and omissible data portion.  We call this type of frame as “MSTP frame”.
  2. EIA-485 frame, it is consisted of bits, including start bit, data bit, parity bit, stop bit. BACnet MSTP using non-return to zero (NRZ) encoding with one start bit, eight data bits, no parity, and one stop bit. The start bit shall have a value of zero, while the stop bit shall have a value of one. The data bits shall be transmitted with the
    least significant bit first. We call this type of frame as “byte frame”.

BACnet MSTP Receive Frame Finite State Machine (hereafter refer to RSM) distinguish starting of frame by preamble bytes. If there is no DataAvailable or ReceiveError within a frame for Tframe_abort (60 bits time, but Implementations may use larger values for this timeout, not to exceed 100 milliseconds) , the frame is aborted, RSM search for next frame again.

Because preamble bytes is allowed on other portions of MSTP frame (Extended frame introduced by Addendum 135-2012an is an exception, it use COBS encoding to avoid 0x55 existing in data portion), so RSM may parse MSTP frame beginning from data portion of previous MSTP frame.

The minimum time gap between MSTP frames is Tturnaround (40 bits time), it is less than Tframe_abort, so if RSM lost sychronization of MSTP frame, it may parse the wrong MSTP frame across actual MSTP frames.

There are several cause of losing synchronization:

  1. Program defect on sending or receiving device. It could be eleminated by code review and test.
  2. Timer precision. BACnet MSTP standard only requires a 1% precise timer with a resolution of 5 ms or less. So it is hard to check out Tframe_abort. Because there is only 5 ms of error space between delay and timeout (Tusage_delay to Tusage_timeout, Treply_delay to Treply_timeout), it is very likely causing collisions for slow responding devices.
  3. Noise on bus line. Noise causes byte frame error and crc error, RSM will abort previous MSTP frame in both situations.

Some may argue that MSTP frame integrity is protected by header crc8 and data crc16. Even without considering malicious devices(Here is a attack example) and random data collision, there are still applications may send valid frame in data portion.

(Updated 25 June 2021) For example, There are some products that have implemented a packet capture function on MSTP bus, then the captured data may be transferred by BACnet service (PrivateTransfer or AtomicReadFile?). What would happen if there is electronic noise on the bus when the data is passing through a MSTP network?

It is not only possible to cause MSTP bus being blocked, but also device
malfunction if the wrong frame is a APDU or break whole BACnet
inter-network if the wrong frame is a I-Am-Router-To-Network.


Byte Frame Desynchronization

For the byte frame, losing synchronization usually is caused by noise, no termination or absence of biasing. The symptoms include regarding data bit of 1 as idle line, regarding data bit of 0 as start bit of new byte. In addition to data error, the losing synchronization of last byte in a MSTP frame will introduce measure error of idle time between MSTP frames.

Read more on Solution to mstp frame desynchronization

How fast device feature of BACRouter improves MSTP token pass rate?

The test criterion is: 115.2kbps, 80m cable. 1 router and 10 mstp master devices, max_master=127, no NPDU traffic on bus, define non used mac as fast device.

Router mac

Fast device timeout config

Other device’s mac

Token pass rate(round/min)

Percentage Base no fast

127

—-

117~126

2301

100%

127

0ms

117~126

4856

211%

127

1ms

117~126

4734

206%

127

2ms

117~126

4473

194%

116

0ms

117~126

4690

204%

116

1ms

117~126

4577

199%

116

2ms

117~126

4343

189%

58

—-

117~126

1777

100%

58

0ms

117~126

4563

257%

58

1ms

117~126

4405

248%

58

2ms

117~126

4071

229%

110

—-

0,11,22,33,44,55,66,77,88,99

1153

100%

110

0ms

0,11,22,33,44,55,66,77,88,99

4037

350%

110

1ms

0,11,22,33,44,55,66,77,88,99

3813

331%

110

2ms

0,11,22,33,44,55,66,77,88,99

3393

294%