Last modified by Mark Kohlmann on 2024/04/05 22:02

Hide last authors
Mark Kohlmann 2.1 1 ShowRunnerCLC™ itself does not support a high availability/clustering capability at this time.  To understand why you have to understand the capabilities of the hardware that ShowRunnerCLC™ controls.  Crestron processors run ShowRunnerCLC™ the same as any other program built to run on a Crestron system.  Crestron hardware communicates to processors using one of the following methods: the processor's Cresnet port, the Ethernet network, the RF gateway built-in to the processor (if supported), or any I/O port attached to the processor.  Typical designs for current generation Crestron hardware use an Ethernet connection.  This includes Zum Wired, Zum Wireless (via RF gateway[s]), and any Cresnet devices that communicate over a Cresnet bridge (DIN-CENCN-2/CAEN-BLOCK-CENCN/ZUMNET/etc).  Crestron Ethernet devices have their communication settings defined in their IP Table.  Some devices allow multiple IP Table entries (usually max of 2) while others only allow 1.  When multiple are allowed, typically the processor will claim specific hardware once the connection is established and will own that hardware until the connection is disconnected.  Any device desiring to claim hardware that is already claimed will be denied.  Crestron firmware, to our knowledge, does not support configuration of primary/backup IP Table entries.  Devices placed on the Control Subnet of a primary processor increase the complexity as the processor's internal router controls the IP allocations on the Control Subnet.  There is no ability to have an immediate failover for anything other than Cresnet.  All Ethernet connected devices will have failover times measured in 10s of seconds to minutes depending on the approach.
Mark Kohlmann 1.1 2
3 ==== Facts ====
4
5 * ShowRunnerCLC™ does not presently have the ability to synchronize system state to a backup processor
Mark Kohlmann 2.1 6 * ShowRunnerCLC™ does not presently have support for automatically reconfiguring hardware or the processor should it be activated as a backup
7 * Crestron hardware devices do not support redundant physical or software based connections.  Any activation of a backup will require configuration changes that may be done automatically or require manual intervention.
8 * Zum Wired supports a fallback to local app mode if the processor connection the hardware is connected to fails, provided the room is configured in CNET mode.  Please see our Zum [[guide>>doc:Design Guide.Hardware Design Guide.Crestron Zum Wired.Design Considerations.WebHome||anchor="HSystemOverviewandHardwareModes"]] for more information.
9 * The failure rate of a power supply powering the processor is typically higher than the processor itself.
Mark Kohlmann 1.1 10
Mark Kohlmann 2.1 11 ==== How can I detect if ShowRunnerCLC™ failed? ====
12
13 * SNMP traps on program status from the processor
14 * ShowRunnerCLC™'s REST API becomes unavailable or returns an invalid status
15 * Configure ShowRunnerCLC™ to close a relay or trigger a digital output when it starts up.  A failure of the program would open/release the output.
16
Mark Kohlmann 1.1 17 ==== How do I make my processor redundant? ====
18
19 * Cold Spare - Manual Recovery
20 ** Have an identically configured Crestron processor ready to be connected in place of the primary processor.
21 ** When primary fails, all connections will need to be moved to the backup.
22 ** Power Up the backup
23 ** Caveats:
24 *** Any devices paired to the internal RF gateway of the processor (if MC3/DIN-AP3MEX/MC4) cannot be moved.
25 *** The backup's configuration will only be as current as the last time it was copied from the primary to the backup.  While not presently a feature it could be possible to have the primary backup itself up to a 3rd device.  The backup could pull the config from the 3rd device at startup to run with an up to date configuration.
26 *** Current system state will attempt to be rebuilt at startup as devices come online but not all pieces of hardware support transmitting current state.
27 * Hot Spare - Automated recovery
28 ** Option A - Reconfigure processor only:
29 *** Have an identically configured Crestron processor connected to the network at a different IP address.
Mark Kohlmann 2.1 30 *** A supervisory program will need to monitor the primary (contact Chief Integrations to discuss).  If the primary goes down then the supervisory program will need to physically disconnect the primary processor's Ethernet so there are no conflicts.
Mark Kohlmann 1.1 31 *** If processor Cresnet is used then an A/B physical RS-485 switch will be needed to switch the Cresnet connection from the primary to the backup.
32 *** The processor will change its IP address to the primary's and reboot.
33 *** Caveats:
34 **** Devices will come back online as their ARP caches renew and identify the new processor's MAC address.
35 ** Option B - Reconfigure devices:
36 *** Have an identically configured Crestron processor connected to the network at a different IP address.
37 *** A supervisory program will need to monitor the primary (contact Chief Integrations to discuss).
38 *** If processor Cresnet is used then an A/B physical RS-485 switch will be needed to switch the Cresnet connection from the primary to the backup.
39 *** The supervisory program will connect to each Ethernet device and modify the device's IP table entry to point to the new
40 *** Caveats:
41 **** Most Crestron Ethernet devices only accept a single console connection.  If this connection is in use or the last connection failed to gracefully exit then the console may not be available.  In this case, reconfiguration would fail and the device would be orphaned until the primary came back online.
Mark Kohlmann 2.1 42 ** Caveats:
43 *** Using the control subnet complicates this significantly.  If deploying one of these 2 approaches it is recommended that a proper network is used rather than using the control subnet.
44 *** Any devices paired to the internal RF gateway of the processor (if MC3/DIN-AP3MEX/MC4) cannot be moved (need to investigate trust center backup/restore but better to use external RF gateway).
45 *** The backup's configuration will only be as current as the last time it was copied from the primary to the backup.  While not presently a feature it could be possible to have the primary push its configuration to the backup as the backup is online and reachable.
46 *** Current system state will attempt to be rebuilt at startup as devices come online but not all pieces of hardware support transmitting current state.
Mark Kohlmann 1.1 47 * VC-4 Virtual Machine High Availability
48 ** VC-4 supports high availability at the hypervisor or operating system level
49 ** Option A - Hypervisor HA:
Mark Kohlmann 3.1 50 *** Run VMWare ESXi with vMotion or similar product
Mark Kohlmann 1.1 51 *** VM infrastructure maintains state of VC-4 across multiple hosts.  If a host fails the system automatically fails over to a different host.
52 *** Caveats:
53 **** Doesn't protect against a failure of the operating system, VC-4, or ShowRunnerCLC™
54 ** Option B - OS Level HA:
55 *** Implement high availability using Linux level capabilities.
56 *** Caveats:
57 **** Will not maintain state, just ability to fire up an identical VC-4 instance when the primary fails
58 **** Difficult to configure
59 **** Doesn't protect against a failure of the operating system, VC-4, or ShowRunnerCLC™
60 ** Caveats:
61 *** Expensive
62 *** Heavy IT requirement
Mark Kohlmann 2.1 63
64 ==== How do I make Cresnet/Zumlink redundant? ====
65
66 * Processor Cresnet Port:
67 ** Must use an RS-485 A/B switch to physically switch between the primary and the backup.  Cresnet does not support multiple masters active simultaneously.  Only one master may poll the Cresnet network.
68 ** Once line is cutover, if the backup is online, the devices will be pulled and online within a few seconds or less.
69 * Cresnet Bridge
70 ** Cresnet devices may be redirected to a new processor via a simple IP Table change and reboot
Mark Kohlmann 3.1 71 ** Bridges generally support fewer devices and provide isolation between NETs with alternate power options offering better protection than a processor connected Cresnet network.
Mark Kohlmann 2.1 72 ** Caveats:
73 *** Typically the bridge must be rebooted after the change.  Downtime is typically 30 seconds to a minute before everything comes back online.
74 * Caveats:
75 ** Cresnet can not be run in a loop
76 ** Any physical damage, loose connections, shorts, power faults will cause a failure that cannot be recovered from
77
78 ==== How do I make Ethernet redundant? ====
79
80 * Touchpanels
81 ** Connect to touchpanel and remove the failed processor's entry and add the now active processor's entry
82 ** No reboot required
83 * Cresnet Bridges/RF Gateways/Other Ethernet Devices
84 ** Connect and remove the failed processor's entry and add the now active processor's entry
85 ** Reboot required, downtime about 30 seconds to a minute
86 ** Caveats:
87 *** If device console has an active connection or the old connection failed and blocked the port then this will fail.
88
89 ==== How do I make RF redundant? ====
90
91 * Facts:
92 ** RF devices are paired to a gateway and require a complete handshake
93 ** Crestron RF gateways support backing up the trust center (the certificates and pairing data)
94 ** Backup would need to be reachable on the network
95 ** Primary and Backup Gateways would need to be the same model
96 ** Supervisory program would need to have a backup copy of the trust center from the primary gateway
Mark Kohlmann 3.1 97 ** The below thoughts are theoretically and have not been tested
Mark Kohlmann 2.1 98 * Connect to the backup gateway and load the trust center from the primary gateway
99 * Configure the IP table on the backup to point to the correct processor
100 * Connect to primary gateway and wipe its trust center and IP table if reachable
101 * Devices should join the now active gateway.  How long this takes would need to be tested.
Mark Kohlmann 3.1 102 * It is not known if the trust center backup/restore processor must be done through Toolbox or if it's something that could be embedded in a program without the need for Toolbox.  This will require significant development effort.
Mark Kohlmann 2.1 103
104 ==== Best Practices ====
105
106 As you can see there are many things to consider when trying to build a redundant lighting control system.  They key points are:
107
108 * Avoid processor Cresnet except in specific scenarios
109 * Use Cresnet Bridges to allow software based reconfiguration of connections
110 * Use IT grade infrastructure with its own redundant capabilities
111 * Provide high quality power to the processor and infrastructure with UPS capabilities
112 * Leverage Zum Wired's app mode to control the rooms if it is compatible with the site's sequence of operations.  This way a failure of the processor does not impact local lighting control.