Design and installation of measurement and control in terms of safety
Cyber security is playing an increasingly important role in building management systems. Unfortunately, it is usually solved only during the implementation, when the end customer (operator) and the supplier of the control system meet. However, the basic safety features should be defined already in the design phase, at the latest in the implementation project. Designers usually make excuses that they do not have a partner with whom to solve these requirements – the construction company is not interested and the operator (or his responsible employee) is not yet known. However, this does not mean that the designer should not be aware of the safety aspects and that he should not take them into account on his own, at least in some basic form in the project.
Let’s look at the classic control system topology from a safety point of view and find some rules according to which the control system should be designed and installed to be able to be successfully handed over to the operator from a cybersecurity point of view – while operating and servicing comfortably and efficiently.
We will be the first to encounter ČSN ISO/IEC 27001 and related, Information Technology – Security Techniques – Information Security Management Systems. However, this group of standards deals more with the organization of processes and evaluation of measures; we do not find much for design in it. More interesting is ČSN EN IEC 62443, which in part 3-3 defines security levels (Security Levels, SL) and requirements for them. A closer look reveals that virtually no building management system itself (not to mention industrial systems here!) is available on the market even offers the features required by the basic SL 1 level, so that the investment and operating costs of the control system are increased only minimally or rather not at all.
Peripheral devices such as sensors, valves, damper actuators and other components of the control system can usually only be protected by a “position”. Elements should be positioned to minimize possible foreign interference or damage. For peripherals installed in engine rooms, the problem does not occur, it is worse with room controllers in publicly accessible areas (corridors, offices) or even with outdoor sensors mounted on the facade. Fortunately, if one room sensor, controller or thermostat is damaged, only one room or zone is affected and loss of communication or signal interruption should appear at the control panel as an alarm. Damage to the outdoor sensor can already have a major effect on the behavior of the heating system, so deviations of the signal (caused by an open or short circuit) outside reasonable limits should not only be reported as an alarm but also treated in the program, for example by writing a “safe” value (eg. 5°C), which guarantees at least in winter the emergency, albeit not energetically optimal operation of the device.
For thermostats, especially safety thermostats (water heating, emergency thermostat on the heat exchanger, etc.), let us prefer the internal setting of the setpoint over the controller on the cover.
Cabinets located in public areas should be lockable (with a key, not just a standard handle) and their front panels should contain a minimum of control elements. Keep in mind that there is usually a PLC and a switch in the switchboard, which already allows access to the entire technological network or even to the corporate network.
Buses with zone controllers are sometimes included among the peripheral devices. A short circuit of the bus deactivates the entire line, disconnection means loss of communication for the part of the bus behind the fault. However, disconnection can deactivate the entire line if the bus is sensitive to missing terminations by a terminator or resistor (eg. for LON or RS485 buses). When designing, we therefore try to connect the device outside the switchboard to a separate communication port (usually mark4X COM4) and I/O modules, located in the switchboard, to the second line (COM3). Thus, in the event of a bus failure, at least communication with the I / O modules in the control cabinet is maintained.
Level of substations
Here, the situation is more critical, because the process controllers (PLC) have an Ethernet interface and are therefore connected to the technological network or the internal network of the building, which is the most common way through which the attack is conducted. At the same time, they contain control algorithms, the damage or attack of which has a fundamental effect on the function of the building, including physical impacts – control of motors, dampers, valves, etc., and especially related technologies.
At the same time, however, we can use organizational rules and technical means for security, which we know from the IT environment, which simplifies our work. Usually, one of the following situations occurs:
1. PLCs are connected in a separate (technological) network, which is considered “protected”, and security is addressed only in the place where this technological network enters the intranet, respectively to the internet. The technological network is separated from the outside world by a router, ideally by a security router. This approach also allows devices that have very weak or no security to connect to the network. These are mainly air handling units with autonomous regulation, various “XY to Ethernet” converters, sensors, frequency converters and the like. At the same time, this means that the entire technological network should be physically secured against the connection of third-party devices and other attacks, ie all active elements should be locked, infrastructure (cables, patch panels …) separated from other IT devices . There should be no wireless access points in the network and, if they are really necessary, they should be protected in addition to strong enough encryption, for example by filtering MAC addresses.
2. PLCs are part of the intranet and can share the same logical network as other devices that are not part of the control system (eg. printers, client stations, servers, routers). Security must therefore be addressed “at the foot” of each PLC, respectively control system device with network communication. This can sometimes be a problem because simpler components do not have enough computing power for this task or their communication protocols do not even allow security well (eg. Modbus TCP). For some devices, it is possible to block the web configuration interface with at least a switch, but this does not prevent, for example, attacks based on network card congestion with extremely heavy traffic.
1. Building management system in a separate technological network
Separate technological network
In the case of the topology according to point 1, ie when the PLC and possibly other IP devices have a separate technological network, we must solve the connection with the building intranet. It should be enough to specify a router in the project (eg Mikrotik series RB2011 or even hEX will suffice), or in agreement with the customer’s IT technician another device. We will change the administrator password for the router. It can also be agreed that the router will be supplied by the customer to maintain a unified system for joint management of active elements.
2. Building management system components as part of the customer’s network
Control systems as part of the intranet
If we plan the system according to point 2, let us take into account possible problems with the coexistence of the customer’s IT equipment and the control system component. We encountered faults several times, when the PLC “accidentally” froze, communication between the PLCs with each other and the like was lost. The reason was, for example, the switched on spanning tree protocol on the customer’s switches, which apparently the Ethernet interface in the PLC could not handle well. Diagnosis is complicated because the fault occurs only occasionally. Can we, as designers, avoid similar problems?
From a design point of view, it would be best to contact the future operator, find out his approach to issues of interconnection of the technological network with the intranet and take the conditions into account in the technical report, or in the topology of the control system. For some projects, this is an absolutely crucial step. Typically, these are chains of branches (shops, banks) with their own IT infrastructure and IT department, which follow the rules of the multinational owner. In these cases, the agreement is lengthy, but we work with a competent partner with whom we have a common interest. Therefore, the solution can usually be found, but let’s prepare for the fact that the control system will have to meet several conditions:
- IT infrastructure (sockets, cabling, patch panels, racks, active elements, etc.) is delivered and managed by the customer. For us, this means that we only have to specify the required number and location of connection points (mains sockets). This is basically not bad news.
- Devices connected to the network have IP addresses assigned by DHCP, not fixed. The IT technician determines that the DHCP server assigns IP addresses according to the MAC addresses of the devices (these are immutable – factory-bound to a specific piece of hardware). The switches can thus monitor what is connected to which socket, and if the hardware changes in the network, a safety alert is sent. The problem may occur when the PLC is replaced due to a fault: after connecting to the network, it will be assigned a different IP address or the network will reject the connection completely. Therefore, the service technician must know that the replacement of any Ethernet device must be coordinated with the customer’s IT department.
- The rules for communication between networks (so-called open ports) are periodically revised and erased so that there are no old, unused records in the system that pose a security risk. Sometimes a new networker comes to the company, deleting all records and waiting for who to respond; it then redefines the necessary rules. While this will ensure that the tables are cleaned, it may happen that, for example, temperature records from refrigerated boxes for the hygiene service are not stored for several months before the problem is accidentally discovered.
- Outgoing communication is restricted or completely blocked. This can affect cloud services, which are becoming increasingly popular with control system vendors. Therefore, if the project considers the transfer of data to a remote database (eg Merbon ContPort, various service and diagnostic services, proxy services for remote access without incoming network connections), we must check in advance whether such a service will be feasible at all. It may be possible to arrange the restriction of outgoing connections only to a specific IP address, but then it is necessary to monitor its possible change (for example, when transferring service hosting to another provider) and inform the customer’s IT department in time to update the rules. This is a problem especially with long-standing relationships, when after several years of trouble-free operation, no one slowly knows how everything actually works.
- Incoming communication is completely blocked. This should not surprise us, it is one of the basic security measures. For remote access for the service, we try to arrange a connection via VPN.
- VPN is often used for service access, which we consider to be a highly secure remote management tool. However, this only applies if other security rules are observed, such as not storing the password in a readable form on a remote computer, regular change of passwords and certificates, access only from defined IP addresses, etc. The operator should mention that the service approach will take this form and the conditions are already in agreement between the operator and the M&A supplier. It should be noted that remote service really significantly speeds up problem solving, regulation and operator training, and therefore has become a matter of course for larger installations (over 1000 data points).
- It is not possible to connect any other devices with Internet connectivity to the network (GPRS/LTE routers, etc.). This would be a gross breach of security rules and the perpetrator would feel the resentment of the network administrators with all the consequences. Therefore, it is not possible to “help” in this way if some services are limited by the IT administrator for any reason.
A common point of contention is the sending of alarm e-mails. To send them, the PLC or headquarters must have access to the mail server. The operator must therefore set up a user account on its mail server, which will be used to send mails. The access data is then communicated to the control system supplier so that the sending of alarms can be set. The relevant central office or PLC must also have access to the Internet. These requirements are again suitable to have in the technical report of the project, the supplier of control system is then to some extent protected. We try to avoid using freemail services in the first place.
Here are some tips to increase security at the PLC level:
- we do not use default passwords, especially for users with Engineering and Full Control rights;
- we will block the web servers in the PLC if they are not used;
- for converters, terminals, etc., we will block FTP access and the web interface, if possible;
- for HMI panels, we do not use a user named Admin for administrator access, but a “less conspicuous” name;
- if there are devices in the network that cannot be secured (Modbus, BACnet servers, etc., let’s prefer a separate technological network;
- for remote access we use VPN, not direct mapping of PLC ports to a public IP address;
- we do not leave project backups on a PC with visualization.
The promotion must not get into a state where the contracted, sold and paid function cannot be activated and handed over to the customer. The supplier is then unable to hand over the equipment and the customer has an excellent reason to withhold payments (“you should have found out / arranged in advance”). Especially with public contracts or subsidy titles, where it is necessary to strictly adhere to the assignment, we can get into a very difficult situation.
Level of visualization
Here we find service computers – SCADA client stations, servers for data storage, web servers for visualization, etc. These are mostly hardware based on personal computers. The basic protection consists of a reasonably set user policy and regular maintenance.
The biggest risk is usually on the part of the user, perhaps only the operating prescription, thorough training and the threat of sanctions will help. As for backups, there is never enough of it, but we have to keep the backups tidy so that the disk is not full of directories with the names “latest version”, “do not delete”, “old”, “backup_not walking”, etc. But this is more of a traffic problem and service. The designer should, in agreement with the control system supplier, specify the PC with the necessary hardware parameters (especially the size of RAM and disk) and determine the location of the hardware. It is a mistake when a computer with a database of stored values of ten years of system operation with several thousand data points is found dusty somewhere under the table. Even with regard to reliability and durability, an air-conditioned room is suitable for servers, which also provides physical security; today’s buildings already have a dedicated server room.
Some tips for configuring the OS and Merbon SCADA:
Installation, updates, backups
- in normal operation, protect the hardware by blocking reading and writing from CD drives, USB ports, etc.;
- block unnecessary services, uninstall unnecessary programs (games, etc.);
- check for regular updates of OS, antivirus programs and applications (SCADA);
- regularly backup and test recoverability from backups (!).
- do not use default usernames and passwords for services;
- do not save passwords in the browser;
- use port 443 (https: //) to access the SCADA server;
- if the SCADA server is accessible from the Internet (and not only from the internal network), restrict access only from certain IP addresses;
- remote access only via VPN, never leave the remote desktop (RDP) port or other services on the Internet open.
- regularly (approx. after 6 months) revise revise user lists and delete unnecessary accounts (also with regard to GDPR);
- users set only the rights they need for their work;
- if the computer is also used for work on the Internet, observe the rules of reasonable behavior;
- use the Windows user policy, use users with Administrator rights only for system configuration, not for normal work.
Some services are now completely virtualized. This means that for their operation we need to use resources (PC, storage), whose physical location we know nothing about and which we access exclusively through the Internet. These are the storage of historical data, portals for service and diagnostics, web portals for remote operation, etc. This may be in conflict with the customer’s security policy, so when designing we must find out whether the considered system uses virtual resources and acceptable to the customer.
The operating costs of the cloud solution are also related to this. These have little to do with security, but as this problem occurs more and more often, I consider it necessary to at least mention them in the project. In the statement of acreage, the designer should state both one-time installation costs and monthly or annual costs of operating the service. If the period explicitly specifies a period for which the service is to be guaranteed, we will state these costs as a separate item (eg “access to the MyCloudAccess web portal for 1000 data points for a period of two years”). The general contractor then bears the costs for this period within the delivery, then the obligation to pay passes to the customer – operator.
A very interesting situation occurs when security certificates and keys are used for communication. These funds have a limited validity. For example, domain certificates (used, for example, for secure access, without which some browsers now refuse to display data from a web server) cannot be issued for more than 27 months. In practice, this means that shortly after the warranty expires, visualization or data exchange between PLCs may stop working. The customer must either know in advance that the certificates need to be renewed and order the renewal, or he must have a service contract with the system supplier. Although there are systems for automatic update of certificates, they also need to be managed. This creates a new business model in which the customer can feel that he is becoming a kind of hostage doomed to permanent payments. It needs to be explained to him that this is a necessary maintenance, resulting from security principles. This brings with it new demands on the supplier of the control system and its designer.
So let’s summarize the basic rules for designing secure systems in several points:
- We place switchboards where there is no risk of unauthorized manipulation with control elements;
- in publicly accessible areas, the cabinet should have a key lock, not just a handle
- serial buses leading outside the switchboard are connected to other ports than those on which the I/O modules are in the switchboard;
- we connect the technological network to the intranet or the Internet at one point and through a security router, whose administrator is aware of his role;
- active elements, servers, data storages are placed in sufficiently secure areas with a suitable environment (temperature, humidity, dust);
- if the system requires external resources (cloud services, renewal of certificates, etc.), we will point this out in the project and we calculate the costs for some period in the statement of acreage;
- we have the specifications of the hardware of workstations and servers approved by the supplier of control system;
- we always deal with adequate physical security with cybersecurity (there is no point in installing disk storage with encrypted access to the switchboard, which stands in a publicly accessible corridor and can be opened without the help of tools);
- we solve all things in advance – communication always takes a long time, because the construction company usually does not understand the issue and has completely different worries;
- we try to communicate directly with the end user, if known.
Technology is changing rapidly, and if we want to be able to do quality design work in the future, we must maintain our professional competence. We will see that the time spent on “improving qualifications” will pay off many times over, because we will not have to deal with unpleasant disputes during implementation.