Multitasking Real-Time Operating Systems

Multitasking and real-time, in the field of operating systems, are antonymous. Normally all general-purpose operating systems, such as Windows and Mac OS, are multitasking and non real-time. Whereas in the embedded system market operating systems exist that use any of the permutations of multitasking and real-time. The computers used for real-time operating systems (RTOS), compared to most general-purpose computers, need to more reliable and tolerant of changing environmental conditions and be able to cope with varying computation loads.

In order to analyse the techniques used to achieve a multitasking RTOS it is first important to define the key terms of this type of operating system (OS). Multitasking, as the name suggests, is a technique used to handle the execution of multiple tasks. It can be defined as, “the execution of multiple software routines in pseudo-parallel. Each routine represents a separate thread of execution” (Ganssle and Barr, 2003). The OS simulates parallelism by dividing the CPU processing time to each individual thread. The second term to define is real-time. A real-time system is a system based on stringent time constraints. For a system to be real-time, there must be a guaranteed time frame where within a task must execute. In a multitasking RTOS the task scheduling, switching and execution elements are key to the effectiveness of the OS in achieving its real-time requirements.

1.1. Multitasking

Multitasking can be divided into a number of operations. These operations can vary between different operating systems, however, they are usually present in one form or another. Wolf (2005) states that there are two main types of multitasking; these are the commonly used preemptive multitasking and the decreasingly used cooperative multitasking. The key mechanism in all multitasking operating systems is the context switch. The context switch is the process of switching from one executing task to another, it involves saving the state of the running task and restoring the state of another. In a single CPU computer only one task can be running at anytime, however when a context switch occurs frequently enough the appearance of task parallelism can be achieved. To understand the value of preemptive multitasking, it is first important to discuss an earlier type and describe its restrictions. Cooperative multitasking has a number of limitations, which stop it being used in RTOSs. In this type of multitasking only the active task may initiate a context switch allowing another task to run. A context switch occurs when either the tasks has completed processing, uses it’s allotted time slice or becomes blocked due to a shared resource. This allows for equal distribution of task execution times, however, a problem with this is that the system is only as strong as its weakest task. If an individual task is not cooperative, due to either bad design or a bug, then the entire system can stop responding. The overriding reason why cooperative multitasking is not suitable for RTOSs is that there is no concept of task priority and preemtability. For RTOSs preemptive multitasking is usually used. Preemptive multitasking uses priority based scheduling. This means a task can be stopped or suspended in order to allow a task of higher priority to run. In the case of two tasks of the same high priority each is given an equal time slice. A task is said to ‘preempt’ another task, when that task is interrupted and the context switched to a higher priority task.

1.1. Scheduling

The scheduler is a key component in any RTOS. Many scheduler algorithms have been developed, for different operating systems, each with their relative merits, such as speed of execution, use of memory and the ability to prioritise and preempt tasks. For RTOS the two mainly used scheduling algorithms are Rate-Monotonic (RM) and Earliest-Deadline First (EDF). In RM scheduling priorities are assigned to tasks based on their period, the sorter the period the higher the priority. For EDF scheduling the task whose deadline is the earliest is always the first to be executed. RM task priorities are static and do not require any additional services from the OS. Contrastingly EDF assigns priorities dynamically as the priority depends on how close the task is to its deadline. This requires the OS to support dynamic priorities, however, EDF scheduling is capable of achieving higher processor usage than static priority algorithms. The main issue with EDF is that an overload can result in an unpredictable system. “Unfortunately, EDF scheduling degrades poorly. If the system experiences a transient overload, it is impossible to predict which threads will miss their deadlines.” (Ganssle and Barr, 2003). For this reason, RM is more often used in ‘hard’ real-time systems, although more processing power is required compared to EDF, as static variables need to be handled.

Real-time systems are used when an execution deadline must be met in a desirable or critical way. The majority of RTOSs are both multitasking and run on embedded computer systems and the context of this discussion of real-time will assume both of these. Real-time operating systems vary in only a few ways compared to general-purpose operating systems such as Windows. These variances are based around the absolute requirement of predictability in how task execution will occur. This means in a RTOS the importance of predictability far outweighs the overall performance of the system as a whole, thus a general purpose OS will often out perform a RTOS in terms of throughput. It is also important to note that many real-time systems have deadlines, which are days apart, meaning that tasks do not have to be continually switched reducing the need of really fast processors.

Real-time systems can be divided into two classes ‘soft’ and ‘hard’. In ‘soft’ real-time systems predictability and deadlines are important, however not safety critical. An example of a ‘soft’ real-time system is a videophone, in this system it would be desirable for deadlines to be met in order to maintain audio and video synchronisation, however often in these non critical system, by sacrificing some quality more throughput can be achieved, meaning that a larger video scene could be viewed as less processing time would be devoted to maintaining absolute deadlines. This is often an important consideration in real-time systems, as if the system can be designed to not be reliant on deadlines then much cost can be saved. However, in many cases it is essential that certain tasks run exactly as predicted and within strict deadlines, for this a ‘hard’ real-time system is used. A ‘hard’ real-time system is concerned with predictability, in terms of knowing how long it will take for a deadline to be met. The designers of real-time systems aim to meet deadlines with a zero latency time. The latency of a task is the difference between the time the task actually started or finished and the time the task was expected to start or finish.

1.2. Failure Rates

It is crucial that the failure rates of real-time computers are extremely small. This is even more important in safety critical applications, where fault-tolerance is essential. These systems must be able to continue operating despite failures of both hardware and software. A method often used in these systems is graceful degradation. This method allows the system to continue executing safely under faults, usually reducing functionality and quality. “The less critical tasks are shed, and the system is still able to carry out the critical core tasks that are vital to the survival of the controlled process.” (Krishna and Shin, 1997). Failures in real-time systems are usually handled differently over time. There may be different routines to handle immediate failures and other routines to handle failures in the long term. Krishna and Shin (1997), describes these responses, “The short-term response consists of quickly correcting for a failure to allow immediate deadlines to be met. The long-term response consists of locating the failure, determining the best response to it, and initiating a recovery and reconfiguration procedure”. In real-time systems it is often more difficult to debug the system if failures occur. This is often due to the difficulty of tracing missed deadlines, which arise due to timing issues of I/O devices. A number of techniques are used specifically to debug real-time systems; these techniques are often used to determine if deadlines are being met. “In-circuit emulators, logic analysers, and even LEDs can be useful tools in checking the execution time of real-time code to determine whether it in fact meets its deadlines.” (Wolf, 2005).

1.3. Developing a RTOS

An API standard named POSIX has been produced for portable operating systems. Initially created for UNIX systems, the standard allows applications that conform, to be run on any platform also supporting the standard. “The real-time and thread extensions of POSIX 1003.1 define a subset of POSIX interface functions that are particularly suitable for multi-threaded real-time applications.” ( Liu, 2000). These standards include API specifications for the creation of threads and management of their execution. In addition there are real-time extensions, which include prioritised scheduling.

Embedded computer systems are by far the largest user of multitasking RTOSs. The majority of technical digital devices, equipment and vehicles use embedded computers to control different operations. Embedded computers have many design considerations not associated with general-purpose computers, such as PCs, workstations and servers. The key design considerations as discussed by Wolf (2005), are; power consumption, hardware required, upgrading, reliability, size and cost. In addition to these, for real-time systems, it is important to consider; the way in which deadlines are met, predictability and fault tolerance.

Designers of real-time systems employ many techniques when developing a product. These techniques are used to cut costs, speed up development time, improve reliability, increase performance, maintain fault tolerance and simplify both the system operations as well as the coding of the applications. Often made design choices include, which OS to use? Is multitasking needed? Does the system need to be real-time? Is memory management needed? What programming language to use? The majority of these design decisions are based around cost. This can be cost at any stage of the products life, including indirect costs such as loss of confidence if the product fails after deployment. It is important to note that in safety-critical systems, the safety aspect usually drives the design. These systems raise more issues such as, does the system need to be fast performing, with the majority of time spent on task execution or does the system need to meet deadlines, be reliable and fault tolerant. With safety critical systems, the choice of OS and its functionality becomes more important. For example a safety critical system may need to not only be ‘hard’ real-time, but also extremely fault tolerant, for which ‘hard’ real-time operating systems provide APIs to implement the necessary levels of failure prevention and recovery.

Meeting the Requirements of ‘Hard’ Real-time

The commercially available VxWorks is an example of a ‘hard’ real-time operating system supplied by Wind River Systems. This OS is used mainly in industry, where the need of a ‘hard’ RTOS is crucial. In comparison to Linux, VxWorks has been designed as a multitasking RTOS from its first implementation. Whereas Linux was designed as a general-purpose OS, and has gradually had both multitasking and real-time processes added.

2.1. Choosing a RTOS: Commercial Vs Open Source

The Linux 2.6 kernel has been developed with a number of significant enhancements over the previous kernel versions. Many of these enhancements have been achieved in an effort to make the standard Linux kernel a more real-time environment. These changes include, preemptive scheduling and reduced context switch latencies. The Linux 2.6 kernel itself is not capable of achieving ‘hard’ real time, due to latency times and predictability issues, although ‘soft’ real-time can be achieved. There are a number of patches and modified kernels, however, that are able to achieve ‘hard’ real-time with performances comparable to those of RTOSs designed solely for real-time operations.

VxWorks, as well as many other commercially available RTOSs, has numerous configurable options, which make it a good option for real-time applications. For different systems, the decision of how the memory is addressed can be vital. For embedded systems, which need to be very fast and efficient often only a single address space is needed, whereas in other cases it is more desirable to have separated kernel and user space with memory protection. The Linux 2.6 kernel provides, kernel and user space with memory protection, which is appropriate for a large number of systems. VxWorks on the other hand, provides a number of configurable settings, which makes it suitable for the majority of memory design requirements. “In VxWorks, we can choose to have only virtual address mapping, to have text segments and exception vector tables write protected, and to give each task a private virtual memory when the task requests for it.” (Liu, 2000).

The Linux 2.6 kernel release, as stated previously, does not have ‘hard’ real-time capabilities. However many advances have been made since the 2.4 kernel version to improve the predictability of task execution. The majority of changes, which affect real-time operations, have been to the scheduler. According to the Linux online journal (2006), the scheduler algorithm has been redesigned since the 2.4 release. Processes are now handled more efficiently and task priority and preemtability have been added. The schedulers execution time is no longer affected by the number of tasks being scheduled, this means that the behaviour of the algorithm and therefore the scheduler’s performance is independent of the number of tasks scheduled. The Input/Output (I/O) subsystem of the Linux 2.6 kernel has also had major changes made, which has allowed for higher responsiveness under varying workloads. The I/O scheduler has been optimised to ensure processes do not spend too much time waiting for resources. These improvements to the Linux 2.6 Kernel go some way to providing real-time capabilities, but as a number of critical tasks used by Linux cannot be preempted, the kernel itself cannot be used in “hard” real-time applications.

Although the Linux 2.6 kernel is not ‘hard’ real-time, a number of projects have been undertaken to produce ‘hard’ real-time versions of Linux. Two popular real-time releases are RTLinux and RTAI. RTLinux (Real-time Linux) is made up of two parts, a real-time kernel and Linux. An application has to be divided over both parts, the real-time part running on the real-time kernel and the non real-time part running on Linux. RTLinux replaces the hardware interrupts, usually handled by the Linux kernel, with interrupts emulated by software. “Rather than letting Linux interface interrupt control hardware directly, the RT kernel sits between the hardware and the Linux kernel.” (Liu, 2000). In this way the real-time kernel can inspect all of the interrupts that occur in order to separate interrupts, which will cause real-time tasks to run. The real-time kernel is able to preempt Linux, if it is running, and allow the real-time task to run. RTAI (Real Time Application Interface), unlike RTLinux, is an extension for the Linux kernel. RTAI can be described as, “An open source kernel for running real-time tasks alongside Linux and non real-time Linux processes” (Ganssle and Barr, 2003). RTAI is a ‘hard’ real-time operating system, which provides predictable responses to interrupts and is POSIX compliant.

It has been shown that there are both commercial and freely available RTOSs. These two classes of OS are used widely in industry and both have positive and negative factors, which promote their use in different applications and industry sectors. With commercial operating systems there are usually many costs, in initial purchases, license fees per product, and consultation fees. These costs are not applicable to freely available operating systems, as they can be openly used. In addition operating systems such as Linux have a large support of online developers, which mostly offer free advice. The decision of which operating system to use will invariably be down to the individual requirements of the system and more importantly the resources, of both time and budget, available to the developer. References

References

Ganssle, J., Barr, M. (2003), Embedded Systems Dictionary. CMP Books .

Wolf, W. (2005), Computers as Components: Principals of Embedded Computer Design. Morgan Kaufmann Publishers.

Krishna, C. M., Shin, K. G. (1997), Real-Time Systems. McGraw-Hill.

Liu, J. W. (2000), Real-Time Systems. Prentice Hall.

Linux Online Journal (2006), viewed: 12 December 2006, http://www.linuxjournal.com/article/8041