High-Resolution Timing, Revisited
By Jeff Bush (jeff@be.com>

In an earlier newsletter, Ficus wrote an interesting article
about how to get high-resolution timing in the BeOS using a
clever hack and a device driver ("Outsmarting the Scheduler,"
<http://www.be.com/aboutbe/benewsletter/volume_II/Issue27.html>).
Thanks to a number of changes in Genki, this sort of black
magic is no longer necessary. The kernel now offers several
ways to get fine-grained, precise timing both from device
drivers and user space.

The sample code <ftp://ftp.be.com/pub/samples/drivers/pcspeaker.zip>
demonstrates just how fine-grained you can get. The code is a
driver for the built-in PC speaker. If you have a file in mono
8-bit unsigned 20k raw format, you can cat it directly to
/dev/audio/pcspeaker to play it. In lieu of that, I've included
a 'sine' program that writes a sine wave to stdout, which can be
redirected to this device. This was tested on a 333Mhz Pentium II.
On slower machines, you may have to tweak the values to keep the
machine usable.

The PC speaker is a primitive piece of hardware. It basically
has two states: on and off. You can drive it from a programmable
rate generator, or you can directly modify the state by setting
a bit. The classic way to mimic analog signals with this type of
speaker is to pulse it very quickly -- faster than the speaker
cone can physically move. The amount of time that current is
applied to the output vs. the amount of time that it is not
applied is controlled by the amplitude of the input signal,
causing the speaker position to approximate the actual waveform.
The sample driver will use this technique if it is compiled with
the PLAY_IN_BACKGROUND macro set to 0. The downside to this
approach is that it requires constant attention from the CPU.
In order to get any kind of sound quality, you have to shut off
interrupts, rendering the machine unusable. This is clearly
undesirable.

The sample code takes a rather unique approach to this problem.
It programs a timer to wake it up periodically so it can move
the speaker cone. Other threads continue to run normally, but at
defined intervals, an interrupt occurs and the speaker driver
code is executed (at interrupt level) to control the speaker.

To get any quality, the interrupts need to occur frequently;
around every 30-60us. Note that, although the machine is still
useable, interrupts cut down hardware performance quite a bit.
This is a rather extreme case, but it does illustrate how fine-
grained you can get with kernel timers. You should also note that
the standard user space synchronization primitives use this
underlying mechanism, and you can now get very accurate delays
using system calls such as snooze, read_port_etc, or
acquire_sem_etc. You don't need to write a device driver to
get accurate timing.

From the driver level, the timers are programmed by using a new
API added for Genki. The declarations for these functions are in
KernelExport.h. Programming a timer in a device driver basically
involves calling the add_timer function, like so:

	ret = add_timer((timer*) &my_timer, handler_function, time,
		B_ONE_SHOT_ABSOLUTE_TIMER);

The first parameter is a pointer to a timer structure. You can
add your own parameters to this by defining your own structure
and making kernel's timer struct the first element, for example:

	struct my_timer {
		timer _kernel_timer;
		long my_variable;
			.
			.
			.
	};


The second parameter to the add_timer function is a hook function
that should be called when the timer expires. It has the form:

	int32 timer_hook(timer *t);

Note that the timer struct that you originally passed to add_timer
is also passed into this function, so you can access elements that
you've added to that struct from the interrupt handler.

The third parameter is the time that this interrupt should occur.
How this is interpreted is determined by the fourth parameter.
There are three basic modes that a timer understands:

* B_ONE_SHOT_ABSOLUTE_TIMER, as the name implies, is a single
  interrupt that occurs at a specific system time.

* B_ONE_SHOT_RELATIVE_TIMER is similar, but allows you to
  specify an interval rather than an actual time.

* B_PERIODIC_TIMER will cause the callback to be called
  repeatedly at a given interval.

When playing in background mode, the speaker driver determines
the next time the speaker cone has to move and program a timer
for that time. If it's under 20us, it doesn't actually program
the timer. This is because there is some overhead to handling
an interrupt. If you program a timer that's too short, you may
end up wasting time and be late for the next event you want to
handle. Likewise, when it sets up the interrupt, it will set it
a little early to compensate for the interrupt latency. A macro
called SPIN determines whether the code will spin in a loop when
it is early to handle event, or just move the speaker cone early.
In the case of the speaker driver, which is obviously not a high-
fidelity component, this isn't really necessary. In fact, since
these interrupts occur so frequently, machine performance is
degraded significantly when it behaves like this. In drivers
where timing is critical, spinning like this is a way to be
accurate.

A quick note about the implementation of timers. You may be
familiar with the way timing is implemented on many operating
systems (including BeOS before Genki). Normally, an operating
system sets up a periodic interrupt that occurs every couple of
milliseconds. At that time, the timeout queue is checked to see
if there are expired timeouts. As a consequence, you can't get
a dependable resolution of less than the interrupt period. In
Genki, the kernel dynamically programs a hardware timer for the
next thread to wake or sleep, allowing much finer resolutions.
This, coupled with the preemptive nature of the kernel, is what
allows the driver to accurately set timers for 30-60us.

When it finishes playing a chunk of data, the interrupt handler
releases a semaphore that the calling thread was waiting for.
Normally, when a semaphore is released, the scheduler is invoked
to wake up threads that may have been waiting for the semaphore.
As you're probably aware, rescheduling from within an interrupt
handler is very bad. Driver writers had to use the flag
B_DO_NOT_RESCHEDULE when releasing a semaphore from an interrupt
handler. This was a limitation, however, because it meant that a
thread waiting for some device driver call would have to wait
until the scheduler was invoked again (for some unrelated reason)
before it would be run. This could be several milliseconds later.
Generally, you'd want it to run as soon as possible.

In Genki, an interrupt handler can now return B_INVOKE_SCHEDULER,
which call the scheduler immediately after your interrupt handler
returns. If you have released semaphores from an interrupt handler,
you can return this to make sure that waiting threads are woken up
soon. This is especially useful for drivers where low-latency
access to hardware is important.

As you can see, there are a number of very useful facilities for
low-latency device access and high-precision timing. In addition,
many of the generic user-level synchronization primitives have
become much more accurate.
