Introduction

The performance of four lock routines is measured on both mill3, 200MHz Quad PPro, and clump3, 167 Mhz eight-way Sun Enterprise.

How it Works

A number of threads will insert 500 messages (each message is an array of 8 integers) into a concurrent queue, which can hold about 25 messages. At any point, there is a receiving thread which constantly pulls out messages from the queue.

The time it takes for each thread to insert a single message is then averaged out. The details of the locking algorithms can be found at http://HTTP.CS.Berkeley.EDU/clumps/ipps98.ps

Nuts and Bolts

Each test is an average over twenty runs. Tests can be categorized in two ways- with or without exponential backoff. With exponential backoff, the thread will exponentially poll before attempting to acquire the lock again.

Conclusion

For most of the cases, the figures suggest that the performance of the PPro and the Sparc are within 10%.

Avg. Time of one insertion cost ( 500 mgs of 32 bytes) over 20 runs

Legends:
NumThreads: Number of threads concurrently sending messages.
Posix mutex: solaris pthread mutex

Note: All the figures are in MICRO SECOND.

Pentium
Tests taken on mill3 (Quad 200MHz Pentiums)

WITHOUT exponential backoff
NumThreads 1 2 3
Test & Set 2.467200 3.250350 4.016100
Test Test& Set 2.495800 3.411050 4.595834
Ticket 2.484800 3.204400 3.735900
Posix mutex 3.601400 6.648050 7.410434

WITH exponential backoff
NumThreads 1 2 3
Test & Set 2.644300 2.993100 3.211567
Test Test& Set 2.712000 3.397550 3.909367
Ticket 2.723600 3.180550 5.288367
Posix mutex 3.564200 6.729900 7.714700

SPARC

Tests taken on clump3 (Eight-way Sun Enterprise 167MHz)

WITHOUT exponential backoff
NumThreads 1 2 3
Test & Set 2.639800 3.398750 3.522134
Test& 2.673200 3.735400 3.931433
Ticket 2.819200 3.962900 4.095533
Posix mutex 3.697600 11.443000** 11.547366