IA-64 : System Calls

I (=Arun) was examining the consequences of using ar.k5 as the system call entry point.

Key findings:

  1. Latency of reading ar.k5 is 12 cycles as opposed to 1 cycles for other GRs
  2. Using an immediate value or loading it from memory saves about 11 cycles.


<pre> #include <stdio.h>

#define COUNT 10000 typedef unsigned long uint64_t;

static inline uint64_t ia64_get_itc() {


static inline uint64_t ia64_get_k5() {


static inline uint64_t ia64_get_r1() {


main() {

} </pre>


<pre> #include <machine/asm.h>

entry: data8 0xa000000000000000



ENTRY(getpid1, 8)

(p6) br.cond.sptk.few _error;


ENTRY(getpid2, 8)

(p6) br.cond.sptk.few _error


ENTRY(getpid3, 8)

(p6) br.cond.sptk.few _error

.endp </pre>


<pre> $ gcc -O2 test.c t.S -o t $ ./t 12 1 433 422 422 </pre>

ia64/SystemCalls (last edited 2008-06-17 21:37:16 by localhost)