IA-64 : System Calls

I (=Arun) was examining the consequences of using ar.k5 as the system call entry point.

Key findings:

  1. Latency of reading ar.k5 is 12 cycles as opposed to 1 cycles for other GRs
  2. Using an immediate value or loading it from memory saves about 11 cycles.

test.c

<pre> #include <stdio.h>

#define COUNT 10000 typedef unsigned long uint64_t;

static inline uint64_t ia64_get_itc() {

}

static inline uint64_t ia64_get_k5() {

}

static inline uint64_t ia64_get_r1() {

}

main() {

} </pre>

t.S

<pre> #include <machine/asm.h>

entry: data8 0xa000000000000000

.text

_error:

ENTRY(getpid1, 8)

(p6) br.cond.sptk.few _error;

.endp

ENTRY(getpid2, 8)

(p6) br.cond.sptk.few _error

.endp

ENTRY(getpid3, 8)

(p6) br.cond.sptk.few _error

.endp </pre>

Results

<pre> $ gcc -O2 test.c t.S -o t $ ./t 12 1 433 422 422 </pre>

ia64/SystemCalls (last edited 2008-06-17 21:37:16 by localhost)