Thread Local Storage

From http://people.freebsd.org/~marcel/tls.html



This document tries to collect all kinds of information related to TLS and serves as a design document and implementation guide. Nothing fancy, just something to help us flesh out the details.


The problem space

We seem to have a three dimensional problem space:

* complete vs shared executable -- Complete means static, but is a term used in certain environments. The advantage of using complete in this context is that it allows us to use static to mean something else. The big difference between complete and shared is the presence (or absence) of a runtime linker.

* static vs dynamic TLS -- This of course refers to the TLS model in use. A process can have both models in use at the same time, but certain technical restrictions apply. The big difference between static and dynamic TLS is the use of the __tls_get_addr() function to get the virtual address of a thread local variable (or not).

* with pthread vs without pthread -- This means whether a threads library (libthr or libkse) is present and/or in use. The existence of the __thread keyword does not imply or mean that the process will be multi-threaded. This means that we have to deal with TLS accesses outside the context of a threaded application. The big difference between pthread and without pthread is the ability to actually have multiple threads.

Current platform support

Of the current tier 1 and tier 2 platforms, only i386 and ia64 have full toolchain support. This is with GCC 3.3. On ia64, the current version of binutils (2.13.2) is buggy with respect to TLS. This seems to affect dynamic TLS relocations. On alpha the TLS access sequences are not generated at all. The __thread keyword seems to be ignored. On sparc64 the compiler emits an error when the __thread keyword is used. GCC 3.4 claims to have support for TLS on alpha and sparc64. This has not been tested or verified. On amd64 the assembler does not support thread-local access relocations in 64-bit mode (binutils 2.13.2). When generating 32-bit (ILP32) code on amd64, the assembler supports TLS. This however has no practical value.

Below typical TLS access sequences, both static and dynamic, for the platforms that do support TLS. The C code from which the access sequences is generated is:

        int __thread i = 3;
        int x() { return i; }

i386

static TLS access sequence

        movl    %gs:0, %eax
        movl    i@NTPOFF(%eax), %eax

dynamic TLS access sequence

        addl    $_GLOBAL_OFFSET_TABLE_+[.-.L2], %ebx
        leal    i@TLSGD(,%ebx,1), %eax
        call    ___tls_get_addr@PLT
        movl    (%eax), %eax
        popl    %ebx

ia64

static TLS access sequence

        addl    r14 = @tprel(i), tp
        ;;
        ld4     r8 = [r14]

dynamic TLS access sequence

        addl    r14 = @ltoff(@dtpmod(i)), gp
        addl    r15 = @ltoff(@dtprel(i)), gp
        ;;
        ld8     out0 = [r14]
        ld8     out1 = [r15]
        br.call.sptk    b0 = __tls_get_addr
        ld4     r8 = [r8]

ThreadLocalStorage (last edited 2008-06-17T21:37:48+0000 by anonymous)