NOTE: This information is largely obsolete and is maintained here for historical purposes. Current FreeBSD releases support only 1:1 threading.

Threading support in GDB

In GDB we have at least two targets to worry about:

  1. inferior -- the inferior target operates on live processes and uses ptrace(2) to fetch and store registers and to read and write memory
  2. core -- the core target operates on core files and uses BFD (binutils) to fetch registers and read from the memory image. Both targets are basicly unthreaded

GDB uses the PTID structure to keep track of threads. The PTID structure holds the following 3 IDs:

  1. PID -- the process ID
  2. LWPID -- the light-weight process ID or kernel thread ID
  3. TID -- the user thread ID

In the following description a PTID structure is denoted as (X,Y,Z), where X is the PID, Y is the LWPID and Z is the TID.

Non-threaded processes only need to know about the PID. This is exactly what the inferior target and the core target need. The PTID structure will only contain (PID,0,0). The non-threaded case is not discussed any further.

For 1xN threading, as implemented by libc_r, there's only 1 LWPID. All TIDs will be assigned to that one LWPID in sequence. Consequently, the currently running thread has both a TID and a LWPID. All other threads only have a TID (i.e. the LWPID is zero). The PID is basicly unimportant, because it's constant for the process.

For 1x1 threading, as implemented by libthr, there's a 1-to-1 relation between the TID and the LWPID. Every TID has its own LWPID and it is always associated with that LWPID. Again, the PID is unimportant.

For MxN threading, as implemented by libkse/libpthread, we have system scope threads (bound threads) and process scope threads (unbound threads). For system scope threads there's a a 1-to-1 relation between TID and LWPID, just like for 1x1 threading. For process scope threads there may be multiple LWPIDs onto which TIDs can be scheduled. This is just like 1xN threading except that multiple TIDs can have a LWPID associated with it at any one time and each TID can be associated with different LWPIDs over time. Here too the PID is constant and unimportant.

The core target has a nice "feature" in that it uses the PID to fetch the registers. It basicly looks up the .reg/$PID section. Since we dump the registers of the LWPs in the core file using the LWPID, we can fool the core target by rewriting the PTID structure so that we replace the PID with the LWPID before passing it on to the core target when we ask it to fetch registers. This of course applies only to threads that are associated with a LWP. Consequently, we have an existing way to get the registers of the various LWPs out of a core file. We don't need to change any existing code. We only need to rewrite the PTID and replace the PID with the LWPID and we get the right registers.

The inferior target is pretty much like the core target in that it uses the PID as the argument to ptrace(2). Now, if we can extend ptrace(2) to accept LWPIDs as well as PIDs, we can use the same trick for the inferior target as we use for the core target: we can fetch and store the registers of the individual LWPs by rewriting the PTID structure and replacing the PID with the LWPID. Interestingly enough, this can also work to continue or step a single LWP or to wait for it to finish.

The current core dump code preserves the behaviour that the PID field in the NT_PRSTATUS note actually means the process ID. It does this by dumping the registers of the first LWP using the PID instead of the LWPID. This pretty much interferes with what has been explained above. The simplest solution is to stop doing that. For one because the PID is mostly unimportant to begin with. Secondly, because there's nothing that breaks if we stop doing it. Last but not least, trying to preserve the behaviour pretty much means that we need to change the existing notes or even add new notes to the core file. After exploring this option it became clear that such an approach would make matters worse in that we then need to add the support for these new or changed notes to BFD and also change GDB to work better with BFD to actually use the information in the right way. In the future when BFD and GDB agree on the semantics of the various notes, we may be in a better position for this. So, the simplest solution for us is to dump the registers of all the kernel threads by using the LWPIDs of the kernel threads and not preserve the old behaviour of having the PID in the note by having the first kernel thread use the PID instead of the LWPID.

Non-threaded processes or 1xN threaded processes don't really need to care about the LWPID. Using the PID yields the same behaviour. This is because the one LWP *is* the process. So, one can use LWPID and PID interchangably. At least, in theory. Nonetheless, it opens up the ability for backward compatibility in a different way. If the core file contains the PID instead of the LWPID, then if we don't rewrite the PTID structure, we also get the registers. Likewise for the inferior target -- simply keep the PTID structure untouched and we'll pass the PID to ptrace(2). So, if we have a LWPID that signals that we don't want the PTID structure to be rewritten, we can operate on the PID. This allows us to debug core files created on 4.x (with a 4.x libc_r) or even port the debugger to 4.x without having to change ptrace(2) and so on.

A threaded process goes through multiple stages. It first starts off as a non-threaded process with PTID (PID,0,0). At this stage libthread_db may also be involved in debugging. The next stage is after the thread library has been initialized. This should give a PTID of (PID,0,TID), (PID,LWP,0) or (PID,LWP,TID). Hence, the presence of LWP or TID indicates that the thread library is up and running. Typically both LWP and TID are non-zero. The next stage is a stage in which multiple threads have been created, possibly with multiple LWPs.

The debugger typically won't know about the associations between LWPIDs and TIDs after the inferior has run, except of course for 1x1 threading. For 1x1 threading the association is fixed. For 1xN there may have been a context switch from, say, T0 to T1. This means that the PTID (PID,LWPID,T0) has become invalid and should be replaced with (PID,0,T0). At the same time the PTID (PID,0,T1) has become invalid and should be replaced with (PID,LWPID,T1). This is all very time consuming and tedious if GDB tries to keep up with all this. Especially with KSE. So, it makes sense for GDB to simply ignore the LWPID part and only operate on PTIDs of the kind (PID,0,TID). This however means that GDB won't be able to side-step libthread_db for some or most of the operations. It simply doesn't have enough information. With only the TID, GDB needs to ask libthread_db about registers. Since libthread_db has the visibility to figure out up-to-date associations between LWPID and TIDs, it can then use proc services to ask GDB to fetch registers from the inferior using the LWPID associated with the TID or return the registers itself when no LWPID is associated with the TID.

The previous point also implies that it's difficult for GDB to cache register informationi at the target level. The question is whether you actually need to do caching there. In GDB registers are already cached above the target level and GDB typically asks the target to fetch registers only when the regcache may be stale or is invalid. In that sense, we can assume that we're being called only when necessary and simply fetch registers in the most straightforward way, without any optimizations. The only command that may possibly benefit from caching at the target level is "info threads". This target switches to all threads in turn and needs to fetch the registers in order to know what the PC is. Caching at the target level may avoid going through libthread_db and proc services. However, the "info threads" command is hardly performance critical, so why bother?

References:

MarcelMoolenaar/GdbAndThreading (last edited 2021-04-25T04:56:09+0000 by KubilayKocak)