[MIT6.S081] Lab7: Multithreading (Detailed Solution Version)

Lab content URL: https://xv6.dgs.zone/labs/requirements/lab7.html

Code branch for this lab: https://gitee.com/dragonlalala/xv6-labs-2020/tree/thread2/

Uthread: switching between threads

Key points: thread switching, swtch

Thought Process:

The task of this lab is to design a context switching mechanism for a user-level thread system. Before starting this lab, you need to carefully read these two documents: 11.3 XV6 Thread Switching (Part 1) | MIT6.S081 and 11.4 XV6 Thread Switching (Part 2) | MIT6.S081. Switching from one thread to another requires saving the CPU registers of the old thread and restoring the previously saved registers of the new thread; the fact that the stack pointer and program counter are saved and restored means that the CPU will switch stacks and executing code. User registers are stored in trapframe, while kernel thread registers are stored in context. For this lab, the registers that need to be saved and restored are those in the context struct.

Why are there 32 registers in RISC-V, but only 14 registers are saved and restored in the swtch function?

Answer: Because swtch is called as a normal function. For some registers, the caller of the swtch function assumes that swtch will modify them, so the caller has already saved these registers on its own stack, and these registers will be automatically restored when the function returns. Therefore, the swtch function only needs to save the callee-saved registers. Since swtch is called from C code, we know that the caller-saved registers will be saved on the current stack by the C compiler. There are approximately 15-18 caller-saved registers, and we only need to handle the registers that the C compiler will not save but are necessary for the swtch function. Therefore, when switching threads, we only need to save the callee-saved registers.

Steps & Code:

  1. First, define a tcontext struct with the same content as the context struct in the kernel. Why not use the pre-defined context struct from the kernel? Because this is a user program, and it is not straightforward to use the data structures used by the kernel. The ra (Return Address) register stores the return address of the current function, and the sp (stack pointer) register represents the stack top pointer.
struct tcontext {
  uint64 ra;
  uint64 sp;

// callee-saved
uint64 s0;
uint64 s1;
uint64 s2;
uint64 s3;
uint64 s4;
uint64 s5;
uint64 s6;
uint64 s7;
uint64 s8;
uint64 s9;
uint64 s10;
uint64 s11;
};

struct thread {
char stack[STACK_SIZE]; /* the thread’s stack /
int state; /
FREE, RUNNING, RUNNABLE */
struct tcontext my_context; // Registers to save during thread switching

};

  1. Initialize the ra and sp registers when the thread is created. After assigning the ra register to func, the function will jump to the func function for execution. Assign the sp register to the thread's stack top. I'm confused why we add STACK_SIZE here?
void
thread_create(void (*func)())
{
  struct thread *t;

for (t = all_thread; t < all_thread + MAX_THREAD; t++) {
if (t->state == FREE) break;
}
t->state = RUNNABLE;
// YOUR CODE HERE

t->my_context.ra = (uint64)func;
t->my_context.sp = (uint64)t->stack + STACK_SIZE;

}

  1. Call the thread switching function in the thread_schedule(void) function. This function saves the old thread's registers and restores the new thread's registers. This function is an assembly language function.
void
thread_schedule(void)
{
  struct thread *t, *next_thread;

/* Find another runnable thread. */
next_thread = 0;
t = current_thread + 1;
for(int i = 0; i < MAX_THREAD; i++){
if(t >= all_thread + MAX_THREAD)
t = all_thread;
if(t->state == RUNNABLE) {
next_thread = t;
break;
}
t = t + 1;
}

if (next_thread == 0) {
printf(“thread_schedule: no runnable threads\n”);
exit(-1);
}

if (current_thread != next_thread) { /* switch threads? /
next_thread->state = RUNNING;
t = current_thread;
current_thread = next_thread;
/
YOUR CODE HERE
* Invoke thread_switch to switch from t to next_thread:
* thread_switch(??, ??);
*/
thread_switch((uint64)&t->my_context, (uint64)&next_thread->my_context);
} else
next_thread = 0;
}

  1. Refer to the swtch function in the kernel's swtch.S file, add the following code to the uthread_switch.S file:
    .text
/*
     * save the old thread's registers,
     * restore the new thread's registers.
     */

.globl thread_switch

thread_switch:
/* YOUR CODE HERE */
sd ra, 0(a0)
sd sp, 8(a0)
sd s0, 16(a0)
sd s1, 24(a0)
sd s2, 32(a0)
sd s3, 40(a0)
sd s4, 48(a0)
sd s5, 56(a0)
sd s6, 64(a0)
sd s7, 72(a0)
sd s8, 80(a0)
sd s9, 88(a0)
sd s10, 96(a0)
sd s11, 104(a0)

ld ra, 0(a1)
ld sp, 8(a1)
ld s0, 16(a1)
ld s1, 24(a1)
ld s2, 32(a1)
ld s3, 40(a1)
ld s4, 48(a1)
ld s5, 56(a1)
ld s6, 64(a1)
ld s7, 72(a1)
ld s8, 80(a1)
ld s9, 88(a1)
ld s10, 96(a1)
ld s11, 104(a1)

ret    /* return to ra */

First, the ra register is saved at the address pointed to by the a0 register. The a0 register corresponds to the first parameter of the thread_switch function, which we can see from earlier is the address of the current thread's context object; the a1 register corresponds to the second parameter of thread_switch, which is the address of the next thread's context object that we are about to switch to. So the first half of the function saves the current registers into the current thread's context object, and the second half restores the next thread's registers into the CPU's registers. Then the function returns. So the ra register of the next thread points to the return address of thread_switch, which is the thread_schedule function.

Using threads

Key points: locks

Thought Process:

When running two threads simultaneously in this problem, key loss occurs because two threads perform insert or modify operations on the hash table at the same time, resulting in failed insertion or modification. So add a lock to the put() function. In some cases, concurrent put() operations do not overlap in the memory read or written to the hash table, so no locks are needed to protect each other. The problem has 5 hash tables, so each hash table uses a mutex lock to avoid key loss when inserting or modifying the hash table. Just complete the code following this method!

Steps & Code:

  1. Define a locks array with NBUCKET elements.
pthread_mutex_t locks[NBUCKET]; // declare a lock
  1. Initialize the mutex locks in the main() function.
  if (argc < 2) {
    fprintf(stderr, "Usage: %s nthreads\n", argv[0]);
    exit(-1);
  }
  for (int i = 0; i < NBUCKET; i++)
  {
    pthread_mutex_init(&locks[i], NULL);
  }

nthread = atoi(argv[1]);
tha = malloc(sizeof(pthread_

  1. Acquire the lock at the start of the put() function, and release the lock after completing the insertion or modification.
static
void put(int key, int value)
{

int i = key % NBUCKET;
pthread_mutex_lock(&locks[i]); // acquire lock
// is the key already present?
struct entry *e = 0;
for (e = table[i]; e != 0; e = e->next) {
if (e->key == key)
break;
}
if(e){
// update the existing key.
e->value = value;
} else {
// the new is new.
insert(key, value, &table[i], table[i]);
}
pthread_mutex_unlock(&locks[i]); // release lock
}

Barrier

Thought Process:

The goal of this problem is to allow all threads to enter the next round of loops only after all threads have reached the barrier. If there are a total of N threads, for the first N-1 threads, when entering the barrier, they need to acquire the lock, then sleep on the cond, which will release the mutex so that other threads can acquire the lock to enter the barrier. When the last thread N enters the barrier, it acquires the lock. At this point, the other N-1 threads are sleeping on the cond. Thread N wakes up all threads sleeping on the cond. To prevent threads from entering the next loop prematurely, thread N holds the lock, releases the lock before exiting the barrier, and then can proceed to the next round of loops.

Code:

static void
barrier()
{
  // YOUR CODE HERE
  // Block until all threads have called barrier() and
  // then increment bstate.round.
  // Acquire lock
  pthread_mutex_lock(&bstate.barrier_mutex); // acquire lock
  // Number of threads that have reached the barrier +1
  bstate.nthread++;
  if(bstate.nthread == nthread){
    // All threads have reached the barrier
    // Round number +1
    bstate.round++;
    // Reset bstate.nthread to 0 to start a new round
    bstate.nthread = 0;
    // Wake up all threads sleeping on cond
    pthread_cond_broadcast(&bstate.barrier_cond);

}else{
// Go to sleep on cond, release the mutex lock, and re-acquire it when waking up
pthread_cond_wait(&bstate.barrier_cond, &bstate.barrier_mutex);
}
// Release lock
pthread_mutex_unlock(&bstate.barrier_mutex); // release lock
}


This is a discussion topic separated from the original topic at https://juejin.cn/post/7368294440547745830