4. Multi-threading

Threads may be created locally or remotely. By using remote thread creation, a service is executed on a given node. The mapping of a thread onto a node must be explicit. All threads of a node cooperate via shared data and synchronizations objects such as mutexes and semaphores. Threads placed on different nodes cooperate by message exchanges.

There are several forms of thread creation in Athapascan-0. Threads can be created in the local node or in a remote node. The threads created locally share the memory with the creator. Remotely created threads, which don't share memory with the creator, can receive a block of data at their startup.

4.1. Local Threads

Threads created locally are called slave threads. They can be created by any existing thread, even another slave thread. At the moment of the creation of any thread, the function that it executes must be specified (the main thread runs the main() function). The function to be executed by a slave has a special prototype, but no service must be declared to start it.

a0tError slave1(void *arg)
{
    ... compute ...
}

  a0tThread ThrSlave;
  ...
  a0NewSlave(&ThrSlave, ..., slave1, arg);

A thread can create a slave when it needs to wait for some event and some computing can be done at the same time. Creating a slave and later waiting for its termination to get its result is a limited form of a parallel function call. Though, it is necessary to explicitly synchronize them if they have some level of dependency.

A daemon thread can also be created, when some background task must be done while the other threads execute. A daemon thread is said to be detached from the creator thread (a0DetachSlave()) and its termination cannot be waited.

Any thread can terminate itself by returning its main function or by calling the function a0ExitThread(). In the latter case, a result can be passed to the thread who will wait its termination. The result has type void *, and a little integer could be easily passed inside of it with a cast. To return bigger structures, note that data must not have been automatically allocated, because their storage will disappear as soon as the thread returns.

  /* slave function definition*/
  a0tError Func(void *argument)
  {
    /* malloc to return 10 floats in it. */
    float *MyResult = malloc(10 * sizeof(float));
    ...
    a0ExitThread(MyResult);
  }
  ...
  float Data[100];
  float Result;
  a0tThread Slave;
  ...
  a0NewSlave(&Slave, SchedulingRule, Priority, StackSize,
             Func, (void *)&Data);
  ...
  a0JoinSlave(&Slave, &Result);

4.2. Synchronization

Any thread may create others slave threads with the function a0NewSlave(), to assist to compute some values or to communicate with other threads. The operators to synchronize local threads provided by Athapascan-0 are mutexes,semaphores, and condition variables.

Mutexes are used to build critical sections in which at most one thread executes. Semaphores can be seen as a bag of execute rights in which threads insert a ticket (V function) or try to get a ticket P function). If a thread does a P when there are no tickets in the bag, it waits until some other thread inserts one in it.

Conditions variables are used in conjuction with mutexes to allow a thread to wait until an arbitrary condition has occurred. There are two basic operations on a condition variable: signalling and waiting for it to be signalled. First, one or more thread wait on a condition variable. Then, when a condition variable is signalled, one or all of the threads (as specified by the signaller) waiting for the the condition variable are allowed to proceed. A singal on a condition that does not have any waiting threads is not remembered. The next thread that waits ont the condition variable will block until the condition variable is again signalled. in other words, condition variables are "stateless".

In the example below, a mutex is created when there is only one thread running. Later on, several threads need to store values at the end of a table. The incrementation of the table counter and the store of the new element at the end of the table must be done atomically. This is guaranteed by the mutex, which is locked before and unlocked after modifying the table.

  a0tMutex CounterMutex;
  int Table[MAX];
  int Count = 0;
  ...
  /* only one thread running, create mutex */
  a0NewMutex(&CounterMutex);
  ...
  /* several threads run, lock and unlock mutex */
  a0LockMutex(&CounterMutex);
  Table[Count++] = NewValue;
  a0UnlockMutex(&CounterMutex);

4.3. Scheduling Rules and Priority

The processor allocation to ready threads is controlled by the scheduling rules inherited from POSIX. The scheduling rules are the following, in decreasing order of priority:

First-In-First-Out (FIFO): by this rule, one of the threads of the highest priority run to completion or blocking (on a mutex, a semaphore, a communication, etc.). Threads may starve if a higher priority thread runs continuously.
Round-robin: by this rule, the threads of the highest priority run in a time-sharing basis, if there are no FIFO threads to execute. There is no starvation in a priority level but threads starve if a thread on a higher priority level or rule runs continuously.
Other: It can be anything. Most of the POSIX threads libraries used to compile Athapascan-0 have an other scheduling rule, in which the implementors are free to do whatever they want. In general, that rule means round-robin with no priorities, i.e., all threads run in a time-sharing basis.

The scheduling rule and priority of a thread are established at its creation time. They can be given as default values or be inherited from the creating thread. They can also be changed during the existence of the thread, by the thread itself or by any other thread.

Not all scheduling rules neither different priorities are supported over all implementations of Athapascan-0. Though, all the rules and the priorities range provided by the POSIX threads used are given to the user. The POSIX norm requires the implementors to provide at least the other scheduling, so does Athapascan-0.

4.4. Remote Threads

Prior to execute one thread remotely, a service must be declared with the function a0NewService(). The service identifier is a number which denotes a function, even local or a remote. All the service declarations must be done in the initialization phase, in order to guarantee their deterministic execution order. The declaration of a new service must give a scheduling rule, a priority and a stack size. A section below in the text explains how scheduling works.

  int Service;

  /* service function definition */
  a0tError ServiceFunction(a0tBuffer *Input)
  {
    a0Unpack(Input, ... );
    ...
  }
  ...
  a0Init(&argc, &argv);
  a0NewService(&Service, ServiceFunction,
               SchedulingRule, Priority, StackSize);
  a0InitCommit();

To create a thread in a remote node, the function a0StartRemoteThread() must be used, giving a service number, a scheduling rule, a priority and a stack size. The remotely started thread executes the procedure declared with the given service number with the given scheduling rule, priority and stack size. To use the scheduling rule, priority and a stack size specified in the service declaration, use A0ServiceScheduling , A0ServicePriority and A0ServiceStack respectively.

  a0tBuffer Buffer;
  a0tRequest Request;
  ...
  a0Pack(&Buf, ... );
  a0StartRemoteThread(Node, Service, 
                       SchedulingRule, Priority, StackSize,
                       &Request, &Buffer);

Active messages are also implemented, as a slightly modified remote service call, named urgent services. The declaration of an urgent service handler is done with a0NewService() and its activation is done with a0StartRemoteUrgent(). Instead of creating a new thread in the destination node, a procedure is executed directly by the urgent daemon thread of Athapascan-0. The scheduling, priority and stack information passed to a0NewService() are ignored, since no new thread is started. It allows the program to perform some quick operation remotely, but the service cannot use any blocking function.