Re: [Ngspice-devel] Multithreading with pthread

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi,

I'll get the article in the next days.
However, right now I don't have a working OpenMP. To compare, I should run
both on the same machine.
I'll have time for that after I release my app (maybe one month from now?).

However, the two options (OMP and PT) co-exist nicely in my code.

What I did:
1. I don't save Rhs separately, to copy all at the end. I just protect the
critical part with a mutex.
2. The worker threads just compete for data (fully dynamic work allocation).
When one thread finishes an instance, it gets the next available. The main
thread just waits for them.
3. The worker threads are created once and live forever, waiting for work.
4. The changes are smaller than the ones for OpenMP (less files touched),
but there is an extra file with utilities.
5. It would look much nicer with pthread_barrier_wait() but apparently this
is optional and it doesn't exist on all systems.

My numbers are, on 4 cores (slow machine) running a BSIM4 4-bit adder, no
bypass option:
threads / cpu seconds / elapsed seconds
0 (no MP compiled in)  318 370
1  317  376
2  356  248
3  360  191
4  370  166
6  383  166
8  396  166

Conclusions:
1. Almost no penalty for having it complied in, due to the separate loop for
1 thread case. Without that trick the numbers are 334  400 for one thread.
The little penalty that still exists comes from the mutex calls at the end
of b3ld.c (I didn't "if()" those out, only #ifdef them).

2. For threads > cores the cpu time increases with no gain in elapsed time.
Quite predictable.
Interesting to test on a machine with more cpus. Probably the circuit has to
be quite big (many bsim instances) to see improvement above 4 threads even
with more cores.

3. More testing would be needed (all this is beta state). Maybe some
regression tests to make sure I don't do something very wrong.

Below is the bulk of the code.

BTW, "good = BSIM3LoadOMP(here, ckt);" is NOK in the OMP code - it
overwrites the error flag so only the error from the last instance is
returned.

Best regards,
Calin

***** Beginning of b3ld.c *****

#ifdef USE_OMP
int BSIM3LoadOMP(BSIM3instance *here, CKTcircuit *ckt);
void BSIM3LoadRhsMat(GENmodel *inModel, CKTcircuit *ckt);
#endif

#ifdef USE_PTHREAD
#include "../PThreads.h"

void *BSIM3getInstPT();
int BSIM3loadPT(BSIM3instance *here, CKTcircuit *ckt);
#endif

int
BSIM3load(
GENmodel *inModel,
CKTcircuit *ckt)
{
#if defined(USE_OMP) || defined(USE_PTHREAD)

#ifdef USE_OMP
    int idx;
    BSIM3model *model = (BSIM3model*)inModel;
    int good = 0;
    BSIM3instance *here;
    BSIM3instance **InstArray;
    InstArray = model->BSIM3InstanceArray;

#pragma omp parallel for private(here)
    for (idx = 0; idx < model->BSIM3InstCount; idx++) {
        here = InstArray[idx];
        good = BSIM3LoadOMP(here, ckt);
    }

    BSIM3LoadRhsMat(inModel, ckt);

    return good;
}

int BSIM3LoadOMP(BSIM3instance *here, CKTcircuit *ckt) {
BSIM3model *model;
#endif

#ifdef USE_PTHREAD
	// Initialize PThere; PTmodel is initialized to inModel in PTrun
	PThere = ((BSIM3model *) inModel)->BSIM3instances;

	return PTrun(inModel, ckt, (void *(*)()) &BSIM3getInstPT,
&BSIM3loadPT);
}

// Returns current instance (or first non-null) and advances pointers.
void *BSIM3getInstPT() {
	void *here;
	if (PTmodel == NULL) return NULL; // We're at the end of the list

	do {
		here = PThere;
		if (PThere != NULL) PThere = ((BSIM3instance *)
PThere)->BSIM3nextInstance;

		while (PThere == NULL) { // WHILE not IF, to catch also
models with no instances
			PTmodel = ((BSIM3model *) PTmodel)->BSIM3nextModel;
			if (PTmodel == NULL) return here; // This is NULL or
next will be NULL

			PThere = ((BSIM3model *) PTmodel)->BSIM3instances;
		}
	} while (here == NULL); // Also to catch models with no instances
	return here;
}

// Original load function
int BSIM3loadPT(BSIM3instance *here, CKTcircuit *ckt) {
BSIM3model *model;
#endif

#else
BSIM3model *model = (BSIM3model*)inModel;
BSIM3instance *here;
#endif

****** PThreads.c *********

/* PThreads
 *
 * Functions for multi-threading using pthread library
 *
 * */

#include "ngspice/config.h"

#ifdef USE_PTHREAD
#include "ngspice/iferrmsg.h"
#include <pthread.h>
extern int nthreads;

#define MAX_PTHREADS 8
//#define PT_DEBUG 1

void *PTworker(void *p);
void *(*PTgetInst)();
int (*PTload)(void *here, void *ckt);

pthread_t PTid[MAX_PTHREADS];
int PTindex[MAX_PTHREADS];
int PTnumber = 0;

pthread_mutex_t PTmutexNext = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t PTmutexData = PTHREAD_MUTEX_INITIALIZER;

pthread_mutex_t PTmutexStart = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t PTcondStart = PTHREAD_COND_INITIALIZER;
int PTstart[MAX_PTHREADS];

pthread_mutex_t PTmutexDone[MAX_PTHREADS];
pthread_cond_t PTcondDone[MAX_PTHREADS];
int PTdone[MAX_PTHREADS];

#ifdef PT_DEBUG
int PTdebugInst;
char PTdebugThr[1000];
#endif

void *PTckt;
void *PTmodel = NULL;
void *PThere = NULL;
int PTerror;

// Main thread
int PTrun(void *model, void *ckt, void *(*getInst)(), int (*load)()) {

	int i;

	PTerror = 0;
	PTckt = ckt;
	PTmodel = model;
	PTgetInst = getInst;
	PTload = load;

	if (nthreads == 1) { // No multi-threading
		void *here;
		while ((here = (*PTgetInst)()) != NULL) {
			int err = (*PTload)(here, PTckt); // Actual work
			if (err) PTerror = err;
		}
		return PTerror;
	}

#ifdef PT_DEBUG
	PTdebugInst = 0;
#endif

	pthread_mutex_lock(&PTmutexStart); // Initialize the list

		if (PTnumber == 0) { // No threads, create them
			PTnumber = nthreads;
			if (PTnumber < 1) PTnumber = 1;
			if (PTnumber > MAX_PTHREADS) PTnumber =
MAX_PTHREADS;

			for (i=0; i<PTnumber; i++) {
				pthread_mutex_init(&PTmutexDone[i], NULL);
				pthread_cond_init(&PTcondDone[i], NULL);
				PTdone[i] = 0;

				PTindex[i] = i;
				if (pthread_create(&PTid[i], NULL, PTworker,
&PTindex[i])) PTerror = E_PANIC;
			}
			if (PTerror) return PTerror;
		}

		for (i=0; i<PTnumber; i++) { // Start flags
			PTstart[i] = 1;
		}

	pthread_cond_broadcast(&PTcondStart); // List ready to start
	pthread_mutex_unlock(&PTmutexStart);

	for (i=0; i<PTnumber; i++) {
		pthread_mutex_lock(&PTmutexDone[i]);
			while (!PTdone[i]) {
				pthread_cond_wait(&PTcondDone[i],
&PTmutexDone[i]); // Wait for the threads to finish
			}
			PTdone[i] = 0;
		pthread_mutex_unlock(&PTmutexDone[i]);
	}

#ifdef PT_DEBUG
	PTdebugThr[PTdebugInst] = 0;
	LOGD(PTdebugThr);
#endif

	return PTerror;
}

// Worker thread
void *PTworker(void *ixp) {
	int index = *(int *)ixp;
	void *here;

	while (1) {

		pthread_mutex_lock(&PTmutexStart);
			while (!PTstart[index]) {
				pthread_cond_wait(&PTcondStart,
&PTmutexStart); // Wait for green light
			}
			PTstart[index] = 0;
		pthread_mutex_unlock(&PTmutexStart);

		while (1) {
			pthread_mutex_lock(&PTmutexNext); // Get another
instance
				here = (*PTgetInst)();
			pthread_mutex_unlock(&PTmutexNext);
			if (here == NULL) break;
#ifdef PT_DEBUG
			PTdebugThr[PTdebugInst++] = '0' + index;
#endif

			int err = (*PTload)(here, PTckt); // Actual work
			if (err) PTerror = err;
		}

		pthread_mutex_lock(&PTmutexDone[index]); // Flag done
			PTdone[index] = 1;
		pthread_cond_signal(&PTcondDone[index]);
		pthread_mutex_unlock(&PTmutexDone[index]);
	}
}

#endif

-----Original Message-----
From: Dietmar Warning [mailto:die...@ar...] 
Sent: Thursday, 18 April, 2013 21:15
To: Ngspice developers mailing list.
Subject: Re: [Ngspice-devel] Multithreading with pthread

Hi,
only for information: There is paper "On performance enhancement of circuit
simulation using multithreaded techniques" from Perng/Weng/Li.
Calin, can you agree with these results?
How large is the change in model code?
BR
Dietmar

Am 18.04.2013 19:59, schrieb Francesco Lannutti:
> I think we are very interested in this, but prior to move the existing
implementation from OpenMP to Pthreads, you should measure the improvement
between OpenMP and Pthread implementations.
> Since OpenMP is a PRAGMA style parallelization, the Pthread one should be
better, but I don't know how better it is :) .
>
> Thank you,
> Fra
>
>
> Il giorno 18/apr/2013, alle ore 16:29, Calin Andrian
<cal...@gm...> ha scritto:
>
>> Hi,
>>   
>> I am working on a design suite that will use ngspice as the simulation
engine.
>> Since the target platform has no OpenMP, I solved multi-threading with
pthread. Is there interest to move this into the public code?
>>   
>> The same models are benefiting (BSIM3, BSIM4, BSIMSOI). I ran experiments
on others too, but there is no gain...
>> Results: 4-core 4-thread time is 2.2 times faster than 1 thread.
>>   
>> Best regards,
>> Calin Andrian
>>   
>> ---------------------------------------------------------------------
>> --------- Precog is a next-generation analytics platform capable of 
>> advanced analytics on semi-structured data. The platform includes 
>> APIs for building apps and a phenomenal toolset for data science. 
>> Developers can use our toolset for easy data analysis & 
>> visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter_____________
>> __________________________________
>> Ngspice-devel mailing list
>> Ngs...@li...
>> https://lists.sourceforge.net/lists/listinfo/ngspice-devel
>
> ----------------------------------------------------------------------
> -------- Precog is a next-generation analytics platform capable of 
> advanced analytics on semi-structured data. The platform includes APIs 
> for building apps and a phenomenal toolset for data science. 
> Developers can use our toolset for easy data analysis & visualization. 
> Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter
> _______________________________________________
> Ngspice-devel mailing list
> Ngs...@li...
> https://lists.sourceforge.net/lists/listinfo/ngspice-devel

----------------------------------------------------------------------------
--
Precog is a next-generation analytics platform capable of advanced analytics
on semi-structured data. The platform includes APIs for building apps and a
phenomenal toolset for data science. Developers can use our toolset for easy
data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Ngspice-devel mailing list
Ngs...@li...
https://lists.sourceforge.net/lists/listinfo/ngspice-devel