aprsd-devel Mailing List for aprsd (Page 7)

aprsd-devel — Developer list for project discussion and patch submission

You can subscribe to this list here.

2001	_Jan	_Feb	_Mar	_Apr	_May	_Jun	_Jul (70)	_Aug (15)	_Sep	_Oct (2)	_Nov (47)	_Dec (12)
2002	_Jan (5)	_Feb (1)	_Mar (15)	_Apr (16)	_May (2)	_Jun	_Jul	_Aug (8)	_Sep	_Oct	_Nov	_Dec
2003	_Jan	_Feb	_Mar	_Apr	_May (2)	_Jun	_Jul	_Aug	_Sep	_Oct	_Nov	_Dec
2004	_Jan	_Feb (1)	_Mar (7)	_Apr	_May (1)	_Jun	_Jul	_Aug	_Sep	_Oct	_Nov	_Dec
2006	_Jan	_Feb	_Mar	_Apr	_May	_Jun	_Jul	_Aug (3)	_Sep	_Oct	_Nov	_Dec
2012	_Jan	_Feb	_Mar	_Apr	_May	_Jun	_Jul (1)	_Aug	_Sep	_Oct	_Nov	_Dec

<< < 1 .. 5 6 7 8 9 > >> (Page 7 of 9)

Re: [Aprsd-devel] NOGATE and cleaning "*" from ax25Source

From: Brian D H. <bdh...@c4...> - 2001-07-18 06:35:25

Chuck,

	I guess my thinking is that we don't want to filter to tighly based
on the current spec.  As you report there are a lot of technically
non-compliant packets out there.  Since aprsd is serving in a transport
function of the network it shouldn't (IMHO) be the primary protocol
policing method.  If we filter too tightly on the spec I think we limit the
experimentation that can be done with the network and cause a lot of extra
work as the spec evolves.  

	I think we've done a good job of filtering in the current code.  We
make sure that the packet looks/smells like the proper general format, that
it isn't long enough to overflow the buffers of a program without strenuous
buffer checking, and we've made sure that the ax25Source call appears
reasonable.  If we go much further I think we create a continuing mess as
we try to keep up with changes and new applications.

	Would it be reasonable (and/or a "good idea") to operate under the
concept that it's up to the software generating a packet to ensure it is
compliant with the spec, up to the software parser on the client end to
decide what it can deal with, and up to the transport implementation
(aprsd) to filter out garbage and do very basic sanity checking?  I'm
thinking along the lines of TCP/IP transport server implementations (INN 
and Sendmail come to mind).  As long as the protocol looks right at the
outer layers they don't sniff into the payload to see if it's clean.

	I've got an image in my head of someone coming up with the next
neat'o - super cool - whiter whites - brighter brights implementation. 
Unfortunately in some cases it can be non-spec compliant.  The mob with
torches end up at our doors (virtual of course) demanding a new aprsd
release to allow the new implementation and they want it yesterday.  Of
course they don't want to tweak the code themselves. <g>

	These are just my thoughts.  Whatever the group decides I'm happy
to help implement.  FWIW - I finally got my offer letter from the company
in DFW.  I'll be moving this weekend so I may be off-line for a few days.

			73/N5VFF


-- 
============================================================
Brian D Heaton               | I fear that we have awakened
Principal Consultant         | a sleeping giant and instilled
C4I2.com System Consultants  | in him a terrible resolve.
bdh...@c4...            | -- Admiral Isoruku Yamamoto
USA (719) 623-0381           | -- Imperial Japanese Navy
UK +44 (0)845 127-5400       | -- December 7, 1941

On 2001.07.17 12:25 Chuck Byam wrote:
SNIP - SNIP - SNIP
> Mmmk, I've been working in this part of the code over the past couple of 
> days.  What I need is discussion on is how strict should we make our 
> checks.  For example, what I've done with message packets is tested for a
> 
> length of 69 bytes, check for illegal chars (|~{), preserved the id
> ({xxx), 
> and truncated the rest of the message.  I've done similar chops in other 
> areas as well, eg, poisition reports.  What I'm finding is a lot of
> packets 
> are being truncated.  I can see where some folks may get upset about 
> this... but on the other hand, it's spec.
> 
> Chuck
> 
> 
> _______________________________________________
> Aprsd-devel mailing list
> Apr...@li...
> http://lists.sourceforge.net/lists/listinfo/aprsd-devel
>

Re: [Aprsd-devel] NOGATE and cleaning "*" from ax25Source

From: Hamish M. <ha...@cl...> - 2001-07-18 04:19:04

On Mon, Jul 16, 2001 at 10:31:18PM -0600, Brian D Heaton wrote:
> 	We will ensure that we only delete the "*".  I'm testing this now,
> but wanted to make sure I understand the implementation.

Can you give me an example of a packet which will be discarded
by this check?

I'm still on holiday in NZ, and not really thinking about APRS at all :-)


Hamish
-- 
Hamish Moffatt VK3SB <ha...@de...> <ha...@cl...>

Re: [Aprsd-devel] something strange

From: Brian D H. <bdh...@c4...> - 2001-07-18 04:14:39

Chuck,

	This email is even longer... <g>


On 2001.07.17 13:30 Chuck Byam wrote:
> I've seen this happen a couple of times and the out come is always the
> same.  
> In other words I think I've see the symptom of our problem.  Please
> excuse 
> the long post:
> 
> Server Up Time    = 1.1 hours
> Total TNC packets = 0
> TNC stream rate   = 0 bytes/sec
> Msgs gated to RF  = 0
> Connect count     = 129
> Users             = 71
> Peak Users        = 78
> APRS Stream rate  = 1.1  Kbps
> Server load       = 31.2  Kbps
> History Items     = 3348
> TAprsString Objs  = 3348
> Items in InetQ    = 0
> InetQ overflows   = 0
> TncQ overflows    = 0
> conQ overflows    = 0
> charQ overflow    = 0
> Hist. dump aborts = 02
> 
> ....
> 

> Session overrun (w5ks)
> Session overrun (KB2QHA-2)
> 

SNIP - SNIP - SNIP

> ... This goes on for nearly all connections
> 
> ... lots of session throttles
> 
> ... more overruns and disconnects
> 

	I don't suppose you've got any way of looking at traffic on the
campus backbone you are connected to?  MRTG might be interesting, but the
default 5-minute averaging might skew things a bit.  Alternatively it might
be interesting to keep continuous pings (say 5 secs apart) running to each
of the IGATEs that you create outbound connections to.

	The goal would be to discover whether it's a problem on the box
(possibly kernel or TCP/IP stack related), in the network between the
hosts, or on the distant host.  Since it appears to happen to all hosts at
once I think we can rule out the distant host.  Also since it affects all
hosts simultaneously we can likely rule out Internet difficulties beyond
the interface router of the provider which supplies the IP bandwidth to the
university.  If the university is multi-homed then we can step back even
further into the network.  In general since it's affecting all hosts
simultaneously I would start looking towards first.aprs.net from where the
network becomes highly redundant and/or multi-homed.

	FOR THE NETWORK CASE:

	I don't recall if the backbone you are connected to is L2 switched.
 If not (or at least if there are a decent number of hosts sharing your
segment) then if you've got another host available to run EtherApe then it
might be interesting.  Even more interesting would be some sniffer traces
of the network activity at the time of the event.  

	FOR THE HOST (first.aprs.net) CASE:

	Does the Ethernet interface have anything interesting in it's
stats.  I'm primarily thinking blocked packets and/or overruns.  I'm
wondering if we may have the same type of situation as Dale found with
setting the socket to non-blocking, but manifesting itself on the primary
stream connections in this case.  Beyond what you can get from netstat and
ifconfig I think ntop would have the most interesting output for looking at
this.

	Before we go bonkers running through the code I think we should
eliminate the network as a possible cause.  I've got some more notes below
on the queue overflows as I've seen this in my stress testing.


> Server Up Time    = 1.2 hours
> Total TNC packets = 0
> TNC stream rate   = 0 bytes/sec
> Msgs gated to RF  = 0
> Connect count     = 158
> Users             = 21
> Peak Users        = 78
> APRS Stream rate  = 1.7  Kbps
> Server load       = 0.0  Bps
> History Items     = 3156
> TAprsString Objs  = 4181
> Items in InetQ    = 1024
> InetQ overflows   = 1797
> TncQ overflows    = 0
> conQ overflows    = 0
> charQ overflow    = 0
> Hist. dump aborts = 0
> 
> Now note the server load (0) and connectons (this happens to be the
> number of 
> igates + 1)

Even more interesting (to me at least) here is that the difference between
the History Items (3156) and the TAprsString Objs (4181) is exactly the
size of the InetQ.  I've created the same situation on my test box and once
they start diverging you can watch the InetQ fill up slowly and the
History/TAprsString counts diverge at exactly the same rate.  Since items
are both pulled off the queue and added to the history list in the
"DeQueue" thread that makes me think it's a likely place to look for
suspects.  As I read it the flow looks something like this:

	1 - Loop awaiting senqueue.ready
	2 - Pop an item off the sendqueue
	3 - dupcheck the item
	4 - Test to see if the item should go in the history list
	5 - if it should place the item in the history list
	6 - Send it out via SendToAllClients

	I'm guessing that either the DeQueue thread is dieing (Need to
figure out which it is and check for the pid in this scenario); the thread
is deadlocking (possibly on a mutex lock); the socket is overflowing; or we
are getting a non-reentrant case from another thread calling one of the
involved functions.

	Functions noted are:

	SendToAllClients - Only called in the DeQueue thread
	sendQueue.ready - Only caled in the DeQueue thread
	sendQueue.read - Only called in the DeQueue thread
	AddHistoryItem - Only called in DeQueue thread
	dupFilter.check - Called in DeQueue and DeQueueTNC

	Interesting, I just took a core dump on my test box and it looks
like it was unable to unlock pmtxHistory at the "getPositAndUpdate" tag.  I
guess the short version is that I'm suspicious of both the history routines
(especially how the pmtxHistory lock is handled) and the non-blocking
socket.
	
> 
> ...
> 24.23.210.235 has connected to port 23
> 24.23.210.235 has connected to port 23
> 199.227.86.221 has connected to port 23
> 24.23.210.235 has connected to port 23
> 24.177.214.61 has connected to port 23
> 206.159.119.88 has connected to port 10151
> 24.23.210.235 has connected to port 23
> 24.23.210.235 has connected to port 23
> 24.177.214.61 has connected to port 23
> 24.23.210.235 has connected to port 23
> ...
> 
> This continues until the maxclient limit is reached and I start getting
> the 
> "error creating new client thread"
> 
> Note the multiple connects from the same host.
> 
> Now its off to see why this is happening...
> 

	Are the multiple connects from a subset of the total host table at
the time of the event?  I would be curious to figure out if they might all
the same type of IGATE/Client software.  

	Are you still on the 2.4.2SMP kernel you started with?  I'd be
curious if there is any change under a newer release.  Also, I don't recall
if you ever told me what Ethernet board you were running.  There has been
some traffic on LKML lately about some problems with SMP and specific
Ethernet boards. 

	Probably enough babble.  We now return you to your regularly
scheduled head scratching and staring at code.. <g>

			73/N5VFF


-- 
============================================================
Brian D Heaton               | I fear that we have awakened
Principal Consultant         | a sleeping giant and instilled
C4I2.com System Consultants  | in him a terrible resolve.
bdh...@c4...            | -- Admiral Isoruku Yamamoto
USA (719) 623-0381           | -- Imperial Japanese Navy
UK +44 (0)845 127-5400       | -- December 7, 1941

[Aprsd-devel] something strange

From: Chuck B. <cb...@vi...> - 2001-07-17 19:29:55

I've seen this happen a couple of times and the out come is always the same.  
In other words I think I've see the symptom of our problem.  Please excuse 
the long post:

Server Up Time    = 1.1 hours
Total TNC packets = 0
TNC stream rate   = 0 bytes/sec
Msgs gated to RF  = 0
Connect count     = 129
Users             = 71
Peak Users        = 78
APRS Stream rate  = 1.1  Kbps
Server load       = 31.2  Kbps
History Items     = 3348
TAprsString Objs  = 3348
Items in InetQ    = 0
InetQ overflows   = 0
TncQ overflows    = 0
conQ overflows    = 0
charQ overflow    = 0
Hist. dump aborts = 02

....

Session overrun (W0IBM)
Session overrun (W0IBM)
Session overrun (W0IBM)
Session overrun (w5ks)
Session overrun (W0IBM)
Session overrun (LC3VAT)
Session overrun (w5ks)
Session overrun (W8MSU-10)
Session overrun (w9da)
Session overrun (VE3ZRD)
Session overrun (W0IBM)
Session overrun (N2UTH)
Session overrun (ON4AWV-12)
Session overrun (LC3VAT)
Session overrun (w5ks)
Session overrun (KB2QHA-2)
Session overrun (W8MSU-10)
Session overrun (KF3DY-2)
Session overrun (w9da)
Session overrun (VE3ZRD)
Session overrun (W0IBM)
Session overrun (N2UTH)
Session overrun (ON4AWV-12)
Session overrun (LC3VAT)
Session overrun (w5ks)
Session overrun (KB2QHA-2)

... This goes on for nearly all connections

... lots of session throttles

... more overruns and disconnects

Server Up Time    = 1.2 hours
Total TNC packets = 0
TNC stream rate   = 0 bytes/sec
Msgs gated to RF  = 0
Connect count     = 158
Users             = 21
Peak Users        = 78
APRS Stream rate  = 1.7  Kbps
Server load       = 0.0  Bps
History Items     = 3156
TAprsString Objs  = 4181
Items in InetQ    = 1024
InetQ overflows   = 1797
TncQ overflows    = 0
conQ overflows    = 0
charQ overflow    = 0
Hist. dump aborts = 0

Now note the server load (0) and connectons (this happens to be the number of 
igates + 1)

...
24.23.210.235 has connected to port 23
24.23.210.235 has connected to port 23
199.227.86.221 has connected to port 23
24.23.210.235 has connected to port 23
24.177.214.61 has connected to port 23
206.159.119.88 has connected to port 10151
24.23.210.235 has connected to port 23
24.23.210.235 has connected to port 23
24.177.214.61 has connected to port 23
24.23.210.235 has connected to port 23
...

This continues until the maxclient limit is reached and I start getting the 
"error creating new client thread"

Note the multiple connects from the same host.

Now its off to see why this is happening...

Chuck

Re: [Aprsd-devel] NOGATE and cleaning "*" from ax25Source

From: Chuck B. <cb...@vi...> - 2001-07-17 18:28:03

On Tuesday 17 July 2001 00:31, Brian D Heaton wrote:
:: As currently implemented aprsd doesn't check or comply with
:: "NOGATE" in the ax25Path.  Based on the current thread on aprssig is
:: this something we want to do?  I think we could implement it with (with
:: the other filtering code in aprsString.cpp):
:: --------------
:: 	if (ax25Path.find('NOGATE') != npos) {
:: 		aprsType = APRSERROR;
:: 		return';
:: 	}
:: --------------

That's doable.  We could add it as an option in the conf.

::
:: 	A second concern is the current way we are stripping "*" from the
:: ax25Source field of packets.  At present it appears to be erasing the
:: full ax25Source of the packet and thus the other filtering code is
:: marking it as an error packet.  I've got badpacket logging turned on and
:: I see everything with a "*" in the ax25Source field (mostly digi ID's)
:: being dropped.  Thus we are dropping any packets directly heard by the
:: IGATE.
::
:: 	As currently implemented it looks like this:
::
:: -------------------
:: if (int nfind = ax25Source.find_first_of('*') <= ax25Source.length()) {
::         //cerr << "Found * in source at position: " << nfind << endl;
::         ax25Source.erase(nfind);
:: 	}
::
:: -------------------
::
:: 	I think if we change to:
::
:: --------------------
:: if (int nfind = ax25Source.find_first_of('*') <= ax25Source.length()) {
::         //cerr << "Found * in source at position: " << nfind << endl;
::         ax25Source.erase(nfind,1);
:: 	}
:: --------------------
::
:: 	We will ensure that we only delete the "*".  I'm testing this now,
:: but wanted to make sure I understand the implementation.
::

Mmmk, I've been working in this part of the code over the past couple of 
days.  What I need is discussion on is how strict should we make our 
checks.  For example, what I've done with message packets is tested for a 
length of 69 bytes, check for illegal chars (|~{), preserved the id ({xxx), 
and truncated the rest of the message.  I've done similar chops in other 
areas as well, eg, poisition reports.  What I'm finding is a lot of packets 
are being truncated.  I can see where some folks may get upset about 
this... but on the other hand, it's spec.

Chuck

Re: [Aprsd-devel] NOGATE and cleaning "*" from ax25Source

From: Brian D H. <bdh...@c4...> - 2001-07-17 05:05:41

Actually my first thought on cleaning the "*"s or "}"s from the ax25Source
didn't work either.  The following does:
--------------
if (ax25Source.find_first_of("*") <= ax25Source.length()) {
    int nfind = ax25Source.find_first_of("*");
    //cerr << "Found * in source at position: " << nfind << endl;
    ax25Source.erase(nfind,1);
    }
--------------

	Something didn't like the initial "int nfind" portion of the IF
conditional.  Everytime it matched was returning position of "1" for either
"*" or "}".  I've got this running on the test machine now and it's simply
deleting the character without any other adverse affects.  I'll run some
more and if it looks clean I'll commit it.

				73/N5VFF

-- 
============================================================
Brian D Heaton               | I fear that we have awakened
Principal Consultant         | a sleeping giant and instilled
C4I2.com System Consultants  | in him a terrible resolve.
bdh...@c4...            | -- Admiral Isoruku Yamamoto
USA (719) 623-0381           | -- Imperial Japanese Navy
UK +44 (0)845 127-5400       | -- December 7, 1941

On 2001.07.16 22:31 Brian D Heaton wrote:
> 	A second concern is the current way we are stripping "*" from the
> ax25Source field of packets.  At present it appears to be erasing the
> full
> ax25Source of the packet and thus the other filtering code is marking it
> as
> an error packet.  I've got badpacket logging turned on and I see
> everything
> with a "*" in the ax25Source field (mostly digi ID's) being dropped. 
> Thus
> we are dropping any packets directly heard by the IGATE.  
---- SNIP SNIP------ 
> 	I think if we change to:
> 
> --------------------
> if (int nfind = ax25Source.find_first_of('*') <= ax25Source.length()) {
>         //cerr << "Found * in source at position: " << nfind << endl;
>         ax25Source.erase(nfind,1);
> 	}
> --------------------
> 	
> 	We will ensure that we only delete the "*".  I'm testing this
> now,
> but wanted to make sure I understand the implementation.
> 
> 			73/N5VFF
> 
> 
> 
> -- 
> ============================================================
> Brian D Heaton               | I fear that we have awakened
> Principal Consultant         | a sleeping giant and instilled
> C4I2.com System Consultants  | in him a terrible resolve.
> bdh...@c4...            | -- Admiral Isoruku Yamamoto
> USA (719) 623-0381           | -- Imperial Japanese Navy
> UK +44 (0)845 127-5400       | -- December 7, 1941
> 
> _______________________________________________
> Aprsd-devel mailing list
> Apr...@li...
> http://lists.sourceforge.net/lists/listinfo/aprsd-devel
>

[Aprsd-devel] NOGATE and cleaning "*" from ax25Source

From: Brian D H. <bdh...@c4...> - 2001-07-17 04:22:02

	As currently implemented aprsd doesn't check or comply with
"NOGATE" in the ax25Path.  Based on the current thread on aprssig is this
something we want to do?  I think we could implement it with (with the
other filtering code in aprsString.cpp):
--------------
	if (ax25Path.find('NOGATE') != npos) {
		aprsType = APRSERROR;
		return';
	}
--------------

	A second concern is the current way we are stripping "*" from the
ax25Source field of packets.  At present it appears to be erasing the full
ax25Source of the packet and thus the other filtering code is marking it as
an error packet.  I've got badpacket logging turned on and I see everything
with a "*" in the ax25Source field (mostly digi ID's) being dropped.  Thus
we are dropping any packets directly heard by the IGATE.  

	As currently implemented it looks like this:

-------------------
if (int nfind = ax25Source.find_first_of('*') <= ax25Source.length()) {
        //cerr << "Found * in source at position: " << nfind << endl;
        ax25Source.erase(nfind);
	}

------------------- 

	I think if we change to:

--------------------
if (int nfind = ax25Source.find_first_of('*') <= ax25Source.length()) {
        //cerr << "Found * in source at position: " << nfind << endl;
        ax25Source.erase(nfind,1);
	}
--------------------
	
	We will ensure that we only delete the "*".  I'm testing this now,
but wanted to make sure I understand the implementation.

			73/N5VFF



-- 
============================================================
Brian D Heaton               | I fear that we have awakened
Principal Consultant         | a sleeping giant and instilled
C4I2.com System Consultants  | in him a terrible resolve.
bdh...@c4...            | -- Admiral Isoruku Yamamoto
USA (719) 623-0381           | -- Imperial Japanese Navy
UK +44 (0)845 127-5400       | -- December 7, 1941

[Aprsd-devel] delete[] and NULL

From: Brian D H. <bdh...@c4...> - 2001-07-16 22:48:21

All,

	I've committed an update that contains updates to change "delete"
to "delete[]" where required and to see variables to "NULL" after the
delete.  Let me know if I've missed any (I think there are still a "delete
posit" and "delete telementry" that I didn't get the NULLs on.  

	I'll wait until the HTTPStats settles down a bit to try fiddling
with status reporting for queue high-water marks.

				73/N5VFF

-- 
============================================================
Brian D Heaton               | I fear that we have awakened
Principal Consultant         | a sleeping giant and instilled
C4I2.com System Consultants  | in him a terrible resolve.
bdh...@c4...            | -- Admiral Isoruku Yamamoto
USA (719) 623-0381           | -- Imperial Japanese Navy
UK +44 (0)845 127-5400       | -- December 7, 1941

[Aprsd-devel] Another attempt to fix user list html trouble

From: Dale H. <da...@wa...> - 2001-07-16 17:51:12

Well, the first attempt to fix the user list truncation didn't
quite work.  It drops some html in a couple of places but
completed the page.  That's progress :-) I've put yet another aprsd.cpp
up which changes the socket to BLOCKING from non-blocking mode.
Since there are no locked mutexs blocking shouldn't matter.
I don't understand why the non-blocking mode didn't work.

I also cleaned up the HTML generation so it now passes
the w3c validator.  http://validator.w3.org/

As usual the ultimate test is first.aprs.net.


-- 
Dale Heatherington
da...@wa...
Web Page http://www.wa4dsy.net
Sent by KMail for Linux

[Aprsd-devel] New aprsd.cpp

From: Dale H. <da...@wa...> - 2001-07-15 19:10:32

I put a new version of aprsd.cpp on sourceforge.
It has the changes to the html server to hopefully
fix the truncated user status problem.  It needs to be tested
on first.aprs.net.

-- 
Dale Heatherington
da...@wa...
Web Page http://www.wa4dsy.net
Sent by KMail for Linux

Re: [Aprsd-devel] cpQueue class

From: Chuck B. <cb...@vi...> - 2001-07-15 17:59:59

On Sunday 15 July 2001 11:47, Dale Heatherington wrote:
:: Chuck,
:: I dunno if two threads can change data at the same time in cpQueue.  I
:: think that's why the  mutex is in there.  But, I don't think there
:: should be a wait mutex to cause the caller to pause if the queue is
:: full.  This will block the caller. He may have other resources tied up
:: at the time.  Who knows what sort of deadlocks might occur.
::
<< snip, snip >>
Just reaching here.  Deadlocks are the problem though.  When I run first 
from the console eventually I'll see messages of not being able to 
create a new client thread.  This occurs in TCPSessionThread and results 
from rc being != 0 from a call to pthread_create (either tcp server or http 
thread).  

<< snip, snip >>

:: Speaking of "delete[]".....
:: Actually I'm still a bit confused about delete[].
:: A char* is an array and needs the [].
:: A string object is what?  Internally it's an array
:: but it was not declared an array so I assume
:: it gets a plain "delete" ? aprsString would also get
:: a plain "delete"?
::

In C++ there are two kinds of pointers, pointers to a single object and 
pointers to an array of objects.  The important thing is if your call to 
new uses [], so should your call to delete.

Chuck

[Aprsd-devel] cpQueue class

From: Dale H. <da...@wa...> - 2001-07-15 15:47:29

Chuck,
I dunno if two threads can change data at the same time in cpQueue.  I think that's
why the  mutex is in there.  But, I don't think there should be a wait
mutex to cause the caller to pause if the queue is full.  This will block the caller.
He may have other resources tied up at the time.  Who knows what sort of deadlocks
might occur.  

The basic plan was to throw away data that could not be put in the queue, not wait
until space was available.  What happens to the data depends on the state of the
"dyn" variable.  If TRUE then the memory containing the data is freed.  I should have also
set the pointer to NULL but didn't.  If "dyn" is false the data is simply ignored.  The dyn
setting is set at the time the queue is created.

Potential pitfall... If the caller puts an item on the queue then uses the data afterwards
he will be in trouble if the queue was full and the memory was freed.  Once data is put on the
queue it should be considered gone forever by the caller if the queue has the "dyn" flag set.
Only the queue reader should access it.  The queue reader must free the memory or pass
it to another function that does.  Only one queue reader is allowed.

Stuff that needs work:  
delete needs [] 
Pointer should be set to NULL after delete.

Speaking of "delete[]".....
Actually I'm still a bit confused about delete[]. 
A char* is an array and needs the [].
A string object is what?  Internally it's an array
but it was not declared an array so I assume
it gets a plain "delete" ? aprsString would also get
a plain "delete"?

On Saturday 14 July 2001 15:49, Chuck Byam wrote:
> On Friday 13 July 2001 19:54, you wrote:
> :: The killer bug that will not die. (sigh)
> :: I had high hopes that fixing the
> :: unterminated string problem was gonna really help.
> ::
> :: On Friday 13 July 2001 17:33, Chuck Byam wrote:
> :: > On Friday 13 July 2001 15:55, you wrote:
> :: > :: I see it's been running 1.1 hours now.  good start.
> :: >
> :: > Well... after about 2.5 hours first is chewing up the CPU cycles.
> :: > I'ts still accepting connections and handing out data, but one of the
> :: > threads is hogging the CPU (88% with top running).
>
> Your changes may very well have fixed an issue, that being the segfaults.
> What this is, I think is a race condition that occurs between two (or more)
> threads.  I've been looking at the cpqueue code and trying to figure out if
> it's possible for two threads to change data there at the same time.  Look
> at the attached and let me know if it make any sense to you.  It
> essentially provides wait variables that has the caller wait until a
> condition is true, in this case whether the queue is full or empty.
>
> Chuck
>
> int cpQueue::write(char *cp, int n)
> {
>     int rc=0;
>
>     if (lock)
>         return -2;                      // Lock is only set true in the
> destructor
>
>     if(pthread_mutex_lock(mut) != 0)
>         cerr << "Unable to lock mut - cpQueue:Write-char *cp.\n" << flush;
>
>     inWrite = 1;
>     int idx = write_p;
>
>     while (base_p[idx].full) {
>         cerr << "Queue is full... waiting" << endl;
>         pthread_cond_wait(base_p[idx].notFull, mut);
>     }
>
>     if (base_p[idx].rdy == false) {     // Be sure not to overwrite old
> stuff
>         base_p[idx].qcp = (void*)cp;    // put char* on queue
>         base_p[idx].qcmd = n;           // put int (cmd) on queue
>         base_p[idx].rdy = true;         // Set the ready flag
>         base_p[idx].empty = false;
>         idx++;
>         itemsQueued++;
>         if (itemsQueued > HWitemsQueued)
>             HWitemsQueued = itemsQueued;
>
>         if (idx >= size)
>             idx = 0;
>
>         write_p = idx;
>     } else {
>         overrun++ ;
>
>         if (dyn)
>             delete cp;
>
>         rc = -1;
>     }
>
>     inWrite = 0;
>
>     if(pthread_mutex_unlock(mut) != 0)
>         cerr << "Unable to unlock mut - cpQueue:Write - char *cp.\n" <<
> flush;
>
>     pthread_cond_signal(base_p[idx].notEmpty);
>     return(rc);
> }
>
> void* cpQueue::read(int *ip)
> {
>     if(pthread_mutex_lock(mut) != 0)
>         cerr << "Unable to lock mut - cpQueue:read - int.\n" << flush;
>
>     while (base_p[read_p].empty) {      // wait here if the queue is empty
>         cerr << "Queue empty... waiting." << endl;
>         pthread_cond_wait(base_p[read_p].notEmpty, mut);
>     }
>
>     inRead = 1;
>     void* cp = base_p[read_p].qcp ;     // Read the TAprsString*
>
>     if (ip)
>         *ip = base_p[read_p].qcmd ;     // read the optional integer
> command
>
>     base_p[read_p].qcp = NULL;          // Set the data pointer to NULL
>     base_p[read_p].rdy = false;         // Clear ready flag
>     read_p++;
>     itemsQueued--;
>
>     if (read_p >= size)
>         read_p = 0;
>
>     inRead = 0;
>
>     if (pthread_mutex_unlock(mut) != 0)
>         cerr << "Unable to unlock mut - cpQueue:read - int.\n" << flush;
>
>     pthread_cond_signal(base_p[read_p].notFull);
>
>     return(cp);
> }

-- 
Dale Heatherington
da...@wa...
Web Page http://www.wa4dsy.net
Sent by KMail for Linux