What you need is RoboRealm (http://www.roborealm.com/
). It allows you to do absolutely everything you outlined above, interfaces to multiple cameras *and* also will talk to the NXT. The documentation and tutorials on the RoboRealm site are very thorough, and almost constitute a machine-vision course in their own right. I've played with RoboRealm a year or two ago and it was really simple to use and very easy to quick test out algorithms and filters.
Unfortunately it is not free as in no-cost, and now costs $89!
I've worked with the Mindsensors NXTcam and they are modified CMUcams that do basic blob detection by colour. You train the camera as to what colour constitutes a blob and then it returns you an array of 8 blobs and their x,y location. Xander Soldaat has a library to do blob-merging, but given the resolution of the NXTcam there's not much else you can do. You can switch it into line-tracking mode and can do more advanced path-following. A NXTcam will not be much use for object recognition by shape, unless your objects are radically different in size and posture. The advantage of a NXTcam is that it connects to the NXT and all processing is done in the camera and in the NXT - no PC is required.
Good luck - image processing on mobile robots is not for the faint of heart!