Thread: rawiso update & documentation
Brought to you by:
aeb,
bencollins
From: Dan M. <dm...@ma...> - 2002-10-26 03:34:48
|
Hi all - I'm making very good progress on rawiso. I need to tweak a few minor things with the implementation, but I think the API is basically done now. I have looked at the PCILynx documentation and I don't see any problems with supporting the API. (I don't immediately see how to do start-on-cycle, but I'm sure it's possible given the enormous flexibility of the chipset...) I have changed from an event-loop API to a callback API, to fit in better with the other parts of libraw1394. I've written a chapter for the libraw1394 documentation on the new API. You can see it on my webserver here: http://dcine.dyndns.org/stuff/libraw1394-doc/ (see chapter 4: Isochronous Transmission and Reception) To help with reading the docs, here is the complete rawiso API: typedef int (*raw1394_iso_xmit_handler_t)(raw1394handle_t, unsigned char *data, unsigned int *len, unsigned char *tag, unsigned char *sy, unsigned int cycle, unsigned int dropped); typedef int (*raw1394_iso_recv_handler_t)(raw1394handle_t, unsigned char *data, unsigned int len, unsigned char channel, unsigned char tag, unsigned char sy, unsigned int cycle, unsigned int dropped); int raw1394_iso_xmit_init(raw1394handle_t handle, raw1394_iso_xmit_handler_t handler, unsigned int buf_packets, unsigned int max_packet_size, int channel, enum raw1394_iso_speed speed, int irq_interval); int raw1394_iso_recv_init(raw1394handle_t handle, raw1394_iso_recv_handler_t handler, unsigned int buf_packets, unsigned int max_packet_size, int channel, enum raw1394_iso_speed speed, int irq_interval); int raw1394_iso_xmit_start(raw1394handle_t handle, int start_on_cycle, unsigned int prebuffer_packets); int raw1394_iso_recv_start(raw1394handle_t handle, int start_on_cycle); void raw1394_iso_stop(raw1394handle_t handle); void raw1394_iso_shutdown(raw1394handle_t handle); Regards, Dan |
From: Manfred W. <we...@ic...> - 2002-10-26 18:31:00
|
Dan Maas wrote: > Hi all - I'm making very good progress on rawiso. > > I need to tweak a few minor things with the implementation, but I > think the API is basically done now. Fine. I just had a look at your documentation. What I found interesting, is, that everything (except the cycle parameter in the handlers) could be implemented using the existing libraw1394 api. I only have one comment: In the function raw1394_iso_recv_init we do not need the speed argument. We only need the speed for transmission to control the speed we use for transmission; but on the receiving side we cannot control the speed, we just can take and handle the packets... Manfred |
From: Dan M. <dm...@ma...> - 2002-10-26 18:45:52
|
* Manfred Weihs (we...@ic...) wrote: > Fine. I just had a look at your documentation. What I found interesting, > is, that everything (except the cycle parameter in the handlers) could > be implemented using the existing libraw1394 api. If you can find a way to use the existing API let me know... Keep in mind the underlying implementation of rawiso is *very* different. It might be possible if some of the handler parameters were offered as separate API calls (raw1394_iso_get_current_cycle() etc...). But, I'd rather make a clean break than complicate the API more than necessary. (in the case of get_current_cycle(), the return value would ONLY be correct inside a packet handler, so I prefer to pass it as a parameter) One issue that has come up is re-entrancy of libraw1394 from the user packet handler. For example, it may make sense to allow calling raw1394_iso_stop() and raw1394_iso_start() inside the packet handler, e.g. in case of a buffer overflow. This would complicate the libraw1394 code quite a bit. I would prefer to disallow calls to raw1394 inside of a packet handler; the handler could request stop/start or signal an error through its return value. What do you think? Is it worth the trouble to make libraw1394 re-entrant? Also, I have had to change the kernel raw1394 poll() method a bit. The isochronous operation is a separate stream of events from the standard read() interface. So, I have changed poll() to return POLLIN only for standard read events and POLLOUT only for isochronous updates. As long as applications check for both POLLIN and POLLOUT, we should be fine and fully backwards-compatible, but this will need some testing. > I only have one comment: In the function raw1394_iso_recv_init we do not > need the speed argument. We only need the speed for transmission to > control the speed we use for transmission; but on the receiving side we > cannot control the speed, we just can take and handle the packets... OK got that, I will delete the speed argument. Dan |
From: Dan M. <dm...@ma...> - 2002-10-26 19:24:46
|
* Dan Maas (dm...@ma...) wrote: > If you can find a way to use the existing API let me know... Keep in > mind the underlying implementation of rawiso is *very* different. Just had an idea - what if I implemented the old ISO API in terms of the new API, and offer both? I think this is possible. My intention was never to get rid of the old API; of course we need to keep it for backwards compatibility. In rawiso the old API still exists, side-by-side with the new code. I could shoehorn all the new functionality in the old API, but the more I think about it the more I dislike the idea. The new API makes it easy to write a correctly-functioning isochronous application. I have carefully structured the API to encourage good practices. There are too many opportunities for making mistakes with the old API. (not checking for buffer overflows, not getting the cycle number correct, etc...). Anyway, none of this affects the kernel-level stuff, only libraw1394. I will get the kernel code ready to merge. It occurred to me that we might be able to eliminate a ton of kernel code by switching dv1394 and video1394 to use the new hpsb_iso interface (the kernel-level counterpart of rawiso). I'll look into this as an option later. Regards, Dan |
From: Manfred W. <we...@ic...> - 2002-10-27 13:44:03
|
Dan Maas wrote: > * Dan Maas (dm...@ma...) wrote: > Just had an idea - what if I implemented the old ISO API in terms of > the new API, and offer both? I think this is possible. > > My intention was never to get rid of the old API; of course we need to > keep it for backwards compatibility. In rawiso the old API still > exists, side-by-side with the new code. At least in the short term the old API should stay working, so that old applications keep working. We can mark the functions as "depricated" in raw1394.h, but they should be available for at least some time. In the long term (after applications developers had time to switch to the new API), we might consider dropping that support. > It occurred to me that we might be able to eliminate a ton of kernel > code by switching dv1394 and video1394 to use the new hpsb_iso > interface (the kernel-level counterpart of rawiso). I'll look into > this as an option later. That is definitely a good idea. I always disliked that ohci dependent stuff. If we have an appropriate interface in the core driver, than no other module than the lowlevel drivers should directly access the hardware. Manfred |
From: Manfred W. <we...@ic...> - 2002-10-27 13:35:53
|
Dan Maas wrote: > * Manfred Weihs (we...@ic...) wrote: > >>Fine. I just had a look at your documentation. What I found interesting, >>is, that everything (except the cycle parameter in the handlers) could >>be implemented using the existing libraw1394 api. > > > If you can find a way to use the existing API let me know... Keep in > mind the underlying implementation of rawiso is *very* different. It And therefore I found it interesting, that on the higher layers they are that similar. Except the cycle number feature, you can implement a libraw1394 implementing your API while using the old drivers. The iso reception mechanism is almost the same. Both (the new and the old) APIs have a function to start and to stop isochronous reception and a handler, that will be called by raw1394_loop_iterate. One minor difference is, that in the new API you can control the size of the kernel's buffer for iso packets, while in the old one that was fixed. The iso transmission is a bit more different, but still you can make libraw1394 keep track of the number of queued (i.e. unsent) isochronous packets in the kernels buffer and let raw1394_loop_iterate call the transmission handler, if necessary. > might be possible if some of the handler parameters were offered as > separate API calls (raw1394_iso_get_current_cycle() etc...). But, I'd > rather make a clean break than complicate the API more than > necessary. (in the case of get_current_cycle(), the return value would > ONLY be correct inside a packet handler, so I prefer to pass it as a > parameter) As I wrote, the cycle count is the only feature, that could not easily be implemented (because the lowlevel drivers do [did] not report it). You can read the local cycle count register, but since the packets are reported delayed to the higher layers, that will not help... > One issue that has come up is re-entrancy of libraw1394 from the user > packet handler. For example, it may make sense to allow calling > raw1394_iso_stop() and raw1394_iso_start() inside the packet handler, > e.g. in case of a buffer overflow. This would complicate the > libraw1394 code quite a bit. I would prefer to disallow calls to > raw1394 inside of a packet handler; the handler could request > stop/start or signal an error through its return value. What do you > think? Is it worth the trouble to make libraw1394 re-entrant? I do not see the real reason, why it makes the trouble. The only problem I see, is that the handler will call itself, because raw1394_iso_stop will internally call raw1394_loop_iterate to get the result of stopping. I think, it probably would be better, that the write function used to write the stop request to the device file would immediatly return the result rather then queuing a request that must be read (using raw1394_loop_iterate); for example my config ROM functions do that. I think, many raw1394 functions could immediately return the result; just asynchronous transactions must queue the result, because it is not available immediatley after the request. But maybe it would be cleaner for these functions (that immediately return the result) to switch from write over to ioctl... > Also, I have had to change the kernel raw1394 poll() method a bit. The > isochronous operation is a separate stream of events from the standard > read() interface. So, I have changed poll() to return POLLIN only for > standard read events and POLLOUT only for isochronous updates. As long > as applications check for both POLLIN and POLLOUT, we should be fine > and fully backwards-compatible, but this will need some testing. That is no good idea. If I get POLLOUT, that indicates, that I should be able to write to the device file without blocking, and POLLIN indicates that I can read without blocking. If you change that behaviour, that will be very misleading. Manfred |
From: Dan M. <dm...@ma...> - 2002-10-27 20:45:18
|
Hi Manfred, thanks for taking time to look at rawiso! > As I wrote, the cycle count is the only feature, that could not > easily be implemented (because the lowlevel drivers do [did] not > report it). You can read the local cycle count register, but since > the packets are reported delayed to the higher layers, that will not > help... Right, in rawiso the cycle number is very carefully managed by libraw. It is not simply the current value of the cycle count register. It is the exact value the cycle count had or will have when the current packet was received or will be transmitted. Question- should the cycle number simply count mod 8000, or should it also include some of the 'seconds' bits from the CYCLE_COUNT register? (it may be helpful to include the 'seconds' bits for start-on-cycle purposes, since OHCI at least requires some of the seconds bits for setting when to start/stop DMA) > I do not see the real reason, why it makes the trouble. The only > problem I see, is that the handler will call itself, because > raw1394_iso_stop will internally call raw1394_loop_iterate to get > the result of stopping. Here is the issue - to avoid massive numbers of system calls and context switching, packets are processed in large batches (by default 1000 packets at a time). So internall raw1394_loop_iterate() does this: for(int i = 0 ; i < 1000; i++) call_iso_packet_handler(); tell_kernel_packets_are_handled(); Now assume that on one call to the iso packet handler, the user calls raw1394_iso_stop(). Instead of looping through the rest of the packets, we need to break out of the loop, and skip the 'tell_kernel' step. This would be rather messy to implement in libraw, though it could be done. I have not worked with the asynchronous side of libraw very much. What is the policy on calling libraw from a signal handler? I would prefer to signal actions with a return value from the handler - like RAW1394_ISO_OK, RAW1394_ISO_STOP, etc... > raw1394_loop_iterate); for example my config ROM functions do that. I > think, many raw1394 functions could immediately return the result; just > asynchronous transactions must queue the result, because it is not > available immediatley after the request. > But maybe it would be cleaner for these functions (that immediately > return the result) to switch from write over to ioctl... rawiso works as you describe here. It does not use the queued event mechanism like other parts of libraw1394. > That is no good idea. If I get POLLOUT, that indicates, that I > should be able to write to the device file without blocking, and > POLLIN indicates that I can read without blocking. If you change > that behaviour, that will be very misleading. Well, there must be *some* way to distinguish between pending read() events and pending isochronous activity. If raw1394_loop_iterate() cannot tell the difference, then it may block unexpectedly. (remember, applications are not allowed to read or write on the fd themselves). The only alternative I can think of is returning POLLIN plus some other signal, like POLLIN | POLLPRI. Then raw1394_loop_iterate() can tell the difference when it polls the fd. Or, raw1394_loop_iterate() could *always* check for isochronous activity when transmitting or receiving; this would result in some extra syscalls, but it wouldn't affect poll(). >> One more issue to discuss - should the raw1394_iso API contact the >> IRM and perform bandwidth/channel allocation automatically? Or >> should this be dealt with at a higher level? > > That definitely must be done by higher layers. Raw1394 may not do that > automatically. > Please keep in mind, that the device, that performs isochronous > transmission, is usually different to the device that has to allocate > the resources (at least for IEC61883 compliant transmission). OK, I see. That makes things easier for me anyway =). Regards, Dan |
From: Kristian H. <hog...@us...> - 2002-10-27 21:21:16
|
Dan Maas <dm...@ma...> writes: [...] > Question- should the cycle number simply count mod 8000, or should it > also include some of the 'seconds' bits from the CYCLE_COUNT register? > (it may be helpful to include the 'seconds' bits for start-on-cycle > purposes, since OHCI at least requires some of the seconds bits for > setting when to start/stop DMA) I would say mod 16 is enough - you only use four bits for cycle count in the timestamp anyway. The start-on-cycle should just program the DMA to start at specified-cycle-count + (current-cycle-count & ~15) + 32 The 32 cycles provide some margin, so that we have time to lookup the current-cycle-count, compute the start time and program the DMA, before the start cycle occurs. I really dont see a need for programming the DMA to start in 1.23 seconds. Or even simpler, we could just always start DMA in a cycle where cycle-count mod 16 == 0 and the timestamping logic would just start from 0. Also, see stream_start_dma() in amdtp.c. Kristian |
From: Dan M. <dm...@ma...> - 2002-10-27 21:44:42
|
* Kristian Hogsberg (hog...@us...) wrote: > I would say mod 16 is enough - you only use four bits for cycle count > in the timestamp anyway. The start-on-cycle should just program the > DMA to start at > > specified-cycle-count + (current-cycle-count & ~15) + 32 MPEG-2 TS streams need the entire cycle number mod 8000 for timestamps... (though maybe they just use the low-order bits, I'm not sure) raw1394_iso_xmit/recv_start() will pick a starting cycle if you don't specify one. The algorithm works as you describe - start on the current cycle number, plus a "few" cycles to allow time for DMA to start up. The ability to start on a specific cycle might be useful for writing PC-to-PC file transfer programs; it could make synchronization easier. Regards, Dan |
From: Kristian H. <hog...@us...> - 2002-10-27 22:07:44
|
Dan Maas <dm...@ma...> writes: > * Kristian Hogsberg (hog...@us...) wrote: > > I would say mod 16 is enough - you only use four bits for cycle count > > in the timestamp anyway. The start-on-cycle should just program the > > DMA to start at > > > > specified-cycle-count + (current-cycle-count & ~15) + 32 > > MPEG-2 TS streams need the entire cycle number mod 8000 for > timestamps... (though maybe they just use the low-order bits, I'm not > sure) Okay, but I guess this is not the CIP header then? > raw1394_iso_xmit/recv_start() will pick a starting cycle if you don't > specify one. The algorithm works as you describe - start on the > current cycle number, plus a "few" cycles to allow time for DMA to > start up. Great, this scheme works quite nicely in amdtp.c Kristian |
From: Manfred W. <we...@ic...> - 2002-10-28 13:12:28
|
Dan Maas wrote: > MPEG-2 TS streams need the entire cycle number mod 8000 for > timestamps... (though maybe they just use the low-order bits, I'm not > sure) To be exact (just for clarification): The timestamps im MPEG-2 TS are stored in the source packet header (not the CIP header) and consist of the lower 25 bits of the cycle time. That is the complete cycle offset and cycle count, but no part of the second count. Manfred |
From: Manfred W. <we...@ic...> - 2002-10-28 13:09:01
|
Dan Maas wrote: > Question- should the cycle number simply count mod 8000, or should it > also include some of the 'seconds' bits from the CYCLE_COUNT register? What does the hardware provide. I think, if we get the complete cycle time (i.e. cycle offset, cycle count and second count) of a packet we receive, we should report it (maybe it is useful for some applications). And if hardware allows starting transmission at a certain second count and cycle count, we should also provide that feature in libraw1394. > Here is the issue - to avoid massive numbers of system calls and > context switching, packets are processed in large batches (by default > 1000 packets at a time). So internall raw1394_loop_iterate() does > this: > > for(int i = 0 ; i < 1000; i++) > call_iso_packet_handler(); > tell_kernel_packets_are_handled(); > > Now assume that on one call to the iso packet handler, the user calls > raw1394_iso_stop(). Instead of looping through the rest of the > packets, we need to break out of the loop, and skip the 'tell_kernel' > step. This would be rather messy to implement in libraw, though it > could be done. OK. Therefore we should discuss, if raw1394_iso_stop immediatly stops handling isochonous packets. It would be possible to just stop the reception of new packets, but continue handling already queued ones. That is the current behaviour of libraw. If you stop isochronous reception, there are usually already queued iso packets in the kernel buffer and they will be handled by subsequent calls to raw1394_loop_iterate. If we keep that behaviour, we will not have the problem you mention. > I have not worked with the asynchronous side of libraw very much. What > is the policy on calling libraw from a signal handler? Signal handlers are executed asynchronously. Since libraw1394 allows one handle only to be used by one thread simultanously, the signal handler may generate a new raw1394 handle and use it, but it should not use a raw1394 handle, that is used by another program thread. >>That is no good idea. If I get POLLOUT, that indicates, that I >>should be able to write to the device file without blocking, and >>POLLIN indicates that I can read without blocking. If you change >>that behaviour, that will be very misleading. > > > Well, there must be *some* way to distinguish between pending read() > events and pending isochronous activity. If raw1394_loop_iterate() > cannot tell the difference, then it may block unexpectedly. (remember, > applications are not allowed to read or write on the fd themselves). I do not understand. raw1394_loop_iterate just reads one message (request) from the device file (that might be a response to an asynchronous transaction, isochronous packets, etc.) and handles it. So if we know, that read does not block, we also know that raw1394_loop_iterate will not block. > The only alternative I can think of is returning POLLIN plus some > other signal, like POLLIN | POLLPRI. Then raw1394_loop_iterate() can > tell the difference when it polls the fd. No good idea. All that POLL* constants have a defined meaning. We should not misuse them. > Or, raw1394_loop_iterate() could *always* check for isochronous > activity when transmitting or receiving; this would result in some > extra syscalls, but it wouldn't affect poll(). How do we get isochronous data? I thought, we get it via queued requests we read from raw1394, don't we? (I should start looking at your code... ;-)) Manfred |
From: Dan M. <dm...@ma...> - 2002-10-28 14:41:00
|
* Manfred Weihs (we...@ic...) wrote: > What does the hardware provide. I think, if we get the complete cycle > time (i.e. cycle offset, cycle count and second count) of a packet we > receive, we should report it (maybe it is useful for some applications). > And if hardware allows starting transmission at a certain second count > and cycle count, we should also provide that feature in libraw1394. OHCI gives you cycle count plus a few bits of seconds (2 bits I think). I think I may need to provide some macros for accessing the cycle number and seconds fields. e.g. int raw1394_iso_cycle_count(int cycle) { return cycle & 0x1FFF; } > OK. Therefore we should discuss, if raw1394_iso_stop immediatly stops > handling isochonous packets. It would be possible to just stop the > reception of new packets, but continue handling already queued ones. > That is the current behaviour of libraw. If you stop isochronous > reception, there are usually already queued iso packets in the kernel > buffer and they will be handled by subsequent calls to > raw1394_loop_iterate. If we keep that behaviour, we will not have the > problem you mention. Right, raw1394_iso_stop is defined to leave queued packets queued. The re-entrancy issue remains however. The packet handler is called multiple times from raw1394_loop_iterate() (and currently it is also called from within raw1394_iso_start(), although I think it would be a good idea to change that). What should happen if the packet handler calls some other function in libraw1394? This could get messy - e.g. if it calls raw1394_destroy_handle(), raw1394_loop_iterate() must notice that and stop trying to queue packets! I think perhaps it would be best to allow libraw1394 function calls, and set up some mechanism (e.g. a "I'm in the packet handler" flag) to prevent unwanted behavior. > > I have not worked with the asynchronous side of libraw very > > much. What is the policy on calling libraw from a signal handler? Sorry that was a typo - I meant "packet handler", not "UNIX signal handler"... > I do not understand. raw1394_loop_iterate just reads one message > (request) from the device file (that might be a response to an > asynchronous transaction, isochronous packets, etc.) and handles > it. So if we know, that read does not block, we also know that > raw1394_loop_iterate will not block. rawiso does not use the queued read() interface. Isochronous transmission status needs to be updated with as little latency as possible, so it uses a special ioctl() that returns the very latest status of the ringbuffer. The problem is that raw1394_loop_iterate() must choose to perform either one read() or one isochronous ioctl(). If isochronous activity is pending but no queued event is pending, the read() will block! poll() must provide some way to distinguish between a pending read() event and isochronous (ioctl) activity. I suppose I could add *another* ioctl() that tells you whether a read event() or isochronous activity are pending, but that seems much worse than using an extra POLL* flag to give the same information! Regards, Dan |
From: Manfred W. <we...@ic...> - 2002-10-29 09:37:27
|
Dan Maas wrote: > The re-entrancy issue remains however. The packet handler is called > multiple times from raw1394_loop_iterate() (and currently it is also > called from within raw1394_iso_start(), although I think it would be a > good idea to change that). What should happen if the packet handler > calls some other function in libraw1394? This could get messy - > e.g. if it calls raw1394_destroy_handle(), raw1394_loop_iterate() must > notice that and stop trying to queue packets! Must it? It also could continue to queue the packets; it is just a matter of definition, what will happen in that case... It will not be able to get new packets from the kernel (because the file descriptor is destroyed). But it could continue to handle the queued packets. Another (probably cleanser solution) would be, that one call to raw1394_loop_iterate just processes one isochronous packet. So you would have to remember the current position in the iso receive buffer somewhere in the handle data structure. Then raw1394_destroy_handle would empty that buffer and the next call of raw1394_loop_iterate would not cause a packet handler to be called. So that would reestablish the old policy that one call to raw1394_loop_iterate causes the call of (at most) one handler. On the other hand that approach has the major drawback, that we could no more use poll to check, if raw1394_loop_iterate would block. Therefore we could add a new function that returns, whether raw1394_loop_iterate would block (i.e. there is iso data pending or poll reports pending data). > I think perhaps it would be best to allow libraw1394 function calls, > and set up some mechanism (e.g. a "I'm in the packet handler" flag) to > prevent unwanted behavior. The old policy was, that calls from within a handler were allowed and the application programmer has to care, if that could cause problems in his application. The root of this policy is probably, that it is allowed to call raw1394_loop_iterate within one thread; and since the handler is executed in the same thread, it might call raw1394_loop_iterate. But of course we could change that policy. > rawiso does not use the queued read() interface. Isochronous > transmission status needs to be updated with as little latency as > possible, so it uses a special ioctl() that returns the very latest > status of the ringbuffer. You still could use the queued read() interface. Just queue a request similar to the old RAW1394_REQ_ISO_RECEIVE or RAW1394_REQ_BUS_RESET, that notifies you about changes in the iso transmission status. I would not expect latency problems. If you call raw1394_loop_iterate on a regular basis (in a loop, as it is supposed to be used), the queue will usually be empty (and if there are requests in the queue, they might need small latency as well...). I think, it would be clean design, if we use that read mechanism for all notifications from kernel to userspace. And for the other direction (userspace to kernel) we should use either write or ioctl. > The problem is that raw1394_loop_iterate() must choose to perform > either one read() or one isochronous ioctl(). If isochronous activity > is pending but no queued event is pending, the read() will block! > poll() must provide some way to distinguish between a pending read() > event and isochronous (ioctl) activity. As I proposed above, you could also report the isochronous activity using the read() syscall. And then poll would work as before. > I suppose I could add *another* ioctl() that tells you whether a read > event() or isochronous activity are pending Anyway it would not be correct, that poll reports, that there is data to read and then a read would block (because only an ioctl works). Manfred |
From: Kristian H. <hog...@us...> - 2002-10-29 12:07:40
|
Manfred Weihs <we...@ic...> writes: [...] > As I proposed above, you could also report the isochronous activity > using the read() syscall. And then poll would work as before. > > > I suppose I could add *another* ioctl() that tells you whether a read > > event() or isochronous activity are pending > > Anyway it would not be correct, that poll reports, that there is data > to read and then a read would block (because only an ioctl works). What about just using two file descriptors? One for the old raw1394 stuff minus the old isochronous raw1394 API, and a new file descriptor for isochronous traffic? I haven't followed your discussion closely, so the suggestion may be irrelevant, but it would be a nice way to separate the two fundamentally different transmission modes. Just a thought, Kristian |
From: Dan M. <dm...@ma...> - 2002-10-29 21:17:03
|
These are all good ideas. I would be happy to have raw1394_loop_iterate() only queue/release one packet. This can be done without additional system calls, as long as isochronous packets take priority over read() events. i.e. int raw1394_loop_iterate() { if(iso packets waiting) return handle_one_iso_packet(); else read(...); // normal read() event processing } Then re-entrancy is not an problem, since libraw is just jumping to the packet handler. I really want to make sure we can handle thousands of packets per syscall instead of doing an ioctl() or poll() for each one individually. This cuts down on CPU usage a lot. Hmm, actually, I'm not sure this is going to work. In a GUI application, poll() will probably be called once per packet/raw1394_loop_iterate() anyway... I think I will simply say that isochronous packet handlers are "special", and may not call any libraw functions. Packet handlers can still communicate errors or start/stop instructions with their return value. > Therefore we could add a new function that returns, whether > raw1394_loop_iterate would block (i.e. there is iso data pending or > poll reports pending data). I have already made an addition in the rawiso branch: raw1394_loop_iterate_timeout(raw194handle_t handle, int timeout); This is the same as raw1394_loop_iterate, plus it takes a 'timeout' argument like poll(). You can pass 0 for the timeout to do a non-blocking check, or -1 to block indefinitely. So if you don't want to block, just say raw1394_loop_iterate_timeout(handle, 0); I will think about queueing isochronous activity together with read() events. Isochronous activity is not really event-based, it's more yes-or-no - are there packets waiting or not? I could queue an event when the first packet comes in, and then when later packets arrive, only append a new event if there are no isochronous events in the queue. This would prevent the queue from growing too long. I'd prefer not to add another file descriptor. Then I'd have to duplicate all of the setup / port selection logic from raw1394, and it would require another device node. IMHO that is unnecessary. Regards, Dan |
From: Kurt K. <konolige@AI.SRI.COM> - 2002-10-28 18:09:16
|
Having looked at the discussion of the new rawiso implementations, and without having looked at the code, I was wondering if anyone could say whether the video1394 functionality is present: 1. Ability to queue multiple buffers to receive frames of video 2. Ability to specify frame-per-buffer reception 3. Ability to mmap buffers to user space Especially, I'm concerned with getting a "no-copy" interface to video frames, with low-cpu overhead (one frame per buffer). The message about the iso loop handling 1000 packets before returning is a little scary... Cheers --Kurt |
From: Dan M. <dm...@ma...> - 2002-10-28 22:07:33
|
* Kurt Konolige (konolige@AI.SRI.COM) wrote: > Having looked at the discussion of the new rawiso implementations, and > without having looked at the code, I was wondering if anyone could say > whether the video1394 functionality is present: I have not worked with video1394 very much, but I am pretty sure rawiso provides a superset of its capabilities. I would be able to give a better answer if you could define what you mean by a "frame" of video. I am used to thinking in terms of isochronous packets. rawiso provides a ringbuffer of packets which are mmap'ed to user space. The libraw1394 API is actually very simple; all you see is a constant stream of packets. (it is "zero copy", in the sense that you get direct access to the DMA buffer) > Especially, I'm concerned with getting a "no-copy" interface to > video frames, with low-cpu overhead (one frame per buffer). The > message about the iso loop handling 1000 packets before returning is > a little scary... Batching ISO packets should not be scary - it allows much better performance than an API that would make one syscall per packet!! It is much more efficient to batch up a few hundred packets and send them to the driver all in one go. I was just recording and playing back a lot of DV video today with rawiso. CPU usage was steady at around 10% or so (Athlon 1.3GHz). Regards, Dan |
From: Kurt K. <konolige@AI.SRI.COM> - 2002-10-28 22:15:43
|
Dan, a frame of video can contain a lot of info -- for example, 640x480 YUV422 has about 600KB. That's a lot of packets. There's an OHCI mode that allows you to specify all of the data from one frame's worth of packets be deposited into a single buffer, by DMA, with zero CPU intervention. The frames are delimited by a sync bit in the packet headers. If you give me the 600 or so packets that make up a frame, then I have to copy their contents into a buffer to make a contiguous frame, which is prohibitively expensive. DV has a compression of about 6:1 over an uncompressed stream, so it's not so bad to deal with individual packets. Does the proposed raw iso API support the frame-per-buffer handling of the current video1394/ohci1394 drivers? Cheers --Kurt > -----Original Message----- > From: lin...@li... > [mailto:lin...@li...]On Behalf Of Dan > Maas > Sent: Monday, October 28, 2002 2:07 PM > To: Kurt Konolige > Cc: lin...@li... > Subject: Re: rawiso update & documentation > > > * Kurt Konolige (konolige@AI.SRI.COM) wrote: > > Having looked at the discussion of the new rawiso implementations, and > > without having looked at the code, I was wondering if anyone could say > > whether the video1394 functionality is present: > > I have not worked with video1394 very much, but I am pretty sure > rawiso provides a superset of its capabilities. > > I would be able to give a better answer if you could define what you > mean by a "frame" of video. I am used to thinking in terms of > isochronous packets. > > rawiso provides a ringbuffer of packets which are mmap'ed to user > space. The libraw1394 API is actually very simple; all you see is a > constant stream of packets. (it is "zero copy", in the sense that you > get direct access to the DMA buffer) > > > Especially, I'm concerned with getting a "no-copy" interface to > > video frames, with low-cpu overhead (one frame per buffer). The > > message about the iso loop handling 1000 packets before returning is > > a little scary... > > Batching ISO packets should not be scary - it allows much better > performance than an API that would make one syscall per packet!! It is > much more efficient to batch up a few hundred packets and send them to > the driver all in one go. > > I was just recording and playing back a lot of DV video today with > rawiso. CPU usage was steady at around 10% or so (Athlon 1.3GHz). > > Regards, > Dan > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > mailing list lin...@li... > https://lists.sourceforge.net/lists/listinfo/linux1394-devel |
From: Dan M. <dm...@ma...> - 2002-10-28 22:28:41
|
* Kurt Konolige (konolige@AI.SRI.COM) wrote: > Dan, a frame of video can contain a lot of info -- for example, 640x480 > YUV422 has about 600KB. That's a lot of packets. There's an OHCI mode that > allows you to specify all of the data from one frame's worth of packets be > deposited into a single buffer, by DMA, with zero CPU intervention. The > frames are delimited by a sync bit in the packet headers. Aha, I see. I am not very familiar with this form of reception. (rawiso at this point only uses packet-per-buffer mode). I think you could accomplish what you need with the rawiso API, though it might not be as straightforward as the single buffer method. I could offer an alternative to raw1394_iso_recv_start() that takes a sync tag rather than a cycle number to start on. Or, it may make sense to just separate this kind of batch-receive functionality into a totally separate API. raw1394_iso_recv_begin_batch() or something... > If you give me the 600 or so packets that make up a frame, then I > have to copy their contents into a buffer to make a contiguous > frame, which is prohibitively expensive. What exactly are you doing with the data? If you are saving it to a file, or displaying it on screen, you'll need to copy it once anyway. Two copies is only slightly more expensive than one copy due to the CPU cache. It is good to hear comments from other users of isochronous transmission. I am mostly focused on IEC 61883 streaming, so you can see that the API is biased in that direction... (people have been talking about deprecating video1394 but I don't really think that's appropriate anytime soon. Obviously it provides good features that people are using intensively. The reason I am introducing rawiso is that I find video1394 to be pretty much useless for IEC 61883 applications, which have specific requirements for latency and packet timestamping) Regards, Dan |
From: Kurt K. <konolige@AI.SRI.COM> - 2002-10-28 23:02:27
|
The vision research community is very interested in preserving the functionality of frame-per-buffer of ohci1394/video1394; a very similar interface is the new raw iso API would help. I think this new API has a much better structure for dealing with channel assignment and other bookkeeping involved with ISO streams. Again, pardon my ignorance on the raw iso API. Basically, outside of the bookkeeping stuff, the functionality required is: 1. Queue up buffers of frame size, to be filled with a frame of packets, delimited by the sync tag 2. Check for buffer status: idle, queued, filled -- with a polling and waiting interface. There's a lot of code in video1394.c for setting up the ohci DMA programs to deposit packets into the frame buffer. I don't know if this is duplicated in the raw iso API. Cheers --Kurt > -----Original Message----- > From: Dan Maas [mailto:dm...@ma...] > Sent: Monday, October 28, 2002 2:29 PM > To: Kurt Konolige > Cc: lin...@li... > Subject: Re: rawiso update & documentation > > > * Kurt Konolige (konolige@AI.SRI.COM) wrote: > > Dan, a frame of video can contain a lot of info -- for example, 640x480 > > YUV422 has about 600KB. That's a lot of packets. There's an > OHCI mode that > > allows you to specify all of the data from one frame's worth of > packets be > > deposited into a single buffer, by DMA, with zero CPU intervention. The > > frames are delimited by a sync bit in the packet headers. > > Aha, I see. I am not very familiar with this form of > reception. (rawiso at this point only uses packet-per-buffer mode). I > think you could accomplish what you need with the rawiso API, though > it might not be as straightforward as the single buffer method. > > I could offer an alternative to raw1394_iso_recv_start() that takes a > sync tag rather than a cycle number to start on. > > Or, it may make sense to just separate this kind of batch-receive > functionality into a totally separate API. > > raw1394_iso_recv_begin_batch() or something... > |
From: Dan D. <da...@de...> - 2002-10-29 00:04:32
|
On Mon, 2002-10-28 at 17:28, Dan Maas wrote: > * Kurt Konolige (konolige@AI.SRI.COM) wrote: > > Dan, a frame of video can contain a lot of info -- for example, 640x480 > > YUV422 has about 600KB. That's a lot of packets. There's an OHCI mode that > > allows you to specify all of the data from one frame's worth of packets be > > deposited into a single buffer, by DMA, with zero CPU intervention. The > > frames are delimited by a sync bit in the packet headers. > > Aha, I see. I am not very familiar with this form of > reception. (rawiso at this point only uses packet-per-buffer mode). I > think you could accomplish what you need with the rawiso API, though > it might not be as straightforward as the single buffer method. > > I could offer an alternative to raw1394_iso_recv_start() that takes a > sync tag rather than a cycle number to start on. Yes, that is exactly what is needed. Digital Camera uses the sync field of iso packet header to indicate start-of-frame. This works great in buffer fill mode because there are no empty packets. The rate is adjusted by varying the packet size depending upon the format/resolution of the video. So, you can program exactly the number of descriptor blocks needed to fill a single buffer, but you then need to tell DMA to to wait for that sync field (the w field of INPUT_MORE). It is still possible to use a packet-oriented approach. It is possible using exising raw1394 to capture from Digital Camera because it retains the iso header for applications to check for the sync tag. However, many in this field need the lowest latency possible. Therefore, the mmap feature of video1394 has been very critical in conjunction with its ability to indicate the number of ready buffers (frames). Using this, an application (libdc1394) is able to skip the intermediate ready frames, and deliver just the latest one. (libdc1394 has another mode that allows an application to receive every frame if that is more desirable for the application.) I am not sure if we can achieve the same level of low latency using the new rawiso API. > What exactly are you doing with the data? If you are saving it to a > file, or displaying it on screen, you'll need to copy it once > anyway. Two copies is only slightly more expensive than one copy due > to the CPU cache. I think you would be blown away by the applications! People use it for stereo imaging, digital microscopy, robotics, machine vision, etc. In any case, I don't know the details of the low latency applications. > It is good to hear comments from other users of isochronous > transmission. I am mostly focused on IEC 61883 streaming, so you can > see that the API is biased in that direction... Unfortunately, I can not keep up to spot these issues! But it's definitely not too late. > (people have been talking about deprecating video1394 but I don't > really think that's appropriate anytime soon. Obviously it provides > good features that people are using intensively. The reason I am > introducing rawiso is that I find video1394 to be pretty much useless > for IEC 61883 applications, which have specific requirements for > latency and packet timestamping) We are talking about deprecating it for kernel 2.6, which is not that soon. It will be maintained for kernel 2.4, however. I understand there are pci_dma issues in the current video1394 that prevent it from being used in kernel 2.5+ anyway. |
From: Dan M. <dm...@ma...> - 2002-10-29 01:20:31
|
* Dan Dennedy (da...@de...) wrote: > Yes, that is exactly what is needed. Digital Camera uses the sync > field of iso packet header to indicate start-of-frame. This works > great in buffer fill mode because there are no empty packets. The > rate is adjusted by varying the packet size depending upon the > format/resolution of the video. So, you can program exactly the > number of descriptor blocks needed to fill a single buffer, but you > then need to tell DMA to to wait for that sync field (the w field of > INPUT_MORE). I see. I think rawiso could handle this. It would have to be a separate API though, since today's code requires a non-contiguous packet buffer. I can see that we might end up with two APIs - a "streaming" API suitable for IEC 61883 stuff, and a "batch" API for this digital camera stuff... > the iso header for applications to check for the sync tag. However, > many in this field need the lowest latency possible. Therefore, the > mmap feature of video1394 has been very critical in conjunction with > its ability to indicate the number of ready buffers (frames). Using > this, an application (libdc1394) is able to skip the intermediate > ready frames, and deliver just the latest one. (libdc1394 has > another mode that allows an application to receive every frame if > that is more desirable for the application.) I am not sure if we can > achieve the same level of low latency using the new rawiso API. I see. This is a fundamentally different approach from the current rawiso code. My focus has been "zero dropped packets at any cost," whereas this seems to be "just give me the latest frame." I will have to think about this some more. I'm pretty sure this is going to require a separate API at both the kernel hpsb level and the libraw1394 user level. > I think you would be blown away by the applications! People use it > for stereo imaging, digital microscopy, robotics, machine vision, > etc. In any case, I don't know the details of the low latency > applications. My point was that you are doing *something* with the data. Even if you are simply scanning the data without recording it, everything *will* end up in the CPU cache. And once the data is cached, performing an extra copy or two is very cheap. The biggest benefit of "zero copy" only applies when the CPU never touches the data at all. e.g. sending data directly from kernel filesystem buffesr with sendfile() > Unfortunately, I can not keep up to spot these issues! But it's > definitely not too late. My current focus is just getting the streaming rawiso work into ieee1394 and libraw1394 trunk for DV/MPEG-2/AMDTP... We can worry about other uses later... I'll be out of town Wednesday on business. I'm hoping to get the kernel side of rawiso merged before I leave. libraw1394 will have to wait until I get back later this week. Regards, Dan |
From: Kurt K. <konolige@AI.SRI.COM> - 2002-10-29 01:27:30
|
I get the Dans confused... Anyhow, thanks to Dan D for a cogent explanation of the video apps and their need for buffer-per-frame. Dan M, you might take a look at video1394 to see how it handles setting up the chaining DMA program to fill a buffer-per-frame using the sync bit. Also, while theoretically copying from cached data can be cheap, in actual practice many things can interfere. A typical vision app will access the data many times, but it may concentrate on certain parts of the image, or use other data that flushes image data from the cache. In my own experience, handling IEC-type formats where I have to look at every packet, the app can require 50% more CPU time than a buffer-per-frame scheme. Cheers --Kurt > -----Original Message----- > From: Dan Maas [mailto:dm...@ma...] > Sent: Monday, October 28, 2002 5:20 PM > To: Dan Dennedy > Cc: Kurt Konolige; linux1394-devel > Subject: Re: rawiso update & documentation > > > * Dan Dennedy (da...@de...) wrote: > > Yes, that is exactly what is needed. Digital Camera uses the sync > > field of iso packet header to indicate start-of-frame. This works > > great in buffer fill mode because there are no empty packets. The > > rate is adjusted by varying the packet size depending upon the > > format/resolution of the video. So, you can program exactly the > > number of descriptor blocks needed to fill a single buffer, but you > > then need to tell DMA to to wait for that sync field (the w field of > > INPUT_MORE). > > I see. I think rawiso could handle this. It would have to be a > separate API though, since today's code requires a non-contiguous > packet buffer. > > I can see that we might end up with two APIs - a "streaming" API > suitable for IEC 61883 stuff, and a "batch" API for this digital > camera stuff... > > > the iso header for applications to check for the sync tag. However, > > many in this field need the lowest latency possible. Therefore, the > > mmap feature of video1394 has been very critical in conjunction with > > its ability to indicate the number of ready buffers (frames). Using > > this, an application (libdc1394) is able to skip the intermediate > > ready frames, and deliver just the latest one. (libdc1394 has > > another mode that allows an application to receive every frame if > > that is more desirable for the application.) I am not sure if we can > > achieve the same level of low latency using the new rawiso API. > > I see. This is a fundamentally different approach from the current > rawiso code. My focus has been "zero dropped packets at any cost," > whereas this seems to be "just give me the latest frame." > > I will have to think about this some more. I'm pretty sure this is > going to require a separate API at both the kernel hpsb level and > the libraw1394 user level. > > > I think you would be blown away by the applications! People use it > > for stereo imaging, digital microscopy, robotics, machine vision, > > etc. In any case, I don't know the details of the low latency > > applications. > > My point was that you are doing *something* with the data. Even if you > are simply scanning the data without recording it, everything *will* > end up in the CPU cache. And once the data is cached, performing an > extra copy or two is very cheap. > > The biggest benefit of "zero copy" only applies when the CPU never > touches the data at all. e.g. sending data directly from kernel > filesystem buffesr with sendfile() > > > Unfortunately, I can not keep up to spot these issues! But it's > > definitely not too late. > > My current focus is just getting the streaming rawiso work into > ieee1394 and libraw1394 trunk for DV/MPEG-2/AMDTP... We can worry > about other uses later... > > I'll be out of town Wednesday on business. I'm hoping to get the > kernel side of rawiso merged before I leave. libraw1394 will have to > wait until I get back later this week. > > Regards, > Dan |
From: Dan M. <dm...@ma...> - 2002-10-29 01:33:01
|
* Kurt Konolige (konolige@AI.SRI.COM) wrote: > I get the Dans confused... So do I... > Dan M, you might take a look at video1394 to see how it handles > setting up the chaining DMA program to fill a buffer-per-frame using > the sync bit. ... the app can require 50% more CPU time than a > buffer-per-frame scheme. Understood, and noted. This is definitely going to require a separate API =). Now I understand why the OHCI architecture needs to be so flexible... Anyway, please sit tight. Let me get IEC 61883 streaming stuff working right, then I'll take a look at the specific needs of buffer-per-frame mode. Regards, Dan |