Re: [Ocf-linux-users] OCF userland support
Brought to you by:
david-m
From: David M. <Dav...@se...> - 2007-10-11 23:45:08
|
Jivin Egor N. Martovetsky lays it down ... > David, > > I was able to go to unlocked_ioctl. Things improved quite > a bit. I suppose this should be included in the next ocf release, Definately > although there > is a chance that it will expose bugs in some drivers - like it did in mine. The driver are supposed to be able to deal with it. I would rather get that sorted out that have a kernel lock anywhere near it :-) > The problem that I saw with ocf-bench, turned out to be my driver > bug after all - I accidentally had one of the variables declared static > in *_process(), > and it was being modified by other threads. Actually, it seems the bkl > was in ioctl > for awhile, I looked in 2.4 tree, it is there. There you go. I would never have thought that some like that would be there. Makes no sense to me :-) I'll work out a patch. > Another issue is that I think the throughput calculation is not correct > in cryptotest.c, > at least not for SMP environment. To calculate the throughput we want > to have > total data size processed divided by real time it took to process it. > So the nops > should always be multiplied by the number of threads. Also, to > calculate time we > are interested in real time, not the time that processes spent running > divided by > the number of threads. This gives incorrect results for SMP systems > with more than 1 thread. Ideally we would synchronize threads, but for > now I take > delta between first process start time and last process stop time as > execution time. Yeah, multi threaded cryptotest has always needed it's results multiplied by the number of threads. I have never bothered to fix it, preferring to keep it as original as possible. Happy to put in a patch though, Cheers, Davidm > >Jivin Egor N. Martovetsky lays it down ... > > > > > >>David - thanks for a quick response. I have comments in line. > >> > >>David McCullough wrote: > >> > >> > >> > >>>Jivin Egor N. Martovetsky lays it down ... > >>> > >>> > >>> > >>> > >>>>David, > >>>> > >>>>I noticed that the throughput I get when using cryptotest or OpenSSL > >>>>speed > >>>>is much worse than what I get using ocf-bench. I also don't get much > >>>>improvement, if at all, when running mutiple threads of the above > >>>>programs. > >>>> > >>>> > >>>> > >>>> > >>>ocf-bench runs in kernel mode, the data does not need to be > >>>copied from user space to kernel space and back. This make a massive > >>>difference to performance. > >>> > >>>All user apps need to pass their data through to the kernel and back. > >>>Unfortunately we don't have a zero copy API for OCF (yet ;-) > >>> > >>>Basically, for OCF accelerated user space, you need to be using larger > >>>packets to help overcome the overheads of the user-kernel-user copies, > >>>but it will never be a good as in-kernel crypto with a zero copy > >>>interface. > >>> > >>> > >>Yes, I was aware of the copying, and it explains some of the performance > >>degradation, > >>that I see with a single thread user space program vs. kernel mode. As > >>you point out, the performance > >>of user space program gets better relative to kernel mode, as the packet > >>size is increased. However, > >>in ocf-bench I can keep cpu 100% utilized submitting and processing done > >>packets, while a single thread > >>of cryptotest is unable to do that, so I tried to run a few threads, and > >>saw the throughput get worse. > >> > >> > > > >All that makes sense, except that it got worse, see below. > > > > > > > >>>>It seems that this is a result of using ioctl(vs unlocked_ioctl) to > >>>>access /dev/crypto, which > >>>>would only allow one process doing crypto at a given time. Is that a > >>>>known problem and > >>>>are there plans to fix it? > >>>> > >>>> > >>>I wasn't aware that ioctl would prevent multiple processes from working > >>>in parallel. I have seen performance improvements with multiple threads > >>>on 2.4 systems. Haven't checked on 2.6 > >>> > >>> > >>In 2.6 kernel the do_ioctl() function in fs/ioctl.c does a kernel lock > >>before calling device's ioctl. > >> > >> > > > >Ok, that is just plain ugly :-( This used to be ok and I obviously missed > >the addition of ioctl_unlocked and the BKL. > > > >It should be safe to switch cryptodev across to ioctl_unlocked since > >that is what the code expects (and gets on other kernels/systems). > > > > > > > >>Since cryptodev ioctl submits a packet and waits for completion before > >>returning, effectively > >>only one request can be processed at a given time, and I am not able to > >>take advantage of multiple > >>crypto channels executing in parallel. > >> > >> > >> > >>>>Also, is ocf-bench SMP safe? I had to set CRYPTO_F_BATCH in the > >>>>crp_flags to make it work, > >>>>otherwise with the CRYPTO_F_CBIMM it would not work in the SMP mode. > >>>> > >>>> > >>>I have never thought nor checked that ocf-bench is SMP safe. Which OCF > >>>driver are you using when doing your tests ? It could explain a few > >>>things, > >>> > >>> > >>> > >>It's my own driver, for a new PA Semi chip, and since it is still under > >>development - > >>yes, it can explain a few things. :) > >> > >> > > > >I was more interested in whether is was cryptosoft or one of the HW > >drivers. Generally the HW drivers work better with immediate callbacks > >as there is still a "gap" between the callin and callback. > > > >When your completion call is run before you have returned from the > >initial request, your code needs to be a lot more careful ;-) > > > >Unfortunately OCF hasn't had a huge amount of SMP focus. I have run it > >on SMP machines using hifn drivers, but not that often. So you may hit > >some other SMP issues. > > > > > > > >>But in this case, I don't think so, because in general it works fine, > >>and ocf-bench > >>works fine in nosmp mode, or with CRYPTO_F_BATCH mode, which makes > >>the completions go through a callback queue that is protected by > >>spinlocks, as opposed > >>to immediate callbacks. > >> > >> > > > >If you get a handle on what is happening let me know, it would be nice > >to get it fixed. > > > >Cheers, > >Davidm > > > > > > > > > -- > Egor N. Martovetsky > -- David McCullough, dav...@se..., Ph:+61 734352815 Secure Computing - SnapGear http://www.uCdot.org http://www.cyberguard.com |