Re: [Ocf-linux-users] talitos driver should be preemtive
Brought to you by:
david-m
From: David M. <dav...@mc...> - 2010-05-25 23:54:12
|
Jivin Kim Phillips lays it down ... > On Wed, 26 May 2010 07:55:51 +1000 > David McCullough <dav...@Mc...> wrote: > > > > > Jivin Kim Phillips lays it down ... > > > On Mon, 24 May 2010 23:14:55 +0200 > > > " ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > > > > > > > Hello. My name is Alexandru and I'm doing my final degree project about > > > > porting Linux to a embedded device (a router), that uses a 8272 of > > > > Freescale. My issue to solve is provide IPsec to the router. The > > > > software that I used is a Linux 2.6.19 + Quagga + Openssl + ipsec-tools. > > > > The point was that the processor (8272) came with a crypto-processor > > > > embedded in it that should help in the encryption process. I've found > > > > this excelent project that provide the support of hardware encryption > > > > with Cryptodev + Talitos driver. Also, the patch for Openssl works > > > > perfectly and I've obtained the feature that I need. > > > > > > > > After making some benchmarks I've discovered that talitos is not > > > > preemtive. The crypto-processor (SEC) should make the operations of > > > > encryption/decryption and let the processor idle; the scheduler should > > > > be called and let another process to enter as "active process". After > > > > the crypto-processor finish the job, it should say "I'm done!" by a > > > > IRQ signal and the other encryption process that need the encrypted data > > > > should be activated and continue getting the crypted data from the > > > > address of memory where the crypto-processor writted it. > > > > > > > > Well, the behaviour of talitos seems not to be like that. It's look like > > > > the processor is waiting for the crypto-processor to finish, and after > > > > that it gets the crypted data.That wastes the time of the processor > > > > while is waiting for the crypto-processor to finish. Maybe I'm wrong, > > > > but the benchmarks looks like (2 processes means 1 crypting and another > > > > doing an infinite loop a=1+1): > > > > no CD(R) no CD(U) no CD (S) > > > > 1 process crypting 0.36 0.13 0.20 > > > > 2 processes(1+1*) 0.72 0.13 0.20 > > > > It looks normal without Cryptodev that the user and system time be the > > > > same, but the the real be double, because there's another process > > > > requiring the CPU. > > > > > > > > Benchmarking the system with Cryptodev I've obtain the more or less the > > > > same times (much more system time that without it), but it's not > > > > exactly the double, it's 0,02 less (0.36*2 - 0.02). That's it > > > > improving the time, but not really how much I've expected. And it is > > > > because crypto-processor doesn't leave free the processor. > > > > Ok, I think the problem may be how you are benchmarking it. What commands > > are you running to benchmark it ? How are you measuring the CPU usage ? > > > > OCF has no busy waits and I am fairly confident that the talitos driver > > doesn't busy wait for anything, but Kim would know best. > > it doesn't. > > > > > I want to add preemtion to talitos, does anyone is working already on it? > > > > May I help? > > > > > > I believe this is due to the wait_event_interruptible call in cryptodev. > > > > > > Also note that there are other pre-emption issues due to openssl having > > > a synchronous crypto api (at least last I checked) - that tends to not > > > jive well with asynchronous crypto h/w, such as what you are using. > > > > Can you recall and details as to how a synchronous userspace API was causing > > kernel preemption issues ? > > not a kernel pre-emption issue per se; I just wanted to mention it > makes it harder to overcome serializing the overhead of sending the > request to h/w and back. Also, newer talitos h/w can perform ciphers > and hashes simultaneously (I'm not sure if the 8272 can do that though). But the 8272 still has a queue for crypto requests right ? Which means you can have several outstanding requests to the HW at any point ? As long as the HW can queue requests and doesn't busy wait, OCF will scale over multiple processes/threads/CPU's, at least to a point where it can be explained by bus bandwidth, userspace copy overhead or something :-) We'll just have to wait and see how Alexandru is testing it, Cheers, Davidm -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |