ocf-linux-users Mailing List for Open Cryptographic Framework for Linux (Page 9)
Brought to you by:
david-m
You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(12) |
Sep
(39) |
Oct
(16) |
Nov
(7) |
Dec
(17) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
(10) |
Feb
(1) |
Mar
(18) |
Apr
(8) |
May
(14) |
Jun
(12) |
Jul
(35) |
Aug
(11) |
Sep
(3) |
Oct
(3) |
Nov
(7) |
Dec
(2) |
2009 |
Jan
(20) |
Feb
(12) |
Mar
(31) |
Apr
(20) |
May
(31) |
Jun
|
Jul
(2) |
Aug
(5) |
Sep
(11) |
Oct
|
Nov
(2) |
Dec
(6) |
2010 |
Jan
(20) |
Feb
(10) |
Mar
(16) |
Apr
|
May
(17) |
Jun
|
Jul
(2) |
Aug
(30) |
Sep
(6) |
Oct
|
Nov
|
Dec
(1) |
2011 |
Jan
|
Feb
(9) |
Mar
(7) |
Apr
(6) |
May
(20) |
Jun
(2) |
Jul
(13) |
Aug
(4) |
Sep
(7) |
Oct
(9) |
Nov
(5) |
Dec
(2) |
2012 |
Jan
(5) |
Feb
(2) |
Mar
|
Apr
(1) |
May
|
Jun
(7) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(8) |
Dec
(19) |
2013 |
Jan
(2) |
Feb
(3) |
Mar
|
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(8) |
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
(2) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
|
From: David M. <dav...@mc...> - 2010-05-30 04:40:59
|
Jivin ALEXANDRU IONUT GRAMA lays it down ... > Con fecha 27/5/2010, "Kim Phillips" <kim...@fr...> > escribi??: > > >On Thu, 27 May 2010 02:10:12 +0200 > >" ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > > > >> Con fecha 26/5/2010, "Kim Phillips" <kim...@fr...> > >> escribi??: > > > >> >It should be straightforward to convert talitos to support SEC 1.x h/w; > >> >it has a different ring buffer mechanism (which, if I knew more about, > >> >I'd be able to tell you whether it allowed simultaneous ciphers and > >> >hashes)... > >> > >> Thank you Kim, I dind't know I shouldn't load the cryptosoft module. In > >> this guide( > >> http://www.docunext.com/wiki/My_Notes_on_Patching_2.6.22_with_OCF#The_Results > >> ),the author uses cryptosoft, and I thought that I should load it. I've > >> found a guide of SEC1.x at this website: > >> http://cache.freescale.com/files/32bit/doc/user_guide/SEC1SWUG.pdf?fpsp=1&WT_TYPE=Users%20Guides&WT_VENDOR=FREESCALE&WT_FILE_FORMAT=pdf&WT_ASSET=Documentation > > > >that's a guide for a different, standalone driver. > > > Ok, I thought that the other driver makes the correct tasks for the > kernel 2.4, and the values gived as triggers for the different > operations are the same. > >> This one give me the the values that I should send to the > >> crypto-processor for doing the proper operation, but I don't know the > >> meaning of the symbols included in the talitos source code. I want to > >> adapt talitos for being fully compatible with SEC 1.x arch, and if my > >> changes of the code are apropiate for the project, contribute with them > >> to the OCF-project to provide integration with SEC1.x branch. So, David > >> and Kim, when you have some time, could you please give me some > >> explanation about the meaning of the simbols and functions that you use > >> on talitos? It will be very apreciated!!! > > > >even though it was written based on a SEC v2 manual, most of the > >references in the talitos driver should coincide with nomenclature used > >in the MPC8272 PowerQUICC II Family Reference Manual, Rev. 2, Chapter > >38: Security Engine (SEC). > > > >Keep in mind that since mainline linux added support for asynchronous > >crypto, a new talitos driver has since been merged: > > > >http://git.kernel.org/?p=linux/kernel/git/herbert/cryptodev-2.6.git;a=blob;f=drivers/crypto/talitos.c;h=637c105f53d262f904230c77b5bc5a5a5234fda7;hb=master > > > >OCF can also utilize this driver through its cryptosoft interface > >module, so depending on your license preference (OCF is Dual BSD/GPL, > >kernel.org is GPL), it might be worth checking out. > > > > I've already checked the new module, but I thought that it's no way to > make it compatible with cryptodev because it doesn't include any > interface with cryptodev (I've look a bit into it, I didn't check it > deeply). What I cannot understand how cryptosoft will know about the > "new" talitos if talitos has only a interface to cryptoAPI. ???There's > any patch that you have to apply for making cryptodev compatible with > cryptoAPI? > > Today I've spoke with my tutor about the actual situation of the > project, and he told me that we need to submit some benchmarks about the > IPsec's working now on June. > > The point is that I'm not working with the base kernel 2.6.21, it's a > patched version that includes a backport for UBI and other features that > our system needs. So, I'm planning to switch to a kernel 2.6.34, so it > means a lot of work to adapt the current patch for 2.5.34 version, that > task being not trivial. > > In summary, I'll return to cryptodev/kernel work on July, now I have > exams and some IPsec benchmarks to do.When I'll finish those tasks, > I'll have benchmarks without cryptodev, with cryptodev+cryptodev's > talitos and with cryptodev+cryptosoft+talitos. I'll tell you the > address where you can found those benchmarks and make an idea of the > performance of cryptodev. You can use the "in-kernel" version of the talitos driver with cryptodev by using cryptosoft. cryptosoft isn't just software, it uses the kernels crypto API, so anything that is available there then becomes availabel to OCF. The latest OCF release has a verison of cryptosoft that can use the kernels newer ASYNC operations under cryptoAPI and thus also use the talitos HW driver provided by the kernel. The other alternative is to use the OCF talitos driver instead of cryptosoft. Which one you choose depends on your requirements, both time and performance. Cheers, Davidm > >> >> Con fecha 25/5/2010, "David McCullough" <dav...@mc...> > >> >> escribi??: > >> >> > >> >> > > >> >> >Jivin Kim Phillips lays it down ... > >> >> >> On Wed, 26 May 2010 07:55:51 +1000 > >> >> >> David McCullough <dav...@Mc...> wrote: > >> >> >> > >> >> >> > > >> >> >> > Jivin Kim Phillips lays it down ... > > > >> >> >> not a kernel pre-emption issue per se; I just wanted to mention it > >> >> >> makes it harder to overcome serializing the overhead of sending the > >> >> >> request to h/w and back. Also, newer talitos h/w can perform ciphers > >> >> >> and hashes simultaneously (I'm not sure if the 8272 can do that though). > >> >> > > >> >> >But the 8272 still has a queue for crypto requests right ? Which means you > >> >> >can have several outstanding requests to the HW at any point ? > > > >turns out the sec1.0 in the 8272 has four dma channels, so yes, it can > >do a cipher and a hash at the same time. > > > >Cheers, > > > >Kim > > -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: sedat a. <sed...@gm...> - 2010-05-29 14:39:19
|
Hi list; I am new in this list. I have a problem about Freescale Security Engine Talitos on MPC8572DS olatform. I am using Linux kernel 2.6.33 patched with ocf-linux -26-20100325.patch After successfuly building the kernel and the ocf,cryptodev and talitos modules I did the below: modprobe ocf modprobe cryptodev modprobe talitos every step is Ok. Then with cryptotest application cryptotest -d talitos -a aes :As a note I am using the talitos driver under crypto/ocf/talitos instead of drivers/crypto/talitos Then I got the following error Unable to handle kernel paging request for instruction fetch Faulting instruction address: 0x61363334 Oops: Kernel access of bad area, sig: 11 [#70] MPC8572 DS Modules linked in: talitos cryptodev(P) ocf(P) [last unloaded: cryptosoft] NIP: 61363334 LR: f2449994 CTR: 61363335 REGS: ef2c3ab0 TRAP: 0400 Tainted: P D (2.6.33.3) MSR: 00029000 <EE,ME,CE> CR: 22088428 XER: 20000000 TASK = ef872ee0[7200] 'cryptotest' THREAD: ef2c2000 GPR00: c0c04000 ef2c3b60 ef872ee0 ef14f6c0 c11f8160 00000a10 00000010 00000001 GPR08: 00000000 c0400000 00000000 61363335 20088422 1001da1c 00000005 00000002 GPR16: 00000001 00000000 00000018 ef292b40 f2450000 00000000 00000004 ef27cf80 GPR24: f2450000 efb70954 ef14f6c0 ef14f6c0 00000003 efb70978 ef27cdc0 ef245cd8 NIP [61363334] 0x61363334 LR [f2449994] talitos_process+0x3ec/0xec0 [talitos] Call Trace: [ef2c3b60] [f244992c] talitos_process+0x384/0xec0 [talitos] (unreliable) [ef2c3bc0] [f2396d28] crypto_invoke+0x98/0x214 [ocf] [ef2c3bf0] [f2397120] crypto_dispatch+0x27c/0x3a4 [ocf] [ef2c3c30] [f24164ac] cryptodev_op+0x4ac/0x94c [cryptodev] [ef2c3ca0] [f2417468] cryptodev_ioctl+0x6d8/0x1a84 [cryptodev] [ef2c3e90] [c00ba6dc] vfs_ioctl+0x88/0x9c [ef2c3ea0] [c00bae78] do_vfs_ioctl+0xc4/0x760 [ef2c3f10] [c00bb554] sys_ioctl+0x40/0x74 [ef2c3f40] [c00102e8] ret_from_syscall+0x0/0x3c Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace d969f1c27b62fd31 ]--- Segmentation fault I couldnt solve the problem I found that the error occurs at line dma_map_single() in talitos_process function. Any help will be appreciated. |
From: A. I. G. <ai....@al...> - 2010-05-28 21:32:09
|
Con fecha 27/5/2010, "Kim Phillips" <kim...@fr...> escribió: >On Thu, 27 May 2010 02:10:12 +0200 >" ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > >> Con fecha 26/5/2010, "Kim Phillips" <kim...@fr...> >> escribió: > >> >It should be straightforward to convert talitos to support SEC 1.x h/w; >> >it has a different ring buffer mechanism (which, if I knew more about, >> >I'd be able to tell you whether it allowed simultaneous ciphers and >> >hashes)... >> >> Thank you Kim, I dind't know I shouldn't load the cryptosoft module. In >> this guide( >> http://www.docunext.com/wiki/My_Notes_on_Patching_2.6.22_with_OCF#The_Results >> ),the author uses cryptosoft, and I thought that I should load it. I've >> found a guide of SEC1.x at this website: >> http://cache.freescale.com/files/32bit/doc/user_guide/SEC1SWUG.pdf?fpsp=1&WT_TYPE=Users%20Guides&WT_VENDOR=FREESCALE&WT_FILE_FORMAT=pdf&WT_ASSET=Documentation > >that's a guide for a different, standalone driver. > Ok, I thought that the other driver makes the correct tasks for the kernel 2.4, and the values gived as triggers for the different operations are the same. >> This one give me the the values that I should send to the >> crypto-processor for doing the proper operation, but I don't know the >> meaning of the symbols included in the talitos source code. I want to >> adapt talitos for being fully compatible with SEC 1.x arch, and if my >> changes of the code are apropiate for the project, contribute with them >> to the OCF-project to provide integration with SEC1.x branch. So, David >> and Kim, when you have some time, could you please give me some >> explanation about the meaning of the simbols and functions that you use >> on talitos? It will be very apreciated!!! > >even though it was written based on a SEC v2 manual, most of the >references in the talitos driver should coincide with nomenclature used >in the MPC8272 PowerQUICC II Family Reference Manual, Rev. 2, Chapter >38: Security Engine (SEC). > >Keep in mind that since mainline linux added support for asynchronous >crypto, a new talitos driver has since been merged: > >http://git.kernel.org/?p=linux/kernel/git/herbert/cryptodev-2.6.git;a=blob;f=drivers/crypto/talitos.c;h=637c105f53d262f904230c77b5bc5a5a5234fda7;hb=master > >OCF can also utilize this driver through its cryptosoft interface >module, so depending on your license preference (OCF is Dual BSD/GPL, >kernel.org is GPL), it might be worth checking out. > I've already checked the new module, but I thought that it's no way to make it compatible with cryptodev because it doesn't include any interface with cryptodev (I've look a bit into it, I didn't check it deeply). What I cannot understand how cryptosoft will know about the "new" talitos if talitos has only a interface to cryptoAPI. ¿There's any patch that you have to apply for making cryptodev compatible with cryptoAPI? Today I've spoke with my tutor about the actual situation of the project, and he told me that we need to submit some benchmarks about the IPsec's working now on June. The point is that I'm not working with the base kernel 2.6.21, it's a patched version that includes a backport for UBI and other features that our system needs. So, I'm planning to switch to a kernel 2.6.34, so it means a lot of work to adapt the current patch for 2.5.34 version, that task being not trivial. In summary, I'll return to cryptodev/kernel work on July, now I have exams and some IPsec benchmarks to do.When I'll finish those tasks, I'll have benchmarks without cryptodev, with cryptodev+cryptodev's talitos and with cryptodev+cryptosoft+talitos. I'll tell you the address where you can found those benchmarks and make an idea of the performance of cryptodev. Thank you again for all your help,was really useful for me. See you back on July! King regards, Alexandru. >> >> Con fecha 25/5/2010, "David McCullough" <dav...@mc...> >> >> escribió: >> >> >> >> > >> >> >Jivin Kim Phillips lays it down ... >> >> >> On Wed, 26 May 2010 07:55:51 +1000 >> >> >> David McCullough <dav...@Mc...> wrote: >> >> >> >> >> >> > >> >> >> > Jivin Kim Phillips lays it down ... > >> >> >> not a kernel pre-emption issue per se; I just wanted to mention it >> >> >> makes it harder to overcome serializing the overhead of sending the >> >> >> request to h/w and back. Also, newer talitos h/w can perform ciphers >> >> >> and hashes simultaneously (I'm not sure if the 8272 can do that though). >> >> > >> >> >But the 8272 still has a queue for crypto requests right ? Which means you >> >> >can have several outstanding requests to the HW at any point ? > >turns out the sec1.0 in the 8272 has four dma channels, so yes, it can >do a cipher and a hash at the same time. > >Cheers, > >Kim |
From: Kim P. <kim...@fr...> - 2010-05-27 19:53:24
|
On Thu, 27 May 2010 02:10:12 +0200 " ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > Con fecha 26/5/2010, "Kim Phillips" <kim...@fr...> > escribió: > >It should be straightforward to convert talitos to support SEC 1.x h/w; > >it has a different ring buffer mechanism (which, if I knew more about, > >I'd be able to tell you whether it allowed simultaneous ciphers and > >hashes)... > > Thank you Kim, I dind't know I shouldn't load the cryptosoft module. In > this guide( > http://www.docunext.com/wiki/My_Notes_on_Patching_2.6.22_with_OCF#The_Results > ),the author uses cryptosoft, and I thought that I should load it. I've > found a guide of SEC1.x at this website: > http://cache.freescale.com/files/32bit/doc/user_guide/SEC1SWUG.pdf?fpsp=1&WT_TYPE=Users%20Guides&WT_VENDOR=FREESCALE&WT_FILE_FORMAT=pdf&WT_ASSET=Documentation that's a guide for a different, standalone driver. > This one give me the the values that I should send to the > crypto-processor for doing the proper operation, but I don't know the > meaning of the symbols included in the talitos source code. I want to > adapt talitos for being fully compatible with SEC 1.x arch, and if my > changes of the code are apropiate for the project, contribute with them > to the OCF-project to provide integration with SEC1.x branch. So, David > and Kim, when you have some time, could you please give me some > explanation about the meaning of the simbols and functions that you use > on talitos? It will be very apreciated!!! even though it was written based on a SEC v2 manual, most of the references in the talitos driver should coincide with nomenclature used in the MPC8272 PowerQUICC II Family Reference Manual, Rev. 2, Chapter 38: Security Engine (SEC). Keep in mind that since mainline linux added support for asynchronous crypto, a new talitos driver has since been merged: http://git.kernel.org/?p=linux/kernel/git/herbert/cryptodev-2.6.git;a=blob;f=drivers/crypto/talitos.c;h=637c105f53d262f904230c77b5bc5a5a5234fda7;hb=master OCF can also utilize this driver through its cryptosoft interface module, so depending on your license preference (OCF is Dual BSD/GPL, kernel.org is GPL), it might be worth checking out. > >> Con fecha 25/5/2010, "David McCullough" <dav...@mc...> > >> escribió: > >> > >> > > >> >Jivin Kim Phillips lays it down ... > >> >> On Wed, 26 May 2010 07:55:51 +1000 > >> >> David McCullough <dav...@Mc...> wrote: > >> >> > >> >> > > >> >> > Jivin Kim Phillips lays it down ... > >> >> not a kernel pre-emption issue per se; I just wanted to mention it > >> >> makes it harder to overcome serializing the overhead of sending the > >> >> request to h/w and back. Also, newer talitos h/w can perform ciphers > >> >> and hashes simultaneously (I'm not sure if the 8272 can do that though). > >> > > >> >But the 8272 still has a queue for crypto requests right ? Which means you > >> >can have several outstanding requests to the HW at any point ? turns out the sec1.0 in the 8272 has four dma channels, so yes, it can do a cipher and a hash at the same time. Cheers, Kim |
From: Kim P. <kim...@fr...> - 2010-05-27 07:00:18
|
On Wed, 26 May 2010 20:32:49 +0200 " ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > Hello sirs! I really apreciate your fast answer, thank you very much for > the answers. > > At first, I think you should know some characteristics of my system and > software layer. > > At the first, I use a kernel 2.6.21-rc2, with the next options: > Kernel options ---> > Timer frequency (300 HZ) ---> > Preemption Model (Preemptible Kernel (Low-Latency Desktop)) ---> > [*] Preempt The Big Kernel Lock > [*] Kernel support for ELF binaries > As I understand, those options give to the kernel the preemption > feature.Ocf have been builded as modules, so: > > Loadable module support ---> > [*] Enable loadable module support > [*] Module unloading > [*] Automatic kernel module loading > Cryptographic options ---> > OCF Configuration ---> > <M> OCF (Open Cryptograhic Framework) > <M> cryptodev (user space support) > <M> cryptosoft (software crypto engine) > <M> talitos (HW crypto engine) > (The other options are disabled) > > After applying the patch to Openssl-0.9.8n, I've make some changes in > cryptodev uncommenting the parts relationated with > --with-cryptodev-digest. (I've understood looking at the code that > cryptodev-digest feature doesn't work, but i've activated it for > testing it). After doing it, I've compile Openssl with those options: > powerpc-linux-gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT > -DDSO_ -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DDSO_DLFCN > -DHAVE_DLFCN_H -DENGINE_DYNAMIC_SUPPORT -Os > > For loading and unloading modules, I've use insmod (or rmmod) with the > modules ocf,cryptodev,cryptosoft and talitos. > > After that, depending if I benchmark with talitos or without talitos I do > the insmod or rmmod with ocf,cryptodev,cryptosoft and talitos. The > benchmarking is done with the time command, so for each execution we > obtain time consumed in user mode (U), system mode (S) and also, total > elapsed time as Real time (R): > $ time openssl enc -e -aes-128-cbc -salt -in kkkk -out kkkk.enc -pass > pass:'micontrasenalarguisima' -engine cryptodev > > kkkk is a file was previously generated with dd from /dev/urandom and > contains 10Megabytes of random data. The measures I'm going to show you > are the minimum of the results produced by a sequence of 50 executions > of each time command.We have checked the average is very close to the > minimum, so the minimum is a very good representation of the best > possible performance. > > Command/Times R(secs) U(secs) S(secs) R-(U+S)* > 1)with crypto engine 3.4 0.13 3.08 ~=0.026 > 2)crypto by software 3.59 2.29 1.09 ~=0.056 > > The first command (1) is executing with "-engine cryptodev" and the > modules loaded and the second one (2) is executing with modules removed > and without "-engine cryptodev", so by software. > > * The R-(U+S) given is the average of that computation for each > individual measurement, so it denotes some minimum and constant > background activity in the system that constantly enlarges the elapsed > time measured. > > Our first surprise is that total elapsed time with (case 1 ~= case 2) and > without engine is very similar. We can deduce that the performance of > the main processor and of the crypto-processor is very similar. It's > surprising to have a crypto-processor not faster than the main CPU, but > it could be understood if the main CPU could perform other tasks in > parallel. So we have repeated those benchmarks with a CPU consuming > process (an infinite loop) running in background, in order to prove if > the CPU can perform in parallel really. The results follows: > > Command/Times R(secs) U(secs) S(secs) R-(U+S)* > 3)with crypto engine 6.7 0.12 3.09 ~=3.436 > 4)crypto by software 6.25 2.31 0.69 ~=3.262 > +100% CPU in background. > > Those figures point us something amazing: It's much faster, cheaper, and > simple having the CPU without the crypto-processor!!!!!! you're not utilizing the crypto h/w at all - the talitos driver needs to be modified to support the 8272's SEC 1.x. The -engine cryptodev results are the results from using the kernel's built-in software crypto algorithm implementations, because you have loaded the cryptosoft module. It should be straightforward to convert talitos to support SEC 1.x h/w; it has a different ring buffer mechanism (which, if I knew more about, I'd be able to tell you whether it allowed simultaneous ciphers and hashes)... Kim > We can see that the time used to crypt the data without cryptodev (case > 4) is 6.25. This is expected because the CPU is shared (50/50) between > openssl process and the other background process, and the openssl > process needs 2.31+0.69=3.00 secs to perform, and 6.25 is ~= 2*3.00 > secs. Also, the elapsed time for crypt the data (U+S) it's more or less > the same independly of the existence of the background process, (case 1 > ~= case 3) and (case 2 ~=case 4). And here comes what it's strange: > Processing the data with a background process should take few more real > time that doing it without the bg process, but not the double! (case 3: > 6.7 > 2*(0.12+3.09)). Supose that, for example, from those 3.09 secs, > 1.00 is the CPU loadding data to the crypto-processor and the other 2.09 > secs is waiting for it to finish, then we would expect something more > simmilar to 2*(0.12+1.00)+2.09 secs = 4.33 secs of real time. But it > looks like doesn't exist parallelism between the CPU and the > crypto-processor, and the CPU is waiting for the cryptoprocessor to > finish without freeing the CPU, as explained in the two following time > graphs: > > The time graph with parallel execution should be like this: > > openssl in CPU | = = = = = = = > other proc in CPU |= = = = ====== ====== ====== > ------------------------------------------- > openssl in Crypto-PU | ====== ====== ====== > > > Instead of the previous graph, I'm thinking that the time graph is > something like this: > > openssl in CPU | = = = = = = = > other proc in CPU |= = = = = = = = = = = = = = = = = = = = = > ------------------------------------------- > openssl in Crypto-PU | = = = = = = = = = = = = = = = = = = > > In summary, our questions are: > Why do the case 3 gives 6.7 secs instead of much less as expected? > Is the first time graph schema correct? > What can we do for fixing it? > > > Best regards, > Alexandru. > > > Con fecha 25/5/2010, "David McCullough" <dav...@mc...> > escribió: > > > > >Jivin Kim Phillips lays it down ... > >> On Wed, 26 May 2010 07:55:51 +1000 > >> David McCullough <dav...@Mc...> wrote: > >> > >> > > >> > Jivin Kim Phillips lays it down ... > >> > > On Mon, 24 May 2010 23:14:55 +0200 > >> > > " ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > >> > > > >> > > > Hello. My name is Alexandru and I'm doing my final degree project about > >> > > > porting Linux to a embedded device (a router), that uses a 8272 of > >> > > > Freescale. My issue to solve is provide IPsec to the router. The > >> > > > software that I used is a Linux 2.6.19 + Quagga + Openssl + ipsec-tools. > >> > > > The point was that the processor (8272) came with a crypto-processor > >> > > > embedded in it that should help in the encryption process. I've found > >> > > > this excelent project that provide the support of hardware encryption > >> > > > with Cryptodev + Talitos driver. Also, the patch for Openssl works > >> > > > perfectly and I've obtained the feature that I need. > >> > > > > >> > > > After making some benchmarks I've discovered that talitos is not > >> > > > preemtive. The crypto-processor (SEC) should make the operations of > >> > > > encryption/decryption and let the processor idle; the scheduler should > >> > > > be called and let another process to enter as "active process". After > >> > > > the crypto-processor finish the job, it should say "I'm done!" by a > >> > > > IRQ signal and the other encryption process that need the encrypted data > >> > > > should be activated and continue getting the crypted data from the > >> > > > address of memory where the crypto-processor writted it. > >> > > > > >> > > > Well, the behaviour of talitos seems not to be like that. It's look like > >> > > > the processor is waiting for the crypto-processor to finish, and after > >> > > > that it gets the crypted data.That wastes the time of the processor > >> > > > while is waiting for the crypto-processor to finish. Maybe I'm wrong, > >> > > > but the benchmarks looks like (2 processes means 1 crypting and another > >> > > > doing an infinite loop a=1+1): > >> > > > no CD(R) no CD(U) no CD (S) > >> > > > 1 process crypting 0.36 0.13 0.20 > >> > > > 2 processes(1+1*) 0.72 0.13 0.20 > >> > > > It looks normal without Cryptodev that the user and system time be the > >> > > > same, but the the real be double, because there's another process > >> > > > requiring the CPU. > >> > > > > >> > > > Benchmarking the system with Cryptodev I've obtain the more or less the > >> > > > same times (much more system time that without it), but it's not > >> > > > exactly the double, it's 0,02 less (0.36*2 - 0.02). That's it > >> > > > improving the time, but not really how much I've expected. And it is > >> > > > because crypto-processor doesn't leave free the processor. > >> > > >> > Ok, I think the problem may be how you are benchmarking it. What commands > >> > are you running to benchmark it ? How are you measuring the CPU usage ? > >> > > >> > OCF has no busy waits and I am fairly confident that the talitos driver > >> > doesn't busy wait for anything, but Kim would know best. > >> > >> it doesn't. > >> > >> > > > I want to add preemtion to talitos, does anyone is working already on it? > >> > > > May I help? > >> > > > >> > > I believe this is due to the wait_event_interruptible call in cryptodev. > >> > > > >> > > Also note that there are other pre-emption issues due to openssl having > >> > > a synchronous crypto api (at least last I checked) - that tends to not > >> > > jive well with asynchronous crypto h/w, such as what you are using. > >> > > >> > Can you recall and details as to how a synchronous userspace API was causing > >> > kernel preemption issues ? > >> > >> not a kernel pre-emption issue per se; I just wanted to mention it > >> makes it harder to overcome serializing the overhead of sending the > >> request to h/w and back. Also, newer talitos h/w can perform ciphers > >> and hashes simultaneously (I'm not sure if the 8272 can do that though). > > > >But the 8272 still has a queue for crypto requests right ? Which means you > >can have several outstanding requests to the HW at any point ? > > > >As long as the HW can queue requests and doesn't busy wait, OCF will > >scale over multiple processes/threads/CPU's, at least to a point where > >it can be explained by bus bandwidth, userspace copy overhead or something :-) > > > >We'll just have to wait and see how Alexandru is testing it, > > > >Cheers, > >Davidm > > > >-- > >David McCullough, dav...@mc..., Ph:+61 734352815 > >McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org > |
From: A. I. G. <ai....@al...> - 2010-05-27 00:10:23
|
Con fecha 26/5/2010, "Kim Phillips" <kim...@fr...> escribió: >On Wed, 26 May 2010 20:32:49 +0200 >" ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > >> Hello sirs! I really apreciate your fast answer, thank you very much for >> the answers. >> >> At first, I think you should know some characteristics of my system and >> software layer. >> >> At the first, I use a kernel 2.6.21-rc2, with the next options: >> Kernel options ---> >> Timer frequency (300 HZ) ---> >> Preemption Model (Preemptible Kernel (Low-Latency Desktop)) ---> >> [*] Preempt The Big Kernel Lock >> [*] Kernel support for ELF binaries >> As I understand, those options give to the kernel the preemption >> feature.Ocf have been builded as modules, so: >> >> Loadable module support ---> >> [*] Enable loadable module support >> [*] Module unloading >> [*] Automatic kernel module loading >> Cryptographic options ---> >> OCF Configuration ---> >> <M> OCF (Open Cryptograhic Framework) >> <M> cryptodev (user space support) >> <M> cryptosoft (software crypto engine) >> <M> talitos (HW crypto engine) >> (The other options are disabled) >> >> After applying the patch to Openssl-0.9.8n, I've make some changes in >> cryptodev uncommenting the parts relationated with >> --with-cryptodev-digest. (I've understood looking at the code that >> cryptodev-digest feature doesn't work, but i've activated it for >> testing it). After doing it, I've compile Openssl with those options: >> powerpc-linux-gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT >> -DDSO_ -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DDSO_DLFCN >> -DHAVE_DLFCN_H -DENGINE_DYNAMIC_SUPPORT -Os >> >> For loading and unloading modules, I've use insmod (or rmmod) with the >> modules ocf,cryptodev,cryptosoft and talitos. >> >> After that, depending if I benchmark with talitos or without talitos I do >> the insmod or rmmod with ocf,cryptodev,cryptosoft and talitos. The >> benchmarking is done with the time command, so for each execution we >> obtain time consumed in user mode (U), system mode (S) and also, total >> elapsed time as Real time (R): >> $ time openssl enc -e -aes-128-cbc -salt -in kkkk -out kkkk.enc -pass >> pass:'micontrasenalarguisima' -engine cryptodev >> >> kkkk is a file was previously generated with dd from /dev/urandom and >> contains 10Megabytes of random data. The measures I'm going to show you >> are the minimum of the results produced by a sequence of 50 executions >> of each time command.We have checked the average is very close to the >> minimum, so the minimum is a very good representation of the best >> possible performance. >> >> Command/Times R(secs) U(secs) S(secs) R-(U+S)* >> 1)with crypto engine 3.4 0.13 3.08 ~=0.026 >> 2)crypto by software 3.59 2.29 1.09 ~=0.056 >> >> The first command (1) is executing with "-engine cryptodev" and the >> modules loaded and the second one (2) is executing with modules removed >> and without "-engine cryptodev", so by software. >> >> * The R-(U+S) given is the average of that computation for each >> individual measurement, so it denotes some minimum and constant >> background activity in the system that constantly enlarges the elapsed >> time measured. >> >> Our first surprise is that total elapsed time with (case 1 ~= case 2) and >> without engine is very similar. We can deduce that the performance of >> the main processor and of the crypto-processor is very similar. It's >> surprising to have a crypto-processor not faster than the main CPU, but >> it could be understood if the main CPU could perform other tasks in >> parallel. So we have repeated those benchmarks with a CPU consuming >> process (an infinite loop) running in background, in order to prove if >> the CPU can perform in parallel really. The results follows: >> >> Command/Times R(secs) U(secs) S(secs) R-(U+S)* >> 3)with crypto engine 6.7 0.12 3.09 ~=3.436 >> 4)crypto by software 6.25 2.31 0.69 ~=3.262 >> +100% CPU in background. >> >> Those figures point us something amazing: It's much faster, cheaper, and >> simple having the CPU without the crypto-processor!!!!!! > >you're not utilizing the crypto h/w at all - the talitos driver needs >to be modified to support the 8272's SEC 1.x. The -engine cryptodev >results are the results from using the kernel's built-in software >crypto algorithm implementations, because you have loaded the >cryptosoft module. > >It should be straightforward to convert talitos to support SEC 1.x h/w; >it has a different ring buffer mechanism (which, if I knew more about, >I'd be able to tell you whether it allowed simultaneous ciphers and >hashes)... > >Kim Thank you Kim, I dind't know I shouldn't load the cryptosoft module. In this guide( http://www.docunext.com/wiki/My_Notes_on_Patching_2.6.22_with_OCF#The_Results ),the author uses cryptosoft, and I thought that I should load it. I've found a guide of SEC1.x at this website: http://cache.freescale.com/files/32bit/doc/user_guide/SEC1SWUG.pdf?fpsp=1&WT_TYPE=Users%20Guides&WT_VENDOR=FREESCALE&WT_FILE_FORMAT=pdf&WT_ASSET=Documentation This one give me the the values that I should send to the crypto-processor for doing the proper operation, but I don't know the meaning of the symbols included in the talitos source code. I want to adapt talitos for being fully compatible with SEC 1.x arch, and if my changes of the code are apropiate for the project, contribute with them to the OCF-project to provide integration with SEC1.x branch. So, David and Kim, when you have some time, could you please give me some explanation about the meaning of the simbols and functions that you use on talitos? It will be very apreciated!!! Thank you to all, you provide light on to my project! Alexandru. ¿David > >> We can see that the time used to crypt the data without cryptodev (case >> 4) is 6.25. This is expected because the CPU is shared (50/50) between >> openssl process and the other background process, and the openssl >> process needs 2.31+0.69=3.00 secs to perform, and 6.25 is ~= 2*3.00 >> secs. Also, the elapsed time for crypt the data (U+S) it's more or less >> the same independly of the existence of the background process, (case 1 >> ~= case 3) and (case 2 ~=case 4). And here comes what it's strange: >> Processing the data with a background process should take few more real >> time that doing it without the bg process, but not the double! (case 3: >> 6.7 > 2*(0.12+3.09)). Supose that, for example, from those 3.09 secs, >> 1.00 is the CPU loadding data to the crypto-processor and the other 2.09 >> secs is waiting for it to finish, then we would expect something more >> simmilar to 2*(0.12+1.00)+2.09 secs = 4.33 secs of real time. But it >> looks like doesn't exist parallelism between the CPU and the >> crypto-processor, and the CPU is waiting for the cryptoprocessor to >> finish without freeing the CPU, as explained in the two following time >> graphs: >> >> The time graph with parallel execution should be like this: >> >> openssl in CPU | = = = = = = = >> other proc in CPU |= = = = ====== ====== ====== >> ------------------------------------------- >> openssl in Crypto-PU | ====== ====== ====== >> >> >> Instead of the previous graph, I'm thinking that the time graph is >> something like this: >> >> openssl in CPU | = = = = = = = >> other proc in CPU |= = = = = = = = = = = = = = = = = = = = = >> ------------------------------------------- >> openssl in Crypto-PU | = = = = = = = = = = = = = = = = = = >> >> In summary, our questions are: >> Why do the case 3 gives 6.7 secs instead of much less as expected? >> Is the first time graph schema correct? >> What can we do for fixing it? >> >> >> Best regards, >> Alexandru. >> >> >> Con fecha 25/5/2010, "David McCullough" <dav...@mc...> >> escribió: >> >> > >> >Jivin Kim Phillips lays it down ... >> >> On Wed, 26 May 2010 07:55:51 +1000 >> >> David McCullough <dav...@Mc...> wrote: >> >> >> >> > >> >> > Jivin Kim Phillips lays it down ... >> >> > > On Mon, 24 May 2010 23:14:55 +0200 >> >> > > " ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: >> >> > > >> >> > > > Hello. My name is Alexandru and I'm doing my final degree project about >> >> > > > porting Linux to a embedded device (a router), that uses a 8272 of >> >> > > > Freescale. My issue to solve is provide IPsec to the router. The >> >> > > > software that I used is a Linux 2.6.19 + Quagga + Openssl + ipsec-tools. >> >> > > > The point was that the processor (8272) came with a crypto-processor >> >> > > > embedded in it that should help in the encryption process. I've found >> >> > > > this excelent project that provide the support of hardware encryption >> >> > > > with Cryptodev + Talitos driver. Also, the patch for Openssl works >> >> > > > perfectly and I've obtained the feature that I need. >> >> > > > >> >> > > > After making some benchmarks I've discovered that talitos is not >> >> > > > preemtive. The crypto-processor (SEC) should make the operations of >> >> > > > encryption/decryption and let the processor idle; the scheduler should >> >> > > > be called and let another process to enter as "active process". After >> >> > > > the crypto-processor finish the job, it should say "I'm done!" by a >> >> > > > IRQ signal and the other encryption process that need the encrypted data >> >> > > > should be activated and continue getting the crypted data from the >> >> > > > address of memory where the crypto-processor writted it. >> >> > > > >> >> > > > Well, the behaviour of talitos seems not to be like that. It's look like >> >> > > > the processor is waiting for the crypto-processor to finish, and after >> >> > > > that it gets the crypted data.That wastes the time of the processor >> >> > > > while is waiting for the crypto-processor to finish. Maybe I'm wrong, >> >> > > > but the benchmarks looks like (2 processes means 1 crypting and another >> >> > > > doing an infinite loop a=1+1): >> >> > > > no CD(R) no CD(U) no CD (S) >> >> > > > 1 process crypting 0.36 0.13 0.20 >> >> > > > 2 processes(1+1*) 0.72 0.13 0.20 >> >> > > > It looks normal without Cryptodev that the user and system time be the >> >> > > > same, but the the real be double, because there's another process >> >> > > > requiring the CPU. >> >> > > > >> >> > > > Benchmarking the system with Cryptodev I've obtain the more or less the >> >> > > > same times (much more system time that without it), but it's not >> >> > > > exactly the double, it's 0,02 less (0.36*2 - 0.02). That's it >> >> > > > improving the time, but not really how much I've expected. And it is >> >> > > > because crypto-processor doesn't leave free the processor. >> >> > >> >> > Ok, I think the problem may be how you are benchmarking it. What commands >> >> > are you running to benchmark it ? How are you measuring the CPU usage ? >> >> > >> >> > OCF has no busy waits and I am fairly confident that the talitos driver >> >> > doesn't busy wait for anything, but Kim would know best. >> >> >> >> it doesn't. >> >> >> >> > > > I want to add preemtion to talitos, does anyone is working already on it? >> >> > > > May I help? >> >> > > >> >> > > I believe this is due to the wait_event_interruptible call in cryptodev. >> >> > > >> >> > > Also note that there are other pre-emption issues due to openssl having >> >> > > a synchronous crypto api (at least last I checked) - that tends to not >> >> > > jive well with asynchronous crypto h/w, such as what you are using. >> >> > >> >> > Can you recall and details as to how a synchronous userspace API was causing >> >> > kernel preemption issues ? >> >> >> >> not a kernel pre-emption issue per se; I just wanted to mention it >> >> makes it harder to overcome serializing the overhead of sending the >> >> request to h/w and back. Also, newer talitos h/w can perform ciphers >> >> and hashes simultaneously (I'm not sure if the 8272 can do that though). >> > >> >But the 8272 still has a queue for crypto requests right ? Which means you >> >can have several outstanding requests to the HW at any point ? >> > >> >As long as the HW can queue requests and doesn't busy wait, OCF will >> >scale over multiple processes/threads/CPU's, at least to a point where >> >it can be explained by bus bandwidth, userspace copy overhead or something :-) >> > >> >We'll just have to wait and see how Alexandru is testing it, >> > >> >Cheers, >> >Davidm >> > >> >-- >> >David McCullough, dav...@mc..., Ph:+61 734352815 >> >McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org >> |
From: A. I. G. <ai....@al...> - 2010-05-26 18:32:59
|
Hello sirs! I really apreciate your fast answer, thank you very much for the answers. At first, I think you should know some characteristics of my system and software layer. At the first, I use a kernel 2.6.21-rc2, with the next options: Kernel options ---> Timer frequency (300 HZ) ---> Preemption Model (Preemptible Kernel (Low-Latency Desktop)) ---> [*] Preempt The Big Kernel Lock [*] Kernel support for ELF binaries As I understand, those options give to the kernel the preemption feature.Ocf have been builded as modules, so: Loadable module support ---> [*] Enable loadable module support [*] Module unloading [*] Automatic kernel module loading Cryptographic options ---> OCF Configuration ---> <M> OCF (Open Cryptograhic Framework) <M> cryptodev (user space support) <M> cryptosoft (software crypto engine) <M> talitos (HW crypto engine) (The other options are disabled) After applying the patch to Openssl-0.9.8n, I've make some changes in cryptodev uncommenting the parts relationated with --with-cryptodev-digest. (I've understood looking at the code that cryptodev-digest feature doesn't work, but i've activated it for testing it). After doing it, I've compile Openssl with those options: powerpc-linux-gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_ -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DDSO_DLFCN -DHAVE_DLFCN_H -DENGINE_DYNAMIC_SUPPORT -Os For loading and unloading modules, I've use insmod (or rmmod) with the modules ocf,cryptodev,cryptosoft and talitos. After that, depending if I benchmark with talitos or without talitos I do the insmod or rmmod with ocf,cryptodev,cryptosoft and talitos. The benchmarking is done with the time command, so for each execution we obtain time consumed in user mode (U), system mode (S) and also, total elapsed time as Real time (R): $ time openssl enc -e -aes-128-cbc -salt -in kkkk -out kkkk.enc -pass pass:'micontrasenalarguisima' -engine cryptodev kkkk is a file was previously generated with dd from /dev/urandom and contains 10Megabytes of random data. The measures I'm going to show you are the minimum of the results produced by a sequence of 50 executions of each time command.We have checked the average is very close to the minimum, so the minimum is a very good representation of the best possible performance. Command/Times R(secs) U(secs) S(secs) R-(U+S)* 1)with crypto engine 3.4 0.13 3.08 ~=0.026 2)crypto by software 3.59 2.29 1.09 ~=0.056 The first command (1) is executing with "-engine cryptodev" and the modules loaded and the second one (2) is executing with modules removed and without "-engine cryptodev", so by software. * The R-(U+S) given is the average of that computation for each individual measurement, so it denotes some minimum and constant background activity in the system that constantly enlarges the elapsed time measured. Our first surprise is that total elapsed time with (case 1 ~= case 2) and without engine is very similar. We can deduce that the performance of the main processor and of the crypto-processor is very similar. It's surprising to have a crypto-processor not faster than the main CPU, but it could be understood if the main CPU could perform other tasks in parallel. So we have repeated those benchmarks with a CPU consuming process (an infinite loop) running in background, in order to prove if the CPU can perform in parallel really. The results follows: Command/Times R(secs) U(secs) S(secs) R-(U+S)* 3)with crypto engine 6.7 0.12 3.09 ~=3.436 4)crypto by software 6.25 2.31 0.69 ~=3.262 +100% CPU in background. Those figures point us something amazing: It's much faster, cheaper, and simple having the CPU without the crypto-processor!!!!!! We can see that the time used to crypt the data without cryptodev (case 4) is 6.25. This is expected because the CPU is shared (50/50) between openssl process and the other background process, and the openssl process needs 2.31+0.69=3.00 secs to perform, and 6.25 is ~= 2*3.00 secs. Also, the elapsed time for crypt the data (U+S) it's more or less the same independly of the existence of the background process, (case 1 ~= case 3) and (case 2 ~=case 4). And here comes what it's strange: Processing the data with a background process should take few more real time that doing it without the bg process, but not the double! (case 3: 6.7 > 2*(0.12+3.09)). Supose that, for example, from those 3.09 secs, 1.00 is the CPU loadding data to the crypto-processor and the other 2.09 secs is waiting for it to finish, then we would expect something more simmilar to 2*(0.12+1.00)+2.09 secs = 4.33 secs of real time. But it looks like doesn't exist parallelism between the CPU and the crypto-processor, and the CPU is waiting for the cryptoprocessor to finish without freeing the CPU, as explained in the two following time graphs: The time graph with parallel execution should be like this: openssl in CPU | = = = = = = = other proc in CPU |= = = = ====== ====== ====== ------------------------------------------- openssl in Crypto-PU | ====== ====== ====== Instead of the previous graph, I'm thinking that the time graph is something like this: openssl in CPU | = = = = = = = other proc in CPU |= = = = = = = = = = = = = = = = = = = = = ------------------------------------------- openssl in Crypto-PU | = = = = = = = = = = = = = = = = = = In summary, our questions are: Why do the case 3 gives 6.7 secs instead of much less as expected? Is the first time graph schema correct? What can we do for fixing it? Best regards, Alexandru. Con fecha 25/5/2010, "David McCullough" <dav...@mc...> escribió: > >Jivin Kim Phillips lays it down ... >> On Wed, 26 May 2010 07:55:51 +1000 >> David McCullough <dav...@Mc...> wrote: >> >> > >> > Jivin Kim Phillips lays it down ... >> > > On Mon, 24 May 2010 23:14:55 +0200 >> > > " ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: >> > > >> > > > Hello. My name is Alexandru and I'm doing my final degree project about >> > > > porting Linux to a embedded device (a router), that uses a 8272 of >> > > > Freescale. My issue to solve is provide IPsec to the router. The >> > > > software that I used is a Linux 2.6.19 + Quagga + Openssl + ipsec-tools. >> > > > The point was that the processor (8272) came with a crypto-processor >> > > > embedded in it that should help in the encryption process. I've found >> > > > this excelent project that provide the support of hardware encryption >> > > > with Cryptodev + Talitos driver. Also, the patch for Openssl works >> > > > perfectly and I've obtained the feature that I need. >> > > > >> > > > After making some benchmarks I've discovered that talitos is not >> > > > preemtive. The crypto-processor (SEC) should make the operations of >> > > > encryption/decryption and let the processor idle; the scheduler should >> > > > be called and let another process to enter as "active process". After >> > > > the crypto-processor finish the job, it should say "I'm done!" by a >> > > > IRQ signal and the other encryption process that need the encrypted data >> > > > should be activated and continue getting the crypted data from the >> > > > address of memory where the crypto-processor writted it. >> > > > >> > > > Well, the behaviour of talitos seems not to be like that. It's look like >> > > > the processor is waiting for the crypto-processor to finish, and after >> > > > that it gets the crypted data.That wastes the time of the processor >> > > > while is waiting for the crypto-processor to finish. Maybe I'm wrong, >> > > > but the benchmarks looks like (2 processes means 1 crypting and another >> > > > doing an infinite loop a=1+1): >> > > > no CD(R) no CD(U) no CD (S) >> > > > 1 process crypting 0.36 0.13 0.20 >> > > > 2 processes(1+1*) 0.72 0.13 0.20 >> > > > It looks normal without Cryptodev that the user and system time be the >> > > > same, but the the real be double, because there's another process >> > > > requiring the CPU. >> > > > >> > > > Benchmarking the system with Cryptodev I've obtain the more or less the >> > > > same times (much more system time that without it), but it's not >> > > > exactly the double, it's 0,02 less (0.36*2 - 0.02). That's it >> > > > improving the time, but not really how much I've expected. And it is >> > > > because crypto-processor doesn't leave free the processor. >> > >> > Ok, I think the problem may be how you are benchmarking it. What commands >> > are you running to benchmark it ? How are you measuring the CPU usage ? >> > >> > OCF has no busy waits and I am fairly confident that the talitos driver >> > doesn't busy wait for anything, but Kim would know best. >> >> it doesn't. >> >> > > > I want to add preemtion to talitos, does anyone is working already on it? >> > > > May I help? >> > > >> > > I believe this is due to the wait_event_interruptible call in cryptodev. >> > > >> > > Also note that there are other pre-emption issues due to openssl having >> > > a synchronous crypto api (at least last I checked) - that tends to not >> > > jive well with asynchronous crypto h/w, such as what you are using. >> > >> > Can you recall and details as to how a synchronous userspace API was causing >> > kernel preemption issues ? >> >> not a kernel pre-emption issue per se; I just wanted to mention it >> makes it harder to overcome serializing the overhead of sending the >> request to h/w and back. Also, newer talitos h/w can perform ciphers >> and hashes simultaneously (I'm not sure if the 8272 can do that though). > >But the 8272 still has a queue for crypto requests right ? Which means you >can have several outstanding requests to the HW at any point ? > >As long as the HW can queue requests and doesn't busy wait, OCF will >scale over multiple processes/threads/CPU's, at least to a point where >it can be explained by bus bandwidth, userspace copy overhead or something :-) > >We'll just have to wait and see how Alexandru is testing it, > >Cheers, >Davidm > >-- >David McCullough, dav...@mc..., Ph:+61 734352815 >McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: David M. <dav...@mc...> - 2010-05-25 23:54:12
|
Jivin Kim Phillips lays it down ... > On Wed, 26 May 2010 07:55:51 +1000 > David McCullough <dav...@Mc...> wrote: > > > > > Jivin Kim Phillips lays it down ... > > > On Mon, 24 May 2010 23:14:55 +0200 > > > " ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > > > > > > > Hello. My name is Alexandru and I'm doing my final degree project about > > > > porting Linux to a embedded device (a router), that uses a 8272 of > > > > Freescale. My issue to solve is provide IPsec to the router. The > > > > software that I used is a Linux 2.6.19 + Quagga + Openssl + ipsec-tools. > > > > The point was that the processor (8272) came with a crypto-processor > > > > embedded in it that should help in the encryption process. I've found > > > > this excelent project that provide the support of hardware encryption > > > > with Cryptodev + Talitos driver. Also, the patch for Openssl works > > > > perfectly and I've obtained the feature that I need. > > > > > > > > After making some benchmarks I've discovered that talitos is not > > > > preemtive. The crypto-processor (SEC) should make the operations of > > > > encryption/decryption and let the processor idle; the scheduler should > > > > be called and let another process to enter as "active process". After > > > > the crypto-processor finish the job, it should say "I'm done!" by a > > > > IRQ signal and the other encryption process that need the encrypted data > > > > should be activated and continue getting the crypted data from the > > > > address of memory where the crypto-processor writted it. > > > > > > > > Well, the behaviour of talitos seems not to be like that. It's look like > > > > the processor is waiting for the crypto-processor to finish, and after > > > > that it gets the crypted data.That wastes the time of the processor > > > > while is waiting for the crypto-processor to finish. Maybe I'm wrong, > > > > but the benchmarks looks like (2 processes means 1 crypting and another > > > > doing an infinite loop a=1+1): > > > > no CD(R) no CD(U) no CD (S) > > > > 1 process crypting 0.36 0.13 0.20 > > > > 2 processes(1+1*) 0.72 0.13 0.20 > > > > It looks normal without Cryptodev that the user and system time be the > > > > same, but the the real be double, because there's another process > > > > requiring the CPU. > > > > > > > > Benchmarking the system with Cryptodev I've obtain the more or less the > > > > same times (much more system time that without it), but it's not > > > > exactly the double, it's 0,02 less (0.36*2 - 0.02). That's it > > > > improving the time, but not really how much I've expected. And it is > > > > because crypto-processor doesn't leave free the processor. > > > > Ok, I think the problem may be how you are benchmarking it. What commands > > are you running to benchmark it ? How are you measuring the CPU usage ? > > > > OCF has no busy waits and I am fairly confident that the talitos driver > > doesn't busy wait for anything, but Kim would know best. > > it doesn't. > > > > > I want to add preemtion to talitos, does anyone is working already on it? > > > > May I help? > > > > > > I believe this is due to the wait_event_interruptible call in cryptodev. > > > > > > Also note that there are other pre-emption issues due to openssl having > > > a synchronous crypto api (at least last I checked) - that tends to not > > > jive well with asynchronous crypto h/w, such as what you are using. > > > > Can you recall and details as to how a synchronous userspace API was causing > > kernel preemption issues ? > > not a kernel pre-emption issue per se; I just wanted to mention it > makes it harder to overcome serializing the overhead of sending the > request to h/w and back. Also, newer talitos h/w can perform ciphers > and hashes simultaneously (I'm not sure if the 8272 can do that though). But the 8272 still has a queue for crypto requests right ? Which means you can have several outstanding requests to the HW at any point ? As long as the HW can queue requests and doesn't busy wait, OCF will scale over multiple processes/threads/CPU's, at least to a point where it can be explained by bus bandwidth, userspace copy overhead or something :-) We'll just have to wait and see how Alexandru is testing it, Cheers, Davidm -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: Kim P. <kim...@fr...> - 2010-05-25 23:06:05
|
On Wed, 26 May 2010 07:55:51 +1000 David McCullough <dav...@Mc...> wrote: > > Jivin Kim Phillips lays it down ... > > On Mon, 24 May 2010 23:14:55 +0200 > > " ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > > > > > Hello. My name is Alexandru and I'm doing my final degree project about > > > porting Linux to a embedded device (a router), that uses a 8272 of > > > Freescale. My issue to solve is provide IPsec to the router. The > > > software that I used is a Linux 2.6.19 + Quagga + Openssl + ipsec-tools. > > > The point was that the processor (8272) came with a crypto-processor > > > embedded in it that should help in the encryption process. I've found > > > this excelent project that provide the support of hardware encryption > > > with Cryptodev + Talitos driver. Also, the patch for Openssl works > > > perfectly and I've obtained the feature that I need. > > > > > > After making some benchmarks I've discovered that talitos is not > > > preemtive. The crypto-processor (SEC) should make the operations of > > > encryption/decryption and let the processor idle; the scheduler should > > > be called and let another process to enter as "active process". After > > > the crypto-processor finish the job, it should say "I'm done!" by a > > > IRQ signal and the other encryption process that need the encrypted data > > > should be activated and continue getting the crypted data from the > > > address of memory where the crypto-processor writted it. > > > > > > Well, the behaviour of talitos seems not to be like that. It's look like > > > the processor is waiting for the crypto-processor to finish, and after > > > that it gets the crypted data.That wastes the time of the processor > > > while is waiting for the crypto-processor to finish. Maybe I'm wrong, > > > but the benchmarks looks like (2 processes means 1 crypting and another > > > doing an infinite loop a=1+1): > > > no CD(R) no CD(U) no CD (S) > > > 1 process crypting 0.36 0.13 0.20 > > > 2 processes(1+1*) 0.72 0.13 0.20 > > > It looks normal without Cryptodev that the user and system time be the > > > same, but the the real be double, because there's another process > > > requiring the CPU. > > > > > > Benchmarking the system with Cryptodev I've obtain the more or less the > > > same times (much more system time that without it), but it's not > > > exactly the double, it's 0,02 less (0.36*2 - 0.02). That's it > > > improving the time, but not really how much I've expected. And it is > > > because crypto-processor doesn't leave free the processor. > > Ok, I think the problem may be how you are benchmarking it. What commands > are you running to benchmark it ? How are you measuring the CPU usage ? > > OCF has no busy waits and I am fairly confident that the talitos driver > doesn't busy wait for anything, but Kim would know best. it doesn't. > > > I want to add preemtion to talitos, does anyone is working already on it? > > > May I help? > > > > I believe this is due to the wait_event_interruptible call in cryptodev. > > > > Also note that there are other pre-emption issues due to openssl having > > a synchronous crypto api (at least last I checked) - that tends to not > > jive well with asynchronous crypto h/w, such as what you are using. > > Can you recall and details as to how a synchronous userspace API was causing > kernel preemption issues ? not a kernel pre-emption issue per se; I just wanted to mention it makes it harder to overcome serializing the overhead of sending the request to h/w and back. Also, newer talitos h/w can perform ciphers and hashes simultaneously (I'm not sure if the 8272 can do that though). Kim |
From: David M. <dav...@mc...> - 2010-05-25 21:57:09
|
Jivin Kim Phillips lays it down ... > On Mon, 24 May 2010 23:14:55 +0200 > " ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > > > Hello. My name is Alexandru and I'm doing my final degree project about > > porting Linux to a embedded device (a router), that uses a 8272 of > > Freescale. My issue to solve is provide IPsec to the router. The > > software that I used is a Linux 2.6.19 + Quagga + Openssl + ipsec-tools. > > The point was that the processor (8272) came with a crypto-processor > > embedded in it that should help in the encryption process. I've found > > this excelent project that provide the support of hardware encryption > > with Cryptodev + Talitos driver. Also, the patch for Openssl works > > perfectly and I've obtained the feature that I need. > > > > After making some benchmarks I've discovered that talitos is not > > preemtive. The crypto-processor (SEC) should make the operations of > > encryption/decryption and let the processor idle; the scheduler should > > be called and let another process to enter as "active process". After > > the crypto-processor finish the job, it should say "I'm done!" by a > > IRQ signal and the other encryption process that need the encrypted data > > should be activated and continue getting the crypted data from the > > address of memory where the crypto-processor writted it. > > > > Well, the behaviour of talitos seems not to be like that. It's look like > > the processor is waiting for the crypto-processor to finish, and after > > that it gets the crypted data.That wastes the time of the processor > > while is waiting for the crypto-processor to finish. Maybe I'm wrong, > > but the benchmarks looks like (2 processes means 1 crypting and another > > doing an infinite loop a=1+1): > > no CD(R) no CD(U) no CD (S) > > 1 process crypting 0.36 0.13 0.20 > > 2 processes(1+1*) 0.72 0.13 0.20 > > It looks normal without Cryptodev that the user and system time be the > > same, but the the real be double, because there's another process > > requiring the CPU. > > > > Benchmarking the system with Cryptodev I've obtain the more or less the > > same times (much more system time that without it), but it's not > > exactly the double, it's 0,02 less (0.36*2 - 0.02). That's it > > improving the time, but not really how much I've expected. And it is > > because crypto-processor doesn't leave free the processor. Ok, I think the problem may be how you are benchmarking it. What commands are you running to benchmark it ? How are you measuring the CPU usage ? OCF has no busy waits and I am fairly confident that the talitos driver doesn't busy wait for anything, but Kim would know best. > > I want to add preemtion to talitos, does anyone is working already on it? > > May I help? > > I believe this is due to the wait_event_interruptible call in cryptodev. > > Also note that there are other pre-emption issues due to openssl having > a synchronous crypto api (at least last I checked) - that tends to not > jive well with asynchronous crypto h/w, such as what you are using. Can you recall and details as to how a synchronous userspace API was causing kernel preemption issues ? Cheers, Davidm -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: Kim P. <kim...@fr...> - 2010-05-25 20:58:00
|
On Mon, 24 May 2010 23:14:55 +0200 " ALEXANDRU IONUT GRAMA" <ai....@al...> wrote: > Hello. My name is Alexandru and I'm doing my final degree project about > porting Linux to a embedded device (a router), that uses a 8272 of > Freescale. My issue to solve is provide IPsec to the router. The > software that I used is a Linux 2.6.19 + Quagga + Openssl + ipsec-tools. > The point was that the processor (8272) came with a crypto-processor > embedded in it that should help in the encryption process. I've found > this excelent project that provide the support of hardware encryption > with Cryptodev + Talitos driver. Also, the patch for Openssl works > perfectly and I've obtained the feature that I need. > > After making some benchmarks I've discovered that talitos is not > preemtive. The crypto-processor (SEC) should make the operations of > encryption/decryption and let the processor idle; the scheduler should > be called and let another process to enter as "active process". After > the crypto-processor finish the job, it should say "I'm done!" by a > IRQ signal and the other encryption process that need the encrypted data > should be activated and continue getting the crypted data from the > address of memory where the crypto-processor writted it. > > Well, the behaviour of talitos seems not to be like that. It's look like > the processor is waiting for the crypto-processor to finish, and after > that it gets the crypted data.That wastes the time of the processor > while is waiting for the crypto-processor to finish. Maybe I'm wrong, > but the benchmarks looks like (2 processes means 1 crypting and another > doing an infinite loop a=1+1): > no CD(R) no CD(U) no CD (S) > 1 process crypting 0.36 0.13 0.20 > 2 processes(1+1*) 0.72 0.13 0.20 > It looks normal without Cryptodev that the user and system time be the > same, but the the real be double, because there's another process > requiring the CPU. > > Benchmarking the system with Cryptodev I've obtain the more or less the > same times (much more system time that without it), but it's not > exactly the double, it's 0,02 less (0.36*2 - 0.02). That's it > improving the time, but not really how much I've expected. And it is > because crypto-processor doesn't leave free the processor. > > I want to add preemtion to talitos, does anyone is working already on it? > May I help? I believe this is due to the wait_event_interruptible call in cryptodev. Also note that there are other pre-emption issues due to openssl having a synchronous crypto api (at least last I checked) - that tends to not jive well with asynchronous crypto h/w, such as what you are using. Kim |
From: A. I. G. <ai....@al...> - 2010-05-24 21:37:25
|
Hello. My name is Alexandru and I'm doing my final degree project about porting Linux to a embedded device (a router), that uses a 8272 of Freescale. My issue to solve is provide IPsec to the router. The software that I used is a Linux 2.6.19 + Quagga + Openssl + ipsec-tools. The point was that the processor (8272) came with a crypto-processor embedded in it that should help in the encryption process. I've found this excelent project that provide the support of hardware encryption with Cryptodev + Talitos driver. Also, the patch for Openssl works perfectly and I've obtained the feature that I need. After making some benchmarks I've discovered that talitos is not preemtive. The crypto-processor (SEC) should make the operations of encryption/decryption and let the processor idle; the scheduler should be called and let another process to enter as "active process". After the crypto-processor finish the job, it should say "I'm done!" by a IRQ signal and the other encryption process that need the encrypted data should be activated and continue getting the crypted data from the address of memory where the crypto-processor writted it. Well, the behaviour of talitos seems not to be like that. It's look like the processor is waiting for the crypto-processor to finish, and after that it gets the crypted data.That wastes the time of the processor while is waiting for the crypto-processor to finish. Maybe I'm wrong, but the benchmarks looks like (2 processes means 1 crypting and another doing an infinite loop a=1+1): no CD(R) no CD(U) no CD (S) 1 process crypting 0.36 0.13 0.20 2 processes(1+1*) 0.72 0.13 0.20 It looks normal without Cryptodev that the user and system time be the same, but the the real be double, because there's another process requiring the CPU. Benchmarking the system with Cryptodev I've obtain the more or less the same times (much more system time that without it), but it's not exactly the double, it's 0,02 less (0.36*2 - 0.02). That's it improving the time, but not really how much I've expected. And it is because crypto-processor doesn't leave free the processor. I want to add preemtion to talitos, does anyone is working already on it? May I help? Thank you. Best regards, Alexandru |
From: David M. <dav...@mc...> - 2010-05-13 05:32:31
|
Jivin mix.kao lays it down ... > Hi, > > i compiled the ocf and ixp4xx kernel module with Intel access-library-with-crypto-3.0 <http://10.2.3.243/cvsweb/fruitfarm/packages/access-library-with-crypto-3.0/?hideattic=0;f=h#dirlist> > And run cryptotest tool on the target to test. > > * When load the ixp4xx.ko with ocf, the cryptotest ran into trouble. (process hang with no output for a long time.) > * When load the cryptosoft.ko with ocf, the cryptotest work fine. > > Any advice will be appreciated. The ixp driver has never been tested with Access lib 3.0, the latest version it is known to work with is 2.4 (they dropped support for the ixp425 so we never took it further :-) You could need to do some debugging :-( > cryptotest code from http://fxr.googlebit.com/source/tools/tools/crypto/cryptotest.c Why don't you use the cryptotest version of code included with OCF ? Cheers, Davidm -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: mix.kao <mi...@ci...> - 2010-05-12 08:47:12
|
Hi, i compiled the ocf and ixp4xx kernel module with Intel access-library-with-crypto-3.0 <http://10.2.3.243/cvsweb/fruitfarm/packages/access-library-with-crypto-3.0/?hideattic=0;f=h#dirlist> And run cryptotest tool on the target to test. * When load the ixp4xx.ko with ocf, the cryptotest ran into trouble. (process hang with no output for a long time.) * When load the cryptosoft.ko with ocf, the cryptotest work fine. Any advice will be appreciated. cryptotest code from http://fxr.googlebit.com/source/tools/tools/crypto/cryptotest.c |
From: David M. <dav...@mc...> - 2010-05-12 00:38:29
|
Jivin Kim Phillips lays it down ... > On Tue, 11 May 2010 13:22:38 +0300 > avital sela <avi...@gm...> wrote: > > > Hi Kim, > > Hi Avital, > > > I'm working on an ocf driver for a custom made hw that does AES and SHA, and > > uses DMA. > > I've noticed that in your talitos driver you also used DMA but you never > > checked for alignment, whereas in David's Safenet code he did check for > > alignments. > > My questions are: > > 1. are you using a 1-byte aligned DMA, therefore you don't need alignment? > > correct, talitos drives h/w that doesn't have alignment restrictions. Yep, it's purely a HW limitation, some crypto devices can handle the unaligned data and some can't. Same thing in network driver land. The solution is usually "bounce buffers" of some kind. > > 2. or alternatively, is there some way to guarantee that buffers are 32bit > > aligned beforehand? > > In OCF-linux, both safe and hifn drivers look like they copy data to > ensure alignment. For most uses you will receive aligned data. Some of the OCF drivers handle unaligned data, and some just fail the request. > In the upstream kernel's crypto API, it looks like a crypto > drivers' .cra_alignmask setting is used for this purpose. > > [cc:ing ocf list in case there's something I'm missing.] Nothing missing that I can see ;-) Cheers, Davidm -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: Kim P. <kim...@fr...> - 2010-05-12 00:11:38
|
On Tue, 11 May 2010 13:22:38 +0300 avital sela <avi...@gm...> wrote: > Hi Kim, Hi Avital, > I'm working on an ocf driver for a custom made hw that does AES and SHA, and > uses DMA. > I've noticed that in your talitos driver you also used DMA but you never > checked for alignment, whereas in David's Safenet code he did check for > alignments. > My questions are: > 1. are you using a 1-byte aligned DMA, therefore you don't need alignment? correct, talitos drives h/w that doesn't have alignment restrictions. > 2. or alternatively, is there some way to guarantee that buffers are 32bit > aligned beforehand? In OCF-linux, both safe and hifn drivers look like they copy data to ensure alignment. In the upstream kernel's crypto API, it looks like a crypto drivers' .cra_alignmask setting is used for this purpose. [cc:ing ocf list in case there's something I'm missing.] Kim |
From: Stefan R. <ry...@lu...> - 2010-05-10 14:18:16
|
Hi, I wounder if it is necessary to install a new patched kernel or if it would be enough to just patch the source of the current running kernel and then build all the ocf-linux stuff as modules and just install the modules. That is does the patching always touch something in the kernel image itself? The reson I am asking is because I own an BUBBA 2 home server http://www.excito.com/ which has an MPC8313E processor with SEC 2.2 and I would like to test if I could speed up especially ssh connection with ocf-linux. The problem is that the kernel is not installed on the hard drive but in an flash and if you put an non functional kernel in the flash you are in deep trouble. It is fixable with some soldering to get add an serial interface to the BUBBA 2. -- Stefan Rystedt |
From: David M. <dav...@mc...> - 2010-03-26 10:20:18
|
Jivin Peter Fry lays it down ... > I just tried the new release of ocf-linux, I had to remove the section of the openssl patch that added makefile to the root directory, but after that openssl compiled fine. Thanks, thats supposed to be renamed as it's only for use with SGlinux and not for general use. I'll fix it up for the next one, Thanks, Davidm -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: Peter F. <pe...@we...> - 2010-03-26 07:39:25
|
I just tried the new release of ocf-linux, I had to remove the section of the openssl patch that added makefile to the root directory, but after that openssl compiled fine. Peter |
From: David M. <dav...@mc...> - 2010-03-25 12:36:42
|
Hi all, A new ocf-linux release. Everything is updated to new versions; linux-2.6.33, openssl-0.9.8n. Some new drivers (Kirkwood, Micronas 7108). Some rework, especially with cryptosoft now being able to use the kernels asynchronous crypto API to take advantage of native linux HW drivers. Some patches from users. Full changelog since the last release in ocf-linux-20100325.txt, http://downloads.sourceforge.net/project/ocf-linux/ocf-linux/20100325 Cheers, Davidm -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: David M. <dav...@mc...> - 2010-03-24 10:50:26
|
Jivin Kennedy, Brendan lays it down ... > > > >-----Original Message----- > >From: David McCullough [mailto:dav...@mc...] > > > >Jivin Kennedy, Brendan lays it down ... > >> Hi Dave, > >> > >> Sorry for the slow response! I like the use of errno, but a few > >questions: > >> > >> 1. Are similar changes going to be made for DSA sign/verify > >functionality? > > > >I can do that if you want. > > > Yep, we would like those changes to be added, but the changes from my originally submitted patch are just suggestions at the end of the day ;) Ok, think I got them all in the attached patch. > >> 2. It seems if the failure is in the driver(as oppose to OCF) for non > >> algotithmic reasons, this will still show an algorithm fail. I think > >the > >> patch code is meant to do that, but: > > > >The "trace" from openssl should show whether it was hardware or OCF > >complaining. > > > >> 3. If the issue was because of algorithm fail (not the hardware), then > >the > >> algorithm should not be run again in software (where essentially it > >should > >> fail there for the same reason). It just slows down response time.. > > > >The reason it falls back to SW is that, at least in my experience, the > >HW > >has limitations that the SW versions often don't. > > > >So driver regularly fail these calls because there are too many bits in > >this > >or that operand. If that is the case I would prefer that it was handled > >in > >SW and still worked myself. Esp. since if I was using openssl without > >OCF > >it would work. > > > >Also, for the failure case, I don't mind a true failure to be slower. > >I > >don't know of many cases where the performance of the failure case is > >critical, but I am more than happy to consider it if you have a good > >case :-) > > > > Hmm, denial of service attack based on many bogus connection attempts utilizing large keys? :) > I can only guess, however I suppose some tuning of the code is required depending on driver use cases. > > I agree that in the case where the hardware cannot handle the request, the operation should be run in software. > However, I think the driver for that hardware should be able to detect any problems with input buffers etc. Also, detecting different error types can be useful in the above scenarios. > > Of course, the final code changes are all up to you. If something doesn't suit enough OCF users to warrant a change, that is fair enough :) You guys are using it, that almost makes you the one who should decide. I have cleaned up the error reporting and done a workover to make it all predictable at least. I would like to do an OCF release, so I will run with this for now and if it isn't enough we can look at that again when the issues are better known/understood. Thanks, Davidm -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: Kennedy, B. <bre...@in...> - 2010-03-22 13:34:53
|
>-----Original Message----- >From: David McCullough [mailto:dav...@mc...] > >Jivin Kennedy, Brendan lays it down ... >> Hi Dave, >> >> Sorry for the slow response! I like the use of errno, but a few >questions: >> >> 1. Are similar changes going to be made for DSA sign/verify >functionality? > >I can do that if you want. > Yep, we would like those changes to be added, but the changes from my originally submitted patch are just suggestions at the end of the day ;) >> 2. It seems if the failure is in the driver(as oppose to OCF) for non >> algotithmic reasons, this will still show an algorithm fail. I think >the >> patch code is meant to do that, but: > >The "trace" from openssl should show whether it was hardware or OCF >complaining. > >> 3. If the issue was because of algorithm fail (not the hardware), then >the >> algorithm should not be run again in software (where essentially it >should >> fail there for the same reason). It just slows down response time.. > >The reason it falls back to SW is that, at least in my experience, the >HW >has limitations that the SW versions often don't. > >So driver regularly fail these calls because there are too many bits in >this >or that operand. If that is the case I would prefer that it was handled >in >SW and still worked myself. Esp. since if I was using openssl without >OCF >it would work. > >Also, for the failure case, I don't mind a true failure to be slower. >I >don't know of many cases where the performance of the failure case is >critical, but I am more than happy to consider it if you have a good >case :-) > Hmm, denial of service attack based on many bogus connection attempts utilizing large keys? :) I can only guess, however I suppose some tuning of the code is required depending on driver use cases. I agree that in the case where the hardware cannot handle the request, the operation should be run in software. However, I think the driver for that hardware should be able to detect any problems with input buffers etc. Also, detecting different error types can be useful in the above scenarios. Of course, the final code changes are all up to you. If something doesn't suit enough OCF users to warrant a change, that is fair enough :) Regards, Brendan -------------------------------------------------------------- Intel Shannon Limited Registered in Ireland Registered Office: Collinstown Industrial Park, Leixlip, County Kildare Registered Number: 308263 Business address: Dromore House, East Park, Shannon, Co. Clare This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. |
From: David M. <dav...@mc...> - 2010-03-22 01:55:19
|
Jivin V Jyothi-B22245 lays it down ... > Hi, > > I am trying to use OCF openssl interface with openssl1.0beta versions > for DH operations. > This framework is very much useful to make use of HW acceleration > through openssl. > > There are two minor issues we found in cryptodev_dh_compute_key() > function definition: > 1) keylen assignment to crypt_kop structure: > keylen is computed using BN_num_bits() which returns the key size > in number of bits. > > While assigning keylen to kop.crk_param[3].crp_nbits is again > multiplied by 8, which not required: > kop.crk_param[3].crp_nbits = keylen * 8; > keylen variable already containing the length of key in bits, so > this statement should be like: > kop.crk_param[3].crp_nbits = keylen; /* keylen is containing > the number of bits, so we should not again multiply with 8 */ > > 2) In case of "ioctl(fd, CIOCKEY, &kop)" returning success, 'dhret' > value is not filled but the return value is expected by the caller as > the length of key in bytes: > "dhret = (keylen+7)/8;" statement may be required after ioctl > condition. > > if (ioctl(fd, CIOCKEY, &kop) == -1) { > const DH_METHOD *meth = DH_OpenSSL(); > > dhret = (meth->compute_key)(key, pub_key, dh); > } > else > dhret = (keylen+7)/8; > > I hope these issues get fixed in the next version. Both problems and your fixes look good to me. I have made them in my local tree and they will be in the next release, Thanks, Davidm -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: David M. <dav...@mc...> - 2010-03-22 00:52:41
|
Jivin Kennedy, Brendan lays it down ... > >-----Original Message----- > >From: David McCullough [mailto:dav...@mc...] > >> > > >> >Did you just want to be able to identify the HW condition that causes > >> >this separate to an error ? > >> > >> Yes, we wanted to identify that the call was failing because of lack > >of driver support rather than due to bad inputs or some such thing. > > > >Ok, I backed out the change in cryptodev so that a fail is still a fail > >at the syscall level, but we can detect this when the errno is the same > >as the kop_status. Saves changes the driver API in any way which is > >kind of > >nice to have. > > > >Have a look at the attached patch and see if it does what you need, if > >so > >then I'll commit that. > > > > Hi Dave, > > Sorry for the slow response! I like the use of errno, but a few questions: > > 1. Are similar changes going to be made for DSA sign/verify functionality? I can do that if you want. > 2. It seems if the failure is in the driver(as oppose to OCF) for non > algotithmic reasons, this will still show an algorithm fail. I think the > patch code is meant to do that, but: The "trace" from openssl should show whether it was hardware or OCF complaining. > 3. If the issue was because of algorithm fail (not the hardware), then the > algorithm should not be run again in software (where essentially it should > fail there for the same reason). It just slows down response time.. The reason it falls back to SW is that, at least in my experience, the HW has limitations that the SW versions often don't. So driver regularly fail these calls because there are too many bits in this or that operand. If that is the case I would prefer that it was handled in SW and still worked myself. Esp. since if I was using openssl without OCF it would work. Also, for the failure case, I don't mind a true failure to be slower. I don't know of many cases where the performance of the failure case is critical, but I am more than happy to consider it if you have a good case :-) Cheers, Davidm -- David McCullough, dav...@mc..., Ph:+61 734352815 McAfee - SnapGear http://www.mcafee.com http://www.uCdot.org |
From: Kennedy, B. <bre...@in...> - 2010-03-19 15:51:01
|
>-----Original Message----- >From: David McCullough [mailto:dav...@mc...] >> > >> >Did you just want to be able to identify the HW condition that causes >> >this separate to an error ? >> >> Yes, we wanted to identify that the call was failing because of lack >of driver support rather than due to bad inputs or some such thing. > >Ok, I backed out the change in cryptodev so that a fail is still a fail >at the syscall level, but we can detect this when the errno is the same >as the kop_status. Saves changes the driver API in any way which is >kind of >nice to have. > >Have a look at the attached patch and see if it does what you need, if >so >then I'll commit that. > Hi Dave, Sorry for the slow response! I like the use of errno, but a few questions: 1. Are similar changes going to be made for DSA sign/verify functionality? 2. It seems if the failure is in the driver(as oppose to OCF) for non algotithmic reasons, this will still show an algorithm fail. I think the patch code is meant to do that, but: 3. If the issue was because of algorithm fail (not the hardware), then the algorithm should not be run again in software (where essentially it should fail there for the same reason). It just slows down response time.. What do you think? Regards, Brendan -------------------------------------------------------------- Intel Shannon Limited Registered in Ireland Registered Office: Collinstown Industrial Park, Leixlip, County Kildare Registered Number: 308263 Business address: Dromore House, East Park, Shannon, Co. Clare This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. |