Looking for the latest version? Download CUDA-6.5-Libs-Windows.7z (42.3 MB)
Home
Name Modified Size Downloads / Week Status
Totals: 9 Items   835.5 kB 32
CUDA Libs 2014-10-17 1313 weekly downloads
2.05 Beta 2014-04-29 1414 weekly downloads
README 2014-04-28 35.3 kB 77 weekly downloads
CUDALucas-2.03-cuda4.2-sm_30-x86-64.exe 2012-06-07 176.6 kB 55 weekly downloads
CUDALucas-2.03-cuda4.0-sm_20-x86-64.exe 2012-06-07 177.7 kB 22 weekly downloads
CUDALucas-2.03-cuda4.1-sm_21-x86-64.exe 2012-06-07 174.6 kB 1010 weekly downloads
CUDALucas-2.03-cuda3.2-sm_13-x86-64.exe 2012-06-07 180.7 kB 22 weekly downloads
CUDALucas-2.03-Linux-x86-64 2012-06-07 86.8 kB 33 weekly downloads
CUDALucas.ini 2012-06-07 3.8 kB 33 weekly downloads
Q. What files should I download from SourceForge? <<<<<<< .mine A. There are various executables for both Windows and Linux. Pick whichever executable is best for you. Win32 is slightly faster than x64 (for most FFT lengths) and CUDA 5.5 is faster than previous versions. You will need: - Execultable that is compatible with your CUDA Driver version - Library Files for your CUDA Driver Version - CUDALucas.ini file ======= A. There are various executables for both Windows and Linux, some text files, and some necessary library files. Pick whichever executable is best for you, then download all the text files including this README and CUDALucas.ini. Next, pick the library archive for your operating system. Place all these files into your CUDALucas folder, and then read the rest of this README. >>>>>>> .r67 Download the executable, CUDALucas.ini, library files and this README. Place all these files into a CUDALucas folder, and then read the rest of this README. Q. I get errors about "lib cudart not found" or "lib cufft not found". What can I do? A. The files from SourceForge, see question above. Q. What's new in 2.05 from 2.03? A. - RCB - On-the-fly FFT selection. Keyboard driven or automatic if error level exceeds threshold - Included GPU memtest and tools to automate finetuning FFT and thread selection - Bit shift to prevent errors from producing similar results - #################### # CUDALucas README # #################### CUDALucas v2.05 Content 0 What is CUDALucas? 1 Supported Hardware 2 Compilation 2.1 Compilation (Linux) 2.2 Compilation (Windows) 3 Running CUDALucas 3.1 Running CUDALucas (Windows) 4 How to get work and report results from/to GIMPS 6 Tuning 7 FAQ 8 To do list ####################### # 0 What is CUDALucas # ####################### (See https://sourceforge.net/p/cudalucas/wiki/Home/ for a version of this information that includes links.) CUDALucas is a program implementing the Lucas-Lehmer primality test for Mersenne numbers using the Fast Fourier Transform implemented by nVidia's cuFFT library. You need a CUDA-capable nVidia card with compute compatibility >= 1.3. Mersenne numbers are numbers of the form 2^p - 1; it is possible that some of these numbers are prime. For instance, 2^7-1 is prime, 2^127-1 is prime, and 2^43,112,609-1 is prime (and is also the largest known prime number in the world). For various reasons explained in the Wikipedia article, throughout almost all known history, the largest known prime number has been a Mersenne prime. Most CUDALucas' users that the developers are aware of use this program to help search for Mersenne primes in coordination with the Great Internet Mersenne Prime Search. It is one of the internet's first distributed computing projects, started in 1996, and since that year has found all of the largest known prime numbers. GIMPS searches for primes by doing some "trial factoring" to find a small factor of a Mersenne number; if that fails, then GIMPS performs the Lucas-Lehmer test to determine once and for all if a Mersenne number is prime. You can participate without needing to be aware of the mathematics involved; all you need to do is download and run the free program GIMPS provides called Prime95. However, Prime95 is optimized for CPUs; in the last few years, volunteer developers from the GIMPS community have ported various parts of Prime95's functionality to GPUs. Shoichiro Yamada took a CPU-based Lucas-Lehmer testing program (written in generic C, as opposed to the x86-specific assembly of Prime95) and ported it to CUDA; this is now known as CUDALucas. Mr. Yamada remains the primary developer of the mathematics code; his username on MersenneForum.org is 'msft' (and I hope to add him as an admin of this SourceForge project as soon as possible). The other GPU programs are mfaktc, a program using CUDA GPUs to perform the "trial factoring" mentioned above, and mfakto which is a port of mfaktc to OpenCL, supporting AMD/ATI graphics cards. (mfakto's developer maintains a GitHub page for it; both programs are free-and-open-source software under the GPL. Source and executables for mfakto and mfaktc are available from here and here respectively.) To participate in GIMPS yourself using CUDALucas (assuming you have the necessary CUDA hardware listed above), all you need to do is go read section 4. ######################## # 1 Supported Hardware # ######################## CUDALucas should run all all CUDA capable Nvidia GPUs with compute capability >= 1.3. Unfortunately this obviously excludes AMD/ATI GPUs, but there are other programs for such cards that you can use to help GIMPS (see section 0). ################# # 2 Compilation # ################# You must have the CUDA Toolkit installed, as well as a C compiler. gcc or MSVC will do in a pinch. The CUDA Toolkit includes nVidia's CUDA compiler, as well as some necessary library and include files. (You do not need the full CUDA SDK.) You can get the Toolkit from <http://developer.nvidia.com/cuda-toolkit>. We are still trying to determine which version CUDA libraries and which architectures produce the fastest executable. In the meantime, use the latest toolkit plus the defaults in the Makefiles. (Please tell us, either on SourceForge or on MersenneForum, if you figure out which combinations work better.) There are some different make commands you can run, but at the moment none of them does anything particularly interesting. ########################### # 2.1 Compilation (Linux) # ########################### You should be able to run 'make' without modifications. If, when you installed <<<<<<< .mine the toolkit, you did not follow the nVidia defaults, you might need specify a different CUDA Toolkit location besides the default '/usr/local/cuda' used in the Makefile. ======= the toolkit, you did not follow the nVidia defaults, you might need specify a different CUDA Toolkit location other than the default '/usr/local/cuda' used in the Makefile. >>>>>>> .r67 ############################# # 2.2 Compilation (Windows) # ############################# MSVS: MSVS can make debug and non-debug versions from CUDA 4.0 and up. To compile CUDA 4.0 thru 5.0 you need to have the applicable CUDA toolkit installed and MSVS 2010. CUDA 5.5 requires the toolkit and MSVS 2012. ----------------------------------------------------------------------------------------------------------------- Detailed instructions for CUDA 4.0 to 5.0 are around on the internet (or PM flashjh on mersenneforum for info) ----------------------------------------------------------------------------------------------------------------- How to create MSVS2012 solution from the latest Cudalucas 2.05, CUDA 5.5: <<<<<<< .mine ** You must have MSVS2012, the CUDA 5.5 Toolkit and current drivers installed first ** 1. Make new cuda project using project wizard 2. Delete kernel.cu that were created by default Copy the following files to the project folder, then add them into project (drag/drop into the MSVS GUI solution explorer) 3. Add cuda_safecalls.cu to project 4. Add cudalucas.cu to project 5. Add parse.c to project 6. Add parse.h to project Project properties (These steps must be done for debug and release, Win32 and x64 options, as applicable!): ======= You will need to download make.exe (http://gnuwin32.sourceforge.net/packages/make.htm) Update the makefile.win with the version you want to compile. You should be able to run 'make -f makefile.win' without modifications. If, when you installed the toolkit, you did not follow the nVidia defaults, you might need specify a different CUDA Toolkit location in the makefile.win file. >>>>>>> .r67 7. Linker|Input|Additional Dependencies, add cufft.lib after cudart.lib, if not already there 9. Linker|Debugging|Generate Debug Info, 'No' for Release, 'Yes' for Debug 10. CUDA C/C++|Common, change target machine platform to 64-bit or 32-bit 11. CUDA C/C++|Device, change code generation to: compute_13,sm_13;compute_20,sm_20;compute_30,sm_30;compute_35,sm_35 12. C/C++|Code Generation|Runtime Library, change to Multi-threaded (/MT) (release) or Multi-threaded Debug (/MTd) (debug) 13. Build Events|Post-Build Events|Command Line, add: echo copy "$(CudaToolkitBinDir)\cudart*.dll" "$(OutDir)" copy "$(CudaToolkitBinDir)\cudart*.dll" "$(OutDir)" echo copy "$(CudaToolkitBinDir)\cufft32*.dll" "$(OutDir)" copy "$(CudaToolkitBinDir)\cufft32*.dll" "$(OutDir)" echo copy "$(CudaToolkitBinDir)\cufft64*.dll" "$(OutDir)" copy "$(CudaToolkitBinDir)\cufft64*.dll" "$(OutDir)" 14. Configuration Properties|General (as desired,these are just examples) OUTPUT DIRECTORY: ..\..\..\Test\ DEBUG NAME: debug_$(ProjectName)-$(CudaToolkitVersion)-$(Platform)_r## RELEASE NAME: (ProjectName)-$(CudaToolkitVersion)-$(Platform)_r## WINDOWS MAKE: This can make non-debug versions from CUDA 4.0 and up. To compile CUDA 4.0 thru 5.0 you need to have the applicable CUDA toolkit installed and MSVS 2010. CUDA 5.5 requires the toolkit and MSVS 2012. ** You must have the correct MSVS, the applicable CUDA Toolkit and current drivers installed first ** 1. Obtain make.exe from here: http://www.equation.com/servlet/equation.cmd?fa=make 2. Place make.exe into the folder with the sourcefiles and makefile.win 3. Use MUST use the 'command shortcut' included with the appropriate version of MSVS If you want x86, use the x86 and if you want x64, use x64. Use the shortcut from MSVS 2010 for CUDA 4.0 to CUDA 5.0 and MSVS 2012 for CUDA 5.5 4. Open makefile.win and set your desired bit level, cuda and version then save 5. Type: make -f makefile.win 6. When complete type make -f makefile.win clean 7. The executable is placed one directory up from your source files ############################## # 3 Running CUDALucas # ############################## CUDALucas is designed to be primarily driven with the information in CUDALucas.ini. (If you don't have a copy of that file, go to https://sourceforge.net/projects/cudalucas/files/ to get the latest version.) You can run CUDALucas from the command line without any arguments, and it should read CUDALucas.ini (it should be in the same directory) and start crunching. CUDALucas reads what numbers to test from a "work file". The default work file <<<<<<< .mine is "worktodo.txt", however you can change the name of that file in CUDALucas.ini. The information in the work file should like something like: Test=25613431 or DoubleCheck=25613431 ======= is "worktodo.txt", however you can change the name of that file in CUDALucas.ini. The information in the work file should look something like: Test=25613431 >>>>>>> .r67 CUDALucas will interpret this to mean "Test 2^25613431-1 to see if it's a prime number." See section 4 on how to get numbers to test that haven't been tested before. (This is done through GIMPS.) CUDALucas will keep crunching numbers as long as there are assignments in your work file; it will terminate if the file is empty. Alternately, you can just pass in a single exponent as a command line argument, and CUDALucas will then test 2^arg-1 and exit. When it's done testing a number, CUDALucas will output the results to a "results file", which defaults to "results.txt"; again, however, you can change <<<<<<< .mine that using the .ini file. We highly encourage you to report your results to GIMPS (see section 4). You can keep track of your results if you create an account with GIMPS. ======= that using the .ini file. We highly encourage you to report your results to GIMPS (see section 4). You can keep track of your results if you create an account with GIMPS. >>>>>>> .r67 <<<<<<< .mine You can modify a number of options that change how CUDALucas behaves; again, see CUDALucas.ini. You can also specify any of those options from the command line; try running "./CUDALucas -h" from a terminal to see what options you can use. Also note that there is a self test mode and a benchmark mode that can only be specified from the command line. ======= You can modify a number of options that change how CUDALucas behaves; again, see CUDALucas.ini. You can also specify any of those options from the command line; try running "./CUDALucas -h" from a terminal to see what options you can use. Also note that there is a self test mode and a benchmark mode that can only be specified from the command line. >>>>>>> .r67 Note that you need library files to run CUDALucas; these are "cudart.dll" and "cufft.dll" for Windows, and "cudart.so" and "cufft.so" for Linux. In Windows, <<<<<<< .mine it's sufficient to put the .dll files into the same directory as the executable; in Linux, you have to set the LD_LIBRARY_PATH environment varibale to include the directory where the .so's are located. Note that there aren't any Windows library files on SourceForge, but we're working on fixing that as soon as possible. ======= it's sufficient to put the .dll files into the same directory as the executable; in Linux, you have to set the LD_LIBRARY_PATH environment varibale to include the directory where the .so's are located. These files can be downloaded from SourceForge. >>>>>>> .r67 It is safe to kill CUDALucas with a Ctrl+C (or by most any other method) at any time. It will write a save file and exit. When you next run CUDALucas, it will detect that there is a save file and resume. <<<<<<< .mine Please feel free to ask for help on SourceForge or MersenneForum if this isn't clear. ======= Please feel free to ask for help on SourceForge or MersenneForum.org if this isn't clear. >>>>>>> .r67 ################################### # 3.1 Running CUDALucas (Windows) # ################################### Read the section above first. Though CUDALucas is called from the command line, you can modify its behavior with CUDALucas.ini, which means you don't need to pass arguments on the command line. What this means for Windows users is that you can right click on the executable and create a shortcut. Double clicking on the shortcut should launch CUDALucas in a terminal where you can watch it crunch. (The drawback to this is that if CUDALucas exits with an error, the terminal will automatically close and you won't see the error message.) ############################ # 3.2 Command line options # ############################ -h prints a help message listing most of the command line options and exits. -v prints the program version number and exits. -info causes current device info to be printed to the screen at the beginning of the first test. -k enables keyboard input during test, see ini file description. -polite n sets the polite iteration interval to n, or disables polite option if n = 0. -d n sets CUDALucas to run on device d -c n sets checkpoint iteration value. Checkpoints will be written every n iterations. -x n sets report iteration value. Screen reports will be written every n iterations. -f n<k|K|m|M> sets fft length to n, n * 1024 (if k or K specified), or n * 1048576 (if m or M specified). Values of n that are not mutiples of 1024, or do not end in k, K, m, or M will be rejected. -threads m s sets thread values for the multiplication and splicing kernels. m and s should be powers of two between 32 and 1024. -i filename sets the name of the file that the initialzation information is to be obtained from. Default is CUDALucas.ini. -s <folder> saves all checkpoint files to subdirectory specified by "folder". Default folder is "savefiles" -r n runs the short (n = 0) or long (n = 1) version of the selftest. -cufftbench s e i times i repetitions of a 50 ll iteration loop, for all reasonable fft lengths between s * 1024 and e * 1024, then writes the fastest fft lengths in the file <gpu> fft.txt. Reasonable lengths are n * 1024 where the largest prime factor of n is 7. -threadbench s e i m times i repetitions of a 50 ll iteration loop, for certain ffts lengths between s * 1024 and e * 1024. Each tested fft length gets combined with different thread values for the multiplication and splicing kernels. The fastest thread values for each fft are written in the file <gpu> threads.txt. The parameter m gives some control over which fft lengths are tested, which thread values are tested, and screen output: bit 0: if set, only fft values from <gpu> fft.txt will be tested, otherwise, all reasonable fft lengths will be tested. bit 1: if set, skips thread value 32. bit 2: if set, skips thread value 1024. bit 3: if set, supresses intermediate output: only the optimal thread values for each fft will be printed to the screen. -memtest s i tests s 25MB chunks of memory doing i repetitions of a 100000 iteration loop on each of 5 different ll test related sets of data. Each iteration consists of copying a 25MB chunk of data, then re-reading and comparing that copy to the original. ################################################################### # 4 How to get work and report results from/to the GIMPS server # ################################################################### You can get numbers to test from the GIMPS server, which is called PrimeNet. <<<<<<< .mine It is located at http://www.mersenne.org/. You can get and report work anonymously, however to track what numbers you've tested and track your credit, you must create an account. You don't even need to enter your email for an account, though of course it's easier if you ever lose your login information :) ======= It is located at http://www.mersenne.org/. You can get and report work anonymously, however to track what numbers you've tested and track your credit, you must create an account. You don't even need to enter your email for an account, though of course it's good to have if you ever lose your login information :) >>>>>>> .r67 Getting work: <<<<<<< .mine Step 1) go to http://www.mersenne.org/ and (optionally) login with your username and password Step 2) on the menu on the left click "Manual Testing" and then "Assignments" Step 3) Choose the number of assignments you want. Note that even the smallest assignment will take a few days to complete, so we recommend you start with just one and come back for more when you know how fast you can complete work. Step 4) Choose your preferred work type. There are a variety of choices here; you can choose the default "World record tests", which means if your number is prime, it would be a world record. "Smallest available first time tests" might be not-World-record, though practically it's exactly the same as "World record tests". "100 million digits" is way beyond what's currently feasible, and will take months or years to complete one test. This isn't recommended. Finally, "Double Check tests" is where you get numbers that have been tested once, but haven't been double checked. Though Double Checking sounds less glamorous, we currently recommend this work type. Not only are the assignments shorter, but at the moment, two matching CUDALucas tests will not mark an number as "Double Checked". (This is for safety reasons, and hopefully we'll add more functionality in the future to remove this restriction.) What this means is that it's safer for CUDALucas to test numbers that have been tested once with Prime95, though some people do first time tests anyways. Step 5) Click the button "Get Assignments" Step 6) Copy and paste the "Test=..." (or "DoubleCheck=...") lines directly into your work file (default "worktodo.txt") in your CUDALucas directory. ======= Step 1) go to http://www.mersenne.org/ and (optionally) login with your username and password Step 2) on the menu on the left click "Manual Testing" and then "Assignments" Step 3) Choose the number of assignments you want. Note that even the smallest assignment will take a few days to complete, so we recommend you start with just one and come back for more when you know how fast you can complete work. Step 4) Choose your preferred work type. There are a variety of choices here; you can choose the default "World record tests", which means if your number is prime, it would be a world record. "Smallest available first time tests" might be not-World-record, though practically it's exactly the same as "World record tests". "100 million digits" is way what's currently feasible, and will take months or years to complete one test. This isn't recommended. Finally, "Double Check tests" is where you get numbers that have been tested once, but haven't been double checked. Though Double Checking sounds less glamorous, we currently recommend this work type. Not only are the assignments much shorter, but at the moment, two matching CUDALucas tests will not mark an number as "Double Checked". (This is for safety reasons, and hopefully we'll add more functionality in the future to remove this restriction.) What this means is that it's safer for CUDALucas to test numbers that have been tested once with Prime95, though some people do first time tests anyways. Step 5) Click the button "Get Assignments" Step 6) Copy and paste the "Test=..." (or "DoubleCheck=...") lines directly into your work file (default "worktodo.txt") in your CUDALucas directory. >>>>>>> .r67 Now you're all set. :) Just launch CUDALucas and watch it crunch :) Once CUDALucas has finished a test, report the result to PrimeNet: <<<<<<< .mine Step 1) go to http://www.mersenne.org/ and (optionally) login with your username and password. (Again, if you want to track your credit, logging in is necessary.) Step 2) On the menu on the left click "Manual Testing" and then "Results" Step 3) Upload the results file (default "result.txt") generated by CUDALucas by using the "Search" and "Upload" buttons. Step 4) Once PrimeNet responds with a verification message, you can either delete your results file or move the data to a different file. ======= Step 1) go to http://www.mersenne.org/ and (optionally) login with your username and password. (Again, if you want to track your credit, logging in is necessary.) Step 2) On the menu on the left click "Manual Testing" and then "Results" Step 3) Upload the results file (default "results.txt") generated by CUDALucas by using the "Search" and "Upload" buttons. Step 4) Once PrimeNet responds with a verification message, you can either delete your results file or move the data to a different file. >>>>>>> .r67 <<<<<<< .mine Advanced usage (set the FFT length): At the moment, CUDALucas' FFT length autoselection is not as optimal as it could be. This is on the to do list to improve, but in the meantime it is possible to manually select this either in the .ini file or with the -f option. There are quite a few caveats, namely that this length is used for all exponents in the work file. See the description in the ini file for more details. ======= Advanced usage (set the FFT length): Recently (since 2.04) you can now specify the FFT length by adding a field to the "Test=..." assignment line in the work file. To use (e.g.) a 1440K length for a test, the line should look like "Test=<assignment key>,<exponent>,1440K". Note that no space is allowed between the number (1440) and the K. You must have a K or M (e.g. "...,<exponent>,3M" for a 3M length) for the program to recognize the field as an FFT length. This feature should render the FFTLength ini option and the -f command line option obsolete. >>>>>>> .r67 ################## # 5 Known issues # ################## - The user interface isn't hardened against malformed input. There are some checks but when you really try you should be able to screw it up. - The GUI of your OS might be very laggy while running CUDALucas. (Newer GPUs with compute capabilty 2.0 or higher can handle this _MUCH_ better.) <<<<<<< .mine If you're experiencing this problem, try setting "Polite" to 1 in CUDALucas.ini. ======= If you're experiencing this problem, try setting "Polite" to 1 in CUDALucas.ini. - The very first checkpoint after a restart appears to have very quick iteration times, but aren't actually that fast. The second checkpoint and beyond appear normal. >>>>>>> .r67 **This last point can be removed. **Put information about ffts hanging here? <<<<<<< .mine - Overclocking - CUDA Drivers vs batch file ======= >>>>>>> .r67 ############ # 6 Tuning # ############ Read CUDALucas.ini (you should have already read it in any case). Some options <<<<<<< .mine to look at are "Polite", "Threads", and "FFTLength". You can also activate extra error checking, as well as an option to save all checkpoint files instead of just the most recent ones. ======= to look at are "Polite", "Threads", and "CheckRoundOffAllIterations". You can also activate as an option to save all checkpoint files instead of just the most recent ones. >>>>>>> .r67 <<<<<<< .mine -cufftbench -threadbench ======= ** New A new card should have some integrity checks run on it. Options -r 0 or -r 1 will run self tests which check residues after 10000 iterations for various exponents. -r 0 runs a short test, testing only a few known Mersenne primes. -r 1 is a more thourough test. Any residue mismatches in these tests usually indicate memory problems with the card. For a more complete memory test, use -memtest n i Choose n and i so that the test runs for at least a few hours. If memory errors are detected, decrease the memory clock until the errors go away. An additional 1-2% decrease from the initial stability point is recommended. >>>>>>> .r67 <<<<<<< .mine post .bat file with all files ======= cufftbench To optimize fft selection for your card, run -cufftbench s e i All reasonable fft lengths between s * 1024 and e * 1024 will be tested i times, where reasonable is defined as 7 smooth multiples of 1024. Cards driving a display usually require higher values of i. The results of the test are written in <gpu> fft.txt. Any old version of the fft.txt file is saved with a time stamp added to the file name. The fft.txt file consists of the fastest fft lengths for the particular card, listed in increasing order. Each line starts with the fft length (as a multiple of 1024) and also includes an estimate of the largest exponent that fft length can be used with and the iteration time. The fft.txt can be edited, but it requires the fft length to be the first entry on any line, and that the ffts are listed in increasing order. Anything except an initial numerical entry on any line is ignored. >>>>>>> .r67 threadbench The option -threadbench s e i m times different threads settings for two kernels, the kernel that does the pointwise squaring and the carry splicing kernel. fft lengths from s * 1024 to e * 1024 are tested, using values from 32 to 1024 for the threads settings. Just as with the cufftbench option, i iterations are done at each setting. Larger values of i are needed for cards driving a display. The fastest times and associated threads values are appended to <gpu> threads.txt. Only the most recent results are used. This file can also be edited manually. Each line should start with an fft length (as a multiple of 1024), followed by the threads value for the squaring kernel and then the threads value for the splicing kernel. The threads values must be powers of 2 between 32 and 1024. Error check interval If set to n, will check the roundoff error once every n iterations. Slowest, but most accurate is with ErrorIterations=1. With any larger value, the reported roundoff error is most likely smaller than the largest roundoff error. Error thresholds are accordingly reduced for such values. For example, if the error threshold is set to 45 and ErrorIterations is 1, then any roundoff error <= .45 is ignored. Any roundoff error > .45 triggers the error handling routines. But for ErrorIterations set to 100, any roundoff errors > .35 will trigger the error handling routines. Screen report interval Screen report iterations involve extra memory writing on the device, as well as a memory transfer from device to host, together with some minimal host processing. Very frequent screen reports (once every few seconds) result in a noticeable slowdown and increased cpu utilization. Checkpoint interval Checkpoint iterations involve a significant amount of cpu processing preparing the checkpoint file, besides writing the checkpoint file to the disk and backing up the old checkpoint file. Because of this, checkpoint iterations take significantly longer than non-checkpoint iterations. Less frequent checkpoints mitigate this delay, but risk losing time in case of power outage or other unexpected termination of the program. Error Reset At each screen report, a roundoff error is computed. This roundoff error is either the the largest roundoff error encountered since the last report or a percentage of the last reported roundoff error, whichever is larger. The percent is given by the variable ErrorReset. Recording a new roundoff error is a slow process. Larger values of the error reset variable skip recording a new value more often, speeding up the iteration times. Small values report smaller roundoff errors that larger values ignore. Polite The polite option can be used to introduce some idle time to the gpu. If Polite=1 and PoliteValue=n, then once every n iterations the gpu is synchonized, preventing any new work from being scheduled until all previouly assigned work is completed. ######### # 7 FAQ # ######### Q Does CUDALucas support multiple GPUs? A Yes, with the exception that a single instance of CUDALucas can only use one GPU. For each GPU you want to run CUDALucas on you need (at least) one instance of CUDALucas. For each instance of CUDALucas you can use the commandline option "-d <GPU number>" to specify which GPU to use for each specific CUDALucas instance. Please read the next question, too. Q Can I run multiple instances of CUDALucas on the same computer? <<<<<<< .mine A Yes! You need a separate directory for each instance of CUDALucas. (We plan to remove this restriction in version 2.04.) ======= A Yes! You can even run more than one instance from the same dir. Use the "CUDALucas -i <ini filename>" command line option to specify an INI file other than the default "CUDALucas.ini". Each instance must have its own work file, however it is safe for all instances to print to the same results file. It is NOT safe for two instances to test the same exponent -- they will clobber each others' save files. >>>>>>> .r67 Q Can I continue (load a checkpoint) from a 32bit version of CUDALucas with a 64bit version of CUDALucas (and vice versa)? A Yes! **New answer: A No. Version 2.05 uses a different checkpoint file format from previous versions, as well as a crc to ensure integrity of the data in the checkpoint file. Running CUDALucas in a directory with an old version of the checkpoint file will cause the test to restart at iteration 0, eventually replacing the old checkpoint files with new ones. Checkpoint files for versions 1.6x -- 2.04 should be interchangable. Q Version numbers A Release numbers are X.XX, where the first number has now reached 2, and the other two just go up by one with each release. If you get a version with <<<<<<< .mine "Alpha" or "Beta" in it, then it probably doesn't work right and you shouldn't use it for "production" work. ======= "Alpha" or "Beta" in it, then it probably doesn't work right and you shouldn't use it for "production" work. >>>>>>> .r67 ########### # 8 To do # ########### <<<<<<< .mine 2.04: - Add support for multiple instances in one directory (method will be the same as mfakto) - Print total time elapsed for a test when it's done - Add a method to specify an FFT length for a specific exponent via the work file ======= 2.05: - Add log file support. - Much of the code is placed in the wrong functions, so the interface between functions is often extremely awkward. I'd like to fix this, especially since it will go a long way to adding log functionality for 2.05. >>>>>>> .r67 TBA: <<<<<<< .mine - automatic primenet interaction (Eric Christenson is working on this for mfaktc) ^ specification draft exists For now, use MISFIT-CULU by Scott Lemieux http://www.mersenneforum.org/misfit/ ======= - automatic primenet interaction (Eric Christenson is working on this for mfaktc) ^ specification draft exists >>>>>>> .r67 - This will mean users don't have to manually get and report work from PrimeNet - The security module that would be used would be closed source, to maintain integrity of PrimeNet's data. GPL v3 does not allow to have parts of the program to be closed source. Solution: We'll re-release under another license. This is NOT the end of the GPL v3 version! We'll release future versions of CUDALucas under GPL v3! We want CUDALucas being open source! The only differences of the closed version will be the security module and the license information.
Source: README, updated 2014-04-28