Suppose each node can perform 1e6 divisions per second in single thread, lets ignore time for additions.
I can start 10 Amazon EC2 nodes running 16 threads each, and ....
Starting the servers and managing them will take much more time than doing actual job.
Now we have 10 GPU nodes capable 16 threads each, we need a simple job split (by range) and job result merge, and task can be done in a second or two.
What do you think?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, GPU should implement a MapReduce over a random network topology...
We probably should start with a list of Use Cases of what we want that the GPU does...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Suppose I need to calculate sum
1/i where i between 1 and 1e8
Suppose each node can perform 1e6 divisions per second in single thread, lets ignore time for additions.
I can start 10 Amazon EC2 nodes running 16 threads each, and ....
Starting the servers and managing them will take much more time than doing actual job.
Now we have 10 GPU nodes capable 16 threads each, we need a simple job split (by range) and job result merge, and task can be done in a second or two.
What do you think?
Yes, GPU should implement a MapReduce over a random network topology...
We probably should start with a list of Use Cases of what we want that the GPU does...