numa: in-kernel profiling: use cpu_to_mem() for per cpu allocations

In kernel profiling requires that we be able to allocate "local" memory
for each cpu. Use "cpu_to_mem()" instead of "cpu_to_node()" to support
memoryless nodes.

Depends on the "numa_mem_id()" patch.

