sampling: reuse token data buffer in llama_sampler_sample (#18365)
* sampling: reuse token data buffer in llama_sampler_sample
* move cur buffer before timing section, after samplers
* minor : fix build
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
macOS/iOS: - macOS Apple Silicon (arm64) - macOS Intel (x64) - iOS XCFramework
Linux: - Ubuntu x64 (CPU) - Ubuntu x64 (Vulkan) - Ubuntu s390x (CPU)
Windows: - Windows x64 (CPU) - Windows arm64 (CPU) - Windows x64 (CUDA 12) - CUDA 12.4 DLLs - Windows x64 (CUDA 13) - CUDA 13.1 DLLs - Windows x64 (Vulkan) - Windows x64 (SYCL) - Windows x64 (HIP)
openEuler: - openEuler x86 (310p) - openEuler x86 (910b) - openEuler aarch64 (310p) - openEuler aarch64 (910b)