Hello,
I am using the Flac library to compress our voice corpus.
We then extract certain samples to create voice synthesis.
Since, according to the documentation, a seek table is supposed to speed up searching and for sample indexes, I added this into the flac format compressed corpus file.
The result is that at each level of compression, using a seek table will increase the search-decompression phase by 10 to 15 fold.
Here is a table of results :
Compression | Seek Table | Taille | Time |
--------- | -------- | -------- | ------- |
0 | Oui | 335.36 Mo | 1 sec 322 ms |
0 | Non | 333.91 Mo | 87 ms |
1 | Oui | 335.36 Mo | 1 sec 199 ms |
1 | Non | 333.91 Mo | 90 ms |
2 | Oui | 335.36 Mo | 1 sec 249 ms |
2 | Non | 333.91 Mo | 87 ms |
3 | Oui | 321.14 Mo | 1 sec 183 ms |
3 | Non | 319.68 Mo | 100 ms |
4 | Oui | 319.96 Mo | 1 sec 229 ms |
4 | Non | 318.50 Mo | 106 ms |
5 | Oui | 319.43 Mo | 1 sec 275 ms |
5 | Non | 317.98 Mo | 106 ms |
6 | Oui | 316.66 Mo | 1 sec 342 ms |
6 | Non | 315.21 Mo | 104 ms |
7 | Oui | 315.95 Mo | 1 sec 189 ms |
7 | Non | 314.50 Mo | 104 ms |
8 | Oui | 315.47 Mo | 1 sec 365 ms |
8 | Non | 314.01 Mo | 113 ms |
The original file is 779Mb in size.
There would appear to be a serious bug with seektable handling.
I am unable to reproduce this. I see a few strange things as well: it seems the seektable is 1.4MB in size. This is unusually large for a seek table. How did you create it? Perhaps this large seektable is the reason for the slowdown?