|
From: Thomas R. <t....@gs...> - 2026-02-07 08:12:58
|
Seems the reported number of idle threads is not so worrisome after all:
My Robinhood has now caught up with the changelogs on the MDS.
For the record I am reporting one of the last stats while it was still busy,
2026/02/07 06:09:59 [19186/1] STATS | ==== EntryProcessor Pipeline Stats ===
2026/02/07 06:09:59 [19186/1] STATS | Idle threads: 19
2026/02/07 06:09:59 [19186/1] STATS | Id constraints count: 1831 (hash min=0/max=9/avg=0.1)
2026/02/07 06:09:59 [19186/1] STATS | Name constraints count: 1824 (hash min=0/max=3/avg=0.1)
2026/02/07 06:09:59 [19186/1] STATS | Stage | Wait | Curr | Done | Total | ms/op |
2026/02/07 06:09:59 [19186/1] STATS | 0: GET_FID | 0 | 0 | 0 | 0 | 0.00 |
2026/02/07 06:09:59 [19186/1] STATS | 1: GET_INFO_DB | 1214 | 0 | 616 | 51625 | 0.33 |
2026/02/07 06:09:59 [19186/1] STATS | 2: GET_INFO_FS | 0 | 0 | 0 | 34542 | 0.39 |
2026/02/07 06:09:59 [19186/1] STATS | 3: PRE_APPLY | 0 | 0 | 0 | 51693 | 0.00 |
2026/02/07 06:09:59 [19186/1] STATS | 4: DB_APPLY | 0 | 1 | 0 | 51693 | 0.90 |
2026/02/07 06:09:59 [19186/1] STATS | 5: CHGLOG_CLR | 0 | 0 | 0 | 51693 | 0.01 |
2026/02/07 06:09:59 [19186/1] STATS | 6: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 |
Note the "Idle threads: 19" here.
This number could be "32", because that's the current nb_threads.
So, as Thomas stated before, the STATS table does not make visible all the working threads - here it would seem there is only one, doing DB_APPLY.
But obviously, there are 12 more threads doing something.
Regards,
Thomas
On 2/4/26 5:41 PM, Thomas Roth wrote:
> Dear Thomas,
>
> indeed the stat dumps always look like that, I have never seen more than 1 or 2 in the column "Curr" of the GET_INFO_DB stat.
> I have reduced the overall number of threads to 32 and the specific one for the FS operations to 24.
>
> Does the number of constraints or rather max_pending_operations have an effect?
> The documentation states that the default value for max_pending_operations = 10000. To test that, I put this value explicitly into robinhood.conf.
> This simply blows up the "Wait" values for GET_INFO_DB, probably as it must if 10k can be pending.
> The rest of the stats looks the same, e.g.
>
>
> 2026/02/04 17:35:43 [5311/1] STATS | Idle threads: 28
> 2026/02/04 17:35:43 [5311/1] STATS | Id constraints count: 9976 (hash min=0/max=12/avg=0.7)
> 2026/02/04 17:35:43 [5311/1] STATS | Name constraints count: 9849 (hash min=0/max=6/avg=0.6)
> 2026/02/04 17:35:43 [5311/1] STATS | Stage | Wait | Curr | Done | Total | ms/op |
> 2026/02/04 17:35:43 [5311/1] STATS | 0: GET_FID | 0 | 0 | 0 | 0 | 0.00 |
> 2026/02/04 17:35:43 [5311/1] STATS | 1: GET_INFO_DB | 6375 | 0 | 3599 | 6965 | 0.44 |
> 2026/02/04 17:35:43 [5311/1] STATS | 2: GET_INFO_FS | 0 | 1 | 1 | 5060 | 9.92 |
> 2026/02/04 17:35:43 [5311/1] STATS | 3: PRE_APPLY | 0 | 0 | 0 | 7147 | 0.00 |
> 2026/02/04 17:35:43 [5311/1] STATS | 4: DB_APPLY | 0 | 0 | 0 | 7147 | 1.21 |
> 2026/02/04 17:35:43 [5311/1] STATS | 5: CHGLOG_CLR | 0 | 0 | 0 | 7233 | 0.01 |
> 2026/02/04 17:35:43 [5311/1] STATS | 6: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 |
>
>
> Best regards
> Thomas
>
>
> On 2/4/26 16:00, Tho...@CE... wrote:
>> Dear Thomas,
>>
>> Indeed the stats show 1 single active thread, but as this display is lockless for performance reasons, the current operations may be moving while
>> the stats are displayed, and thus give a false view of what's really going on.
>> Do all other stat dumps look the same?
>>
>> Otherwise, I'm concerned about the 10.08ms for "GET_INFO_FS" (basically stat()+getstripe()). It's quite a high latency (if operations were
>> sequential, it would only make 100 stat per sec...).
>> I wonder if having too much threads querying the lustre client might not be
>> counterproductive.
>>
>> There is a way to fine tune the number of threads allowed by pipeline stage, to have a high parallelism on some operations (e.g. DB) while
>> restricting the number of simultaneous calls to the FS.
>>
>> Regards,
>> Thomas
>>
>>
>> -----Message d'origine-----
>> De : Thomas Roth <t....@gs...>
>> Envoyé : lundi 2 février 2026 21:58
>> À : 'rob...@li...' <rob...@li...>
>> Objet : [robinhood-support] idle threads
>>
>> Hi all,
>>
>> I have Robinhood v3.2 running on a Lustre 2.15, and might have misconfigured / misunderstood the thread count.
>>
>> The Robinhood box has 96 cores, so I have set the number of threads to 96 (EntryProcessor {nb_threads = 96;})
>>
>> When checking the robinhood.log, most cores do nothing:
>>
>> 2026/02/02 21:46:58 [4479/1] STATS | ==== EntryProcessor Pipeline Stats ===
>> 2026/02/02 21:46:58 [4479/1] STATS | Idle threads: 95
>> 2026/02/02 21:46:58 [4479/1] STATS | Id constraints count: 100 (hash min=0/max=3/avg=0.0)
>> 2026/02/02 21:46:58 [4479/1] STATS | Name constraints count: 98 (hash min=0/max=2/avg=0.0)
>> 2026/02/02 21:46:58 [4479/1] STATS | Stage | Wait | Curr | Done | Total | ms/op |
>> 2026/02/02 21:46:58 [4479/1] STATS | 0: GET_FID | 0 | 0 | 0 | 0 | 0.00 |
>> 2026/02/02 21:46:58 [4479/1] STATS | 1: GET_INFO_DB | 57 | 0 | 41 | 38819 | 0.35 |
>> 2026/02/02 21:46:58 [4479/1] STATS | 2: GET_INFO_FS | 0 | 0 | 0 | 26390 | 10.08 |
>> 2026/02/02 21:46:58 [4479/1] STATS | 3: PRE_APPLY | 0 | 0 | 0 | 37960 | 0.00 |
>> 2026/02/02 21:46:58 [4479/1] STATS | 4: DB_APPLY | 1 | 1 | 0 | 37958 | 1.17 | 2.64% batched (avg batch size: 3.8)
>> 2026/02/02 21:46:58 [4479/1] STATS | 5: CHGLOG_CLR | 0 | 0 | 0 | 38823 | 0.01 |
>> 2026/02/02 21:46:58 [4479/1] STATS | 6: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 |
>>
>> This is quite single-threaded ;-(
>>
>> Right now, the file system is not in production and quite idle, so Robinhood is working to close a gap of ~100M changelog entries.
>> But I am afraid that with this configuration, once in production, the changelogs will run away and fill up the MDS disk.
>>
>> Regards,
>> Thomas
>>
>>
>> --
>> --------------------------------------------------------------------
>> Thomas Roth
>> Department: IT
>> Location: SB3 2.291
>> Phone: +49-6159-71 1453 Fax: +49-6159-71 2986
>>
>> GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de
>>
>> Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528 Managing Directors / Geschäftsführung:
>> Prof. Dr. Thomas Nilsson, Dr. Katharina Stummeyer, Jörg Blaurock Chairman of the Supervisory Board / Vorsitzende des GSI-Aufsichtsrats:
>> State Secretary / Ministerialrätin Dr. Andrea Fischer
>>
>>
>>
>> _______________________________________________
>> robinhood-support mailing list
>> rob...@li...
>> https://lists.sourceforge.net/lists/listinfo/robinhood-support
>
|