I'm focusing on understanding the exact algorithm/heuristic that CiteSpace employs when implementing the modified g-index. Specifically, how does CiteSpace determine the cut-off g when many nodes have identical citation counts — for example, when all papers in a time slice have exactly 1 citation each? More concretely, in a time slice with 29 papers each cited once and the k=25, the inequality reduces to:
g² ≤ 25 × g ⟹ g ≤ 25
so mathematically, the modified g-index would be 25. However, I noticed that CiteSpace suggests including all 29 nodes for further processing in such a case. Could you please clarify the exact approach/heuristic used in CiteSpace for determining node inclusion cutoff in such scenarios? Is there an additional rule or adjustment beyond the strict application of the formula that influences including nodes beyond the theoretical g?
Thank you.
Regards,
Andrej
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear Professor Chen,
I'm focusing on understanding the exact algorithm/heuristic that CiteSpace employs when implementing the modified g-index. Specifically, how does CiteSpace determine the cut-off g when many nodes have identical citation counts — for example, when all papers in a time slice have exactly 1 citation each? More concretely, in a time slice with 29 papers each cited once and the k=25, the inequality reduces to:
g² ≤ 25 × g ⟹ g ≤ 25
so mathematically, the modified g-index would be 25. However, I noticed that CiteSpace suggests including all 29 nodes for further processing in such a case. Could you please clarify the exact approach/heuristic used in CiteSpace for determining node inclusion cutoff in such scenarios? Is there an additional rule or adjustment beyond the strict application of the formula that influences including nodes beyond the theoretical g?
Thank you.
Regards,
Andrej