Menu

Zero values of cluster uncertainty

2021-09-23
2021-10-04
  • Glib Voloskyi

    Glib Voloskyi - 2021-09-23

    Dear professor Chen,

    I have encountered another problem with uncertainty calculation. As per your previous reply, I have successfully downloaded the MAS dataset, retrieved contexts, and calculated uncertainties. When I attempt to compute cluster uncertainty (screenshot 1), Citespace retrieves 4484 nodes with (non-zero) uncertainty scores but still calculate the uncertainty of all the clusters as 0 (screenshot 2). Could you please suggest what might be the problem and have can we tackle it?

    Best regards,

    Glib Voloskyi

     
    • Chaomei Chen

      Chaomei Chen - 2021-10-04

      I have checked this. As it turns out, this is not a problem with the software, but a problem with the data or the lack of data. In your example, Retrieve non-zero uncertainty scores didn't return anything because the query didn't find any citation contexts with non-zero scores in the database.
      Below is an example when there are some non-zero uncertainty scores: Out of the 968 nodes of the underlying network, 360 of them involve non-zero uncertainties; thus, the subsequent calculations of cluster uncertainties have non-zeros.
      In my experience, MAG's coverage is uneven from topic to topic. You may consider expand the scope of your data, which may find more sentences with uncertainties.

      CitationContextUncertainty(317): Retrieve (non-zero) uncertainty scores of 968 nodes:
      10.20.30.40.50.60.70.80.90.100.
      110.120.130.140.150.160.170.180.190.200.
      210.220.230.240.250.260.270.280.290.300.
      310.320.330.340.350.360.!
      GraphPanel(12300): Uncertainties of Clusters (E, H, T):
      0 2.17 .75 .04
      1 2.23 5.21 .0
      2 3.52 5.72 .0
      3 5.66 3.88 .52
      4 3.77 4.66 .8
      5 5.39 7.12 .98
      6 7.86 14.71 1.18
      7 .0 .0 .0
      8 .0 .0 .0
      9 1.39 13.55 6.4

       
      • Glib Voloskyi

        Glib Voloskyi - 2021-10-04

        Thank you for your reply. I have checked the contexts for the dataset, and if I interpret it correctly, there are 705 records with citation context (screen 1). Can it be just an insufficient number of records to calculate uncertainties, considering that the dataset contains roughly 13500 records?

         
        • Chaomei Chen

          Chaomei Chen - 2021-10-04

          Two possible reasons for zero uncertainty scores: 1. none of the 705 citation contexts have positive uncertainty scores, for example, they don't contain any of those uncertainty cue words (see our paper/book on that for more details); 2. some of the 705 contexts have positive uncertainty scores, but none of them are included in the visualized network. This boils down to MAG's coverage, which seems varying substantially across topics. I will reach out to MAG and see if they are aware of any more info on this.

           

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.