Scaffold Hunter / Feature Requests / #73 Infobar and categorical string properties

Till Schäfer - 2015-12-18

Solution 1 is clean, but not very flexible. One must select this during import, which is not very practical from a user point of view. It adds another layer of complexity. Solution 2 seeme a bit intransparent and might be interpreted as a bug by the user (Why is property XY not shown?). Furthermore, there might be a few scaffold subtrees that contain only a few distinct category values and a Mapping still makes sense, even if the total number is to high.

=> i would go for solution 3 or solution 4 (see below)

Solution 4: Show a dynamic categorical infobar, that puts all categories which are below a freuqnecy threshot in category "the rest". This category might have a special visual appearance to distinguis it from the other categories (e.g. using a striped filling or something like that)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nils Kriege - 2016-11-10

assigned_to: Philipp Mewes
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-01-13

status: open --> in-progress
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-01-13

Some code in the IntervalPanel already implements a mapping for string-properties. It could be used by editing only a few lines of code. However it looks like there are still some issues. Perhaps the interval-comboboxes are not updated, if another string-property is selected. [6feefc] enables the support for string-properties so far.

Related

Commit: [6feefc]

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-01-19

Fixed the remaining problems with [841d1e] and [4a395d]. The feature is implemented now, i think. Please have a look, if it looks okay.

Related

Commit: [4a395d]
Commit: [841d1e]

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-01-19

status: in-progress --> needs-review
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Till Schäfer - 2017-01-24

Subtree accumulation does not work (see screenshot: the two ring node in the center of the image should have some blue in it)

Screenshot_20170124_135923.png

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Till Schäfer - 2017-01-24

For some string properties the numerical interval panel is shown. Example: Tutorial Dataset / PUBCHEM_MOLECULAR_FORMULA (see Screnshots)

Screenshot_20170124_140703.png

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Till Schäfer - 2017-01-24

Screenshot: PUBCHEM_MOLECULAR_FORMULA in the table view

Screenshot_20170124_140330.png

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Till Schäfer - 2017-01-24

The last bug fails silently. Can you please have a look through the code and check if Exceptions are thrown on errors?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Till Schäfer - 2017-01-24

status: needs-review --> re-opened
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Till Schäfer - 2017-01-24

labels: --> scaffold tree view, property mapping
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-01-26

status: re-opened --> in-progress
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-01-31

The number of distinct string-values is limited to 10 (see SinglePropertyPanel, l.427), but PUBCHEM_MOLECULAR_FORMULA contains more than 500 distinct values. This limitation is reasonable, i think, but how should we handle properties, which do not fulfill this predicate? Just removing them from the dropdown-list?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Till Schäfer - 2017-02-17
  
  I think, that it should be always possible to select every value, but the default color mapping should only inlude the most frequent values (lets say with a limit of 10) and only values that have a frequency larger than some X (e.g. no singleton values).
  
  Last edit: Till Schäfer 2017-02-17
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Philipp Mewes - 2017-02-20
    
    Missunderstood your proposal at first, sorry. Sounds good. How should we handle the case that a property only consists of singleton values? There must be at least one string-interval to select.
    
    Last edit: Philipp Mewes 2017-03-01
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nils Kriege - 2017-03-02
      
      I think there is no need to handle this case differently. I would propose to sort the strings with the same frequency (one in case of singletons) lexicographically and just show the top ten.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-02-09

After some work i could implement the subtree accumulation. See [985f41] for details.

Related

Commit: [985f41]

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-03-06

The DbManager renders support for requesting the frequency of distinct string values now ([836e4a]).

Implemented the feature in the view ([c85279]). At most 10 (non-singleton-)values are displayed now. This also enhances the performance for properties with many distinct values, since at most 10 intervals have to be added to the respective panel (See PUBCHEM_MOLECULAR_FORMULA).

Handled the special case of singleton-values as proposed ([2d0092] and [2579df]).

Related

Commit: [2579df]
Commit: [2d0092]
Commit: [836e4a]
Commit: [c85279]

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-03-06

Should the manual be checked for updates, related to this feature too?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nils Kriege - 2017-03-06
  
  Yes, please.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-03-08

Updated the manual ([bf9f33]).

Related

Commit: [bf9f33]

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Mewes - 2017-03-08

status: in-progress --> needs-review
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nils Kriege - 2017-03-08

status: needs-review --> re-opened
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nils Kriege - 2017-03-08

Thank you for implementing this feature. Some minor issues still need to be fixed:

The size of the combo boxes depends on length of the largest possible string value. Could you try to use edu.udo.scaffoldhunter.gui.util.SteppedComboBox instead?

It would be nice to allow to resize the dialog.

The checkbox 'Fit interval borders to current subset' does not make sense for string properties. Could you replace it with 'Restrict to values of current subset' when a string property is selected and implement the functionality (Show only values of the current subset and determine frequencies based on the current subset)?

Loading the panel for string properties takes a long time, which I suppose to be caused by the database for finding the distinct string values and counting their frequencies. Could you check if this could be sped up by adding a database index? Then this should be added to the Hibernate XML files.

The limitation to ten entries does not work (after removing an element, it is possible to add an arbitrary number of new elements). Actually, there is not need for this restriction. Just show up to ten strings by default and allow the user to decide if he wants to add more. This should also be clarified in the manual.

Last edit: Nils Kriege 2017-03-08
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Infobar and categorical string properties

Group

Searches

Help

#73 Infobar and categorical string properties

Discussion

Related

Related

Related

Related

Related