Menu

#15 related (e.g. inverse) questions skew scheduling and ANN

open
D.C.
core (3)
5
2012-12-11
2004-10-02
No

Memaid currently handles inverse and
sub-problems/-questions as independent from each other.
This skews their gradings especially when both versions
are asked in the same session as having seen the one
version makes the it a *lot* easier to remember the other
one(s). That, in turn, results in badly scheduled questions
and incorrectly trained ANN.

The attached patch implements a naive but useful fix for
the inverse questions problem in pyqt_memaid: if the
queue for some session contains both a questions and its
inverse, the program removes the one that was scheduled
later. Inverses are assumed to be *exact* inverses, i.e.
they are detected through character-by-character
comparison, which, of course doesn't always work
(Finnish<->English example):

Q: linja-auto
A: bus

Q: bus (the car, not the computer part)
A: linja-auto

I think the correct (but perhaps overkill) fix would be to let
the user identify inverse questions (or rather save the
information automatically when adding the questions) and
also feed the last time and grading of the inverse question
to the ANN. I don't know what to do with questions that
don't have an inverse, though. Not concerning the ANN
with inverses but instead just forcing them not to appear
in the same session probably works OK, though.

The same problem applies, perhaps to somewhat lesser
extent, to sub-questions as well. For example:

Q: linja-auto
A: bus

Q: linja
A: line (as in "transportation line")

Q: auto
A: car

I use to add such split questions whenever I find it hard to
some memoize large compound problem. This technique is
also recommended in SuperMemo's site but I doubt SM's
scheduling algorithm contains any explicit support for it.
The above "linja-auto" example doesn't make the problem
look serious but SM site's example, a long poem with parts
covered from it, does. If you saw one part of first, the
others would then be way too easy during the same
session.

Discussion

  • Jarno Elonen

    Jarno Elonen - 2004-10-05

    Logged In: YES
    user_id=312071

    I'm adding a new version of the pyqt_memaid patch. The last
    one didn't take into account that rebuild_revision_queue() can
    be called multiple times per session. It now remembers
    "banned" items for 3 hours before allowing them again.

     
  • Jarno Elonen

    Jarno Elonen - 2004-10-05

    Simplistic but still useful fix for the inverse question problem in pyqt_memaid (version 2)

     
  • Jarno Elonen

    Jarno Elonen - 2004-10-05

    Logged In: YES
    user_id=312071

    Please not the the 3-hour-ban implementation is a quick hack
    using a global dictionary. It would be better to save that
    information in the question database (on the disk) but I dare not
    touch the format.

     
  • Jarno Elonen

    Jarno Elonen - 2004-10-05

    Logged In: YES
    user_id=312071

    - Please not the the
    + Please note that the
    (argh)

     
  • Peter Bienstman

    Peter Bienstman - 2004-10-17

    Logged In: YES
    user_id=275016

    Thanks, I've added your patch to pyqt_memaid.

    I've not yet closed this bugreport, as you have assigned it to Dave,
    and the C version of MemAid doesn't have this feature.

     
  • radical tyro

    radical tyro - 2005-01-20

    Logged In: YES
    user_id=741857

    You have a very interesting point. Often in learning a foreign language,
    inverse questions are useful. Indeed, seeing one of these items is almost
    as good for your memory as seeing the inverse. As you said, this is a
    large problem when you break an item up into several related items. I
    think the scheduling algorithm needs to take this into account. The
    proposed naive fix to me seems far from ideal. For instance, if it only
    delays the quizzing of the inverse card to the next day, your memory
    was still refreshed the previous day, thus making the inverse card easier
    than the ANN thinks.

    Here is one idea: allow the user to group related cards. The ANN will
    treat these cards as a single item. Then, when the item is quizzed, the
    program picks one item in the group, sequentially. For instance, in the
    example discussed above, the following cards would be grouped:
    Q1: linja-auto
    A1: bus
    Q2: linja
    A2: line (as in "transportation line")
    Q3: auto
    A3: car
    First, card 1 would be quizzed and rescheduled accordingly. Then card 2
    would be shown when this grouped item is quizzed next. Then card 3,
    then card 1, etc.

    The shortcoming to this idea is if the difficulty of items in the grouped
    cards varies. Then for instance card 3 may be obvious but card 1 is
    tough, yet they are shown with the same frequency. A second problem is
    that the ANN will likely over estimate the quizzing interval since the item
    is not a single piece of information.

    I would like some feedback on this idea, as I do not think it is optimal. I
    would be interested in other people's thoughts on this problem. It is
    surprising that the supermemo page suggests this sort of breaking down
    of one long item into many related items, yet does not take this into
    account in the algorithm.

     
  • Jarno Elonen

    Jarno Elonen - 2005-01-22

    Logged In: YES
    user_id=312071

    Explicit grouping is certainly more powerful than the simplistic
    automatic detection of inverses but I think it's still too suboptimal
    for the benefits to outweigh the added UI complexity.

    The main problem is that the exclusion groups can grow too wide.
    For example, "linja" and "auto" should not exclude each other but
    they both should be excluded by "linja-auto". Not to mention when
    you add, for example, "autoilija = driver (of a car)". This could, in
    principle, be fixed by letting the user specify finer relations
    ("subitem of" and "inverse of"). Coming up with a nice UI for the
    task is more challenging than plain grouping but hopefully not
    impossible.

    Feeding any of this stuff to the ANN might also give rise to the
    "curse of dimensionality" problem; there are already quite a few
    input variables if I've understood correctly. On the other hand, is
    it possible to design a good fixed scheduling algorithm that
    doesn't spoil the ANN's results?

    Showing the previously "forbidden" item next day or postponing it
    <group size> x <scheduled delay> are both bad choices. For
    example, Superitems (e.g. "linja-auto") should clearly be delayed
    less than subitems (e.g. "auto") when they collide. Even inverses
    are not necessarily symmetrical -- remembering "linja-auto =>
    bus" might be much easier than "bus => linja-auto".

    The last point makes the simple "tomorrow" re-scheduling
    algorithm slightly better than the "round robin" (cycle-through),
    IMO: it allows the scheduling of the grouped items to naturally
    drift apart, which gradually compensates the too early scheduling.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.