[Memaid-devel] pyqt_memaid's future

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

These are some thoughts on the successor to pyqt_memaid which I am currentl=
y=20
planning:

The original memaid didn't allow for multiple repetitions of the same item =
on=20
a single day if it had a low grade, so that's why I added the 'drill badly=
=20
known' learning mode to pyqt_memaid. However, this still feels like an ugly=
=20
afterthought.

Another worry I have is that the descriptions of the grades are fairly=20
general, which can mean that sometimes the user doubts between two adjacent=
=20
grades. This obviously has an influence of the performance of the schedulin=
g=20
algorithm.

So, in order to solve both problems at once, I'm thinking of switching to t=
he=20
following grade definitions:

0 : "Wrong answer, you haven't memorised this item yet, or you forgot it"
1: "Wrong answer, but the item is more familiar than in grade 0"

Items with grades 0 or 1 are always reviewed until they get grade 2 or high=
er,=20
and items with grade 0 are reviewed twice as often as those with grade 1.

2: "You've now memorised this item, and will probably remember it for one o=
r=20
two days"

This grade signals the transition from the acquisition phase to the retenti=
on=20
phase, where the items get scheduled with ever increasing intervals.

3: "Correct answer, but after great difficulty. The interval was probably t=
oo=20
long"
4: "Correct answer, with some effort. The interval was just about right"
5: "Correct answer, but without any effort whatsoever. The interval was=20
probably too short"

Notice how we tell the user what effects these grades will have on the=20
interval scheduling.

Now onto the scheduling algorithm. Memaid and Supermemo use fairly complica=
ted=20
algorithms. However, I am a bit sceptical if the added complexity has a=20
statistically significant benefit. Therefore, I plan to use an algorithm of=
=20
the same complexity as the (very) old SM2 in SuperMemo=20
(http://www.supermemo.com/english/ol/sm2.htm), but with some small randomne=
ss=20
and extra heuristics for early and late revisions. Note that I don't plan=20
extra hurdles for early or late revisions, as e.g. having to use a special=
=20
Mercy option in Supermemo.

I would base any refinements to the scheduling algorithm solely on extensiv=
e=20
feedback from a large number of users. Therefore, in order to collect this=
=20
data as easily as possible, a detailed history of the revision process will=
=20
be kept, and this data will be anonymously and transparently uploaded to a=
=20
central server (with the user's permission of course).=20

The first thing I'd like to find out with these statistics is how large the=
=20
spread is on intervals which the user considers optimal (grade 4) for a giv=
en=20
repetition number and difficulty. Is this spread turns out to be fairly=20
large, I don't think it makes sense to make the algorithm more sophisticate=
d.=20
Of course, since I don't have any commercial interests with the software=20
(i.e. no need for "Buy the upgrade to algorithm X, which really is a lot=20
better"), the statistics will be completely open fur public scrutiny.

On the website, I'd give equal attention to both parts of the project. On o=
ne=20
hand, we have a free tool to improve your memorising process, but on the=20
other hand there is a research project on long-term memory in which people=
=20
can participate and which can result in better algorithms (which will alway=
s=20
be free).

I'd love to hear any feedback on this. Also, people can already take a look=
 at=20
the current implementation of the scheduling algorithm if they want to.

Cheers,

Peter