Title text reuse
 SuperMemo 2006
 problem with SuperMemo 2004
Milena: I searched for a string and got to an element where I could not find this string (although it was the same correct subject). Is the search fuzzy or does it also include similar elements?
No. The search looks for exact matches. What you are experiencing is a design problem. Call it the worst design decision in SuperMemo history. Or at least a good candidate.
SuperMemo 2004, unlike its predecessor, does not store automatically generated titles in the text registry. To save space, it reuses long source texts. What your search finds is the text in the title. As you can only see a portion of the title (because of the limited space in the caption or the knowledge tree), you will often find elements that do not seem to contain the searched string.
This would not be much of a problem, but the problem with title text reuse is that it statistically INCREASES the space taken by titles, i.e. it works precisely against the goal it was to accomplish.
This is how it works:
- import a long topic article (e.g. 500KB)
- SuperMemo reuses the text of the article to generate element's title (without increasing the size of the collection)
- extract a new topic from the beginning of the article
- again SuperMemo reuses the text of the article to generate the title (without increasing the size of the collection)
- after you process the main topic, you execute Done
- Done deletes the article but it still keeps its text which is registered as the element title in two elements
- Now if you execute Done on the child topic, its text and title would normally be deleted, but the original long text is still registered with two elements as a title, so it cannot be deleted after Done (unless SuperMemo went to check all uses of a text for its actual registration type, which is not feasible as some texts might be used in hudreds of elements)
The net result is that instead of keeping a short title, you store the entire original article in SuperMemo. It is enough to produce a single case like this to destroy the savings made by reusing titles in hundreds of articles. In other words, for a larger collection, the new method actually wastes disk space (statistically).