[Zeitgeist] [Bug 498878] [NEW] non-clear API: get__most_used

Mikkel Kamstrup Erlandsen Sun, 20 Dec 2009 13:30:44 -0800

Public bug reported:

I just reviewed the latest API additions with the
ZeitgeistClient.get_uris_most_used_with* and corresponding methods
inside the engine. Sorry to be a party-spoiler, but I think there is
some stuff that needs cleanup before we can roll 0.3.1.


The technical issues I found off the bat:

 1. Using the prefix get_* for all of these most-used queries seems a
bit off since you are not retrieving a well known thing, but are
querying and getting an unpredictable result. I'd prefer if we used
find_everywhere instead.

 2. The method names also use the *_with_* word where we have used
*_for_* everywhere else in the code hitherto.

 3. Why do these new methods return URIs and not Subject instances?

 4. The DBus method GetMostUsedWithSubjects has a misleading name. It
doesn't take a list of subjects in the args.

 5. Why is the time_range arg. not the first arg like it is for
FindEventIds()?


Then there's something about this API that just tells me it is wrong:

 I. Why is there no paging? The API allows me to do very broad queries,
like "give me anything related to subjects of interpretation Document"
which would presumably have huge result sets?

 II. Assuming there can be many results why doesn't it return a list of
event ids, like FindEventIds()?

 III. Using "MostUsed" in the name more or less implies that which
algorithm will be used. Do we want that? Is it better for forwards
compatibility to use the more generic term "Related"?


All of these questions leads me to think that we really need an API like:

  FindRelatedEventIds(in (xx) time_range,
                                    in aE event_templates,
                                    in aE related_event_templates,
                                    in u storage_state,
                                    in u num_events,
                                    in u result_type
                                    out au related_event_ids)

The result_type could change the way results are ordered, much like our current 
FIndEventIds(), but not exactly the same switches. Maybe result_type could be:
  
  0 : Results ordered by relevancy (where the exact measure of relevancy is up 
to the engine)
  1 : Results ordered by recency

** Affects: zeitgeist
     Importance: Critical
         Status: New

** Changed in: zeitgeist
    Milestone: None => 0.3.1

** Changed in: zeitgeist
   Importance: Undecided => Critical

-- 
non-clear API: get_*_most_used*
https://bugs.launchpad.net/bugs/498878
You received this bug notification because you are a member of Zeitgeist
Developers, which is the registrant for Zeitgeist Framework.

Status in Zeitgeist Framework: New

Bug description:
I just reviewed the latest API additions with the 
ZeitgeistClient.get_uris_most_used_with* and corresponding methods inside the 
engine. Sorry to be a party-spoiler, but I think there is some stuff that needs 
cleanup before we can roll 0.3.1.

The technical issues I found off the bat:

 1. Using the prefix get_* for all of these most-used queries seems a bit off 
since you are not retrieving a well known thing, but are querying and getting 
an unpredictable result. I'd prefer if we used find_everywhere instead.

 2. The method names also use the *_with_* word where we have used *_for_* 
everywhere else in the code hitherto.

 3. Why do these new methods return URIs and not Subject instances?

 4. The DBus method GetMostUsedWithSubjects has a misleading name. It doesn't 
take a list of subjects in the args.

 5. Why is the time_range arg. not the first arg like it is for FindEventIds()?


Then there's something about this API that just tells me it is wrong:

 I. Why is there no paging? The API allows me to do very broad queries, like 
"give me anything related to subjects of interpretation Document" which would 
presumably have huge result sets?

 II. Assuming there can be many results why doesn't it return a list of event 
ids, like FindEventIds()?

 III. Using "MostUsed" in the name more or less implies that which algorithm 
will be used. Do we want that? Is it better for forwards compatibility to use 
the more generic term "Related"?


All of these questions leads me to think that we really need an API like:

  FindRelatedEventIds(in (xx) time_range,
                                    in aE event_templates,
                                    in aE related_event_templates,
                                    in u storage_state,
                                    in u num_events,
                                    in u result_type
                                    out au related_event_ids)

The result_type could change the way results are ordered, much like our current 
FIndEventIds(), but not exactly the same switches. Maybe result_type could be:
  
  0 : Results ordered by relevancy (where the exact measure of relevancy is up 
to the engine)
  1 : Results ordered by recency



_______________________________________________
Mailing list: https://launchpad.net/~zeitgeist
Post to     : zeitgeist@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zeitgeist
More help   : https://help.launchpad.net/ListHelp

[Zeitgeist] [Bug 498878] [NEW] non-clear API: get_*_most_used*

Reply via email to

[Zeitgeist] [Bug 498878] [NEW] non-clear API: get__most_used