On Tuesday 18 September 2007, Alexander Larsson wrote: > On Tue, 2007-09-18 at 11:18 +0200, Patryk Zawadzki wrote: > > On 9/18/07, Alexander Larsson <[EMAIL PROTECTED]> wrote: > > > On Tue, 2007-09-18 at 00:51 +0200, David Faure wrote: > > > > On Tuesday 28 August 2007, Alexander Larsson wrote: > > > > > If several globs matches, and sniffing fails, or doesn't help: > > > > > fall back to the first glob match > > > > > (maybe we should do something better here?) > > > > > > > > Hmm, I just found the case of "README.txt", which could either be > > > > "text/plain" due to *.txt > > > > or "text/x-readme" due to README*. Which one should we pick? The second > > > > pattern "looks" > > > > more specific to my eyes so it should probably win, but how should we > > > > quantify that? > > > > Should we take the longest pattern? > > > > > > Yeah, this is tricky. I think the longest pattern is the traditional way > > > to solve things like that. It will probably work good enought for us. > > > > Isn't just enough to check if either of them is the subclass of the > > second? If so, pick the more specific one. > > That only works in the case of subclasses though, which might not always > be the case. Seems right to use that when its possible though.
Agreed. Do we also agree that this handling of multiple glob matches can be done right away inside glob-matching? No need to delay that to the "If several globs matches" resolution (after sniffing), IMHO. So the new algorithm would be the one described with Alexander, with something like this prepended: Glob-matching should prefer derived mimetype over base mimetype, and longer matches over shorter ones. However if two globs of the same length match the file, and the two matches are not related in the inheritance tree, then we have a "glob conflict", which will be resolved below. "If several globs matches" in Alexander's algorithm really becomes "In case of a glob conflict", i.e. two or more mimetypes with the same glob (like *.doc or *.ogg). [Well technically you could invent a pattern like foo.* and *.doc, so that foo.doc matches both and you don't have a "longer match", but this is really border case (and would simply be handled as a "glob conflict" too).] My problem is that I can't test the subclass case, README* is the only case of a glob match that has a * but not as the first character, so it's the only one that can give conflicts... So after implementing "take longest match", I see no way of testing "take subclass", since in the case of README.txt it is the longest match anyway... I could can data, but I also mean that we might not have a use case for it at the moment :) == OK. Anyone knows which other implementations of shared-mime-info around? xdgmime was mentionned, I don't know who knows it code well enough to modify it once we all agree on the spec changes, but the first question is: who else needs to approve those spec changes? Rox, I assume? Thomas Leonard CC'ed (see thread for more info, we covered "preferring globs over contents" before talking about glob conflicts). Thanks for the feedback. Now the kde glob-matching code takes the longest match :) -- David Faure, [EMAIL PROTECTED], sponsored by Trolltech to work on KDE, Konqueror (http://www.konqueror.org), and KOffice (http://www.koffice.org). _______________________________________________ xdg mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/xdg
