https://bugzilla.wikimedia.org/show_bug.cgi?id=10867


Philippe Verdy <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]




--- Comment #5 from Philippe Verdy <[email protected]>  2009-11-20 17:00:38 
UTC ---
Interestingly, the various number types that can be mapped to the EAN
international standard (which encompass them all) can also be parsed by
recognizing a list of known prefixes, from which we could automatically
determine the product type, or the geographical area (or linguistic area).

If such an application for automatic determination of these types of
information is developed, the table of prefixes should include:
- the exact standard identifier (such as "ISBN", "ASIN", "ISMN", possibly
"UPC", but NOT "EAN" which is not qualifying and includes all the other
standards in their longest number form...)
- the total length of the identifier (including its prefix below and the check
digit)
- the prefix (possibly a regular expression) : it may be empty if all numbers
with the above length are part of the standard.
- an optional qualifier (such as country or language zone in ISBN)
- a comment containing a verifiable reference for this prefix.
- an optional field containing a bigger length on which the shorter 

The table should be editable somewhere (possibly restricted by admins) And
there may exist several lines for the same standard identifier (because there
are alternate forms using other lengths, or different mappings from a shorter
legacy number to the longer number, or because one wants to subdivide the
standard space for easier identification of products of the same type, for
building specific search pages per product type and or optional subdivision
like the geographic area).

Note that the geographic area can be complex to determine (it may require lots
of lines, notably for ISBN), and some legacy geographical areas are remaining
in the standards that are now split across distinct countries: if you want to
search a book with vendors for a specific geographic zone, they may need to
support the legacy numbers that were allocated to larger zones.

(and these legacy zones are still in use for newer products, because the
geographic subdivisions are then subdivided by producer/editor company which
still have unallocated space in their current number blocks, or because the
vendors have disappeared, have been split or merged, and their existing number
blocks became shared, so they no longer map exactly to the exact geographic
area in which the block prefixes were allocated).

Another table could contain optional equivalent identifiers, or could contain a
list of regular expressions used to recognize them, and another one used to
parse the acceptable number formats (because there may be letters sometimes),
and an identifier for the method used to verify the number validity (by its
check digit) : the legacy ISBN-10 check digit is computed diferently from the
newer ISBN-13 identifier that uses the EAN method.

These tables do not necessarily have to be on a Wiki page or in the database.
They may perfectly reside in a PHP source file, because the number validation
or remapping methods will frequently require specific PHP code, and also
because it will probably be more efficient there.


-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to