Hi everyone.

As the developer of this Minimum Viable Product (to be honest, it's already 
production-ready imo), I’d like to provide some background about the project.

Due to the intrinsic characteristics of the Chinese language, its character set 
is extremely large—Unicode 17.0 includes 101,996 Han characters. As a result, 
Chinese font files are typically tens of MiB in size, and in some cases even 
exceed OpenType’s technical limits, requiring distribution via multiple (two or 
three) separate font files, each around ten MiB in size. This creates several 
challenges:

1. Storage: MediaWiki namespaces impose a 20 KiB size limit on JavaScript and 
CSS pages, making it impossible to embed a Chinese font via base64 encoding 
directly on a single page. Additionally, TemplateStyles currently doesn’t allow 
inline base64-encoded files. Moreover, the security team is actively promoting 
the implementation of Content Security Policy (CSP), which will technically 
prohibit unauthenticated readers from directly requesting fonts hosted on 
Toolforge. One possible solution might be allowing font files to be hosted on 
Wikimedia Commons (all fonts used in my project are freely licensed); 
otherwise, this issue is nearly unsolvable.

2. Distribution: It’s impractical to expect users to download the entire font 
file at once, so fine-grained subsetting and on-demand delivery are essential. 
The `unicode-range` CSS feature is promising, but it currently suffers from 
unexpected issues in Chinese contexts (e.g., request timeouts). Ideally, the 
server should dynamically serve tailored font subsets based on user 
requests—this is precisely the approach my backend currently implements. 
Commonly used Chinese characters (<10k typically) are already renderable on 
most user devices, so our focus is on rare or obscure characters (e.g., new 
Hanzi added in Unicode updates), which may appear only once or twice on 
specific pages. Modifying ULS or developing a new MediaWiki extension could 
address this in a secure and compliant manner. However, this use case would 
arise only in a very limited number of wikis—likely zhwp, zhsource, and various 
Wiktionaries across languages. Therefore, I would prefer a unified hosting and 
delivery platform, such as a dedicated WMCS project or a Commons API interface.

3. Security: I shouldn’t need to elaborate much here. However, I’d like to note 
that hosting static font files on Toolforge (or another WMCS project) should 
pose no greater security risk than hosting images, audio, or other media files 
on Commons—it’s essentially the same type of static asset.
_______________________________________________
Wikitech-l mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

Reply via email to