https://bugzilla.wikimedia.org/show_bug.cgi?id=52593

--- Comment #3 from Faidon Liambotis <[email protected]> ---
TL;DR: increasing the limit to at least 1GiB is fine from an ops perspective.

We're currently at 63.6T out of 96T (* 3 replicas * 2 for pmtpa/eqiad = 576T
raw).  Individual disks show at as much as 70% full. About 5.5T of these are
temp data that haven't been pruned because of #56401 and friends, so we'll
regain some capacity from there. The thumb RFC can potentially shave off as
much as 15.5T of thumbs (perhaps some number in between, depending on the
solution we'll end up choosing).

Even at the current trend, estimates place us at 75-80% (max of our comfort
zone, to be able to provide redundancy + lead time to procure hardware) by
April/May:
http://ganglia.wikimedia.org/latest/graph.php?r=year&z=xlarge&c=Swift+pmtpa&h=Swift+pmtpa+prod&jr=&js=&v=63597099281113&m=swift_bytes_count&vl=total+bytes&trend=1

There are some ideas of increasing the capacity much earlier than that by
moving pmtpa hardware to eqiad at the end of the year but nothing's decided
yet. I can say with certainty that we're not going to keep 6 replicas of
everything with the new datacenter but use Swift 1.9's georeplication features
to lower this to, likely, 4.

Varnish's caches are much smaller, obviously, but they're LRU, so unless we
have tons of very popular large files, it shouldn't affect them much.

Large files aren't a big deal with Swift or its underlying filesystem (XFS) --
at least up to (the default of) 5G; after that, we'd need to explore segmented
files in Swift itself ("large object support"). Large files are actually *much*
more efficient to handle that really small files (filesystem overheads etc.)

Now, a large number of large files could have the potential of throwing us off
planning, especially if you account for a multiplification factor because of
transcoding and us keeping in Swift multiple versions of the same video file in
different formats & resolution.

However, I don't think it's even remotely plausible this would happen. All of
our transcoded files account for a mere 1.1T. Additionally, the 21.251.977
objects in Commons (originals, does *not* include thumbs/transcoded) are
distributed in size as follows:

0 bytes  - 4.0KiB   = 368841
4.0KiB   - 8.0KiB   = 275486
8.0KiB   - 16.0KiB  = 596394
16.0KiB  - 32.0KiB  = 972185
32.0KiB  - 64.0KiB  = 1528037
64.0KiB  - 128.0KiB = 2466817
128.0KiB - 256.0KiB = 2294701
256.0KiB - 512.0KiB = 2247147
512.0KiB - 1.0MiB   = 2453605
1.0MiB   - 2.0MiB   = 2746332
2.0MiB   - 4.0MiB   = 2931704
4.0MiB   - 8.0MiB   = 1832701
8.0MiB   - 16.0MiB  = 410738
16.0MiB  - 32.0MiB  = 88009
32.0MiB  - 64.0MiB  = 24599
64.0MiB  - 128.0MiB = 13504
128.0MiB - 256.0MiB = 933
256.0MiB - 512.0MiB = 192
512.0MiB - 1.0GiB   = 52

Files over 64MiB are a mere 0.06% of the total file count and account for under
2T in size in total. Files over 128MiB are less than one tenth of files between
64MiB-128MiB. I think it's safe to assume that files in the 512MiB-1.0GiB will
stay well below a 1TiB limit in the mid-term, which is more than fine given our
current media storage pool.

Finally, a factor that should be considered is the resources needed from the
videoscaler (TMH) infrastructure. Jan Gerber is the expert here, but I don't
think going to 1GiB is going to make any big difference. Maybe silly things
such as cgroup limits would need to be adjusted but it's not a pressing matter
anyway as the process is asynchronous and we can course-correct as we go
forward.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to