-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Fri, Nov 28, 2008 at 11:28 PM, Lemmih  wrote:
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEAREKAAYFAkkw1BsACgkQvpDo5Pfl1oIYagCgjhmSV7MUbcaY6DbzgtUWEesn
770An3qpJAf+rhiBJ+3t+a/pN5/E6bB6
=gJk7
-----END PGP SIGNATURE-----

>
> Hiya,
>
> I noticed something weird while editing this 987 lines long file
> http://code.haskell.org/lhc/src/Main.hs.
>
> The memory usage is collected via +RTS -sstderr.
> Just opening the file takes 3 MB. So far so good.
> Scrolling to the end of the page raises the memory usage to 52 MB (1
> MB lost due to fragmentation).
> Inserting a single space at the end of the file raises memory usage to
> 129 MB (2 MB lost due to fragmentation).
> Inserting a bunch of spaces increases the memory usage to 192 MB (3 MB lost).
>
> This is with the darcs version of yi (with 2797 patches) compiled with
> GHC-6.10.1. I'm using the emacs keymap if that matters.

I noticed something else while looking at the profiling:

Here is the output just for opening Main.hs and scrolling to the
bottom and inserting some spaces in random places:

./yi-i386-linux ../Main.hs +RTS -sfile
   1,093,956,880 bytes allocated in the heap
     501,879,016 bytes copied during GC
      26,564,856 bytes maximum residency (31 sample(s))
         429,588 bytes maximum slop
              63 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  2065 collections,     0 parallel,  0.76s,  0.84s elapsed
  Generation 1:    31 collections,     0 parallel,  1.44s,  1.51s elapsed

  Task  0 (worker) :  MUT time:   0.00s  (  0.00s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  1 (worker) :  MUT time:   3.38s  ( 18.91s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  2 (worker) :  MUT time:   1.86s  ( 18.91s elapsed)
                      GC  time:   1.49s  (  1.57s elapsed)

  Task  3 (worker) :  MUT time:   3.35s  ( 18.90s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  4 (bound)  :  MUT time:   0.00s  (  0.00s elapsed)
                      GC  time:   0.70s  (  0.77s elapsed)

  INIT  time    0.00s  (  0.04s elapsed)
  MUT   time    1.16s  ( 18.90s elapsed)
  GC    time    2.20s  (  2.34s elapsed)
  EXIT  time    0.03s  (  0.01s elapsed)
  Total time    3.38s  ( 21.29s elapsed)

  %GC time      64.9%  (11.0% elapsed)

  Alloc rate    920,787,194 bytes per MUT second

  Productivity  35.1% of total user, 5.6% of total elapsed

recordMutableGen_sync: 0
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_todo: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_todo: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_todo: 0
gen[1].steps[0].sync_large_objects: 0

63MB total memory doesn't seem *too* terrible to me. But suppose I go
into the file, select the entire buffer, and paste 5 times (so the
buffer now contains 6x the file content)? Here's the output from that:
./yi-i386-linux ../Main.hs +RTS -sfile
  37,205,805,072 bytes allocated in the heap
   6,417,077,404 bytes copied during GC
     369,854,576 bytes maximum residency (26 sample(s))
       2,581,916 bytes maximum slop
            1077 MB total memory in use (8 MB lost due to fragmentation)

  Generation 0: 70935 collections,     0 parallel, 14.13s, 14.23s elapsed
  Generation 1:    26 collections,     0 parallel, 10.60s, 13.80s elapsed

  Task  0 (worker) :  MUT time:   0.00s  (  0.00s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  1 (worker) :  MUT time:  56.74s  ( 90.55s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  2 (worker) :  MUT time:  43.70s  ( 90.55s elapsed)
                      GC  time:  13.03s  ( 13.60s elapsed)

  Task  3 (worker) :  MUT time:  56.74s  ( 90.55s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  4 (bound)  :  MUT time:   0.00s  (  0.00s elapsed)
                      GC  time:  11.70s  ( 14.43s elapsed)

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time   32.00s  ( 90.55s elapsed)
  GC    time   24.73s  ( 28.03s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time   56.48s  (118.58s elapsed)

  %GC time      43.8%  (23.6% elapsed)

  Alloc rate    1,171,984,954 bytes per MUT second

  Productivity  56.2% of total user, 26.8% of total elapsed

recordMutableGen_sync: 0
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_todo: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_todo: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_todo: 0
gen[1].steps[0].sync_large_objects: 0

Now it claims to be using total memory of nearly 1.1 gigabytes. How
did a 6x increase in length lead to a 17x increase in memory usage?
Well, I mentioned to Lemmih that I remembered syntax highlighting had
been a problem in the past.

So I took an e-text of _The Scarlet Pimpernil_, which at 480K is more
than 10x the size of Main.hs, and loaded it up. Scrolling through it,
up and down, and added random spaces. Here's the usage:

./yi-i386-linux ../60.txt +RTS -sfile
   1,544,508,552 bytes allocated in the heap
     245,372,368 bytes copied during GC
       1,134,728 bytes maximum residency (161 sample(s))
         149,824 bytes maximum slop
               4 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  2791 collections,     0 parallel,  1.01s,  1.04s elapsed
  Generation 1:   161 collections,     0 parallel,  0.42s,  0.41s elapsed

  Task  0 (worker) :  MUT time:   0.00s  (  0.00s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  1 (worker) :  MUT time:   3.14s  ( 20.60s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  2 (worker) :  MUT time:   2.51s  ( 20.60s elapsed)
                      GC  time:   0.63s  (  0.61s elapsed)

  Task  3 (worker) :  MUT time:   3.14s  ( 20.60s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  4 (bound)  :  MUT time:   0.00s  (  0.00s elapsed)
                      GC  time:   0.80s  (  0.84s elapsed)

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    1.72s  ( 20.60s elapsed)
  GC    time    1.43s  (  1.45s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    3.12s  ( 22.06s elapsed)

  %GC time      45.8%  (6.6% elapsed)

  Alloc rate    914,931,969 bytes per MUT second

  Productivity  54.0% of total user, 7.6% of total elapsed

recordMutableGen_sync: 0
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_todo: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_todo: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_todo: 0
gen[1].steps[0].sync_large_objects: 0

4mb in total usage. Let's try the 6x test:

./yi-i386-linux ../60.txt +RTS -sfile
   7,266,926,716 bytes allocated in the heap
   1,639,234,772 bytes copied during GC
       9,764,592 bytes maximum residency (105 sample(s))
       1,128,960 bytes maximum slop
              28 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0: 13762 collections,     0 parallel,  5.01s,  5.13s elapsed
  Generation 1:   105 collections,     0 parallel,  2.02s,  2.06s elapsed

  Task  0 (worker) :  MUT time:   0.00s  (  0.00s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  1 (worker) :  MUT time:  16.71s  ( 60.28s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  2 (worker) :  MUT time:  13.86s  ( 60.28s elapsed)
                      GC  time:   2.85s  (  3.00s elapsed)

  Task  3 (worker) :  MUT time:  16.73s  ( 60.26s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  4 (bound)  :  MUT time:   0.00s  (  0.00s elapsed)
                      GC  time:   4.18s  (  4.19s elapsed)

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    9.70s  ( 60.26s elapsed)
  GC    time    7.02s  (  7.20s elapsed)
  EXIT  time    0.04s  (  0.02s elapsed)
  Total time   16.77s  ( 67.47s elapsed)

  %GC time      41.9%  (10.7% elapsed)

  Alloc rate    746,044,544 bytes per MUT second

  Productivity  58.1% of total user, 14.4% of total elapsed

recordMutableGen_sync: 0
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_todo: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_todo: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_todo: 0
gen[1].steps[0].sync_large_objects: 0

So there's some bad scaling here too - 6*4 = 24, not 28, but nowhere
what we saw earlier.

Moral of the story: syntax highlighting is expensive!

--
gwern

--~--~---------~--~----~------------~-------~--~----~
Yi development mailing list
[email protected]
http://groups.google.com/group/yi-devel
-~----------~----~----~----~------~----~------~--~---

Reply via email to