Hi there,

I'm running a custom image built with Yocto on my Akita. Currently i'm
running the 3.2 kernel. I suspect that the ads7846 driver takes quite a bit
more cpu time than the older, simpler corgi_ts driver. For instance, simply
running "top" and keeping constant pressure on the touchpanel will start
off the ksoftirqd and the irq/107-ads7846 process will take up to 3% of cpu
time. This does not happen on the older kernels with corgi_ts driver. I
know this is greatly oversimplified but i do have oprofile stats to make a
better point.

I use kdrive 1.7.99 and gtk 2.16.6 in my image. Here's another simple test
i did:

I open a text file in leafpad (~1000 lines). Scroll to the bottom and then
back up again two times using the touchscreen to drag the scrollbar.  This
is done in portrait mode to factor out the fb rotation performance penalty.
Scrolling is mostly blitting so most of the action should be in Xfbdev and
Pixman. Here are some abbreviated oprofile stats on a 3.2 kernel for this
simple test:

System wide
==================
CPU: ARM/XScale PMU2, speed 416000 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
CPU_CYCLES:100000|
  samples|      %|
------------------
    80225 30.6922 vmlinux
    35017 13.3967 Xfbdev
    25602  9.7947 libpixman-1.so.0.23.6
    24123  9.2289 libc-2.13.so
    18613  7.1209 libglib-2.0.so.0.3000.0
    18328  7.0119 libpango-1.0.so.0.2800.4
    10324  3.9497 libgobject-2.0.so.0.3000.0
     8641  3.3058 libgdk-x11-2.0.so.0.1200.7
     6890  2.6359 libgtk-x11-2.0.so.0.1200.7
     6163  2.3578 oprofiled


Kernel - 3.2
===================
CPU: ARM/XScale PMU2, speed 416000 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
samples  %        symbol name
12809    15.9663  __do_softirq
9602     11.9688  __sched_text_start
3620      4.5123  cpu_xscale_switch_mm
3569      4.4487  giveback
2900      3.6148  spitz_ads7846_wait_for_hsync
2657      3.3119  rcu_bh_qs
1793      2.2350  tasklet_action
1720      2.1440  pump_messages
1631      2.0330  ring_buffer_consume
1587      1.9782  spi_async_locked
1510      1.8822  pump_transfers
1287      1.6042  input_event
1118      1.3936  run_ksoftirqd
1073      1.3375  worker_thread
982       1.2241  process_one_work
972       1.2116  get_signal_to_deliver
959       1.1954  rcu_sched_qs
783       0.9760  complete

Xfbdev
==================
CPU: ARM/XScale PMU2, speed 416000 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
samples  %        symbol name
2135      6.2480  dixLookupPrivate
1292      3.7810  miGlyphs
1280      3.7459  Dispatch
1090      3.1898  CompositePicture
1087      3.1811  ReadRequestFromClient
968       2.8328  WaitForSomething


Pixman
==================
CPU: ARM/XScale PMU2, speed 416000 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
samples  %        symbol name
7537     30.1854  pixman_blt_mmx.part.1
3411     13.6609  mmx_composite_over_n_8_0565
1177      4.7138  pixman_fill_mmx
1128      4.5176  pixman_image_composite32
1074      4.3013  mmx_composite_add_8_8
1039      4.1612  _pixman_bits_image_setup_accessors
832       3.3321  _pixman_image_validate
765       3.0638  pixman_region32_selfcheck
654       2.6192  pixman_region32_init
652       2.6112  _pixman_lookup_composite_function
561       2.2468  pixman_compute_composite_region32
531       2.1266  bits_image_property_changed



It seems that most of the time is spent in the kernel, processing ts
interrupts. This visibly impacts performance as you can clearly see text
redrawing as you scroll. Here's oprofile output of that same little test,
only now i use a usb mouse to drag the scrollbar:


System wide
==================
CPU: ARM/XScale PMU2, speed 416000 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
CPU_CYCLES:100000|
  samples|      %|
------------------
    39003 19.8228 vmlinux
    35678 18.1329 libpixman-1.so.0.23.6
    25286 12.8513 Xfbdev
    22998 11.6885 libc-2.13.so
    13126  6.6711 libglib-2.0.so.0.3000.0
    10189  5.1784 libpango-1.0.so.0.2800.4
     9336  4.7449 libgobject-2.0.so.0.3000.0
     7392  3.7569 libgtk-x11-2.0.so.0.1200.7
     6290  3.1968 libgdk-x11-2.0.so.0.1200.7
     4160  2.1143 oprofiled

Kernel
=================
CPU: ARM/XScale PMU2, speed 416000 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
samples  %        symbol name
3672      9.4147  __sched_text_start
3474      8.9070  cpu_xscale_switch_mm
3143      8.0584  __do_softirq
1132      2.9023  ring_buffer_consume
1069      2.7408  __hrtimer_start_range_ns
927       2.3767  do_select
588       1.5076  __copy_to_user_std
581       1.4896  __wake_up_sync_key
556       1.4255  unix_stream_recvmsg
556       1.4255  vector_swi
455       1.1666  default_idle
455       1.1666  mc_copy_user_page
432       1.1076  ktime_get_ts
430       1.1025  core_sys_select

Xfbdev
=================
CPU: ARM/XScale PMU2, speed 416000 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
samples  %        symbol name
1696      6.8197  dixLookupPrivate
1124      4.5197  Dispatch
827       3.3254  ReadRequestFromClient
752       3.0238  miGlyphs
638       2.5654  CompositePicture
617       2.4810  dixLookupResourceByType
582       2.3403  XYToWindow
542       2.1794  WaitForSomething
509       2.0467  image_from_pict
487       1.9583  XaceHook
460       1.8497  damageComposite
379       1.5240  dixLookupPrivateAddr
359       1.4436  getDrawableDamageRef
333       1.3390  fbValidateGC
325       1.3068  fbBltOne

Pixman
=================
CPU: ARM/XScale PMU2, speed 416000 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
samples  %        symbol name
23715    67.0029  pixman_blt_mmx.part.1
2656      7.5041  mmx_composite_over_n_8_0565
1099      3.1050  pixman_fill_mmx
812       2.2942  mmx_composite_add_8_8
712       2.0116  _pixman_bits_image_setup_accessors
619       1.7489  pixman_image_composite32
487       1.3759  pixman_region32_selfcheck
455       1.2855  pixman_region32_init
397       1.1217  _pixman_lookup_composite_function
353       0.9973  _pixman_image_validate

Scrolling with a mouse is much faster and the oprofile data shows it. It is
also visually faster and screen redraws are not so visible.
Finally, here's that same test, on that same image, but this time with a
2.6.26-RP kernel which uses the old corgi_ts driver. Dragging the scrollbar
with the ts.

System wide
==================
 CPU: ARM/XScale PMU2, speed 622 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
CPU_CYCLES:100000|
  samples|      %|
------------------
   131815 25.0331 libpixman-1.so.0.23.6
    84564 16.0596 libc-2.13.so
    82888 15.7414 vmlinux
    51252  9.7333 Xfbdev
    30044  5.7057 libglib-2.0.so.0.3000.2
    25637  4.8688 libgobject-2.0.so.0.3000.2
    19355  3.6757 libpango-1.0.so.0.2800.4
    18839  3.5777 libgtk-x11-2.0.so.0.1600.6
    17266  3.2790 libgdk-x11-2.0.so.0.1600.6
    17112  3.2498 libcairo.so.2.11000.2
    10855  2.0615 libpthread-2.13.so
     9703  1.8427 oprofiled

vmlinux - 2.6.26-RP
==================
CPU: ARM/XScale PMU2, speed 622 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
samples  %        symbol name
11787    14.2204  ts_interrupt_main.isra.0
7505      9.0544  cpu_xscale_switch_mm
6219      7.5029  __do_softirq
4832      5.8296  ide_intr
3778      4.5580  schedule
2174      2.6228  get_signal_to_deliver
1483      1.7892  __wake_up_sync
1431      1.7264  __copy_to_user
1413      1.7047  tty_ldisc_deref
1243      1.4996  __switch_to
1158      1.3971  do_select

Xfbdev
=================
CPU: ARM/XScale PMU2, speed 622 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
samples  %        symbol name
3159      6.2847  dixLookupPrivate
2472      4.9179  Dispatch
2217      4.4106  miGlyphs
1947      3.8735  CompositePicture
1817      3.6148  ReadRequestFromClient
1275      2.5366  WaitForSomething
1178      2.3436  dixLookupPrivateAddr
1135      2.2580  image_from_pict
992       1.9735  getDrawableDamageRef
977       1.9437  damageRegionProcessPending
885       1.7607  dixLookupResourceByType
869       1.7288  damageComposite
818       1.6274  miSpriteSourceValidate
767       1.5259  ValidatePicture
724       1.4404  FlushIfCriticalOutputPending
621       1.2355  fbBltOne

pixman
==================
CPU: ARM/XScale PMU2, speed 622 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00
(No unit mask) count 100000
samples  %        symbol name
106356   81.0838  pixman_blt_mmx.part.1
5485      4.1817  mmx_composite_over_n_8_0565
2327      1.7741  pixman_fill_mmx
1613      1.2297  _pixman_bits_image_setup_accessors
1519      1.1581  mmx_composite_add_8_8
1325      1.0102  pixman_image_composite32
1273      0.9705  pixman_region32_selfcheck
1155      0.8806  pixman_region32_init


And yet again, oprofile output for the 2.6.26-RP suggests that most of the
time is spent blitting, as it should be. Cpu speed is not accurately
reported, it's fixed at 416mhz as reported by /proc/cpuinfo.   Scrolling is
also visibly faster than the 3.2.kernel.

So what's going on here? Is the ads7846 driver really such a hog? Or is
this due to the pxa spi driver?  Corgi_ts used a light ssp microwire
interface.

I tried to reintroduce the old corgi_ts driver into the current kernel but
it doesn't work yet.
_______________________________________________
Zaurus-devel mailing list
Zaurus-devel@lists.linuxtogo.org
http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/zaurus-devel

Reply via email to