Hi Abraham, I'm quite interested in this topic. I'll have your tests performed and report back the results. But it will probably take a month or so.
Meanwhile, let me recap the possible optimization vectors for video on RDP systems: 1. terminal CPU usage 2. server CPU usage 3. network bandwidth Each of this factors can become a performance bottleneck, depending on your hardware. You seem to be addressing 2 and 3 with you optimizations. I would like to share with you (see attachment) that if you have a Gigabit network and a fast server the limit that is hit is usually 1, on a single terminal video test. At least this is waht we found. In the attached document you can the test results that support this conclusion. You will see in the document that the only way for the server CPU usage to be the bottleneck is having an extremely fast client (client 5, for example). In test number 3 you have almost smooth fullsceen youtube using 130-180Mbps an 90% CPU of a 3GHZ physical RDP server machine. This was only possible because the client (client 5) was a very powerful Toshiba laptop. With a regular terminal we wouldn't be able to fully use the server CPU and network because the terminal CPU limit would be hit first and the video quality would be degraded. This leads me to thinking that lowering CPU usage on the terminal (which is usually a weak machine) should be the priority, if we want to have smooth video. Some ideas worth evaluating: - is the code optimized on the rdp client (rdesktop, xfreerdp) in terms of GCC opts and algorithms? - could the client use hardware acccel to scale video and/or converte colorspaces (Xvideo, VAAPI, VDPAU, ...)? - could the x11rdp server implement a virtual Xorg extension (Xvideo, VAAPI, ..) so that the decoding can be partially done on the client by hardware (see above); note that this would lower CPU usage on the server as well Redaring the is_rle_worth idea, perhaps someone in this list can help you. Maybe sampling a small percentage of the image is enough to reach conclusions, being thus less CPU expensive? Sampling across two diagonals would probably be enough to count the number of colors and less expensive than sample the whole frame. If nr_colors > certain_reasonable_threahold you could consider the frame as "video" instead of "cartoon". This is speculation, for now. Best regards, gracias Gustavo ----- Original Message ----- > From: "Abraham Macías Paredes" <amac...@solutia-it.es> > To: "Gustavo Homem" <gust...@angulosolido.pt>, "xrdp-devel" > <xrdp-devel@lists.sourceforge.net> > Sent: Friday, May 24, 2013 7:46:34 AM > Subject: RE: [Xrdp-devel] Improving XRDP performance > > Hi Gustavo, > The tests that I've made are really simple tests with a controlled > development environment (simply open a firefox a whatching the > same video http://www.youtube.com/watch?v=tgpc8c2xoZQ). And I run > the test twice for every test case. The tests are run in a VMware > virtual machine width 1 VCPU and 1Gb of RAM. > > Nevertheless your doubts are logical. Since I don't have a proper > benchmark I can't say "this is a 15% faster". > But enabling the "-O3" parameter in the GCC (a compiler that doesn't > optimize code by default) use to mean around a 15% of performance > gain. If I were using Visual C++ (a compiler that optimize code by > default) probably there would be no performance gain at all after > enabling the optimization parameters. > > If we talk about bandwidth, the "bitmap RLE" compression method is > only a good compression method when you use drawings (images with > very few colors), when you use real images (like photos) is a very > bad compression method. And the bandwidh to watch a youtube video in > a LAN can be up to 80 Mbits/s (measured in my tests with "iptraf"). > So if bandwidth is a problem you can't watch a video with XRDP. > > Obviuosly, the " is_rle_compression_worthy" method that I'm proposing > has a CPU cost, that’s because I'm asking if somebody knows a good > and fast algorithm to do this. I think that counting the number of > different colors of the image would tell me if it can be compressed > or not, but it would be very slow. So I'm expecting that someone > tells me about something better. > > And finaly, it would be nice if you make your own test and we share > the results, so I attach the modified "xrdp_bitmap_compress.c" file. > > If you compare it with the original, you will find that I've changed > some operations: > * "%" is changed by "&" where possible. > * "/" is changed by ">>" where possible. > * "*" is changed by "<<" where possible. > > And I've reordered the "for" and "if" structures to simplify the "if" > conditions. Let me explain that. > > The original code was: > > for () { > // Code > if ((last_line == 0 && pixel == 0) || (last_line != 0 && pixel > == ypixel)) { > //code > } > // Code > } > > And that means that for every pixel you are checking if you are in > the first line (last_line == 0) or not. So I've reordered the code > to: > > if (last_line == 0) { > > for () { > if (pixel == 0) { > //code > } > } > > } else { > > for () { > if (pixel == ypixel) { > //code > } > } > > } > > And that’s all. (This is very long mail, isn't it?) > > So please, make your own tests, think about improvements and give me > any advice. > > Thank you very much, or "muito obrigado" my Portuguese friend. > > > -----Mensaje original----- > De: Gustavo Homem [mailto:gust...@angulosolido.pt] > Enviado el: jueves, 23 de mayo de 2013 18:33 > Para: Abraham Macías Paredes; xrdp-devel > Asunto: Re: [Xrdp-devel] Improving XRDP performance > > Hi Abraham, > > > Hi everybody, > > > > I’m trying to improve the performance of XRDP. > > > > > > > > Let me explain the problem. When a user is writing in an OpenOffice > > document and things like that, the performance is OK, but if a user > > tries to watch a Youtube video the performance could be a problem. > > > > With the “bitmap_compression=yes” configuration parameter set in > > “xrdp.ini”, and monitoring with the “top” Linux command, the CPU > > used > > by XRDP process was up to 60%. > > > > > > > > First of all, I set the “-O3” GCC option to get the code optimized, > > and with the same test (and the same video) the CPU used by xrdp > > was > > up to 45%. > > Do you get this consistently in a reproducible manner? Are you sure > you are playing the same exact video without any other load > varations on the host? > > Can you garantee that the client and network are not at 100% load > when you were testing the second time? > > Sorry for the overchecking :-) A potencial 15% performance gain is > worth being well understood. > > We did some performance testing here and it is quite easy to have the > test disturbed by network or client CPU usage variations, if one is > not careful enough. > > > > > > > > > Them I’ve made some little changes to “xrdp_bitmap_compress” > > function > > to improve the performance of the RLE compression algorithm. When I > > repeated the test the CPU used was up to 36%. > > > > > > Again, is this the same video and under the same conditions? Have > your changes increased network usage or reduce CPU usage without > bandwidth penalty? > > > > > Now I realize that using RLE compression with real video frames is > > not > > a good idea, because it consumes a lot of CPU and doesn’t compress > > very much. So I’m thinking in modifying > > Did you quantify the bandwith differences? > > > “libxrdp_send_bitmap” function to test the image before to compress > > it. > > > > > > > > I mean, set a test like: > > > > If (is_rle_compression_worthy()) { > > > > /* Performs RLE compression */ > > > > } else { > > > > /* Send as RAW image */ > > > > } > > > > > > Interesting idea. I wonder how much CPU the test would cost. > > It would be nice if we could understand your tests better, at first. > > Best regards > Gustavo > > -- > Angulo Sólido - Tecnologias de Informação http://angulosolido.pt > -- Angulo Sólido - Tecnologias de Informação http://angulosolido.pt
dimensionamento-VDI.ods
Description: application/vnd.oasis.opendocument.spreadsheet
------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
_______________________________________________ xrdp-devel mailing list xrdp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xrdp-devel