Discussion:
Performance of 5.6 & 5.8
(too old to reply)
Jeremy White
2005-10-14 10:43:20 UTC
Permalink
Hi,

I've just completed some formal timings of one of my Perl/XS/C modules
running under Perl 5.6 and Perl 5.8. Both modules were built with gcc
version 3.4.2 (mingw-special) against Activestate on win32.

Perl 5.6.1 (build 638 ActiveState) = 94676 iterations per second.
Perl 5.8.7 (build 813 ActiveState) = 68677 iterations per second.

Quite a difference! No idea why...

Cheers,

jez.
Jeremy White
2005-10-14 12:41:25 UTC
Permalink
Thanks for the reply.
Post by Jeremy White
Quite a difference! No idea why...
Unicode would be my guess.
Perhaps a stupid question - can 5.8.x be built without unicode support to
get some of that speed back?

Cheers,

jez.
Tels
2005-10-14 14:16:42 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----

Moin,
Post by Jeremy White
Thanks for the reply.
Post by Jeremy White
Quite a difference! No idea why...
Unicode would be my guess.
Perhaps a stupid question - can 5.8.x be built without unicode support
to get some of that speed back?
Thank god, no. ASCII is so 1995...

However, if you can, please post the code, so that we can run benchmarks
ourselves, and figure out what the difference is exactly and whether one
can do something about it.

5.8.x fixes quite a few bugs of 5.6, and in somecases these fixes make the
code slower (due to more checks/conditions).

Best wishes,

Tels

- --
Signed on Fri Oct 14 16:15:11 2005 with key 0x93B84C15.
Visit my photo gallery at http://bloodgate.com/photos/
PGP key on http://bloodgate.com/tels.asc or per email.

"The campaign should combat the messages of pornography by putting signs
on buses saying sex with children is not OK." -- Mary Anne Layden in
ttp://tinyurl.com/6a9cy
Gisle Aas
2005-10-14 12:16:52 UTC
Permalink
Post by Jeremy White
I've just completed some formal timings of one of my Perl/XS/C modules
running under Perl 5.6 and Perl 5.8. Both modules were built with gcc
version 3.4.2 (mingw-special) against Activestate on win32.
Perl 5.6.1 (build 638 ActiveState) = 94676 iterations per second.
Perl 5.8.7 (build 813 ActiveState) = 68677 iterations per second.
Quite a difference! No idea why...
Unicode would be my guess.

Regards,
Gisle
Jeremy White
2005-10-14 14:38:37 UTC
Permalink
Post by Jeremy White
Perhaps a stupid question - can 5.8.x be built without unicode support
to get some of that speed back?
Thank god, no. ASCII is so 1995...
:)
However, if you can, please post the code, so that we can run benchmarks
ourselves, and figure out what the difference is exactly and whether one
can do something about it.
Unfortunately the code is large and complicated - although I'm prepared to
find and create
test cases and post those.
5.8.x fixes quite a few bugs of 5.6, and in somecases these fixes make the
code slower (due to more checks/conditions).
Indeed - the reason why I'm upgrading to 5.8.x is because of bug fixes - was
just a shock to see such a drop in performance.

Cheers,

jez.
Tels
2005-10-14 18:30:06 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----

Moin Jeremy,
Post by Jeremy White
Post by Jeremy White
Perhaps a stupid question - can 5.8.x be built without unicode
support to get some of that speed back?
Thank god, no. ASCII is so 1995...
:)
However, if you can, please post the code, so that we can run
benchmarks ourselves, and figure out what the difference is exactly
and whether one can do something about it.
Unfortunately the code is large and complicated - although I'm prepared
to find and create test cases and post those.
5.8.x fixes quite a few bugs of 5.6, and in somecases these fixes make
the code slower (due to more checks/conditions).
Indeed - the reason why I'm upgrading to 5.8.x is because of bug fixes
- was just a shock to see such a drop in performance.
I did have the idea last week to look at how Perl processes unicode data
and see if I can streamline that. However, good benchmarks/profiling must
come first.

Just some background:

In ASCII, all you have is one byte per character. With unicode, there are
Post by Jeremy White
64K characters, so you need more than one byte. This boils basically
down to two choices:

* utf8 - each character is 1,2,3 or 4 bytes, depending.
* utf32 - each char is 4 bytes

utf8 has the nice property to save space, because ASCII characters use 1
byte, and everything else 2,3 or 4 bytes, depending. utf32 has the
advantage that each char is exactly the same size, but it wastes 4 times
the space as ASCII.

In Perl, strings are now internally stored as utf8. Whenever possible they
are stored as ASCII, to speed things up. So, whether string operations
are the sole cause of the slowdown you see must be determined, of
course :)

One potential slowdown is stringlength and substrings. For instance, if
you want to do substr($string, $x, $y), then Perl needs to do:

* for ASCII: calculate str_start + $x, str_start + $x + $y: O(1)
* for utf8: start at the beginning, advance $x characters (that can be a
variable number of bytes!), then advance another $y characters. This is
unfortunately O(N) - e.g. it depends on $x and $y.

For large strings, repeated operations etc, the time difference can grow
very large.

There are basically two things one can do:

* try to make utf8 operations faster,
* cache as much info as possible
* use utf32 instead of utf8 (maybe a compile time option?)

The first two are medium work, and probably already done to quite some
extend. The last one is actually a very big task. I am pretty sure that
there are many places where the code assumes utf8 encoding - but it would
be worth a try after we found reliable benchmarks.

Best wishes,

Tels

- --
Signed on Fri Oct 14 20:20:03 2005 with key 0x93B84C15.
Visit my photo gallery at http://bloodgate.com/photos/
PGP key on http://bloodgate.com/tels.asc or per email.

"Yeah. Whatever."
Jeremy White
2005-10-21 16:29:52 UTC
Permalink
Hi,
Post by Tels
I did have the idea last week to look at how Perl processes unicode data
and see if I can streamline that. However, good benchmarks/profiling must
come first.
For large strings, repeated operations etc, the time difference can grow
very large.
I started to put some test scripts together, and came across some
interesting results. I started with basic parameter passing (to and from XS)
and for the most part there is very little difference between 5.6.1 and
5.8.7. For a few things 5.8.7 was slightly faster.

As soon as you use SV's as strings things start to get bad... For example:

void SVManipulation()
CODE:
SV* SVstring;
SVstring = newSVpvn("Some",4);
sv_catpvn(SVstring,"AA",2);
sv_catpvn(SVstring,"BBB",3);
sv_catpvn(SVstring,"CCCC",4);
SvREFCNT_dec(SVstring);

Gave me the following results:

5.8.7
Benchmark: timing 10000000 iterations of SVManipulation...
SVManipulation: 22 wallclock secs (22.17 usr + 0.02 sys = 22.19 CPU) @
450694.07/s (n=10000000)

5.6.1
Benchmark: timing 10000000 iterations of SVManipulation...
SVManipulation: 10 wallclock secs ( 9.81 usr + 0.00 sys = 9.81 CPU) @
1019160.21/s (n=10000000)

Next I was going to test the performance of hash access in xs, but before I
do, I just want to confirm that I'm being "fair" in my testing. I'm using
ActiveState for both versions, same version of GCC and using Benchmark for
timings. All test code is the same for both versions.

Cheers,

jez.

Loading...