[plug] Perl OO/Data Optimisation
Matthew Robinson
matt at zensunni.org
Wed Nov 27 20:09:18 WST 2002
Trevor Phillips wrote:
> Any Perl Guts Gurus here? ^_^
>
> I have an OO app which does nasty things to a large multi-depth hash of data
> (of which some of it is structures of objects), and I'd really like to
> optimise it, but I'd rather not spend time doing optimisations that Perl may
> do internally.
I am assuming that you want to optimise for execution speed, rather than
memory usage or readability.
>
> So, in general: Is it worth me compiling my hash into arrays with integer
> references? ie; Instead of $data->{prop}, convert it to $data->[num], and
> things that reference prop to reference it as num?
Using array lookups rather than hash lookups is alway quicker as perl
doesn't have to perform the hashing function and then walk the buckets.
If you define constants for your 'hash' keys the code will look almost
identical. Example:
use strict;
use warnings;
use constant foo => 1;
my $array = [];
my $hash = {};
$array->[foo] = 'bar';
$hash->{foo} = 'bar';
__END__
>
> Is it worth dumping the object structure (HTML::Element), and using a
> simplified hash or array structure, and walking it manually?
You'll almost certainly be able to get a speed increase by doing this
(although it depends how HTML::Element is written). However, this will
almost certainly come at the expense of readability and maintainability.
> If I reference a hash several times in a row, is it faster to assign it to a
> temporary variable, or does Perl cache/optimise that internally?
This would be a good time to pull out the Benchmark module[0] and test
this for ourselves. Example:
#!/usr/bin/perl
use strict;
use Benchmark;
our %data;
my $data_ref = \${data{prop}}; # Create a scalar reference to the key
#
# $$data_ref = 'foo';
# is equivalent to
# $data{prop} = 'foo';
# Run each closure (Lookup and Reference) for a minimum of 1 CPU second
timethese(-1, {
Lookup => sub {
$data{prop} = rand;
},
Reference => sub {
$$data_ref = rand;
},
});
__END__
Which produces the output:
Benchmark: running Lookup, Reference, each for at least 1 CPU seconds...
Lookup: 2 wallclock secs ( 1.08 usr + 0.01 sys = 1.09 CPU) @
876818.35/s (n=955732)
Reference: 3 wallclock secs ( 1.10 usr + 0.00 sys = 1.10 CPU) @
1429875.45/s (n=1572863)
Showing that it is more efficent to use the reference to the data rather
than performing the lookup each time. The Lookup managed 876,818
iterations per second, whereas the Reference managed 1429875 iterations
per second.
However, it should be remembered that it has taken a million iterations
to produce a noticable difference.
> When building text by shuffling chunks of text around, is there a more
> efficient way than appending the text as you go, and storing the chunks in a
> hash?
I think I would want to have a better idea of what you are trying to do
with this one. However, I doubt there is much to be saved either
speed-wise or memory-wise here.
> Any other hints on how I can get performance increases by the way data is
> handled?
>
I am of the opinion that you probably want to keep the code readable and
maintainable rather than trying to squeeze every optimisation out of
perl. As you can see with the hash keys and references it takes a large
number of iterations to produce a noticable difference and even then we
are only talking half a second or so.
If you have a serious speed issue I would look at your algorithms first
and make sure they are optimised for the problem rathrer than trying to
optimise the implementation of the algorithm.
Anyway, hope this helps,
Matt
[0] perldoc Benchmark
--
print map{s&^(.)&\U\1&&&$_}split$,=$",(`$^Xdoc$,-qj`=~m&"(.*)"&)[0].$/
More information about the plug
mailing list