Cliff -

Has your client given you with some sample data so that you can try to
reproduce the error on your own machine? If so, a collection of warnings
dumped to a logfile might at least tell you which line of code is croaking.

Allocation of large piddles (many hundreds of megabytes) has been reported
to be a problem elsewhere. One thing I have done on Linux to work around
this problem is to build a FastRaw file piece-by-piece, then memory-mapping
the file. Although this is not a possibility on Windows (no PDL support for
memory mapping on windows yet), it might provide a means for a solution.
You could build a piddle into a FastRaw file with one script, then have a
different script try to readfraw that file. If you pull in this file early
in your (second) Perl process, you have a higher likelihood of getting the
contiguous memory request that PDL needs for the large data array.

I know, it's not ideal, but I hope that helps. I should probably try to
figure out how to add memory mapping support to Windows and then document
this technique so that others can use it.

For building the FastRaw file, I can dig up some sample code and send it
along if that would help, but I won't be able to get to it until tonight at
the earliest (and I make no guarantees as it's Valentine's day :-)


On Tue, Feb 14, 2012 at 9:26 AM, Clifford Sobchuk <
clifford.sobchuk at ericsson.com> wrote:

> Hi Folks,
> I am running in to a problem where I am putting in a large amount of data
> (variable depending on log size). The data is being pushed in to a perl
> array, and then converted in to a piddle. I think that it might be the
> conversion from perl array to piddle, but am not sure. How can I find out
> where the issue exists and correct it. The end users computer (laptop) will
> often be in this situation apparently. Since the data is intermixed with
> text that needs to be used to hash each specific attribute, I can't simply
> use an rgrep or rcols import. I can use rcols for each section, this would
> result in using glue to build up the piddle slowly (groups of 20 to 100 -
> depending on the datum for that attribute).
> Example pseudo code.
> Foreach line {
>        $index1 = $1 if (/index1:\s(\d+)\w+);
>        $index2 ...
>        if $datastart && ! $dataend {
>                push @{$myhash{$index1}{$index2}{datum1}},$1 if (/mydata/);
>                $dataend = 1 if (/$eod/);
>        }
> Foreach sort(keys %myhash) {
>        ....for each index
>                $data1=pdl(@{$myhash{$index1}{$index2}{datum1}});
>        }
> }
> The raw text files are on the order of 0.5 to 14 GB and are being run on
> win32 (vista - which I know has a 2GB limit for applications). Hope that
> this provides enough information to scope the issue.
> Thanks,
 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan
