[Perldl] How to find out cause of out of memory

Clifford Sobchuk clifford.sobchuk at ericsson.com
Wed Feb 15 06:52:12 HST 2012


Thanks Ingo. The chunks are not the same unfortunately which is why I was using a hash to keep track of everything. I could run through the file twice to build up the information of the sizes and then pre-allocate the piddles. For the smaller files this is ok, but for the ones > 5GB this is going to be a pretty increase in the amount of time to process the files. 

In either case I am not sure how to assign the data from the rgrep or rcols to a hash - where a text column of the information in the rgrep or rcols is required as an index to the hash. Maybe I have to build a temporary hash table and then use it to push into a pre-allocated piddle. My concern with the pre-allocated piddles is that I will either be under or worse over the size of the pre-allocated piddle. 

Thanks,

CLIFF SOBCHUK
Core RF Engineering
Phone 613-667-1974   ecn: 8109-71974
mobile 403-819-9233
yahoo: sobchuk
www.ericsson.com 

"The author works for Telefonaktiebolaget L M Ericsson ("Ericsson"), who is solely responsible for this email and its contents. All inquiries regarding this email should be addressed to Ericsson. The web site for Ericsson is www.ericsson.com."

This Communication is Confidential. We only send and receive email on the basis of the terms set out at www.ericsson.com/email_disclaimer


-----Original Message-----
From: Ingo Schmid [mailto:ingosch at gmx.at] 
Sent: Wednesday, February 15, 2012 6:05 AM
To: perldl at jach.hawaii.edu
Subject: Re: [Perldl] How to find out cause of out of memory


I think you should skip using arrays and assign data to intermediate piddles. Or, if you can, figure out data size then create a big piddle and using slicising in assignements. Do the chunks look the same always? 
Then an array of piddles maybe a good choice.

When processing functional images, I am forced to do something similar due to a) huge memory demands b) constraitns on piddle elements. What I do is read in data one time-point (or whatever outer index you can come up with) and populate a piddle than append it to an open file handle (using writeflex). I don't know if that works in your case since it heavily depends on slicability of your data.

Ingo

On 02/14/2012 04:26 PM, Clifford Sobchuk wrote:
> Hi Folks,
>
> I am running in to a problem where I am putting in a large amount of data (variable depending on log size). The data is being pushed in to a perl array, and then converted in to a piddle. I think that it might be the conversion from perl array to piddle, but am not sure. How can I find out where the issue exists and correct it. The end users computer (laptop) will often be in this situation apparently. Since the data is intermixed with text that needs to be used to hash each specific attribute, I can't simply use an rgrep or rcols import. I can use rcols for each section, this would result in using glue to build up the piddle slowly (groups of 20 to 100 - depending on the datum for that attribute).
>
> Example pseudo code.
> Foreach line {
>          $index1 = $1 if (/index1:\s(\d+)\w+);
>          $index2 ...
>          if $datastart&&  ! $dataend {
>                  push @{$myhash{$index1}{$index2}{datum1}},$1 if (/mydata/);
>                  $dataend = 1 if (/$eod/);
>          }
> Foreach sort(keys %myhash) {
>          ....for each index
>                  $data1=pdl(@{$myhash{$index1}{$index2}{datum1}});
>          }
> }
>
> The raw text files are on the order of 0.5 to 14 GB and are being run on win32 (vista - which I know has a 2GB limit for applications). Hope that this provides enough information to scope the issue.
>
> Thanks,
>
>
> CLIFF SOBCHUK
> Ericsson
> Core RF Engineering
> Calgary, AB, Canada
> Phone 613-667-1974  ECN 8109 x71974
> Mobile 403-819-9233
> clifford.sobchuk at ericsson.com<mailto:clifford.sobchuk at ericsson.com>
> yahoo: sobchuk
> http://www.ericsson.com/
>
> "The author works for Telefonaktiebolaget L M Ericsson ("Ericsson"), who is solely responsible for this email and its contents. All inquiries regarding this email should be addressed to Ericsson. The web site for Ericsson is www.ericsson.com."
>
> This Communication is Confidential. We only send and receive email on the basis of the terms set out at www.ericsson.com/email_disclaimer<http://www.ericsson.com/email_disclaimer>
>
>
>
> _______________________________________________
> Perldl mailing list
> Perldl at jach.hawaii.edu
> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>


_______________________________________________
Perldl mailing list
Perldl at jach.hawaii.edu
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl




More information about the Perldl mailing list