[Perldl] How to find out cause of out of memory

Clifford Sobchuk clifford.sobchuk at ericsson.com
Tue Feb 14 15:42:13 HST 2012


Mark,

Why are there blocks for the simple lines like :
 {open(DATA, "large_file_path_here");}

Trying to figure out what exactly is happening here.

Thanks, 

CLIFF SOBCHUK
Core RF Engineering
Phone 613-667-1974   ecn: 8109-71974
mobile 403-819-9233
yahoo: sobchuk
www.ericsson.com 

"The author works for Telefonaktiebolaget L M Ericsson ("Ericsson"), who is solely responsible for this email and its contents. All inquiries regarding this email should be addressed to Ericsson. The web site for Ericsson is www.ericsson.com."

This Communication is Confidential. We only send and receive email on the basis of the terms set out at www.ericsson.com/email_disclaimer


-----Original Message-----
From: Joel Berger [mailto:joel.a.berger at gmail.com] 
Sent: Tuesday, February 14, 2012 3:03 PM
To: MARK BAKER
Cc: Clifford Sobchuk; "Perldl at jach.hawaii.edu"
Subject: Re: [Perldl] How to find out cause of out of memory

Just a quick note as I leave work. Please don't use DATA as a handle to a file! If you must use a bareword handle please choose any other one. DATA is magical.

Cheers,
Joel

On Tue, Feb 14, 2012 at 3:36 PM, MARK BAKER <mrbaker_mark at yahoo.com> wrote:
> Hey Clifford
> I was talking with David about along the same lines that your problem 
> involves here, and I would like to share something with you that I 
> think might make its way into the PDL core in one way or another
>
> I found a way to load 3 gigabytes of data in only 40 megabytes of ram 
> like this ...
> ##################################################
> {open(DATA, "large_file_path_here");}
>   {my @offset;}
>   {$offset[1] = tell DATA;}
>  {my $line_num = 2;}
>  {while (<DATA>) {
>  {$offset[$line_num++] = tell DATA;}
>    }}
> print "DONE please enter a line number";
>     { while (my $entered = <STDIN>) {
>      {  seek DATA, $offset[$entered], 0;
>       $line = <DATA>;
>     print $line,"\n";
>     }
>      }}
> ###################################################
>
> try this out with a large file
> enter a line number and it will bring up that the information on that 
> line very fast..
>
> the trick here is to put your file data into decimal then to pack the 
> file data like this ##################################################
> $data = pack "w*", $_;
> #################################################
> then just use unpack to view the numerical data what this does is 
> instead of calling each line, is it calls by each paragraph of 
> numerical data which will allow you to save a lot of RAM by calling in 
> the information in chunks instead of by line
>
> hope that helps
>
> -Mark
>
>
>
> ________________________________
> From: Clifford Sobchuk <clifford.sobchuk at ericsson.com>
> To: Chris Marshall <devel.chm.01 at gmail.com>
> Cc: "Perldl at jach.hawaii.edu" <perldl at jach.hawaii.edu>
> Sent: Tuesday, February 14, 2012 12:36 PM
>
> Subject: Re: [Perldl] How to find out cause of out of memory
>
> Thanks all. Pre-allocating isn't obvious (to me) as the file and hence 
> data are highly variable with no easy way to determine the size.
> I do think that it is the conversion from perl array to pdl as I am 
> guessing that the entire perl array would have to be loaded - which 
> likely causes the out of memory. In the "Whirlwind Tour" book there 
> was an example showing how to assign an image to a hash with the elements being pdls.
>
> I am unsure how to do this with rgrep or rcols. I have tried:
>
> open ($in, "<$ARGV[0]") or die "can't open ARGV[0] $!\n"; $fwdGain = 
> rgrep {/\s\d\s+\d+\s+\d\d\d\s+\w+\s+(\d+)\s+\-\d+/} $in; open ($in, 
> "<$ARGV[0]") or die "can't open ARGV[0] $!\n"; %snr = rgrep 
> {/\s\d\s+\d+\s+\d\d\d\s+(\w+)\s+\d+\s+(\-\d+)/} $in;
>
> To import the data - but it doesn't work as the $1 is a word. I made a 
> map for it as my %rate = { "Full"=>1, "Half"=>0.5, "Quarter"=>0.25, 
> "Eighth"=>0.125 };
>
> And tried to use:
> $snr{$rate{$1}} = rgrep 
> {/\s\d\s+\d+\s+\d\d\d\s+(\w+)\s+\d+\s+(\-\d+)/} $in;
>
> But this doesn't work either as it seems that rgrep is looking for a 
> numeric value.
> Argument "Eighth" isn't numeric in multiplication (*) at 
> C:\strawberry\perl\site\...
>
> Is there a way to use rgrep to put the mapped numeric and the data in 
> to a hash?
>
> Thanks,
>
> CLIFF SOBCHUK
> Core RF Engineering
> Phone 613-667-1974  ecn: 8109-71974
> mobile 403-819-9233
> yahoo: sobchuk
> www.ericsson.com
>
> "The author works for Telefonaktiebolaget L M Ericsson ("Ericsson"), 
> who is solely responsible for this email and its contents. All 
> inquiries regarding this email should be addressed to Ericsson. The 
> web site for Ericsson is www.ericsson.com."
>
> This Communication is Confidential. We only send and receive email on 
> the basis of the terms set out at www.ericsson.com/email_disclaimer
>
>
> -----Original Message-----
> From: Chris Marshall [mailto:devel.chm.01 at gmail.com]
> Sent: Tuesday, February 14, 2012 10:42 AM
> To: Clifford Sobchuk
> Cc: David Mertens; Perldl at jach.hawaii.edu
> Subject: Re: [Perldl] How to find out cause of out of memory
>
> Another angle, I can't tell how much of the data you collect in the 
> perl hash structures but they are *much* more memory intensive than 
> the pdl data arrays.
>
> Your best chance would be to allocate the destination pdl and then use 
> slice assignments to put the hash data into its correct place.
>
> Beware, one issue with perl is that it dies if it runs out of memory 
> which is a pain.  If you preallocate the big piddle, then maybe you'll 
> get the crash in the perl code which could give you an idea where the memory use is.
>
> --Chris
>
> On Tue, Feb 14, 2012 at 11:22 AM, David Mertens 
> <dcmertens.perl at gmail.com>
> wrote:
>> Cliff -
>>
>> Has your client given you with some sample data so that you can try 
>> to reproduce the error on your own machine? If so, a collection of 
>> warnings dumped to a logfile might at least tell you which line of 
>> code is croaking.
>>
>> Allocation of large piddles (many hundreds of megabytes) has been 
>> reported to be a problem elsewhere. One thing I have done on Linux to 
>> work around this problem is to build a FastRaw file piece-by-piece, 
>> then memory-mapping the file. Although this is not a possibility on 
>> Windows (no PDL support for memory mapping on windows yet), it might 
>> provide a means for a solution. You could build a piddle into a 
>> FastRaw file with one script, then have a different script try to 
>> readfraw that file. If you pull in this file early in your (second) 
>> Perl process, you have a higher likelihood of getting the contiguous 
>> memory request that PDL needs for the large data array.
>>
>> I know, it's not ideal, but I hope that helps. I should probably try 
>> to figure out how to add memory mapping support to Windows and then 
>> document this technique so that others can use it.
>>
>> For building the FastRaw file, I can dig up some sample code and send 
>> it along if that would help, but I won't be able to get to it until 
>> tonight at the earliest (and I make no guarantees as it's Valentine's 
>> day :-)
>>
>> David
>>
>>
>> On Tue, Feb 14, 2012 at 9:26 AM, Clifford Sobchuk 
>> <clifford.sobchuk at ericsson.com> wrote:
>>>
>>> Hi Folks,
>>>
>>> I am running in to a problem where I am putting in a large amount of 
>>> data (variable depending on log size). The data is being pushed in 
>>> to a perl array, and then converted in to a piddle. I think that it 
>>> might be the conversion from perl array to piddle, but am not sure.
>>> How can I find out where the issue exists and correct it. The end 
>>> users computer (laptop) will often be in this situation apparently.
>>> Since the data is intermixed with text that needs to be used to hash 
>>> each specific attribute, I can't simply use an rgrep or rcols import.
>>> I can use rcols for each section, this would result in using glue to 
>>> build up the piddle slowly (groups of 20 to 100 - depending on the 
>>> datum for that attribute).
>>>
>>> Example pseudo code.
>>> Foreach line {
>>>        $index1 = $1 if (/index1:\s(\d+)\w+);
>>>        $index2 ...
>>>        if $datastart && ! $dataend {
>>>                push @{$myhash{$index1}{$index2}{datum1}},$1 if 
>>> (/mydata/);
>>>                $dataend = 1 if (/$eod/);
>>>        }
>>> Foreach sort(keys %myhash) {
>>>        ....for each index
>>>                $data1=pdl(@{$myhash{$index1}{$index2}{datum1}});
>>>        }
>>> }
>>>
>>> The raw text files are on the order of 0.5 to 14 GB and are being 
>>> run on
>>> win32 (vista - which I know has a 2GB limit for applications). Hope 
>>> that this provides enough information to scope the issue.
>>>
>>> Thanks,
>>>
>>>
>>> CLIFF SOBCHUK
>>> Ericsson
>>> Core RF Engineering
>>> Calgary, AB, Canada
>>> Phone 613-667-1974  ECN 8109 x71974
>>> Mobile 403-819-9233
>>> clifford.sobchuk at ericsson.com<mailto:clifford.sobchuk at ericsson.com>
>>> yahoo: sobchuk
>>> http://www.ericsson.com/
>>>
>>> "The author works for Telefonaktiebolaget L M Ericsson ("Ericsson"), 
>>> who is solely responsible for this email and its contents. All 
>>> inquiries regarding this email should be addressed to Ericsson. The 
>>> web site for Ericsson is www.ericsson.com."
>>>
>>> This Communication is Confidential. We only send and receive email 
>>> on the basis of the terms set out at 
>>> www.ericsson.com/email_disclaimer<http://www.ericsson.com/email_disc
>>> l
>>> aimer>
>>>
>>>
>>>
>>> _______________________________________________
>>> Perldl mailing list
>>> Perldl at jach.hawaii.edu
>>> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>>
>>
>>
>>
>> --
>>  "Debugging is twice as hard as writing the code in the first place.
>>   Therefore, if you write the code as cleverly as possible, you are,
>>   by definition, not smart enough to debug it." -- Brian Kernighan
>>
>>
>> _______________________________________________
>> Perldl mailing list
>> Perldl at jach.hawaii.edu
>> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>>
>
> _______________________________________________
> Perldl mailing list
> Perldl at jach.hawaii.edu
> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>
>
>
> _______________________________________________
> Perldl mailing list
> Perldl at jach.hawaii.edu
> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>




More information about the Perldl mailing list