[Perldl] extending PDL data type support

David Mertens dcmertens.perl at gmail.com
Tue Feb 14 06:29:26 HST 2012


Having JIT support integrated deeply into PDL would be fantastically
amazing. I wonder if this would make for a good GSoC project? The student
could join Craig, me, and whoever else is planning on spelunking the core
this summer, and perhaps by August we could have something.

David

On Tue, Feb 14, 2012 at 10:18 AM, Chris Marshall <devel.chm.01 at gmail.com>wrote:

> David-
>
> Interesting that you should bring up libtcc and dynamic
> compilation.  I've been thinking for some time that the
> next evolution of the PDL code generation would be to
> have some sort of JIT compilation process.  That would
> give a number of advantages:
>
> * replace big, static structure loops by specific code
>
> * no need to pre-build every possible data type
>
> * allows arbitrary datatypes to work *much* better
>  (there would have to be special handling but it
>   could be at the inline C performance and not
>   up and down through C<->Perl stacks and control
>   flow---think complex number support)
>
> * would allow us to invert the memory-loop order for performance
>
> * this strategy seems good for GPU code
>
> * we could have a C-eccelerator for simple, C-like
>  code sections
>
> * we could co-compile chained PDL method calls
>
> ...and much more.  This is the kind of thing that could
> really make a nice PDL-2.5 (or even PDL-3.x... :-)
>
> --Chris
>
> On Tue, Feb 14, 2012 at 11:05 AM, David Mertens
> <dcmertens.perl at gmail.com> wrote:
> > On Tue, Feb 14, 2012 at 9:38 AM, Judd Taylor <judd.t at orbitalsystems.com>
> > wrote:
> >>
> >> This reminds me of libraries like Boost C++. Making this fast is what
> will
> >> be the trick here. I don't think fast can be done at run time, so any
> type
> >> support may have to be built in at compile time (like Boost).
> >>
> >> Support for any arbitrary type would be excellent, sort of like a
> template
> >> library does in C++. It would also be a very Perl-ish thing to do. Maybe
> >> this sort of thing can be done quickly at runtime via some new feature
> of
> >> Perl6?
> >>
> >> -Judd
> >>
> >> ____________________________
> >> Judd Taylor
> >> Software Engineer
> >>
> >> Orbital Systems, Ltd.
> >> 3807 Carbon Rd.
> >> Irving, TX 75038-3415
> >>
> >> judd.t at orbitalsystems.com
> >> (972) 915-3669 x127
> >>
> >> ________________________________________
> >> From: Craig DeForest [deforest at boulder.swri.edu]
> >> Sent: Sunday, February 12, 2012 1:02 PM
> >> To: chm
> >> Cc: perldl at jach.hawaii.edu
> >> Subject: Re: [Perldl] extending PDL data type support
> >>
> >> That is a very interesting idea, Chris.  Hmmm,  I wonder if something
> like
> >> that would make ranges easier/faster?
> >>
> >>
> >> On Feb 12, 2012, at 11:55 AM, chm wrote:
> >>
> >> > [Changing the topic in reply...]
> >> >
> >> > Adding support for arbitrary data types
> >> > is something I would like to see.  It should
> >> > be possible to have a piddle of "something"
> >> > as a regular array of that "something".
> >> >
> >> > This is something that would require an
> >> > update (at least) to the PDL::PP code generation.
> >> > A specific case of interest would be piddles
> >> > of pointers that would allow for indirection
> >> > in pdl data sets.
> >> >
> >> > The trick would be to implement these in a
> >> > simple, efficient, and fast code.
> >> >
> >> > --Chris
> >> >
> >> > On 2/10/2012 6:23 PM, David Mertens wrote:
> >> >> On Fri, Feb 10, 2012 at 3:08 PM, Judd
> >> >> Taylor<judd.t at orbitalsystems.com>wrote:
> >> >>
> >> >>>  I'd also like to chime in here and say that I think PDL's support
> of
> >> >>> data types is too limited right now. It should at least support long
> >> >>> double
> >> >>> formats. It would be more than awesome if PDL would work on the full
> >> >>> range
> >> >>> of numeric data types commonly used in scientific software and data
> >> >>> formats, but it doesn't even come close currently.
> >> >>>
> >> >>> Some relevant lists:
> >> >>> HDF5:
> >> >>> http://www.hdfgroup.org/HDF5/doc/UG/11_Datatypes.html
> >> >>>
> >> >>> HDF:
> >> >>>
> >> >>>
> http://www.hdfgroup.org/training/HDFtraining/UsersGuide/Fundmtls.fm3.html
> >> >>>
> >> >>> NetCDF:
> >> >>>
> >> >>>
> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf/CDL-Data-Types.html
> >> >>>
> >> >>> It would make interfacing to these very common formats stupid easy
> >> >>> without
> >> >>> any additional memory or data storage expense that you get from
> using
> >> >>> the
> >> >>> current PDL interfaces to these formats...
> >> >>>
> >> >>> -Judd
> >> >>>
> >> >>>  ____________________________
> >> >>> Judd Taylor
> >> >>> Software Engineer
> >> >>>
> >> >>> Orbital Systems, Ltd.
> >> >>> 3807 Carbon Rd.
> >> >>> Irving, TX 75038-3415
> >> >>>
> >> >>> judd.t at orbitalsystems.com
> >> >>> (972) 915-3669 x127
> >> >>>   ------------------------------
> >> >>>
> >> >>
> >> >> Adding new C data types (like long double) to the core is relatively
> >> >> easy.
> >> >> At the moment there are some silly holes, such as unsigned chars:
> only
> >> >> signed bytes are supported. I know of no reason for this. The same is
> >> >> true
> >> >> for long doubles.
> >> >>
> >> >> The problem with adding new data types is that every single
> threadloop
> >> >> that
> >> >> doesn't explicitly state GenericTypes will have a copy of the code
> >> >> generated and compiled for each data type. We have seven data types
> at
> >> >> the
> >> >> moment, so adding unsigned chars and long doubles wouldn't have a
> huge
> >> >> impact on the code size. However, we might also consider adding
> signed
> >> >> long
> >> >> (32 bit ints) and signed long-long (64 bit ints). That takes us from
> >> >> seven
> >> >> to 11. We should add the types and see how much this increases the
> code
> >> >> size. It may not be unreasonable.
> >> >>
> >> >> As for adding additional types, like complex numbers or Large
> numbers,
> >> >> those are more difficult to accommodate. Craig and I will be going
> >> >> through
> >> >> the core for a cleanup leading up to v2.5 (hopefully we'll get
> started
> >> >> some
> >> >> time this summer), so maybe after that we can address non-native
> types
> >> >> at
> >> >> that time. However, adding anything that's not known to C will be
> Very
> >> >> Difficult with PDL, as I understand it.
> >> >>
> >> >> One possible work around, which I've thought about but have no code
> for
> >> >> it,
> >> >> is a sort of PDL::Pointer type. But that would require a fair amount
> of
> >> >> core hacking before we have it working.
> >> >>
> >> >> David
> >> >
> >> > _______________________________________________
> >> > Perldl mailing list
> >> > Perldl at jach.hawaii.edu
> >> > http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
> >> >
> >>
> >>
> >> _______________________________________________
> >> Perldl mailing list
> >> Perldl at jach.hawaii.edu
> >> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
> >>
> >> _______________________________________________
> >> Perldl mailing list
> >> Perldl at jach.hawaii.edu
> >> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
> >
> >
> > @Judd, No, Perl6 won't help us here unless we re-implement PDL for Perl6,
> > which seems to me like it would be a big project. :-)
> >
> > A different approach to solving this problem is to glue together and
> compile
> > C code at runtime using libtcc (http://bellard.org/tcc/). Perl is good
> at
> > assembling and manipulating strings, of course; libtcc is good at quickly
> > compiling a string of C code into machine code for x86 and ARM
> processors.
> > It doesn't do a ton of optimizations, but the combination of highly
> > customizable yet compiled code could be a big win.
> >
> > @Craig, libtcc could make range *extraordinarily* fast yet flexible. I've
> > played with it a bit and would be happy to chat about it with you
> (porters
> > list, private email, #irc, whatever).
> >
> > David
> >
> > P.S. I've been kicking around ideas about using libtcc and Perl or PDL
> for
> > many months now and hadn't found a good time to introduce it. This seems
> to
> > be about as good a time as any. :-)
> >
> >
> > --
> >  "Debugging is twice as hard as writing the code in the first place.
> >   Therefore, if you write the code as cleverly as possible, you are,
> >   by definition, not smart enough to debug it." -- Brian Kernighan
> >
>



-- 
 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.jach.hawaii.edu/pipermail/perldl/attachments/20120214/0f9a3109/attachment.html>


More information about the Perldl mailing list