1. Summary
  2. Files
  3. Support
  4. Report Spam
  5. Create account
  6. Log in

PDL way forward draft

From pdl

Revision as of 15:45, 14 July 2010 by Marshallch (Talk | contribs)
(diff) ← Older revision | Current revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

General Thoughts

  • Changes should be evolutionary not revolutionary
    • Maintains back compatibility for users and package maintainers
    • Incremental fixes can go a long way towards a radically better PDL
  • Need to test PDL build from scratch
    • All major platforms (unix/linux, mac os x, windows)
    • Document how to set up a local sandbox
      • PREFIX and LIB for ExtUtils::MakeMaker
      • Build with CPAN shell
  • We need developers on Mac OS X and Windows platforms
    • Wider testing of PDL for releases
    • Implementing platform specific fixes
  • Improve PDL usability for first time users
    • Work out of the box
    • Easy install via 1-click or package managers
  • Implement "1-click" installs by platform and document in wiki
    • Mac OS X: SciKarl (now on PDL sf.net download page)
    • Windows: Strawberry Perl Professional (TBD)
    • Unix OSes: Various package managers, once PDL is packaged
      • Debian (ok)
      • What about RPM?
  • Add interactive mode to CPAN builds
    • Interactive option for perl Makefile.PL
    • Use to update perldl.conf options
    • Allow perldl.conf options to be specified on the command line
    • An interactive GUI to download and install PDL might be nice

Modularize PDL

There seems to be a growing consensus that PDL should be split into a solid, installable-everywhere core, and a handful of supporting modules. The module would be easily installable using the CPAN Bundle namespace, in which case a person wanting to use PDL for their own work would install Bundle::PDL instead of just PDL since the latter would only contain the core.

As first steps in this process, we're going to try doing this with PLplot:

  1. Move PLplot out of the core
  2. Package PLplot
  3. Create something of a migration guide
    • Git steps to extract a PDL submodule
    • Modifications required to build alone or in PDL
  4. Repeat process for packages deemed 'not necessary for core functionality'
    • PGPLOT
    • OpenGL
    • Others
Monolithic distribution is a real problem when the failure of one
component breaks the whole build/compile/test/install.
  • Standardize use of modules inside and outside main PDL distribution
    • Minimal changes
    • Outstanding issues
      • How can we maintain PDL quality and consistency?
      • What is needed to upgrade a PDL (sub)module?
      • Bundles for distributing multiple configurations
        • Maybe a PDL::Install::Easy to interactively select and install
        • Maybe a PDL::Install::GUI that wraps it all up
  • Create git repository per split off module
    • Don't use case sensitive names for the repository
    • pdl-graphics-pgplot our first test case
    • Hosted via the sourceforge.net git
I don't want a PDL that is split apart.  I want a PDL that is
separable in the sense that internal and external modules in
PDL are equal and equally easy to develop, integrate, and use.
  • Settle on a default base distribution of PDL
    • Start with perldl.conf options off
    • Using only C and Perl to build
    • Builds and installs wherever perl does
    • OS Packagers have observed that breaking PDL into components should be coarse-grained, not fine grained
      • Minimizes version skew issues
      • File IO is something that could go either way, but should probably be in the core.
    • What about users who want the whole works
      • Use runtime detection to allow for "soft fails"
      • Have stubs in case of missing dependencies
      • Add missing dependencies after main PDL install
      • Would allow 2D and 3D graphics as part of base PDL

Better Handling of External Dependencies

  • Review external dependency handling
    • Use Devel::CheckLib to directly detect headers and libraries
    • Remove dependencies on external commands (e.g. clear in Demos)
    • Make missing functionality detectable at runtime (a la PGPLOT)
    • Could use warning/error stubs if functionality not available
    • Verify dependency detection across all major PDL platforms
      • Unix/Linux
      • Windows
      • Mac OS X
  • Fix FORTAN code requirements
    • Add a USE_FORTRAN option to perldl.conf
    • Convert PDL::Slatec to use C with f2c of source
    • Convert PDL::Minuit to use C with f2c of source
[David/run4flat] I think we should also write a rock-solid Alien::PLplot
package that will interface with various OS's package managers to ensure
that PLplot is installed using the native package management, or download
and compile the source if the native package management does not include
PLplot or is unknown.  Similar Alien packages can and should be written
for OpenGL, PGPLOT, and whatever else gets split off the core.
  • Make PDL::Graphics::PLplot an external module to answer these questions:
    • What is needed to make an external PDL::Module work the same as a PDL::Module configured and build in the main distribution?
    • What changes need to be made to the PDL build process?
    • How does the documentation get integrated in with the perldl help and pdldoc?
    • How do we add/remove/update on-line docs?
    • What is the significant difference between an in-tree PDL module and an external one?
    • What are the reasons that a module must be built in the default tree?
    • Which PDL (sub)modules must be built in the main distribution and why?
    • If the split of PDL::Graphics::PLplot is successful, then that module could depend on Alien::PLplot to get/install/configure the required PLplot library configuration.
    • For external dependencies, we need to sort out a PDL specific set of install locations for "our" dependencies. We don't want a PDL::Alien::PLplot to spit files all over the main system directories and files.
    • How is testing done?
      • Do test suites merge?
      • Is there some sort of API or contract that the external+internal modules would need to meet?
      • How are tests, regression tests, and compatibility handled?
    • How are versions and consistency between different releases to be maintained?
      • What are the standards for consistency and quality?
      • What should we have as a spec?
      • How do the current PDL internal modules stack up (varying in implementations)?
  • Modify PDL::Graphics::OpenGLQ to build dynamically
    • Would enable PDL install then TriD upgrade
    • Implement perl version of OpenGLQ routines
    • Use Inline::Pdlpp to build if need to compile
    • Prompt at runtime to compile
  • Alien modules for the hard dependencies
    • Add "no croak" and "no install" options
    • More generally useful than just for PDL
    • Implement Alien modules needed by PDL
      • Inventory what libraries/commands/functions are needed
      • Some possible candidates for Alien-ization
        • Alien::Gnu::readline
          • Not required, Term::ReadLine::Perl works if no GNU readline
          • GNU readline doesn't build easily on pure MinGW win32
        • Alien::Curses for IO::Browser
        • Alien::GD for PDL::IO::GD
        • Alien::HDF4, Alien::HDF5
        • Alien::LibJPEG
        • Alien::LibPNG
        • Alien::LibTIFF
        • Alien::NetPBM
        • Alien::OpenGL, Alien::GLUT, Alien::GLX, Alien::GLUI ...
        • Alien::PGPLOT
        • Alien::PLplot
        • Alien::PROJ4

Provide a Baseline Default 2D and 3D Graphics Capability

  • Perl OpenGL improvements supporting PDL development
    • Not in critical path for PDL-2.4.7
    • Move to Devel::CheckLib for OpenGL probes
    • Does an Alien::OpenGL module make sense?
    • How about a GLEW or GLEE based interface?
In order to guarantee that PLplot works on any machine, it was
suggested that somebody write an OpenGL driver for PLplot.  In
this way, ensuring that PLplot has a consistent, reliable interface
boils down to ensuring that OpenGL properly installs.
  • Add a portable, generic graphics capability to PDL
    • Graphics should definitely be in a base PDL distribution
      • BUT, no hard dependencies for graphics for a base PDL install
      • If dependencies are there, graphics will work
      • If not, you should be able to install the dependency afterward
    • Propose PLplot for 2D default capability
      • Need PLplot and PDL::Graphics::PLplot to build all platforms
        • MS Windows
        • Cygwin/XP
        • Linux/unix
        • Mac OS X
      • Need a portable interactive display driver
        • Propose OpenGL driver solution
        • Implement first cut driver a la existing mem/memcairo
        • Full driver would support anti-aliasing
        • Can tie in with existing PDL::Graphics::TriD
    • Propose TriD with new POGL build for 3D
      • Add support for improved 2D image display using TriD
        • imag2d() works ok
        • Next step: move into PDL distribution as a module w tests...
      • Enable REPL and GUI simultaneous operations and event loops
        • perldl via ReadLine and FreeGLUT
        • REPL via ReadLine and FreeGLUT
      • Need more portable OpenGL + GUI library support
        • PDL::Graphics::TriD::Tk is only X11
        • Need alternatives with Perl and OpenGL support
        • GLUI is small and simple enough to bundle with PDL
        • Gtk2 and wxWidget both have OpenGL and Perl support

Documentation and Usability Fixes

  • Update PDL web site pages
  • Review current documentation (underway)
    • Verify correctness and consistency with PDL-2.4.6_014+
    • Update documentation where needed.
    • Add web searchable versions of the docs.
    • Maybe a wiki format could be used to improve docs
  • Add platform install notes to PDL wiki
    • Complete build-from-scratch installs
    • Build issues and their fixes
    • How to get needed dependencies
    • Other platform details
      • Mac OS X
      • Linux (by distribution and version)
      • BSD
      • Solaris
      • Windows
      • Other
  • Update HDF docs
    • HDF5 and HDF4 are entirely different beasts (and totally incompatible)
    • The PDL::IO::HDF library uses HDF4.
    • Portability issues especially re. Installs
  • Add docs/support for users of other software
    • IDL, Matlab, NumPy
    • PDL::Matlab helper module
      • Implement the main basic Matlab routines
      • Add help for equivalent PDL constructs
      • New pdl() constructor with matlab [] syntax for string arguments
        • Implemented in PDL git
        • Needs more testing and verification
        • Does it work for MATLAB and PDL users' purposes?
  • Add CPAN Testers startup info to PDL wiki
    • What you need to start
    • How to configure CPAN
    • How to install the reporting packages
      • Test::Reporter and others
      • How use CPAN Testers version 2
      • Other options
  • Update PDL Book (in progress)
    • Needs updating to match PDL-2.4.6_xxx from PDL-2.4.3
    • make a web version available on-line if possible
    • Make PDF of draft available on sourceforge

Build Process Fixes

  • Fix the test problems with preexisting PDL installs
    • Maybe a function in PDL::Core::Devel to check @INC
    • Clean up build output from make process to reduce cruft
      • May not be possible with EU::MM
      • Module::Build offers some hope to fix
  • Improved build process
    • Little fixes to current build would help
    • Aim to build out of box with Perl+C only
    • Building with Perl+C only implies we need Module::Build
    • Benefits/Features of a switch to Module::Build
      • Would affect the entire PDL tree build process
      • Better diagnostics
      • Clearer build output
      • Only need Perl+C
  • Bundle::PDL implementation
    • What if not all elements are available on a platform
    • Can Bundles have Makefile.PL's to sort these things out?
    • The ability to use Bundles would be a direct consequence of modularized PDL components
  • Is there a way of setting perldl.conf before PDL install attempt?
  • Set up paths for external dependency libraries for PDL
    • Local to PDL not system wide
    • Can skip if system wide is available
    • Relocatable with package-config or Alien dependencies information
  • List and obtain module owners and developers for existing PDL modules
    • Get them to revisit their modules in light of portability fixes
    • Improved intro documentation
    • Update documentation to reflect current PDL implementation

Computational Improvements

  • Improved support for external computational libraries
    • HDF5 IO
    • OpenCV
    • FFTW3
      • Depends on Alien::FFTW3
    • Clean and unify the FFT and FFTW in PDL
      • FFT is the default implementation
      • Add FFTW acceleration for the default FFT
      • Add cleaner use of complex values to FFT routines
      • Document FFT/FFTW
        • The computational algorithms
        • The input/output locations
      • Make non-inplace versions of FFT
      • Make default FFT library match FFTW calling conventions
        • But, use symmetric scaling (no factors of N)
  • Clean up/refactor core PDL array and computation functionality
    • Document current implementation
    • What improvements would be useful for PDL::PP?
    • Minimal, clean datatype support
      • Structures
      • Objects
      • Pointers
  • Improved performance metrics for PDL
    • Benchmark current PDL performance
      • Speed
      • Memory usage
      • Latency for piddle operations
      • Support for profiling
    • Any easy improvements relating to PP refactoring?
  • Full GSL bindings for PDL
    • Alien::GSL to install and something else to check
    • Prioritize GSL functionality to add
    • Convert GSL documentation to PDL usable form
    • Automate code conversion
      • Can we use Math::GSL as a starting point?
      • What about an PDL::PP version using GSL configure info for thread support?
    • Slatec compatibility wrappers
  • Support for GPGPU computing via PDL
    • The quick start would be an Inline::PDL::CUDA module
    • Look at PyCUDA for some implementation ideas
  • Improved Complex support for PDL
    • Data representation (Re,Im) vice (*,2) or separate Re and Im
    • Complex functions (GSL can provide)
    • Better support for complex data in PDL
      • How to control
      • Things should not break because of real<->complex issues

Padre Development for Strawberry Perl Pro Release

  • Evaluate a way forward to use PDL with Padre
    • Try out an determine what works already
      • Padre runs on strawberry
      • Padre::Plugin::REPL crashes on win32 and cygwin
        • Another interface will be needed for PDL support in Padre
        • There is a new Task API that may be useful
        • Maybe a separate process
      • Padre does not build/install with cygwin
      • Padre build/install difficult
    • List what is required and what would be nice to have
    • Make sure development on windows, unix/linux, mac os x is covered
  • Refactor perldl to support operation with Padre
    • Remove explicit use of STDIN and STDOUT for perldl (or other REPL)
    • Extend the help functionality to work on any set of modules, not just PDL
    • How can we tie in Padre help with perldl help and apropos?
    • Investigate moving to Devel::REPL as a basic framework (completed)
      • A rough Perldl2 shell has been implemented:
        • Based on Devel::REPL (and hence, Moose)
        • Supports lexical variables
        • Preservation of $_ across lines
        • Filename, method, variable completion
        • Completion works with both Term::ReadLine::Gnu and Term::ReadLine::Perl
        • Called as pdl2 (if Devel::REPL is not available, calls perldl)
  • Need better documentation of the PDL help system
    • Is there a way to fetch all the functions/keywords the PDL help system supports?
  • Add GLUT event support
    • To original perldl
      • Interleave event loops via Term::ReadLine::Gnu
    • To Devel::REPL perldl for Padre
      • Interleave event loops directly with wxWidgets
      • Use Term::ReadLine::Gnu and same as with perldl
    • Other?
  • Add record of installation information to PDL config
    • what was built
    • why
    • where it went (was installed)
    • other key configuration parameters.

Coordinate PDL Plans with Perl 6

  • Begin tracking of Perl 6 progress
    • Who is the POC for communication
    • What type of feedback is needed
    • Thought: PDL computation core translate more to Parrot than Perl 6
Personal tools