Re: [Module::Build] M::B actions design

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Friday, October 17, 2003, at 03:57  am, Randy W. Sims wrote:
>
> Thanks for the discussion; It has caused me think this through a=20
> little more thoroughly.

Likewise - I'm glad I'm not the only one thinking about this, and it's=20=

good to talk it over.

>  But the thoughts below should still be considered idealizations and=20=

> may differ from the possible or practical. ;)
>
> On 10/16/2003 6:50 AM, Steve Purkis wrote:
>> On Thursday, October 16, 2003, at 12:06  am, Randy W. Sims wrote:
>>> There is another issue here in that the list of actions is growing=20=

>>> quite large, and M::B is (IMHO) turning into a large monolithic=20
>>> module. Would it be worth the effort to re-architect actions, so=20
>>> that they are plug-ins (M::B::ACTION::html, M::B::ACTION::dist).=20
>>> This would keep M::B from becoming monolithic and would allow others=20=

>>> to develop plug-ins to expand functionality without requiring=20
>>> invasive code changes.
>> Yes, M::B: is becoming quite monolithic.  So was MakeMaker.  The main=20=

>> difference I find between the two is that M::B is much easier to=20
>> read, understand and extend.
>
> I didn't realize until now that my original choice of words may have=20=

> been misconstrued as an insulting remark--That was not my intent. I=20
> meant monolithic in the sense that most of the code is in a single=20
> file/class (M::B::Base) that it is growing quite large, and it is not=20=

> likely to get smaller. I didn't intend it as an insult or as any kind=20=

> of judgemental statement, just as an observation.

I don't think anyone interpreted it in an insulting way :).  Also, the=20=

above was more a blaz=E9 remark than anything else.

>> Splitting out the actions is not a bad idea - I bounced a similar one=20=

>> around with some friends in the past:
>>     package My::Action;
>>     use base qw( Module::Build::Action );
>>     use constant depends_on =3D> qw( foo, bar );
>>     sub builder { }    # the parent M::B obj
>>     sub check_dependencies {
>>         ...
>>     }
>>     sub execute {
>>         ...
>>     }
>> There are a few problems with these ideas: how do you write platform=20=

>> specific code?  One answer is to sub-class each action, or write a=20
>> method for each platform in the action's class.  That could work, but=20=

>> it could also get a bit hairy.
>
> The idea(l) is that M::B provide as much as possible an API for the=20
> action plugins that is platform independent. If there are still OS=20
> differences and they are small, they can be put inline. Otherwise, you=20=

> would create subclasses that handled the differences and use the=20
> 'base' class as a proxy, forwarding the calls to the appropriate=20
> 'subclass'.
>
> i.e. if you write an action, say for reporting build status to=20
> cpan-testers called M::B::Action::Reports. If there are differnces=20
> between *nix systems & Windows, you'd create classes=20
> M::B::Action::Reports::Unix & M::B::Action::Reports::Windows. Now, in=20=

> M::B::Action::Reports you'd write stub functions that determinse the=20=

> appropriate subclass to call based on OS and forward the call to the=20=

> real function in the determined subclass.

Yeah, that would do the trick.

>> Then, how do you access commonly used methods?  A solution might be=20=

>> to bung all these methods in a common place that everybody inherits=20=

>> from, say M::B::Common:
>>     M::B::Common
>>         |
>>         +- M::B::Base
>>         |
>>         +- M::B::Action
>>               |
>>               +- M::B::Action::Foo
>
> Again, the common methods would become part of M::B's API for actions.=20=

> (Nit: It's a bad idea to subclass just to share code.)

I disagree - sharing code is one of the best reasons to subclass.  But=20=

I don't want to be pedantic here, because I think having a API for M::B=20=

actions is an excellent idea, and I imagine it would be a bit of mix &=20=

match as to what makes more sense in the sub-class, and what makes more=20=

sense in the API.

>> And, how do you subclass actions?  If I want to write my own=20
>> 'install' action, how do I do it?  At the moment, I sub-class M::B=20
>> and do it there.  Sub-classing each action is easy, but how do you=20
>> get M::B to use them?  One way is to say M::B::Base must have stub=20
>> methods for each action so that they can be overridden.
>
> As an Build.PL author, you would do one of two things.
>
> 1) You'd use a helper class, something like
>
> my $action =3D new M::B::Action::Builder(
>   name =3D> 'install',
>   code =3D> sub {...},
>   help =3D> 'A help message to display for users.'
> );
>
> M::B::Action::register( $action );

Cool idea - fits in well with Module::Build->subclass().  The=20
registration could be done automatically by the builder too.

As an aside, I'd move the Builder somewhere else and reserve the=20
M::B::Action:: namespace for actions only.

> or 2) You'd create an inline class and register it:
>
> package MY::Action::install;
> use Module::Build::Action::Install;
> use vars qw(@ISA);
> @ISA =3D qw(Module::Build::Action::Install);
> sub new { return Module::Build::Action::register( bless \my $arse ) }
> sub ACTION_install { ... }
>
> package main
> new MY::Action::install;
> ...

If you're gonna use a register() method, I'd say do it at a class=20
level, and keep with Perl's naming convention (lowercase reserved for=20
pragmas):

	package My::Action::Install
	use base qw( Module::Build::Action::Install );
	Module::Build::Action->register( install =3D> __PACKAGE__ );

	sub dispatch { ... } # or execute() or whatever()

(This is still assuming you have one action per package - I'll get to=20
that below).

>> I think they all stem from the fact that M::B was not written with=20
>> this design in mind, so it would take a lot of effort to refactor=20
>> things.  It certainly would involve a *lot* of changes, and little=20
>> (if any) backwards compatibility.  There is potentially a huge gain=20=

>> as far as simplicity is concerned, but it may be more work than it's=20=

>> worth.
>
> Yeah, I have no doubt it would be difficult to change, but M::B is not=20=

> going to get smaller. So, any refactoring is best done sooner than=20
> later.

Good point.

>> That said...
>>> This is only half thought-out, so I'm not sure if it's practical.=20
>>> For example, I don't know how easy it would be to come up with an=20
>>> interface so that actions could get info from M::B without digging=20=

>>> into the internals.
>> Making them plugins is a good idea, but it's even more complicated.
>>> <thinking out loud>
>>>
>>> Each module would contain one or more actions.
>> I'd limit it to one a package.  That way you can get the action name=20=

>> from the package name itself:
>>     'build' =3D> M::B::Action::Build
>
> It would be more convenient to allow multiple related actions to share=20=

> a file so that they can share common functions/data. Also, it's best=20=

> if they do not share the name because the most appropriate name for an=20=

> action might be a perl keyword. The module should have a descriptive=20=

> name appropriate for the actions it contains. The actions=20
> (packages/classes) themselves should probably follow the current=20
> standard of prefacing them with 'ACTION_' to avoid conflicts.

I see where you're coming from, but I still disagree.  Here's why:

1. Simplicity.
This is the key, really.  A class that does one thing and does it well=20=

is easier to write, maintain, understand and sub-class.  There's also=20
less to go wrong, and it will make dispatching the actions a lot=20
easier, seeing as each action would be dispatched in the same way:

	$action->dispatch();	# thank you Pipeline.pm :)

2. Common code can be inherited or called out.
We talked about inheritance above; with dependencies between actions,=20
common code can be easily called out.  The M::B API for actions would=20
also fit in here.

3. Highly configurable.
If each action is implemented as a small unit of code, you could easily=20=

change the behaviour of the system by replacing one of the units.

I don't think having an action that has the same name as a perl keyword=20=

(say 'if') would be a problem here.

>>>  M::B would search for plugin actions at startup, querying each=20
>>> module for a list of supported actions.
>> So M::B builds up a list of actions and their classes?  And I assume=20=

>> it would search particular sub-classes, starting with=20
>> Module::Build::Action?  Maybe that's how you could do user level=20
>> sub-classing: push on your own 'search' package:
>>     @M::B::SearchPackages =3D qw( My::Action Module::Build::Action );
>
> This is where I still have uncertainties. One way that seems kind of=20=

> hackish to me is to scan for '*.pm' under M::B::Action. Then try to=20
> execute a register function for each module found. If an author=20
> provides an action that is intended to subclass/override another=20
> action, then the register function for that action would have to=20
> report that in some way. I think this kind of usage would be rare in=20=

> practice. ???

I've done some work along these lines in the past - right off the bat=20
I'll tell ya that looking for *.pm doesn't work as well as you might=20
like :-/.

Here are a couple of algorithms I've used in the past (I'm sure there's=20=

more ;)...

1. A type map & a register method.
The idea is you have an hash like the 'ActionClassMap' I mentioned=20
before (pls forgive the studley caps :).  That hash is populated=20
internally by the module author:

	package Module::Build::Action;
	our %ActionClassMap =3D
	  (
	   Build =3D> 'My::Action::Build',
	   ...
	  );

Externally, that hash is populated by a 'register' method like the one=20=

you had above:

	package My::Module::Action::Build;
	use base qw( Modle::Build::Action::Build );
	Module::Build::Action->register( build =3D> __PACKAGE__ );
	# or alternatively:
	__PACKAGE__->register( 'build' );

2. Search packages.
This option is potentially less work, but not as flexible in naming=20
convention - the idea is you have a rule for naming action classes=20
under a namespace, and an ordered list of packages to search.  For=20
example:

	# assuming a 'ucfirst' rule:
	$class       =3D ucfirst( $action ); # 'build' =3D> 'Build'
	@search_pkgs =3D qw( Module::Build::Action );

	foreach my $pkg ( @search_pkgs ) {
		$pkg .=3D "::$class";
		return $pkg if $pkg->can( 'new' );
		eval "require $pkg";
		return $pkg unless $@;
	}

	die "couldn't find action $action\n";

>>>  Each action knows what parameters it accepts and can display its=20
>>> own help info. There could be a helper class for generating actions=20=

>>> and linking them by dependencies, i.e.
>>>
>>> my $action =3D new M::B::ACTION::Builder(
>>>   name =3D> 'My_lib',
>>>   code =3D> sub {},
>>>   ...
>>> );
>> Dunno if that's needed - M::B already does it.
>
> Do you mean the subclass() method? In the way I'm describing, you=20
> would subclass on an action by action basis rather than subclass M::B;=20=

> in fact, you would probably rarely if ever need to subclass M::B=20
> itself.

Yeah, I think I missed the point last time :).  Gotcha now.

>>> my $top =3D new M::B::ACTION::Builtin( name =3D> 'all' );
>>> $top->add_dependency($action);
>> I'd go with 'M::B::Action::All', it's more consistent.
>
> Actually, right now I'm thinking more along the lines of
>
> my $top =3D new M::B::Action::find_action( name =3D> 'all' );
> $top->add_dependency($action);
>
> to get a reference to an existing action as it seems more flexible.

Mmm..  so you'd like to be able to compile a list of dependencies=20
external to the Actions themselves, eh?  Well, it's not what I was=20
thinking but it could work.

Still, after a bit of thought I'd say keep it simple and sub-class if=20
ever you want to change the dependency tree.  Which is not as flexible=20=

when you're writing Build.PL's, but it does mean less work for the=20
person implementing/maintaining M::B's action dispatching code.

-Steve