Re: [Transdecoder-users] Announcement: Transdecoder release r20140704
Extracting likely coding regions from transcript sequences
Brought to you by:
bhaas
From: Martin M. <mmo...@gm...> - 2015-01-09 16:54:07
|
Hi Brian, Brian Haas wrote: > Hi all, > > In general, I prefer for the system to be retained as a fully > self-contained unit. I really would *not* want any of the scripts to > show up in /usr/bin, or for the perl libraries to contaminate (for > lack of a better word) any site-perl libraries. I do not mind if all, maybe except TransDecoder main script, appear in /usr/share/TransDecoder/util we point PATH to it. Or, we can use environment variable TRANSDECODER to point to /usr/share/TransDecoder. But, the perl modules I would install under /usr/lib64/perl5/vendor_perl/5.16.3/TransDecoder, like most apps do. > > Keeping everything self-contained within the one package allows one > to easily move it around or delete it in its entirety. In this case, The packaging system enables user to uninstall a package seamlessly as well. I infer you want that unpacked directory to be functional. That can be done by telling users (or making it a shell-wrapper script TransDecoder.sh) to execute: export PATH=$PATH:`pwd`:`pwd`/3rd_party/TransDecoder_r20140704:`pwd`/3rd_party/TransDecoder_r20140704/3rd_party/ffindex-0.9.9.3/src:`pwd`/3rd_party/TransDecoder_r20140704/3rd_party/parafly-r2013-01-21 export TRANSDECODER=`pwd` export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd`/3rd_party/TransDecoder_r20140704/3rd_party/ffindex-0.9.9.3/src ./TransDecoder $* > they simply need to have Transdecoder in their PATH, and ideally the > whole thing just works. I attach two patches I needed because the installed apps looked for /usr/bin/util . With the above 3 commands the unpacked distribution should still be functional and match your intent. > The one key issue that I'm happy to consider is what dependencies to > bundle with the system vs. what dependencies require users to supply > externally. In this case, my personal opinion is that the very > commonly installed bioinformatics tools and very large software > packages (ie. samtools, hmmer, bowtie, etc.) need not be bundled, but > for other ones that may be less commonly installed and dependent > within the software package (ie. ffindex, parafly, and possibly > cdhit) or those that have specific version compatibility issues (ie. > rsem in Trinity), I'm inclined to simply bundle them into the package Although I do not agree, especially because distro will NOT allow user to fetch the bundle unless he/she agrees with *all* LICENSEs in it, so you only complicate the situation and the person who made a package for a distro will be blamed if he did not realize there are multiple licenses in the bundle. This is especially case for trinity (because there is some code some people will not be comfortable with). I don't remember it from top of my head but it is in the email archives from 2014. Provided user agreed with a licensing scheme, the package maintainers will easily do something similar to what I did: just zap the 3rd_party directory contents and install "manually" or via a future Makefile doing the *install* step. The only thing which is necessary for us, is that the tools will not look into "/usr/bin/util", which is currently derived as '/usr/bin' + '/util'. > to ease installation and better ensure overall system integrity. As > long as those bundled packages remain inside the self-contained > software package, and don't contaminate or replace other > already-installed software tools, and are specifically leveraged by > the driving software within the self-contained package (by internally > adjusting PATH env var, or looking for tools via relative paths to > the package installation directory), I think it should be both > acceptable and non-disruptive. I should also mention that we don't > and won't plan to bundle any software that has restrictive licensing > issues nor that is not open source. I think the above TransDecoder.sh trick should meet your criteria although I did not test it. > > Now - with that said - TransDecoder does need to be seriously > overhauled with respect to its makefile, build mechanism, etc., and I think the only "pressing" issue is either acceptance of the attached patches or of something functionally similar, and where to place the *.pm files and to ensure that they get found via PERL INC path. I don't care about the Makefile anymore as I now know what layout is needed. > separately from that, it needs some additional enhancements to make > it even more useful to users in the next release. So, after this next > generation of the Trinity software goes out, we'll tackle > TransDecoder and whip it into shape. But note that, unless there's > some major shift in my method of operation with respect to software > packaging issues, it'll fit the general description that I outline > above. Alexie and I will work on it together, and we'll try to do > what we can to address any lingering issues that Martin might raise.> > cheers, The Makefile should check whether a user has hmmpress, there is no check for the binary. I think it could also tell users whether they can stay with earlier than hmmer-3.0 (so whether there is something like hmmpress in the older version of hmmer). Do not know myself. I only know that hmmer-3.0 does not have yet all functionality of hmmer-2 series, so some users just cannot "upgrade". Basically, you could document what is needed and when. I suspect that there is no real requirement for openmpi and that the perl scripts could call mpiexec (if discovered via $PATH) to launch children. The requirement is probably coming from parafly, not sure what in ffindex needs it. And is anything in the perl scripts using MPI API? I also do not like that by default, 2 jobs are forked by default. Default should be CPU=1. Finally, somewhere is defined that only ORFs larger than 900nt are to be retained in results, I think the default is too strict for current NGS-based assemblies full of sequencing errors causing frameshifts. BTW, TransDecoder does not include in results "ORFs" broken by a frameshift, right? [Note: it is easier to ask then to test that myself. ;-)] Martin |