dart-announce Mailing List for DART: DNA, amino acid and RNA tests
Brought to you by:
ihh
You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2008 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2009 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Ian H. <ih...@be...> - 2009-03-27 15:43:36
|
Hi all, As of this morning, dart is moved to github: http://github.com/ihh/dart/tree/master For CVS users, this will mean a small amount of pain since you may have to install git (unlike CVS it's not usually bundled as a default tool), and you'll have to re-download the repository. In return, you'll get a proper 21st-century source code management tool. If you haven't used it, git is very nice. For most people, it will be a boon because you can now download dart as a tarball, without me having to go through Sourceforge's excruciating release mechanism. ;) New release announcement coming soon on this list... Ian |
From: Ian H. <ih...@be...> - 2008-09-29 17:53:18
|
Hi all DART CVS services will be down for 8 hours from 17:00 UTC on Tuesday 2008-09-30, while Sourceforge migrate their CVS servers to a new datacenter (see below). Cheers Ian SourceForge.net Team wrote: > Hello, > > This message is being sent to all project administrators for > SourceForge.net-hosted projects since the majority of projects on the site > are opted-in for CVS service use. If your project is not actually using > CVS, you can turn off this feature via the CVS page under the Admin menu > option on your project pages on the SourceForge.net site (after > logging-in). > > On 2008-09-30 (Tuesday), project CVS service will be taken offline at 17:00 > UTC for not more than 8 hours to permit the migration of service to the new > Chicago datacenter. Servers have been prepared and locally tested already; > the migration event will be used to copy data deltas from the old > datacenter to the new, and cutover CVS service DNS to point at the new > servers. During this migration window read-only pserver-based CVS access, > SSH-based developer CVS access, ViewVC web-based repository access for CVS, > and other CVS-related facilities will be completely offline. CVS is the > final major SourceForge.net service left to be migrated to our new > California datacenter. > > If you have any follow-up questions or concerns, please submit a new > Support Request at: https://sourceforge.net/projects/alexandria/support/ > > Thank you, > > Jacob Moorman > Director of Operations, SourceForge.net > > > ---------------------------------------------------------------------- > > This message was sent on behalf of SourceForge.net based on > the existence of your user account on our site. > > To unsubscribe from future mailings, login to the SourceForge.net site > and request account removal at: > http://sourceforge.net/account/remove_account.php > > Or contact us by postal mail at: > Attn: SourceForge.net Legal Services - Account Removal > SourceForge, Inc. > 650 Castro Street, Suite 450 > Mountain View, CA 94041 > > Unsubscribe requests will be processed within 10 days of receipt. |
From: Ian H. <ih...@be...> - 2008-07-19 04:49:47
|
A quick software update: xrate now does maximum a-posteriori ancestral phylogenetic sequence reconstructions, using the "-ma" option. Sketchy tests have been done with the following models (the results look plausible and it doesn't crash): -- general reversible point substitution model (DNA) -- reconstructed dummy example -- 3-state Thorne-Goldman-Jones phylo-HMM (protein) -- reconstructed Tc1 transposase -- PFOLD (RNA) -- reconstructed hammerhead ribozyme, and a couple others |
From: Ian H. <ih...@be...> - 2008-01-31 22:24:06
|
Hi all, I've created a new mailing list for getting help/support for dart (including xrate, stemloc, Handel etc). This should make it a bit easier to field questions as the pool of dart developers grows, particularly as the distribution now includes various scripts from different developers. You can subscribe here: https://lists.sourceforge.net/lists/listinfo/dart-help I'm CCing a few of the people who've used dart (mostly xrate) substantially in the past few years. You don't have to be a subscriber to post questions (non-subscriber posts are moderated). I'm also CC'ing dart-announce on this email. Whether or not you subscribe to dart-help, *please* consider subscribing to dart-announce if you use (or might use) these tools. This is a VERY low-volume, moderated mailing list, intended for major announcements/updates ONLY. The last message was in 2005! (Yes, I know, another update is overdue. It will be along shortly.) My point: it won't hurt your inbox, but it might just keep you updated. Subscribe to the dart-announce list here: https://lists.sourceforge.net/lists/listinfo/dart-announce And check the DART website: http://biowiki.org/dart Best wishes.. Ian Holmes |
From: Ian H. <ih...@bd...> - 2005-09-08 19:39:40
|
Quick addendum to my last email: Ewan informs me that Exonerate is the one true successor to GeneWise, that GeneWise (and presumably Exonerate) rarely segfault and that champagne is no longer offered as a prize. I suggest direct lobbying of Guy Slater (lead developer on Exonerate) if a revivification of this tradition is sought. Exonerate is an amazing program, btw, and a close cousin of Dart. http://www.ebi.ac.uk/~guy/exonerate/ |
From: Ian H. <ih...@bd...> - 2005-09-08 01:13:56
|
....DART news.... ....Berkeley.... ....September 2005.... (Scroll down for the DART website at the foot of this email.) Hello there, fellow evolutionary hackers, RNA tinkerers and phylo-enthusiasts! This is the latest missive on dart-announce, the low-volume mailing list describing developments with the DART package, your FAVOURITE software for evolutionary bioinformatics... er, OK, my British upbringing demands more disclaimers in that sentence. How about: your favourite software for FAST evolutionary bioinformatics... counting only open source programs WRITTEN IN C++...... in Berkeley........ by me. yeah, that shouldn't offend anyone. (WE RULE!!!!) (*ahem*) If nothing else, dart SURELY must be your favourite GPL'd software that comes with an opportunity to SCORE some FREE BOOZE[*]! (Let's see if that makes it past the spam filters.) That's right:-- the 2004 bug hunt, despite being prolonged well into 2005, has finally reached its inevitable conclusion. Carolin Kosiol, working with Nick Goldman's group, found some cool sparse data glitches while experimenting with "xrate" on 61-state codon models. Yes, that means rate matrices with an almighty 61*61 parameters! (Give or take a factor of 2 due to reversibility constraints.) Hopefully the problems disclosed by Carolin et al have now been fixed: both in the sense that we have patched the code (adding pseudocounts and various checks for robustness), and in the sense that we have effectively SILENCED Carolin and crew by sending them a crate of 12 bottles of some of the finest wines produced in my old stomping ground, the Yarra valley in Victoria, Australia. (Remember that old Monty Python sketch about Australian wines: "they really open up the sluice-gates at both ends".) So that should be the last we hear from Carolin and Nick for a while (well, that's the theory anyway). Anyone else who reckons they're hard enough, come and have a go! Best bug found by summer 2005 wins the discoverer a similarly punishing alcoholic stupor. (A shout out is due to Sam Griffiths-Jones and Alex Bateman, of RFAM, who came a close second with their heavy-duty testing of the StemLoc program; much valuable feedback, thanks guys. Maybe next year the booze will be yours...) [*] I believe that Ewan Birney has stopped doling out champagne as a bug prize for his GeneWise programs, though I'm prepared to be corrected. Ewan? BTW, lest it escape anyone's attention, the point of doing this kind of bughunting prize (aka "pandering to beta testers") is that, as a result, we have VERY FEW BUGS. For example: to my knowledge, dart programs never segfault. Ever. Some of the more bloated RNA alignment algorithms have been known to abort when they run out of memory, but that's par for the course. Dart is _very_ robust software and we intend to keep it that way (and bribe off any dissenters with fine wines, muahahahaa). Moving on: XRate, the abovementioned software for estimating instantaneous rate matrices, continues to be our most popular tool (but closely followed by StemLoc; see below). This year, Pete Klosterman has worked to adapt XRate to estimate *irreversible* rate matrices, using a generalisation of the eigenspace-EM algorithm that powers the reversible version. (Much credit should also go to Gerton Lunter, Robert Davies and others who were kind enough to contribute asymmetric linear algebra codes.) XRate has also been adapted to incorporate arbitrary stochastic grammars (including algorithms like Thorne-Goldman-Jones for predicting secondary structure of proteins, and Knudsen-Hein for that of RNA). We have also implemented the basic Siepel-Haussler technique for context-sensitive substitutions (e.g. incorporating CpG effects, or basepair stacking in RNA; irreversibility can be an important consideration in this sort of model). As a result, we now have spin-off programs for working with evolutionary or "phylo"-HMMs/SCFGs (XFold/XProt, for RNA/protein, respectively). XRate (and its spinoffs XFold and XProt) have been applied in a number of "big genomics" projects this year, including ENCODE and analysis of the 12 fruitfly genomes. We hope to expand this. In other developments, the Handel program (MCMC sampling of multiple sequence alignments) can now be used together with elaborate phylo-grammar programs (like XFold and XProt) to sample from the likelihood function of such phylo-grammars, using a Metropolis-like accept/reject scheme. That is, you can pipe the sampled alignments from Handel directly into XFold, and thereby explore XFold's alignment space using the Handel sampling engine. We're still testing this capability but expect it to be very powerful. As always, you can download the code from SourceForge (advance warning: we may move the CVS repository to our own machines in the near future; anonymous CVS access will still be offered). Our RNA-oriented tools are also still going strong. Development continues on "evoldoer" (our RNA evolutionary aligner) and "StemLoc" (for RNA multiple alignment). Our current favoured way to proceed with RNA multiple alignment is an MCMC-type approach. We're about to invest in a machine with 32GB of RAM (currently we're limited to 8GB) specifically so we can play around with some of these high-memory RNA algorithms. Keep watching this space. Anyway, enough rambling from me. Keep hacking away. See you all at ISMB in Brazil, or Benasque'06, or some such cool compbio venue. Ian Holmes, ihh at berkeley dot edu UC Berkeley, Dept of Bioengineering September 7, 2005 DART website: http://biowiki.org/dart |
From: Ian H. <ih...@bd...> - 2004-07-24 22:46:02
|
Hi all, I promised this would be a low-volume mailing list, but two messages per year might be a bit slack... so to compensate, here is a newsletter. -------------------------------------------------------------------------- GCC 3.4 compatibility Release 0.2 of Dart is on its way, but I haven't yet found time to make a tarball (or update the tutorial), and I keep adding little things. The code currently in the repository works with the latest version of gcc (3.4.1) and also is backward-compatible with gcc 3.3.* (as used in the pseudo-parallel universe of Mac OS X). -------------------------------------------------------------------------- RNA multiple alignment More excitingly, stemloc does RNA multiple alignment! how cool is that? As you, discerning User, have come to expect, the code is blindingly fast. It errs on the side of speed (& low memory) versus sensitivity. You can make it more sensitive by playing with the "-nf" switch. HOWEVER: It's quite easy to max out the memory on your machine. This means you need MORE RAM! There may be cooler hackier ways to constrain the Sankoff algorithm than the ones DART currently knows about. But, ultimately, none of us can shirk the duty of buying a 64-bit box ;-) -------------------------------------------------------------------------- Today's DART top tip - logging One quick-and-dirty way to get more info about what's going on inside DART is to access the built-in logging diagnostics -- pretty much every debugging output method I've ever written is accessible by using the right "-log ..." directive. Type "xrate -logtags" or "xrate -logtaginfo" to get a list (substitute e.g. "stemloc" for "xrate" depending on what program you're using). As an example, the relevant logtags for "xrate" begin with "RATE_EM" & you can grep for them in the file dart/src/hsm/em_matrix.cc e.g. "-log RATE_EM_STATS" should give you the alignment log-likelihood, and also the statistics $u_{ij}$ and $w_i$ estimated during the E-step. The output is to stderr or a logfile, in a sort of dismal apology for XML. Just "-log N" where N is a number will give you a bunch of log messages about everything. The default N is 9; set N to lower for more info (9 is the most Unix-y, i.e. minimal output; I usually run at about 6 for standard usage, or 3 or lower for debugging.) -------------------------------------------------------------------------- Current top 5 on the DART wishlist * Telegraph (machine-readable format for stochastic grammars, plus compiler) * Irreversible substitution matrices in "xrate" (and other programs) * Importance-sampling for "tkfalign" (i.e. using TKF as proposal function) * Generic "Evolutionary HMMs" (as described in Holmes, ISMB 2003) * Simple "Evolutionary SCFGs" (as described by e.g. Knudsen & Hein) -------------------------------------------------------------------------- DART bug hunt The 2004 DART bug hunt is a late starter but it is ON! The best bug mailed to me before 12:01am on January 1, 2005 will win a DART-related prize that will *astound your friends*! It could be a dart rifle. It could be a Dart Blaster pen. It could be my 1966 Dodge Dart, resurrected from its dusty junkshop grave in Coalinga, halfway down the California Interstate 5. What is certain is that it will KNOCK YOUR SOCKS OFF. Current best contender is the weird "xrate" normalization behaviour observed by Carolin Kosiol at the EBI. But YOU CAN FIND SOMETHING WEIRDER! Come on, do you really think *I* would write something bug-free? Have you SEEN how long Dart takes to compile? There must be bugs the size of Microsoft Word in that thing! Entries on a postcard or email. (The excellent third-party software "newmat" (Robert Davies) & "weighbor" (Bill Bruno), distributed with DART, is defined to be bug-free for the purposes of this contest.) -------------------------------------------------------------------------- Ian Holmes Dept of Bioengineering 465 Evans Hall, UC Berkeley (510) 643-2393 ih...@be... -------------------------------------------------------------------------- Contributions to DART always welcome. Email me for CVS developer access... -------------------------------------------------------------------------- |
From: Ian H. <ih...@bd...> - 2004-03-23 10:52:30
|
Dear Rodrigo, as mentioned in a previous mail I'm in the middle of a move this week & so sorry if my answers are terse... > have more than one letter). That means that interclass rates have to be > zero. That should be enforced by the --nointer option. Using this option > lowers the rate, without setting it to zero, which still results in > several mixed-class columns.So: > > a) How do I make it be zero? - I have tried providing a initial rate > matrix with zeros but it wont converge. Small values always end up with > mixed-class columns. This is a bug that seems to result from the way xrate treats mixture models (with no inter-class substitution) as a special case of dynamic class models (where the class can be substituted). The long-term fix for this will be in the next version of xrate, which will include much more flexible mixture models. Until then, it should certainly be possible to patch the code so that the inter-class rates are forced to zero if "--nointer" is specified. I can't do this until next week, but if you want to try hacking the code yourself, the relevant file is "dart/src/em_matrix.cc" > b)How can I access the likelihoods of the models? For instance, to > choose between a 3 hidden class model and 2 hidden class model? Set the log level to 7 or lower (see my previous email), pending a more user-friendly way of doing this... > c)What do the capital and lower-case class assignments mean? (stalling for time) have you checked that your input sequence data aren't mixed-case? Most dart sequence I/O code preserves the case. Best wishes, Ian |
From: Rodrigo Gouveia-O. <ro...@cb...> - 2004-03-20 19:55:30
|
Dear Users, I am using xrate to estimate the substitution rate matrices with several hidden classes. Then I use tkfemit to simulate data from these matrices. However, I need that each site in the alignment sticks to a class (that no columns in the alignment given by the inferclass option have more than one letter). That means that interclass rates have to be zero. That should be enforced by the --nointer option. Using this option lowers the rate, without setting it to zero, which still results in several mixed-class columns.So: a) How do I make it be zero? - I have tried providing a initial rate matrix with zeros but it wont converge. Small values always end up with mixed-class columns. b)How can I access the likelihoods of the models? For instance, to choose between a 3 hidden class model and 2 hidden class model? c)What do the capital and lower-case class assignments mean? If you can help me with at least one of these questions, please do! Thank you all, Rodrigo PhD student at www.cbs.dtu.dk |