denovoassembler-devel Mailing List for Ray: scalable assembly
Ray -- Parallel genome assemblies for parallel DNA sequencing
Brought to you by:
sebhtml
You can subscribe to this list here.
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(42) |
Aug
(4) |
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(8) |
Dec
(4) |
2013 |
Jan
(6) |
Feb
(21) |
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(3) |
Nov
|
Dec
|
2014 |
Jan
|
Feb
(8) |
Mar
(10) |
Apr
|
May
(1) |
Jun
(5) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
From: Boisvert, S. <boi...@an...> - 2014-08-04 21:49:23
|
Dear de novo assembler users: Just to be open, I wish to inform you that I started to work on a new distributed assembler focusing on metagenomics. It's called spate [spate]. spate runs on top of the thorium distributed actor engine and is built using reusable actor components from the biosal (biological sequence actor library) library. Here are the reasons: 1. After 4 years working on Ray and RayPlatform, I was beginning to be bored. Ray uses MPI for the only programming model, which can be hard to maintain in the long run. The design of RayPlatform is somehow limited. 2. Rick Stevens (my new supervisor) is very interested in better ways of expressing distributed algorithms. I am also interested in this. I think that the actor model does exactly that and is just superior to MPI in every way. 3. C (C 1999) is a technical requirement for the SAL project at Argonne for "vendor-friendliness" so that they can eventually provide better implementation for specific kernels (in biosal, these kernels are actors). Ray and RayPlatform were implemented in C++. spate, thorium, and biosal genomics are all in C 1999. 4. Ray, ABySS, Kiki, PASHA, YAGA, and SWAP (those are the distributed assemblers I know about) are not scalable enough to tackle the DOE soil Grand Challenge [GrandChallenge]. The good news is that spate takes the same arguments so the learning curve should be minimal for those who desire to try it. To get started: git clone https://github.com/sebhtml/biosal.git cd biosal make ./applications/spate_metagenome_assembler/spate -help Thank you ! --- [spate] https://github.com/sebhtml/biosal [GrandChallenge] http://dskernel.blogspot.com/2014/08/the-public-datasets-from-doejgi-great.html |
From: Boisvert, S. <boi...@an...> - 2014-06-25 16:01:54
|
> ________________________________________ > From: Maxime Déraspe [max...@gm...] > Sent: Wednesday, June 25, 2014 6:57 AM > To: den...@li... > Subject: Re: [Denovoassembler-devel] Ray app in BaseSpace > Hi Sebastien, > thank you for providing me this information. > The RayCommand goes like this : > mpiexec -n 32 Ray \ > -k \ > 31 \ > -p \ > /data/input/samples/37052/data/intensities/basecalls/s_G1_L001_R1_001.fastq.1.gz.fastq.gz \ > /data/input/samples/37052/data/intensities/basecalls/s_G1_L001_R2_001.fastq.1.gz.fastq.gz \ > -p \ > /data/input/samples/37052/data/intensities/basecalls/s_G1_L001_R1_002.fastq.1.gz.fastq.gz \ > /data/input/samples/37052/data/intensities/basecalls/s_G1_L001_R2_002.fastq.1.gz.fastq.gz \ > -o \ > /data/output/appresults/11507497/37052 > > I will deactivate the SM btl, would it be better to use MPICH in case > this doesn't work ? No. That won't change anything I think. Both are great MPI implementations. > Maxime > > On 06/25/2014 03:49 PM, Boisvert, Sebastien wrote: > > > > [I CC'ed devel mailing lists] > > > > Can you provide the full command (cat RayCommand) ? > > > > I have 2 ideas to troubleshoot (waiting 17 hours is not a good approach > > to troubleshooting). > > > > # Idea 1 > > > > Disabling the SM btl in Open-MPI's MCA (assuming you are using Open-MPI and not > > MPICH). > > > > Before: > > mpiexec -n <ranks> Ray -k <k> -p <file1> <file2> > > > > After: > > mpiexec -n <ranks> --mca btl ^sm Ray -k <k> -p <file1> <file2> > > > > > > # Idea 2 > > > > RayPlatform has built-in debugger (it is mentioned in the doc). > > > > Before: > > mpiexec -n <ranks> Ray -k <k> -p <file1> <file2> > > > > After: > > mpiexec -n <ranks> Ray -debug -k <k> -p <file1> <file2> > > > > This will give you plenty of information in the standard output. > > > > > > > > My bet: SM in the cloud is slow / buggy and there is a bug in the runtime there > > (below RayPlatform). > > > > > > > > ------------------------------------------------------------------------------ > > Open source business process management suite built on Java and Eclipse > > Turn processes into business applications with Bonita BPM Community Edition > > Quickly connect people, data, and systems into organized workflows > > Winner of BOSSIE, CODIE, OW2 and Gartner awards > > http://p.sf.net/sfu/Bonitasoft > > _______________________________________________ > > Denovoassembler-devel mailing list > > Den...@li... > > https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel > > > > ------------------------------------------------------------------------------ > Open source business process management suite built on Java and Eclipse > Turn processes into business applications with Bonita BPM Community Edition > Quickly connect people, data, and systems into organized workflows > Winner of BOSSIE, CODIE, OW2 and Gartner awards > http://p.sf.net/sfu/Bonitasoft > _______________________________________________ > Denovoassembler-devel mailing list > Den...@li... > https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel |
From: Maxime D. <max...@gm...> - 2014-06-25 15:59:13
|
Hi Sebastien, thank you for providing me this information. The RayCommand goes like this : mpiexec -n 32 Ray \ -k \ 31 \ -p \ /data/input/samples/37052/data/intensities/basecalls/s_G1_L001_R1_001.fastq.1.gz.fastq.gz \ /data/input/samples/37052/data/intensities/basecalls/s_G1_L001_R2_001.fastq.1.gz.fastq.gz \ -p \ /data/input/samples/37052/data/intensities/basecalls/s_G1_L001_R1_002.fastq.1.gz.fastq.gz \ /data/input/samples/37052/data/intensities/basecalls/s_G1_L001_R2_002.fastq.1.gz.fastq.gz \ -o \ /data/output/appresults/11507497/37052 I will deactivate the SM btl, would it be better to use MPICH in case this doesn't work ? Maxime On 06/25/2014 03:49 PM, Boisvert, Sebastien wrote: > > [I CC'ed devel mailing lists] > > Can you provide the full command (cat RayCommand) ? > > I have 2 ideas to troubleshoot (waiting 17 hours is not a good approach > to troubleshooting). > > # Idea 1 > > Disabling the SM btl in Open-MPI's MCA (assuming you are using Open-MPI and not > MPICH). > > Before: > mpiexec -n <ranks> Ray -k <k> -p <file1> <file2> > > After: > mpiexec -n <ranks> --mca btl ^sm Ray -k <k> -p <file1> <file2> > > > # Idea 2 > > RayPlatform has built-in debugger (it is mentioned in the doc). > > Before: > mpiexec -n <ranks> Ray -k <k> -p <file1> <file2> > > After: > mpiexec -n <ranks> Ray -debug -k <k> -p <file1> <file2> > > This will give you plenty of information in the standard output. > > > > My bet: SM in the cloud is slow / buggy and there is a bug in the runtime there > (below RayPlatform). > > > > ------------------------------------------------------------------------------ > Open source business process management suite built on Java and Eclipse > Turn processes into business applications with Bonita BPM Community Edition > Quickly connect people, data, and systems into organized workflows > Winner of BOSSIE, CODIE, OW2 and Gartner awards > http://p.sf.net/sfu/Bonitasoft > _______________________________________________ > Denovoassembler-devel mailing list > Den...@li... > https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel > |
From: Boisvert, S. <boi...@an...> - 2014-06-25 15:49:34
|
[I CC'ed devel mailing lists] Can you provide the full command (cat RayCommand) ? I have 2 ideas to troubleshoot (waiting 17 hours is not a good approach to troubleshooting). # Idea 1 Disabling the SM btl in Open-MPI's MCA (assuming you are using Open-MPI and not MPICH). Before: mpiexec -n <ranks> Ray -k <k> -p <file1> <file2> After: mpiexec -n <ranks> --mca btl ^sm Ray -k <k> -p <file1> <file2> # Idea 2 RayPlatform has built-in debugger (it is mentioned in the doc). Before: mpiexec -n <ranks> Ray -k <k> -p <file1> <file2> After: mpiexec -n <ranks> Ray -debug -k <k> -p <file1> <file2> This will give you plenty of information in the standard output. My bet: SM in the cloud is slow / buggy and there is a bug in the runtime there (below RayPlatform). |
From: Boisvert, S. <boi...@an...> - 2014-06-25 15:48:54
|
[I CC'ed devel mailing lists] Can you provide the full command (cat RayCommand) ? I have 2 ideas to troubleshoot (waiting 17 hours is not a good approach to troubleshooting). # Idea 1 Disabling the SM btl in Open-MPI's MCA (assuming you are using Open-MPI and not MPICH). Before: mpiexec -n <ranks> Ray -k <k> -p <file1> <file2> After: mpiexec -n <ranks> --mca btl ^sm Ray -k <k> -p <file1> <file2> # Idea 2 RayPlatform has built-in debugger (it is mentioned in the doc). Before: mpiexec -n <ranks> Ray -k <k> -p <file1> <file2> After: mpiexec -n <ranks> Ray -debug -k <k> -p <file1> <file2> This will give you plenty of information in the standard output. My bet: SM in the cloud is slow / buggy and there is a bug in the runtime there (below RayPlatform). |
From: Boisvert, S. <boi...@an...> - 2014-06-25 15:47:22
|
[I CC'ed devel mailing lists] Can you provide the full command (cat RayCommand) ? I have 2 ideas to troubleshoot (waiting 17 hours is not a good approach to troubleshooting). # Idea 1 Disabling the SM btl in Open-MPI's MCA (assuming you are using Open-MPI and not MPICH). Before: mpiexec -n <ranks> Ray -k <k> -p <file1> <file2> After: mpiexec -n <ranks> --mca btl ^sm Ray -k <k> -p <file1> <file2> # Idea 2 RayPlatform has built-in debugger (it is mentioned in the doc). Before: mpiexec -n <ranks> Ray -k <k> -p <file1> <file2> After: mpiexec -n <ranks> Ray -debug -k <k> -p <file1> <file2> This will give you plenty of information in the standard output. My bet: SM in the cloud is slow / buggy and there is a bug in the runtime there (below RayPlatform). |
From: Sébastien B. <seb...@ul...> - 2014-05-16 14:13:43
|
________________________________ > From: cp...@cr... > To: seb...@ul... > CC: jac...@cr... > Date: Fri, 16 May 2014 08:05:55 -0400 > Subject: Quick update > > Hi Sebastien, > > OK, thanks to your help I have Ray running on the XC30 here at Cray. > As I mentioned yesterday I submitted HiSeq2500 with a limit of 12 hr > and using slightly more than 300 cores. This morning ran out of time. > Now, while I start becoming familiar with Titan I’ll continue running > on the XC30 here. Do you have any feeling for how long is going to > take? What I found while experimenting on Titan was that the I/O code that loads sequence from the file system into memory is a bottleneck. Another bottleneck is the code the merge the assembled sequences. > Show we try to run with the restart option? I need to read the > Ray manual to see if you have any info about that option. The option is -read-write-checkpoints Checkpoints It saves the progression of the calculation at key steps. > > In addition, I built the code with a profiler to start familiarizing > myself with the code and try to identify where the bottlenecks are. OK You can also send signal SIGUSR1 to a Ray process to activate RayPlatform debug mode. This option is available by default, but I don't know if you can actually send a signal to a running process on Cray compute nodes. > > Cheers > > carlos |
From: Sébastien B. <seb...@ul...> - 2014-03-28 15:57:55
|
Hi,. On 28 mars 2014 10:47, Gunning, Don [don...@in...] wrote: > À : Sébastien Boisvert > Objet : Ray > > Sebastien > > Can you provide the status of Ray development? Ray is a community project although I am the main developer / maintainer. I am defending my thesis in April and starting an appointment at Argonne also in April 2014. Obviously Ray is a never-ending project. You can find "issues" on github at https://github.com/sebhtml/ray/issues. > > Intel is doing quite a bit in life sciences and we would like to understand the status of your project. The project has some relations with the industry: So far, Cray, Inc. has worked with us for testing Ray on Cray hardware. => http://www.cray.com/Assets/PDF/products/xe/XE6-ULAVAL-Ray-0413.pdf I also was a beta user on SOSCIP's IBM Blue Gene/Q to test Ray to assemble a tree genome (white spruce). SOSCIP -> http://soscip.org/ SciNet -> http://www.scinethpc.ca/ > > In addition, could you let me know the status of Intel MPI and Ray? I never tested myself, but on the mailing list people use it. Ray is compliant with the MPI standard ( http://www.mpi-forum.org/ ). Since Intel MPI is derived from MPICH and that MPICH is compliant with the MPI standard, then Ray should work with Intel MPI. > > Regards > > Don > > Don Gunning > Software Program Manager > Technical computing group > Developer Product Division > Intel Corporation > 1906 Fox Dr > Champaign Il 61820 > 217 403 4213 > |
From: Sébastien B. <se...@bo...> - 2014-03-13 17:52:10
|
Hey Maxime, I made some changes: Surveyor: add documentation in manual for the kmer matrix Surveyor: fix some copyrights Surveyor: fix some indentation Surveyor: fix indentation in printMatrixHeader |
From: Sébastien B. <seb...@ul...> - 2014-03-13 17:14:40
|
Done ! On 13 mars 2014 07:01, Maxime Déraspe [max...@gm...] wrote: > À : Sébastien Boisvert; Maxime Deraspe > Cc : den...@li... > Objet : Re: [Denovoassembler-devel] RE : surveyor negative loaded sequence > > Hi Seb, > > I fixed an endless loop that happened in some cases in > GenomeAssemblyReader when scaffolds were given in entry instead of contigs. > > I also renamed the option to write the kmer matrix into a file, now > "-write-kmer-matrix". > Used to be "-run-kmer-matrix". > > Please pull from : > > https://github.com/Zorino/ray.git > > fix-assembly-reader > > Cheers, > > Maxime > > > On 03/12/2014 04:13 PM, Sébastien Boisvert wrote: >> On 11 mars 2014 05:58, Maxime Deraspe [ma...@de...] wrote: >>> À : Sébastien Boisvert >>> Objet : surveyor negative loaded sequence >>> >>> Salut Seb, >>> >>> je me demandais comment est ce que le storekeeper pouvait loader une >>> valeur négative de sequence. >> The StoreKeeper actor is designed to only manage kmers in the distributed de Bruijn graph. >> It does not load sequences from files directly. In fact, StoreKeeper actors >> receive their payload from the aggregator (called Coa >> >>> /actors/1668 -> loaded -1246000000 sequences >>> /actors/1768 -> loaded 1739000000 sequences >>> /actors/1756 -> loaded -1138000000 sequences >>> /actors/1807 -> loaded -2084000000 sequences >> >> If you look in the code, it is not a StoreKeeper actor that loads sequences from files and >> that prints the "loaded XXX sequences" lines. >> >> [boiseb01@ls30 Surveyor]$ grep loaded *.cpp|grep sequences >> GenomeAssemblyReader.cpp: cout << " loaded " << m_loaded << " sequences" << endl; >> GenomeGraphReader.cpp: cout << " loaded " << m_loaded << " sequences" << endl; >> >>> >>> Est-ce que tu as une idée du problème qui peut survenir ? >> It is in GenomeAssemblyReader.cpp or in GenomeGraphReader.cpp. GenomeGraphReader.cpp loads a graph file >> and I tested it thoroughly. >> >> My guess would be that it is GenomeAssemblyReader.cpp, or code being used by this class. >> More specifically, maybe you should look in SequenceKmerReader.cpp. >> >> In that file, SequenceKmerReader::hasAnotherKmer returns the value of m_hasKmerLeft. >> The problem is presumably around that. >> >> What happens when your buffer has k symbols 'N' and you reached eof ? >> >> m_hasKmerLeft changes value on lines 67 and 118 in your code. >> >> >> In your case, there is one thing to do to go forward: >> >> >> Write a unit test for SequenceKmerReader.cpp. >> >> >> You can add your unit test in https://github.com/sebhtml/Ray-TestSuite/tree/master/unit-tests if you want too. >> >> >> >>> Aussi je crois que j'ai mal compilé ray pour cette run, j'obtiens ces >>> erreurs dans le stderr : >>> >>> Ray: /opt/pgi/linux86-64/11.8/libso/libnuma.so.1: no version information >>> available (required by /opt/mpi/gcc/openmpi-1.6.4/lib64/libmpi.so.1) >>> >>> Maxime >> ------------------------------------------------------------------------------ >> Learn Graph Databases - Download FREE O'Reilly Book >> "Graph Databases" is the definitive new guide to graph databases and their >> applications. Written by three acclaimed leaders in the field, >> this first edition is now available. Download your free book today! >> http://p.sf.net/sfu/13534_NeoTech >> _______________________________________________ >> Denovoassembler-devel mailing list >> Den...@li... >> https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel > |
From: Maxime D. <max...@gm...> - 2014-03-13 15:02:23
|
Hi Seb, I fixed an endless loop that happened in some cases in GenomeAssemblyReader when scaffolds were given in entry instead of contigs. I also renamed the option to write the kmer matrix into a file, now "-write-kmer-matrix". Used to be "-run-kmer-matrix". Please pull from : https://github.com/Zorino/ray.git fix-assembly-reader Cheers, Maxime On 03/12/2014 04:13 PM, Sébastien Boisvert wrote: > On 11 mars 2014 05:58, Maxime Deraspe [ma...@de...] wrote: >> À : Sébastien Boisvert >> Objet : surveyor negative loaded sequence >> >> Salut Seb, >> >> je me demandais comment est ce que le storekeeper pouvait loader une >> valeur négative de sequence. > The StoreKeeper actor is designed to only manage kmers in the distributed de Bruijn graph. > It does not load sequences from files directly. In fact, StoreKeeper actors > receive their payload from the aggregator (called Coa > >> /actors/1668 -> loaded -1246000000 sequences >> /actors/1768 -> loaded 1739000000 sequences >> /actors/1756 -> loaded -1138000000 sequences >> /actors/1807 -> loaded -2084000000 sequences > > If you look in the code, it is not a StoreKeeper actor that loads sequences from files and > that prints the "loaded XXX sequences" lines. > > [boiseb01@ls30 Surveyor]$ grep loaded *.cpp|grep sequences > GenomeAssemblyReader.cpp: cout << " loaded " << m_loaded << " sequences" << endl; > GenomeGraphReader.cpp: cout << " loaded " << m_loaded << " sequences" << endl; > >> >> Est-ce que tu as une idée du problème qui peut survenir ? > It is in GenomeAssemblyReader.cpp or in GenomeGraphReader.cpp. GenomeGraphReader.cpp loads a graph file > and I tested it thoroughly. > > My guess would be that it is GenomeAssemblyReader.cpp, or code being used by this class. > More specifically, maybe you should look in SequenceKmerReader.cpp. > > In that file, SequenceKmerReader::hasAnotherKmer returns the value of m_hasKmerLeft. > The problem is presumably around that. > > What happens when your buffer has k symbols 'N' and you reached eof ? > > m_hasKmerLeft changes value on lines 67 and 118 in your code. > > > In your case, there is one thing to do to go forward: > > > Write a unit test for SequenceKmerReader.cpp. > > > You can add your unit test in https://github.com/sebhtml/Ray-TestSuite/tree/master/unit-tests if you want too. > > > >> Aussi je crois que j'ai mal compilé ray pour cette run, j'obtiens ces >> erreurs dans le stderr : >> >> Ray: /opt/pgi/linux86-64/11.8/libso/libnuma.so.1: no version information >> available (required by /opt/mpi/gcc/openmpi-1.6.4/lib64/libmpi.so.1) >> >> Maxime > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/13534_NeoTech > _______________________________________________ > Denovoassembler-devel mailing list > Den...@li... > https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel |
From: Sébastien B. <seb...@ul...> - 2014-03-12 16:13:11
|
On 11 mars 2014 05:58, Maxime Deraspe [ma...@de...] wrote: > À : Sébastien Boisvert > Objet : surveyor negative loaded sequence > > Salut Seb, > > je me demandais comment est ce que le storekeeper pouvait loader une > valeur négative de sequence. The StoreKeeper actor is designed to only manage kmers in the distributed de Bruijn graph. It does not load sequences from files directly. In fact, StoreKeeper actors receive their payload from the aggregator (called Coa > > /actors/1668 -> loaded -1246000000 sequences > /actors/1768 -> loaded 1739000000 sequences > /actors/1756 -> loaded -1138000000 sequences > /actors/1807 -> loaded -2084000000 sequences If you look in the code, it is not a StoreKeeper actor that loads sequences from files and that prints the "loaded XXX sequences" lines. [boiseb01@ls30 Surveyor]$ grep loaded *.cpp|grep sequences GenomeAssemblyReader.cpp: cout << " loaded " << m_loaded << " sequences" << endl; GenomeGraphReader.cpp: cout << " loaded " << m_loaded << " sequences" << endl; > > > Est-ce que tu as une idée du problème qui peut survenir ? It is in GenomeAssemblyReader.cpp or in GenomeGraphReader.cpp. GenomeGraphReader.cpp loads a graph file and I tested it thoroughly. My guess would be that it is GenomeAssemblyReader.cpp, or code being used by this class. More specifically, maybe you should look in SequenceKmerReader.cpp. In that file, SequenceKmerReader::hasAnotherKmer returns the value of m_hasKmerLeft. The problem is presumably around that. What happens when your buffer has k symbols 'N' and you reached eof ? m_hasKmerLeft changes value on lines 67 and 118 in your code. In your case, there is one thing to do to go forward: Write a unit test for SequenceKmerReader.cpp. You can add your unit test in https://github.com/sebhtml/Ray-TestSuite/tree/master/unit-tests if you want too. > > Aussi je crois que j'ai mal compilé ray pour cette run, j'obtiens ces > erreurs dans le stderr : > > Ray: /opt/pgi/linux86-64/11.8/libso/libnuma.so.1: no version information > available (required by /opt/mpi/gcc/openmpi-1.6.4/lib64/libmpi.so.1) > > Maxime |
From: Sébastien B. <se...@bo...> - 2014-03-07 19:35:24
|
Merge commit: https://github.com/sebhtml/ray/commit/be653216a634b28693071ec8b7c39073324510b6 I found a minor issue: You are calling createDirectory in KmerMatrixOwner, but the directory already exists. I'll fix that up. Good work ! ---------------------------------------- > Date: Fri, 7 Mar 2014 10:57:14 +0000 > From: max...@gm... > To: se...@bo...; den...@li... > Subject: Re: [Denovoassembler-devel] git diff kmersmatrix branch > > Hi Seb, > > You can find these modifications in the same branch as before : > > https://github.com/Zorino/ray.git > > patch-kmermatrix > > > On 03/07/2014 02:18 AM, Sébastien Boisvert wrote: >> Hi Maxime, >> >> I can't merge your code in master without making some changes first. >> >> As the maintainer, here are the changes that I will need to do to increase the plus-value of your work (on >> your next pull request, you can think about some of these points if you want): >> >> Keep in mind that Coding style is very important for readability. >> (you can check https://github.com/sebhtml/ray/blob/master/Documentation/CodingStyle.txt ) >> >> Here are the changes that I will make tomorrow: >> >> - (C0) As discussed, we don't want to generate Surveyor/KmerMatrix.tsv by default because this is >> quite large file. There needs to be an option. >> >> - (C1) Obviously, if you want people to use your Surveyor workflow, you need to add documentation. >> You just need to add some code in code/Mock/Parameters.cpp. After that you build Ray, >> then you write ./Ray -help> MANUAL_PAGE.txt > > Yes there was an option needed to launch it so it wasn't by default but > I forgot to add it in the Parameter file (now -run-kmer-matrix).Done. > >> - (C2) KmerMatrixOwner.h/cpp was done by you. So the copyright in the header belongs to you, not >> me. > > done. > >> >> - (C2) there is some red showing up in 'git diff --color', nothing important though >> relevant commits ce6c272 & 4e4949 >> >> - (C3) Copyright for code/Surveyor/SequenceKmerReader.cpp/.h should be in 2014 > > done. > >> - (C4) In code/Surveyor/StoreKeeper.h you must use tabulations and no spaces for indentation. >> for example, the line with MERGE_KMER_MATRIX uses spaces in StoreKeeper.h > > Yes, my indent-mode in emacs suddenly decided to put 8 spaces instead of > tabs I corrected this and re-indented it all. > >> - (C5) code/Surveyor/KmerMatrixOwner.cpp was using spaces instead of tabulations. >> >> - (C6) the method name KmerMatrixOwner::printLocalKmersMatrix is meaningless as it does not >> write a matrix, it writes a kmer. > > now called dumpKmerMatrixBuffer since it is not writing a single kmer > but a couple of it with a utility function of yours called > flushFileOperationBuffer > >> - (C7) you hard-coded the kmer value (31) !!! m_kmerMatrix << kmer.idToWord(31,0); >> you must avoid that ! [KmerMatrixOwner.cpp] > > big mistake here fixed it with m_parameters->getWordSize() > >> - (C8) The FIRST_TAG for your reader is the same that is used by the graph reader. I think we said it >> was OK as long as you added a comment about the reason. >> code/Surveyor/GenomeAssemblyReader.h: FIRST_TAG = 10200, >> code/Surveyor/GenomeGraphReader.h: FIRST_TAG = 10200, > > added this comment : > // Using the same tag as GenomeGraphReader > // because we can mix an assembly reader with a graph reader > >> >> Keep up the great work, but next time, I think you can make progress on respecting the coding style, >> among other things. > > thanks, I understand the importance you give to coding style. > >> I will do a bunch of commits in my branch patch-kmermatrix (fetched from >> remotes/zorino/patch-kmermatrix) to address the comments above. >> >> >> [boiseb01@ls30 ray]$ git diff master..remotes/zorino/patch-kmermatrix --stat >> code/Surveyor/GenomeAssemblyReader.cpp | 3 +- >> code/Surveyor/KmerMatrixOwner.cpp | 157 ++++++++++++++++++++ >> code/Surveyor/KmerMatrixOwner.h | 72 +++++++++ >> code/Surveyor/Makefile | 1 + >> code/Surveyor/MatrixOwner.cpp | 19 +-- >> code/Surveyor/MatrixOwner.h | 3 +- >> code/Surveyor/Mother.cpp | 253 +++++++++++++++++++++---------- >> code/Surveyor/Mother.h | 17 ++- >> code/Surveyor/SequenceKmerReader.cpp | 53 ++++++- >> code/Surveyor/SequenceKmerReader.h | 2 + >> code/Surveyor/StoreKeeper.cpp | 117 +++++++++++---- >> code/Surveyor/StoreKeeper.h | 20 +++- >> 12 files changed, 580 insertions(+), 137 deletions(-) >> >> Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/017C19ECDB98F64F99200D83876C44520111A7FC4FE7%40EXCH-MBX-B.ulaval.ca/#msg31999514 >> Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/COL131-W760CDDC5071A415AB9C5EDAC870%40phx.gbl/#msg32015211 >> Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/f2d8f53ef133555e36529047f4e95084%40boisvert.info/#msg31993547 >> Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/5301F42B.6060904%40gmail.com/#msg31988453 >> >> ---------------------------------------- >>> Date: Wed, 5 Mar 2014 13:41:39 +0000 >>> From: max...@gm... >>> To: se...@bo... >>> Subject: Re: [Denovoassembler-devel] git diff kmersmatrix branch >>> >>> Hi Sebastien, >>> >>> I have fix the issue when scaffolds would be given in entry instead of >>> contigs in the SequenceKmerReader class. >>> >>> I also made the edition according to this review. >>> >>> Please now pull from : >>> >>> https://github.com/Zorino/ray.git >>> >>> patch-kmermatrix >>> >>> Cheers, >>> >>> Maxime >>> >>> >>> >>> On 02/23/2014 12:16 PM, Sébastien Boisvert wrote: >>>> Hey Maxime, >>>> >>>> You did not provide KmersMatrixOwner.h, KmersMatrixOwner.cpp, and changes >>>> to the Surveyor Makefile. >>>> >>>> >>>> OTher comments are below. >>>> >>>> ---------------------------------------- >>>>> Date: Sat, 22 Feb 2014 10:03:30 +0000 >>>>> From: ma...@de... >>>>> To: se...@bo... >>>>> Subject: git diff kmersmatrix branch >>>>> >>>>> diff --git a/code/Surveyor/MatrixOwner.cpp b/code/Surveyor/MatrixOwner.cpp >>>>> index ffaae00..47cf84a 100644 >>>>> --- a/code/Surveyor/MatrixOwner.cpp >>>>> +++ b/code/Surveyor/MatrixOwner.cpp >>>>> @@ -65,9 +65,12 @@ void MatrixOwner::receive(Message & message) { >>>>> assert(m_parameters != NULL); >>>>> assert(m_sampleNames != NULL); >>>>> #endif >>>>> - >>>>> m_mother = source; >>>>> >>>>> + //open the buffer of the file >>>>> + // createKmersMatrixOutputFile(); >>>>> + >>>>> + >>>>> } else if(tag == PUSH_PAYLOAD) { >>>>> >>>>> SampleIdentifier sample1 = -1; >>>>> @@ -89,10 +92,10 @@ void MatrixOwner::receive(Message & message) { >>>>> assert(count>= 0); >>>>> #endif >>>>> >>>>> - /* >>>>> + >>>>> printName(); >>>>> - cout << "DEBUG add " << sample1 << " " << sample2 << " " >>>>> << count << endl; >>>>> -*/ >>>>> + // cout << "DEBUG add " << sample1 << " " << sample2 << >>>> Commented lines should be removed. >>>> >>>>> " " << count << endl; >>>>> + >>>>> m_receivedPayloads ++; >>>>> >>>>> m_localGramMatrix[sample1][sample2] += count; >>>>> @@ -100,14 +103,14 @@ void MatrixOwner::receive(Message & message) { >>>>> Message response; >>>>> response.setTag(PUSH_PAYLOAD_OK); >>>>> send(source, response); >>>>> + } >>>>> + else if(tag == PUSH_PAYLOAD_END) { >>>> Use '} else if (' and not '} >>>> else if' >>>> >>>> >>>> This is the coding style of the project. >>>> see https://github.com/sebhtml/ray/blob/master/Documentation/CodingStyle.txt >>>> >>>> * Kernighan and Ritchie style, variant "The One True Brace Style" (1TBS) >>>> http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS >>>> >>>> >>>> (you used K&R Variant: Stroustrup). >>>> >>>>> - } else if(tag == PUSH_PAYLOAD_END) { >>>>> - >>>>> + cout << "PUSH_PAYLOAD_END" <<endl; >>>> Remove this debug message. >>>> >>>>> m_completedStoreActors++; >>>>> >>>>> if(m_completedStoreActors == getSize()) { >>>>> >>>>> - >>>>> printName(); >>>>> cout << "MatrixOwner received " << >>>>> m_receivedPayloads << " payloads" << endl; >>>>> >>>>> @@ -151,10 +154,9 @@ void MatrixOwner::receive(Message & message) { >>>>> >>>>> >>>>> // tell Mother that the matrix is ready now. >>>>> - >>>>> - Message coolMessage; >>>>> - coolMessage.setTag(MATRIX_IS_READY); >>>>> - send(m_mother, coolMessage); >>>>> + Message coolMessage; >>>>> + coolMessage.setTag(GRAM_MATRIX_IS_READY); >>>>> + send(m_mother, coolMessage); >>>>> >>>>> >>>>> // clear matrices >>>>> @@ -275,3 +277,4 @@ void >>>>> MatrixOwner::printLocalGramMatrixWithHash(ostream & stream, map<SampleIdent >>>>> stream << endl; >>>>> } >>>>> } >>>>> + >>>>> diff --git a/code/Surveyor/MatrixOwner.h b/code/Surveyor/MatrixOwner.h >>>>> index ceb17e2..afa9278 100644 >>>>> --- a/code/Surveyor/MatrixOwner.h >>>>> +++ b/code/Surveyor/MatrixOwner.h >>>>> @@ -28,6 +28,7 @@ >>>>> >>>>> #include <map> >>>>> #include <iostream> >>>>> +#include <sstream> >>>>> using namespace std; >>>>> >>>>> class MatrixOwner : public Actor { >>>>> @@ -62,7 +63,7 @@ public: >>>>> PUSH_PAYLOAD, >>>>> PUSH_PAYLOAD_OK, >>>>> PUSH_PAYLOAD_END, >>>>> - MATRIX_IS_READY, >>>>> + GRAM_MATRIX_IS_READY, >>>>> LAST_TAG >>>>> }; >>>>> >>>>> diff --git a/code/Surveyor/Mother.cpp b/code/Surveyor/Mother.cpp >>>>> index 4d2ef9c..103a583 100644 >>>>> --- a/code/Surveyor/Mother.cpp >>>>> +++ b/code/Surveyor/Mother.cpp >>>>> @@ -27,6 +27,7 @@ >>>>> #include "GenomeGraphReader.h" >>>>> #include "GenomeAssemblyReader.h" >>>>> #include "MatrixOwner.h" >>>>> +#include "KmersMatrixOwner.h" >>>>> >>>>> #include <RayPlatform/cryptography/crypto.h> >>>>> >>>>> @@ -39,11 +40,13 @@ using namespace std; >>>>> #define INPUT_TYPE_GRAPH 0 >>>>> #define INPUT_TYPE_ASSEMBLY 1 >>>>> >>>>> - >>>>> Mother::Mother() { >>>>> >>>>> m_coalescenceManager = -1; >>>>> m_matrixOwner = -1; >>>>> + m_kmersMatrixOwner = -1; >>>>> + >>>>> + // m_matricesAreReady = true; >>>> Remove this commented line. >>>> >>>>> m_parameters = NULL; >>>>> m_bigMother = -1; >>>>> @@ -91,7 +94,7 @@ void Mother::receive(Message & message) { >>>>> notifyController(); >>>>> } >>>>> >>>>> - } else if(tag == MERGE) { >>>>> + } else if(tag == MERGE_GRAM_MATRIX) { >>>>> >>>>> int matrixOwner = -1; >>>>> memcpy(&matrixOwner, buffer, sizeof(matrixOwner)); >>>>> @@ -102,7 +105,7 @@ void Mother::receive(Message & message) { >>>>> #endif >>>>> >>>>> Message theMessage; >>>>> - theMessage.setTag(StoreKeeper::MERGE); >>>>> + theMessage.setTag(StoreKeeper::MERGE_GRAM_MATRIX); >>>>> theMessage.setBuffer(&matrixOwner); >>>>> theMessage.setNumberOfBytes(sizeof(matrixOwner)); >>>>> >>>>> @@ -111,10 +114,33 @@ void Mother::receive(Message & message) { >>>>> send(destination, theMessage); >>>>> >>>>> Message response; >>>>> - response.setTag(MERGE_OK); >>>>> + response.setTag(MERGE_GRAM_MATRIX_OK); >>>>> + send(source, response); >>>>> + >>>>> + } else if (tag == MERGE_KMERS_MATRIX) { >>>>> + >>>>> + int kmersMatrixOwner = -1; >>>>> + memcpy(&kmersMatrixOwner, buffer, sizeof(kmersMatrixOwner)); >>>>> + >>>>> +#ifdef CONFIG_ASSERT >>>>> + assert(kmersMatrixOwner>= 0); >>>>> + assert(m_storeKeepers.size() == 1); >>>>> +#endif >>>>> + >>>>> + Message theMessage; >>>>> + theMessage.setTag(StoreKeeper::MERGE_KMERS_MATRIX); >>>> The name should be MERGE_KMER_MATRIX and not MERGE_KMERS_MATRIX. >>>> >>>>> + theMessage.setBuffer(&kmersMatrixOwner); >>>>> + theMessage.setNumberOfBytes(sizeof(kmersMatrixOwner)); >>>>> + >>>>> + int destination = m_storeKeepers[0]; >>>>> + >>>>> + send(destination, theMessage); >>>>> + >>>>> + Message response; >>>>> + response.setTag(MERGE_KMERS_MATRIX_OK); >>>>> send(source, response); >>>>> >>>>> - } else if(tag == SHUTDOWN) { >>>>> + } else if(tag == SHUTDOWN) { >>>>> >>>>> Message response; >>>>> response.setTag(SHUTDOWN_OK); >>>>> @@ -122,18 +148,16 @@ void Mother::receive(Message & message) { >>>>> >>>>> stop(); >>>>> >>>>> - } else if(tag == StoreKeeper::MERGE_OK) { >>>>> + } else if(tag == StoreKeeper::MERGE_GRAM_MATRIX_OK) { >>>>> >>>>> // TODO: the bug https://github.com/sebhtml/ray/issues/216 >>>>> // is caused by the fact that this message is not >>>>> // received . >>>>> >>>>> - /* >>>>> - Message newMessage; >>>>> - newMessage.setTag(MERGE_OK); >>>>> + // Message newMessage; >>>>> + // newMessage.setTag(MERGE_OK); >>>>> >>>>> - send(m_bigMother, newMessage); >>>>> - */ >>>>> + // send(m_bigMother, newMessage); >>>> Remove these commented lines. >>>> >>>>> } else if(tag == FINISH_JOB) { >>>>> >>>>> @@ -153,6 +177,7 @@ void Mother::receive(Message & message) { >>>>> >>>>> sendToFirstMother(FLUSH_AGGREGATOR, >>>>> FLUSH_AGGREGATOR_RETURN); >>>>> } >>>>> + >>>>> } else if(tag == FLUSH_AGGREGATOR) { >>>>> >>>>> /* >>>>> @@ -188,64 +213,52 @@ void Mother::receive(Message & message) { >>>>> cout << "DEBUG sending FLUSH_AGGREGATOR_OK to >>>>> m_bigMother" << endl; >>>>> */ >>>>> >>>>> - } else if(tag == MatrixOwner::MATRIX_IS_READY) { >>>>> + } else if(tag == MatrixOwner::GRAM_MATRIX_IS_READY) { >>>>> + >>>>> + //TODO : check if all matrices are ready >>>>> + if(m_matricesAreReady){ >>>>> + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>>>> + }else { >>>>> + cout << "GRAM_MATRIX_IS_READY" << endl; >>>> When an actor speak, you must print its name too in stdout. >>>> (with printName()). >>>> >>>>> + m_matricesAreReady = true; >>>>> + } >>>>> >>>>> - sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>>>> + } >>>>> + else if(tag == KmersMatrixOwner::KMERS_MATRIX_IS_READY) { >>>>> + >>>>> + cout << "KMERS_MATRIX_IS_READY" << endl; >>>>> + if(m_matricesAreReady){ >>>>> + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>>>> + }else { >>>>> + cout << "KMERS_MATRIX_IS_READY" << endl; >>>>> + m_matricesAreReady = true; >>>> In one comment above, I saw that m_matricesAreReady = false >>>> was commented. Check that out. >>>> >>>>> + } >>>>> >>>>> } else if(tag == FLUSH_AGGREGATOR_OK) { >>>>> >>>>> - /* >>>>> printName(); >>>>> cout << "DEBUG received FLUSH_AGGREGATOR_OK" << endl; >>>>> - */ >>>>> >>>>> m_flushedMothers++; >>>>> >>>>> if(m_flushedMothers < getSize()) >>>>> return; >>>>> >>>>> - // spawn the MatrixOwner here ! >>>>> - >>>>> - MatrixOwner * matrixOwner = new MatrixOwner(); >>>>> - spawn(matrixOwner); >>>>> - >>>>> - m_matrixOwner = matrixOwner->getName(); >>>>> - >>>>> - printName(); >>>>> - cout << "Spawned MatrixOwner actor !" << endl; >>>>> - >>>>> - // tell the StoreKeeper actors to send their stuff to the >>>>> - // MatrixOwner actor >>>>> - // The Mother of Mother will wait for a signal from >>>>> MatrixOwner >>>>> - >>>>> - Message greetingMessage; >>>>> - >>>>> - vector<string> * names = & m_sampleNames; >>>>> - >>>>> - char buffer[32]; >>>>> - int offset = 0; >>>>> - memcpy(buffer + offset, &m_parameters, >>>>> sizeof(m_parameters)); >>>>> - offset += sizeof(m_parameters); >>>>> - memcpy(buffer + offset, &names, sizeof(names)); >>>>> - offset += sizeof(names); >>>>> - >>>>> - greetingMessage.setBuffer(&buffer); >>>>> - greetingMessage.setNumberOfBytes(offset); >>>>> - >>>>> - greetingMessage.setTag(MatrixOwner::GREETINGS); >>>>> - send(m_matrixOwner, greetingMessage); >>>>> - >>>>> - sendToFirstMother(MERGE, MERGE_OK); >>>>> - >>>>> + spawnMatrixOwner(); >>>> I like that. A method to spawn an actor. Good ! >>>> >>>>> } else if(tag == m_responseTag) { >>>>> >>>>> - >>>>> if(m_responseTag == SHUTDOWN_OK) { >>>>> >>>>> - } else if(m_responseTag == MERGE_OK) { >>>>> - >>>>> - } else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { >>>>> + } else if(m_responseTag == MERGE_GRAM_MATRIX_OK) { >>>>> + // All mothers merged their GRAM MATRIX >>>>> + // Spawn KmersMatrixOwner to print >>>>> + if(m_motherToKill < getSize() && >>>>> m_printKmersMatrix){ >>>>> + spawnKmersMatrixOwner(); >>>>> + } >>>>> + } else if(m_responseTag == MERGE_KMERS_MATRIX_OK) { >>>>> + } >>>>> + else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { >>>> Again, put closing brace on same line (} else if ( ...) { >>>> >>>>> /* >>>>> printName(); >>>>> @@ -254,9 +267,8 @@ void Mother::receive(Message & message) { >>>>> */ >>>>> } >>>>> >>>>> - // every mother was informed. >>>>> + // every mother was not informed. >>>> Good catch ! >>>> >>>>> if(m_motherToKill>= getSize()) { >>>>> - >>>>> sendMessageWithReply(m_motherToKill, m_forwardTag); >>>>> m_motherToKill--; >>>>> } >>>>> @@ -284,11 +296,15 @@ void Mother::sendMessageWithReply(int & actor, int >>>>> tag) { >>>>> Message message; >>>>> message.setTag(tag); >>>>> >>>>> - if(tag == MERGE) { >>>>> + if(tag == MERGE_GRAM_MATRIX) { >>>>> message.setBuffer(&m_matrixOwner); >>>>> message.setNumberOfBytes(sizeof(m_matrixOwner)); >>>>> - >>>>> - } else if(tag == FLUSH_AGGREGATOR) { >>>>> + } >>>>> + else if(tag == MERGE_KMERS_MATRIX) { >>>>> + message.setBuffer(&m_kmersMatrixOwner); >>>>> + message.setNumberOfBytes(sizeof(m_kmersMatrixOwner)); >>>>> + } >>>>> + else if(tag == FLUSH_AGGREGATOR) { >>>>> >>>>> /* >>>>> printName(); >>>>> @@ -328,6 +344,10 @@ void Mother::stop() { >>>>> m_matrixOwner = -1; >>>>> } >>>>> >>>>> + if(m_kmersMatrixOwner>= 0) { >>>>> + send(m_kmersMatrixOwner, kill); >>>>> + m_kmersMatrixOwner = -1; >>>>> + } >>>>> >>>>> die(); >>>>> >>>>> @@ -410,39 +430,44 @@ void Mother::startSurveyor() { >>>>> >>>>> bool isRoot = (getName() % getSize()) == 0; >>>>> >>>>> - //cout << "DEBUG startSurveyor isRoot" << isRoot << endl; >>>>> - >>>>> - // get a list of files. >>>>> + // Set matricesAreReady to true in case user doesn't want >>>>> + // to print out kmers matrix. >>>>> + m_matricesAreReady = true; >>>>> >>>>> vector<string> * commands = m_parameters->getCommands(); >>>>> >>>>> - >>>>> for(int i = 0 ; i < (int) commands->size() ; ++i) { >>>>> >>>>> string & element = commands->at(i); >>>>> >>>>> - // DONE: Check bounds for file names >>>>> + if (element != "-print-kmers-matrix") { >>>> The name should be kmer-matrix, not kmers-matrix. >>>> >>>> It is like groceries store vs grocery store. >>>> >>>>> + // DONE: Check bounds for file names >>>>> >>>>> - map<string,int> fastTable; >>>>> + map<string,int> fastTable; >>>>> >>>>> - fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; >>>>> - fastTable["-read-sample-assembly"] = INPUT_TYPE_ASSEMBLY; >>>>> + fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; >>>>> + fastTable["-read-sample-assembly"] = >>>>> INPUT_TYPE_ASSEMBLY; >>>>> >>>>> - // Unsupported option >>>>> - if(fastTable.count(element) == 0 || i+2> (int) >>>>> commands->size()) >>>>> - continue; >>>>> + // Unsupported option >>>>> + if(fastTable.count(element) == 0 || i+2> (int) >>>>> commands->size()) >>>>> + continue; >>>>> >>>>> - string sampleName = commands->at(++i); >>>>> - string fileName = commands->at(++i); >>>>> + string sampleName = commands->at(++i); >>>>> + string fileName = commands->at(++i); >>>>> >>>>> - m_sampleNames.push_back(sampleName); >>>>> + m_sampleNames.push_back(sampleName); >>>>> >>>>> - // DONE implement this m_assemblyFileNames + type >>>>> - m_inputFileNames.push_back(fileName); >>>>> + // DONE implement this m_assemblyFileNames + type >>>>> + m_inputFileNames.push_back(fileName); >>>>> >>>>> - int type = fastTable[element]; >>>>> + int type = fastTable[element]; >>>>> >>>>> - m_sampleInputTypes.push_back(type); >>>>> + m_sampleInputTypes.push_back(type); >>>>> + >>>>> + } else { >>>>> + m_matricesAreReady = false; >>>>> + m_printKmersMatrix = true; >>>> Question: if m_printKmersMatrix is false, I suppose the code >>>> follows the usual path of printing just one matrix, right ? >>>> >>>>> + } >>>>> >>>>> } >>>>> >>>>> @@ -468,6 +493,9 @@ void Mother::startSurveyor() { >>>>> >>>>> m_storeKeepers.push_back(actor->getName()); >>>>> >>>>> + actor->setOutputKmersMatrixPath(m_parameters->getPrefix()); >>>> The path should be prefix/Surveyor/<whatever the kmer matrix's name is> >>>> >>>>> + actor->setSamplesSize(m_sampleNames.size()); >>>> sample size, not samples size. >>>> >>>>> + >>>>> // tell the CoalescenceManager about the local StoreKeeper >>>>> Message dummyMessage; >>>>> int localStore = actor->getName(); >>>>> @@ -568,6 +596,80 @@ void Mother::spawnReader() { >>>>> } >>>>> } >>>>> >>>>> + >>>>> +void Mother::spawnMatrixOwner() { >>>>> + >>>>> + // spawn the MatrixOwner here ! >>>>> + MatrixOwner * matrixOwner = new MatrixOwner(); >>>>> + spawn(matrixOwner); >>>>> + >>>>> + m_matrixOwner = matrixOwner->getName(); >>>>> + >>>>> + printName(); >>>>> + cout << "Spawned MatrixOwner actor !" << m_matrixOwner << endl; >>>>> + >>>>> + // tell the StoreKeeper actors to send their stuff to the >>>>> + // MatrixOwner actor >>>>> + // The Mother of Mother will wait for a signal from MatrixOwner >>>>> + >>>>> + Message greetingMessage; >>>>> + >>>>> + vector<string> * names = & m_sampleNames; >>>>> + >>>>> + char buffer[32]; >>>>> + int offset = 0; >>>>> + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); >>>>> + offset += sizeof(m_parameters); >>>>> + memcpy(buffer + offset, &names, sizeof(names)); >>>>> + offset += sizeof(names); >>>>> + >>>>> + greetingMessage.setBuffer(&buffer); >>>>> + greetingMessage.setNumberOfBytes(offset); >>>>> + >>>>> + greetingMessage.setTag(MatrixOwner::GREETINGS); >>>>> + send(m_matrixOwner, greetingMessage); >>>>> + >>>>> + sendToFirstMother(MERGE_GRAM_MATRIX, MERGE_GRAM_MATRIX_OK); >>>>> +} >>>>> + >>>>> +void Mother::spawnKmersMatrixOwner() { >>>>> + >>>>> + // spawn the MatrixOwner here ! >>>>> + KmersMatrixOwner * kmersMatrixOwner = new KmersMatrixOwner(); >>>>> + spawn(kmersMatrixOwner); >>>>> + >>>>> + m_kmersMatrixOwner = kmersMatrixOwner->getName(); >>>>> + >>>>> + printName(); >>>>> + cout << "Spawned KmersMatrixOwner actor !" << >>>>> m_kmersMatrixOwner << endl; >>>>> + >>>>> + // tell the StoreKeeper actors to send their stuff to the >>>>> + // KmersMatrixOwner actor >>>>> + // The Mother of Mother will wait for a signal from MatrixOwner >>>>> + >>>>> + Message greetingMessage; >>>>> + >>>>> + vector<string> * names = & m_sampleNames; >>>>> + >>>>> + char buffer[32]; >>>>> + int offset = 0; >>>>> + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); >>>>> + offset += sizeof(m_parameters); >>>>> + memcpy(buffer + offset, &names, sizeof(names)); >>>>> + offset += sizeof(names); >>>>> + >>>>> + greetingMessage.setBuffer(&buffer); >>>>> + greetingMessage.setNumberOfBytes(offset); >>>>> + >>>>> + greetingMessage.setTag(KmersMatrixOwner::GREETINGS); >>>>> + send(m_kmersMatrixOwner, greetingMessage); >>>>> + >>>>> + sendToFirstMother(MERGE_KMERS_MATRIX, MERGE_KMERS_MATRIX_OK); >>>>> + >>>>> +} >>>>> + >>>>> + >>>>> void Mother::setParameters(Parameters * parameters) { >>>>> m_parameters = parameters; >>>>> } >>>>> + >>>>> diff --git a/code/Surveyor/Mother.h b/code/Surveyor/Mother.h >>>>> index 092920f..9774c4b 100644 >>>>> --- a/code/Surveyor/Mother.h >>>>> +++ b/code/Surveyor/Mother.h >>>>> @@ -28,6 +28,7 @@ >>>>> >>>>> #include <vector> >>>>> #include <string> >>>>> +#include <iostream> >>>>> using namespace std; >>>>> >>>>> /** >>>>> @@ -55,9 +56,12 @@ class Mother: public Actor { >>>>> private: >>>>> >>>>> int m_matrixOwner; >>>>> + int m_kmersMatrixOwner; >>>>> >>>>> int m_flushedMothers; >>>>> int m_finishedMothers; >>>>> + bool m_matricesAreReady; >>>>> + bool m_printKmersMatrix; >>>>> >>>>> Parameters * m_parameters; >>>>> >>>>> @@ -93,6 +97,13 @@ private: >>>>> */ >>>>> void sendToFirstMother(int forwardTag, int responseTag); >>>>> >>>>> + /* int m_kmersMatrixBlocNumber; */ >>>>> + void printLocalKmersMatrix(string & kmer, string & >>>>> samples_kmers, bool force); >>>>> + void createKmersMatrixOutputFile(); >>>>> + >>>>> + void spawnMatrixOwner(); >>>>> + void spawnKmersMatrixOwner(); >>>>> + >>>> That's a good design -- private methods for private uses. >>>> >>>>> public: >>>>> >>>>> Mother(); >>>>> @@ -109,8 +120,10 @@ public: >>>>> FLUSH_AGGREGATOR, >>>>> FLUSH_AGGREGATOR_OK, >>>>> FLUSH_AGGREGATOR_RETURN, >>>>> - MERGE, >>>>> - MERGE_OK, >>>>> + MERGE_GRAM_MATRIX, >>>>> + MERGE_GRAM_MATRIX_OK, >>>>> + MERGE_KMERS_MATRIX, >>>>> + MERGE_KMERS_MATRIX_OK, >>>> kmer matrix, not kmers matrix. >>>> >>>>> LAST_TAG, >>>>> }; >>>>> >>>>> diff --git a/code/Surveyor/StoreKeeper.cpp b/code/Surveyor/StoreKeeper.cpp >>>>> index 84eef34..492208c 100644 >>>>> --- a/code/Surveyor/StoreKeeper.cpp >>>>> +++ b/code/Surveyor/StoreKeeper.cpp >>>>> @@ -22,10 +22,16 @@ >>>>> #include "StoreKeeper.h" >>>>> #include "CoalescenceManager.h" >>>>> #include "MatrixOwner.h" >>>>> +#include "KmersMatrixOwner.h" >>>>> >>>>> #include <code/VerticesExtractor/Vertex.h> >>>>> +#include <RayPlatform/structures/MyHashTableIterator.h> >>>>> +#include <RayPlatform/core/OperatingSystem.h> >>>>> >>>>> #include <iostream> >>>>> +#include <sstream> >>>>> +#include <iomanip> >>>>> +#include <fstream> >>>>> using namespace std; >>>>> >>>>> #include <string.h> >>>>> @@ -83,15 +89,21 @@ void StoreKeeper::receive(Message & message) { >>>>> >>>>> die(); >>>>> >>>>> - } else if(tag == MERGE) { >>>>> + } else if(tag == MERGE_GRAM_MATRIX) { >>>>> >>>>> >>>>> - printName(); >>>>> - cout << "DEBUG at MERGE message reception "; >>>>> - cout << "(StoreKeeper) received " << m_receivedObjects >>>>> << " objects in total"; >>>>> - cout << " with " << m_receivedPushes << " push >>>>> operations" << endl; >>>>> + // printName(); >>>>> + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; >>>>> + // cout << "(StoreKeeper) received " << >>>>> m_receivedObjects << " objects in total"; >>>>> + // cout << " with " << m_receivedPushes << " push >>>> You can remove commented lines. >>>> >>>>> operations" << endl; >>>>> computeLocalGramMatrix(); >>>>> >>>>> + >>>>> + // TODEL Print matrix bloc >>>>> + // m_kmersMatrixBlocNumber = 0; >>>>> + // printLocalKmersMatrix(); >>>>> + >>>> You can remove commented lines. >>>> >>>>> + >>>>> m_mother = source; >>>>> >>>>> memcpy(&m_matrixOwner, buffer, sizeof(m_matrixOwner)); >>>>> @@ -108,19 +120,32 @@ void StoreKeeper::receive(Message & message) { >>>>> m_iterator2 = m_iterator1->second.begin(); >>>>> } >>>>> >>>>> - /* >>>>> - printName(); >>>>> - cout << "DEBUG printLocalGramMatrix before first >>>>> sendMatrixCell" << endl; >>>>> - printLocalGramMatrix(); >>>>> - */ >>>>> - >>>>> + // printName(); >>>>> + // cout << "DEBUG printLocalGramMatrix before first >>>>> sendMatrixCell" << endl; >>>>> + // printLocalGramMatrix(); >>>> You can remove commented lines. >>>> >>>>> sendMatrixCell(); >>>>> >>>>> } else if(tag == MatrixOwner::PUSH_PAYLOAD_OK) { >>>>> - >>>>> sendMatrixCell(); >>>>> + } else if(tag == MERGE_KMERS_MATRIX) { >>>>> + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; >>>>> + // cout << "(StoreKeeper) received " << >>>>> m_receivedObjects << " objects in total"; >>>>> + // cout << " with " << m_receivedPushes << " push >>>>> operations" << endl; >>>> You can remove commented lines. >>>> >>>> Otherwise, add a "#ifdef DEBUG_SOMETHING_SOMETHING / #endif around that lines". >>>> >>>> >>>>> + >>>>> + m_mother = source; >>>>> >>>>> - } else if(tag == CoalescenceManager::SET_KMER_LENGTH) { >>>>> + memcpy(&m_kmersMatrixOwner, buffer, >>>>> sizeof(m_kmersMatrixOwner)); >>>>> + >>>>> + m_hashTableIterator.constructor(&m_hashTable); >>>>> + >>>>> + sendKmersSamples(); >>>>> + } >>>>> + else if (tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_END) { >>>>> + } >>>>> + else if(tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_OK) { >>>>> + sendKmersSamples(); >>>>> + } >>>>> + else if(tag == CoalescenceManager::SET_KMER_LENGTH) { >>>>> >>>>> int kmerLength = 0; >>>>> int position = 0; >>>>> @@ -181,8 +206,6 @@ void StoreKeeper::sendMatrixCell() { >>>>> message.setNumberOfBytes(offset); >>>>> message.setTag(MatrixOwner::PUSH_PAYLOAD); >>>>> >>>>> - //cout << " DEBUG send PUSH_PAYLOAD to " << >>>>> m_matrixOwner << endl; >>>>> - >>>>> send(m_matrixOwner, message); >>>>> >>>>> m_iterator2++; >>>>> @@ -207,10 +230,7 @@ void StoreKeeper::sendMatrixCell() { >>>>> // free memory. >>>>> m_localGramMatrix.clear(); >>>>> >>>>> - /* >>>>> printName(); >>>>> - cout << "DEBUG send PUSH_PAYLOAD_END to " << m_matrixOwner << endl; >>>>> - */ >>>>> >>>>> Message response; >>>>> response.setTag(MatrixOwner::PUSH_PAYLOAD_END); >>>>> @@ -236,6 +256,7 @@ void StoreKeeper::configureHashTable() { >>>>> ); >>>>> >>>>> m_configured = true; >>>>> + >>>>> } >>>>> >>>>> void StoreKeeper::printColorReport() { >>>>> @@ -375,6 +396,7 @@ void StoreKeeper::computeLocalGramMatrix() { >>>>> //printLocalGramMatrix(); >>>>> } >>>>> >>>>> + >>>>> void StoreKeeper::printLocalGramMatrix() { >>>>> >>>>> printName(); >>>>> @@ -623,3 +645,73 @@ void StoreKeeper::storeData(Vertex & vertex, int & >>>>> sample) { >>>>> >>>>> */ >>>>> } >>>>> + >>>>> + >>>>> +void StoreKeeper::setSamplesSize(int sampleSize) { >>>>> + m_sampleSize = sampleSize; >>>>> +} >>>>> + >>>>> +void StoreKeeper::setOutputKmersMatrixPath(string pathPrefix) { >>>>> + // m_outputKmersMatrixPath = pathPrefix; >>>>> + // m_outputKmersMatrixPath += "/KmersMatrixDump/"; >>>>> + // createDirectory(m_outputKmersMatrixPath.c_str()); >>>> You can remove commented lines. >>>> >>>> >>>> This file could be in prefix/Surveyor/<...> >>>> >>>>> +} >>>>> + >>>>> + >>>>> +void StoreKeeper::sendKmersSamples() { >>>>> + >>>>> + char buffer[4000]; >>>> For portability, use MAXIMUM_MESSAGE_SIZE_IN_BYTES instead of 4000. >>>> >>>>> + int bytes = 0; >>>>> + >>>>> + ExperimentVertex * currentVertex = NULL; >>>>> + VirtualKmerColorHandle currentVirtualColor = NULL_VIRTUAL_COLOR; >>>>> + >>>>> + vector<bool> samplesVector (m_sampleSize, false); >>>>> + >>>>> + if(m_hashTableIterator.hasNext()){ >>>>> + >>>>> + // fill(samplesVector.begin(),samplesVector.end(),false); >>>>> + >>>> You can remove commented lines. >>>> >>>>> + currentVertex = m_hashTableIterator.next(); >>>>> + Kmer kmer = currentVertex->getKey(); >>>>> + >>>>> + bytes += kmer.dump(buffer); >>>>> + >>>>> + currentVirtualColor = currentVertex->getVirtualColor(); >>>>> + set<PhysicalKmerColor> * samples = >>>>> m_colorSet.getPhysicalColors(currentVirtualColor); >>>>> + >>>>> + for(set<PhysicalKmerColor>:: iterator sampleIterator = >>>>> samples->begin(); >>>>> + sampleIterator != samples->end(); ++sampleIterator) { >>>>> + PhysicalKmerColor value = *sampleIterator; >>>>> + samplesVector[value] = true; >>>>> + // cout << " " << value; >>>>> + } >>>>> + >>>>> + for (std::vector<bool>::iterator it = >>>>> samplesVector.begin(); >>>>> + it != samplesVector.end(); ++it) { >>>>> + buffer[bytes] = *it; >>>>> + bytes++; >>>>> + } >>>>> + // buffer[bytes] = '\0'; >>>> You can remove commented lines. >>>> >>>>> + } >>>>> + >>>>> + >>>>> + Message message; >>>>> + message.setNumberOfBytes(bytes); >>>>> + message.setBuffer(buffer); >>>>> + >>>>> + // message.setTag(MatrixOwner::PUSH_KMERS_SAMPLES); >>>>> + if(m_hashTableIterator.hasNext()){ >>>>> + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES); >>>>> + }else{ >>>>> + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES_END); >>>>> + } >>>>> + >>>>> + // Message response; >>>>> + // response.setTag(MatrixOwner::PUSH_PAYLOAD_END); >>>>> + // send(m_matrixOwner, response); >>>>> + >>>>> + send(m_kmersMatrixOwner, message); >>>>> + >>>>> +} >>>>> + >>>>> diff --git a/code/Surveyor/StoreKeeper.h b/code/Surveyor/StoreKeeper.h >>>>> index e44cf98..36adf77 100644 >>>>> --- a/code/Surveyor/StoreKeeper.h >>>>> +++ b/code/Surveyor/StoreKeeper.h >>>>> @@ -34,6 +34,10 @@ >>>>> >>>>> #include <RayPlatform/actors/Actor.h> >>>>> #include <RayPlatform/structures/MyHashTable.h> >>>>> +#include <RayPlatform/structures/MyHashTableIterator.h> >>>>> + >>>>> +#include <iostream> >>>>> +#include <sstream> >>>>> >>>>> /** >>>>> * Provides genomic storage. >>>>> @@ -55,6 +59,7 @@ private: >>>>> >>>>> int m_mother; >>>>> int m_matrixOwner; >>>>> + int m_kmersMatrixOwner; >>>>> >>>>> bool m_configured; >>>>> >>>>> @@ -64,6 +69,8 @@ private: >>>>> */ >>>>> MyHashTable<Kmer,ExperimentVertex> m_hashTable; >>>>> >>>>> + MyHashTableIterator<Kmer,ExperimentVertex> m_hashTableIterator; >>>>> + >>>>> int m_kmerLength; >>>>> bool m_colorSpaceMode; >>>>> >>>>> @@ -79,6 +86,13 @@ private: >>>>> void printLocalGramMatrix(); >>>>> void printColorReport(); >>>>> >>>>> + /* ostringstream m_currentKmer; */ >>>>> + /* ostringstream m_currentSamplesKmers; */ >>>>> + int m_sampleSize; >>>>> + string m_outputKmersMatrixPath; >>>>> + void printLocalKmersMatrix(string & m_kmer, string & >>>>> m_samplesKmers); >>>>> + void sendKmersSamples(); >>>>> + >>>>> void sendMatrixCell(); >>>>> >>>>> public: >>>>> @@ -86,14 +100,19 @@ public: >>>>> StoreKeeper(); >>>>> ~StoreKeeper(); >>>>> >>>>> + void setOutputKmersMatrixPath(string pathPrefix); >>>>> + void setSamplesSize(int sampleSize); >>>>> + >>>>> void receive(Message & message); >>>>> >>>>> enum { >>>>> FIRST_TAG = 10250, >>>>> PUSH_SAMPLE_VERTEX, >>>>> PUSH_SAMPLE_VERTEX_OK, >>>>> - MERGE, >>>>> - MERGE_OK, >>>>> + MERGE_GRAM_MATRIX, >>>>> + MERGE_GRAM_MATRIX_OK, >>>>> + MERGE_KMERS_MATRIX, >>>>> + MERGE_KMERS_MATRIX_OK, >>>>> LAST_TAG >>>>> }; >>>>> }; >>>> ------------------------------------------------------------------------------ >>>> Managing the Performance of Cloud-Based Applications >>>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >>>> Read the Whitepaper. >>>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Denovoassembler-devel mailing list >>>> Den...@li... >>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel >>> > |
From: Sébastien B. <se...@bo...> - 2014-03-07 19:20:51
|
I merged your work. Cheer. ---------------------------------------- > Date: Fri, 7 Mar 2014 10:57:14 +0000 > From: max...@gm... > To: se...@bo...; den...@li... > Subject: Re: [Denovoassembler-devel] git diff kmersmatrix branch > > Hi Seb, > > You can find these modifications in the same branch as before : > > https://github.com/Zorino/ray.git > > patch-kmermatrix > > > On 03/07/2014 02:18 AM, Sébastien Boisvert wrote: >> Hi Maxime, >> >> I can't merge your code in master without making some changes first. >> >> As the maintainer, here are the changes that I will need to do to increase the plus-value of your work (on >> your next pull request, you can think about some of these points if you want): >> >> Keep in mind that Coding style is very important for readability. >> (you can check https://github.com/sebhtml/ray/blob/master/Documentation/CodingStyle.txt ) >> >> Here are the changes that I will make tomorrow: >> >> - (C0) As discussed, we don't want to generate Surveyor/KmerMatrix.tsv by default because this is >> quite large file. There needs to be an option. >> >> - (C1) Obviously, if you want people to use your Surveyor workflow, you need to add documentation. >> You just need to add some code in code/Mock/Parameters.cpp. After that you build Ray, >> then you write ./Ray -help> MANUAL_PAGE.txt > > Yes there was an option needed to launch it so it wasn't by default but > I forgot to add it in the Parameter file (now -run-kmer-matrix).Done. > >> - (C2) KmerMatrixOwner.h/cpp was done by you. So the copyright in the header belongs to you, not >> me. > > done. > >> >> - (C2) there is some red showing up in 'git diff --color', nothing important though >> relevant commits ce6c272 & 4e4949 >> >> - (C3) Copyright for code/Surveyor/SequenceKmerReader.cpp/.h should be in 2014 > > done. > >> - (C4) In code/Surveyor/StoreKeeper.h you must use tabulations and no spaces for indentation. >> for example, the line with MERGE_KMER_MATRIX uses spaces in StoreKeeper.h > > Yes, my indent-mode in emacs suddenly decided to put 8 spaces instead of > tabs I corrected this and re-indented it all. > >> - (C5) code/Surveyor/KmerMatrixOwner.cpp was using spaces instead of tabulations. >> >> - (C6) the method name KmerMatrixOwner::printLocalKmersMatrix is meaningless as it does not >> write a matrix, it writes a kmer. > > now called dumpKmerMatrixBuffer since it is not writing a single kmer > but a couple of it with a utility function of yours called > flushFileOperationBuffer > >> - (C7) you hard-coded the kmer value (31) !!! m_kmerMatrix << kmer.idToWord(31,0); >> you must avoid that ! [KmerMatrixOwner.cpp] > > big mistake here fixed it with m_parameters->getWordSize() > >> - (C8) The FIRST_TAG for your reader is the same that is used by the graph reader. I think we said it >> was OK as long as you added a comment about the reason. >> code/Surveyor/GenomeAssemblyReader.h: FIRST_TAG = 10200, >> code/Surveyor/GenomeGraphReader.h: FIRST_TAG = 10200, > > added this comment : > // Using the same tag as GenomeGraphReader > // because we can mix an assembly reader with a graph reader > >> >> Keep up the great work, but next time, I think you can make progress on respecting the coding style, >> among other things. > > thanks, I understand the importance you give to coding style. > >> I will do a bunch of commits in my branch patch-kmermatrix (fetched from >> remotes/zorino/patch-kmermatrix) to address the comments above. >> >> >> [boiseb01@ls30 ray]$ git diff master..remotes/zorino/patch-kmermatrix --stat >> code/Surveyor/GenomeAssemblyReader.cpp | 3 +- >> code/Surveyor/KmerMatrixOwner.cpp | 157 ++++++++++++++++++++ >> code/Surveyor/KmerMatrixOwner.h | 72 +++++++++ >> code/Surveyor/Makefile | 1 + >> code/Surveyor/MatrixOwner.cpp | 19 +-- >> code/Surveyor/MatrixOwner.h | 3 +- >> code/Surveyor/Mother.cpp | 253 +++++++++++++++++++++---------- >> code/Surveyor/Mother.h | 17 ++- >> code/Surveyor/SequenceKmerReader.cpp | 53 ++++++- >> code/Surveyor/SequenceKmerReader.h | 2 + >> code/Surveyor/StoreKeeper.cpp | 117 +++++++++++---- >> code/Surveyor/StoreKeeper.h | 20 +++- >> 12 files changed, 580 insertions(+), 137 deletions(-) >> >> Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/017C19ECDB98F64F99200D83876C44520111A7FC4FE7%40EXCH-MBX-B.ulaval.ca/#msg31999514 >> Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/COL131-W760CDDC5071A415AB9C5EDAC870%40phx.gbl/#msg32015211 >> Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/f2d8f53ef133555e36529047f4e95084%40boisvert.info/#msg31993547 >> Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/5301F42B.6060904%40gmail.com/#msg31988453 >> >> ---------------------------------------- >>> Date: Wed, 5 Mar 2014 13:41:39 +0000 >>> From: max...@gm... >>> To: se...@bo... >>> Subject: Re: [Denovoassembler-devel] git diff kmersmatrix branch >>> >>> Hi Sebastien, >>> >>> I have fix the issue when scaffolds would be given in entry instead of >>> contigs in the SequenceKmerReader class. >>> >>> I also made the edition according to this review. >>> >>> Please now pull from : >>> >>> https://github.com/Zorino/ray.git >>> >>> patch-kmermatrix >>> >>> Cheers, >>> >>> Maxime >>> >>> >>> >>> On 02/23/2014 12:16 PM, Sébastien Boisvert wrote: >>>> Hey Maxime, >>>> >>>> You did not provide KmersMatrixOwner.h, KmersMatrixOwner.cpp, and changes >>>> to the Surveyor Makefile. >>>> >>>> >>>> OTher comments are below. >>>> >>>> ---------------------------------------- >>>>> Date: Sat, 22 Feb 2014 10:03:30 +0000 >>>>> From: ma...@de... >>>>> To: se...@bo... >>>>> Subject: git diff kmersmatrix branch >>>>> >>>>> diff --git a/code/Surveyor/MatrixOwner.cpp b/code/Surveyor/MatrixOwner.cpp >>>>> index ffaae00..47cf84a 100644 >>>>> --- a/code/Surveyor/MatrixOwner.cpp >>>>> +++ b/code/Surveyor/MatrixOwner.cpp >>>>> @@ -65,9 +65,12 @@ void MatrixOwner::receive(Message & message) { >>>>> assert(m_parameters != NULL); >>>>> assert(m_sampleNames != NULL); >>>>> #endif >>>>> - >>>>> m_mother = source; >>>>> >>>>> + //open the buffer of the file >>>>> + // createKmersMatrixOutputFile(); >>>>> + >>>>> + >>>>> } else if(tag == PUSH_PAYLOAD) { >>>>> >>>>> SampleIdentifier sample1 = -1; >>>>> @@ -89,10 +92,10 @@ void MatrixOwner::receive(Message & message) { >>>>> assert(count>= 0); >>>>> #endif >>>>> >>>>> - /* >>>>> + >>>>> printName(); >>>>> - cout << "DEBUG add " << sample1 << " " << sample2 << " " >>>>> << count << endl; >>>>> -*/ >>>>> + // cout << "DEBUG add " << sample1 << " " << sample2 << >>>> Commented lines should be removed. >>>> >>>>> " " << count << endl; >>>>> + >>>>> m_receivedPayloads ++; >>>>> >>>>> m_localGramMatrix[sample1][sample2] += count; >>>>> @@ -100,14 +103,14 @@ void MatrixOwner::receive(Message & message) { >>>>> Message response; >>>>> response.setTag(PUSH_PAYLOAD_OK); >>>>> send(source, response); >>>>> + } >>>>> + else if(tag == PUSH_PAYLOAD_END) { >>>> Use '} else if (' and not '} >>>> else if' >>>> >>>> >>>> This is the coding style of the project. >>>> see https://github.com/sebhtml/ray/blob/master/Documentation/CodingStyle.txt >>>> >>>> * Kernighan and Ritchie style, variant "The One True Brace Style" (1TBS) >>>> http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS >>>> >>>> >>>> (you used K&R Variant: Stroustrup). >>>> >>>>> - } else if(tag == PUSH_PAYLOAD_END) { >>>>> - >>>>> + cout << "PUSH_PAYLOAD_END" <<endl; >>>> Remove this debug message. >>>> >>>>> m_completedStoreActors++; >>>>> >>>>> if(m_completedStoreActors == getSize()) { >>>>> >>>>> - >>>>> printName(); >>>>> cout << "MatrixOwner received " << >>>>> m_receivedPayloads << " payloads" << endl; >>>>> >>>>> @@ -151,10 +154,9 @@ void MatrixOwner::receive(Message & message) { >>>>> >>>>> >>>>> // tell Mother that the matrix is ready now. >>>>> - >>>>> - Message coolMessage; >>>>> - coolMessage.setTag(MATRIX_IS_READY); >>>>> - send(m_mother, coolMessage); >>>>> + Message coolMessage; >>>>> + coolMessage.setTag(GRAM_MATRIX_IS_READY); >>>>> + send(m_mother, coolMessage); >>>>> >>>>> >>>>> // clear matrices >>>>> @@ -275,3 +277,4 @@ void >>>>> MatrixOwner::printLocalGramMatrixWithHash(ostream & stream, map<SampleIdent >>>>> stream << endl; >>>>> } >>>>> } >>>>> + >>>>> diff --git a/code/Surveyor/MatrixOwner.h b/code/Surveyor/MatrixOwner.h >>>>> index ceb17e2..afa9278 100644 >>>>> --- a/code/Surveyor/MatrixOwner.h >>>>> +++ b/code/Surveyor/MatrixOwner.h >>>>> @@ -28,6 +28,7 @@ >>>>> >>>>> #include <map> >>>>> #include <iostream> >>>>> +#include <sstream> >>>>> using namespace std; >>>>> >>>>> class MatrixOwner : public Actor { >>>>> @@ -62,7 +63,7 @@ public: >>>>> PUSH_PAYLOAD, >>>>> PUSH_PAYLOAD_OK, >>>>> PUSH_PAYLOAD_END, >>>>> - MATRIX_IS_READY, >>>>> + GRAM_MATRIX_IS_READY, >>>>> LAST_TAG >>>>> }; >>>>> >>>>> diff --git a/code/Surveyor/Mother.cpp b/code/Surveyor/Mother.cpp >>>>> index 4d2ef9c..103a583 100644 >>>>> --- a/code/Surveyor/Mother.cpp >>>>> +++ b/code/Surveyor/Mother.cpp >>>>> @@ -27,6 +27,7 @@ >>>>> #include "GenomeGraphReader.h" >>>>> #include "GenomeAssemblyReader.h" >>>>> #include "MatrixOwner.h" >>>>> +#include "KmersMatrixOwner.h" >>>>> >>>>> #include <RayPlatform/cryptography/crypto.h> >>>>> >>>>> @@ -39,11 +40,13 @@ using namespace std; >>>>> #define INPUT_TYPE_GRAPH 0 >>>>> #define INPUT_TYPE_ASSEMBLY 1 >>>>> >>>>> - >>>>> Mother::Mother() { >>>>> >>>>> m_coalescenceManager = -1; >>>>> m_matrixOwner = -1; >>>>> + m_kmersMatrixOwner = -1; >>>>> + >>>>> + // m_matricesAreReady = true; >>>> Remove this commented line. >>>> >>>>> m_parameters = NULL; >>>>> m_bigMother = -1; >>>>> @@ -91,7 +94,7 @@ void Mother::receive(Message & message) { >>>>> notifyController(); >>>>> } >>>>> >>>>> - } else if(tag == MERGE) { >>>>> + } else if(tag == MERGE_GRAM_MATRIX) { >>>>> >>>>> int matrixOwner = -1; >>>>> memcpy(&matrixOwner, buffer, sizeof(matrixOwner)); >>>>> @@ -102,7 +105,7 @@ void Mother::receive(Message & message) { >>>>> #endif >>>>> >>>>> Message theMessage; >>>>> - theMessage.setTag(StoreKeeper::MERGE); >>>>> + theMessage.setTag(StoreKeeper::MERGE_GRAM_MATRIX); >>>>> theMessage.setBuffer(&matrixOwner); >>>>> theMessage.setNumberOfBytes(sizeof(matrixOwner)); >>>>> >>>>> @@ -111,10 +114,33 @@ void Mother::receive(Message & message) { >>>>> send(destination, theMessage); >>>>> >>>>> Message response; >>>>> - response.setTag(MERGE_OK); >>>>> + response.setTag(MERGE_GRAM_MATRIX_OK); >>>>> + send(source, response); >>>>> + >>>>> + } else if (tag == MERGE_KMERS_MATRIX) { >>>>> + >>>>> + int kmersMatrixOwner = -1; >>>>> + memcpy(&kmersMatrixOwner, buffer, sizeof(kmersMatrixOwner)); >>>>> + >>>>> +#ifdef CONFIG_ASSERT >>>>> + assert(kmersMatrixOwner>= 0); >>>>> + assert(m_storeKeepers.size() == 1); >>>>> +#endif >>>>> + >>>>> + Message theMessage; >>>>> + theMessage.setTag(StoreKeeper::MERGE_KMERS_MATRIX); >>>> The name should be MERGE_KMER_MATRIX and not MERGE_KMERS_MATRIX. >>>> >>>>> + theMessage.setBuffer(&kmersMatrixOwner); >>>>> + theMessage.setNumberOfBytes(sizeof(kmersMatrixOwner)); >>>>> + >>>>> + int destination = m_storeKeepers[0]; >>>>> + >>>>> + send(destination, theMessage); >>>>> + >>>>> + Message response; >>>>> + response.setTag(MERGE_KMERS_MATRIX_OK); >>>>> send(source, response); >>>>> >>>>> - } else if(tag == SHUTDOWN) { >>>>> + } else if(tag == SHUTDOWN) { >>>>> >>>>> Message response; >>>>> response.setTag(SHUTDOWN_OK); >>>>> @@ -122,18 +148,16 @@ void Mother::receive(Message & message) { >>>>> >>>>> stop(); >>>>> >>>>> - } else if(tag == StoreKeeper::MERGE_OK) { >>>>> + } else if(tag == StoreKeeper::MERGE_GRAM_MATRIX_OK) { >>>>> >>>>> // TODO: the bug https://github.com/sebhtml/ray/issues/216 >>>>> // is caused by the fact that this message is not >>>>> // received . >>>>> >>>>> - /* >>>>> - Message newMessage; >>>>> - newMessage.setTag(MERGE_OK); >>>>> + // Message newMessage; >>>>> + // newMessage.setTag(MERGE_OK); >>>>> >>>>> - send(m_bigMother, newMessage); >>>>> - */ >>>>> + // send(m_bigMother, newMessage); >>>> Remove these commented lines. >>>> >>>>> } else if(tag == FINISH_JOB) { >>>>> >>>>> @@ -153,6 +177,7 @@ void Mother::receive(Message & message) { >>>>> >>>>> sendToFirstMother(FLUSH_AGGREGATOR, >>>>> FLUSH_AGGREGATOR_RETURN); >>>>> } >>>>> + >>>>> } else if(tag == FLUSH_AGGREGATOR) { >>>>> >>>>> /* >>>>> @@ -188,64 +213,52 @@ void Mother::receive(Message & message) { >>>>> cout << "DEBUG sending FLUSH_AGGREGATOR_OK to >>>>> m_bigMother" << endl; >>>>> */ >>>>> >>>>> - } else if(tag == MatrixOwner::MATRIX_IS_READY) { >>>>> + } else if(tag == MatrixOwner::GRAM_MATRIX_IS_READY) { >>>>> + >>>>> + //TODO : check if all matrices are ready >>>>> + if(m_matricesAreReady){ >>>>> + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>>>> + }else { >>>>> + cout << "GRAM_MATRIX_IS_READY" << endl; >>>> When an actor speak, you must print its name too in stdout. >>>> (with printName()). >>>> >>>>> + m_matricesAreReady = true; >>>>> + } >>>>> >>>>> - sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>>>> + } >>>>> + else if(tag == KmersMatrixOwner::KMERS_MATRIX_IS_READY) { >>>>> + >>>>> + cout << "KMERS_MATRIX_IS_READY" << endl; >>>>> + if(m_matricesAreReady){ >>>>> + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>>>> + }else { >>>>> + cout << "KMERS_MATRIX_IS_READY" << endl; >>>>> + m_matricesAreReady = true; >>>> In one comment above, I saw that m_matricesAreReady = false >>>> was commented. Check that out. >>>> >>>>> + } >>>>> >>>>> } else if(tag == FLUSH_AGGREGATOR_OK) { >>>>> >>>>> - /* >>>>> printName(); >>>>> cout << "DEBUG received FLUSH_AGGREGATOR_OK" << endl; >>>>> - */ >>>>> >>>>> m_flushedMothers++; >>>>> >>>>> if(m_flushedMothers < getSize()) >>>>> return; >>>>> >>>>> - // spawn the MatrixOwner here ! >>>>> - >>>>> - MatrixOwner * matrixOwner = new MatrixOwner(); >>>>> - spawn(matrixOwner); >>>>> - >>>>> - m_matrixOwner = matrixOwner->getName(); >>>>> - >>>>> - printName(); >>>>> - cout << "Spawned MatrixOwner actor !" << endl; >>>>> - >>>>> - // tell the StoreKeeper actors to send their stuff to the >>>>> - // MatrixOwner actor >>>>> - // The Mother of Mother will wait for a signal from >>>>> MatrixOwner >>>>> - >>>>> - Message greetingMessage; >>>>> - >>>>> - vector<string> * names = & m_sampleNames; >>>>> - >>>>> - char buffer[32]; >>>>> - int offset = 0; >>>>> - memcpy(buffer + offset, &m_parameters, >>>>> sizeof(m_parameters)); >>>>> - offset += sizeof(m_parameters); >>>>> - memcpy(buffer + offset, &names, sizeof(names)); >>>>> - offset += sizeof(names); >>>>> - >>>>> - greetingMessage.setBuffer(&buffer); >>>>> - greetingMessage.setNumberOfBytes(offset); >>>>> - >>>>> - greetingMessage.setTag(MatrixOwner::GREETINGS); >>>>> - send(m_matrixOwner, greetingMessage); >>>>> - >>>>> - sendToFirstMother(MERGE, MERGE_OK); >>>>> - >>>>> + spawnMatrixOwner(); >>>> I like that. A method to spawn an actor. Good ! >>>> >>>>> } else if(tag == m_responseTag) { >>>>> >>>>> - >>>>> if(m_responseTag == SHUTDOWN_OK) { >>>>> >>>>> - } else if(m_responseTag == MERGE_OK) { >>>>> - >>>>> - } else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { >>>>> + } else if(m_responseTag == MERGE_GRAM_MATRIX_OK) { >>>>> + // All mothers merged their GRAM MATRIX >>>>> + // Spawn KmersMatrixOwner to print >>>>> + if(m_motherToKill < getSize() && >>>>> m_printKmersMatrix){ >>>>> + spawnKmersMatrixOwner(); >>>>> + } >>>>> + } else if(m_responseTag == MERGE_KMERS_MATRIX_OK) { >>>>> + } >>>>> + else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { >>>> Again, put closing brace on same line (} else if ( ...) { >>>> >>>>> /* >>>>> printName(); >>>>> @@ -254,9 +267,8 @@ void Mother::receive(Message & message) { >>>>> */ >>>>> } >>>>> >>>>> - // every mother was informed. >>>>> + // every mother was not informed. >>>> Good catch ! >>>> >>>>> if(m_motherToKill>= getSize()) { >>>>> - >>>>> sendMessageWithReply(m_motherToKill, m_forwardTag); >>>>> m_motherToKill--; >>>>> } >>>>> @@ -284,11 +296,15 @@ void Mother::sendMessageWithReply(int & actor, int >>>>> tag) { >>>>> Message message; >>>>> message.setTag(tag); >>>>> >>>>> - if(tag == MERGE) { >>>>> + if(tag == MERGE_GRAM_MATRIX) { >>>>> message.setBuffer(&m_matrixOwner); >>>>> message.setNumberOfBytes(sizeof(m_matrixOwner)); >>>>> - >>>>> - } else if(tag == FLUSH_AGGREGATOR) { >>>>> + } >>>>> + else if(tag == MERGE_KMERS_MATRIX) { >>>>> + message.setBuffer(&m_kmersMatrixOwner); >>>>> + message.setNumberOfBytes(sizeof(m_kmersMatrixOwner)); >>>>> + } >>>>> + else if(tag == FLUSH_AGGREGATOR) { >>>>> >>>>> /* >>>>> printName(); >>>>> @@ -328,6 +344,10 @@ void Mother::stop() { >>>>> m_matrixOwner = -1; >>>>> } >>>>> >>>>> + if(m_kmersMatrixOwner>= 0) { >>>>> + send(m_kmersMatrixOwner, kill); >>>>> + m_kmersMatrixOwner = -1; >>>>> + } >>>>> >>>>> die(); >>>>> >>>>> @@ -410,39 +430,44 @@ void Mother::startSurveyor() { >>>>> >>>>> bool isRoot = (getName() % getSize()) == 0; >>>>> >>>>> - //cout << "DEBUG startSurveyor isRoot" << isRoot << endl; >>>>> - >>>>> - // get a list of files. >>>>> + // Set matricesAreReady to true in case user doesn't want >>>>> + // to print out kmers matrix. >>>>> + m_matricesAreReady = true; >>>>> >>>>> vector<string> * commands = m_parameters->getCommands(); >>>>> >>>>> - >>>>> for(int i = 0 ; i < (int) commands->size() ; ++i) { >>>>> >>>>> string & element = commands->at(i); >>>>> >>>>> - // DONE: Check bounds for file names >>>>> + if (element != "-print-kmers-matrix") { >>>> The name should be kmer-matrix, not kmers-matrix. >>>> >>>> It is like groceries store vs grocery store. >>>> >>>>> + // DONE: Check bounds for file names >>>>> >>>>> - map<string,int> fastTable; >>>>> + map<string,int> fastTable; >>>>> >>>>> - fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; >>>>> - fastTable["-read-sample-assembly"] = INPUT_TYPE_ASSEMBLY; >>>>> + fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; >>>>> + fastTable["-read-sample-assembly"] = >>>>> INPUT_TYPE_ASSEMBLY; >>>>> >>>>> - // Unsupported option >>>>> - if(fastTable.count(element) == 0 || i+2> (int) >>>>> commands->size()) >>>>> - continue; >>>>> + // Unsupported option >>>>> + if(fastTable.count(element) == 0 || i+2> (int) >>>>> commands->size()) >>>>> + continue; >>>>> >>>>> - string sampleName = commands->at(++i); >>>>> - string fileName = commands->at(++i); >>>>> + string sampleName = commands->at(++i); >>>>> + string fileName = commands->at(++i); >>>>> >>>>> - m_sampleNames.push_back(sampleName); >>>>> + m_sampleNames.push_back(sampleName); >>>>> >>>>> - // DONE implement this m_assemblyFileNames + type >>>>> - m_inputFileNames.push_back(fileName); >>>>> + // DONE implement this m_assemblyFileNames + type >>>>> + m_inputFileNames.push_back(fileName); >>>>> >>>>> - int type = fastTable[element]; >>>>> + int type = fastTable[element]; >>>>> >>>>> - m_sampleInputTypes.push_back(type); >>>>> + m_sampleInputTypes.push_back(type); >>>>> + >>>>> + } else { >>>>> + m_matricesAreReady = false; >>>>> + m_printKmersMatrix = true; >>>> Question: if m_printKmersMatrix is false, I suppose the code >>>> follows the usual path of printing just one matrix, right ? >>>> >>>>> + } >>>>> >>>>> } >>>>> >>>>> @@ -468,6 +493,9 @@ void Mother::startSurveyor() { >>>>> >>>>> m_storeKeepers.push_back(actor->getName()); >>>>> >>>>> + actor->setOutputKmersMatrixPath(m_parameters->getPrefix()); >>>> The path should be prefix/Surveyor/<whatever the kmer matrix's name is> >>>> >>>>> + actor->setSamplesSize(m_sampleNames.size()); >>>> sample size, not samples size. >>>> >>>>> + >>>>> // tell the CoalescenceManager about the local StoreKeeper >>>>> Message dummyMessage; >>>>> int localStore = actor->getName(); >>>>> @@ -568,6 +596,80 @@ void Mother::spawnReader() { >>>>> } >>>>> } >>>>> >>>>> + >>>>> +void Mother::spawnMatrixOwner() { >>>>> + >>>>> + // spawn the MatrixOwner here ! >>>>> + MatrixOwner * matrixOwner = new MatrixOwner(); >>>>> + spawn(matrixOwner); >>>>> + >>>>> + m_matrixOwner = matrixOwner->getName(); >>>>> + >>>>> + printName(); >>>>> + cout << "Spawned MatrixOwner actor !" << m_matrixOwner << endl; >>>>> + >>>>> + // tell the StoreKeeper actors to send their stuff to the >>>>> + // MatrixOwner actor >>>>> + // The Mother of Mother will wait for a signal from MatrixOwner >>>>> + >>>>> + Message greetingMessage; >>>>> + >>>>> + vector<string> * names = & m_sampleNames; >>>>> + >>>>> + char buffer[32]; >>>>> + int offset = 0; >>>>> + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); >>>>> + offset += sizeof(m_parameters); >>>>> + memcpy(buffer + offset, &names, sizeof(names)); >>>>> + offset += sizeof(names); >>>>> + >>>>> + greetingMessage.setBuffer(&buffer); >>>>> + greetingMessage.setNumberOfBytes(offset); >>>>> + >>>>> + greetingMessage.setTag(MatrixOwner::GREETINGS); >>>>> + send(m_matrixOwner, greetingMessage); >>>>> + >>>>> + sendToFirstMother(MERGE_GRAM_MATRIX, MERGE_GRAM_MATRIX_OK); >>>>> +} >>>>> + >>>>> +void Mother::spawnKmersMatrixOwner() { >>>>> + >>>>> + // spawn the MatrixOwner here ! >>>>> + KmersMatrixOwner * kmersMatrixOwner = new KmersMatrixOwner(); >>>>> + spawn(kmersMatrixOwner); >>>>> + >>>>> + m_kmersMatrixOwner = kmersMatrixOwner->getName(); >>>>> + >>>>> + printName(); >>>>> + cout << "Spawned KmersMatrixOwner actor !" << >>>>> m_kmersMatrixOwner << endl; >>>>> + >>>>> + // tell the StoreKeeper actors to send their stuff to the >>>>> + // KmersMatrixOwner actor >>>>> + // The Mother of Mother will wait for a signal from MatrixOwner >>>>> + >>>>> + Message greetingMessage; >>>>> + >>>>> + vector<string> * names = & m_sampleNames; >>>>> + >>>>> + char buffer[32]; >>>>> + int offset = 0; >>>>> + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); >>>>> + offset += sizeof(m_parameters); >>>>> + memcpy(buffer + offset, &names, sizeof(names)); >>>>> + offset += sizeof(names); >>>>> + >>>>> + greetingMessage.setBuffer(&buffer); >>>>> + greetingMessage.setNumberOfBytes(offset); >>>>> + >>>>> + greetingMessage.setTag(KmersMatrixOwner::GREETINGS); >>>>> + send(m_kmersMatrixOwner, greetingMessage); >>>>> + >>>>> + sendToFirstMother(MERGE_KMERS_MATRIX, MERGE_KMERS_MATRIX_OK); >>>>> + >>>>> +} >>>>> + >>>>> + >>>>> void Mother::setParameters(Parameters * parameters) { >>>>> m_parameters = parameters; >>>>> } >>>>> + >>>>> diff --git a/code/Surveyor/Mother.h b/code/Surveyor/Mother.h >>>>> index 092920f..9774c4b 100644 >>>>> --- a/code/Surveyor/Mother.h >>>>> +++ b/code/Surveyor/Mother.h >>>>> @@ -28,6 +28,7 @@ >>>>> >>>>> #include <vector> >>>>> #include <string> >>>>> +#include <iostream> >>>>> using namespace std; >>>>> >>>>> /** >>>>> @@ -55,9 +56,12 @@ class Mother: public Actor { >>>>> private: >>>>> >>>>> int m_matrixOwner; >>>>> + int m_kmersMatrixOwner; >>>>> >>>>> int m_flushedMothers; >>>>> int m_finishedMothers; >>>>> + bool m_matricesAreReady; >>>>> + bool m_printKmersMatrix; >>>>> >>>>> Parameters * m_parameters; >>>>> >>>>> @@ -93,6 +97,13 @@ private: >>>>> */ >>>>> void sendToFirstMother(int forwardTag, int responseTag); >>>>> >>>>> + /* int m_kmersMatrixBlocNumber; */ >>>>> + void printLocalKmersMatrix(string & kmer, string & >>>>> samples_kmers, bool force); >>>>> + void createKmersMatrixOutputFile(); >>>>> + >>>>> + void spawnMatrixOwner(); >>>>> + void spawnKmersMatrixOwner(); >>>>> + >>>> That's a good design -- private methods for private uses. >>>> >>>>> public: >>>>> >>>>> Mother(); >>>>> @@ -109,8 +120,10 @@ public: >>>>> FLUSH_AGGREGATOR, >>>>> FLUSH_AGGREGATOR_OK, >>>>> FLUSH_AGGREGATOR_RETURN, >>>>> - MERGE, >>>>> - MERGE_OK, >>>>> + MERGE_GRAM_MATRIX, >>>>> + MERGE_GRAM_MATRIX_OK, >>>>> + MERGE_KMERS_MATRIX, >>>>> + MERGE_KMERS_MATRIX_OK, >>>> kmer matrix, not kmers matrix. >>>> >>>>> LAST_TAG, >>>>> }; >>>>> >>>>> diff --git a/code/Surveyor/StoreKeeper.cpp b/code/Surveyor/StoreKeeper.cpp >>>>> index 84eef34..492208c 100644 >>>>> --- a/code/Surveyor/StoreKeeper.cpp >>>>> +++ b/code/Surveyor/StoreKeeper.cpp >>>>> @@ -22,10 +22,16 @@ >>>>> #include "StoreKeeper.h" >>>>> #include "CoalescenceManager.h" >>>>> #include "MatrixOwner.h" >>>>> +#include "KmersMatrixOwner.h" >>>>> >>>>> #include <code/VerticesExtractor/Vertex.h> >>>>> +#include <RayPlatform/structures/MyHashTableIterator.h> >>>>> +#include <RayPlatform/core/OperatingSystem.h> >>>>> >>>>> #include <iostream> >>>>> +#include <sstream> >>>>> +#include <iomanip> >>>>> +#include <fstream> >>>>> using namespace std; >>>>> >>>>> #include <string.h> >>>>> @@ -83,15 +89,21 @@ void StoreKeeper::receive(Message & message) { >>>>> >>>>> die(); >>>>> >>>>> - } else if(tag == MERGE) { >>>>> + } else if(tag == MERGE_GRAM_MATRIX) { >>>>> >>>>> >>>>> - printName(); >>>>> - cout << "DEBUG at MERGE message reception "; >>>>> - cout << "(StoreKeeper) received " << m_receivedObjects >>>>> << " objects in total"; >>>>> - cout << " with " << m_receivedPushes << " push >>>>> operations" << endl; >>>>> + // printName(); >>>>> + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; >>>>> + // cout << "(StoreKeeper) received " << >>>>> m_receivedObjects << " objects in total"; >>>>> + // cout << " with " << m_receivedPushes << " push >>>> You can remove commented lines. >>>> >>>>> operations" << endl; >>>>> computeLocalGramMatrix(); >>>>> >>>>> + >>>>> + // TODEL Print matrix bloc >>>>> + // m_kmersMatrixBlocNumber = 0; >>>>> + // printLocalKmersMatrix(); >>>>> + >>>> You can remove commented lines. >>>> >>>>> + >>>>> m_mother = source; >>>>> >>>>> memcpy(&m_matrixOwner, buffer, sizeof(m_matrixOwner)); >>>>> @@ -108,19 +120,32 @@ void StoreKeeper::receive(Message & message) { >>>>> m_iterator2 = m_iterator1->second.begin(); >>>>> } >>>>> >>>>> - /* >>>>> - printName(); >>>>> - cout << "DEBUG printLocalGramMatrix before first >>>>> sendMatrixCell" << endl; >>>>> - printLocalGramMatrix(); >>>>> - */ >>>>> - >>>>> + // printName(); >>>>> + // cout << "DEBUG printLocalGramMatrix before first >>>>> sendMatrixCell" << endl; >>>>> + // printLocalGramMatrix(); >>>> You can remove commented lines. >>>> >>>>> sendMatrixCell(); >>>>> >>>>> } else if(tag == MatrixOwner::PUSH_PAYLOAD_OK) { >>>>> - >>>>> sendMatrixCell(); >>>>> + } else if(tag == MERGE_KMERS_MATRIX) { >>>>> + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; >>>>> + // cout << "(StoreKeeper) received " << >>>>> m_receivedObjects << " objects in total"; >>>>> + // cout << " with " << m_receivedPushes << " push >>>>> operations" << endl; >>>> You can remove commented lines. >>>> >>>> Otherwise, add a "#ifdef DEBUG_SOMETHING_SOMETHING / #endif around that lines". >>>> >>>> >>>>> + >>>>> + m_mother = source; >>>>> >>>>> - } else if(tag == CoalescenceManager::SET_KMER_LENGTH) { >>>>> + memcpy(&m_kmersMatrixOwner, buffer, >>>>> sizeof(m_kmersMatrixOwner)); >>>>> + >>>>> + m_hashTableIterator.constructor(&m_hashTable); >>>>> + >>>>> + sendKmersSamples(); >>>>> + } >>>>> + else if (tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_END) { >>>>> + } >>>>> + else if(tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_OK) { >>>>> + sendKmersSamples(); >>>>> + } >>>>> + else if(tag == CoalescenceManager::SET_KMER_LENGTH) { >>>>> >>>>> int kmerLength = 0; >>>>> int position = 0; >>>>> @@ -181,8 +206,6 @@ void StoreKeeper::sendMatrixCell() { >>>>> message.setNumberOfBytes(offset); >>>>> message.setTag(MatrixOwner::PUSH_PAYLOAD); >>>>> >>>>> - //cout << " DEBUG send PUSH_PAYLOAD to " << >>>>> m_matrixOwner << endl; >>>>> - >>>>> send(m_matrixOwner, message); >>>>> >>>>> m_iterator2++; >>>>> @@ -207,10 +230,7 @@ void StoreKeeper::sendMatrixCell() { >>>>> // free memory. >>>>> m_localGramMatrix.clear(); >>>>> >>>>> - /* >>>>> printName(); >>>>> - cout << "DEBUG send PUSH_PAYLOAD_END to " << m_matrixOwner << endl; >>>>> - */ >>>>> >>>>> Message response; >>>>> response.setTag(MatrixOwner::PUSH_PAYLOAD_END); >>>>> @@ -236,6 +256,7 @@ void StoreKeeper::configureHashTable() { >>>>> ); >>>>> >>>>> m_configured = true; >>>>> + >>>>> } >>>>> >>>>> void StoreKeeper::printColorReport() { >>>>> @@ -375,6 +396,7 @@ void StoreKeeper::computeLocalGramMatrix() { >>>>> //printLocalGramMatrix(); >>>>> } >>>>> >>>>> + >>>>> void StoreKeeper::printLocalGramMatrix() { >>>>> >>>>> printName(); >>>>> @@ -623,3 +645,73 @@ void StoreKeeper::storeData(Vertex & vertex, int & >>>>> sample) { >>>>> >>>>> */ >>>>> } >>>>> + >>>>> + >>>>> +void StoreKeeper::setSamplesSize(int sampleSize) { >>>>> + m_sampleSize = sampleSize; >>>>> +} >>>>> + >>>>> +void StoreKeeper::setOutputKmersMatrixPath(string pathPrefix) { >>>>> + // m_outputKmersMatrixPath = pathPrefix; >>>>> + // m_outputKmersMatrixPath += "/KmersMatrixDump/"; >>>>> + // createDirectory(m_outputKmersMatrixPath.c_str()); >>>> You can remove commented lines. >>>> >>>> >>>> This file could be in prefix/Surveyor/<...> >>>> >>>>> +} >>>>> + >>>>> + >>>>> +void StoreKeeper::sendKmersSamples() { >>>>> + >>>>> + char buffer[4000]; >>>> For portability, use MAXIMUM_MESSAGE_SIZE_IN_BYTES instead of 4000. >>>> >>>>> + int bytes = 0; >>>>> + >>>>> + ExperimentVertex * currentVertex = NULL; >>>>> + VirtualKmerColorHandle currentVirtualColor = NULL_VIRTUAL_COLOR; >>>>> + >>>>> + vector<bool> samplesVector (m_sampleSize, false); >>>>> + >>>>> + if(m_hashTableIterator.hasNext()){ >>>>> + >>>>> + // fill(samplesVector.begin(),samplesVector.end(),false); >>>>> + >>>> You can remove commented lines. >>>> >>>>> + currentVertex = m_hashTableIterator.next(); >>>>> + Kmer kmer = currentVertex->getKey(); >>>>> + >>>>> + bytes += kmer.dump(buffer); >>>>> + >>>>> + currentVirtualColor = currentVertex->getVirtualColor(); >>>>> + set<PhysicalKmerColor> * samples = >>>>> m_colorSet.getPhysicalColors(currentVirtualColor); >>>>> + >>>>> + for(set<PhysicalKmerColor>:: iterator sampleIterator = >>>>> samples->begin(); >>>>> + sampleIterator != samples->end(); ++sampleIterator) { >>>>> + PhysicalKmerColor value = *sampleIterator; >>>>> + samplesVector[value] = true; >>>>> + // cout << " " << value; >>>>> + } >>>>> + >>>>> + for (std::vector<bool>::iterator it = >>>>> samplesVector.begin(); >>>>> + it != samplesVector.end(); ++it) { >>>>> + buffer[bytes] = *it; >>>>> + bytes++; >>>>> + } >>>>> + // buffer[bytes] = '\0'; >>>> You can remove commented lines. >>>> >>>>> + } >>>>> + >>>>> + >>>>> + Message message; >>>>> + message.setNumberOfBytes(bytes); >>>>> + message.setBuffer(buffer); >>>>> + >>>>> + // message.setTag(MatrixOwner::PUSH_KMERS_SAMPLES); >>>>> + if(m_hashTableIterator.hasNext()){ >>>>> + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES); >>>>> + }else{ >>>>> + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES_END); >>>>> + } >>>>> + >>>>> + // Message response; >>>>> + // response.setTag(MatrixOwner::PUSH_PAYLOAD_END); >>>>> + // send(m_matrixOwner, response); >>>>> + >>>>> + send(m_kmersMatrixOwner, message); >>>>> + >>>>> +} >>>>> + >>>>> diff --git a/code/Surveyor/StoreKeeper.h b/code/Surveyor/StoreKeeper.h >>>>> index e44cf98..36adf77 100644 >>>>> --- a/code/Surveyor/StoreKeeper.h >>>>> +++ b/code/Surveyor/StoreKeeper.h >>>>> @@ -34,6 +34,10 @@ >>>>> >>>>> #include <RayPlatform/actors/Actor.h> >>>>> #include <RayPlatform/structures/MyHashTable.h> >>>>> +#include <RayPlatform/structures/MyHashTableIterator.h> >>>>> + >>>>> +#include <iostream> >>>>> +#include <sstream> >>>>> >>>>> /** >>>>> * Provides genomic storage. >>>>> @@ -55,6 +59,7 @@ private: >>>>> >>>>> int m_mother; >>>>> int m_matrixOwner; >>>>> + int m_kmersMatrixOwner; >>>>> >>>>> bool m_configured; >>>>> >>>>> @@ -64,6 +69,8 @@ private: >>>>> */ >>>>> MyHashTable<Kmer,ExperimentVertex> m_hashTable; >>>>> >>>>> + MyHashTableIterator<Kmer,ExperimentVertex> m_hashTableIterator; >>>>> + >>>>> int m_kmerLength; >>>>> bool m_colorSpaceMode; >>>>> >>>>> @@ -79,6 +86,13 @@ private: >>>>> void printLocalGramMatrix(); >>>>> void printColorReport(); >>>>> >>>>> + /* ostringstream m_currentKmer; */ >>>>> + /* ostringstream m_currentSamplesKmers; */ >>>>> + int m_sampleSize; >>>>> + string m_outputKmersMatrixPath; >>>>> + void printLocalKmersMatrix(string & m_kmer, string & >>>>> m_samplesKmers); >>>>> + void sendKmersSamples(); >>>>> + >>>>> void sendMatrixCell(); >>>>> >>>>> public: >>>>> @@ -86,14 +100,19 @@ public: >>>>> StoreKeeper(); >>>>> ~StoreKeeper(); >>>>> >>>>> + void setOutputKmersMatrixPath(string pathPrefix); >>>>> + void setSamplesSize(int sampleSize); >>>>> + >>>>> void receive(Message & message); >>>>> >>>>> enum { >>>>> FIRST_TAG = 10250, >>>>> PUSH_SAMPLE_VERTEX, >>>>> PUSH_SAMPLE_VERTEX_OK, >>>>> - MERGE, >>>>> - MERGE_OK, >>>>> + MERGE_GRAM_MATRIX, >>>>> + MERGE_GRAM_MATRIX_OK, >>>>> + MERGE_KMERS_MATRIX, >>>>> + MERGE_KMERS_MATRIX_OK, >>>>> LAST_TAG >>>>> }; >>>>> }; >>>> ------------------------------------------------------------------------------ >>>> Managing the Performance of Cloud-Based Applications >>>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >>>> Read the Whitepaper. >>>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Denovoassembler-devel mailing list >>>> Den...@li... >>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel >>> > |
From: Maxime D. <max...@gm...> - 2014-03-07 15:57:26
|
Hi Seb, You can find these modifications in the same branch as before : https://github.com/Zorino/ray.git patch-kmermatrix On 03/07/2014 02:18 AM, Sébastien Boisvert wrote: > Hi Maxime, > > I can't merge your code in master without making some changes first. > > As the maintainer, here are the changes that I will need to do to increase the plus-value of your work (on > your next pull request, you can think about some of these points if you want): > > Keep in mind that Coding style is very important for readability. > (you can check https://github.com/sebhtml/ray/blob/master/Documentation/CodingStyle.txt ) > > Here are the changes that I will make tomorrow: > > - (C0) As discussed, we don't want to generate Surveyor/KmerMatrix.tsv by default because this is > quite large file. There needs to be an option. > > - (C1) Obviously, if you want people to use your Surveyor workflow, you need to add documentation. > You just need to add some code in code/Mock/Parameters.cpp. After that you build Ray, > then you write ./Ray -help> MANUAL_PAGE.txt Yes there was an option needed to launch it so it wasn't by default but I forgot to add it in the Parameter file (now -run-kmer-matrix).Done. > - (C2) KmerMatrixOwner.h/cpp was done by you. So the copyright in the header belongs to you, not > me. done. > > - (C2) there is some red showing up in 'git diff --color', nothing important though > relevant commits ce6c272 & 4e4949 > > - (C3) Copyright for code/Surveyor/SequenceKmerReader.cpp/.h should be in 2014 done. > - (C4) In code/Surveyor/StoreKeeper.h you must use tabulations and no spaces for indentation. > for example, the line with MERGE_KMER_MATRIX uses spaces in StoreKeeper.h Yes, my indent-mode in emacs suddenly decided to put 8 spaces instead of tabs I corrected this and re-indented it all. > - (C5) code/Surveyor/KmerMatrixOwner.cpp was using spaces instead of tabulations. > > - (C6) the method name KmerMatrixOwner::printLocalKmersMatrix is meaningless as it does not > write a matrix, it writes a kmer. now called dumpKmerMatrixBuffer since it is not writing a single kmer but a couple of it with a utility function of yours called flushFileOperationBuffer > - (C7) you hard-coded the kmer value (31) !!! m_kmerMatrix << kmer.idToWord(31,0); > you must avoid that ! [KmerMatrixOwner.cpp] big mistake here fixed it with m_parameters->getWordSize() > - (C8) The FIRST_TAG for your reader is the same that is used by the graph reader. I think we said it > was OK as long as you added a comment about the reason. > code/Surveyor/GenomeAssemblyReader.h: FIRST_TAG = 10200, > code/Surveyor/GenomeGraphReader.h: FIRST_TAG = 10200, added this comment : // Using the same tag as GenomeGraphReader // because we can mix an assembly reader with a graph reader > > Keep up the great work, but next time, I think you can make progress on respecting the coding style, > among other things. thanks, I understand the importance you give to coding style. > I will do a bunch of commits in my branch patch-kmermatrix (fetched from > remotes/zorino/patch-kmermatrix) to address the comments above. > > > [boiseb01@ls30 ray]$ git diff master..remotes/zorino/patch-kmermatrix --stat > code/Surveyor/GenomeAssemblyReader.cpp | 3 +- > code/Surveyor/KmerMatrixOwner.cpp | 157 ++++++++++++++++++++ > code/Surveyor/KmerMatrixOwner.h | 72 +++++++++ > code/Surveyor/Makefile | 1 + > code/Surveyor/MatrixOwner.cpp | 19 +-- > code/Surveyor/MatrixOwner.h | 3 +- > code/Surveyor/Mother.cpp | 253 +++++++++++++++++++++---------- > code/Surveyor/Mother.h | 17 ++- > code/Surveyor/SequenceKmerReader.cpp | 53 ++++++- > code/Surveyor/SequenceKmerReader.h | 2 + > code/Surveyor/StoreKeeper.cpp | 117 +++++++++++---- > code/Surveyor/StoreKeeper.h | 20 +++- > 12 files changed, 580 insertions(+), 137 deletions(-) > > Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/017C19ECDB98F64F99200D83876C44520111A7FC4FE7%40EXCH-MBX-B.ulaval.ca/#msg31999514 > Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/COL131-W760CDDC5071A415AB9C5EDAC870%40phx.gbl/#msg32015211 > Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/f2d8f53ef133555e36529047f4e95084%40boisvert.info/#msg31993547 > Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/5301F42B.6060904%40gmail.com/#msg31988453 > > ---------------------------------------- >> Date: Wed, 5 Mar 2014 13:41:39 +0000 >> From: max...@gm... >> To: se...@bo... >> Subject: Re: [Denovoassembler-devel] git diff kmersmatrix branch >> >> Hi Sebastien, >> >> I have fix the issue when scaffolds would be given in entry instead of >> contigs in the SequenceKmerReader class. >> >> I also made the edition according to this review. >> >> Please now pull from : >> >> https://github.com/Zorino/ray.git >> >> patch-kmermatrix >> >> Cheers, >> >> Maxime >> >> >> >> On 02/23/2014 12:16 PM, Sébastien Boisvert wrote: >>> Hey Maxime, >>> >>> You did not provide KmersMatrixOwner.h, KmersMatrixOwner.cpp, and changes >>> to the Surveyor Makefile. >>> >>> >>> OTher comments are below. >>> >>> ---------------------------------------- >>>> Date: Sat, 22 Feb 2014 10:03:30 +0000 >>>> From: ma...@de... >>>> To: se...@bo... >>>> Subject: git diff kmersmatrix branch >>>> >>>> diff --git a/code/Surveyor/MatrixOwner.cpp b/code/Surveyor/MatrixOwner.cpp >>>> index ffaae00..47cf84a 100644 >>>> --- a/code/Surveyor/MatrixOwner.cpp >>>> +++ b/code/Surveyor/MatrixOwner.cpp >>>> @@ -65,9 +65,12 @@ void MatrixOwner::receive(Message & message) { >>>> assert(m_parameters != NULL); >>>> assert(m_sampleNames != NULL); >>>> #endif >>>> - >>>> m_mother = source; >>>> >>>> + //open the buffer of the file >>>> + // createKmersMatrixOutputFile(); >>>> + >>>> + >>>> } else if(tag == PUSH_PAYLOAD) { >>>> >>>> SampleIdentifier sample1 = -1; >>>> @@ -89,10 +92,10 @@ void MatrixOwner::receive(Message & message) { >>>> assert(count>= 0); >>>> #endif >>>> >>>> - /* >>>> + >>>> printName(); >>>> - cout << "DEBUG add " << sample1 << " " << sample2 << " " >>>> << count << endl; >>>> -*/ >>>> + // cout << "DEBUG add " << sample1 << " " << sample2 << >>> Commented lines should be removed. >>> >>>> " " << count << endl; >>>> + >>>> m_receivedPayloads ++; >>>> >>>> m_localGramMatrix[sample1][sample2] += count; >>>> @@ -100,14 +103,14 @@ void MatrixOwner::receive(Message & message) { >>>> Message response; >>>> response.setTag(PUSH_PAYLOAD_OK); >>>> send(source, response); >>>> + } >>>> + else if(tag == PUSH_PAYLOAD_END) { >>> Use '} else if (' and not '} >>> else if' >>> >>> >>> This is the coding style of the project. >>> see https://github.com/sebhtml/ray/blob/master/Documentation/CodingStyle.txt >>> >>> * Kernighan and Ritchie style, variant "The One True Brace Style" (1TBS) >>> http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS >>> >>> >>> (you used K&R Variant: Stroustrup). >>> >>>> - } else if(tag == PUSH_PAYLOAD_END) { >>>> - >>>> + cout << "PUSH_PAYLOAD_END" <<endl; >>> Remove this debug message. >>> >>>> m_completedStoreActors++; >>>> >>>> if(m_completedStoreActors == getSize()) { >>>> >>>> - >>>> printName(); >>>> cout << "MatrixOwner received " << >>>> m_receivedPayloads << " payloads" << endl; >>>> >>>> @@ -151,10 +154,9 @@ void MatrixOwner::receive(Message & message) { >>>> >>>> >>>> // tell Mother that the matrix is ready now. >>>> - >>>> - Message coolMessage; >>>> - coolMessage.setTag(MATRIX_IS_READY); >>>> - send(m_mother, coolMessage); >>>> + Message coolMessage; >>>> + coolMessage.setTag(GRAM_MATRIX_IS_READY); >>>> + send(m_mother, coolMessage); >>>> >>>> >>>> // clear matrices >>>> @@ -275,3 +277,4 @@ void >>>> MatrixOwner::printLocalGramMatrixWithHash(ostream & stream, map<SampleIdent >>>> stream << endl; >>>> } >>>> } >>>> + >>>> diff --git a/code/Surveyor/MatrixOwner.h b/code/Surveyor/MatrixOwner.h >>>> index ceb17e2..afa9278 100644 >>>> --- a/code/Surveyor/MatrixOwner.h >>>> +++ b/code/Surveyor/MatrixOwner.h >>>> @@ -28,6 +28,7 @@ >>>> >>>> #include <map> >>>> #include <iostream> >>>> +#include <sstream> >>>> using namespace std; >>>> >>>> class MatrixOwner : public Actor { >>>> @@ -62,7 +63,7 @@ public: >>>> PUSH_PAYLOAD, >>>> PUSH_PAYLOAD_OK, >>>> PUSH_PAYLOAD_END, >>>> - MATRIX_IS_READY, >>>> + GRAM_MATRIX_IS_READY, >>>> LAST_TAG >>>> }; >>>> >>>> diff --git a/code/Surveyor/Mother.cpp b/code/Surveyor/Mother.cpp >>>> index 4d2ef9c..103a583 100644 >>>> --- a/code/Surveyor/Mother.cpp >>>> +++ b/code/Surveyor/Mother.cpp >>>> @@ -27,6 +27,7 @@ >>>> #include "GenomeGraphReader.h" >>>> #include "GenomeAssemblyReader.h" >>>> #include "MatrixOwner.h" >>>> +#include "KmersMatrixOwner.h" >>>> >>>> #include <RayPlatform/cryptography/crypto.h> >>>> >>>> @@ -39,11 +40,13 @@ using namespace std; >>>> #define INPUT_TYPE_GRAPH 0 >>>> #define INPUT_TYPE_ASSEMBLY 1 >>>> >>>> - >>>> Mother::Mother() { >>>> >>>> m_coalescenceManager = -1; >>>> m_matrixOwner = -1; >>>> + m_kmersMatrixOwner = -1; >>>> + >>>> + // m_matricesAreReady = true; >>> Remove this commented line. >>> >>>> m_parameters = NULL; >>>> m_bigMother = -1; >>>> @@ -91,7 +94,7 @@ void Mother::receive(Message & message) { >>>> notifyController(); >>>> } >>>> >>>> - } else if(tag == MERGE) { >>>> + } else if(tag == MERGE_GRAM_MATRIX) { >>>> >>>> int matrixOwner = -1; >>>> memcpy(&matrixOwner, buffer, sizeof(matrixOwner)); >>>> @@ -102,7 +105,7 @@ void Mother::receive(Message & message) { >>>> #endif >>>> >>>> Message theMessage; >>>> - theMessage.setTag(StoreKeeper::MERGE); >>>> + theMessage.setTag(StoreKeeper::MERGE_GRAM_MATRIX); >>>> theMessage.setBuffer(&matrixOwner); >>>> theMessage.setNumberOfBytes(sizeof(matrixOwner)); >>>> >>>> @@ -111,10 +114,33 @@ void Mother::receive(Message & message) { >>>> send(destination, theMessage); >>>> >>>> Message response; >>>> - response.setTag(MERGE_OK); >>>> + response.setTag(MERGE_GRAM_MATRIX_OK); >>>> + send(source, response); >>>> + >>>> + } else if (tag == MERGE_KMERS_MATRIX) { >>>> + >>>> + int kmersMatrixOwner = -1; >>>> + memcpy(&kmersMatrixOwner, buffer, sizeof(kmersMatrixOwner)); >>>> + >>>> +#ifdef CONFIG_ASSERT >>>> + assert(kmersMatrixOwner>= 0); >>>> + assert(m_storeKeepers.size() == 1); >>>> +#endif >>>> + >>>> + Message theMessage; >>>> + theMessage.setTag(StoreKeeper::MERGE_KMERS_MATRIX); >>> The name should be MERGE_KMER_MATRIX and not MERGE_KMERS_MATRIX. >>> >>>> + theMessage.setBuffer(&kmersMatrixOwner); >>>> + theMessage.setNumberOfBytes(sizeof(kmersMatrixOwner)); >>>> + >>>> + int destination = m_storeKeepers[0]; >>>> + >>>> + send(destination, theMessage); >>>> + >>>> + Message response; >>>> + response.setTag(MERGE_KMERS_MATRIX_OK); >>>> send(source, response); >>>> >>>> - } else if(tag == SHUTDOWN) { >>>> + } else if(tag == SHUTDOWN) { >>>> >>>> Message response; >>>> response.setTag(SHUTDOWN_OK); >>>> @@ -122,18 +148,16 @@ void Mother::receive(Message & message) { >>>> >>>> stop(); >>>> >>>> - } else if(tag == StoreKeeper::MERGE_OK) { >>>> + } else if(tag == StoreKeeper::MERGE_GRAM_MATRIX_OK) { >>>> >>>> // TODO: the bug https://github.com/sebhtml/ray/issues/216 >>>> // is caused by the fact that this message is not >>>> // received . >>>> >>>> - /* >>>> - Message newMessage; >>>> - newMessage.setTag(MERGE_OK); >>>> + // Message newMessage; >>>> + // newMessage.setTag(MERGE_OK); >>>> >>>> - send(m_bigMother, newMessage); >>>> - */ >>>> + // send(m_bigMother, newMessage); >>> Remove these commented lines. >>> >>>> } else if(tag == FINISH_JOB) { >>>> >>>> @@ -153,6 +177,7 @@ void Mother::receive(Message & message) { >>>> >>>> sendToFirstMother(FLUSH_AGGREGATOR, >>>> FLUSH_AGGREGATOR_RETURN); >>>> } >>>> + >>>> } else if(tag == FLUSH_AGGREGATOR) { >>>> >>>> /* >>>> @@ -188,64 +213,52 @@ void Mother::receive(Message & message) { >>>> cout << "DEBUG sending FLUSH_AGGREGATOR_OK to >>>> m_bigMother" << endl; >>>> */ >>>> >>>> - } else if(tag == MatrixOwner::MATRIX_IS_READY) { >>>> + } else if(tag == MatrixOwner::GRAM_MATRIX_IS_READY) { >>>> + >>>> + //TODO : check if all matrices are ready >>>> + if(m_matricesAreReady){ >>>> + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>>> + }else { >>>> + cout << "GRAM_MATRIX_IS_READY" << endl; >>> When an actor speak, you must print its name too in stdout. >>> (with printName()). >>> >>>> + m_matricesAreReady = true; >>>> + } >>>> >>>> - sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>>> + } >>>> + else if(tag == KmersMatrixOwner::KMERS_MATRIX_IS_READY) { >>>> + >>>> + cout << "KMERS_MATRIX_IS_READY" << endl; >>>> + if(m_matricesAreReady){ >>>> + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>>> + }else { >>>> + cout << "KMERS_MATRIX_IS_READY" << endl; >>>> + m_matricesAreReady = true; >>> In one comment above, I saw that m_matricesAreReady = false >>> was commented. Check that out. >>> >>>> + } >>>> >>>> } else if(tag == FLUSH_AGGREGATOR_OK) { >>>> >>>> - /* >>>> printName(); >>>> cout << "DEBUG received FLUSH_AGGREGATOR_OK" << endl; >>>> - */ >>>> >>>> m_flushedMothers++; >>>> >>>> if(m_flushedMothers < getSize()) >>>> return; >>>> >>>> - // spawn the MatrixOwner here ! >>>> - >>>> - MatrixOwner * matrixOwner = new MatrixOwner(); >>>> - spawn(matrixOwner); >>>> - >>>> - m_matrixOwner = matrixOwner->getName(); >>>> - >>>> - printName(); >>>> - cout << "Spawned MatrixOwner actor !" << endl; >>>> - >>>> - // tell the StoreKeeper actors to send their stuff to the >>>> - // MatrixOwner actor >>>> - // The Mother of Mother will wait for a signal from >>>> MatrixOwner >>>> - >>>> - Message greetingMessage; >>>> - >>>> - vector<string> * names = & m_sampleNames; >>>> - >>>> - char buffer[32]; >>>> - int offset = 0; >>>> - memcpy(buffer + offset, &m_parameters, >>>> sizeof(m_parameters)); >>>> - offset += sizeof(m_parameters); >>>> - memcpy(buffer + offset, &names, sizeof(names)); >>>> - offset += sizeof(names); >>>> - >>>> - greetingMessage.setBuffer(&buffer); >>>> - greetingMessage.setNumberOfBytes(offset); >>>> - >>>> - greetingMessage.setTag(MatrixOwner::GREETINGS); >>>> - send(m_matrixOwner, greetingMessage); >>>> - >>>> - sendToFirstMother(MERGE, MERGE_OK); >>>> - >>>> + spawnMatrixOwner(); >>> I like that. A method to spawn an actor. Good ! >>> >>>> } else if(tag == m_responseTag) { >>>> >>>> - >>>> if(m_responseTag == SHUTDOWN_OK) { >>>> >>>> - } else if(m_responseTag == MERGE_OK) { >>>> - >>>> - } else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { >>>> + } else if(m_responseTag == MERGE_GRAM_MATRIX_OK) { >>>> + // All mothers merged their GRAM MATRIX >>>> + // Spawn KmersMatrixOwner to print >>>> + if(m_motherToKill < getSize() && >>>> m_printKmersMatrix){ >>>> + spawnKmersMatrixOwner(); >>>> + } >>>> + } else if(m_responseTag == MERGE_KMERS_MATRIX_OK) { >>>> + } >>>> + else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { >>> Again, put closing brace on same line (} else if ( ...) { >>> >>>> /* >>>> printName(); >>>> @@ -254,9 +267,8 @@ void Mother::receive(Message & message) { >>>> */ >>>> } >>>> >>>> - // every mother was informed. >>>> + // every mother was not informed. >>> Good catch ! >>> >>>> if(m_motherToKill>= getSize()) { >>>> - >>>> sendMessageWithReply(m_motherToKill, m_forwardTag); >>>> m_motherToKill--; >>>> } >>>> @@ -284,11 +296,15 @@ void Mother::sendMessageWithReply(int & actor, int >>>> tag) { >>>> Message message; >>>> message.setTag(tag); >>>> >>>> - if(tag == MERGE) { >>>> + if(tag == MERGE_GRAM_MATRIX) { >>>> message.setBuffer(&m_matrixOwner); >>>> message.setNumberOfBytes(sizeof(m_matrixOwner)); >>>> - >>>> - } else if(tag == FLUSH_AGGREGATOR) { >>>> + } >>>> + else if(tag == MERGE_KMERS_MATRIX) { >>>> + message.setBuffer(&m_kmersMatrixOwner); >>>> + message.setNumberOfBytes(sizeof(m_kmersMatrixOwner)); >>>> + } >>>> + else if(tag == FLUSH_AGGREGATOR) { >>>> >>>> /* >>>> printName(); >>>> @@ -328,6 +344,10 @@ void Mother::stop() { >>>> m_matrixOwner = -1; >>>> } >>>> >>>> + if(m_kmersMatrixOwner>= 0) { >>>> + send(m_kmersMatrixOwner, kill); >>>> + m_kmersMatrixOwner = -1; >>>> + } >>>> >>>> die(); >>>> >>>> @@ -410,39 +430,44 @@ void Mother::startSurveyor() { >>>> >>>> bool isRoot = (getName() % getSize()) == 0; >>>> >>>> - //cout << "DEBUG startSurveyor isRoot" << isRoot << endl; >>>> - >>>> - // get a list of files. >>>> + // Set matricesAreReady to true in case user doesn't want >>>> + // to print out kmers matrix. >>>> + m_matricesAreReady = true; >>>> >>>> vector<string> * commands = m_parameters->getCommands(); >>>> >>>> - >>>> for(int i = 0 ; i < (int) commands->size() ; ++i) { >>>> >>>> string & element = commands->at(i); >>>> >>>> - // DONE: Check bounds for file names >>>> + if (element != "-print-kmers-matrix") { >>> The name should be kmer-matrix, not kmers-matrix. >>> >>> It is like groceries store vs grocery store. >>> >>>> + // DONE: Check bounds for file names >>>> >>>> - map<string,int> fastTable; >>>> + map<string,int> fastTable; >>>> >>>> - fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; >>>> - fastTable["-read-sample-assembly"] = INPUT_TYPE_ASSEMBLY; >>>> + fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; >>>> + fastTable["-read-sample-assembly"] = >>>> INPUT_TYPE_ASSEMBLY; >>>> >>>> - // Unsupported option >>>> - if(fastTable.count(element) == 0 || i+2> (int) >>>> commands->size()) >>>> - continue; >>>> + // Unsupported option >>>> + if(fastTable.count(element) == 0 || i+2> (int) >>>> commands->size()) >>>> + continue; >>>> >>>> - string sampleName = commands->at(++i); >>>> - string fileName = commands->at(++i); >>>> + string sampleName = commands->at(++i); >>>> + string fileName = commands->at(++i); >>>> >>>> - m_sampleNames.push_back(sampleName); >>>> + m_sampleNames.push_back(sampleName); >>>> >>>> - // DONE implement this m_assemblyFileNames + type >>>> - m_inputFileNames.push_back(fileName); >>>> + // DONE implement this m_assemblyFileNames + type >>>> + m_inputFileNames.push_back(fileName); >>>> >>>> - int type = fastTable[element]; >>>> + int type = fastTable[element]; >>>> >>>> - m_sampleInputTypes.push_back(type); >>>> + m_sampleInputTypes.push_back(type); >>>> + >>>> + } else { >>>> + m_matricesAreReady = false; >>>> + m_printKmersMatrix = true; >>> Question: if m_printKmersMatrix is false, I suppose the code >>> follows the usual path of printing just one matrix, right ? >>> >>>> + } >>>> >>>> } >>>> >>>> @@ -468,6 +493,9 @@ void Mother::startSurveyor() { >>>> >>>> m_storeKeepers.push_back(actor->getName()); >>>> >>>> + actor->setOutputKmersMatrixPath(m_parameters->getPrefix()); >>> The path should be prefix/Surveyor/<whatever the kmer matrix's name is> >>> >>>> + actor->setSamplesSize(m_sampleNames.size()); >>> sample size, not samples size. >>> >>>> + >>>> // tell the CoalescenceManager about the local StoreKeeper >>>> Message dummyMessage; >>>> int localStore = actor->getName(); >>>> @@ -568,6 +596,80 @@ void Mother::spawnReader() { >>>> } >>>> } >>>> >>>> + >>>> +void Mother::spawnMatrixOwner() { >>>> + >>>> + // spawn the MatrixOwner here ! >>>> + MatrixOwner * matrixOwner = new MatrixOwner(); >>>> + spawn(matrixOwner); >>>> + >>>> + m_matrixOwner = matrixOwner->getName(); >>>> + >>>> + printName(); >>>> + cout << "Spawned MatrixOwner actor !" << m_matrixOwner << endl; >>>> + >>>> + // tell the StoreKeeper actors to send their stuff to the >>>> + // MatrixOwner actor >>>> + // The Mother of Mother will wait for a signal from MatrixOwner >>>> + >>>> + Message greetingMessage; >>>> + >>>> + vector<string> * names = & m_sampleNames; >>>> + >>>> + char buffer[32]; >>>> + int offset = 0; >>>> + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); >>>> + offset += sizeof(m_parameters); >>>> + memcpy(buffer + offset, &names, sizeof(names)); >>>> + offset += sizeof(names); >>>> + >>>> + greetingMessage.setBuffer(&buffer); >>>> + greetingMessage.setNumberOfBytes(offset); >>>> + >>>> + greetingMessage.setTag(MatrixOwner::GREETINGS); >>>> + send(m_matrixOwner, greetingMessage); >>>> + >>>> + sendToFirstMother(MERGE_GRAM_MATRIX, MERGE_GRAM_MATRIX_OK); >>>> +} >>>> + >>>> +void Mother::spawnKmersMatrixOwner() { >>>> + >>>> + // spawn the MatrixOwner here ! >>>> + KmersMatrixOwner * kmersMatrixOwner = new KmersMatrixOwner(); >>>> + spawn(kmersMatrixOwner); >>>> + >>>> + m_kmersMatrixOwner = kmersMatrixOwner->getName(); >>>> + >>>> + printName(); >>>> + cout << "Spawned KmersMatrixOwner actor !" << >>>> m_kmersMatrixOwner << endl; >>>> + >>>> + // tell the StoreKeeper actors to send their stuff to the >>>> + // KmersMatrixOwner actor >>>> + // The Mother of Mother will wait for a signal from MatrixOwner >>>> + >>>> + Message greetingMessage; >>>> + >>>> + vector<string> * names = & m_sampleNames; >>>> + >>>> + char buffer[32]; >>>> + int offset = 0; >>>> + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); >>>> + offset += sizeof(m_parameters); >>>> + memcpy(buffer + offset, &names, sizeof(names)); >>>> + offset += sizeof(names); >>>> + >>>> + greetingMessage.setBuffer(&buffer); >>>> + greetingMessage.setNumberOfBytes(offset); >>>> + >>>> + greetingMessage.setTag(KmersMatrixOwner::GREETINGS); >>>> + send(m_kmersMatrixOwner, greetingMessage); >>>> + >>>> + sendToFirstMother(MERGE_KMERS_MATRIX, MERGE_KMERS_MATRIX_OK); >>>> + >>>> +} >>>> + >>>> + >>>> void Mother::setParameters(Parameters * parameters) { >>>> m_parameters = parameters; >>>> } >>>> + >>>> diff --git a/code/Surveyor/Mother.h b/code/Surveyor/Mother.h >>>> index 092920f..9774c4b 100644 >>>> --- a/code/Surveyor/Mother.h >>>> +++ b/code/Surveyor/Mother.h >>>> @@ -28,6 +28,7 @@ >>>> >>>> #include <vector> >>>> #include <string> >>>> +#include <iostream> >>>> using namespace std; >>>> >>>> /** >>>> @@ -55,9 +56,12 @@ class Mother: public Actor { >>>> private: >>>> >>>> int m_matrixOwner; >>>> + int m_kmersMatrixOwner; >>>> >>>> int m_flushedMothers; >>>> int m_finishedMothers; >>>> + bool m_matricesAreReady; >>>> + bool m_printKmersMatrix; >>>> >>>> Parameters * m_parameters; >>>> >>>> @@ -93,6 +97,13 @@ private: >>>> */ >>>> void sendToFirstMother(int forwardTag, int responseTag); >>>> >>>> + /* int m_kmersMatrixBlocNumber; */ >>>> + void printLocalKmersMatrix(string & kmer, string & >>>> samples_kmers, bool force); >>>> + void createKmersMatrixOutputFile(); >>>> + >>>> + void spawnMatrixOwner(); >>>> + void spawnKmersMatrixOwner(); >>>> + >>> That's a good design -- private methods for private uses. >>> >>>> public: >>>> >>>> Mother(); >>>> @@ -109,8 +120,10 @@ public: >>>> FLUSH_AGGREGATOR, >>>> FLUSH_AGGREGATOR_OK, >>>> FLUSH_AGGREGATOR_RETURN, >>>> - MERGE, >>>> - MERGE_OK, >>>> + MERGE_GRAM_MATRIX, >>>> + MERGE_GRAM_MATRIX_OK, >>>> + MERGE_KMERS_MATRIX, >>>> + MERGE_KMERS_MATRIX_OK, >>> kmer matrix, not kmers matrix. >>> >>>> LAST_TAG, >>>> }; >>>> >>>> diff --git a/code/Surveyor/StoreKeeper.cpp b/code/Surveyor/StoreKeeper.cpp >>>> index 84eef34..492208c 100644 >>>> --- a/code/Surveyor/StoreKeeper.cpp >>>> +++ b/code/Surveyor/StoreKeeper.cpp >>>> @@ -22,10 +22,16 @@ >>>> #include "StoreKeeper.h" >>>> #include "CoalescenceManager.h" >>>> #include "MatrixOwner.h" >>>> +#include "KmersMatrixOwner.h" >>>> >>>> #include <code/VerticesExtractor/Vertex.h> >>>> +#include <RayPlatform/structures/MyHashTableIterator.h> >>>> +#include <RayPlatform/core/OperatingSystem.h> >>>> >>>> #include <iostream> >>>> +#include <sstream> >>>> +#include <iomanip> >>>> +#include <fstream> >>>> using namespace std; >>>> >>>> #include <string.h> >>>> @@ -83,15 +89,21 @@ void StoreKeeper::receive(Message & message) { >>>> >>>> die(); >>>> >>>> - } else if(tag == MERGE) { >>>> + } else if(tag == MERGE_GRAM_MATRIX) { >>>> >>>> >>>> - printName(); >>>> - cout << "DEBUG at MERGE message reception "; >>>> - cout << "(StoreKeeper) received " << m_receivedObjects >>>> << " objects in total"; >>>> - cout << " with " << m_receivedPushes << " push >>>> operations" << endl; >>>> + // printName(); >>>> + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; >>>> + // cout << "(StoreKeeper) received " << >>>> m_receivedObjects << " objects in total"; >>>> + // cout << " with " << m_receivedPushes << " push >>> You can remove commented lines. >>> >>>> operations" << endl; >>>> computeLocalGramMatrix(); >>>> >>>> + >>>> + // TODEL Print matrix bloc >>>> + // m_kmersMatrixBlocNumber = 0; >>>> + // printLocalKmersMatrix(); >>>> + >>> You can remove commented lines. >>> >>>> + >>>> m_mother = source; >>>> >>>> memcpy(&m_matrixOwner, buffer, sizeof(m_matrixOwner)); >>>> @@ -108,19 +120,32 @@ void StoreKeeper::receive(Message & message) { >>>> m_iterator2 = m_iterator1->second.begin(); >>>> } >>>> >>>> - /* >>>> - printName(); >>>> - cout << "DEBUG printLocalGramMatrix before first >>>> sendMatrixCell" << endl; >>>> - printLocalGramMatrix(); >>>> - */ >>>> - >>>> + // printName(); >>>> + // cout << "DEBUG printLocalGramMatrix before first >>>> sendMatrixCell" << endl; >>>> + // printLocalGramMatrix(); >>> You can remove commented lines. >>> >>>> sendMatrixCell(); >>>> >>>> } else if(tag == MatrixOwner::PUSH_PAYLOAD_OK) { >>>> - >>>> sendMatrixCell(); >>>> + } else if(tag == MERGE_KMERS_MATRIX) { >>>> + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; >>>> + // cout << "(StoreKeeper) received " << >>>> m_receivedObjects << " objects in total"; >>>> + // cout << " with " << m_receivedPushes << " push >>>> operations" << endl; >>> You can remove commented lines. >>> >>> Otherwise, add a "#ifdef DEBUG_SOMETHING_SOMETHING / #endif around that lines". >>> >>> >>>> + >>>> + m_mother = source; >>>> >>>> - } else if(tag == CoalescenceManager::SET_KMER_LENGTH) { >>>> + memcpy(&m_kmersMatrixOwner, buffer, >>>> sizeof(m_kmersMatrixOwner)); >>>> + >>>> + m_hashTableIterator.constructor(&m_hashTable); >>>> + >>>> + sendKmersSamples(); >>>> + } >>>> + else if (tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_END) { >>>> + } >>>> + else if(tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_OK) { >>>> + sendKmersSamples(); >>>> + } >>>> + else if(tag == CoalescenceManager::SET_KMER_LENGTH) { >>>> >>>> int kmerLength = 0; >>>> int position = 0; >>>> @@ -181,8 +206,6 @@ void StoreKeeper::sendMatrixCell() { >>>> message.setNumberOfBytes(offset); >>>> message.setTag(MatrixOwner::PUSH_PAYLOAD); >>>> >>>> - //cout << " DEBUG send PUSH_PAYLOAD to " << >>>> m_matrixOwner << endl; >>>> - >>>> send(m_matrixOwner, message); >>>> >>>> m_iterator2++; >>>> @@ -207,10 +230,7 @@ void StoreKeeper::sendMatrixCell() { >>>> // free memory. >>>> m_localGramMatrix.clear(); >>>> >>>> - /* >>>> printName(); >>>> - cout << "DEBUG send PUSH_PAYLOAD_END to " << m_matrixOwner << endl; >>>> - */ >>>> >>>> Message response; >>>> response.setTag(MatrixOwner::PUSH_PAYLOAD_END); >>>> @@ -236,6 +256,7 @@ void StoreKeeper::configureHashTable() { >>>> ); >>>> >>>> m_configured = true; >>>> + >>>> } >>>> >>>> void StoreKeeper::printColorReport() { >>>> @@ -375,6 +396,7 @@ void StoreKeeper::computeLocalGramMatrix() { >>>> //printLocalGramMatrix(); >>>> } >>>> >>>> + >>>> void StoreKeeper::printLocalGramMatrix() { >>>> >>>> printName(); >>>> @@ -623,3 +645,73 @@ void StoreKeeper::storeData(Vertex & vertex, int & >>>> sample) { >>>> >>>> */ >>>> } >>>> + >>>> + >>>> +void StoreKeeper::setSamplesSize(int sampleSize) { >>>> + m_sampleSize = sampleSize; >>>> +} >>>> + >>>> +void StoreKeeper::setOutputKmersMatrixPath(string pathPrefix) { >>>> + // m_outputKmersMatrixPath = pathPrefix; >>>> + // m_outputKmersMatrixPath += "/KmersMatrixDump/"; >>>> + // createDirectory(m_outputKmersMatrixPath.c_str()); >>> You can remove commented lines. >>> >>> >>> This file could be in prefix/Surveyor/<...> >>> >>>> +} >>>> + >>>> + >>>> +void StoreKeeper::sendKmersSamples() { >>>> + >>>> + char buffer[4000]; >>> For portability, use MAXIMUM_MESSAGE_SIZE_IN_BYTES instead of 4000. >>> >>>> + int bytes = 0; >>>> + >>>> + ExperimentVertex * currentVertex = NULL; >>>> + VirtualKmerColorHandle currentVirtualColor = NULL_VIRTUAL_COLOR; >>>> + >>>> + vector<bool> samplesVector (m_sampleSize, false); >>>> + >>>> + if(m_hashTableIterator.hasNext()){ >>>> + >>>> + // fill(samplesVector.begin(),samplesVector.end(),false); >>>> + >>> You can remove commented lines. >>> >>>> + currentVertex = m_hashTableIterator.next(); >>>> + Kmer kmer = currentVertex->getKey(); >>>> + >>>> + bytes += kmer.dump(buffer); >>>> + >>>> + currentVirtualColor = currentVertex->getVirtualColor(); >>>> + set<PhysicalKmerColor> * samples = >>>> m_colorSet.getPhysicalColors(currentVirtualColor); >>>> + >>>> + for(set<PhysicalKmerColor>:: iterator sampleIterator = >>>> samples->begin(); >>>> + sampleIterator != samples->end(); ++sampleIterator) { >>>> + PhysicalKmerColor value = *sampleIterator; >>>> + samplesVector[value] = true; >>>> + // cout << " " << value; >>>> + } >>>> + >>>> + for (std::vector<bool>::iterator it = >>>> samplesVector.begin(); >>>> + it != samplesVector.end(); ++it) { >>>> + buffer[bytes] = *it; >>>> + bytes++; >>>> + } >>>> + // buffer[bytes] = '\0'; >>> You can remove commented lines. >>> >>>> + } >>>> + >>>> + >>>> + Message message; >>>> + message.setNumberOfBytes(bytes); >>>> + message.setBuffer(buffer); >>>> + >>>> + // message.setTag(MatrixOwner::PUSH_KMERS_SAMPLES); >>>> + if(m_hashTableIterator.hasNext()){ >>>> + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES); >>>> + }else{ >>>> + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES_END); >>>> + } >>>> + >>>> + // Message response; >>>> + // response.setTag(MatrixOwner::PUSH_PAYLOAD_END); >>>> + // send(m_matrixOwner, response); >>>> + >>>> + send(m_kmersMatrixOwner, message); >>>> + >>>> +} >>>> + >>>> diff --git a/code/Surveyor/StoreKeeper.h b/code/Surveyor/StoreKeeper.h >>>> index e44cf98..36adf77 100644 >>>> --- a/code/Surveyor/StoreKeeper.h >>>> +++ b/code/Surveyor/StoreKeeper.h >>>> @@ -34,6 +34,10 @@ >>>> >>>> #include <RayPlatform/actors/Actor.h> >>>> #include <RayPlatform/structures/MyHashTable.h> >>>> +#include <RayPlatform/structures/MyHashTableIterator.h> >>>> + >>>> +#include <iostream> >>>> +#include <sstream> >>>> >>>> /** >>>> * Provides genomic storage. >>>> @@ -55,6 +59,7 @@ private: >>>> >>>> int m_mother; >>>> int m_matrixOwner; >>>> + int m_kmersMatrixOwner; >>>> >>>> bool m_configured; >>>> >>>> @@ -64,6 +69,8 @@ private: >>>> */ >>>> MyHashTable<Kmer,ExperimentVertex> m_hashTable; >>>> >>>> + MyHashTableIterator<Kmer,ExperimentVertex> m_hashTableIterator; >>>> + >>>> int m_kmerLength; >>>> bool m_colorSpaceMode; >>>> >>>> @@ -79,6 +86,13 @@ private: >>>> void printLocalGramMatrix(); >>>> void printColorReport(); >>>> >>>> + /* ostringstream m_currentKmer; */ >>>> + /* ostringstream m_currentSamplesKmers; */ >>>> + int m_sampleSize; >>>> + string m_outputKmersMatrixPath; >>>> + void printLocalKmersMatrix(string & m_kmer, string & >>>> m_samplesKmers); >>>> + void sendKmersSamples(); >>>> + >>>> void sendMatrixCell(); >>>> >>>> public: >>>> @@ -86,14 +100,19 @@ public: >>>> StoreKeeper(); >>>> ~StoreKeeper(); >>>> >>>> + void setOutputKmersMatrixPath(string pathPrefix); >>>> + void setSamplesSize(int sampleSize); >>>> + >>>> void receive(Message & message); >>>> >>>> enum { >>>> FIRST_TAG = 10250, >>>> PUSH_SAMPLE_VERTEX, >>>> PUSH_SAMPLE_VERTEX_OK, >>>> - MERGE, >>>> - MERGE_OK, >>>> + MERGE_GRAM_MATRIX, >>>> + MERGE_GRAM_MATRIX_OK, >>>> + MERGE_KMERS_MATRIX, >>>> + MERGE_KMERS_MATRIX_OK, >>>> LAST_TAG >>>> }; >>>> }; >>> ------------------------------------------------------------------------------ >>> Managing the Performance of Cloud-Based Applications >>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >>> Read the Whitepaper. >>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Denovoassembler-devel mailing list >>> Den...@li... >>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel >> |
From: Sébastien B. <se...@bo...> - 2014-03-07 02:25:36
|
Hi Maxime, How do I activate your new code ? I ran Surveyor on some graph to check if the kmer matrix would be generated by default (there is no option to turn it on or off). But I only got the usual files: [boiseb01@ls30 Surveyor]$ find RaySurveyorResults/Surveyor RaySurveyorResults/Surveyor RaySurveyorResults/Surveyor/SimilarityMatrix.tsv RaySurveyorResults/Surveyor/DistanceMatrix.tsv ---------------------------------------- > Date: Wed, 5 Mar 2014 13:41:39 +0000 > From: max...@gm... > To: se...@bo... > Subject: Re: [Denovoassembler-devel] git diff kmersmatrix branch > > Hi Sebastien, > > I have fix the issue when scaffolds would be given in entry instead of > contigs in the SequenceKmerReader class. > > I also made the edition according to this review. > > Please now pull from : > > https://github.com/Zorino/ray.git > > patch-kmermatrix > > Cheers, > > Maxime > > > > On 02/23/2014 12:16 PM, Sébastien Boisvert wrote: >> Hey Maxime, >> >> You did not provide KmersMatrixOwner.h, KmersMatrixOwner.cpp, and changes >> to the Surveyor Makefile. >> >> >> OTher comments are below. >> >> ---------------------------------------- >>> Date: Sat, 22 Feb 2014 10:03:30 +0000 >>> From: ma...@de... >>> To: se...@bo... >>> Subject: git diff kmersmatrix branch >>> >>> diff --git a/code/Surveyor/MatrixOwner.cpp b/code/Surveyor/MatrixOwner.cpp >>> index ffaae00..47cf84a 100644 >>> --- a/code/Surveyor/MatrixOwner.cpp >>> +++ b/code/Surveyor/MatrixOwner.cpp >>> @@ -65,9 +65,12 @@ void MatrixOwner::receive(Message & message) { >>> assert(m_parameters != NULL); >>> assert(m_sampleNames != NULL); >>> #endif >>> - >>> m_mother = source; >>> >>> + //open the buffer of the file >>> + // createKmersMatrixOutputFile(); >>> + >>> + >>> } else if(tag == PUSH_PAYLOAD) { >>> >>> SampleIdentifier sample1 = -1; >>> @@ -89,10 +92,10 @@ void MatrixOwner::receive(Message & message) { >>> assert(count>= 0); >>> #endif >>> >>> - /* >>> + >>> printName(); >>> - cout << "DEBUG add " << sample1 << " " << sample2 << " " >>> << count << endl; >>> -*/ >>> + // cout << "DEBUG add " << sample1 << " " << sample2 << >> Commented lines should be removed. >> >>> " " << count << endl; >>> + >>> m_receivedPayloads ++; >>> >>> m_localGramMatrix[sample1][sample2] += count; >>> @@ -100,14 +103,14 @@ void MatrixOwner::receive(Message & message) { >>> Message response; >>> response.setTag(PUSH_PAYLOAD_OK); >>> send(source, response); >>> + } >>> + else if(tag == PUSH_PAYLOAD_END) { >> Use '} else if (' and not '} >> else if' >> >> >> This is the coding style of the project. >> see https://github.com/sebhtml/ray/blob/master/Documentation/CodingStyle.txt >> >> * Kernighan and Ritchie style, variant "The One True Brace Style" (1TBS) >> http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS >> >> >> (you used K&R Variant: Stroustrup). >> >>> - } else if(tag == PUSH_PAYLOAD_END) { >>> - >>> + cout << "PUSH_PAYLOAD_END" <<endl; >> Remove this debug message. >> >>> m_completedStoreActors++; >>> >>> if(m_completedStoreActors == getSize()) { >>> >>> - >>> printName(); >>> cout << "MatrixOwner received " << >>> m_receivedPayloads << " payloads" << endl; >>> >>> @@ -151,10 +154,9 @@ void MatrixOwner::receive(Message & message) { >>> >>> >>> // tell Mother that the matrix is ready now. >>> - >>> - Message coolMessage; >>> - coolMessage.setTag(MATRIX_IS_READY); >>> - send(m_mother, coolMessage); >>> + Message coolMessage; >>> + coolMessage.setTag(GRAM_MATRIX_IS_READY); >>> + send(m_mother, coolMessage); >>> >>> >>> // clear matrices >>> @@ -275,3 +277,4 @@ void >>> MatrixOwner::printLocalGramMatrixWithHash(ostream & stream, map<SampleIdent >>> stream << endl; >>> } >>> } >>> + >>> diff --git a/code/Surveyor/MatrixOwner.h b/code/Surveyor/MatrixOwner.h >>> index ceb17e2..afa9278 100644 >>> --- a/code/Surveyor/MatrixOwner.h >>> +++ b/code/Surveyor/MatrixOwner.h >>> @@ -28,6 +28,7 @@ >>> >>> #include <map> >>> #include <iostream> >>> +#include <sstream> >>> using namespace std; >>> >>> class MatrixOwner : public Actor { >>> @@ -62,7 +63,7 @@ public: >>> PUSH_PAYLOAD, >>> PUSH_PAYLOAD_OK, >>> PUSH_PAYLOAD_END, >>> - MATRIX_IS_READY, >>> + GRAM_MATRIX_IS_READY, >>> LAST_TAG >>> }; >>> >>> diff --git a/code/Surveyor/Mother.cpp b/code/Surveyor/Mother.cpp >>> index 4d2ef9c..103a583 100644 >>> --- a/code/Surveyor/Mother.cpp >>> +++ b/code/Surveyor/Mother.cpp >>> @@ -27,6 +27,7 @@ >>> #include "GenomeGraphReader.h" >>> #include "GenomeAssemblyReader.h" >>> #include "MatrixOwner.h" >>> +#include "KmersMatrixOwner.h" >>> >>> #include <RayPlatform/cryptography/crypto.h> >>> >>> @@ -39,11 +40,13 @@ using namespace std; >>> #define INPUT_TYPE_GRAPH 0 >>> #define INPUT_TYPE_ASSEMBLY 1 >>> >>> - >>> Mother::Mother() { >>> >>> m_coalescenceManager = -1; >>> m_matrixOwner = -1; >>> + m_kmersMatrixOwner = -1; >>> + >>> + // m_matricesAreReady = true; >> Remove this commented line. >> >>> m_parameters = NULL; >>> m_bigMother = -1; >>> @@ -91,7 +94,7 @@ void Mother::receive(Message & message) { >>> notifyController(); >>> } >>> >>> - } else if(tag == MERGE) { >>> + } else if(tag == MERGE_GRAM_MATRIX) { >>> >>> int matrixOwner = -1; >>> memcpy(&matrixOwner, buffer, sizeof(matrixOwner)); >>> @@ -102,7 +105,7 @@ void Mother::receive(Message & message) { >>> #endif >>> >>> Message theMessage; >>> - theMessage.setTag(StoreKeeper::MERGE); >>> + theMessage.setTag(StoreKeeper::MERGE_GRAM_MATRIX); >>> theMessage.setBuffer(&matrixOwner); >>> theMessage.setNumberOfBytes(sizeof(matrixOwner)); >>> >>> @@ -111,10 +114,33 @@ void Mother::receive(Message & message) { >>> send(destination, theMessage); >>> >>> Message response; >>> - response.setTag(MERGE_OK); >>> + response.setTag(MERGE_GRAM_MATRIX_OK); >>> + send(source, response); >>> + >>> + } else if (tag == MERGE_KMERS_MATRIX) { >>> + >>> + int kmersMatrixOwner = -1; >>> + memcpy(&kmersMatrixOwner, buffer, sizeof(kmersMatrixOwner)); >>> + >>> +#ifdef CONFIG_ASSERT >>> + assert(kmersMatrixOwner>= 0); >>> + assert(m_storeKeepers.size() == 1); >>> +#endif >>> + >>> + Message theMessage; >>> + theMessage.setTag(StoreKeeper::MERGE_KMERS_MATRIX); >> The name should be MERGE_KMER_MATRIX and not MERGE_KMERS_MATRIX. >> >>> + theMessage.setBuffer(&kmersMatrixOwner); >>> + theMessage.setNumberOfBytes(sizeof(kmersMatrixOwner)); >>> + >>> + int destination = m_storeKeepers[0]; >>> + >>> + send(destination, theMessage); >>> + >>> + Message response; >>> + response.setTag(MERGE_KMERS_MATRIX_OK); >>> send(source, response); >>> >>> - } else if(tag == SHUTDOWN) { >>> + } else if(tag == SHUTDOWN) { >>> >>> Message response; >>> response.setTag(SHUTDOWN_OK); >>> @@ -122,18 +148,16 @@ void Mother::receive(Message & message) { >>> >>> stop(); >>> >>> - } else if(tag == StoreKeeper::MERGE_OK) { >>> + } else if(tag == StoreKeeper::MERGE_GRAM_MATRIX_OK) { >>> >>> // TODO: the bug https://github.com/sebhtml/ray/issues/216 >>> // is caused by the fact that this message is not >>> // received . >>> >>> - /* >>> - Message newMessage; >>> - newMessage.setTag(MERGE_OK); >>> + // Message newMessage; >>> + // newMessage.setTag(MERGE_OK); >>> >>> - send(m_bigMother, newMessage); >>> - */ >>> + // send(m_bigMother, newMessage); >> Remove these commented lines. >> >>> } else if(tag == FINISH_JOB) { >>> >>> @@ -153,6 +177,7 @@ void Mother::receive(Message & message) { >>> >>> sendToFirstMother(FLUSH_AGGREGATOR, >>> FLUSH_AGGREGATOR_RETURN); >>> } >>> + >>> } else if(tag == FLUSH_AGGREGATOR) { >>> >>> /* >>> @@ -188,64 +213,52 @@ void Mother::receive(Message & message) { >>> cout << "DEBUG sending FLUSH_AGGREGATOR_OK to >>> m_bigMother" << endl; >>> */ >>> >>> - } else if(tag == MatrixOwner::MATRIX_IS_READY) { >>> + } else if(tag == MatrixOwner::GRAM_MATRIX_IS_READY) { >>> + >>> + //TODO : check if all matrices are ready >>> + if(m_matricesAreReady){ >>> + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>> + }else { >>> + cout << "GRAM_MATRIX_IS_READY" << endl; >> When an actor speak, you must print its name too in stdout. >> (with printName()). >> >>> + m_matricesAreReady = true; >>> + } >>> >>> - sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>> + } >>> + else if(tag == KmersMatrixOwner::KMERS_MATRIX_IS_READY) { >>> + >>> + cout << "KMERS_MATRIX_IS_READY" << endl; >>> + if(m_matricesAreReady){ >>> + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>> + }else { >>> + cout << "KMERS_MATRIX_IS_READY" << endl; >>> + m_matricesAreReady = true; >> >> In one comment above, I saw that m_matricesAreReady = false >> was commented. Check that out. >> >>> + } >>> >>> } else if(tag == FLUSH_AGGREGATOR_OK) { >>> >>> - /* >>> printName(); >>> cout << "DEBUG received FLUSH_AGGREGATOR_OK" << endl; >>> - */ >>> >>> m_flushedMothers++; >>> >>> if(m_flushedMothers < getSize()) >>> return; >>> >>> - // spawn the MatrixOwner here ! >>> - >>> - MatrixOwner * matrixOwner = new MatrixOwner(); >>> - spawn(matrixOwner); >>> - >>> - m_matrixOwner = matrixOwner->getName(); >>> - >>> - printName(); >>> - cout << "Spawned MatrixOwner actor !" << endl; >>> - >>> - // tell the StoreKeeper actors to send their stuff to the >>> - // MatrixOwner actor >>> - // The Mother of Mother will wait for a signal from >>> MatrixOwner >>> - >>> - Message greetingMessage; >>> - >>> - vector<string> * names = & m_sampleNames; >>> - >>> - char buffer[32]; >>> - int offset = 0; >>> - memcpy(buffer + offset, &m_parameters, >>> sizeof(m_parameters)); >>> - offset += sizeof(m_parameters); >>> - memcpy(buffer + offset, &names, sizeof(names)); >>> - offset += sizeof(names); >>> - >>> - greetingMessage.setBuffer(&buffer); >>> - greetingMessage.setNumberOfBytes(offset); >>> - >>> - greetingMessage.setTag(MatrixOwner::GREETINGS); >>> - send(m_matrixOwner, greetingMessage); >>> - >>> - sendToFirstMother(MERGE, MERGE_OK); >>> - >>> + spawnMatrixOwner(); >> I like that. A method to spawn an actor. Good ! >> >>> } else if(tag == m_responseTag) { >>> >>> - >>> if(m_responseTag == SHUTDOWN_OK) { >>> >>> - } else if(m_responseTag == MERGE_OK) { >>> - >>> - } else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { >>> + } else if(m_responseTag == MERGE_GRAM_MATRIX_OK) { >>> + // All mothers merged their GRAM MATRIX >>> + // Spawn KmersMatrixOwner to print >>> + if(m_motherToKill < getSize() && >>> m_printKmersMatrix){ >>> + spawnKmersMatrixOwner(); >>> + } >>> + } else if(m_responseTag == MERGE_KMERS_MATRIX_OK) { >>> + } >>> + else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { >> Again, put closing brace on same line (} else if ( ...) { >> >>> /* >>> printName(); >>> @@ -254,9 +267,8 @@ void Mother::receive(Message & message) { >>> */ >>> } >>> >>> - // every mother was informed. >>> + // every mother was not informed. >> Good catch ! >> >>> if(m_motherToKill>= getSize()) { >>> - >>> sendMessageWithReply(m_motherToKill, m_forwardTag); >>> m_motherToKill--; >>> } >>> @@ -284,11 +296,15 @@ void Mother::sendMessageWithReply(int & actor, int >>> tag) { >>> Message message; >>> message.setTag(tag); >>> >>> - if(tag == MERGE) { >>> + if(tag == MERGE_GRAM_MATRIX) { >>> message.setBuffer(&m_matrixOwner); >>> message.setNumberOfBytes(sizeof(m_matrixOwner)); >>> - >>> - } else if(tag == FLUSH_AGGREGATOR) { >>> + } >>> + else if(tag == MERGE_KMERS_MATRIX) { >>> + message.setBuffer(&m_kmersMatrixOwner); >>> + message.setNumberOfBytes(sizeof(m_kmersMatrixOwner)); >>> + } >>> + else if(tag == FLUSH_AGGREGATOR) { >>> >>> /* >>> printName(); >>> @@ -328,6 +344,10 @@ void Mother::stop() { >>> m_matrixOwner = -1; >>> } >>> >>> + if(m_kmersMatrixOwner>= 0) { >>> + send(m_kmersMatrixOwner, kill); >>> + m_kmersMatrixOwner = -1; >>> + } >>> >>> die(); >>> >>> @@ -410,39 +430,44 @@ void Mother::startSurveyor() { >>> >>> bool isRoot = (getName() % getSize()) == 0; >>> >>> - //cout << "DEBUG startSurveyor isRoot" << isRoot << endl; >>> - >>> - // get a list of files. >>> + // Set matricesAreReady to true in case user doesn't want >>> + // to print out kmers matrix. >>> + m_matricesAreReady = true; >>> >>> vector<string> * commands = m_parameters->getCommands(); >>> >>> - >>> for(int i = 0 ; i < (int) commands->size() ; ++i) { >>> >>> string & element = commands->at(i); >>> >>> - // DONE: Check bounds for file names >>> + if (element != "-print-kmers-matrix") { >> The name should be kmer-matrix, not kmers-matrix. >> >> It is like groceries store vs grocery store. >> >>> + // DONE: Check bounds for file names >>> >>> - map<string,int> fastTable; >>> + map<string,int> fastTable; >>> >>> - fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; >>> - fastTable["-read-sample-assembly"] = INPUT_TYPE_ASSEMBLY; >>> + fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; >>> + fastTable["-read-sample-assembly"] = >>> INPUT_TYPE_ASSEMBLY; >>> >>> - // Unsupported option >>> - if(fastTable.count(element) == 0 || i+2> (int) >>> commands->size()) >>> - continue; >>> + // Unsupported option >>> + if(fastTable.count(element) == 0 || i+2> (int) >>> commands->size()) >>> + continue; >>> >>> - string sampleName = commands->at(++i); >>> - string fileName = commands->at(++i); >>> + string sampleName = commands->at(++i); >>> + string fileName = commands->at(++i); >>> >>> - m_sampleNames.push_back(sampleName); >>> + m_sampleNames.push_back(sampleName); >>> >>> - // DONE implement this m_assemblyFileNames + type >>> - m_inputFileNames.push_back(fileName); >>> + // DONE implement this m_assemblyFileNames + type >>> + m_inputFileNames.push_back(fileName); >>> >>> - int type = fastTable[element]; >>> + int type = fastTable[element]; >>> >>> - m_sampleInputTypes.push_back(type); >>> + m_sampleInputTypes.push_back(type); >>> + >>> + } else { >>> + m_matricesAreReady = false; >>> + m_printKmersMatrix = true; >> >> Question: if m_printKmersMatrix is false, I suppose the code >> follows the usual path of printing just one matrix, right ? >> >>> + } >>> >>> } >>> >>> @@ -468,6 +493,9 @@ void Mother::startSurveyor() { >>> >>> m_storeKeepers.push_back(actor->getName()); >>> >>> + actor->setOutputKmersMatrixPath(m_parameters->getPrefix()); >> The path should be prefix/Surveyor/<whatever the kmer matrix's name is> >> >>> + actor->setSamplesSize(m_sampleNames.size()); >> sample size, not samples size. >> >>> + >>> // tell the CoalescenceManager about the local StoreKeeper >>> Message dummyMessage; >>> int localStore = actor->getName(); >>> @@ -568,6 +596,80 @@ void Mother::spawnReader() { >>> } >>> } >>> >>> + >>> +void Mother::spawnMatrixOwner() { >>> + >>> + // spawn the MatrixOwner here ! >>> + MatrixOwner * matrixOwner = new MatrixOwner(); >>> + spawn(matrixOwner); >>> + >>> + m_matrixOwner = matrixOwner->getName(); >>> + >>> + printName(); >>> + cout << "Spawned MatrixOwner actor !" << m_matrixOwner << endl; >>> + >>> + // tell the StoreKeeper actors to send their stuff to the >>> + // MatrixOwner actor >>> + // The Mother of Mother will wait for a signal from MatrixOwner >>> + >>> + Message greetingMessage; >>> + >>> + vector<string> * names = & m_sampleNames; >>> + >>> + char buffer[32]; >>> + int offset = 0; >>> + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); >>> + offset += sizeof(m_parameters); >>> + memcpy(buffer + offset, &names, sizeof(names)); >>> + offset += sizeof(names); >>> + >>> + greetingMessage.setBuffer(&buffer); >>> + greetingMessage.setNumberOfBytes(offset); >>> + >>> + greetingMessage.setTag(MatrixOwner::GREETINGS); >>> + send(m_matrixOwner, greetingMessage); >>> + >>> + sendToFirstMother(MERGE_GRAM_MATRIX, MERGE_GRAM_MATRIX_OK); >>> +} >>> + >>> +void Mother::spawnKmersMatrixOwner() { >>> + >>> + // spawn the MatrixOwner here ! >>> + KmersMatrixOwner * kmersMatrixOwner = new KmersMatrixOwner(); >>> + spawn(kmersMatrixOwner); >>> + >>> + m_kmersMatrixOwner = kmersMatrixOwner->getName(); >>> + >>> + printName(); >>> + cout << "Spawned KmersMatrixOwner actor !" << >>> m_kmersMatrixOwner << endl; >>> + >>> + // tell the StoreKeeper actors to send their stuff to the >>> + // KmersMatrixOwner actor >>> + // The Mother of Mother will wait for a signal from MatrixOwner >>> + >>> + Message greetingMessage; >>> + >>> + vector<string> * names = & m_sampleNames; >>> + >>> + char buffer[32]; >>> + int offset = 0; >>> + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); >>> + offset += sizeof(m_parameters); >>> + memcpy(buffer + offset, &names, sizeof(names)); >>> + offset += sizeof(names); >>> + >>> + greetingMessage.setBuffer(&buffer); >>> + greetingMessage.setNumberOfBytes(offset); >>> + >>> + greetingMessage.setTag(KmersMatrixOwner::GREETINGS); >>> + send(m_kmersMatrixOwner, greetingMessage); >>> + >>> + sendToFirstMother(MERGE_KMERS_MATRIX, MERGE_KMERS_MATRIX_OK); >>> + >>> +} >>> + >>> + >>> void Mother::setParameters(Parameters * parameters) { >>> m_parameters = parameters; >>> } >>> + >>> diff --git a/code/Surveyor/Mother.h b/code/Surveyor/Mother.h >>> index 092920f..9774c4b 100644 >>> --- a/code/Surveyor/Mother.h >>> +++ b/code/Surveyor/Mother.h >>> @@ -28,6 +28,7 @@ >>> >>> #include <vector> >>> #include <string> >>> +#include <iostream> >>> using namespace std; >>> >>> /** >>> @@ -55,9 +56,12 @@ class Mother: public Actor { >>> private: >>> >>> int m_matrixOwner; >>> + int m_kmersMatrixOwner; >>> >>> int m_flushedMothers; >>> int m_finishedMothers; >>> + bool m_matricesAreReady; >>> + bool m_printKmersMatrix; >>> >>> Parameters * m_parameters; >>> >>> @@ -93,6 +97,13 @@ private: >>> */ >>> void sendToFirstMother(int forwardTag, int responseTag); >>> >>> + /* int m_kmersMatrixBlocNumber; */ >>> + void printLocalKmersMatrix(string & kmer, string & >>> samples_kmers, bool force); >>> + void createKmersMatrixOutputFile(); >>> + >>> + void spawnMatrixOwner(); >>> + void spawnKmersMatrixOwner(); >>> + >> That's a good design -- private methods for private uses. >> >>> public: >>> >>> Mother(); >>> @@ -109,8 +120,10 @@ public: >>> FLUSH_AGGREGATOR, >>> FLUSH_AGGREGATOR_OK, >>> FLUSH_AGGREGATOR_RETURN, >>> - MERGE, >>> - MERGE_OK, >>> + MERGE_GRAM_MATRIX, >>> + MERGE_GRAM_MATRIX_OK, >>> + MERGE_KMERS_MATRIX, >>> + MERGE_KMERS_MATRIX_OK, >> kmer matrix, not kmers matrix. >> >>> LAST_TAG, >>> }; >>> >>> diff --git a/code/Surveyor/StoreKeeper.cpp b/code/Surveyor/StoreKeeper.cpp >>> index 84eef34..492208c 100644 >>> --- a/code/Surveyor/StoreKeeper.cpp >>> +++ b/code/Surveyor/StoreKeeper.cpp >>> @@ -22,10 +22,16 @@ >>> #include "StoreKeeper.h" >>> #include "CoalescenceManager.h" >>> #include "MatrixOwner.h" >>> +#include "KmersMatrixOwner.h" >>> >>> #include <code/VerticesExtractor/Vertex.h> >>> +#include <RayPlatform/structures/MyHashTableIterator.h> >>> +#include <RayPlatform/core/OperatingSystem.h> >>> >>> #include <iostream> >>> +#include <sstream> >>> +#include <iomanip> >>> +#include <fstream> >>> using namespace std; >>> >>> #include <string.h> >>> @@ -83,15 +89,21 @@ void StoreKeeper::receive(Message & message) { >>> >>> die(); >>> >>> - } else if(tag == MERGE) { >>> + } else if(tag == MERGE_GRAM_MATRIX) { >>> >>> >>> - printName(); >>> - cout << "DEBUG at MERGE message reception "; >>> - cout << "(StoreKeeper) received " << m_receivedObjects >>> << " objects in total"; >>> - cout << " with " << m_receivedPushes << " push >>> operations" << endl; >>> + // printName(); >>> + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; >>> + // cout << "(StoreKeeper) received " << >>> m_receivedObjects << " objects in total"; >>> + // cout << " with " << m_receivedPushes << " push >> You can remove commented lines. >> >>> operations" << endl; >>> computeLocalGramMatrix(); >>> >>> + >>> + // TODEL Print matrix bloc >>> + // m_kmersMatrixBlocNumber = 0; >>> + // printLocalKmersMatrix(); >>> + >> You can remove commented lines. >> >>> + >>> m_mother = source; >>> >>> memcpy(&m_matrixOwner, buffer, sizeof(m_matrixOwner)); >>> @@ -108,19 +120,32 @@ void StoreKeeper::receive(Message & message) { >>> m_iterator2 = m_iterator1->second.begin(); >>> } >>> >>> - /* >>> - printName(); >>> - cout << "DEBUG printLocalGramMatrix before first >>> sendMatrixCell" << endl; >>> - printLocalGramMatrix(); >>> - */ >>> - >>> + // printName(); >>> + // cout << "DEBUG printLocalGramMatrix before first >>> sendMatrixCell" << endl; >>> + // printLocalGramMatrix(); >> You can remove commented lines. >> >>> sendMatrixCell(); >>> >>> } else if(tag == MatrixOwner::PUSH_PAYLOAD_OK) { >>> - >>> sendMatrixCell(); >>> + } else if(tag == MERGE_KMERS_MATRIX) { >>> + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; >>> + // cout << "(StoreKeeper) received " << >>> m_receivedObjects << " objects in total"; >>> + // cout << " with " << m_receivedPushes << " push >>> operations" << endl; >> You can remove commented lines. >> >> Otherwise, add a "#ifdef DEBUG_SOMETHING_SOMETHING / #endif around that lines". >> >> >>> + >>> + m_mother = source; >>> >>> - } else if(tag == CoalescenceManager::SET_KMER_LENGTH) { >>> + memcpy(&m_kmersMatrixOwner, buffer, >>> sizeof(m_kmersMatrixOwner)); >>> + >>> + m_hashTableIterator.constructor(&m_hashTable); >>> + >>> + sendKmersSamples(); >>> + } >>> + else if (tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_END) { >>> + } >>> + else if(tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_OK) { >>> + sendKmersSamples(); >>> + } >>> + else if(tag == CoalescenceManager::SET_KMER_LENGTH) { >>> >>> int kmerLength = 0; >>> int position = 0; >>> @@ -181,8 +206,6 @@ void StoreKeeper::sendMatrixCell() { >>> message.setNumberOfBytes(offset); >>> message.setTag(MatrixOwner::PUSH_PAYLOAD); >>> >>> - //cout << " DEBUG send PUSH_PAYLOAD to " << >>> m_matrixOwner << endl; >>> - >>> send(m_matrixOwner, message); >>> >>> m_iterator2++; >>> @@ -207,10 +230,7 @@ void StoreKeeper::sendMatrixCell() { >>> // free memory. >>> m_localGramMatrix.clear(); >>> >>> - /* >>> printName(); >>> - cout << "DEBUG send PUSH_PAYLOAD_END to " << m_matrixOwner << endl; >>> - */ >>> >>> Message response; >>> response.setTag(MatrixOwner::PUSH_PAYLOAD_END); >>> @@ -236,6 +256,7 @@ void StoreKeeper::configureHashTable() { >>> ); >>> >>> m_configured = true; >>> + >>> } >>> >>> void StoreKeeper::printColorReport() { >>> @@ -375,6 +396,7 @@ void StoreKeeper::computeLocalGramMatrix() { >>> //printLocalGramMatrix(); >>> } >>> >>> + >>> void StoreKeeper::printLocalGramMatrix() { >>> >>> printName(); >>> @@ -623,3 +645,73 @@ void StoreKeeper::storeData(Vertex & vertex, int & >>> sample) { >>> >>> */ >>> } >>> + >>> + >>> +void StoreKeeper::setSamplesSize(int sampleSize) { >>> + m_sampleSize = sampleSize; >>> +} >>> + >>> +void StoreKeeper::setOutputKmersMatrixPath(string pathPrefix) { >>> + // m_outputKmersMatrixPath = pathPrefix; >>> + // m_outputKmersMatrixPath += "/KmersMatrixDump/"; >>> + // createDirectory(m_outputKmersMatrixPath.c_str()); >> You can remove commented lines. >> >> >> This file could be in prefix/Surveyor/<...> >> >>> +} >>> + >>> + >>> +void StoreKeeper::sendKmersSamples() { >>> + >>> + char buffer[4000]; >> For portability, use MAXIMUM_MESSAGE_SIZE_IN_BYTES instead of 4000. >> >>> + int bytes = 0; >>> + >>> + ExperimentVertex * currentVertex = NULL; >>> + VirtualKmerColorHandle currentVirtualColor = NULL_VIRTUAL_COLOR; >>> + >>> + vector<bool> samplesVector (m_sampleSize, false); >>> + >>> + if(m_hashTableIterator.hasNext()){ >>> + >>> + // fill(samplesVector.begin(),samplesVector.end(),false); >>> + >> You can remove commented lines. >> >>> + currentVertex = m_hashTableIterator.next(); >>> + Kmer kmer = currentVertex->getKey(); >>> + >>> + bytes += kmer.dump(buffer); >>> + >>> + currentVirtualColor = currentVertex->getVirtualColor(); >>> + set<PhysicalKmerColor> * samples = >>> m_colorSet.getPhysicalColors(currentVirtualColor); >>> + >>> + for(set<PhysicalKmerColor>:: iterator sampleIterator = >>> samples->begin(); >>> + sampleIterator != samples->end(); ++sampleIterator) { >>> + PhysicalKmerColor value = *sampleIterator; >>> + samplesVector[value] = true; >>> + // cout << " " << value; >>> + } >>> + >>> + for (std::vector<bool>::iterator it = >>> samplesVector.begin(); >>> + it != samplesVector.end(); ++it) { >>> + buffer[bytes] = *it; >>> + bytes++; >>> + } >>> + // buffer[bytes] = '\0'; >> You can remove commented lines. >> >>> + } >>> + >>> + >>> + Message message; >>> + message.setNumberOfBytes(bytes); >>> + message.setBuffer(buffer); >>> + >>> + // message.setTag(MatrixOwner::PUSH_KMERS_SAMPLES); >>> + if(m_hashTableIterator.hasNext()){ >>> + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES); >>> + }else{ >>> + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES_END); >>> + } >>> + >>> + // Message response; >>> + // response.setTag(MatrixOwner::PUSH_PAYLOAD_END); >>> + // send(m_matrixOwner, response); >>> + >>> + send(m_kmersMatrixOwner, message); >>> + >>> +} >>> + >>> diff --git a/code/Surveyor/StoreKeeper.h b/code/Surveyor/StoreKeeper.h >>> index e44cf98..36adf77 100644 >>> --- a/code/Surveyor/StoreKeeper.h >>> +++ b/code/Surveyor/StoreKeeper.h >>> @@ -34,6 +34,10 @@ >>> >>> #include <RayPlatform/actors/Actor.h> >>> #include <RayPlatform/structures/MyHashTable.h> >>> +#include <RayPlatform/structures/MyHashTableIterator.h> >>> + >>> +#include <iostream> >>> +#include <sstream> >>> >>> /** >>> * Provides genomic storage. >>> @@ -55,6 +59,7 @@ private: >>> >>> int m_mother; >>> int m_matrixOwner; >>> + int m_kmersMatrixOwner; >>> >>> bool m_configured; >>> >>> @@ -64,6 +69,8 @@ private: >>> */ >>> MyHashTable<Kmer,ExperimentVertex> m_hashTable; >>> >>> + MyHashTableIterator<Kmer,ExperimentVertex> m_hashTableIterator; >>> + >>> int m_kmerLength; >>> bool m_colorSpaceMode; >>> >>> @@ -79,6 +86,13 @@ private: >>> void printLocalGramMatrix(); >>> void printColorReport(); >>> >>> + /* ostringstream m_currentKmer; */ >>> + /* ostringstream m_currentSamplesKmers; */ >>> + int m_sampleSize; >>> + string m_outputKmersMatrixPath; >>> + void printLocalKmersMatrix(string & m_kmer, string & >>> m_samplesKmers); >>> + void sendKmersSamples(); >>> + >>> void sendMatrixCell(); >>> >>> public: >>> @@ -86,14 +100,19 @@ public: >>> StoreKeeper(); >>> ~StoreKeeper(); >>> >>> + void setOutputKmersMatrixPath(string pathPrefix); >>> + void setSamplesSize(int sampleSize); >>> + >>> void receive(Message & message); >>> >>> enum { >>> FIRST_TAG = 10250, >>> PUSH_SAMPLE_VERTEX, >>> PUSH_SAMPLE_VERTEX_OK, >>> - MERGE, >>> - MERGE_OK, >>> + MERGE_GRAM_MATRIX, >>> + MERGE_GRAM_MATRIX_OK, >>> + MERGE_KMERS_MATRIX, >>> + MERGE_KMERS_MATRIX_OK, >>> LAST_TAG >>> }; >>> }; >> ------------------------------------------------------------------------------ >> Managing the Performance of Cloud-Based Applications >> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >> Read the Whitepaper. >> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >> _______________________________________________ >> Denovoassembler-devel mailing list >> Den...@li... >> https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel > |
From: Sébastien B. <se...@bo...> - 2014-03-07 02:18:39
|
Hi Maxime, I can't merge your code in master without making some changes first. As the maintainer, here are the changes that I will need to do to increase the plus-value of your work (on your next pull request, you can think about some of these points if you want): Keep in mind that Coding style is very important for readability. (you can check https://github.com/sebhtml/ray/blob/master/Documentation/CodingStyle.txt ) Here are the changes that I will make tomorrow: - (C0) As discussed, we don't want to generate Surveyor/KmerMatrix.tsv by default because this is quite large file. There needs to be an option. - (C1) Obviously, if you want people to use your Surveyor workflow, you need to add documentation. You just need to add some code in code/Mock/Parameters.cpp. After that you build Ray, then you write ./Ray -help> MANUAL_PAGE.txt - (C2) KmerMatrixOwner.h/cpp was done by you. So the copyright in the header belongs to you, not me. - (C2) there is some red showing up in 'git diff --color', nothing important though relevant commits ce6c272 & 4e4949 - (C3) Copyright for code/Surveyor/SequenceKmerReader.cpp/.h should be in 2014 - (C4) In code/Surveyor/StoreKeeper.h you must use tabulations and no spaces for indentation. for example, the line with MERGE_KMER_MATRIX uses spaces in StoreKeeper.h - (C5) code/Surveyor/KmerMatrixOwner.cpp was using spaces instead of tabulations. - (C6) the method name KmerMatrixOwner::printLocalKmersMatrix is meaningless as it does not write a matrix, it writes a kmer. - (C7) you hard-coded the kmer value (31) !!! m_kmerMatrix << kmer.idToWord(31,0); you must avoid that ! [KmerMatrixOwner.cpp] - (C8) The FIRST_TAG for your reader is the same that is used by the graph reader. I think we said it was OK as long as you added a comment about the reason. code/Surveyor/GenomeAssemblyReader.h: FIRST_TAG = 10200, code/Surveyor/GenomeGraphReader.h: FIRST_TAG = 10200, Keep up the great work, but next time, I think you can make progress on respecting the coding style, among other things. I will do a bunch of commits in my branch patch-kmermatrix (fetched from remotes/zorino/patch-kmermatrix) to address the comments above. [boiseb01@ls30 ray]$ git diff master..remotes/zorino/patch-kmermatrix --stat code/Surveyor/GenomeAssemblyReader.cpp | 3 +- code/Surveyor/KmerMatrixOwner.cpp | 157 ++++++++++++++++++++ code/Surveyor/KmerMatrixOwner.h | 72 +++++++++ code/Surveyor/Makefile | 1 + code/Surveyor/MatrixOwner.cpp | 19 +-- code/Surveyor/MatrixOwner.h | 3 +- code/Surveyor/Mother.cpp | 253 +++++++++++++++++++++---------- code/Surveyor/Mother.h | 17 ++- code/Surveyor/SequenceKmerReader.cpp | 53 ++++++- code/Surveyor/SequenceKmerReader.h | 2 + code/Surveyor/StoreKeeper.cpp | 117 +++++++++++---- code/Surveyor/StoreKeeper.h | 20 +++- 12 files changed, 580 insertions(+), 137 deletions(-) Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/017C19ECDB98F64F99200D83876C44520111A7FC4FE7%40EXCH-MBX-B.ulaval.ca/#msg31999514 Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/COL131-W760CDDC5071A415AB9C5EDAC870%40phx.gbl/#msg32015211 Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/f2d8f53ef133555e36529047f4e95084%40boisvert.info/#msg31993547 Link: http://sourceforge.net/p/denovoassembler/mailman/denovoassembler-devel/thread/5301F42B.6060904%40gmail.com/#msg31988453 ---------------------------------------- > Date: Wed, 5 Mar 2014 13:41:39 +0000 > From: max...@gm... > To: se...@bo... > Subject: Re: [Denovoassembler-devel] git diff kmersmatrix branch > > Hi Sebastien, > > I have fix the issue when scaffolds would be given in entry instead of > contigs in the SequenceKmerReader class. > > I also made the edition according to this review. > > Please now pull from : > > https://github.com/Zorino/ray.git > > patch-kmermatrix > > Cheers, > > Maxime > > > > On 02/23/2014 12:16 PM, Sébastien Boisvert wrote: >> Hey Maxime, >> >> You did not provide KmersMatrixOwner.h, KmersMatrixOwner.cpp, and changes >> to the Surveyor Makefile. >> >> >> OTher comments are below. >> >> ---------------------------------------- >>> Date: Sat, 22 Feb 2014 10:03:30 +0000 >>> From: ma...@de... >>> To: se...@bo... >>> Subject: git diff kmersmatrix branch >>> >>> diff --git a/code/Surveyor/MatrixOwner.cpp b/code/Surveyor/MatrixOwner.cpp >>> index ffaae00..47cf84a 100644 >>> --- a/code/Surveyor/MatrixOwner.cpp >>> +++ b/code/Surveyor/MatrixOwner.cpp >>> @@ -65,9 +65,12 @@ void MatrixOwner::receive(Message & message) { >>> assert(m_parameters != NULL); >>> assert(m_sampleNames != NULL); >>> #endif >>> - >>> m_mother = source; >>> >>> + //open the buffer of the file >>> + // createKmersMatrixOutputFile(); >>> + >>> + >>> } else if(tag == PUSH_PAYLOAD) { >>> >>> SampleIdentifier sample1 = -1; >>> @@ -89,10 +92,10 @@ void MatrixOwner::receive(Message & message) { >>> assert(count>= 0); >>> #endif >>> >>> - /* >>> + >>> printName(); >>> - cout << "DEBUG add " << sample1 << " " << sample2 << " " >>> << count << endl; >>> -*/ >>> + // cout << "DEBUG add " << sample1 << " " << sample2 << >> Commented lines should be removed. >> >>> " " << count << endl; >>> + >>> m_receivedPayloads ++; >>> >>> m_localGramMatrix[sample1][sample2] += count; >>> @@ -100,14 +103,14 @@ void MatrixOwner::receive(Message & message) { >>> Message response; >>> response.setTag(PUSH_PAYLOAD_OK); >>> send(source, response); >>> + } >>> + else if(tag == PUSH_PAYLOAD_END) { >> Use '} else if (' and not '} >> else if' >> >> >> This is the coding style of the project. >> see https://github.com/sebhtml/ray/blob/master/Documentation/CodingStyle.txt >> >> * Kernighan and Ritchie style, variant "The One True Brace Style" (1TBS) >> http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS >> >> >> (you used K&R Variant: Stroustrup). >> >>> - } else if(tag == PUSH_PAYLOAD_END) { >>> - >>> + cout << "PUSH_PAYLOAD_END" <<endl; >> Remove this debug message. >> >>> m_completedStoreActors++; >>> >>> if(m_completedStoreActors == getSize()) { >>> >>> - >>> printName(); >>> cout << "MatrixOwner received " << >>> m_receivedPayloads << " payloads" << endl; >>> >>> @@ -151,10 +154,9 @@ void MatrixOwner::receive(Message & message) { >>> >>> >>> // tell Mother that the matrix is ready now. >>> - >>> - Message coolMessage; >>> - coolMessage.setTag(MATRIX_IS_READY); >>> - send(m_mother, coolMessage); >>> + Message coolMessage; >>> + coolMessage.setTag(GRAM_MATRIX_IS_READY); >>> + send(m_mother, coolMessage); >>> >>> >>> // clear matrices >>> @@ -275,3 +277,4 @@ void >>> MatrixOwner::printLocalGramMatrixWithHash(ostream & stream, map<SampleIdent >>> stream << endl; >>> } >>> } >>> + >>> diff --git a/code/Surveyor/MatrixOwner.h b/code/Surveyor/MatrixOwner.h >>> index ceb17e2..afa9278 100644 >>> --- a/code/Surveyor/MatrixOwner.h >>> +++ b/code/Surveyor/MatrixOwner.h >>> @@ -28,6 +28,7 @@ >>> >>> #include <map> >>> #include <iostream> >>> +#include <sstream> >>> using namespace std; >>> >>> class MatrixOwner : public Actor { >>> @@ -62,7 +63,7 @@ public: >>> PUSH_PAYLOAD, >>> PUSH_PAYLOAD_OK, >>> PUSH_PAYLOAD_END, >>> - MATRIX_IS_READY, >>> + GRAM_MATRIX_IS_READY, >>> LAST_TAG >>> }; >>> >>> diff --git a/code/Surveyor/Mother.cpp b/code/Surveyor/Mother.cpp >>> index 4d2ef9c..103a583 100644 >>> --- a/code/Surveyor/Mother.cpp >>> +++ b/code/Surveyor/Mother.cpp >>> @@ -27,6 +27,7 @@ >>> #include "GenomeGraphReader.h" >>> #include "GenomeAssemblyReader.h" >>> #include "MatrixOwner.h" >>> +#include "KmersMatrixOwner.h" >>> >>> #include <RayPlatform/cryptography/crypto.h> >>> >>> @@ -39,11 +40,13 @@ using namespace std; >>> #define INPUT_TYPE_GRAPH 0 >>> #define INPUT_TYPE_ASSEMBLY 1 >>> >>> - >>> Mother::Mother() { >>> >>> m_coalescenceManager = -1; >>> m_matrixOwner = -1; >>> + m_kmersMatrixOwner = -1; >>> + >>> + // m_matricesAreReady = true; >> Remove this commented line. >> >>> m_parameters = NULL; >>> m_bigMother = -1; >>> @@ -91,7 +94,7 @@ void Mother::receive(Message & message) { >>> notifyController(); >>> } >>> >>> - } else if(tag == MERGE) { >>> + } else if(tag == MERGE_GRAM_MATRIX) { >>> >>> int matrixOwner = -1; >>> memcpy(&matrixOwner, buffer, sizeof(matrixOwner)); >>> @@ -102,7 +105,7 @@ void Mother::receive(Message & message) { >>> #endif >>> >>> Message theMessage; >>> - theMessage.setTag(StoreKeeper::MERGE); >>> + theMessage.setTag(StoreKeeper::MERGE_GRAM_MATRIX); >>> theMessage.setBuffer(&matrixOwner); >>> theMessage.setNumberOfBytes(sizeof(matrixOwner)); >>> >>> @@ -111,10 +114,33 @@ void Mother::receive(Message & message) { >>> send(destination, theMessage); >>> >>> Message response; >>> - response.setTag(MERGE_OK); >>> + response.setTag(MERGE_GRAM_MATRIX_OK); >>> + send(source, response); >>> + >>> + } else if (tag == MERGE_KMERS_MATRIX) { >>> + >>> + int kmersMatrixOwner = -1; >>> + memcpy(&kmersMatrixOwner, buffer, sizeof(kmersMatrixOwner)); >>> + >>> +#ifdef CONFIG_ASSERT >>> + assert(kmersMatrixOwner>= 0); >>> + assert(m_storeKeepers.size() == 1); >>> +#endif >>> + >>> + Message theMessage; >>> + theMessage.setTag(StoreKeeper::MERGE_KMERS_MATRIX); >> The name should be MERGE_KMER_MATRIX and not MERGE_KMERS_MATRIX. >> >>> + theMessage.setBuffer(&kmersMatrixOwner); >>> + theMessage.setNumberOfBytes(sizeof(kmersMatrixOwner)); >>> + >>> + int destination = m_storeKeepers[0]; >>> + >>> + send(destination, theMessage); >>> + >>> + Message response; >>> + response.setTag(MERGE_KMERS_MATRIX_OK); >>> send(source, response); >>> >>> - } else if(tag == SHUTDOWN) { >>> + } else if(tag == SHUTDOWN) { >>> >>> Message response; >>> response.setTag(SHUTDOWN_OK); >>> @@ -122,18 +148,16 @@ void Mother::receive(Message & message) { >>> >>> stop(); >>> >>> - } else if(tag == StoreKeeper::MERGE_OK) { >>> + } else if(tag == StoreKeeper::MERGE_GRAM_MATRIX_OK) { >>> >>> // TODO: the bug https://github.com/sebhtml/ray/issues/216 >>> // is caused by the fact that this message is not >>> // received . >>> >>> - /* >>> - Message newMessage; >>> - newMessage.setTag(MERGE_OK); >>> + // Message newMessage; >>> + // newMessage.setTag(MERGE_OK); >>> >>> - send(m_bigMother, newMessage); >>> - */ >>> + // send(m_bigMother, newMessage); >> Remove these commented lines. >> >>> } else if(tag == FINISH_JOB) { >>> >>> @@ -153,6 +177,7 @@ void Mother::receive(Message & message) { >>> >>> sendToFirstMother(FLUSH_AGGREGATOR, >>> FLUSH_AGGREGATOR_RETURN); >>> } >>> + >>> } else if(tag == FLUSH_AGGREGATOR) { >>> >>> /* >>> @@ -188,64 +213,52 @@ void Mother::receive(Message & message) { >>> cout << "DEBUG sending FLUSH_AGGREGATOR_OK to >>> m_bigMother" << endl; >>> */ >>> >>> - } else if(tag == MatrixOwner::MATRIX_IS_READY) { >>> + } else if(tag == MatrixOwner::GRAM_MATRIX_IS_READY) { >>> + >>> + //TODO : check if all matrices are ready >>> + if(m_matricesAreReady){ >>> + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>> + }else { >>> + cout << "GRAM_MATRIX_IS_READY" << endl; >> When an actor speak, you must print its name too in stdout. >> (with printName()). >> >>> + m_matricesAreReady = true; >>> + } >>> >>> - sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>> + } >>> + else if(tag == KmersMatrixOwner::KMERS_MATRIX_IS_READY) { >>> + >>> + cout << "KMERS_MATRIX_IS_READY" << endl; >>> + if(m_matricesAreReady){ >>> + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); >>> + }else { >>> + cout << "KMERS_MATRIX_IS_READY" << endl; >>> + m_matricesAreReady = true; >> >> In one comment above, I saw that m_matricesAreReady = false >> was commented. Check that out. >> >>> + } >>> >>> } else if(tag == FLUSH_AGGREGATOR_OK) { >>> >>> - /* >>> printName(); >>> cout << "DEBUG received FLUSH_AGGREGATOR_OK" << endl; >>> - */ >>> >>> m_flushedMothers++; >>> >>> if(m_flushedMothers < getSize()) >>> return; >>> >>> - // spawn the MatrixOwner here ! >>> - >>> - MatrixOwner * matrixOwner = new MatrixOwner(); >>> - spawn(matrixOwner); >>> - >>> - m_matrixOwner = matrixOwner->getName(); >>> - >>> - printName(); >>> - cout << "Spawned MatrixOwner actor !" << endl; >>> - >>> - // tell the StoreKeeper actors to send their stuff to the >>> - // MatrixOwner actor >>> - // The Mother of Mother will wait for a signal from >>> MatrixOwner >>> - >>> - Message greetingMessage; >>> - >>> - vector<string> * names = & m_sampleNames; >>> - >>> - char buffer[32]; >>> - int offset = 0; >>> - memcpy(buffer + offset, &m_parameters, >>> sizeof(m_parameters)); >>> - offset += sizeof(m_parameters); >>> - memcpy(buffer + offset, &names, sizeof(names)); >>> - offset += sizeof(names); >>> - >>> - greetingMessage.setBuffer(&buffer); >>> - greetingMessage.setNumberOfBytes(offset); >>> - >>> - greetingMessage.setTag(MatrixOwner::GREETINGS); >>> - send(m_matrixOwner, greetingMessage); >>> - >>> - sendToFirstMother(MERGE, MERGE_OK); >>> - >>> + spawnMatrixOwner(); >> I like that. A method to spawn an actor. Good ! >> >>> } else if(tag == m_responseTag) { >>> >>> - >>> if(m_responseTag == SHUTDOWN_OK) { >>> >>> - } else if(m_responseTag == MERGE_OK) { >>> - >>> - } else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { >>> + } else if(m_responseTag == MERGE_GRAM_MATRIX_OK) { >>> + // All mothers merged their GRAM MATRIX >>> + // Spawn KmersMatrixOwner to print >>> + if(m_motherToKill < getSize() && >>> m_printKmersMatrix){ >>> + spawnKmersMatrixOwner(); >>> + } >>> + } else if(m_responseTag == MERGE_KMERS_MATRIX_OK) { >>> + } >>> + else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { >> Again, put closing brace on same line (} else if ( ...) { >> >>> /* >>> printName(); >>> @@ -254,9 +267,8 @@ void Mother::receive(Message & message) { >>> */ >>> } >>> >>> - // every mother was informed. >>> + // every mother was not informed. >> Good catch ! >> >>> if(m_motherToKill>= getSize()) { >>> - >>> sendMessageWithReply(m_motherToKill, m_forwardTag); >>> m_motherToKill--; >>> } >>> @@ -284,11 +296,15 @@ void Mother::sendMessageWithReply(int & actor, int >>> tag) { >>> Message message; >>> message.setTag(tag); >>> >>> - if(tag == MERGE) { >>> + if(tag == MERGE_GRAM_MATRIX) { >>> message.setBuffer(&m_matrixOwner); >>> message.setNumberOfBytes(sizeof(m_matrixOwner)); >>> - >>> - } else if(tag == FLUSH_AGGREGATOR) { >>> + } >>> + else if(tag == MERGE_KMERS_MATRIX) { >>> + message.setBuffer(&m_kmersMatrixOwner); >>> + message.setNumberOfBytes(sizeof(m_kmersMatrixOwner)); >>> + } >>> + else if(tag == FLUSH_AGGREGATOR) { >>> >>> /* >>> printName(); >>> @@ -328,6 +344,10 @@ void Mother::stop() { >>> m_matrixOwner = -1; >>> } >>> >>> + if(m_kmersMatrixOwner>= 0) { >>> + send(m_kmersMatrixOwner, kill); >>> + m_kmersMatrixOwner = -1; >>> + } >>> >>> die(); >>> >>> @@ -410,39 +430,44 @@ void Mother::startSurveyor() { >>> >>> bool isRoot = (getName() % getSize()) == 0; >>> >>> - //cout << "DEBUG startSurveyor isRoot" << isRoot << endl; >>> - >>> - // get a list of files. >>> + // Set matricesAreReady to true in case user doesn't want >>> + // to print out kmers matrix. >>> + m_matricesAreReady = true; >>> >>> vector<string> * commands = m_parameters->getCommands(); >>> >>> - >>> for(int i = 0 ; i < (int) commands->size() ; ++i) { >>> >>> string & element = commands->at(i); >>> >>> - // DONE: Check bounds for file names >>> + if (element != "-print-kmers-matrix") { >> The name should be kmer-matrix, not kmers-matrix. >> >> It is like groceries store vs grocery store. >> >>> + // DONE: Check bounds for file names >>> >>> - map<string,int> fastTable; >>> + map<string,int> fastTable; >>> >>> - fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; >>> - fastTable["-read-sample-assembly"] = INPUT_TYPE_ASSEMBLY; >>> + fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; >>> + fastTable["-read-sample-assembly"] = >>> INPUT_TYPE_ASSEMBLY; >>> >>> - // Unsupported option >>> - if(fastTable.count(element) == 0 || i+2> (int) >>> commands->size()) >>> - continue; >>> + // Unsupported option >>> + if(fastTable.count(element) == 0 || i+2> (int) >>> commands->size()) >>> + continue; >>> >>> - string sampleName = commands->at(++i); >>> - string fileName = commands->at(++i); >>> + string sampleName = commands->at(++i); >>> + string fileName = commands->at(++i); >>> >>> - m_sampleNames.push_back(sampleName); >>> + m_sampleNames.push_back(sampleName); >>> >>> - // DONE implement this m_assemblyFileNames + type >>> - m_inputFileNames.push_back(fileName); >>> + // DONE implement this m_assemblyFileNames + type >>> + m_inputFileNames.push_back(fileName); >>> >>> - int type = fastTable[element]; >>> + int type = fastTable[element]; >>> >>> - m_sampleInputTypes.push_back(type); >>> + m_sampleInputTypes.push_back(type); >>> + >>> + } else { >>> + m_matricesAreReady = false; >>> + m_printKmersMatrix = true; >> >> Question: if m_printKmersMatrix is false, I suppose the code >> follows the usual path of printing just one matrix, right ? >> >>> + } >>> >>> } >>> >>> @@ -468,6 +493,9 @@ void Mother::startSurveyor() { >>> >>> m_storeKeepers.push_back(actor->getName()); >>> >>> + actor->setOutputKmersMatrixPath(m_parameters->getPrefix()); >> The path should be prefix/Surveyor/<whatever the kmer matrix's name is> >> >>> + actor->setSamplesSize(m_sampleNames.size()); >> sample size, not samples size. >> >>> + >>> // tell the CoalescenceManager about the local StoreKeeper >>> Message dummyMessage; >>> int localStore = actor->getName(); >>> @@ -568,6 +596,80 @@ void Mother::spawnReader() { >>> } >>> } >>> >>> + >>> +void Mother::spawnMatrixOwner() { >>> + >>> + // spawn the MatrixOwner here ! >>> + MatrixOwner * matrixOwner = new MatrixOwner(); >>> + spawn(matrixOwner); >>> + >>> + m_matrixOwner = matrixOwner->getName(); >>> + >>> + printName(); >>> + cout << "Spawned MatrixOwner actor !" << m_matrixOwner << endl; >>> + >>> + // tell the StoreKeeper actors to send their stuff to the >>> + // MatrixOwner actor >>> + // The Mother of Mother will wait for a signal from MatrixOwner >>> + >>> + Message greetingMessage; >>> + >>> + vector<string> * names = & m_sampleNames; >>> + >>> + char buffer[32]; >>> + int offset = 0; >>> + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); >>> + offset += sizeof(m_parameters); >>> + memcpy(buffer + offset, &names, sizeof(names)); >>> + offset += sizeof(names); >>> + >>> + greetingMessage.setBuffer(&buffer); >>> + greetingMessage.setNumberOfBytes(offset); >>> + >>> + greetingMessage.setTag(MatrixOwner::GREETINGS); >>> + send(m_matrixOwner, greetingMessage); >>> + >>> + sendToFirstMother(MERGE_GRAM_MATRIX, MERGE_GRAM_MATRIX_OK); >>> +} >>> + >>> +void Mother::spawnKmersMatrixOwner() { >>> + >>> + // spawn the MatrixOwner here ! >>> + KmersMatrixOwner * kmersMatrixOwner = new KmersMatrixOwner(); >>> + spawn(kmersMatrixOwner); >>> + >>> + m_kmersMatrixOwner = kmersMatrixOwner->getName(); >>> + >>> + printName(); >>> + cout << "Spawned KmersMatrixOwner actor !" << >>> m_kmersMatrixOwner << endl; >>> + >>> + // tell the StoreKeeper actors to send their stuff to the >>> + // KmersMatrixOwner actor >>> + // The Mother of Mother will wait for a signal from MatrixOwner >>> + >>> + Message greetingMessage; >>> + >>> + vector<string> * names = & m_sampleNames; >>> + >>> + char buffer[32]; >>> + int offset = 0; >>> + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); >>> + offset += sizeof(m_parameters); >>> + memcpy(buffer + offset, &names, sizeof(names)); >>> + offset += sizeof(names); >>> + >>> + greetingMessage.setBuffer(&buffer); >>> + greetingMessage.setNumberOfBytes(offset); >>> + >>> + greetingMessage.setTag(KmersMatrixOwner::GREETINGS); >>> + send(m_kmersMatrixOwner, greetingMessage); >>> + >>> + sendToFirstMother(MERGE_KMERS_MATRIX, MERGE_KMERS_MATRIX_OK); >>> + >>> +} >>> + >>> + >>> void Mother::setParameters(Parameters * parameters) { >>> m_parameters = parameters; >>> } >>> + >>> diff --git a/code/Surveyor/Mother.h b/code/Surveyor/Mother.h >>> index 092920f..9774c4b 100644 >>> --- a/code/Surveyor/Mother.h >>> +++ b/code/Surveyor/Mother.h >>> @@ -28,6 +28,7 @@ >>> >>> #include <vector> >>> #include <string> >>> +#include <iostream> >>> using namespace std; >>> >>> /** >>> @@ -55,9 +56,12 @@ class Mother: public Actor { >>> private: >>> >>> int m_matrixOwner; >>> + int m_kmersMatrixOwner; >>> >>> int m_flushedMothers; >>> int m_finishedMothers; >>> + bool m_matricesAreReady; >>> + bool m_printKmersMatrix; >>> >>> Parameters * m_parameters; >>> >>> @@ -93,6 +97,13 @@ private: >>> */ >>> void sendToFirstMother(int forwardTag, int responseTag); >>> >>> + /* int m_kmersMatrixBlocNumber; */ >>> + void printLocalKmersMatrix(string & kmer, string & >>> samples_kmers, bool force); >>> + void createKmersMatrixOutputFile(); >>> + >>> + void spawnMatrixOwner(); >>> + void spawnKmersMatrixOwner(); >>> + >> That's a good design -- private methods for private uses. >> >>> public: >>> >>> Mother(); >>> @@ -109,8 +120,10 @@ public: >>> FLUSH_AGGREGATOR, >>> FLUSH_AGGREGATOR_OK, >>> FLUSH_AGGREGATOR_RETURN, >>> - MERGE, >>> - MERGE_OK, >>> + MERGE_GRAM_MATRIX, >>> + MERGE_GRAM_MATRIX_OK, >>> + MERGE_KMERS_MATRIX, >>> + MERGE_KMERS_MATRIX_OK, >> kmer matrix, not kmers matrix. >> >>> LAST_TAG, >>> }; >>> >>> diff --git a/code/Surveyor/StoreKeeper.cpp b/code/Surveyor/StoreKeeper.cpp >>> index 84eef34..492208c 100644 >>> --- a/code/Surveyor/StoreKeeper.cpp >>> +++ b/code/Surveyor/StoreKeeper.cpp >>> @@ -22,10 +22,16 @@ >>> #include "StoreKeeper.h" >>> #include "CoalescenceManager.h" >>> #include "MatrixOwner.h" >>> +#include "KmersMatrixOwner.h" >>> >>> #include <code/VerticesExtractor/Vertex.h> >>> +#include <RayPlatform/structures/MyHashTableIterator.h> >>> +#include <RayPlatform/core/OperatingSystem.h> >>> >>> #include <iostream> >>> +#include <sstream> >>> +#include <iomanip> >>> +#include <fstream> >>> using namespace std; >>> >>> #include <string.h> >>> @@ -83,15 +89,21 @@ void StoreKeeper::receive(Message & message) { >>> >>> die(); >>> >>> - } else if(tag == MERGE) { >>> + } else if(tag == MERGE_GRAM_MATRIX) { >>> >>> >>> - printName(); >>> - cout << "DEBUG at MERGE message reception "; >>> - cout << "(StoreKeeper) received " << m_receivedObjects >>> << " objects in total"; >>> - cout << " with " << m_receivedPushes << " push >>> operations" << endl; >>> + // printName(); >>> + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; >>> + // cout << "(StoreKeeper) received " << >>> m_receivedObjects << " objects in total"; >>> + // cout << " with " << m_receivedPushes << " push >> You can remove commented lines. >> >>> operations" << endl; >>> computeLocalGramMatrix(); >>> >>> + >>> + // TODEL Print matrix bloc >>> + // m_kmersMatrixBlocNumber = 0; >>> + // printLocalKmersMatrix(); >>> + >> You can remove commented lines. >> >>> + >>> m_mother = source; >>> >>> memcpy(&m_matrixOwner, buffer, sizeof(m_matrixOwner)); >>> @@ -108,19 +120,32 @@ void StoreKeeper::receive(Message & message) { >>> m_iterator2 = m_iterator1->second.begin(); >>> } >>> >>> - /* >>> - printName(); >>> - cout << "DEBUG printLocalGramMatrix before first >>> sendMatrixCell" << endl; >>> - printLocalGramMatrix(); >>> - */ >>> - >>> + // printName(); >>> + // cout << "DEBUG printLocalGramMatrix before first >>> sendMatrixCell" << endl; >>> + // printLocalGramMatrix(); >> You can remove commented lines. >> >>> sendMatrixCell(); >>> >>> } else if(tag == MatrixOwner::PUSH_PAYLOAD_OK) { >>> - >>> sendMatrixCell(); >>> + } else if(tag == MERGE_KMERS_MATRIX) { >>> + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; >>> + // cout << "(StoreKeeper) received " << >>> m_receivedObjects << " objects in total"; >>> + // cout << " with " << m_receivedPushes << " push >>> operations" << endl; >> You can remove commented lines. >> >> Otherwise, add a "#ifdef DEBUG_SOMETHING_SOMETHING / #endif around that lines". >> >> >>> + >>> + m_mother = source; >>> >>> - } else if(tag == CoalescenceManager::SET_KMER_LENGTH) { >>> + memcpy(&m_kmersMatrixOwner, buffer, >>> sizeof(m_kmersMatrixOwner)); >>> + >>> + m_hashTableIterator.constructor(&m_hashTable); >>> + >>> + sendKmersSamples(); >>> + } >>> + else if (tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_END) { >>> + } >>> + else if(tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_OK) { >>> + sendKmersSamples(); >>> + } >>> + else if(tag == CoalescenceManager::SET_KMER_LENGTH) { >>> >>> int kmerLength = 0; >>> int position = 0; >>> @@ -181,8 +206,6 @@ void StoreKeeper::sendMatrixCell() { >>> message.setNumberOfBytes(offset); >>> message.setTag(MatrixOwner::PUSH_PAYLOAD); >>> >>> - //cout << " DEBUG send PUSH_PAYLOAD to " << >>> m_matrixOwner << endl; >>> - >>> send(m_matrixOwner, message); >>> >>> m_iterator2++; >>> @@ -207,10 +230,7 @@ void StoreKeeper::sendMatrixCell() { >>> // free memory. >>> m_localGramMatrix.clear(); >>> >>> - /* >>> printName(); >>> - cout << "DEBUG send PUSH_PAYLOAD_END to " << m_matrixOwner << endl; >>> - */ >>> >>> Message response; >>> response.setTag(MatrixOwner::PUSH_PAYLOAD_END); >>> @@ -236,6 +256,7 @@ void StoreKeeper::configureHashTable() { >>> ); >>> >>> m_configured = true; >>> + >>> } >>> >>> void StoreKeeper::printColorReport() { >>> @@ -375,6 +396,7 @@ void StoreKeeper::computeLocalGramMatrix() { >>> //printLocalGramMatrix(); >>> } >>> >>> + >>> void StoreKeeper::printLocalGramMatrix() { >>> >>> printName(); >>> @@ -623,3 +645,73 @@ void StoreKeeper::storeData(Vertex & vertex, int & >>> sample) { >>> >>> */ >>> } >>> + >>> + >>> +void StoreKeeper::setSamplesSize(int sampleSize) { >>> + m_sampleSize = sampleSize; >>> +} >>> + >>> +void StoreKeeper::setOutputKmersMatrixPath(string pathPrefix) { >>> + // m_outputKmersMatrixPath = pathPrefix; >>> + // m_outputKmersMatrixPath += "/KmersMatrixDump/"; >>> + // createDirectory(m_outputKmersMatrixPath.c_str()); >> You can remove commented lines. >> >> >> This file could be in prefix/Surveyor/<...> >> >>> +} >>> + >>> + >>> +void StoreKeeper::sendKmersSamples() { >>> + >>> + char buffer[4000]; >> For portability, use MAXIMUM_MESSAGE_SIZE_IN_BYTES instead of 4000. >> >>> + int bytes = 0; >>> + >>> + ExperimentVertex * currentVertex = NULL; >>> + VirtualKmerColorHandle currentVirtualColor = NULL_VIRTUAL_COLOR; >>> + >>> + vector<bool> samplesVector (m_sampleSize, false); >>> + >>> + if(m_hashTableIterator.hasNext()){ >>> + >>> + // fill(samplesVector.begin(),samplesVector.end(),false); >>> + >> You can remove commented lines. >> >>> + currentVertex = m_hashTableIterator.next(); >>> + Kmer kmer = currentVertex->getKey(); >>> + >>> + bytes += kmer.dump(buffer); >>> + >>> + currentVirtualColor = currentVertex->getVirtualColor(); >>> + set<PhysicalKmerColor> * samples = >>> m_colorSet.getPhysicalColors(currentVirtualColor); >>> + >>> + for(set<PhysicalKmerColor>:: iterator sampleIterator = >>> samples->begin(); >>> + sampleIterator != samples->end(); ++sampleIterator) { >>> + PhysicalKmerColor value = *sampleIterator; >>> + samplesVector[value] = true; >>> + // cout << " " << value; >>> + } >>> + >>> + for (std::vector<bool>::iterator it = >>> samplesVector.begin(); >>> + it != samplesVector.end(); ++it) { >>> + buffer[bytes] = *it; >>> + bytes++; >>> + } >>> + // buffer[bytes] = '\0'; >> You can remove commented lines. >> >>> + } >>> + >>> + >>> + Message message; >>> + message.setNumberOfBytes(bytes); >>> + message.setBuffer(buffer); >>> + >>> + // message.setTag(MatrixOwner::PUSH_KMERS_SAMPLES); >>> + if(m_hashTableIterator.hasNext()){ >>> + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES); >>> + }else{ >>> + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES_END); >>> + } >>> + >>> + // Message response; >>> + // response.setTag(MatrixOwner::PUSH_PAYLOAD_END); >>> + // send(m_matrixOwner, response); >>> + >>> + send(m_kmersMatrixOwner, message); >>> + >>> +} >>> + >>> diff --git a/code/Surveyor/StoreKeeper.h b/code/Surveyor/StoreKeeper.h >>> index e44cf98..36adf77 100644 >>> --- a/code/Surveyor/StoreKeeper.h >>> +++ b/code/Surveyor/StoreKeeper.h >>> @@ -34,6 +34,10 @@ >>> >>> #include <RayPlatform/actors/Actor.h> >>> #include <RayPlatform/structures/MyHashTable.h> >>> +#include <RayPlatform/structures/MyHashTableIterator.h> >>> + >>> +#include <iostream> >>> +#include <sstream> >>> >>> /** >>> * Provides genomic storage. >>> @@ -55,6 +59,7 @@ private: >>> >>> int m_mother; >>> int m_matrixOwner; >>> + int m_kmersMatrixOwner; >>> >>> bool m_configured; >>> >>> @@ -64,6 +69,8 @@ private: >>> */ >>> MyHashTable<Kmer,ExperimentVertex> m_hashTable; >>> >>> + MyHashTableIterator<Kmer,ExperimentVertex> m_hashTableIterator; >>> + >>> int m_kmerLength; >>> bool m_colorSpaceMode; >>> >>> @@ -79,6 +86,13 @@ private: >>> void printLocalGramMatrix(); >>> void printColorReport(); >>> >>> + /* ostringstream m_currentKmer; */ >>> + /* ostringstream m_currentSamplesKmers; */ >>> + int m_sampleSize; >>> + string m_outputKmersMatrixPath; >>> + void printLocalKmersMatrix(string & m_kmer, string & >>> m_samplesKmers); >>> + void sendKmersSamples(); >>> + >>> void sendMatrixCell(); >>> >>> public: >>> @@ -86,14 +100,19 @@ public: >>> StoreKeeper(); >>> ~StoreKeeper(); >>> >>> + void setOutputKmersMatrixPath(string pathPrefix); >>> + void setSamplesSize(int sampleSize); >>> + >>> void receive(Message & message); >>> >>> enum { >>> FIRST_TAG = 10250, >>> PUSH_SAMPLE_VERTEX, >>> PUSH_SAMPLE_VERTEX_OK, >>> - MERGE, >>> - MERGE_OK, >>> + MERGE_GRAM_MATRIX, >>> + MERGE_GRAM_MATRIX_OK, >>> + MERGE_KMERS_MATRIX, >>> + MERGE_KMERS_MATRIX_OK, >>> LAST_TAG >>> }; >>> }; >> ------------------------------------------------------------------------------ >> Managing the Performance of Cloud-Based Applications >> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >> Read the Whitepaper. >> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >> _______________________________________________ >> Denovoassembler-devel mailing list >> Den...@li... >> https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel > |
From: Sébastien B. <se...@bo...> - 2014-02-23 12:29:03
|
Hey Maxime, You did not provide KmersMatrixOwner.h, KmersMatrixOwner.cpp, and changes to the Surveyor Makefile. OTher comments are below. ---------------------------------------- > Date: Sat, 22 Feb 2014 10:03:30 +0000 > From: ma...@de... > To: se...@bo... > Subject: git diff kmersmatrix branch > > diff --git a/code/Surveyor/MatrixOwner.cpp b/code/Surveyor/MatrixOwner.cpp > index ffaae00..47cf84a 100644 > --- a/code/Surveyor/MatrixOwner.cpp > +++ b/code/Surveyor/MatrixOwner.cpp > @@ -65,9 +65,12 @@ void MatrixOwner::receive(Message & message) { > assert(m_parameters != NULL); > assert(m_sampleNames != NULL); > #endif > - > m_mother = source; > > + //open the buffer of the file > + // createKmersMatrixOutputFile(); > + > + > } else if(tag == PUSH_PAYLOAD) { > > SampleIdentifier sample1 = -1; > @@ -89,10 +92,10 @@ void MatrixOwner::receive(Message & message) { > assert(count>= 0); > #endif > > - /* > + > printName(); > - cout << "DEBUG add " << sample1 << " " << sample2 << " " > << count << endl; > -*/ > + // cout << "DEBUG add " << sample1 << " " << sample2 << Commented lines should be removed. > " " << count << endl; > + > m_receivedPayloads ++; > > m_localGramMatrix[sample1][sample2] += count; > @@ -100,14 +103,14 @@ void MatrixOwner::receive(Message & message) { > Message response; > response.setTag(PUSH_PAYLOAD_OK); > send(source, response); > + } > + else if(tag == PUSH_PAYLOAD_END) { Use '} else if (' and not '} else if' This is the coding style of the project. see https://github.com/sebhtml/ray/blob/master/Documentation/CodingStyle.txt * Kernighan and Ritchie style, variant "The One True Brace Style" (1TBS) http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS (you used K&R Variant: Stroustrup). > > - } else if(tag == PUSH_PAYLOAD_END) { > - > + cout << "PUSH_PAYLOAD_END" <<endl; Remove this debug message. > m_completedStoreActors++; > > if(m_completedStoreActors == getSize()) { > > - > printName(); > cout << "MatrixOwner received " << > m_receivedPayloads << " payloads" << endl; > > @@ -151,10 +154,9 @@ void MatrixOwner::receive(Message & message) { > > > // tell Mother that the matrix is ready now. > - > - Message coolMessage; > - coolMessage.setTag(MATRIX_IS_READY); > - send(m_mother, coolMessage); > + Message coolMessage; > + coolMessage.setTag(GRAM_MATRIX_IS_READY); > + send(m_mother, coolMessage); > > > // clear matrices > @@ -275,3 +277,4 @@ void > MatrixOwner::printLocalGramMatrixWithHash(ostream & stream, map<SampleIdent > stream << endl; > } > } > + > diff --git a/code/Surveyor/MatrixOwner.h b/code/Surveyor/MatrixOwner.h > index ceb17e2..afa9278 100644 > --- a/code/Surveyor/MatrixOwner.h > +++ b/code/Surveyor/MatrixOwner.h > @@ -28,6 +28,7 @@ > > #include <map> > #include <iostream> > +#include <sstream> > using namespace std; > > class MatrixOwner : public Actor { > @@ -62,7 +63,7 @@ public: > PUSH_PAYLOAD, > PUSH_PAYLOAD_OK, > PUSH_PAYLOAD_END, > - MATRIX_IS_READY, > + GRAM_MATRIX_IS_READY, > LAST_TAG > }; > > diff --git a/code/Surveyor/Mother.cpp b/code/Surveyor/Mother.cpp > index 4d2ef9c..103a583 100644 > --- a/code/Surveyor/Mother.cpp > +++ b/code/Surveyor/Mother.cpp > @@ -27,6 +27,7 @@ > #include "GenomeGraphReader.h" > #include "GenomeAssemblyReader.h" > #include "MatrixOwner.h" > +#include "KmersMatrixOwner.h" > > #include <RayPlatform/cryptography/crypto.h> > > @@ -39,11 +40,13 @@ using namespace std; > #define INPUT_TYPE_GRAPH 0 > #define INPUT_TYPE_ASSEMBLY 1 > > - > Mother::Mother() { > > m_coalescenceManager = -1; > m_matrixOwner = -1; > + m_kmersMatrixOwner = -1; > + > + // m_matricesAreReady = true; Remove this commented line. > > m_parameters = NULL; > m_bigMother = -1; > @@ -91,7 +94,7 @@ void Mother::receive(Message & message) { > notifyController(); > } > > - } else if(tag == MERGE) { > + } else if(tag == MERGE_GRAM_MATRIX) { > > int matrixOwner = -1; > memcpy(&matrixOwner, buffer, sizeof(matrixOwner)); > @@ -102,7 +105,7 @@ void Mother::receive(Message & message) { > #endif > > Message theMessage; > - theMessage.setTag(StoreKeeper::MERGE); > + theMessage.setTag(StoreKeeper::MERGE_GRAM_MATRIX); > theMessage.setBuffer(&matrixOwner); > theMessage.setNumberOfBytes(sizeof(matrixOwner)); > > @@ -111,10 +114,33 @@ void Mother::receive(Message & message) { > send(destination, theMessage); > > Message response; > - response.setTag(MERGE_OK); > + response.setTag(MERGE_GRAM_MATRIX_OK); > + send(source, response); > + > + } else if (tag == MERGE_KMERS_MATRIX) { > + > + int kmersMatrixOwner = -1; > + memcpy(&kmersMatrixOwner, buffer, sizeof(kmersMatrixOwner)); > + > +#ifdef CONFIG_ASSERT > + assert(kmersMatrixOwner>= 0); > + assert(m_storeKeepers.size() == 1); > +#endif > + > + Message theMessage; > + theMessage.setTag(StoreKeeper::MERGE_KMERS_MATRIX); The name should be MERGE_KMER_MATRIX and not MERGE_KMERS_MATRIX. > + theMessage.setBuffer(&kmersMatrixOwner); > + theMessage.setNumberOfBytes(sizeof(kmersMatrixOwner)); > + > + int destination = m_storeKeepers[0]; > + > + send(destination, theMessage); > + > + Message response; > + response.setTag(MERGE_KMERS_MATRIX_OK); > send(source, response); > > - } else if(tag == SHUTDOWN) { > + } else if(tag == SHUTDOWN) { > > Message response; > response.setTag(SHUTDOWN_OK); > @@ -122,18 +148,16 @@ void Mother::receive(Message & message) { > > stop(); > > - } else if(tag == StoreKeeper::MERGE_OK) { > + } else if(tag == StoreKeeper::MERGE_GRAM_MATRIX_OK) { > > // TODO: the bug https://github.com/sebhtml/ray/issues/216 > // is caused by the fact that this message is not > // received . > > - /* > - Message newMessage; > - newMessage.setTag(MERGE_OK); > + // Message newMessage; > + // newMessage.setTag(MERGE_OK); > > - send(m_bigMother, newMessage); > - */ > + // send(m_bigMother, newMessage); Remove these commented lines. > > } else if(tag == FINISH_JOB) { > > @@ -153,6 +177,7 @@ void Mother::receive(Message & message) { > > sendToFirstMother(FLUSH_AGGREGATOR, > FLUSH_AGGREGATOR_RETURN); > } > + > } else if(tag == FLUSH_AGGREGATOR) { > > /* > @@ -188,64 +213,52 @@ void Mother::receive(Message & message) { > cout << "DEBUG sending FLUSH_AGGREGATOR_OK to > m_bigMother" << endl; > */ > > - } else if(tag == MatrixOwner::MATRIX_IS_READY) { > + } else if(tag == MatrixOwner::GRAM_MATRIX_IS_READY) { > + > + //TODO : check if all matrices are ready > + if(m_matricesAreReady){ > + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); > + }else { > + cout << "GRAM_MATRIX_IS_READY" << endl; When an actor speak, you must print its name too in stdout. (with printName()). > + m_matricesAreReady = true; > + } > > - sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); > + } > + else if(tag == KmersMatrixOwner::KMERS_MATRIX_IS_READY) { > + > + cout << "KMERS_MATRIX_IS_READY" << endl; > + if(m_matricesAreReady){ > + sendToFirstMother(SHUTDOWN, SHUTDOWN_OK); > + }else { > + cout << "KMERS_MATRIX_IS_READY" << endl; > + m_matricesAreReady = true; In one comment above, I saw that m_matricesAreReady = false was commented. Check that out. > + } > > } else if(tag == FLUSH_AGGREGATOR_OK) { > > - /* > printName(); > cout << "DEBUG received FLUSH_AGGREGATOR_OK" << endl; > - */ > > m_flushedMothers++; > > if(m_flushedMothers < getSize()) > return; > > - // spawn the MatrixOwner here ! > - > - MatrixOwner * matrixOwner = new MatrixOwner(); > - spawn(matrixOwner); > - > - m_matrixOwner = matrixOwner->getName(); > - > - printName(); > - cout << "Spawned MatrixOwner actor !" << endl; > - > - // tell the StoreKeeper actors to send their stuff to the > - // MatrixOwner actor > - // The Mother of Mother will wait for a signal from > MatrixOwner > - > - Message greetingMessage; > - > - vector<string> * names = & m_sampleNames; > - > - char buffer[32]; > - int offset = 0; > - memcpy(buffer + offset, &m_parameters, > sizeof(m_parameters)); > - offset += sizeof(m_parameters); > - memcpy(buffer + offset, &names, sizeof(names)); > - offset += sizeof(names); > - > - greetingMessage.setBuffer(&buffer); > - greetingMessage.setNumberOfBytes(offset); > - > - greetingMessage.setTag(MatrixOwner::GREETINGS); > - send(m_matrixOwner, greetingMessage); > - > - sendToFirstMother(MERGE, MERGE_OK); > - > + spawnMatrixOwner(); I like that. A method to spawn an actor. Good ! > > } else if(tag == m_responseTag) { > > - > if(m_responseTag == SHUTDOWN_OK) { > > - } else if(m_responseTag == MERGE_OK) { > - > - } else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { > + } else if(m_responseTag == MERGE_GRAM_MATRIX_OK) { > + // All mothers merged their GRAM MATRIX > + // Spawn KmersMatrixOwner to print > + if(m_motherToKill < getSize() && > m_printKmersMatrix){ > + spawnKmersMatrixOwner(); > + } > + } else if(m_responseTag == MERGE_KMERS_MATRIX_OK) { > + } > + else if(m_responseTag == FLUSH_AGGREGATOR_RETURN) { Again, put closing brace on same line (} else if ( ...) { > > /* > printName(); > @@ -254,9 +267,8 @@ void Mother::receive(Message & message) { > */ > } > > - // every mother was informed. > + // every mother was not informed. Good catch ! > if(m_motherToKill>= getSize()) { > - > sendMessageWithReply(m_motherToKill, m_forwardTag); > m_motherToKill--; > } > @@ -284,11 +296,15 @@ void Mother::sendMessageWithReply(int & actor, int > tag) { > Message message; > message.setTag(tag); > > - if(tag == MERGE) { > + if(tag == MERGE_GRAM_MATRIX) { > message.setBuffer(&m_matrixOwner); > message.setNumberOfBytes(sizeof(m_matrixOwner)); > - > - } else if(tag == FLUSH_AGGREGATOR) { > + } > + else if(tag == MERGE_KMERS_MATRIX) { > + message.setBuffer(&m_kmersMatrixOwner); > + message.setNumberOfBytes(sizeof(m_kmersMatrixOwner)); > + } > + else if(tag == FLUSH_AGGREGATOR) { > > /* > printName(); > @@ -328,6 +344,10 @@ void Mother::stop() { > m_matrixOwner = -1; > } > > + if(m_kmersMatrixOwner>= 0) { > + send(m_kmersMatrixOwner, kill); > + m_kmersMatrixOwner = -1; > + } > > die(); > > @@ -410,39 +430,44 @@ void Mother::startSurveyor() { > > bool isRoot = (getName() % getSize()) == 0; > > - //cout << "DEBUG startSurveyor isRoot" << isRoot << endl; > - > - // get a list of files. > + // Set matricesAreReady to true in case user doesn't want > + // to print out kmers matrix. > + m_matricesAreReady = true; > > vector<string> * commands = m_parameters->getCommands(); > > - > for(int i = 0 ; i < (int) commands->size() ; ++i) { > > string & element = commands->at(i); > > - // DONE: Check bounds for file names > + if (element != "-print-kmers-matrix") { The name should be kmer-matrix, not kmers-matrix. It is like groceries store vs grocery store. > + // DONE: Check bounds for file names > > - map<string,int> fastTable; > + map<string,int> fastTable; > > - fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; > - fastTable["-read-sample-assembly"] = INPUT_TYPE_ASSEMBLY; > + fastTable["-read-sample-graph"] = INPUT_TYPE_GRAPH; > + fastTable["-read-sample-assembly"] = > INPUT_TYPE_ASSEMBLY; > > - // Unsupported option > - if(fastTable.count(element) == 0 || i+2> (int) > commands->size()) > - continue; > + // Unsupported option > + if(fastTable.count(element) == 0 || i+2> (int) > commands->size()) > + continue; > > - string sampleName = commands->at(++i); > - string fileName = commands->at(++i); > + string sampleName = commands->at(++i); > + string fileName = commands->at(++i); > > - m_sampleNames.push_back(sampleName); > + m_sampleNames.push_back(sampleName); > > - // DONE implement this m_assemblyFileNames + type > - m_inputFileNames.push_back(fileName); > + // DONE implement this m_assemblyFileNames + type > + m_inputFileNames.push_back(fileName); > > - int type = fastTable[element]; > + int type = fastTable[element]; > > - m_sampleInputTypes.push_back(type); > + m_sampleInputTypes.push_back(type); > + > + } else { > + m_matricesAreReady = false; > + m_printKmersMatrix = true; Question: if m_printKmersMatrix is false, I suppose the code follows the usual path of printing just one matrix, right ? > + } > > } > > @@ -468,6 +493,9 @@ void Mother::startSurveyor() { > > m_storeKeepers.push_back(actor->getName()); > > + actor->setOutputKmersMatrixPath(m_parameters->getPrefix()); The path should be prefix/Surveyor/<whatever the kmer matrix's name is> > + actor->setSamplesSize(m_sampleNames.size()); sample size, not samples size. > + > // tell the CoalescenceManager about the local StoreKeeper > Message dummyMessage; > int localStore = actor->getName(); > @@ -568,6 +596,80 @@ void Mother::spawnReader() { > } > } > > + > +void Mother::spawnMatrixOwner() { > + > + // spawn the MatrixOwner here ! > + MatrixOwner * matrixOwner = new MatrixOwner(); > + spawn(matrixOwner); > + > + m_matrixOwner = matrixOwner->getName(); > + > + printName(); > + cout << "Spawned MatrixOwner actor !" << m_matrixOwner << endl; > + > + // tell the StoreKeeper actors to send their stuff to the > + // MatrixOwner actor > + // The Mother of Mother will wait for a signal from MatrixOwner > + > + Message greetingMessage; > + > + vector<string> * names = & m_sampleNames; > + > + char buffer[32]; > + int offset = 0; > + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); > + offset += sizeof(m_parameters); > + memcpy(buffer + offset, &names, sizeof(names)); > + offset += sizeof(names); > + > + greetingMessage.setBuffer(&buffer); > + greetingMessage.setNumberOfBytes(offset); > + > + greetingMessage.setTag(MatrixOwner::GREETINGS); > + send(m_matrixOwner, greetingMessage); > + > + sendToFirstMother(MERGE_GRAM_MATRIX, MERGE_GRAM_MATRIX_OK); > +} > + > +void Mother::spawnKmersMatrixOwner() { > + > + // spawn the MatrixOwner here ! > + KmersMatrixOwner * kmersMatrixOwner = new KmersMatrixOwner(); > + spawn(kmersMatrixOwner); > + > + m_kmersMatrixOwner = kmersMatrixOwner->getName(); > + > + printName(); > + cout << "Spawned KmersMatrixOwner actor !" << > m_kmersMatrixOwner << endl; > + > + // tell the StoreKeeper actors to send their stuff to the > + // KmersMatrixOwner actor > + // The Mother of Mother will wait for a signal from MatrixOwner > + > + Message greetingMessage; > + > + vector<string> * names = & m_sampleNames; > + > + char buffer[32]; > + int offset = 0; > + memcpy(buffer + offset, &m_parameters, sizeof(m_parameters)); > + offset += sizeof(m_parameters); > + memcpy(buffer + offset, &names, sizeof(names)); > + offset += sizeof(names); > + > + greetingMessage.setBuffer(&buffer); > + greetingMessage.setNumberOfBytes(offset); > + > + greetingMessage.setTag(KmersMatrixOwner::GREETINGS); > + send(m_kmersMatrixOwner, greetingMessage); > + > + sendToFirstMother(MERGE_KMERS_MATRIX, MERGE_KMERS_MATRIX_OK); > + > +} > + > + > void Mother::setParameters(Parameters * parameters) { > m_parameters = parameters; > } > + > diff --git a/code/Surveyor/Mother.h b/code/Surveyor/Mother.h > index 092920f..9774c4b 100644 > --- a/code/Surveyor/Mother.h > +++ b/code/Surveyor/Mother.h > @@ -28,6 +28,7 @@ > > #include <vector> > #include <string> > +#include <iostream> > using namespace std; > > /** > @@ -55,9 +56,12 @@ class Mother: public Actor { > private: > > int m_matrixOwner; > + int m_kmersMatrixOwner; > > int m_flushedMothers; > int m_finishedMothers; > + bool m_matricesAreReady; > + bool m_printKmersMatrix; > > Parameters * m_parameters; > > @@ -93,6 +97,13 @@ private: > */ > void sendToFirstMother(int forwardTag, int responseTag); > > + /* int m_kmersMatrixBlocNumber; */ > + void printLocalKmersMatrix(string & kmer, string & > samples_kmers, bool force); > + void createKmersMatrixOutputFile(); > + > + void spawnMatrixOwner(); > + void spawnKmersMatrixOwner(); > + That's a good design -- private methods for private uses. > public: > > Mother(); > @@ -109,8 +120,10 @@ public: > FLUSH_AGGREGATOR, > FLUSH_AGGREGATOR_OK, > FLUSH_AGGREGATOR_RETURN, > - MERGE, > - MERGE_OK, > + MERGE_GRAM_MATRIX, > + MERGE_GRAM_MATRIX_OK, > + MERGE_KMERS_MATRIX, > + MERGE_KMERS_MATRIX_OK, kmer matrix, not kmers matrix. > LAST_TAG, > }; > > diff --git a/code/Surveyor/StoreKeeper.cpp b/code/Surveyor/StoreKeeper.cpp > index 84eef34..492208c 100644 > --- a/code/Surveyor/StoreKeeper.cpp > +++ b/code/Surveyor/StoreKeeper.cpp > @@ -22,10 +22,16 @@ > #include "StoreKeeper.h" > #include "CoalescenceManager.h" > #include "MatrixOwner.h" > +#include "KmersMatrixOwner.h" > > #include <code/VerticesExtractor/Vertex.h> > +#include <RayPlatform/structures/MyHashTableIterator.h> > +#include <RayPlatform/core/OperatingSystem.h> > > #include <iostream> > +#include <sstream> > +#include <iomanip> > +#include <fstream> > using namespace std; > > #include <string.h> > @@ -83,15 +89,21 @@ void StoreKeeper::receive(Message & message) { > > die(); > > - } else if(tag == MERGE) { > + } else if(tag == MERGE_GRAM_MATRIX) { > > > - printName(); > - cout << "DEBUG at MERGE message reception "; > - cout << "(StoreKeeper) received " << m_receivedObjects > << " objects in total"; > - cout << " with " << m_receivedPushes << " push > operations" << endl; > + // printName(); > + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; > + // cout << "(StoreKeeper) received " << > m_receivedObjects << " objects in total"; > + // cout << " with " << m_receivedPushes << " push You can remove commented lines. > operations" << endl; > computeLocalGramMatrix(); > > + > + // TODEL Print matrix bloc > + // m_kmersMatrixBlocNumber = 0; > + // printLocalKmersMatrix(); > + You can remove commented lines. > + > m_mother = source; > > memcpy(&m_matrixOwner, buffer, sizeof(m_matrixOwner)); > @@ -108,19 +120,32 @@ void StoreKeeper::receive(Message & message) { > m_iterator2 = m_iterator1->second.begin(); > } > > - /* > - printName(); > - cout << "DEBUG printLocalGramMatrix before first > sendMatrixCell" << endl; > - printLocalGramMatrix(); > - */ > - > + // printName(); > + // cout << "DEBUG printLocalGramMatrix before first > sendMatrixCell" << endl; > + // printLocalGramMatrix(); You can remove commented lines. > sendMatrixCell(); > > } else if(tag == MatrixOwner::PUSH_PAYLOAD_OK) { > - > sendMatrixCell(); > + } else if(tag == MERGE_KMERS_MATRIX) { > + // cout << "DEBUG at MERGE_GRAM_MATRIX message reception "; > + // cout << "(StoreKeeper) received " << > m_receivedObjects << " objects in total"; > + // cout << " with " << m_receivedPushes << " push > operations" << endl; You can remove commented lines. Otherwise, add a "#ifdef DEBUG_SOMETHING_SOMETHING / #endif around that lines". > + > + m_mother = source; > > - } else if(tag == CoalescenceManager::SET_KMER_LENGTH) { > + memcpy(&m_kmersMatrixOwner, buffer, > sizeof(m_kmersMatrixOwner)); > + > + m_hashTableIterator.constructor(&m_hashTable); > + > + sendKmersSamples(); > + } > + else if (tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_END) { > + } > + else if(tag == KmersMatrixOwner::PUSH_KMER_SAMPLES_OK) { > + sendKmersSamples(); > + } > + else if(tag == CoalescenceManager::SET_KMER_LENGTH) { > > int kmerLength = 0; > int position = 0; > @@ -181,8 +206,6 @@ void StoreKeeper::sendMatrixCell() { > message.setNumberOfBytes(offset); > message.setTag(MatrixOwner::PUSH_PAYLOAD); > > - //cout << " DEBUG send PUSH_PAYLOAD to " << > m_matrixOwner << endl; > - > send(m_matrixOwner, message); > > m_iterator2++; > @@ -207,10 +230,7 @@ void StoreKeeper::sendMatrixCell() { > // free memory. > m_localGramMatrix.clear(); > > - /* > printName(); > - cout << "DEBUG send PUSH_PAYLOAD_END to " << m_matrixOwner << endl; > - */ > > Message response; > response.setTag(MatrixOwner::PUSH_PAYLOAD_END); > @@ -236,6 +256,7 @@ void StoreKeeper::configureHashTable() { > ); > > m_configured = true; > + > } > > void StoreKeeper::printColorReport() { > @@ -375,6 +396,7 @@ void StoreKeeper::computeLocalGramMatrix() { > //printLocalGramMatrix(); > } > > + > void StoreKeeper::printLocalGramMatrix() { > > printName(); > @@ -623,3 +645,73 @@ void StoreKeeper::storeData(Vertex & vertex, int & > sample) { > > */ > } > + > + > +void StoreKeeper::setSamplesSize(int sampleSize) { > + m_sampleSize = sampleSize; > +} > + > +void StoreKeeper::setOutputKmersMatrixPath(string pathPrefix) { > + // m_outputKmersMatrixPath = pathPrefix; > + // m_outputKmersMatrixPath += "/KmersMatrixDump/"; > + // createDirectory(m_outputKmersMatrixPath.c_str()); You can remove commented lines. This file could be in prefix/Surveyor/<...> > +} > + > + > +void StoreKeeper::sendKmersSamples() { > + > + char buffer[4000]; For portability, use MAXIMUM_MESSAGE_SIZE_IN_BYTES instead of 4000. > + int bytes = 0; > + > + ExperimentVertex * currentVertex = NULL; > + VirtualKmerColorHandle currentVirtualColor = NULL_VIRTUAL_COLOR; > + > + vector<bool> samplesVector (m_sampleSize, false); > + > + if(m_hashTableIterator.hasNext()){ > + > + // fill(samplesVector.begin(),samplesVector.end(),false); > + You can remove commented lines. > + currentVertex = m_hashTableIterator.next(); > + Kmer kmer = currentVertex->getKey(); > + > + bytes += kmer.dump(buffer); > + > + currentVirtualColor = currentVertex->getVirtualColor(); > + set<PhysicalKmerColor> * samples = > m_colorSet.getPhysicalColors(currentVirtualColor); > + > + for(set<PhysicalKmerColor>:: iterator sampleIterator = > samples->begin(); > + sampleIterator != samples->end(); ++sampleIterator) { > + PhysicalKmerColor value = *sampleIterator; > + samplesVector[value] = true; > + // cout << " " << value; > + } > + > + for (std::vector<bool>::iterator it = > samplesVector.begin(); > + it != samplesVector.end(); ++it) { > + buffer[bytes] = *it; > + bytes++; > + } > + // buffer[bytes] = '\0'; You can remove commented lines. > + } > + > + > + Message message; > + message.setNumberOfBytes(bytes); > + message.setBuffer(buffer); > + > + // message.setTag(MatrixOwner::PUSH_KMERS_SAMPLES); > + if(m_hashTableIterator.hasNext()){ > + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES); > + }else{ > + message.setTag(KmersMatrixOwner::PUSH_KMER_SAMPLES_END); > + } > + > + // Message response; > + // response.setTag(MatrixOwner::PUSH_PAYLOAD_END); > + // send(m_matrixOwner, response); > + > + send(m_kmersMatrixOwner, message); > + > +} > + > diff --git a/code/Surveyor/StoreKeeper.h b/code/Surveyor/StoreKeeper.h > index e44cf98..36adf77 100644 > --- a/code/Surveyor/StoreKeeper.h > +++ b/code/Surveyor/StoreKeeper.h > @@ -34,6 +34,10 @@ > > #include <RayPlatform/actors/Actor.h> > #include <RayPlatform/structures/MyHashTable.h> > +#include <RayPlatform/structures/MyHashTableIterator.h> > + > +#include <iostream> > +#include <sstream> > > /** > * Provides genomic storage. > @@ -55,6 +59,7 @@ private: > > int m_mother; > int m_matrixOwner; > + int m_kmersMatrixOwner; > > bool m_configured; > > @@ -64,6 +69,8 @@ private: > */ > MyHashTable<Kmer,ExperimentVertex> m_hashTable; > > + MyHashTableIterator<Kmer,ExperimentVertex> m_hashTableIterator; > + > int m_kmerLength; > bool m_colorSpaceMode; > > @@ -79,6 +86,13 @@ private: > void printLocalGramMatrix(); > void printColorReport(); > > + /* ostringstream m_currentKmer; */ > + /* ostringstream m_currentSamplesKmers; */ > + int m_sampleSize; > + string m_outputKmersMatrixPath; > + void printLocalKmersMatrix(string & m_kmer, string & > m_samplesKmers); > + void sendKmersSamples(); > + > void sendMatrixCell(); > > public: > @@ -86,14 +100,19 @@ public: > StoreKeeper(); > ~StoreKeeper(); > > + void setOutputKmersMatrixPath(string pathPrefix); > + void setSamplesSize(int sampleSize); > + > void receive(Message & message); > > enum { > FIRST_TAG = 10250, > PUSH_SAMPLE_VERTEX, > PUSH_SAMPLE_VERTEX_OK, > - MERGE, > - MERGE_OK, > + MERGE_GRAM_MATRIX, > + MERGE_GRAM_MATRIX_OK, > + MERGE_KMERS_MATRIX, > + MERGE_KMERS_MATRIX_OK, > LAST_TAG > }; > }; |
From: Sébastien B. <seb...@ul...> - 2014-02-19 19:13:35
|
Review: On 19 février 2014 08:41, Maxime Deraspe [ma...@de...] wrote: > À : Sébastien Boisvert > Objet : kmers matrix > > diff --git a/code/Surveyor/MatrixOwner.cpp b/code/Surveyor/MatrixOwner.cpp > index ffaae00..13911ef 100644 > --- a/code/Surveyor/MatrixOwner.cpp > +++ b/code/Surveyor/MatrixOwner.cpp > @@ -36,6 +36,8 @@ MatrixOwner::MatrixOwner() { > > m_receivedPayloads = 0; > > + matricesIsReady = false; matrices is in the plural form, you should use *are* > + > } > > MatrixOwner::~MatrixOwner() { > @@ -65,7 +67,6 @@ void MatrixOwner::receive(Message & message) { > assert(m_parameters != NULL); > assert(m_sampleNames != NULL); > #endif > - > m_mother = source; > > } else if(tag == PUSH_PAYLOAD) { > @@ -100,8 +101,8 @@ void MatrixOwner::receive(Message & message) { > Message response; > response.setTag(PUSH_PAYLOAD_OK); > send(source, response); > - > - } else if(tag == PUSH_PAYLOAD_END) { > + } > + else if(tag == PUSH_PAYLOAD_END) { > > m_completedStoreActors++; > > @@ -152,9 +153,12 @@ void MatrixOwner::receive(Message & message) { > > // tell Mother that the matrix is ready now. > > - Message coolMessage; > - coolMessage.setTag(MATRIX_IS_READY); > - send(m_mother, coolMessage); > + if(matricesIsReady){ > + Message coolMessage; > + coolMessage.setTag(MATRIX_IS_READY); > + send(m_mother, coolMessage); > + } > + matricesIsReady = true; > > > // clear matrices > @@ -162,7 +166,79 @@ void MatrixOwner::receive(Message & message) { > m_localGramMatrix.clear(); > m_kernelDistanceMatrix.clear(); > } > + } > + else if(tag == PUSH_KMERS_SAMPLES) { change to PUSH_KMER_SAMPLES > + > + char * kmer; > + char * samples_vector; use upper camel case style (sampleVector) Also, this is a dangling pointer. Using this as is will lead to random behavior. > + // vector<bool> samples_vector; > + > + int offset = 0; > + > + memcpy(&kmer, buffer + offset, sizeof(kmer)); > + offset += sizeof(kmer); > + memcpy(&samples_vector, buffer + offset, > sizeof(samples_vector)); You are memcpy'ing in a unitialized pointer. > + offset += sizeof(samples_vector); This is always 8 bytes I think on 64 bits systems. And it does not count the bytes pointed by your pointer. > + > +#ifdef CONFIG_ASSERT > + assert(kmer >= 0); you can check if your pointer is NULL with kmer != NULL It is invalid to compare a pointer (char * kmer) with an integer (0). > + assert(samples_vector >= 0); > +#endif > + kmer[strlen(kmer)+1] = '\0'; > + samples_vector[strlen(samples_vector)+1] = '\0'; > + > + // TODEL : > + cout << "DEBUG push_kmers_samples : " << kmer << endl; > + > + string kmerS(kmer); > + string samples_vectorS(samples_vector); > + printLocalKmersMatrix(kmerS, samples_vectorS, false); > + > + Message response; > + response.setTag(PUSH_KMERS_SAMPLES_OK); change to PUSH_KMER_SAMPLES_OK > + send(source, response); > + > } > + else if(tag == PUSH_KMERS_SAMPLES_END) { > + > + char * kmer; char kmer[255]; char * kmer = malloc(255*sizeof(char)); char * kmer = new char[255]; // not sure of the syntax To create a Kmer from a char* code/Mock/common_functions.h Kmer wordId(const char*a); To transfer a Kmer in a network buffer, use load/dump (interface CarriageableItem, Kmer implements this !) int load(const char * buffer); int dump(char * buffer) const; int getRequiredNumberOfBytes() const; to dump a vector<bool> in a network buffer: vector<bool> kmerSamples; kmerSamples.resize(numberOfSamples); for(int i = 0 ; i < (int) kmerSamples.size() ; ++i) kmerSamples[i] = false; // fetch samples from VirtualColor // bla bla bla char buffer[4000]; int bytes = 0; bytes += kmerObject.dump(buffer); for(vector<bool>::iterator myIterator = kmerSamples.begin() ; myIterator != kmerSamples.end() ; ++myIterator) { buffer[bytes] = *myIterator; bytes++; } Message message; message.setNumberOfBytes(bytes); message.setBuffer(buffer); send(HENRY, message); > + char * samples_vector; > + // vector<bool> samples_vector; > + > + int offset = 0; > + > + memcpy(&kmer, buffer + offset, sizeof(kmer)); > + offset += sizeof(kmer); > + memcpy(&samples_vector, buffer + offset, > sizeof(samples_vector)); > + offset += sizeof(samples_vector); > + > +#ifdef CONFIG_ASSERT > + assert(kmer >= 0); Check if it is NULL with this kmer != NULL > + assert(samples_vector >= 0); > +#endif > + kmer[strlen(kmer)+1] = '\0'; > + samples_vector[strlen(samples_vector)+1] = '\0'; > + // TODEL : > + cout << "DEBUG push_kmers_samples END : " << kmer << endl; > + > + string kmerS(kmer); > + string samples_vectorS(samples_vector); > + printLocalKmersMatrix(kmerS, samples_vectorS, false); > + > + Message response; > + response.setTag(PUSH_KMERS_SAMPLES_OK); > + send(source, response); > + > + // tell Mother that the matrix is ready now. > + > + if(matricesIsReady){ > + Message coolMessage; > + coolMessage.setTag(MATRIX_IS_READY); > + send(m_mother, coolMessage); > + } You should probably create two types of Actor (KmerFileOwner and StoreKeeperIterator, or something like this) because otherwise you have to support both personalities. > + matricesIsReady = true; > + > + } > } > > > @@ -275,3 +351,29 @@ void > MatrixOwner::printLocalGramMatrixWithHash(ostream & stream, map<SampleIdent > stream << endl; > } > } > + > + > + > +void MatrixOwner::printLocalKmersMatrix(string & kmer, string & > samples_kmers, bool force) { > + > + m_kmersMatrix << kmer; > + for(std::string::iterator sampleKmerBool = > samples_kmers.begin(); sampleKmerBool != samples_kmers.end(); > ++sampleKmerBool) { > + // do_things_with(*sampleKmerBool); > + m_kmersMatrix << "\t" << *sampleKmerBool; > + // TODEL : > + cout << "\t" << *sampleKmerBool; > + } > + m_kmersMatrix << endl; > + > + > flushFileOperationBuffer(force,&m_kmersMatrix,&m_kmersMatrixFile, 4096); use CONFIG_FILE_IO_BUFFER_SIZE instead of 4096. > +} > + > + > +void MatrixOwner::createKmersMatrixOutputFile() { kmer matrix, not kmers matrix. > + > + ostringstream kmersMatrix; > + kmersMatrix << m_parameters->getPrefix() << "/Surveyor/"; > + kmersMatrix << "KmersMatrix.tsv"; > + m_kmersMatrixFile.open(kmersMatrix.str().c_str()); > + // similarityFile.close(); > +} > diff --git a/code/Surveyor/MatrixOwner.h b/code/Surveyor/MatrixOwner.h > index ceb17e2..ef0cc5f 100644 > --- a/code/Surveyor/MatrixOwner.h > +++ b/code/Surveyor/MatrixOwner.h > @@ -28,6 +28,7 @@ > > #include <map> > #include <iostream> > +#include <sstream> > using namespace std; > > class MatrixOwner : public Actor { > @@ -49,6 +50,15 @@ private: > > void computeDistanceMatrix(); > > + ostringstream m_kmersMatrix; > + ofstream m_kmersMatrixFile; > + > + void printLocalKmersMatrix(string & kmer, string & > samples_kmers, bool force); > + void createKmersMatrixOutputFile(); > + > + > + bool matricesIsReady; > + > public: > > MatrixOwner(); > @@ -62,6 +72,9 @@ public: > PUSH_PAYLOAD, > PUSH_PAYLOAD_OK, > PUSH_PAYLOAD_END, > + PUSH_KMERS_SAMPLES, > + PUSH_KMERS_SAMPLES_OK, > + PUSH_KMERS_SAMPLES_END, > MATRIX_IS_READY, > LAST_TAG > }; > diff --git a/code/Surveyor/Mother.cpp b/code/Surveyor/Mother.cpp > index 4d2ef9c..8fe0789 100644 > --- a/code/Surveyor/Mother.cpp > +++ b/code/Surveyor/Mother.cpp > @@ -410,6 +410,9 @@ void Mother::startSurveyor() { > > bool isRoot = (getName() % getSize()) == 0; > > + //TODEL > + // m_kmersMatrixBlocNumber = 0; > + > //cout << "DEBUG startSurveyor isRoot" << isRoot << endl; > > // get a list of files. > @@ -468,6 +471,13 @@ void Mother::startSurveyor() { > > m_storeKeepers.push_back(actor->getName()); > > + //TODEL > + // set the vector of samples into the storekeeper, and > path to write > + // actor->setSamplesVector(&m_sampleNames); > + actor->setOutputKmersMatrixPath(m_parameters->getPrefix()); > + // > actor->setKmersMatrixBlocNumber(m_kmersMatrixBlocNumber); > + // ++m_kmersMatrixBlocNumber; > + > // tell the CoalescenceManager about the local StoreKeeper > Message dummyMessage; > int localStore = actor->getName(); > diff --git a/code/Surveyor/Mother.h b/code/Surveyor/Mother.h > index 092920f..207127b 100644 > --- a/code/Surveyor/Mother.h > +++ b/code/Surveyor/Mother.h > @@ -28,6 +28,7 @@ > > #include <vector> > #include <string> > +#include <iostream> > using namespace std; > > /** > @@ -93,6 +94,11 @@ private: > */ > void sendToFirstMother(int forwardTag, int responseTag); > > + /* int m_kmersMatrixBlocNumber; */ > + void printLocalKmersMatrix(string & kmer, string & > samples_kmers, bool force); > + void createKmersMatrixOutputFile(); > + > + > public: > > Mother(); > diff --git a/code/Surveyor/StoreKeeper.cpp b/code/Surveyor/StoreKeeper.cpp > index 84eef34..0dd84e3 100644 > --- a/code/Surveyor/StoreKeeper.cpp > +++ b/code/Surveyor/StoreKeeper.cpp > @@ -24,8 +24,13 @@ > #include "MatrixOwner.h" > > #include <code/VerticesExtractor/Vertex.h> > +#include <RayPlatform/structures/MyHashTableIterator.h> > +#include <RayPlatform/core/OperatingSystem.h> > > -#include <iostream> > +#include <iostream> > +#include <sstream> > +#include <iomanip> > +#include <fstream> > using namespace std; > > #include <string.h> > @@ -92,6 +97,12 @@ void StoreKeeper::receive(Message & message) { > cout << " with " << m_receivedPushes << " push operations" << endl; > computeLocalGramMatrix(); > > + > + // TODEL Print matrix bloc > + // m_kmersMatrixBlocNumber = 0; block > + // printLocalKmersMatrix(); > + > + > m_mother = source; > > memcpy(&m_matrixOwner, buffer, sizeof(m_matrixOwner)); > @@ -114,13 +125,19 @@ void StoreKeeper::receive(Message & message) { > printLocalGramMatrix(); > */ > > + m_hashTableIterator.constructor(&m_hashTable); > + > sendMatrixCell(); > > - } else if(tag == MatrixOwner::PUSH_PAYLOAD_OK) { > + sendKmersSamples(); > > + } else if(tag == MatrixOwner::PUSH_PAYLOAD_OK) { > sendMatrixCell(); > > - } else if(tag == CoalescenceManager::SET_KMER_LENGTH) { > + } else if(tag == MatrixOwner::PUSH_KMERS_SAMPLES_OK) { > + sendKmersSamples(); > + } > + else if(tag == CoalescenceManager::SET_KMER_LENGTH) { > > int kmerLength = 0; > int position = 0; > @@ -236,6 +253,8 @@ void StoreKeeper::configureHashTable() { > ); > > m_configured = true; > + > + // m_hashTableIterator.constructor(&m_hashTable); > } > > void StoreKeeper::printColorReport() { > @@ -375,6 +394,7 @@ void StoreKeeper::computeLocalGramMatrix() { > //printLocalGramMatrix(); > } > > + > void StoreKeeper::printLocalGramMatrix() { > > printName(); > @@ -623,3 +643,123 @@ void StoreKeeper::storeData(Vertex & vertex, int & > sample) { > > */ > } > + > + > +// void StoreKeeper::setSamplesVector(vector<string> * samplesId) { > +// for (std::vector<bool>::iterator it = samplesVector.begin() ; > +// it != samplesVector.end(); ++it) { > +// m_currentSamplesKmers << *it << "\t"; > +// } > +// m_currentSamplesKmers = samplesId; > +// } > + > +void StoreKeeper::setOutputKmersMatrixPath(string pathPrefix) { > + // m_outputKmersMatrixPath = pathPrefix; > + // m_outputKmersMatrixPath += "/KmersMatrixDump/"; > + // createDirectory(m_outputKmersMatrixPath.c_str()); > +} > + > + > +// void StoreKeeper::setKmersMatrixBlocNumber(int blocNb) { > + > +// // m_kmersMatrixBlocNumber = blocNb; > +// } > + > +void StoreKeeper::sendKmersSamples() { > + > + cout << "sendKmersSamples_traces"<< endl; > + > + string kmerString; > + string samplesKmers; > + > + printLocalKmersMatrix(kmerString, samplesKmers); > + > + cout << "DEBUG sendKmersSamples :" << kmerString << > samplesKmers << endl; > + > + > + Message message; > + char buffer[4096]; > + int offset = 0; > + > + memcpy(buffer + offset, kmerString.c_str(), kmerString.length()); > + offset += kmerString.length(); > + memcpy(buffer + offset, samplesKmers.c_str(), > samplesKmers.length()); > + offset += samplesKmers.length(); > + > + message.setBuffer(buffer); > + message.setNumberOfBytes(offset); > + > + message.setTag(MatrixOwner::PUSH_KMERS_SAMPLES); > + if(m_hashTableIterator.hasNext()){ > + message.setTag(MatrixOwner::PUSH_KMERS_SAMPLES); > + }else{ > + message.setTag(MatrixOwner::PUSH_KMERS_SAMPLES_END); > + } > + > + send(m_matrixOwner, message); > + > +} > + > + > +void StoreKeeper::printLocalKmersMatrix(string & kmerString, string & > samplesKmers) { > + > + ExperimentVertex * currentVertex; > + VirtualKmerColorHandle currentVirtualColor; > + > + vector<bool> samplesVector (m_currentSamplesKmers.tellp(), false); > + > + // ofstream kmersMatrixOutFile; > + // stringstream matrixOutFileName; > + > + // m_currentKmer.clear(); > + // m_currentSamplesKmers.clear(); > + > + cout << "YOYOYO "<< m_hashTableIterator.hasNext() << endl; > + // matrixOutFileName << m_outputKmersMatrixPath; > + // matrixOutFileName << "kmatrix_bloc-"; > + // matrixOutFileName << setw(3) << setfill('0') << > m_kmersMatrixBlocNumber; > + // matrixOutFileName << ".tsv"; > + > + // > kmersMatrixOutFile.open(matrixOutFileName.str().c_str(),ios::app); > + > + if(m_hashTableIterator.hasNext()){ > + > + fill(samplesVector.begin(),samplesVector.end(),false); > + currentVertex = m_hashTableIterator.next(); > + Kmer kmer = currentVertex->getKey(); > + > + // cout << "DEBUG vertex :" << > kmer.idToWord(m_kmerLength, m_colorSpaceMode) << " color: "; > + // kmersMatrixOutFile << kmer.idToWord(m_kmerLength, > m_colorSpaceMode) << "\t"; > + // m_currentKmer << kmer.idToWord(m_kmerLength, > m_colorSpaceMode) << "\t"; > + kmerString = kmer.idToWord(m_kmerLength, m_colorSpaceMode); > + > + currentVirtualColor = currentVertex->getVirtualColor(); > + set<PhysicalKmerColor> * samples = > m_colorSet.getPhysicalColors(currentVirtualColor); > + > + for(set<PhysicalKmerColor>:: iterator sampleIterator = > samples->begin(); > + sampleIterator != samples->end(); ++sampleIterator) { > + PhysicalKmerColor value = *sampleIterator; > + samplesVector[value] = true; > + // cout << " " << value; > + } > + > + for (std::vector<bool>::iterator it = > samplesVector.begin() ; > + it != samplesVector.end(); ++it) { > + // m_currentSamplesKmers << *it << "\t"; > + samplesKmers += '\t'; > + samplesKmers += *it; > + } > + > + // cout << endl; > + // samplesKmers += '\n'; > + // m_currentSamplesKmers << '\n'; > + // kmersMatrixOutFile << endl; > + } > + > + cout << "DEBUG printLocalKmers " << kmerString << samplesKmers > << endl; > + > + // kmerString = m_currentKmer.str(); > + // samplesKmers = m_currentSamplesKmers.str(); > + // kmersMatrixOutFile.close(); > + // m_kmersMatrixBlocNumber++; > +} > diff --git a/code/Surveyor/StoreKeeper.h b/code/Surveyor/StoreKeeper.h > index e44cf98..94ced7a 100644 > --- a/code/Surveyor/StoreKeeper.h > +++ b/code/Surveyor/StoreKeeper.h > @@ -34,6 +34,10 @@ > > #include <RayPlatform/actors/Actor.h> > #include <RayPlatform/structures/MyHashTable.h> > +#include <RayPlatform/structures/MyHashTableIterator.h> > + > +#include <iostream> > +#include <sstream> > > /** > * Provides genomic storage. > @@ -64,6 +68,8 @@ private: > */ > MyHashTable<Kmer,ExperimentVertex> m_hashTable; > > + MyHashTableIterator<Kmer,ExperimentVertex> m_hashTableIterator; > + > int m_kmerLength; > bool m_colorSpaceMode; > > @@ -79,6 +85,13 @@ private: > void printLocalGramMatrix(); > void printColorReport(); > > + ostringstream m_currentKmer; > + ostringstream m_currentSamplesKmers; > + string m_outputKmersMatrixPath; > + void printLocalKmersMatrix(string & m_kmer, string & > m_samplesKmers); > + > + void sendKmersSamples(); > + > void sendMatrixCell(); > > public: > @@ -86,6 +99,8 @@ public: > StoreKeeper(); > ~StoreKeeper(); > > + void setOutputKmersMatrixPath(string pathPrefix); > + > void receive(Message & message); > > enum { > |
From: Sébastien B. <se...@bo...> - 2014-02-18 16:28:30
|
Hi Maxime, Option 1: store keepers send messages to a single actor who will write the file in real time. Be sure to buffer the I/O. StoreKeeper ------------> MatrixOwner ---------> KmerFeatureFile StoreKeeper ------------> StoreKeeper ------------> StoreKeeper ------------> One issue with this solution is that two runs lead to two different KmerFeatureFile because the order of the kmers will be different. However, Both files will contain the same kmers though, so strictly speaking that's the same data. Option 2: use MPI I/O The other approach would be to use MPI I/O (there is one StoreKeeper actor per MPI rank, so it would work). You would basically need to know how much bytes are required by each StoreKeeper in order to set the MPI File views (offsets, in some way). I prefer option 1 because it is more natural with regards to the actor model. I think that the MPI I/O option will lead to a better performance on file systems like GPFS or Lustre. Anyway, that's pretty much the idea. |
From: Sébastien B. <seb...@ul...> - 2014-02-17 20:31:33
|
I merged your changes. I also added proper documentation (Ray -help ) Keep up the good work !! On 17 février 2014 07:12, Maxime Déraspe [max...@gm...] wrote: > À : Sébastien Boisvert; den...@li... > Objet : Re: RE : [Denovoassembler-devel] pull request for GenomeAssemblyReader > > On 02/17/2014 04:55 PM, Sébastien Boisvert wrote: >> On 17 février 2014 06:36, Maxime Déraspe [max...@gm...] wrote: >>> À : den...@li... >>> Objet : [Denovoassembler-devel] pull request for GenomeAssemblyReader >>> >>> Hi, >>> >>> I have added 2 classes in Ray to load and "kmerized" any fasta files, >>> it can be assembly contigs or reference genomes. >>> >>> It aims to be used with Surveyor the same way as it was for loading a >>> graph : >>> -read-sample-assembly >>> instead of >>> -read-sample-graph >>> >>> ------ >>> mpiexec -n 2 ../Ray \ >>> -k 31 \ >>> -output RaySurveyorResults-assembly \ >>> -run-surveyor \ >>> -read-sample-assembly test1 ./contigs/test.fasta \ >>> -read-sample-assembly test2 ./contigs/test2.fasta \ >>> ------ >>> >>> it can be find in this commit >>> bb7c62d5c8ed0dceab6571dcbac9382467b15de7 >>> @ https://github.com/Zorino/ray/commits/master >>> >> Hi, >> >> >> 1. In Mother.cpp, why is there a +1 here: actor->setKmerSize(m_parameters->getWordSize()+1); >> >> 2. +#define I_LIKE_FAST_IO n'est pas utilisé dans AssemblyReader.h >> >> >> You should work in a branch. >> >> When comment 1. is addressed, I'll merge. Comment 2. is not important. >> >> > > please pull from > > https://github.com/Zorino/ray.git > fixkmerlen > >>> cheers, >>> >>> Maxime >>> >>> >>> ------------------------------------------------------------------------------ >>> Managing the Performance of Cloud-Based Applications >>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >>> Read the Whitepaper. >>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Denovoassembler-devel mailing list >>> Den...@li... >>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel > |
From: Maxime D. <max...@gm...> - 2014-02-17 17:14:05
|
On 02/17/2014 04:55 PM, Sébastien Boisvert wrote: > On 17 février 2014 06:36, Maxime Déraspe [max...@gm...] wrote: >> À : den...@li... >> Objet : [Denovoassembler-devel] pull request for GenomeAssemblyReader >> >> Hi, >> >> I have added 2 classes in Ray to load and "kmerized" any fasta files, >> it can be assembly contigs or reference genomes. >> >> It aims to be used with Surveyor the same way as it was for loading a >> graph : >> -read-sample-assembly >> instead of >> -read-sample-graph >> >> ------ >> mpiexec -n 2 ../Ray \ >> -k 31 \ >> -output RaySurveyorResults-assembly \ >> -run-surveyor \ >> -read-sample-assembly test1 ./contigs/test.fasta \ >> -read-sample-assembly test2 ./contigs/test2.fasta \ >> ------ >> >> it can be find in this commit >> bb7c62d5c8ed0dceab6571dcbac9382467b15de7 >> @ https://github.com/Zorino/ray/commits/master >> > Hi, > > > 1. In Mother.cpp, why is there a +1 here: actor->setKmerSize(m_parameters->getWordSize()+1); > > 2. +#define I_LIKE_FAST_IO n'est pas utilisé dans AssemblyReader.h > > > You should work in a branch. > > When comment 1. is addressed, I'll merge. Comment 2. is not important. > > please pull from https://github.com/Zorino/ray.git fixkmerlen >> cheers, >> >> Maxime >> >> >> ------------------------------------------------------------------------------ >> Managing the Performance of Cloud-Based Applications >> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >> Read the Whitepaper. >> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >> _______________________________________________ >> Denovoassembler-devel mailing list >> Den...@li... >> https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel |
From: Sébastien B. <seb...@ul...> - 2014-02-17 16:56:54
|
On 17 février 2014 06:36, Maxime Déraspe [max...@gm...] wrote: > À : den...@li... > Objet : [Denovoassembler-devel] pull request for GenomeAssemblyReader > > Hi, > > I have added 2 classes in Ray to load and "kmerized" any fasta files, > it can be assembly contigs or reference genomes. > > It aims to be used with Surveyor the same way as it was for loading a > graph : > -read-sample-assembly > instead of > -read-sample-graph > > ------ > mpiexec -n 2 ../Ray \ > -k 31 \ > -output RaySurveyorResults-assembly \ > -run-surveyor \ > -read-sample-assembly test1 ./contigs/test.fasta \ > -read-sample-assembly test2 ./contigs/test2.fasta \ > ------ > > it can be find in this commit > bb7c62d5c8ed0dceab6571dcbac9382467b15de7 > @ https://github.com/Zorino/ray/commits/master > Hi, 1. In Mother.cpp, why is there a +1 here: actor->setKmerSize(m_parameters->getWordSize()+1); 2. +#define I_LIKE_FAST_IO n'est pas utilisé dans AssemblyReader.h You should work in a branch. When comment 1. is addressed, I'll merge. Comment 2. is not important. > cheers, > > Maxime > > > ------------------------------------------------------------------------------ > Managing the Performance of Cloud-Based Applications > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. > Read the Whitepaper. > http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk > _______________________________________________ > Denovoassembler-devel mailing list > Den...@li... > https://lists.sourceforge.net/lists/listinfo/denovoassembler-devel |
From: Maxime D. <max...@gm...> - 2014-02-17 16:37:21
|
Hi, I have added 2 classes in Ray to load and "kmerized" any fasta files, it can be assembly contigs or reference genomes. It aims to be used with Surveyor the same way as it was for loading a graph : -read-sample-assembly instead of -read-sample-graph ------ mpiexec -n 2 ../Ray \ -k 31 \ -output RaySurveyorResults-assembly \ -run-surveyor \ -read-sample-assembly test1 ./contigs/test.fasta \ -read-sample-assembly test2 ./contigs/test2.fasta \ ------ it can be find in this commit bb7c62d5c8ed0dceab6571dcbac9382467b15de7 @ https://github.com/Zorino/ray/commits/master cheers, Maxime |
From: Sébastien B. <seb...@ul...> - 2014-02-14 20:57:08
|
Salut Maxime, Ta productivité en utilisant le "RayPlatform Actor Playground API" démontre que le modèle des acteurs est supérieur à l'ancien modèle "monolithique" de RayPlatform. Ton code: [boisver1@cp0260-mp2 ray]$ git shortlog master..zorino/assemblyReader Maxime Déraspe (4): added .gitignore for conviviality Added 2 classes : SequenceKmerReader (read a fasta file into kmers, GenomeAssemblyReader : the actor to manage the creation of kmers and offloading to the storekeeper Changed Mother class to distinguish between a graph reader and a sequence assembly reader Added GenomeAssemblyReader and SequenceKmerReader to the Makefile [boisver1@cp0260-mp2 ray]$ git diff --stat master..zorino/assemblyReader .gitignore | 5 + code/Surveyor/GenomeAssemblyReader.cpp | 256 ++++++++++++++++++++++++++++++++ code/Surveyor/GenomeAssemblyReader.h | 89 +++++++++++ code/Surveyor/Makefile | 2 + code/Surveyor/Mother.cpp | 96 +++++++++---- code/Surveyor/Mother.h | 3 +- code/Surveyor/SequenceKmerReader.cpp | 117 +++++++++++++++ code/Surveyor/SequenceKmerReader.h | 59 ++++++++ 8 files changed, 599 insertions(+), 28 deletions(-) Mes commentaires sur ton code: 1. Le copyright dans b/code/Surveyor/GenomeAssemblyReader.cpp doit être Maxime Déraspe 2014 et non 2013. 2. Tu peux remplacer KMER_SIZE par m_parameters->getWordSize(), si ton acteur ne l'a pas, donne-lui quelque part dans Mother (proche du spawn() 3. FIRST_TAG de ton AssemblyReader pourrait être différent de 10200 car GraphReader utilise aussi ce range. En théorie et en pratique, ça ne change rien puisque tes messages sont 4. Il y a du rouge dans GenomeAssemblyReader::startParty avec 'git diff --color' 5. b/code/Surveyor/GenomeAssemblyReader.h doit être Copyright Maxime Déraspe et non Sébastien Boisvert 6. +#define I_LIKE_FAST_IO n'est pas utilisé dans AssemblyReader.h 7. La méthode readKmer dans AssemblyReader doit être privé. 8. Tu peux ajouter si tu veux Copyright ton nom dans Mother.{h,cpp} après le mien. 9. dans Mother::startSurveyor -> ligne xxx il y a un espace de trop avant '}' 10. /SequenceKmerReader.cpp a plein de rouge dans 'git diff --color' 11. Le message de commit de 9d6b4d057a67bafb8bb611ccc1f3a7136693f9fd est trop long. Mais tu peux le laisser comme ça car c'est difficile à changer. Tu as juste à implémenter ces changements dans la même branche et ensuite je vais merger. On 14 février 2014 09:32, Maxime Déraspe [max...@ul...] wrote: > À : Sébastien Boisvert > Objet : Re: RE : RE : C++ quotes > > On 02/14/2014 06:54 PM, Sébastien Boisvert wrote: >> On 14 février 2014 08:38, Maxime Déraspe [max...@ul...] wrote: >>> À : Sébastien Boisvert >>> Objet : Re: RE : C++ quotes >>> >>> Je viens de m'apercevoir que j'avais laisser quelque DEBUG en output... >>> je vais faire un autre commit. >> >> Dis moi le quand c'est prêt à être regardé. (reviewed)< > https://github.com/Zorino/ray/commits/assemblyReader > c'est prêt >> >>> >>> On 02/14/2014 05:48 PM, Sébastien Boisvert wrote: >>>> On 14 février 2014 06:26, Maxime Déraspe [max...@ul...] wrote: >>>>> À : Sébastien Boisvert >>>>> Objet : Re: C++ quotes >>>>> >>>>> Ca marche enfin mon loader !! >>>>> Pas mal fier de moi et encore merci pour ton aide ! >>>>> Je vais faire un commit clean pm ou en fds pour que tu puisses reviewé >>>>> et pushé par par après. >>>>> Il me reste à le faire plus clean pour donner le parent et enfant au >>>>> storekeeper et p-e le depth, à moins que ce soit déjà gérer dans le >>>>> storekeeper. Le depth serait facile à rajouté dans le storekeeper j,ai >>>>> l'impression, mais parent et enfant serait p-e mieux d'être géré dans le >>>>> loader. >>>>> Et dans un futur non déterminé je vais essayer le profiling sur les >>>>> protéines !! >>>>> Je commence à prendre goût au c++. >>>> hehe >>>> >>>>> Pourrais tu me rapeller le code que je pourrais me baser pour printé la >>>>> matrices des couleurs par kmers, soit les échantillons qui les ont. >>>> Il y a du code qui itère sur les k-mers dans CoverageGatherer.cpp >>>> >>>>> Il me faut l'itérateur sur les kmers et ensuite celui sur les couleurs >>>>> (avec leur tag) si je me rapelle bien. >>>> Les sommets ont un membre "couleur virtuel" >>>> >>>>> Je crois que la matrice de gram est calculé dans Storekeeper ? >>>> Oui, localement. Synchronisé dans MatrixOwner. >>>> >>>>> Pour l'itérateur tu m'avais dit de regarder dans le CoalescenceManager >>>>> c'est bien ça ? >>>> Dans CoverageGatherer >>>> >>>>> Merci >>>>> >>>>> Maxime >>>>> >>>>> On 02/13/2014 09:44 PM, Sébastien Boisvert wrote: >>>>>> http://harmful.cat-v.org/software/c++/ >>>>>> >>>>>> >>>>>> Trucs de tous les jours faits en C++ http://www.stroustrup.com/applications.html > Élément précédent Élément suivant Connecté à Microsoft Exchange |