From: Josh F. <jf...@pv...> - 2007-02-26 18:43:59
|
Why require multiple SDs? Why not instead require a single SD and force spooling? It is a bit restrictive, but simplifies things. This way we have spooled data that a single SD can then write repeatedly, once for each pool to be written to. Should any write operation fail, all volumes involved can be marked in error and the job failed in the usual way. This would also work with a single drive autoloader, should have no timeout problem if operator intervention were needed to load volumes, and requires no client changes. Also, on defining copy pools, I would think a linked list of "pools to be written" could be built at job startup. The primary pool is added first. If it has a "CopyPool" defined then add the copy pool to the list. If the first copy pool has a CopyPool defined then add the second copy pool to the list, and so on, until one is added that has no CopyPool defined. This way there is no inherent restriction on the number of copy pools. David Boyes wrote: >> Also, Arno presented an idea to me at FOSDEM >> for doing copies from SD to SD which seems much easier to do than >> > trying > >> to >> multiplex a FD. As a result, at the same time, I'll take a look at >> > what > >> extra work it would be to have two SDs talk to each other for doing >> copies ... >> > > That's an interesting idea, for a number of reasons. I'm not sure it's > any simpler to implement than the proxy FD idea, but it opens up some > other things that might be useful (zero-copy stuff in SANs, more > effective use of flashcopy, etc). > > Some thoughts on possible logic for this: > > Assumptions: each "primary" pool (a pool with copypools enabled) would > have a maximum of two copies (the pool itself, copy 1, and copy 2). > These are normal Pools, containing volumes. The main pool definition > would have Copy Pool 1= and Copy Pool 2= keywords in the pool resource, > indicating what pools to use to select volumes for the copies. These > pools MAY NOT have copy pools of their own, but MAY have Next Pool > resources and participate in migration. > > 1) Job starts, and the usual evaluation of resources occurs. When the > director looks at the Pool resource, it detects that copypool1 and/or > copypool2 are non-null. > > 2) Director does device reservation processing for main pool, then does > device reservation for one device in each copy pool. If devices not > available, or no volumes available in primary AND all copy pools AND > there is no scratch pool available with media appropriate for the pool > definition, fail job. If there is scratch media available of the right > type for a pool, move the volume from scratch pool into the appropriate > pool and continue. > > 3) Director forks off a proxy SD/FD process and marks the reservations > as assigned to the proxy process. > > 4) Director connects to proxy and feeds it the number of streams and > what SD's to connect to, including appropriate credentials to use. > > 4) Proxy connects to SDs controlling the reserved devices using the FDs > credentials (some thought needed here to determine whether we need the > FD creds or whether the director's creds would be the right thing to > use). We wait until all connections are completed before returning > success. When complete, we start listening on a port as a SD. > > 5) Original director tells the original FD to connect to the proxy SD in > the usual way. > > 6) Proxy SD receives data blocks from FD, writes to each real SD > connection in turn, and acks to FD when data block is committed to SDs > atomically. > > 7) Proxy SD writes file records into database indicating that the file > can be found on volumes A, B, C. > > 8) Repeat until all FD traffic completed. > > 9) On FD close, the proxy SD flushes all remaining data to the real SDs, > and then closes down the proxy SD connections to the SDs normally. When > all connections are closed down, the proxy SD reports completion to the > original director, and exits. > > 10) Original director records job status and moves on. > > If there is an error writing to a volume on the real SDs, it's handled > the same as it is today (close the volume, mark it error, dismount, > mount another one). From the FD's perspective, the writes to the proxy > just take a very long time (possibly requiring longer timeouts in the > FD/SD settings). > > Noting Arno's idea, you may be able to insert the mux function into the > SD code supporting the primary pool, and implement the additional copies > as RPCs from the initial SD to the copypool SDs. You'd still need some > way of reserving the devices in the copypool SDs at the director level > so as not to get the director confused about who is using what > resources. > > I think that logic allows zero changes in the client, and gets us > simultaneous copies with minimum pain. > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Bacula-devel mailing list > Bac...@li... > https://lists.sourceforge.net/lists/listinfo/bacula-devel > > |