Re: [Burp-users] Large file problems?
Brought to you by:
grke
|
From: Brendon H. <br...@qu...> - 2026-02-07 20:28:36
|
Hi Graham,
An extra couple of observations to add to my previous:
* I moved the server config and backup medium to a different machine - a laptop
that's running a similar but not identical OS install (Debian testing). The
client remained on the original machine. The new machine had the same problem
occur as the old one did, so the problem is not limited to local sockets.
* I needed to resume a functional backup routine, so I deleted the failing
folder and it's "working" symlink, and took a new backup. There were zero
problems with taking a new backup, despite the problem file having not changed
in the interim.
Hope it helps,
Brendon
On Tuesday, February 3, 2026 12:45:36 a.m. Eastern Standard Time Graham
Keeling wrote:
> Thanks for the debugging attempt, I will be looking at reproducing the
> problem myself, once I have simplified my automatic testing infrastructure
> (it has blown up again, I'm going to get rid of jenkins so that there
> is less to go wrong).
>
> On Sun, Feb 01, 2026 at 06:54:33PM -0500, Brendon Higgins wrote:
> > Hi all,
> >
> > I spent a few hours today digging into the communication problem that's
> > preventing my local backup resuming properly (and possibly related
> > issues). I added a bunch of debug logging (see attached), ran backups,
> > and made some observations of the logs produced. Here are the ones that
> > seem most significant:
> >
> > * At some point while transmitting the next file, the server just stops
> > reading. Its final call to SSL_read() returns -1, with error 2
> > (SSL_ERROR_WANT_READ), and after that the server does not try SSL_read()
> > again until after the client disconnects.
> >
> > * The client sends considerably more packets of data which the server
> > doesn't read. The client does eventually stop sending, however.
> >
> > * During this sending process, the client never calls SSL_read(), even
> > though the server is sending it small packets (responses?). I wonder if
> > this might be an important point.
> >
> > * Reading is handled by the async_io(), which seems to check for the
> > presence of data using FD_ISSET(). For the server, a loop in
> > backup_phase2_server_all() causes execution to enter async_io() via some
> > other functions. For all loop iterations after the server's SSL_read()
> > failure, the FD_ISSET() condition is not met, preventing further
> > SSL_read() calls.
> >
> > * I added a call to SSL_want() after FS_ISSET() condition failures. For
> > only the first iteration after the SSL_read() failure, SSL_want() returns
> > 3 (SSL_READING). This seems inconsistent with the FD_ISSET() response.
> >
> > * For all following iterations, SSL_want() returns 1 (SSL_NOTHING). This
> > seems inconsistent with the fact that the server still hasn't called
> > SSL_read() (data should be still present in the buffer, no?) and the fact
> > that the client does send more data.
> >
> > Now, I've never encountered this kind of construction for a read loop
> > before - I'm more familiar with loops that just call SSL_read() (or
> > read()) and let the return value indicate whether data was or wasn't
> > available. That makes me wonder if some behaviour of OpenSSL (or
> > something further upstream) has changed in such a way that this
> > construction no longer works reliably. The fact that FS_ISSET() seems to
> > be contradicting SSL_want(), at least initially, feels consistent with
> > that.
> >
> > No idea what could be done about it though. And I'm also not sure what to
> > examine further.
> >
> > Best,
> > Brendon
> >
> > diff --git a/src/asfd.c b/src/asfd.c
> > index 02af6138..30ec92ae 100644
> > --- a/src/asfd.c
> > +++ b/src/asfd.c
> > @@ -203,11 +203,13 @@ static int asfd_do_read_ssl(struct asfd *asfd)
> >
> > asfd->read_blocked_on_write=0;
> >
> > ERR_clear_error();
> >
> > + logp("%s: SSL_read(%p, %p, %lu)...\n", asfd->desc, asfd->ssl,
> > asfd->readbuf+asfd->readbuflen, asfd->bufmaxsize-asfd->readbuflen);>
> > r=SSL_read(
> >
> > asfd->ssl,
> > asfd->readbuf+asfd->readbuflen,
> > asfd->bufmaxsize-asfd->readbuflen
> >
> > );
> >
> > + logp("-> %zd [err: %d]\n", r, SSL_get_error(asfd->ssl, r));
> >
> > switch((e=SSL_get_error(asfd->ssl, r)))
> > {
> >
> > @@ -221,6 +223,9 @@ static int asfd_do_read_ssl(struct asfd *asfd)
> >
> > SSL_shutdown(asfd->ssl);
> > goto error;
> >
> > case SSL_ERROR_WANT_READ:
> > + // HACK:
> > + //sleep(1);
> > + //return asfd_do_read_ssl(asfd);
> >
> > break;
> >
> > case SSL_ERROR_WANT_WRITE:
> > asfd->read_blocked_on_write=1;
> >
> > @@ -327,7 +332,9 @@ static int asfd_do_write_ssl(struct asfd *asfd)
> >
> > if(asfd->ratelimit && check_ratelimit(asfd)) return 0;
> > ERR_clear_error();
> >
> > + logp("%s: SSL_write(%p, %p, %lu)...\n", asfd->desc, asfd->ssl,
> > asfd->writebuf, asfd->writebuflen);>
> > w=SSL_write(asfd->ssl, asfd->writebuf, asfd->writebuflen);
> >
> > + logp("-> %zd\n", w);
> >
> > switch((e=SSL_get_error(asfd->ssl, w)))
> > {
> >
> > @@ -481,9 +488,11 @@ int asfd_read_expect(struct asfd *asfd, enum cmd cmd,
> > const char *expect)>
> > static int asfd_write(struct asfd *asfd, struct iobuf *wbuf)
> > {
> >
> > + logp("Entering %s()\n", __func__);
> >
> > if(asfd->as->doing_estimate) return 0;
> > while(wbuf->len)
> > {
> >
> > + logp("%s: Iterate\n", __func__);
> >
> > if(asfd->errors)
> >
> > return -1;
> >
> > if(asfd->append_all_to_write_buffer(asfd,
wbuf)==APPEND_ERROR)
> >
> > diff --git a/src/async.c b/src/async.c
> > index 41d9152e..382d0533 100644
> > --- a/src/async.c
> > +++ b/src/async.c
> > @@ -82,6 +82,7 @@ static int windows_stupidity_hacks(struct asfd *asfd,
> >
> > static int async_io(struct async *as, int doread)
> > {
> >
> > + logp("Entering %s(%p, %d)\n", __func__, as, doread);
> >
> > int mfd=-1;
> > fd_set fsr;
> > fd_set fsw;
> >
> > @@ -151,6 +152,7 @@ static int async_io(struct async *as, int doread)
> >
> > if(mfd>0)
> > {
> >
> > + logp("%s: mfd>0\n", __func__);
> >
> > errno=0;
> > s=select(mfd+1, &fsr, &fsw, &fse, &tval);
> > if(errno==EAGAIN || errno==EINTR) goto end;
> >
> > @@ -185,30 +187,48 @@ static int async_io(struct async *as, int doread)
> >
> > }
> >
> > }
> >
> > - if(asfd->doread && FD_ISSET(asfd->fd, &fsr)) // Able to
read.
> > + if(asfd->doread)
> >
> > {
> >
> > - asfd->network_timeout=asfd-
>max_network_timeout;
> > - switch(asfd->fdtype)
> > + logp("%s: doread true\n", __func__);
> > + if (FD_ISSET(asfd->fd, &fsr)) // Able to
read.
> >
> > {
> >
> > - case
ASFD_FD_SERVER_LISTEN_MAIN:
> > - case
ASFD_FD_SERVER_LISTEN_STATUS:
> > - // Indicate to the
caller that we have
> > - // a new
incoming client.
> > - asfd-
>new_client++;
> > - break;
> > - default:
> > - if(asfd-
>do_read(asfd)
> > - || asfd-
>parse_readbuf(asfd))
> > -
return asfd_problem(asfd);
> > - break;
> > + asfd->network_timeout=asfd-
>max_network_timeout;
> > + switch(asfd->fdtype)
> > + {
> > + case
ASFD_FD_SERVER_LISTEN_MAIN:
> > + case
ASFD_FD_SERVER_LISTEN_STATUS:
> > + //
Indicate to the caller that we have
> > + // a
new incoming client.
> > +
asfd->new_client++;
> > +
break;
> > + default:
> > +
if(asfd->do_read(asfd)
> > + ||
asfd->parse_readbuf(asfd))
> > +
return asfd_problem(asfd);
> > +
break;
> > + }
> > + }
> > + else
> > + {
> > + logp("%s: not able to do_read.
SSL_want says: %d\n", __func__,
> > + asfd->ssl ?
SSL_want(asfd->ssl) : 0xDEADBEEF);
> >
> > }
> >
> > }
> >
> > - if(asfd->dowrite && FD_ISSET(asfd->fd, &fsw)) // Able
to write.
> > + if(asfd->dowrite)
> >
> > {
> >
> > - asfd->network_timeout=asfd-
>max_network_timeout;
> > - if(asfd->do_write(asfd))
> > - return asfd_problem(asfd);
> > + logp("%s: dowrite true\n", __func__);
> > + if (FD_ISSET(asfd->fd, &fsw)) // Able to
write.
> > + {
> > + asfd->network_timeout=asfd-
>max_network_timeout;
> > + if(asfd->do_write(asfd))
> > + return
asfd_problem(asfd);
> > + }
> > + else
> > + {
> > + logp("%s: not able to
do_write. SSL_want says: %d\n", __func__,
> > + asfd->ssl ?
SSL_want(asfd->ssl) : 0xDEADBEEF);
> > + }
> >
> > }
> >
> > if((!asfd->doread || !FD_ISSET(asfd->fd, &fsr))
> >
> > @@ -229,6 +249,7 @@ static int async_io(struct async *as, int doread)
> >
> > end:
> > as->last_time=as->now;
> >
> > + logp("Exiting %s() normally\n", __func__);
> >
> > return 0;
> >
> > }
> >
> > @@ -239,6 +260,7 @@ static int async_read_write(struct async *as)
> >
> > static int async_write(struct async *as)
> > {
> >
> > + logp("Entering %s()\n", __func__);
> >
> > return async_io(as, 0 /* No read. */);
> >
> > }
> >
> > diff --git a/src/client/backup_phase2.c b/src/client/backup_phase2.c
> > index 16357bf5..4674b443 100644
> > --- a/src/client/backup_phase2.c
> > +++ b/src/client/backup_phase2.c
> > @@ -153,6 +153,7 @@ static enum send_e send_whole_file_w(struct asfd
> > *asfd,
> >
> > struct cntr *cntr, int compression, struct BFILE *bfd,
> > const char *extrameta, size_t elen)
> >
> > {
> >
> > + logp("Entering %s\n", __func__);
> >
> > if((compression || encpassword) && sb->path.cmd!=CMD_EFS_FILE)
> > {
> >
> > int key_deriv=sb->encryption;
> >
> > @@ -212,6 +213,7 @@ static int size_checks(struct asfd *asfd, struct sbuf
> > *sb, struct conf **confs)>
> > static int deal_with_data(struct asfd *asfd, struct sbuf *sb,
> >
> > struct BFILE *bfd, struct conf **confs)
> >
> > {
> >
> > + logp("Entering %s\n", __func__);
> >
> > int ret=-1;
> > int forget=0;
> > size_t elen=0;
> >
> > @@ -342,7 +344,7 @@ static int deal_with_data(struct asfd *asfd, struct
> > sbuf *sb,>
> > }
> > else
> > {
> >
> > - //logp("need to send whole file: %s\n", sb.path);
> > + logp("need to send whole file: %s\n",
iobuf_to_printable(&sb->path));
> >
> > // send the whole file.
> >
> > if(asfd->write(asfd, &sb->attr)
> >
> > @@ -391,6 +393,7 @@ static int parse_rbuf(struct asfd *asfd, struct sbuf
> > *sb,>
> > {
> >
> > static struct iobuf *rbuf;
> > rbuf=asfd->rbuf;
> >
> > + logp("Entering %s: rbuf->cmd = %d\n", __func__, rbuf->cmd);
> >
> > if(rbuf->cmd==CMD_DATAPTH)
> > {
> >
> > iobuf_move(&(sb->datapth), rbuf);
> >
> > @@ -426,6 +429,7 @@ static int parse_rbuf(struct asfd *asfd, struct sbuf
> > *sb,>
> > static int do_backup_phase2_client(struct asfd *asfd,
> >
> > struct conf **confs, int resume)
> >
> > {
> >
> > + logp("Entering %s(%p, %p, %d)\n", __func__, asfd, confs, resume);
> >
> > int ret=-1;
> > // For efficiency, open Windows files for the VSS data, and do not
> > // close them until another time around the loop, when the actual
> >
> > @@ -464,10 +468,13 @@ static int do_backup_phase2_client(struct asfd
> > *asfd,
> >
> > while(1)
> > {
> >
> > + logp("Iterating loop in %s\n", __func__);
> >
> > iobuf_free_content(rbuf);
> >
> > + logp("%s: About to asfd->read()...\n", __func__);
> >
> > if(asfd->read(asfd)) goto end;
> > else if(!rbuf->buf) continue;
> >
> > + logp("%s: asfd->read() yielded non-NULL rbuf->buf\n",
__func__);
> >
> > if(rbuf->cmd==CMD_GEN && !strcmp(rbuf->buf,
"backupphase2end"))
> > {
> >
> > if(asfd->write_str(asfd, CMD_GEN,
"okbackupphase2end"))
> >
> > @@ -486,6 +493,7 @@ end:
> > bfile_free(&bfd);
> > iobuf_free_content(rbuf);
> > sbuf_free(&sb);
> >
> > + logp("Exiting %s\n", __func__);
> >
> > return ret;
> >
> > }
> >
> > diff --git a/src/fzp.c b/src/fzp.c
> > index eb55afeb..76b9ddab 100644
> > --- a/src/fzp.c
> > +++ b/src/fzp.c
> > @@ -24,9 +24,11 @@ static void fzp_free(struct fzp **fzp)
> >
> > static FILE *open_fp(const char *fname, const char *mode)
> > {
> >
> > + logp("Entering %s(\"%s\", \"%s\")\n", __func__, fname, mode);
> >
> > FILE *fp=NULL;
> > if(!(fp=fopen(fname, mode)))
> >
> > logp("could not open %s: %s\n", fname,
strerror(errno));
> >
> > + logp("Exiting %s() -> %p\n", __func__, fp);
> >
> > return fp;
> >
> > }
> >
> > @@ -84,6 +86,7 @@ static void not_open(const char *func)
> >
> > static struct fzp *fzp_do_open(const char *path, const char *mode,
> >
> > enum fzp_type type)
> >
> > {
> >
> > + //logp("Entering %s(\"%s\", \"%s\", %d)\n", __func__, path, mode,
type);
> >
> > struct fzp *fzp=NULL;
> >
> > if(!(fzp=fzp_alloc())) goto error;
> >
> > @@ -93,6 +96,7 @@ static struct fzp *fzp_do_open(const char *path, const
> > char *mode,>
> > case FZP_FILE:
> > if(!(fzp->fp=open_fp(path, mode)))
> >
> > goto error;
> >
> > + //logp("Exiting %s() -> %p\n", __func__,
fzp);
> >
> > return fzp;
> >
> > case FZP_COMPRESSED:
> > if(!(fzp->zp=open_zp(path, mode)))
> >
> > @@ -103,6 +107,7 @@ static struct fzp *fzp_do_open(const char *path, const
> > char *mode,>
> > goto error;
> >
> > }
> >
> > error:
> > + //logp("Erroring-out of %s()\n", __func__);
> >
> > fzp_close(&fzp);
> > return NULL;
> >
> > }
> >
> > diff --git a/src/handy_extra.c b/src/handy_extra.c
> > index 5f10ea92..f7f798b8 100644
> > --- a/src/handy_extra.c
> > +++ b/src/handy_extra.c
> > @@ -147,6 +147,7 @@ char *get_endfile_str(uint64_t bytes, uint8_t
> > *checksum)>
> > int write_endfile(struct asfd *asfd, uint64_t bytes, uint8_t *checksum)
> > {
> >
> > + logp("Entering %s\n", __func__);
> >
> > return asfd->write_str(asfd,
> >
> > CMD_END_FILE, get_endfile_str(bytes, checksum));
> >
> > }
> >
> > @@ -165,6 +166,7 @@ enum send_e send_whole_file_gzl(struct asfd *asfd,
> > const char *datapth,>
> > struct cntr *cntr, int compression, struct BFILE *bfd,
> > const char *extrameta, size_t elen, int key_deriv, uint64_t salt)
> >
> > {
> >
> > + logp("Entering %s\n", __func__);
> >
> > enum send_e ret=SEND_OK;
> > int zret=0;
> > struct md5 *md5=NULL;
> >
> > @@ -222,6 +224,7 @@ enum send_e send_whole_file_gzl(struct asfd *asfd,
> > const char *datapth,>
> > do
> > {
> >
> > + logp("%s: Iterate\n", __func__);
> >
> > if(metadata)
> > {
> >
> > if(metalen>ZCHUNK)
> >
> > @@ -324,8 +327,10 @@ enum send_e send_whole_file_gzl(struct asfd *asfd,
> > const char *datapth,>
> > else
> > {
> >
> > iobuf_set(&wbuf,
CMD_APPEND, (char *)out, have);
> >
> > + logp("%s: About to asfd-
>write()...\n", __func__);
> >
> > if(asfd->write(asfd, &wbuf))
> > {
> >
> > + logp("%s: asfd-
>write() error\n", __func__);
> >
> > ret=SEND_FATAL;
> > break;
> >
> > }
> >
> > diff --git a/src/server/backup_phase2.c b/src/server/backup_phase2.c
> > index a48bfc14..461f09d9 100644
> > --- a/src/server/backup_phase2.c
> > +++ b/src/server/backup_phase2.c
> > @@ -613,6 +613,7 @@ enum sts_e
> >
> > static enum sts_e do_stuff_to_send(struct asfd *asfd,
> >
> > struct sbuf *p1b, char **last_requested)
> >
> > {
> >
> > + logp("Entering %s()\n", __func__);
> >
> > static struct iobuf wbuf;
> > if(p1b->flags & SBUF_SEND_DATAPTH)
> > {
> >
> > @@ -834,6 +835,7 @@ static enum str_e do_stuff_to_receive(struct asfd
> > *asfd,>
> > struct sbuf *rb, struct manios *manios,
> > struct dpth *dpth, char **last_requested)
> >
> > {
> >
> > + logp("Entering %s()\n", __func__);
> >
> > struct iobuf *rbuf=asfd->rbuf;
> >
> > if(rbuf->cmd==CMD_MESSAGE
> >
> > @@ -987,6 +989,7 @@ static int process_next_file_from_manios(struct asfd
> > *asfd,>
> > struct sbuf *cb,
> > struct conf **cconfs)
> >
> > {
> >
> > + logp("Entering %s()\n", __func__);
> >
> > if(!(*p1b=sbuf_alloc()))
> >
> > goto error;
> >
> > switch(manio_read(manios->phase1, *p1b))
> >
> > @@ -1138,6 +1141,8 @@ end:
> > int backup_phase2_server_all(struct async *as, struct sdirs *sdirs,
> >
> > const char *incexc, int resume, struct conf **cconfs)
> >
> > {
> >
> > + logp("Entering %s()\n", __func__);
> > +
> >
> > int ret=0;
> > man_off_t *pos_phase1=NULL;
> > man_off_t *pos_current=NULL;
> >
> > @@ -1220,6 +1225,7 @@ int backup_phase2_server_all(struct async *as,
> > struct sdirs *sdirs,>
> > goto error;
> >
> > if(cntr_send_sdirs(asfd, sdirs, cconfs,
CNTR_STATUS_BACKUP))
> >
> > goto error;
> >
> > + logp("Resume commenced in %s\n", __func__);
> >
> > }
> >
> > if(!(manios=manios_open_phase2(sdirs, pos_phase1, pos_current)))
> >
> > @@ -1239,6 +1245,7 @@ int backup_phase2_server_all(struct async *as,
> > struct sdirs *sdirs,>
> > while(1)
> > {
> >
> > + logp("%s: Iterate...\n", __func__);
> >
> > if(check_fail_on_warning(fail_on_warning, warn_ent))
> >
> > goto error;
> >
> > @@ -1252,6 +1259,7 @@ int backup_phase2_server_all(struct async *as,
> > struct sdirs *sdirs,>
> > || !manios->phase1
> > || asfd->writebuflen)
> >
> > {
> >
> > + logp("%s: sat cond for do_stuff_to_receive()
\n", __func__);
> >
> > iobuf_free_content(asfd->rbuf);
> > if(asfd->as->read_write(asfd->as))
> > {
> >
> > @@ -1275,6 +1283,11 @@ int backup_phase2_server_all(struct async *as,
> > struct sdirs *sdirs,>
> > goto error;
> >
> > }
> >
> > }
> >
> > + else
> > + {
> > + logp("%s: failed cond for
do_stuff_to_receive(): %p (\"%s\"), %p,
> > %lu\n", __func__, + last_requested,
> > last_requested?last_requested:"", manios->phase1, asfd->writebuflen);
> > + }
> >
> > if(p1b) switch(do_stuff_to_send(asfd, p1b,
&last_requested))
> > {
> >
> > _______________________________________________
> > Burp-users mailing list
> > Bur...@li...
> > https://lists.sourceforge.net/lists/listinfo/burp-users
>
> _______________________________________________
> Burp-users mailing list
> Bur...@li...
> https://lists.sourceforge.net/lists/listinfo/burp-users
|