Re: [GSTP-devel] some thoughts on the specification

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

tor 2002-11-07 klockan 17.18 skrev Alexander Hav=E4ng:
> > If GSTP will ever be a widely used protocol it will mean that it will h=
ave
> > many more or less messed up client implementations, some of them very
> > wide-spread. To handle those clients workarounds in server behaviour ma=
y be
> > needed on the server side, like what is done today in the HTTP protocol=
 (for
> > example many versions of keep-alive doesn't get handled correctly in
> > internet explorer). Unfortunately the HTTP protcol doesn't specify how
> > client software should identify them self other than that they should s=
end
> > a a string to the server. This leads to advanced logic in the server fo=
r
> > pattern matching in the client version string to try to determine the e=
xact
> > version of the client software.
>=20
> We haven't figured out how to use the CAPABILITIES command yet.. but I
> think that command should work out most of the problems I can think of.
> But.. you do have a point, and if we can figure out a good way of
> distributing nice client software ids then I think we should add your
> fields to the hello or capabilities command.

Capabilities have the reversed functionality, when a client needs to
work around peculiarities in the server. When it is the other way around
you need good info that can identify a client implementation in the
HELLO command. I think that some byte string that identifies the client
plus a numeric version number (8 + 8 bits) should do it.

An 16 bit vendor ID + some official registry where you can register your
implementation is perhaps also something to think about

> > 2**32 char path lengths?
> > 32 bit file name length in the open command? 2**10 should be enough for
> > anyone, rounded to the nearest byte, 16 bits. This applies to all field=
s
> > that indicate length of paths or globs of any kind.
>=20
> We have 2 choices.. 16 bit or 32 bit. 16 bit will limit paths to 65k.
> I personally think that is too small, and just the idea that it might be
> too small should be enough incentive to keep the 32 bits.
> So.. no, not granted :)
>=20

Are you serious? 65k paths are EXTREMELY long. Like 650 directories in
each others, 100 chars each. I'd definitely say that if you need paths
even remotely close to 65k long (say 1k) you are doing lots of stuff
terribly wrong and deserve breakage.

Just out of curiosity, do you have an example of an application where
paths longer than 65k would be thinkable?

One thing to think about when defining the specification is that if you
define the standard to being able to handle 2**32 char paths,
implementors need be able to handle that too, or at least fail in some
predictable way. That is an additional burden.

> > 2**32 filedescriptors?
>=20
> 65k open files on one server is not enough. While this is not the
> exactly the same thing as the protocol filedescriptor field, it's
> easiest to implement that way, and 2 bytes isn't that much overhead
> anyway. I'm fairly open on this though..=20
>=20

I thought the obvious way to implement a server was to fork() off one
process per connection, thus you would have a private number of open
files per client connection. If you're going the threaded way, a mapping
table between protocol filedescriptor and server filedescriptor is
trivial, and wouldn't hurt performance much.

You say "2 bytes isn't that much overhead anyway" and that is correct.
However, two bytes here and two bytes there adds up quickly and the
protocol becomes more bloated than it needs to be (not much, but he
perfectionist in me don't like it)

> > The response message for a read command has a field for "read message
> > length". This field seems redundant, as the information can be easily
> > calculated from the message size field (the first field in the message)=
.=20
> > This also applies to the write command.
>=20
> It does.. ehm.. haven't got time to look at the source right now.. there
> might be a reason for this.. and then again, there might not :) I'll
> check.
>
> > How does a response indicate that there are more responses to come?
>=20
> A response that is not SUCCESS or FAILURE will _always_ have more
> responses to come. A command->response chain always ends with a SUCCESS
> or FAILURE response from the server, unless the client requests that he
> doesn't want a SUCCESS reply (this is because you don't want a SUCCESS
> response for each write command).

Oh, that's the way you do it. I'll document that then :)

> > How do you determine what a symlink points to when using the create
> > command?
>=20
> Ehm.. dunno.. I'll modify the create command.
>=20
> > When writing a client that operates over a slow link it would be useful=
 to
> > know the total size of the reply message(s) that corresponds to for
> > example a "list directory" command. If this information is sent, progre=
ss
> > could be displayed to the user in a reliable way. The client could esti=
mate
> > the total size of the replies by estimating the size of a single file
> > property and multiply it with the file count, but that seems like a hac=
k to
> > me.
>=20
> This is true.. I'll split some of the responses into different types,
> one for the "first" response that includes the total length (if known).
>=20

good

> > To enable file transfer between systems with different file name encodi=
ngs
> > i propose that the specification dictates that all filenames should be
> > encoded in the UTF-8 charset. The other way to do this would be to have=
 a
> > mechanism for the client to query the server for filename encoding char=
set
> > and then do any encoding conversion on the client side.
>=20
> This.. is something I really hate.. charsets suck. Just tell me the
> right way to do it and I'll do it if the implementation doesn't suffer
> too much.=20
>=20

Yes it's a mess, especially when you mix in the far east people. I'll do
a writeup of the different ways to do it in a separate email.

I'll also create a TODO list for protocol changes and clarifications.

/noa