From: Hungerburg <pc...@my...> - 2014-03-14 13:01:50
|
Moved to exist-dev. I am just a user, that happened, out of fondness with the product, to peek at the code and mess with it a little. I'd like to expand on the analysis, a top-down approach, to try and find the vocabulary needed to get a clear picture. 1) The sandwitch: eXist-db lies between two interfaces -- the public interface (rest, webdav, xmlrpc, etc.) on the one side -- its own internal storage and messaging and, for binary resources, the operating systems file system on the other side. # The public interface is documented in RFC, but there are also a number of implementations to watch, one wants to interact with, that not always strictly conform or have different interpretations of what the standard says. # The internal storage, seemingly, can accomodate almost anything. The operating system interface though is different on linux, mac and windows, although in a small degree. Some translation has to happen inbetween: transparent and idempotent. Most of it (not all) is done in the xmldb-uri class. The documentation there is a little scarce and sometimes directly misleading. 2) Main areas of interest: # URI layer: A name passed in as a path-segment (GET,PUT) is encoded differently from a name passed in as a query-segment (GET, POST). Both must decode to the same: The current practice of double-en/de/coding is bad. # FS layer: How to handle names that are valid in URIs, but (mostly) NTFS prohibits? The other way round: one wants to store UTF-8 names instead of percent-encoded (as in a URI) non-ASCII diacritics etc. # Messaging layer: of course it was best, to always pass java.net.URI objects, but for performance reasons a plain String is preferred sometimes… This gets hairy soon, very clear guides on normalization must be in place. All in all quite some effort required, for probably marginal gains. I suspect though that, the more people use webdav the more the need to do something will show. People are used to certain inconveniences in using web-based systems, unlike when interacting with something that looks like a conventional filesystem. Kind regards Peter |