Menu

#56 GMAIL protocol SDK

open
nobody
None
5
2004-09-11
2004-09-11
MSDaibert
No

More Informations:
http://johnvey.com/features/gmailapi/

About the Gmail engine and protocol
You've probably noticed that Gmail's interface is
extremely fast when compared to other web-based
email systems like Yahoo! Mail and Hotmail. This is a
result of Gmail's placement of the UI engine on the
client-side as a JavaScript module. Whenever you log in
to Gmail, a copy of the UI engine is loaded into one of
the HTML page frames and remains there for the
duration of your session (credit has to be given to
Oddpost for being the first ones who perfected this
idea). Subsequent actions from the Gmail interface are
then routed through the Gmail UI engine in your
browser, which in turn makes HTTP requests (via the
XmlHttpRequest object) to the Gmail server, interprets
the DataPack (more on this later), and updates the UI
dynamically. In contrast, Hotmail and Yahoo! Mail follow
traditional web application models and reload the entire
UI after almost every action.

The item most relevant to this project is what I refer to
as the “DataPack”, a base HTML file that contains only
JavaScript array declarations that the UI engine parses
and then uses to determine what to update. The
advantages of this should be immediately obvious:
reduced traffic load, and increased functionality —
especially for developers who no longer have to resort
to crude “screen scraping” techniques to interface with
web applications. Although the ideal situation for
external developers would be an XML-based DataPack,
the JavaScript version is sufficient (and I suspect it was
chosen for performance reasons as well).

The DataPack format consists of individual “DataItems”,
or JavaScript arrays wrapped in a envelope function. An
example:

D(["ts",0,50,106,0,"Inbox","fd36721220",154]);

The function D() references a runtime evaluator within
the Gmail engine, which then interprets the attached
array parameters. The "ts" element indicates that this is
a threadlist summary item, and the subsequent elements
denote start index, threads per page, estimated total,
threadlist title, threadlist timestamp, and total threads.
This is the same format that is applied to all array
parameters sent through the DataPack:

[<DataItem_name>(,<custom_attribute>)]

The mappings to all the DataItems can be found in the
engine code source (/gmail?view=page&name=js). For
instance, qu contains quota information, while ct
contains category (a.k.a. labels) definitions. Read
through that file if you really want to get everything you
can out of Gmail.

Determining the right URL to retrieve the DataPack is
pretty straightforward, as most requests will return the
same basic information, such as quota, category count,
and inbox count. The main thing that changes is the
threadlist summary, which depends on what page you're
looking at. All the main folders — inbox, starred, trash,
spam, etc. — are all really just pre-defined searches
within Gmail. For example, the inbox DataPack URL is:

/gmail?search=inbox&view=tl&start=0&init=1&zx=

The search query for all unread threads is:

/gmail?search=query&q=is%
3Aunread&view=tl&start=0&init=1&zx=

The main parameters are search= and q=, which define
what set of threads the user is requesting. The zx=
parameter is a proxy cache defeater, and I've omitted it
here for brevity. See GmailAdapter.MakeUniqueUrl() for
more information.

Gmail exploits another advantage of the DataPack model
to increase efficiency by allowing for an empty
document. This is employed by the 2-minute auto-
refresh request. The inbox URL adds a few more
parameters:

/gmail?
view=tl&search=inbox&start=0&tlt=fd8dfa2e31&fp=c1555
94240dcc7cb&auto=1&zx=

The tlt= parameter is the thread list timestamp, which is
treated like a checksum in determining the state of the
client versus the mailbox state on the server. If the
client timestamp is older than the one on the server,
then a full DataPack is sent. Otherwise, Gmail sends an
essentially empty DataPack.

Discussion


Log in to post a comment.