Re: [Communicator-user] Communicator and VoiceXML?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

>
> The state issue thus becomes a bit of a red herring: in reality no one
> really tries to explicitly enumerate all possible states for a system
> (which is in fact what you would have to do if you code directly in vxml);
> they write systems that (hopefully) are capable of generating all
necessary
> states. To tie this back to Communicator: most sites, whether explicitly
or
> implicitly, tried to develop some "theory" of state generation that would
> do this in a reasonably efficient manner. Notions like "mixed-initiative",
> I think really had to do with the problem of accommodating otherwise
> unanticipatable dialog states.
>

I do agree with this. Not even the so called "state based" dialog systems
attempt at enumerating all the possible states, and that includes also the
commercial - "mostly" system initiative apps - developed by SpeechWorks and
its partners. We tend to use the expression "state-based" for those dialog
systems whose strategy can be depicted with a graph with blobs and arcs, but
we tend to forget that there are conditions on those arcs that can be pretty
complex and that can refer to global variables - and this could be done, in
principle, also in voiceXML (although I agree that nobody in their right
mind would build a Communicator mixed initiative system in pure voiceXML).
So the states in the "state-based" system are general "clusters of states"
with the clustering function arbitrarily complex. If the complexity of those
clustering function is high, it becomers rather impractical representing the
dialog with blobs and arcs. On the other hand we at SpeechWorks have learned
the importance (for the goal of a task completion rate close to 100%) of a
finely tuned UI that often requires specific assignment of prompts and
grammar in specific situations in the interaction. And that generally is
easier to do if you have a "state-based" dialog manager where you can
control directly "every state" of the interaction. I guess this is one of
the many reasons that makes it hard to build a general theory of dialog
management that would satisfy everybody - i.e. the necessity of generating
(or clustering) states at a high level and the opposite necessity of
tailoring the UI to each individual state.

As far as the voiceXML dicussion is concerned, as Bob explained, we can look
at voiceXML *only* as the protocol used by the application server to invoke
ASR and TTS. The dialog is managed at the server side, and the dailog
manager could be state based as well  as anything else - as long as a
voiceXML document could be generated at each turn describing which grammar
(or language model) to activate and which prompt to play. Under that limited
use, voiceXML can be used within the Galaxy architecture with, in my
opinion, at least a couple of advantages: 1. buzz-word compliancy 2. easy
(but not completely transparent yet...) porting to different telephony
platforms as long as they are voiceXML compliant.

Roberto