[Octave-cvsupdate] octave-forge/main/info-theory/inst arithmetic_decode.m, NONE, 1.1 arithmetic_enc

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Update of /cvsroot/octave/octave-forge/main/info-theory/inst
In directory sc8-pr-cvs3.sourceforge.net:/tmp/cvs-serv24756

Added Files:
	arithmetic_decode.m arithmetic_encode.m binaryn.m bscchannel.m 
	conditionalentropy_XY.m conditionalentropy_YX.m entropy.m 
	huffman.m huffman_minvar.m jointentropy.m laverage.m 
	marginalc.m marginalr.m mutualinformation.m redundancy.m 
	relativeentropy.m rle_decode.m rle_encode.m shannonfano_code.m 
	tunstallcode.m unarydec.m unaryenc.m 
Log Message:
Functions in Information Theory and Source coding 

--- NEW FILE: arithmetic_decode.m ---
## (C) 2006, August, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##
## usage: arithmetic_decode(tag_message,symbol_probabilites_list)
## computes the message from arithmetic code given  with symbol
## probabilities. The arithmetic decoding procedure assumes
## that the message is a list of numbers and the symbol probabilities
## correspond to the index. The message is returned.

## example: symbols=[1,2,3,4]; sym_prob=[0.5 0.25 0.15 0.10];
##          message=[1, 1, 2, 3, 4];
##          arithmetic_encode(message,sym_prob) ans=0.18078
##          arithmetic_decode(0.18078,sym_prob) ans=[1 1 2 3 4];
## see also: arithmetic_encode
##

% FIXME
% -----
% Limits on floating point lengths. Not fool proof 
% (interval doubling? not implemented). Maynot converge!
%
% First assume that the list is in symbolic-order 
% (sorting not necessary). 
%
% Tag is a floating point thing generated by the 
% encoder.
%
% Message is an array of symbols (numbers).
%
function message=arithmetic_decode(tag,problist,tolerance)
  if nargin < 2
    error("Usage: arithmetic_decode(tag,problist,tolerance=1e-8)");
  elseif nargin < 3
    tolerance=1e-8;
  end

  %start from the extreme extents & find the sweet-spot.
  Up=1.0;
  Lo=0.0;
  L_SYM=length(problist);

  mytag=0;
  message=[];

  while (mytag~=tag)
    % initial guess for the sweet-spot.

    %some loop-invariants.
    found=0;
    T1=Lo;
    T2=+(Up-Lo);

    for itr=1:L_SYM
      T3=sum(problist(1:itr));
      tL=T1+T2*(T3-problist(itr));
      tU=T1+T2*T3;

      %identify the correct spot.
      if((tL < tag) && (tag < tU))
	found=1;
	break;
      end
    end

    if(~found)
      warning("Cannot decode letter. Defaults to max symbol.\n");
    end

    Up=tU;
    Lo=tL;
    mytag=0.5*(tL+tU);
    message=[message itr]; 
    if(abs(tag-mytag)<tolerance)
      break;
    end
  end
  return;
end

--- NEW FILE: huffman_minvar.m ---
## (C) 2006, May 31, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

## usage: huffman_minvar(source probability vector, {toggle})
## toggle is a 1,0 optional argument that starts building a
## code based on 1's or 0's, defaulting to 0. Computes the 
## minimum variance huffman code, which can potentially cut down
## on the buffering need, using an even weight distribution.
##
##
## This function builds a Huffman code,
## given the probability list.
## The Huffman codes persymbol are output
## as a list of strings-per-source symbol.
## This function is the minimum variance method of building
## Huffman codes, such that the newly combined strings are
## promoted to the top of the tree.
##
## example: huffman([0.5 0.25 0.15 0.1]) => CW(0,10,111,110)
##          huffman(0.25*ones(1,4)) => CW(11,10,01,00)
##
## see also: huffman
##
function cw_list=huffman_minvar(source_prob,togglecode)
  if nargin < 1
    error("Usage: huffman(source_prob,{togglecode 1-0 in code});")
  elseif nargin < 2
    togglecode=0;
  end

##
## Huffman code algorithm.
## while (uncombined_symbols_remain)
##       combine least probable symbols into one with,
##       their probability being the sum of the two.
##       save this result as a stage at lowest order rung.
##       (Moving to lowest position possible makes it non-minimum variance
##        entropy coding)
## end
##
## for each (stage)
## Walk the tree we built, and assign each row either 1,
## or 0 of 
## end
## 
## reverse each symbol, and dump it out.
## 
## 

  %
  %need to find & eliminate the zero-probability code words.
  %in practice we donot need to assign anything to them.
  %
  origsource_prob=source_prob;

  %
  %sort the list and know the index transpositions.
  %
  L=length(source_prob);
  index=[1:L];
  for itr1=1:L
    for itr2=itr1:L
      if(source_prob(itr1) < source_prob(itr2))
	t=source_prob(itr1);
	source_prob(itr1)=source_prob(itr2);
	source_prob(itr2)=t;

	t=index(itr1);
	index(itr1)=index(itr2);
	index(itr2)=t;
      end
    end
  end
  %index
  %source_prob;

  idx=find(source_prob==0);
  if(not(isempty(idx)))    
    source_prob=source_prob(1:idx(1)-1);
  end
  clear idx;

  stage_list={};
  cw_list={};

  stage_curr={};
  stage_curr.prob_list=source_prob;
  stage_curr.sym_list={};
  S=length(source_prob);
  for i=1:S; 
    stage_curr.sym_list{i}=[i];
    cw_list{i}="";
  end

  I=1;
  while (I<S)
    L=length(stage_curr.prob_list);
    nprob=stage_curr.prob_list(L-1)+stage_curr.prob_list(L);
    nsym=[stage_curr.sym_list{L-1}(1:end),stage_curr.sym_list{L}(1:end)];

    %stage_curr;
    %archive old stage list.
    stage_list{I}=stage_curr;

    %
    %insert the new probability into the list, at the
    %first-position (greedy?) possible.
    %
    for i=1:(L-2)
      if(stage_curr.prob_list(i)<=nprob)
	break;
      end
    end

    stage_curr.prob_list=[stage_curr.prob_list(1:i-1) nprob stage_curr.prob_list(i:L-2)];
    stage_curr.sym_list={stage_curr.sym_list{1:i-1}, nsym, stage_curr.sym_list{i:L-2}};

    % Loopie
    I=I+1;
  end

  one_cw="1";
  zero_cw="0";

  if (togglecode==0)
      one_cw="1";
      zero_cw="0";
  else
     one_cw="0";
     zero_cw="1";
  end

  %printf("Exit Loop");
  I=I-1;
  while (I>0)
    stage_curr=stage_list{I};
    L=length(stage_curr.sym_list);

    clist=stage_curr.sym_list{L};
    for k=1:length(clist)
      cw_list{clist(k)}=strcat(cw_list{clist(k)},one_cw);
    end

    clist=stage_curr.sym_list{L-1};
    for k=1:length(clist)
      cw_list{clist(k)}=strcat(cw_list{clist(k)},zero_cw);
    end

    % Loopie
    I=I-1;
  end

  %
  %zero all the code-words of zero-probability length, 'cos they
  %never occur.
  %
  S=length(source_prob);
  for itr=(S+1):length(origsource_prob)
    cw_list{itr}="";
  end
  cw_list;

  %
  % Re-sort the indices according to the probability list.
  %
  L=length(source_prob);
  for itr=1:(L)
    if(index(itr)==-1)
      continue;
    end

    t=cw_list{index(itr)};
    cw_list{index(itr)}=cw_list{itr};
    cw_list{itr}=t;

    index(index(itr))=-1;

  end

  %zero all the code-words of zero-probability length, 'cos they
  %never occur.

  %for itr=1:L
  %  if(origsource_prob(itr)==0)
  %    cw_list{itr}="";
  %  end
  %end

  return
end
%!assert(huffman_minvar(0.25*ones(1,4),1),{"11","10","01","00"},0)

--- NEW FILE: shannonfano_code.m ---
## (C) 2006, September 29, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: shannonfano_code(symbol_probabilites)
##        computes the code words for the symbol probabilities using the
##        shannon fano scheme. Output is a cell array of strings, which 
##        are codewords, and correspond to the order of input probability.
##
##
## example: [CW]=shannonfano_code([0.5 0.25 0.15 0.1]);
##          assert(redundancy(CW,[0.5 0.25 0.15 0.1]),0.25841,0.001)
##
function [cw_list]=shannonfano_code(P)
  DMAX=length(P);
  S=1:DMAX;
#
#FIXME: Is there any other way of doing the
#topological sort? Insertion sort is bad.
#
#Sort the probability list in 
#descending/non-increasing order.
#Using plain vanilla insertion sort.
#
  for i=1:length(P)
    for j=i:length(P)

      if(P(i) < P(j))

				#Swap prob
	t=P(i);
	P(i)=P(j);
	P(j)=t;

				#Swap symb
	t=S(i);
	S(i)=S(j);
	S(j)=t;
      end

    end
  end

  #Now for each symbol you need to
  #create code as first [-log p(i)] bits of
  #cumulative function sum(p(1:i))
  #
  #printf("Shannon Codes\n");
  #data_table=zeros(1,DMAX);

   cw_list={};
   for i=1:length(S)
     if(P(i)>0)
       digits=ceil(-log2(P(i))); #somany digits needed
     else
       digits=0; #dont assign digits
     end

     Psum = sum([0 P](1:i)); #Cumulative probability
     s=binaryn(Psum,digits);
     if(length(s)==0)
       v=0;
     else
       v=bin2dec(s);
       if(v==0)
	 s=strcat("1",s);
	 v=bin2dec(s);
       end
     end
     # printf(" %d => %s %d %d\n",S(i),s,v,i);
     #data_table(S(i))=v;
     cw_list{i}=s;
   end

   #re-arrange the list accroding to input prob list.
   cw_list2={};
   for i=1:length(cw_list)
     cw_list2{i}=cw_list{S(i)};
   end
   cw_list=cw_list2;

   return;
end

%!
%!CW=shannonfano_code([0.5 0.25 0.15 0.1]);
%!assert(redundancy(CW,[0.5 0.25 0.15 0.1]),0.25841,0.001)
%!CW=shannonfano_code([0.5 0.15 0.25 0.1]);
%!

--- NEW FILE: marginalc.m ---
## (C) 2005, December, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: marginalc(XY)
##        computes marginal  probabilities along columns.
##
## XY is the transition matrix
## see also: marginalr
##
function val=marginalc(XY)
  val=sum(XY);
  return
end

--- NEW FILE: tunstallcode.m ---
## (C) June 2006, Muthiah Annamalai Sat Jun  3 01:44:59 CDT 2006 
## <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

## usage: code_dictionary = tunstallcode(probability_list)
## Implementation of a |A|-bit tunstall coder
## given the source probability of the |A| symbols 
## from the source with 2^|A| code-words involved.
## The variable probability_list ordering of symbols is 
## preserved in the output symbol/code dictionary.
## Tunstall code is a variable to fixed source coding scheme,
## and the arrangement of the codeword list order corrseponds to
## to the regular tunstall code ordering of the variable source 
## word list, and the codes for each of them are enumerations 
## from 1:2^N. Return only the ordering (grouping) of source symbols
## as their index of match is the corresponding code word. The
## probabilites of the various symbols are also stored in here.
##
## example: [cw_list,prob_list]=tunstallcode([0.6 .3 0.1]) 
##          essentially you will use the cw_list to parse the input
##          and then compute the code as the binary value of their index
##          of match, since it is a variable to fixed code.
##
## Reference: "Synthesis of noiseless compression codes", Ph.D. Thesis
##             of B.P. Tunstall, Georgia Tech, Sept 1967
##
function [cw_list,prob_list]=tunstallcode(prob_list)
  if nargin < 1
    error('usage: tunstallcode(probability_list)');
  end

  N=length(prob_list);
  S=1:N;
  CWMAX=2**N;
  porig_list=prob_list;

  prob_op=prob_list;
  for idx=1:N
    cw_list{idx}=sprintf("%d",idx);
  end

  while ((length(cw_list)+N-1) <= CWMAX)
    L=length(cw_list);

    %
    %assuming maximum of list is on top.
    %i.e: all lists are in descending order.
    %
    base=cw_list{1};
    base_prob=prob_list(1);

    prob_list=[prob_list(2:end) base_prob];
    cw_list={cw_list{2:end}};

    for idx=1:N
      prob_list(L+idx-1)=base_prob*porig_list(idx);
      w=sprintf("%s%d",base,idx);
      cw_list{L+idx-1}=w;
    end

    idx=find(prob_list==max(prob_list));
    prob_list=[prob_list(idx) prob_list(1:idx-1) prob_list(idx+1:end)];
    cw_list={cw_list{idx} cw_list{1:idx-1} cw_list{idx+1:end}};
  end

  return
end
%!
%!P=[  0.500000   0.250000   0.125000   0.125];
%!tunstallcode(P)
%!tunstallcode([0.6 .3 0.1])
%!

--- NEW FILE: relativeentropy.m ---
## (C) 2005, December, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: relativeentropy(p,q) computes the 
## relative entropy between the 2 give pdf's.
##
## d(p || q) =
## Kullback-Leiber distance between 2 probability,
## distributions, is relative entropy. Not a real measure
## as its not symmetric. wherever infinity is present, we
## reduce it to zeros
##
## see also: entropy, conditionalentropy
##
function val=relativeentropy(p,q)
  if nargin < 2 || size(p)~=size(q) || ~isvector(p) || ~isvector(q)
    error('usage: relativeentropy(p,q) of equal length');
  end  
  val=zeros(1,max(size(p)));
  p_by_q=p./q;
  p_by_q(p_by_q == inf)=0;
  nonzero_idx=(p_by_q > 0 );
  val(nonzero_idx)= p(nonzero_idx).*log2(p_by_q(nonzero_idx));
  return
end
%!
%!
%!

--- NEW FILE: unaryenc.m ---
## (C) 2006, August, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

## usage: unaryenc(value)
## This function encodes the decimal value.
## useful if you are trying to perform golomb-rice coding
## value needs to be a number or row-vector. value is a non-negative
## number.
##
## Theory: Unary encoding of a +ve number N is done as follows,
##         use N-ones followed by one zero. For instance, the  
##         unary coded value of 5 will be then (111110)base2 
##         which is 31x2 = 62.
##         From this definition, decoding follows.
##
## example: 
##          message=[    5   4   4   1   1   1]
##          coded=unaryenc(message); #gives    62   30   30    2    2    2
##          unarydec(coded) #gives message  back.
##
##
## See also: unarydec
##

##
## Optimal for exponential sequences
## pow(2,-n) kind of sequences this is optimal.
##
function rval=unaryenc(val)
     if val==0
        rval=val;
        return
     end
     rval=2.^(val)-1; % add somany 1's
     rval=rval*2;    % append 0
end

--- NEW FILE: huffman.m ---
## (C) 2006, May 31, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: huffman(source probability vector, {toggle})
## toggle is a 1,0 optional argument that starts building a
## code based on 1's or 0's, defaulting to 0.
##
## This function builds a Huffman code,
## given the probability list.
## The Huffman codes persymbol are output
## as a list of strings-per-source symbol.
##  
## example: huffman([0.5 0.25 0.15 0.1]) => CW(0,10,111,110)
##          huffman(0.25*ones(1,4)) => CW(11,10,01,00)
##
function cw_list=huffman(source_prob,togglecode)
  if nargin < 1
    error("Usage: huffman(source_prob,{togglecode 1-0 in code});")
  elseif nargin < 2
    togglecode=0;
  end

## Huffman code algorithm.
## while (uncombined_symbols_remain)
##       combine least probable symbols into one with,
##      their probability being the sum of the two.
##       save this result as a stage at lowest order rung.
##       (Moving to lowest position possible makes it non-minimum variance
##        entropy coding)
## end
##
## for each (stage)
## Walk the tree we built, and assign each row either 1,
## or 0 of 
## end
## 
## reverse each symbol, and dump it out.
##

  %
  %need to find & eliminate the zero-probability code words.
  %in practice we donot need to assign anything to them.
  %
  origsource_prob=source_prob;

  %
  %sort the list and know the index transpositions.
  %
  L=length(source_prob);
  index=[1:L];
  for itr1=1:L
    for itr2=itr1:L
      if(source_prob(itr1) < source_prob(itr2))
	t=source_prob(itr1);
	source_prob(itr1)=source_prob(itr2);
	source_prob(itr2)=t;

	t=index(itr1);
	index(itr1)=index(itr2);
	index(itr2)=t;
      end
    end
  end
  %index
  %source_prob;

  idx=find(source_prob==0);
  if(not(isempty(idx)))    
    source_prob=source_prob(1:idx(1)-1);
  end
  clear idx;

  stage_list={};
  cw_list={};

  stage_curr={};
  stage_curr.prob_list=source_prob;
  stage_curr.sym_list={};
  S=length(source_prob);
  for i=1:S; 
    stage_curr.sym_list{i}=[i];
    cw_list{i}="";
  end

  I=1;
  while (I<S)
    L=length(stage_curr.prob_list);
    nprob=stage_curr.prob_list(L-1)+stage_curr.prob_list(L);
    nsym=[stage_curr.sym_list{L-1}(1:end),stage_curr.sym_list{L}(1:end)];

    %stage_curr;
    %archive old stage list.
    stage_list{I}=stage_curr;

    %
    %insert the new probability into the list, at the
    %first-position (greedy?) possible.
    %
    for i=1:(L-2)
      if(stage_curr.prob_list(i)<nprob)
	break;
      end
    end

    stage_curr.prob_list=[stage_curr.prob_list(1:i-1) nprob stage_curr.prob_list(i:L-2)];
    stage_curr.sym_list={stage_curr.sym_list{1:i-1}, nsym, stage_curr.sym_list{i:L-2}};

    % Loopie
    I=I+1;
  end

  one_cw="1";
  zero_cw="0";

  if (togglecode==0)
      one_cw="1";
      zero_cw="0";
  else
     one_cw="0";
     zero_cw="1";
  end

  %printf("Exit Loop");
  I=I-1;
  while (I>0)
    stage_curr=stage_list{I};
    L=length(stage_curr.sym_list);

    clist=stage_curr.sym_list{L};
    for k=1:length(clist)
      cw_list{clist(k)}=strcat(cw_list{clist(k)},one_cw);
    end

    clist=stage_curr.sym_list{L-1};
    for k=1:length(clist)
      cw_list{clist(k)}=strcat(cw_list{clist(k)},zero_cw);
    end

    % Loopie
    I=I-1;
  end

  %
  %zero all the code-words of zero-probability length, 'cos they
  %never occur.
  %
  S=length(source_prob);
  for itr=(S+1):length(origsource_prob)
    cw_list{itr}="";
  end
  cw_list;

  %
  % Re-sort the indices according to the probability list.
  %
  L=length(source_prob);
  for itr=1:(L)
    if(index(itr)==-1)
      continue;
    end

    t=cw_list{index(itr)};
    cw_list{index(itr)}=cw_list{itr};
    cw_list{itr}=t;

    index(index(itr))=-1;

  end

  %zero all the code-words of zero-probability length, 'cos they
  %never occur.

  %for itr=1:L
  %  if(origsource_prob(itr)==0)
  %    cw_list{itr}="";
  %  end
  %end

  return
end
%!assert(huffman([0.5 0.25 0.15 0.1],1), {"0","10","111","110"},0)
%!assert(huffman(0.25*ones(1,4),1),{"11","10","01","00"},0)

--- NEW FILE: redundancy.m ---
## (C) 2006, September 28, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: redundancy(code_word_list,symbol_probabilites)
##        computes the wasted excessive bits over the entropy 
##        when using a particular coding scheme.
##
## example: prob_list=[0.5 0.25 0.15 0.1];
##          min_bits=entropy(prob_list);
##          cw_list=huffman(prob_list);
##          redundancy(cw_list,prob_list)
##
function [val,ent,lavg]=redundancy(code_word_list,symprob)
  if ( nargin < 2 )
       error("usage: redundancy(code_word_list,symbol_probability_list); computes entropy in base-2");
  end

  #eliminate zeros from x.
  ent=entropy(symprob);
  lavg=laverage(code_word_list,symprob);
  val=1.0-(ent/lavg);
  return
end

%!
%!assert(redundancy({"1","01","000","001"},[0.5 0.25 0.15 0.1]),0.0041499,1e-3)
%!assert(redundancy({"00","01","10","11"},[0.25 0.25 0.25 0.25]),0,0)
%!

--- NEW FILE: entropy.m ---
## (C) 2006, May 31, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: entropy(symbol_probabilites,{base})
## computes the shannon entropy of a discrete source whose
## probabilities are given in first argument, and optionally 
## base can be specified. Base of logarithm defaults to 2,
## when the entropy can be thought of as a measure of bits
## needed to represent any message of the source.
##
## example: entropy([0.25 0.25 0.25 0.25]) ans = 2
##          entropy([0.25 0.25 0.25 0.25],4) ans = 1
##
function val=entropy(symprob,base)
  if nargin < 1
       error("usage: entropy(symbol_probability_list); computes entropy in base-2");
  elseif nargin < 2
       base=2;
  end
  val=0.0;

  #eliminate zeros from x.
  x=symprob(symprob > 0);

  val=-sum(log10(x).*x)/log10(base);
  return
end
%!
%!assert(entropy([0.25 0.25 0.25 0.25]),2,0)
%!
--- NEW FILE: arithmetic_encode.m ---
## (C) 2006, August, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: arithmetic_encode(message,symbol_probabilites_list)
## computes the arithmetic code for the message with symbol
## probabilities are given. The arithmetic coding procedure assumes
## that the message is a list of numbers and the symbol probabilities
## correspond to the index.
##
## example: symbols=[1,2,3,4]; sym_prob=[0.5 0.25 0.15 0.10];
##          message=[1, 1, 2, 3, 4];
##          arithmetic_encode(message,sym_prob) ans=0.18078
##
## see also: arithmetic_decode
##

%  FIXME
% -------
% Limits on floating point lengths. Not fool proof 
% (interval doubling? not implemented).
%
% First assume that the list is in symbolic-order 
% (sorting not necessary). 
%
% Message is an array of symbols (numbers).
%
function tag=arithmetic_encode(message,problist)
  if nargin < 2
      error('Usage: arithmetic_encode(message,problist)');
  end

  Up=1;
  Lo=0;
  L_MSG=length(message);

  for itr=1:L_MSG
    Mi=message(itr);
    T1=Lo;
    T2=+(Up-Lo);
    T3=sum(problist(1:Mi)); 
    Lo=T1+T2*(T3-problist(Mi));
    Up=T1+T2*T3;
  end

  tag=0.5*(Up+Lo);
  return;
end

--- NEW FILE: marginalr.m ---
## (C) 2005, December, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: marginalr(XY)
##        computes marginal  probabilities along rows.
##
## XY is the transition matrix
## see also: marginalc
##
function val=marginalr(XY)
   val=sum(XY');
   return
end

--- NEW FILE: conditionalentropy_YX.m ---
## (C) 2005, December, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: conditionalentropy_XY(XY) computes the 
## H(Y/X) = SUM( P(Xi)*H(Y/Xi) ), where H(Y/Xi) = SUM( -P(Yk/Xi)log(P(Yk/Xi)) )
## 
## XY matrix must have, Y along rows
## X along columns
## Xi = SUM( COLi ) 
## Yi = SUM( ROWi )
## H(Y|X) = H(X,Y) - H(X)
##
## See also: conditionalentropy_XY
function val=conditionalentropy_YX(XY)
  val=0.0;
  for i=1:size(XY)(1)
    Xi = sum(XY(:,i));
    val = val + Xi*entropy(XY(:,i)/sum(XY(:,i)));
  end
  return
end

--- NEW FILE: rle_decode.m ---
## (C) 2006, August, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: rle_decode(message)
## This function decodes a run-length encoded message into its
## original form. The encoded message has to be in the form of a 
## row-vector. The message format (encoded RLE) is like  repetition 
## factor, value.
##
## example: 
##          message=[1 5 2 4 3 1];
##          rle_decode(message) #gives
##          ans = [    5   4   4   1   1   1]
##
## see also: rle_encode()
##
function rmsg=rle_decode(message)
     if nargin < 1
       error('Usage: rle_decode(message)')
     end
     rmsg=[];
     L=length(message);
     itr=1;
     while itr < L
       times=message(itr);
       val=message(itr+1);
       rmsg=[rmsg val*(ones(1,times))];
       itr=itr+2;
     end
     return
end

--- NEW FILE: binaryn.m ---
## (C)  Mon May 22 15:46:24 2006, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: binaryn(val,digits) extracts the first MSB $digits of the 
##        number val in a string format.
##
##

function rval=binaryn(Val,digits)
%
% This function returns the first $digits 
% binary digits/bits of the value $Val
% as a string.
%
% FIXME: Is there a faster way to do this?
rval=sprintf("%d",ones(1,digits))
digits=digits+1;
while  digits > 1
  digits=digits-1;
  Val=Val*2;
  if(Val >= 1)
    b="1";
    Val=Val-1;     
    rval(digits)="1";
  else
    rval(digits)="0";
  end
end

return
end

%!
%!assert(binaryn(255,8),"11111111",0);
%!assert(binaryn(128,1),"1",0);
%!
--- NEW FILE: bscchannel.m ---
## (C) 2005, December, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: bscchannel(p) 
## Returns the transition matrix for a Binary Symmetric
## Channel with error probability, p.
##

## see also: entropy
##
function transmat=bschannel(p)
  transmat=[ 1-p p; p 1-p];
  return
end

--- NEW FILE: unarydec.m ---
## (C) 2006, August, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

## usage: unarydec(value)
## This function decodes the unary encoded value.
## useful if you are trying to perform golomb-rice coding
## value needs to be a number or row-vector.
##
## example: 
##          message=[    5   4   4   1   1   1]
##          coded=unaryenc(message); #gives    62   30   30    2    2    2
##          unarydec(coded) #gives message  back.
##

## See also: unaryenc
##
function rval=unarydec(val)
     rval=log2(val+2)-1;
end

--- NEW FILE: conditionalentropy_XY.m ---
## (C) 2005, December, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: conditionalentropy_XY(XY) computes the 
## H(X/Y) = SUM( P(Yi)*H(X/Yi) ) , where H(X/Yi) = SUM( -P(Xk/Yi)log(P(Xk/Yi)) )
##                               , where P(Xk/Yi) = P(Xk,Yi)/P(Yi)
## XY matrix must have, Y along rows
## X along columns
## Xi = SUM( COLi ) 
## Yi = SUM( ROWi )
## H(X|Y) = H(X,Y) - H(Y)
## see also: entropy, conditionalentropy
##
function val=conditionalentropy_XY(XY)
  val=0.0;
  for i=1:size(XY)(2)
    Yi = sum(XY(i,:));
    val = val + Yi*entropy(XY(i,:)/sum(XY(i,:)));
  end
  return
end

--- NEW FILE: laverage.m ---
## (C) Aug, 2006 Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

## usage: avgbits = laverage(codebook,problist);
##
## Compute the average word length SUM(i=1:N)Li.Pi
## where codebook is a struct of strings, where each
## string represents the codeword. Problist is the 
## probability values.
## 
## example: x={"0","111","1110"}; p=[0.1 0.5 0.4];
## laverage(x,p) #must give
## ans =  3.2000
function Lavg=laverage(codebook,problist)
     if nargin < 2
        error('usage laverage(codebook,problist)');
     end
     Lavg=0.0;
     for itr=1:length(codebook)
       Lavg=Lavg+length(codebook{itr})*problist(itr);
     end
     return
end

%!
%! x={"0","111","1110"}; 
%! p=[0.1 0.5 0.4];
%! assert(laverage(x,p),3.200,0.001);
%!
--- NEW FILE: mutualinformation.m ---
## (C) 2005, December, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

## usage: mutualinformation(XY) computes the 
## mutual information of the given channel transition matrix.
## By definition we have I(X,Y) given as
## I(X:Y) = SUM( P(x,y) log2 ( p(x,y) / p(x)p(y) ) ) = relative_entropy(P(X,Y) || P(X),P(Y))
## Mutual Information, is amount of information, one variable
## has, about the other. It is the reduction of uncertainity.
## This is a symmetric function.
##
## see also: entropy, conditionalentropy
##
function val=mutualinformation(XY)
  if nargin < 1 || ~ismatrix(XY)
    error('Usage: mutualinformation(XY) where XY is the transition matrix');
  end

  val=0.0;
  X=marginalc(XY);
  Y=marginalr(XY);

  cols=size(XY,1);
  rows=size(XY,2);

  for i=1:rows
      tVal=XY(i,:)./(X*Y(i));
      #cannot handle infinities,user should be prepared
      nz_idx=(tVal > 0);
      val=val+ sum(XY(i,nz_idx).*log2(tVal(nz_idx)));
  end

  return
end

--- NEW FILE: jointentropy.m ---
## (C) 2005, December, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

## usage: jointentropy(XY) computes the 
## joint entropy of the given channel transition matrix.
## By definition we have H(X,Y) given as
## H(X:Y) = SUMx( SUMy( P(x,y) log2 ( p(x,y) ) ) ) 
##
## see also: entropy, conditionalentropy
##
function val=jointentropy(XY)
     if nargin < 1 || ~ismatrix(XY)
       error('Usage: jointentropy(XY) where XY is the transition matrix');
     end

     val=0.0;
     S=size(XY);
     if (S(1)==1)
       val=entropy(XY)
     else
       row=S(2);
       col=S(1);
       for i=1:row
	 val=val+entropy(XY(i,:));
       end
     end
     return
end

--- NEW FILE: rle_encode.m ---
## (C) 2006, August, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: rle_encode(message)
## This function decodes a run-length encodes a message into its
## rle form. The original message has to be in the form of a 
## row-vector. The message format (encoded RLE) is like  repetition 
## factor, value.
##
## example: 
##          message=[    5   4   4   1   1   1]
##          rle_encode(message) #gives
##          ans = [1 5 2 4 3 1];
##
##
## see also: rle_decode()
##
function rmsg=rle_encode(message)
     if nargin < 1
       error('Usage: rle_encode(message)')
     end
     rmsg=[];
     L=length(message);
     itr=1;
     while itr <= L
       times=0;
       val=message(itr);
       while(itr <= L && message(itr)==val)
	 itr=itr+1;
	 times=times+1;
       end
       rmsg=[rmsg times val];
     end
     return
end

[Octave-cvsupdate] octave-forge/main/info-theory/inst arithmetic_decode.m, NONE, 1.1 arithmetic_enc

A collection of packages providing extra functionality for GNU Octave