[Octave-cvsupdate] octave-forge/main/comm/inst huffmandeco.m, NONE, 1.1 huffmandict.m, NONE, 1.1 hu

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Update of /cvsroot/octave/octave-forge/main/comm/inst
In directory sc8-pr-cvs3.sourceforge.net:/tmp/cvs-serv3667

Added Files:
	huffmandeco.m huffmandict.m huffmanenco.m 
Log Message:
Octave now has huffman coding algorithms which are compatible with the other-leading-brand and even better with 2 other options for Huffman code generation including a minimum variance of code word option. The computational complexity of these routines are way high, sorry about that. But O(N*log(N)) encoding & decoding exist which we can implement as time permits.

--- NEW FILE: huffmandict.m ---
## (C) 2006, Sep 30, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

## usage: huffmandict(symbols, source probability vector, {toggle},{minvar})
## toggle is a 1,0 optional argument that starts building a
## code based on 1's or 0's, defaulting to 0. Also minvar is a boolean
## value that is useful in choosing if you want to optimize buffer for
## transmission in the applications of Huffman coding, however it doesnt
## affect the type or average codeword length of the generated code.
##
## This function builds a Huffman code, given the probability list.
## The Huffman codes persymbol are output as a list of
## strings-per-source symbol. A zero probability symbol is NOT assigned
## any codeword as this symbol doesnt occur in practice anyway.
## 
## example: huffmandict(symbols, [0.5 0.25 0.15 0.1]) => CW(0,10,111,110)
##          huffmandict(symbols, 0.25*ones(1,4)) => CW(11,10,01,00)
##
##          prob=[0.5 0 0.25 0.15 0.1]
##          dict=huffmandict(1:5,[0.5 0 0.25 0.15 0.1],1)
##          entropy(prob)
##          laverage(dict,prob)
##         
##          x =   [0.20000   0.40000   0.20000   0.10000   0.10000];
##          #illustrates the minimum variance thing.
##          huffmandict(1,x,0,true) #min variance tree.
##          huffmandict(1,x)     #normal huffman tree.
##
##          Ref: Dr.Rao's course EE5351 Digital Video Coding,
##               at UT-Arlington.
##
## Huffman code algorithm.
## while (uncombined_symbols_remain)
##       combine least probable symbols into one with,
##      their probability being the sum of the two.
##       save this result as a stage at lowest order rung.
##       (Moving to lowest position possible makes it non-minimum variance
##        entropy coding)
## end
##
## for each (stage)
## Walk the tree we built, and assign each row either 1,
## or 0 of 
## end
## 
## reverse each symbol, and dump it out.
##

function cw_list=huffmandict(sym,source_prob,togglecode,minvar)
  if nargin < 2
    error("Usage: huffman_dict(source_prob,{togglecode 1-0 in code});")
  elseif nargin < 3
    togglecode=0;
    minvar=0;
  end

  if(sum(source_prob)!=1.0)
    error("source probabilities must add up to 1.0");
  end

  %
  %need to find & eliminate the zero-probability code words.
  %in practice we donot need to assign anything to them. Reasoning
  %being that if it doesnt occur why bother saving its vale anyway?
  %

  origsource_prob=source_prob;

  %
  %sort the list and know the index transpositions. kills the speed O(n^2).
  %
  L=length(source_prob);
  index=[1:L];
  for itr1=1:L
    for itr2=itr1:L
      if(source_prob(itr1) < source_prob(itr2))
	t=source_prob(itr1);
	source_prob(itr1)=source_prob(itr2);
	source_prob(itr2)=t;

	t=index(itr1);
	index(itr1)=index(itr2);
	index(itr2)=t;
      end
    end
  end

  %index
  %source_prob

  idx=find(source_prob==0);
  if(not(isempty(idx)))    
    source_prob=source_prob(1:idx(1)-1);
  end
  clear idx;

  stage_list={};
  cw_list{1:L}=[];

  stage_curr={};
  stage_curr.prob_list=source_prob;
  stage_curr.sym_list={};
  S=length(source_prob);
  for i=1:S; 
    stage_curr.sym_list{i}=[i];
    #cw_list{i}="";
  end

  %
  % another O(n^2) part.
  %
  I=1;
  while (I<S)
    L=length(stage_curr.prob_list);
    nprob=stage_curr.prob_list(L-1)+stage_curr.prob_list(L);
    nsym=[stage_curr.sym_list{L-1}(1:end),stage_curr.sym_list{L}(1:end)];

    %stage_curr;
    %archive old stage list.
    stage_list{I}=stage_curr;

    %
    %insert the new probability into the list, at the
    %first-position (greedy?) possible.
    %
    for i=1:(L-2)
      if((minvar && stage_curr.prob_list(i)<=nprob) || \
	 stage_curr.prob_list(i) < nprob)
	break;
      end
    end

    stage_curr.prob_list=[stage_curr.prob_list(1:i-1) nprob stage_curr.prob_list(i:L-2)];
    stage_curr.sym_list={stage_curr.sym_list{1:i-1}, nsym, stage_curr.sym_list{i:L-2}};

    % Loopie
    I=I+1;
  end

  one_cw="1";
  zero_cw="0";

  if (togglecode==0)
      one_cw=1;
      zero_cw=0;
  else
     one_cw=0;
     zero_cw=1;
  end

  %
  % another O(n^2) part.
  %
  %printf("Exit Loop");
  I=I-1;
  while (I>0)
    stage_curr=stage_list{I};
    L=length(stage_curr.sym_list);

    clist=stage_curr.sym_list{L};
    for k=1:length(clist)
      cw_list{1,clist(k)}=[cw_list{1,clist(k)} one_cw];
    end

    clist=stage_curr.sym_list{L-1};
    for k=1:length(clist)
      cw_list{1,clist(k)}=[cw_list{1,clist(k)}, zero_cw];
    end

    % Loopie
    I=I-1;
  end

  %
  %zero all the code-words of zero-probability length, 'cos they
  %never occur.
  %
  S=length(source_prob);
  for itr=(S+1):length(origsource_prob)
    cw_list{1,itr}=-1;
  end

  %disp('Before resorting')
  %cw_list

  nw_list{1:L}=[];
  %
  % Re-sort the indices according to the probability list.
  %
  L=length(source_prob);
  for itr=1:(L)
    t=cw_list{index(itr)};
    nw_list{index(itr)}=cw_list{itr};
  end
  cw_list=nw_list;

  %zero all the code-words of zero-probability length, 'cos they
  %never occur.

  %for itr=1:L
  %  if(origsource_prob(itr)==0)
  %    cw_list{itr}="";
  %  end
  %end

  return
end
%!
%!assert(huffmandict(1:4,[0.5 0.25 0.15 0.1],1), {[0],[1 0],[1 1 1],[1 1 0]},0)
%!assert(huffman(0.25*ones(1,4),1),{[1 1],[1 0],[0 1],[0 0]},0)
%!

--- NEW FILE: huffmanenco.m ---
## (C) 2006, Sep 30, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

##
## usage: huffmanenco(sig,dict)
## The function returns the Huffman encoded signal using dict.
## This function uses a dict built from the huffmandict
## and uses it to encode a signal list into a huffman list.
## Restrictions include a signal set that strictly belongs
## in the range [1,N] with N=length(dict). Also dict can
## only be from the huffmandict() routine.
## 
## 
## example: hd=huffmandict(1:4,[0.5 0.25 0.15 0.10])
##          huffmanenco(1:4,hd) #  [ 1   0   1   0   0   0   0   0   1 ]
##          
##
function hcode=huffmanenco(sig,dict)
  if ( nargin < 2 || strcmp(class(dict),"cell")~=1 )
    error('usage: huffmanenco(sig,dict)');
  end
  if (max(sig) > length(dict)) || ( min(sig) < 1)
    error("signal has elements that are outside alphabet set ...
	Cannot encode.");
  end
  hcode=[dict{sig}];
  return
end
%! 
%! assert(huffmanenco(1:4, huffmandict(1:4,[0.5 0.25 0.15 0.10])), [ 1   0   1   0   0   0   0   0   1 ],0)
%!

--- NEW FILE: huffmandeco.m ---
## (C) 2006, Sep 30, Muthiah Annamalai, <mut...@ut...>
## 
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##

##
## usage: sig=huffmandeco(hcode,dict)
## The function returns the original signal that was 
## Huffman encoded signal using huffmanenco. This function uses
## a dict built from the huffmandict and uses it to decode a signal 
## list into a huffman list.
## Restrictions include hcode is expected to be a binary code;
## returned signal set that strictly belongs in the range [1,N]
## with N=length(dict). Also dict can  only be from the
## huffmandict() routine. Whenever decoding fails,
## those signal values are indicated by -1, and we successively 
## try to restart decoding from the next bit that hasnt failed in 
## decoding, ad-infinitum. 
##
## example: hd=huffmandict(1:4,[0.5 0.25 0.15 0.10])
##          hcode=huffmanenco(1:4,hd) #  [ 1   0   1   0   0   0   0   0   1 ]
##          huffmandeco(hcode,hd) # [1 2 3 4]
## 
function sig=huffmandeco(hcode,dict)
  if ( nargin < 2 || strcmp(class(dict),"cell")~=1 )
    error('usage: huffmanenco(sig,dict)');
  end
  if (max(hcode) > 1 || min(hcode) < 0)
    error("hcode has elements that are outside alphabet set ...
	Cannot decode.");
  end

# TODO:
# O(log(N)) algorithms exist, but we need some effort to implement those
# Maybe sometime later, it would be a nice 1-day project
# Ref: A memory efficient Huffman Decoding Algorithm, AINA 2005, IEEE
#

# TODO: Somebody can implement a 'faster' algorithm than O(N^3) at present
# Decoding Algorithm O(N+k*log(N)) which is approximately O(N+Nlog(N))
#
# 1.  Separate the dictionary  by the lengths
#     and knows symbol correspondence.
#
# 2.  Find the symbol in the dict of lengths l,h where
#     l = smallest cw length ignoring 0 length CW, and
#     h = largest cw length , with k=h-l+1;
#     
# 3.  Match the k codewords to the dict using binary
#     search, and then you can declare decision.
#
# 4.  In case of non-decodable words skip the start-bit
#     used in the previous case, and then restart the same
#     procedure from the next bit.
#
  L=length(dict);
  lenL=length(dict{1});
  lenH=0;

#  
# Know the ranges of operation
#  
  for itr=1:L
    t=length(dict{itr});
    if(t < lenL)
      lenL=t;
    end
    if(t > lenH)
      lenH=t;
    end
  end

#
# Now do a O(N^4) algorithm
#
  itr=0; #offset into the hcode
  sig=[]; #output
  CL=length(hcode);

  %whole decoding loop.
  while ( itr <  CL )
    %decode one element (or just try to).
    for itr_y=lenL:lenH
      %look for word in dictionary.
      %with flag to indicate found 
      %or not found. Detect end of buffer.

      if ( (itr+itr_y) > CL )
	break;
      end

      word=hcode(itr+1:itr+itr_y);
      flag=0;

      %word

      %search loop.
      for itr_z=1:L
	if(isequal(dict{itr_z},word))
	  itr=itr+itr_y; 
	  sig=[sig itr_z];
	  flag=1;
	  break;
	end
      end %for_ loop breaker

      if( flag == 1 )
	break; %also need to break_ above then.
      end

    end %for_ loop

    #could not decode
    if( flag == 0 )
      itr=itr+1;
      sig=[sig -1]; 
    end

  end  %while_ loop
end
%! 
%! assert(huffmandeco(huffmanenco(1:4, huffmandict(1:4,[0.5 0.25 0.15 0.10])), huffmandict(1:4,[0.5 0.25 0.15 0.10])), [1:4],0)
%!

[Octave-cvsupdate] octave-forge/main/comm/inst huffmandeco.m, NONE, 1.1 huffmandict.m, NONE, 1.1 hu

A collection of packages providing extra functionality for GNU Octave

[Octave-cvsupdate] octave-forge/main/comm/inst huffmandeco.m, NONE, 1.1 huffmandict.m, NONE, 1.1 huffmanenco.m, NONE, 1.1