CMU Sphinx / Forums / Sphinx4 Help: MFCC+DTW

hi all;
if I'm using MFCC features and dynamic time warpping for comparaison, I must apply DTW for each frame or for all features extracted?
thanks

jinio adham - 2008-10-12

ok;
I'm testing it; and it work fine;
the method measure() return the distance similarity so if reference=target we have DTWSemilarity.measure()=1;so we have to take the max of results,
but if I'm applying it with MFCC features I don't take good recognition;
so what's I'm doing wrong???
thank's in advance

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2008-10-12
  
  Share your code so I can easily test it
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ok;
this is the DTW:
package dtw

import java.util.ArrayList;
import edu.cmu.sphinx.tools.feature.FeatureFileDumper;

/
* A similarity measure based on "Dynamic Time Warping". The DTW distance is
* mapped to a similarity measure using f(x)= 1 - (x / (1 + x)). Feature weights
* are also supported.
*
*/
public class DTWSimilarity {
/
*
/
private static final long serialVersionUID = -8898553450277603746L;
// la disntace optimale
double Distance=0.0D;
/
* XXX DOC
*
* @param i
* @param j
* @param ts1
* @param ts2
* @return
/
private double pointDistance(int i, int j, ArrayList<double[]> ts1, ArrayList<double[]> ts2) {
// double framei[]=(double[])(ts1.get(i)); : is the frame i for the the first recording
//double framei[]=(double[])(ts2.get(i)); : is the frame i for the the second recording
// ts1.size(): is the number of frames for the first recording
//ts2.size(): is the number of frames for the second recording
double diff = 0;
for(int k=0;k<39;k++)
diff=diff+((double[])ts1.get(i))[k] - ((double[])ts2.get(j))[k];
return (diff * diff);// return the distance (Euclidian distance) between tow frames
}

/**
 * XXX DOC
 * 
 * @param x
 * @return
 */
private double distance2Similarity(double x) {
    return (1.0 - (x / (1 + x)));
}

/**
 * XXX DOC
 */
public double measure(ArrayList&lt;double[]&gt; ts1,  ArrayList&lt;double[]&gt; ts2) {


    int i, j;


    /** Build a point-to-point distance matrix */
    double[][] dP2P = new double[ts1.size()][ts2.size()];
    for (i = 0; i &lt; ts1.size(); i++) {
        for (j = 0; j &lt; ts2.size(); j++) {
            dP2P[i][j] = pointDistance(i, j, ts1, ts2);
        }
    }

    /** Check for some special cases due to ultra short time series */
    if (ts1.size() == 0 || ts2.size() == 0) {
        return Double.NaN;
    }

    if (ts1.size() == 1 &amp;&amp; ts2.size() == 1) {
        return distance2Similarity(Math.sqrt(dP2P[0][0]));
    }

    /**
     * Build the optimal distance matrix using a dynamic programming
     * approach
     */
    double[][] D = new double[ts1.size()][ts2.size()];

    D[0][0] = dP2P[0][0]; // Starting point

    for (i = 1; i &lt; ts1.size(); i++) { // Fill the first column of our
        // distance matrix with optimal
        // values
        D[i][0] = dP2P[i][0] + D[i - 1][0];
    }

    if (ts2.size() == 1) { // TS2 is a point
        double sum = 0;
        for (i = 0; i &lt; ts1.size(); i++) {
            sum += D[i][0];
        }
        return distance2Similarity(Math.sqrt(sum) / ts1.size());
    }

    for (j = 1; j &lt; ts2.size(); j++) { // Fill the first row of our
        // distance matrix with optimal
        // values
        D[0][j] = dP2P[0][j] + D[0][j - 1];
    }

    if (ts1.size() == 1) { // TS1 is a point
        double sum = 0;
        for (j = 0; j &lt; ts2.size(); j++) {
            sum += D[0][j];
        }
        return distance2Similarity(Math.sqrt(sum) / ts2.size());
    }

    for (i = 1; i &lt; ts1.size(); i++) { // Fill the rest
        for (j = 1; j &lt; ts2.size(); j++) {
            double[] steps = { D[i - 1][j - 1], D[i - 1][j], D[i][j - 1] };
            double min = Math.min(steps[0], Math.min(steps[1], steps[2]));
            D[i][j] = dP2P[i][j] + min;
        }
    }

    /**
     * Calculate the distance between the two time series through optimal
     * alignment.
     */
    i = ts1.size() - 1;
    j = ts2.size() - 1;
    int k = 1;
    double dist = D[i][j];

    while (i + j &gt; 2) {
        if (i == 0) {
            j--;
        } else if (j == 0) {
            i--;
        } else {
            double[] steps = { D[i - 1][j - 1], D[i - 1][j], D[i][j - 1] };
            double min = Math.min(steps[0], Math.min(steps[1], steps[2]));

            if (min == steps[0]) {
                i--;
                j--;
            } else if (min == steps[1]) {
                i--;
            } else if (min == steps[2]) {
                j--;
            }
        }
        k++;
        dist += D[i][j];
    }
    Distance=dist;
    return distance2Similarity(Math.sqrt(dist) / k)    }

public static void main(String args[]){
DTWSimilarity dtws=new DTWSimilarity ();
FeatureFileDumper ffd1=new FeatureFileDumper(cm,frontEndName,inputfile);
FeatureFileDumper ffd2=new FeatureFileDumper(cm,frontEndName,inputfile);
ArrayList<double[]> ts1=ffd1.allfeatres;
ArrayList<double[]> ts2=ffd2.allfeatres;
// I just rectifie your class to get allfeatres as a ArrayList of (double[])vectors;
dtws.measure(ts1,ts2);

}
}

Nickolay V. Shmyrev - 2008-10-12

Hey, reread my request again. Do you think I have to execute this code on Java machine embedded in my brains? I need <b>the sample that is easy to run</b>.

Btw, stop spamming my mail please

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

jinio adham - 2008-10-12

sorry I don't understand here?
what do you mean?
and sorry again for your mail

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Nickolay V. Shmyrev - 2008-10-13

>
> private double pointDistance(int i, int j, ArrayList<double[]> ts1, > ArrayList<double[]> ts2) {
>double diff = 0;
>for(int k=0;k<39;k++)
>diff=diff+((double[])ts1.get(i))[k] - ((double[])ts2.get(j))[k];
>return (diff * diff);
>}

this function is not correct. I suggest you to read what is Euclidean distance between vectors.

>what do you mean?

I mean that in order to help you I need to be able to start your program. I don't need the big samples of uncomplete code your providing. You can always pack your program into archive and upload it to the public resource and give me a link so I could try it.

If you need help, provide working samples I can run in a minute without any additional actions. Otherwise it will be impossible to suggest you anything.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
jinio adham - 2008-10-13

Thank you Nickolay;
you're right, I guess it before your replay(about Euclidian distance it's dif+(a-b)*(a-b); return dif;); and I think now I have good result, and I don't want to disturb you any more, but can anyone give me the help about my other question (MFCC+HMM)?
I will be very grateful;
thank you very much

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Nickolay V. Shmyrev - 2008-10-10

> must apply DTW for each frame or for all features extracted?

DTW aligns frames. You calculate distance between source and target frames to compare them and then try to create a relation that maps source frames to the target ones

> what I want to say; it's just about comparing the content of two audio files, so if I get features from the first one and features from the second one, I can decide if this two files are semilare or no, my files contain conversation of course;

Can you explain more? Do files have the same recordings or do they have the same text spoken by different speakers?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Nickolay V. Shmyrev - 2008-10-10

btw, for DTW implementation I suggest you to take existing package like
java-ml

http://java-ml.sourceforge.net/

So you can use sphinx to extract features and java-ml to compare files.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
jinio adham - 2008-10-10

Hi Nickolay;
Thank you so much for your replay;
in fact I'm studying DTW in java-ml because I have already use this framework for an other project.

--Can you explain more? Do files have the same recordings or do they have the same text spoken by different speakers?

yes the files have the same recordings but in différents conditions, in the file of référence the volume is very good but int the test one volume is low

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2008-10-10
  
  Ok, you can take existing DTW.java, but you have to modify it to work with vectors instead of double, so:
  
  private double pointDistance(int i, int j, double[] ts1, double[] ts2) { double diff = ts1[i] - ts2[j]; return (diff * diff); } will become: private double pointDistance(int i, int j, double[][] ts1, double[][] ts2) { double diff; for (int k = 0; k<39; k++) { diff = diff + (ts1[k][i] - ts1[k][j]) * (ts1[k][i] - ts1[k][j]) ; } return diff;
  
  }
  
  so instead of doubles you align MFCC vectors.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hi all;
sorry it' me again,
but I need help for this point;
here is the code for DTWSimilarity.java modified from java-ml;
I'm not testing if it's work or no, but I need your point of view if it's ok, it's good, but please, if I must do some rectification please give me some details because I have problems with this point;
thank you so much:
/*
* This file is part of the Java Machine Learning Library
*
* The Java Machine Learning Library is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* The Java Machine Learning Library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with the Java Machine Learning Library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*
* Copyright (c) 2006-2008, Thomas Abeel
*
* Project: http://java-ml.sourceforge.net/
*
/
package dtw;

import java.util.ArrayList;

/
* A similarity measure based on "Dynamic Time Warping". The DTW distance is
* mapped to a similarity measure using f(x)= 1 - (x / (1 + x)). Feature weights
* are also supported.
*
* @author Piotr Kasprzak
* @author Thomas Abeel
*
*/
public class DTWSimilarity {
/
*
*/
private static final long serialVersionUID = -8898553450277603746L;

/**
 * XXX DOC
 * 
 * @param i
 * @param j
 * @param ts1
 * @param ts2
 * @return
 */
private double pointDistance(int i, int j, ArrayList&lt;double[]&gt; ts1, ArrayList&lt;double[]&gt; ts2) {
    // double framei[]=(double[])(ts1.get(i)); : is the frame i for the the first recording
    //double framei[]=(double[])(ts1.get(i)); : is the frame i for the the second recording
    // ts1.size(): is the number of frames for the first recording
    //ts2.size(): is the number of frames for the second recording
    double diff = 0;
    for(int k=0;k&lt;39;k++)
    diff=diff+((double[])ts1.get(i))[k] - ((double[])ts2.get(j))[k];
    return (diff * diff);// return the distance (Euclidian distance) between tow frames
}

/**
 * XXX DOC
 * 
 * @param x
 * @return
 */
private double distance2Similarity(double x) {
    return (1.0 - (x / (1 + x)));
}

/**
 * XXX DOC
 */
public double measure(ArrayList&lt;double[]&gt; ts1,  ArrayList&lt;double[]&gt; ts2) {


    int i, j;


    /** Build a point-to-point distance matrix */
    double[][] dP2P = new double[ts1.size()][ts2.size()];
    for (i = 0; i &lt; ts1.size(); i++) {
        for (j = 0; j &lt; ts2.size(); j++) {
            dP2P[i][j] = pointDistance(i, j, ts1, ts2);
        }
    }

    /** Check for some special cases due to ultra short time series */
    if (ts1.size() == 0 || ts2.size() == 0) {
        return Double.NaN;
    }

    if (ts1.size() == 1 &amp;&amp; ts2.size() == 1) {
        return distance2Similarity(Math.sqrt(dP2P[0][0]));
    }

    /**
     * Build the optimal distance matrix using a dynamic programming
     * approach
     */
    double[][] D = new double[ts1.size()][ts2.size()];

    D[0][0] = dP2P[0][0]; // Starting point

    for (i = 1; i &lt; ts1.size(); i++) { // Fill the first column of our
        // distance matrix with optimal
        // values
        D[i][0] = dP2P[i][0] + D[i - 1][0];
    }

    if (ts2.size() == 1) { // TS2 is a point
        double sum = 0;
        for (i = 0; i &lt; ts1.size(); i++) {
            sum += D[i][0];
        }
        return distance2Similarity(Math.sqrt(sum) / ts1.size());
    }

    for (j = 1; j &lt; ts2.size(); j++) { // Fill the first row of our
        // distance matrix with optimal
        // values
        D[0][j] = dP2P[0][j] + D[0][j - 1];
    }

    if (ts1.size() == 1) { // TS1 is a point
        double sum = 0;
        for (j = 0; j &lt; ts2.size(); j++) {
            sum += D[0][j];
        }
        return distance2Similarity(Math.sqrt(sum) / ts2.size());
    }

    for (i = 1; i &lt; ts1.size(); i++) { // Fill the rest
        for (j = 1; j &lt; ts2.size(); j++) {
            double[] steps = { D[i - 1][j - 1], D[i - 1][j], D[i][j - 1] };
            double min = Math.min(steps[0], Math.min(steps[1], steps[2]));
            D[i][j] = dP2P[i][j] + min;
        }
    }

    /**
     * Calculate the distance between the two time series through optimal
     * alignment.
     */
    i = ts1.size() - 1;
    j = ts2.size() - 1;
    int k = 1;
    double dist = D[i][j];

    while (i + j &gt; 2) {
        if (i == 0) {
            j--;
        } else if (j == 0) {
            i--;
        } else {
            double[] steps = { D[i - 1][j - 1], D[i - 1][j], D[i][j - 1] };
            double min = Math.min(steps[0], Math.min(steps[1], steps[2]));

            if (min == steps[0]) {
                i--;
                j--;
            } else if (min == steps[1]) {
                i--;
            } else if (min == steps[2]) {
                j--;
            }
        }
        k++;
        dist += D[i][j];
    }

    return distance2Similarity(Math.sqrt(dist) / k);// what's the result here?
    //it's the score witch I can compare  to decide about the result?!!
}

/**
 * XXX doc
 */
/*public double getMaximumDistance(Dataset data) {
    // TODO implement
    throw new UnsupportedOperationException(&quot;Not yet implemented&quot;);
}*/

/**
 * XXX doc
 */
/*public double getMinimumDistance(Dataset data) {
    // TODO implement
    throw new UnsupportedOperationException(&quot;Not yet implemented&quot;);
}

*/
}

MFCC+DTW

Speech Recognition Toolkit

Forums

Help

MFCC+DTW document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

MFCC+DTW