aparat-users Mailing List for TKK Aparat
Project moved to http://research.spa.aalto.fi/projects/aparat/
Status: Beta
Brought to you by:
mairas
You can subscribe to this list here.
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(5) |
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Matti A. <ma...@ik...> - 2012-05-27 18:06:31
|
Dear Philipp, (Sorry for the slight delay in my reply!) All of Aparat's functionality is indeed fairly easily available on Matlab command line as well. I see that the website at http://aparat.sf.net is down at the moment - it would have had some tutorial documentation on the matter, if I remember correctly. Meanwhile, you might just want to browse the contents of the Aparat source package and see the inline documentation for each function. For example, IAIF contains the actual inverse filtering functionality. Aparat quite heavily utilizes the Matsig signal processing class library for Matlab, so most functions return (or even expect as input) Matsig signal objects. Matsig has the following tutorial online: http://matsig.sourceforge.net/doc/Tutorial.html I hope this was of any use; I don't have access to Matlab at the moment and it's been 4 years since I have been working with Aparat. Best regards, Matti Airas On 22 May 2012 15:35, Philipp Aichinger <phi...@me...> wrote: > dear aparat users, > > I am currently working with aparat in its gui mode. Can you tell me what > the best way is to call the tool from the command line? > is there a function that starts the inverse filtering, with the > possibility to define all filtering options? > > thank you, > philipp aichinger > > -- > DI Philipp Aichinger > Medizinische Universität Wien > Universitätsklinik für Hals-, Nasen- und Ohrenkrankheiten > Klinische Abteilung Phoniatrie-Logopädie > Währinger Gürtel 18-20 > A-1090 Wien > > Tel. +43 1 40400 1167 > Mobil +43 699 12292869 > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Aparat-users mailing list > Apa...@li... > https://lists.sourceforge.net/lists/listinfo/aparat-users |
From: Philipp A. <phi...@me...> - 2012-05-22 12:54:44
|
dear aparat users, I am currently working with aparat in its gui mode. Can you tell me what the best way is to call the tool from the command line? is there a function that starts the inverse filtering, with the possibility to define all filtering options? thank you, philipp aichinger -- DI Philipp Aichinger Medizinische Universität Wien Universitätsklinik für Hals-, Nasen- und Ohrenkrankheiten Klinische Abteilung Phoniatrie-Logopädie Währinger Gürtel 18-20 A-1090 Wien Tel. +43 1 40400 1167 Mobil +43 699 12292869 |
From: Joseph H. <jha...@ya...> - 2010-11-16 18:01:11
|
You have a great tool for voice filtering & parameterization and I like how it works, but I have questions on its use. I understand that your solver function glottal_lf.m uses the boundary conditions in order to solve for the LF model parameters of the glottal derivative estimate, but I am curious of why you have the following statement at the end of the fitlf.m function: if d=1e5*ones(1,len(lfg));len(lfg)~=45elsed=g_per.s-lfg.s;end Is the comparison to 45 measurements a significant part of the program? I assume as long as g_per.s and lfg.s are the same length that doing the subtraction would be ok. How do you find out if T1 and T2 in the glottal_lf.m program are making its output arguments lengths accurate? Also, a question on the use of lsqnonlin. I was trying to replace these calls with fminbnd and fminsearch since I do not have the optimization toolbox. Have you tried other optimizer matlab programs to solve for the LF parameters? I was looking at the possibility of using levmar.m, a freeware version of the lsqnonlin method, but thought that I'd ask about the accuracy of lsqnonlin versus other optimization methods. Thanks in advance, Joe |
From: Mustafa C. O. <orh...@gm...> - 2010-07-01 12:58:23
|
Hi, I am new at Aparat. I am now going through the tutorial but when I try to download the samples, at the page http://aparat.sourceforge.net/images/0/01/Aparat-tutorial-samples.zip I get an error page saying "An error has been encountered in accessing this page." Following is the full error message. Regards, Cem An error has been encountered in accessing this page. 1. Server: aparat.sourceforge.net 2. URL path: /images/0/01/Aparat-tutorial-samples.zip 3. Error notes: NONE 4. Error type: 403 5. Request method: GET 6. Request query string: NONE 7. Time: 2010-07-01 10:57:08 UTC (1277981828) Reporting this problem: The problem you have encountered is with a project web site hosted by SourceForge.net. This issue should be reported to the SourceForge.net-hosted project (not to SourceForge.net). If this is a severe or recurring/persistent problem, please do one of the following, and provide the error text (numbered 1 through 7, above): Contact the project via their designated support resources. Contact the project administrators of this project via email (see the upper right-hand corner of the Project Summary page for their usernames) at use...@us... If you are a maintainer of this web content, please refer to the Site Documentation regarding web services for further assistance. NOTE: As of 2008-10-23 directory index display has been disabled by default. This option may be re-enabled by the project by placing a file with the name ".htaccess" with this line: Options +Indexes |
From: Matti A. <ma...@ik...> - 2008-04-02 14:15:03
|
Paul Modart wrote: > Can I extract the results of the IAIF algorithm and save them in a file > in order to use them in another work? And especially how can I save > them? More precisely, I would like to compare at least two different > speeches with their spectra. Hi Paul, Are you using the standalone version or the Matlab version? The standalone version does not currently offer facilities for extraction of the signals, while in the Matlab version it is fairly simple to do. All data in Aparat are saved on the Matlab workspace in the aprt variable. The spectra are not explicitly saved, but you can easily recreate them: plot(half1(db(fft(win(aprt.cut_glotflow,@hamming))))) So, just save the data of the file in another variable and then plot them simultaneously, and that's it. ma. |
From: Paul M. <pau...@gm...> - 2008-04-02 13:09:29
|
Hi, I'm using Aparat (version of around March 2008) in Matlab 7.0. Can I extract the results of the IAIF algorithm and save them in a file in order to use them in another work? And especially how can I save them? More precisely, I would like to compare at least two different speeches with their spectra. Thanks a lot Paul |
From: Dinoj S. <di...@gm...> - 2005-10-30 09:56:50
|
> OK. If you get any interesting error printouts, please report them so > that we can get those fixed! It's been a while (apologies), but here are some bug reports - this is from Version 0.1.2 (released 2005-10-149) (I didn't know October had 149 days.) EDU>> addpath C:\doc'uments and Settings\'dinoj\M'y Documents\'aparat-0.1.2= \ EDU>> addpath C:\doc'uments and Settings\'dinoj\desktop\matsig-0.2.2\ EDU>> aparat ??? Error using =3D=3D> load Unable to read MAT-file C:\documents and Settings\dinoj\My Documents\aparat-0.1.2\aparat.fig File may be corrupt. Error in =3D=3D> hgload at 44 fileVars =3D load(filename,'-mat'); Error in =3D=3D> openfig at 84 [fig, oldvis] =3D hgload(filename, struct('Visible','off')); Error in =3D=3D> gui_mainfcn>local_openfig at 206 gui_hFigure =3D openfig(name, singleton); Error in =3D=3D> gui_mainfcn at 94 gui_hFigure =3D local_openfig(gui_State.gui_Name, gui_SingletonOpt); Error in =3D=3D> aparat at 42 gui_mainfcn(gui_State, varargin{:}); ,.. Note that the CVS entry works perfectly. I just checked it out of CVS and tested it again to be sure. Dinoj |
From: Matti A. <ma...@ik...> - 2005-10-19 11:52:13
|
On ti, 2005-10-18 at 15:25 -0500, Dinoj Surendran wrote: > Thanks for a very informative response! I've modified measvq.m in > accordance with your suggestions, and placed it at > http://people.cs.uchicago.edu/~dinoj/matlab/measvq_18oct05.m > > I also added an option to have multiple fixed window lengths. Plots > made with the above script using window sizes of 32, 64 and 128 ms can > be found here. The first three were computed on the same wav files as > those in my previous post. > > http://people.cs.uchicago.edu/~dinoj/vq/breathy_oct18.jpg > http://people.cs.uchicago.edu/~dinoj/vq/near_modal_oct18.jpg > http://people.cs.uchicago.edu/~dinoj/vq/pressed_oct18.jpg > http://people.cs.uchicago.edu/~dinoj/vq/heavy_creaky_oct18.jpg > > > What do you use the time instants for? Generally speaking, reliable > > assessment of the opening time instants is rather difficult, so if you > > want stable time instants within the periods, better use t_max or t_c > > (or maybe even t_dmin). > > The time instants are required to produce a NAQ contour (like a pitch > contour) -- and I'll use t_max now. > > > I wouldn't use auto_iaif, as it gives quite unpredictable results > > sometimes. I'd rather just use some sensible defaults, like > > r=floor(fs/1000) and rho=0.995 and see if they work. > > Cool, done that - and things do look better now, particularly the > error values. How much should p be set to? Right now I've set it to be > the same as r. I've never really experimented with p values different from r, so it should be OK to set it to be the same as r. :-) > That's a good warning. Would the error values reflect this? In any > case, I can survive this as I'm working on tone recognition in > Mandarin, and only use the vocalic parts of syllables. Having said > that, this is often nasalized, especially in syllables ending with a > nasal, so I'll just have to continue with it and see what happens. Probably the error value would show it, but I don't have too strong insight on that, so I could be wrong. > Would NAQ and OQa be, at least in theory, more robust than OQ, SQ, > CIQ, etc? Should I look at any glottal frequency parameters instead? Yes, the amplitude-domain parameters should be somewhat more robust than the time-based parameters, since it is much easier to pick the maximum and minimum amplitudes (and derivatives thereof) than to try to deduce the exact closing (or especially opening) instant of the pulse. The frequency domain parameters might be interesting as well. At least the H1-H2 should be so simple that it would be very insensitive for imperfect inverse filtering, at least for vowels in which the first formant is relatively high. The PSP might otherwise give more interesting results, but it could be more prone to IF defects. > Yes, good idea - I was going to do this as a postprocessing step once > I was sure I the rest of it was reasonable... OTOH, I don't see how > using the signal object's trim function would help. Um, no, not in the median calculation, that's true. It just reduces the manual bookkeeping of the time instants, that's all. > Last time I checked, it was the version of aparat (1.1) on the web > that had those problems, while the CVS entry didn't. Having said that, > it's got other problems that are reported to the matlab command window > when the GUI gets something it doesnt like, but those are solved by > restarting aparat. I should check this again, though. OK. If you get any interesting error printouts, please report them so that we can get those fixed! Cheers, m. |
From: Dinoj S. <di...@gm...> - 2005-10-18 20:25:23
|
Thanks for a very informative response! I've modified measvq.m in accordance with your suggestions, and placed it at http://people.cs.uchicago.edu/~dinoj/matlab/measvq_18oct05.m I also added an option to have multiple fixed window lengths. Plots made with the above script using window sizes of 32, 64 and 128 ms can be found here. The first three were computed on the same wav files as those in my previous post. http://people.cs.uchicago.edu/~dinoj/vq/breathy_oct18.jpg http://people.cs.uchicago.edu/~dinoj/vq/near_modal_oct18.jpg http://people.cs.uchicago.edu/~dinoj/vq/pressed_oct18.jpg http://people.cs.uchicago.edu/~dinoj/vq/heavy_creaky_oct18.jpg > What do you use the time instants for? Generally speaking, reliable > assessment of the opening time instants is rather difficult, so if you > want stable time instants within the periods, better use t_max or t_c > (or maybe even t_dmin). The time instants are required to produce a NAQ contour (like a pitch contour) -- and I'll use t_max now. > I wouldn't use auto_iaif, as it gives quite unpredictable results > sometimes. I'd rather just use some sensible defaults, like > r=3Dfloor(fs/1000) and rho=3D0.995 and see if they work. Cool, done that - and things do look better now, particularly the error values. How much should p be set to? Right now I've set it to be the same as r. > You might want to try to avoid IFing any part of the /m/ sound. Inverse > filtering (well, at least if it uses an all-pole model of the vocal > tract) doesn't work for nasals, as the nasal coupling introduces zeros > in the transfer function. Having said that, the NAQ parameter should be > quite robust, so that even if the inverse filtering results aren't too > optimal, you'll still get quite good results. The time-based parameters > (OQ, SQ, ClQ and such) will probably fail horribly in such situations, > though. That's a good warning. Would the error values reflect this? In any case, I can survive this as I'm working on tone recognition in Mandarin, and only use the vocalic parts of syllables. Having said that, this is often nasalized, especially in syllables ending with a nasal, so I'll just have to continue with it and see what happens. Would NAQ and OQa be, at least in theory, more robust than OQ, SQ, CIQ, etc? Should I look at any glottal frequency parameters instead? > A quick note about the sprintf-eval pattern you use: you might be able > to substitute > > cmd=3Dsprintf('N=3Dlength(times.%s);',meas{j}); > eval(cmd); > > with > > N=3Dlength(times.(meas{j})); Great! I didnt know one could do this. I've modified my code. > A further note about the method for looping through the signal - if you > want to do it in Matsig domain, you can just take the whole signal (say, > s1) and loop through it: ... > s =3D trim(s1,tcur,tcur+winsize); Are tcur and winsize here in seconds? > This method would have the advantage that any time instants you get > during the processing are relative to the original signal instead of the > single window, which might make any further processing somewhat easier. > For example, you could try to get the median value of the multiple > parameter values calculated at a single time instant (due to overlapping > windows) to increase the robustness of the measurement. Yes, good idea - I was going to do this as a postprocessing step once I was sure I the rest of it was reasonable... OTOH, I don't see how using the signal object's trim function would help. > BTW, Dinoj, I just began to try to find out the problem you reported > last week regarding the .fig files not loading in different versions of > Matlab. I couldn't quite yet find the cause for it, but I'll try to test > Aparat on some Windows machines in a hope to understand the cause for > it. Last time I checked, it was the version of aparat (1.1) on the web that had those problems, while the CVS entry didn't. Having said that, it's got other problems that are reported to the matlab command window when the GUI gets something it doesnt like, but those are solved by restarting aparat. I should check this again, though. Dinoj |
From: Matti A. <ma...@ik...> - 2005-10-18 07:31:17
|
On ma, 2005-10-17 at 22:08 -0500, Dinoj Surendran wrote: > Question 1 : I have assumed that the ideal case is for each glottal > period to be found in all pieces that fully contain it. Is this > correct? I think so, if I got your question right. :-) > Question 2: Would it be better to use t_max instead of t_po? What do you use the time instants for? Generally speaking, reliable assessment of the opening time instants is rather difficult, so if you want stable time instants within the periods, better use t_max or t_c (or maybe even t_dmin). > Question 3: How should one calibrate this? At the moment I do so by > calling auto_iaif across the entire segment of speech v, and I'm not > sure if this is right, especially if v is long. I wouldn't use auto_iaif, as it gives quite unpredictable results sometimes. I'd rather just use some sensible defaults, like r=floor(fs/1000) and rho=0.995 and see if they work. You might at least want to periodically plot the inverse filtered window to see that things aren't totally haywire... :-) > If the assumption in 1 is correct, then one can assume that glottal > periods that are found with similar NAQ values many times are more > likely to actually exist and be measured correctly. I'm not sure what > the voice quality is for sections of speech where no glottal period > can be found - is NAQ there very large, or can one just interpolate > between the nearest available NAQ values? Interpolation might give most sensible results, if you need to attach some value to those sections as well. > Examples plots of T and V, for samples of breathy, modal, and pressed > voice saying 'ma' are given here (with both NAQ and OQa plots). They > were done with window size 32ms stepped every 4ms. You might want to try to avoid IFing any part of the /m/ sound. Inverse filtering (well, at least if it uses an all-pole model of the vocal tract) doesn't work for nasals, as the nasal coupling introduces zeros in the transfer function. Having said that, the NAQ parameter should be quite robust, so that even if the inverse filtering results aren't too optimal, you'll still get quite good results. The time-based parameters (OQ, SQ, ClQ and such) will probably fail horribly in such situations, though. A quick note about the sprintf-eval pattern you use: you might be able to substitute cmd=sprintf('N=length(times.%s);',meas{j}); eval(cmd); with N=length(times.(meas{j})); which should be more robust and easier to read as well. A further note about the method for looping through the signal - if you want to do it in Matsig domain, you can just take the whole signal (say, s1) and loop through it: tcur = s1.time.begin; te = s1.time.end; while tcur+winsize<te s = trim(s1,tcur,tcur+winsize); PROCESS HERE tcur = tcur+winstep; end This method would have the advantage that any time instants you get during the processing are relative to the original signal instead of the single window, which might make any further processing somewhat easier. For example, you could try to get the median value of the multiple parameter values calculated at a single time instant (due to overlapping windows) to increase the robustness of the measurement. OTOH, if you want to keep your stuff as close to pure Matlab code as possible, I'd have no problem with that. :-) BTW, Dinoj, I just began to try to find out the problem you reported last week regarding the .fig files not loading in different versions of Matlab. I couldn't quite yet find the cause for it, but I'll try to test Aparat on some Windows machines in a hope to understand the cause for it. Cheers, m. |
From: Dinoj S. <di...@gm...> - 2005-10-18 03:14:47
|
Hi - I'm using Aparat (CVS version of around 15 Oct 2005) in Matlab 7 and matsig 0.2.2 on Windows. I've written a wrapper Matlab script that uses Aparat to measure voice quality. I'm not sure if it's correct, though, and would appreciate some comments. The script is at http://people.cs.uchicago.edu/~dinoj/matlab/measvq_17oct05.m if you want to try it yourself. Here's what the script does for measuring, say, NAQ. It takes as input a sound vector v, the sample rate r, window size w (default 0.02 sec) and windowstep p (default 0.005 sec) and ouputs vectors V and T of the same length such that the NAQ at time T(i) is V(i). There could be more than value of NAQ recorded at the same time. What it does is chop the input signal into pieces of fixed length w, finds all glottal periods in each piece, and then joins them together. Here's quasipseudocode: for each piece offset =3D time at start of piece g =3D glottaltimeperiods(iaif (piece)); for each entry in g t =3D offset + time of entry (I took this to be t_po) v =3D NAQ in entry V =3D [V v] T =3D [T t] end end Question 1 : I have assumed that the ideal case is for each glottal period to be found in all pieces that fully contain it. Is this correct? Question 2: Would it be better to use t_max instead of t_po? Question 3: How should one calibrate this? At the moment I do so by calling auto_iaif across the entire segment of speech v, and I'm not sure if this is right, especially if v is long. If the assumption in 1 is correct, then one can assume that glottal periods that are found with similar NAQ values many times are more likely to actually exist and be measured correctly. I'm not sure what the voice quality is for sections of speech where no glottal period can be found - is NAQ there very large, or can one just interpolate between the nearest available NAQ values? Examples plots of T and V, for samples of breathy, modal, and pressed voice saying 'ma' are given here (with both NAQ and OQa plots). They were done with window size 32ms stepped every 4ms. http://people.cs.uchicago.edu/~dinoj/vq/breathy_vq.jpg http://people.cs.uchicago.edu/~dinoj/vq/near_modal_vq.jpg http://people.cs.uchicago.edu/~dinoj/vq/pressed_vq.jpg Comments appreciated, especially if I'm totally wrong. Thanks! Dinoj Surendran |