 [Crf-users] Urgent problem about the input data structure From: Anan Liu - 2010-11-27 14:53:45 Attachments: Message as HTML ```Hi,all I am a beginner to use CRF and Semi-CRF. I am attracted by the fancy semi-crf model. I download the CRF package from http://crf.sourceforge.net/ currently I can run the demo of "segment.java". I have the following questions: Question1: For the train/test data format, my understanding is below: for the train/test data, X denotes the observation and Y denotes the label. In "us50.train.tagged", each word corresponds to Xi which is one node of sequence X={X1,X2,...,Xn} and the number behind "|" is the label. Therefore, for the input X, each Xi is only one dimension. But in my problem, each Xi is a vector. let us assume Xi is m dimention, and X has n Xi, my input is a matrix with m*n size.Now I am confused how to input my train/tet data into semi-crf model. Question 2: When I run the segment.java, I change the model type to 'semi-markov', but it stop when iteration gets 18. the model works quite worse than the one with crf model. Can someone show me how to get a well-performed semi-crf? Thanks a lot! leo ```

