Menu

#28 CQPweb: derive metadata from XML attribute

closed
CQPweb (45)
8
2009-12-14
2009-10-29
No

For corpora with CWB pre-indexing.

Allow text-level metadata to be extracted from existing XML attributes that have been indexed as s-elements.

Use the following CQP trick:

> Texts = <text> [];
> tabulate Texts match text_id, match cat_name, match cat_code, match
> cat_major > "brown_meta.txt";

(Actually, it might be possible to get the data straight from the CQP::execute method - look into this)

Discussion

  • Andrew Hardie

    Andrew Hardie - 2009-11-07

    Also, allow a metadata table to be auto-generated containing just the text_ids, for corpora without any metadata fields.

     
  • Andrew Hardie

    Andrew Hardie - 2009-12-14

    Both new methods are complete in latest commit; but extracting metadata from s-atts has not been tested as I do not have any corpora to test it on.

     
  • Andrew Hardie

    Andrew Hardie - 2009-12-14
    • status: open --> closed
     

Log in to post a comment.