The OpenNLP Grok Library / Discussion / Help: Tagging financial row definitions

John Nagle - 2001-01-03

I'm interested in using "grok" to help interpret the tags on the rows of financial statements. Typically, these are not sentences, but noun clauses. Some examples:

ASSETS:
Cash and cash equivalents
Marketable securities
Total cash, cash equivalents and marketable securities
Accounts receivable - trade and other
Inventories
Prepaid employee benefits, taxes and other expenses
Finance receivables and retained interests in sold receivables
Property and equipment
Special tools
Intangible assets
Other noncurrent assets

LIABILITIES:
Accounts payable
Accrued liabilities and expenses
Short-term debt
Payments due within one year on long-term debt
Long-term debt
Accrued noncurrent employee benefits
Other noncurrent liabilities

If I could get a sentence diagram out, in which I could then look for stock phrases and identify subordinate clauses to them, that would be sufficient. What I need, in a LISP-like notation,
is parsing into something like

(and (cash) (cash (equivalents)))
(securities (marketable))
(total (cash) (cash (equivalents)) (securities (marketable)))
(accounts (receivable (and (trade other)))

Is this something one can reasonably do with GROK?
Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Gann Bierner - 2001-01-03
  
  Well, yes and no.
  
  The Grok parser will certainly parse these noun phrases and produce something resembling the structures you describe. The problem is that, in general, correctly parsing nps is really hard because you need a lot of world knowledge to know what the attachments are.
  
  The good news is that, I believe, Grok has a category tagger trained off of wall street journal (ie. financial) text, so it might actually do a decent job in your case. Jason is the person to ask about this, and I believe that he is writing some code to make the parser simpler to use. Jason?
  
  Gann
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- John Nagle - 2001-01-03
  
  Thanks for the quick reply. You can contact me directly at "nagle@downside.com".
  Visiting http://www.downside.com will show what I'm doing with this info. It could get Grok some publicity; Downside gets a substantial number of hits.
  
  Right now, I'm working on the code that extracts tables from the SEC database (http://www.sec.gov), and finds rows and columns. Once I have that done, I'll have text to feed into Grok.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Tagging financial row definitions

Forums

Help

Tagging financial row definitions document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Tagging financial row definitions