Python parsing module / Discussion / Help/Open Discussion: Match character if character NOT in list?

Keith McCammon - 2006-01-23

I'm in the process of building a SQL parser, based on the example that you provide with the pyparsing distribution. I'm now able to handle all of the common query types, table and column aliases, joins, etc. pyparsing makes this all quite simple, and for that I am thankful! But I digress...

I'm having trouble getting past the following:

"SELECT f.*, b.col1, b.col2 FROM foo f, bar b WHERE . . . "

A columnName is defined as:

columnName = delimitedList( ident, ".", combine=True )

And a selectStmt looks like this:

selectToken + ( "*" | columnNameList ).setResultsName("columns") + fromToken + . . .

The problem? According to our grammar, a select statement is either "SELECT * FROM foo" or "SELECT f.col1, f.col2 FROM foo f". Because it finds "*" in "SELECT f.* FROM foo f", it expects fromToken to follow the "*".

I've tried changing columnName such that the elements within the delimited list can be either ident OR * as follows:

columnName = delimitedList( ("*" | ident ), ".", combine=True )

This will allow the remained of the statement to be properly parsed, but will ignore column names after the "*". Thus, in "SELECT f.* b.col1 FROM foo f, bar b", tokens.tables will only contain [f.*].

Apologies for the long-winded description of the problem, but I thought it more convenient than having to (potentially) post several clarifications.

Many thanks in advance for all of your thoughts and/or suggestions.

Keith

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Keith McCammon - 2006-01-23
  
  Apologies for the representation of the double-quotes in my post ("). First time posting to an SF board. Wasn't aware that it would mangle something like this, as it's clearly not part of any tag or code (of which I'm aware).
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Keith McCammon - 2006-01-24
  
  OK. Was able to fix this after taking a night off :)
  
  Removed the * from the selectStmt def, and left it in the columnName def only:
  
  columnName = delimitedList( "*" | ident ), ".", combine=True )
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Paul McGuire - 2006-01-26
  
  Keith -
  
  (Sorry you had to fix this yourself. I thought I replied to this, but SF must have eaten my reply. Here it is below.)
  
  I think the cleanest approach would be to get away from the delimitedList trickery, and just say what you mean:
  
  columnRef = "*" | ident
  tableRef = ident
  columnAlias = quotedString | ident
  columnName = Combine(Optional(tableRef + ".") + columnRef).setResultsName("colName") + Optional(columnAlias).setResultsName("colAlias")
  
  Does this make things better or worse?
  
  -- Paul
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Match character if character NOT in list?

Forums

Help

Match character if character NOT in list? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Match character if character NOT in list?