Menu

Match character if character NOT in list?

2006-01-23
2013-05-14
  • Keith McCammon

    Keith McCammon - 2006-01-23

    I'm in the process of building a SQL parser, based on the example that you provide with the pyparsing distribution.  I'm now able to handle all of the common query types, table and column aliases, joins, etc.  pyparsing makes this all quite simple, and for that I am thankful!  But I digress...

    I'm having trouble getting past the following:

    "SELECT f.*, b.col1, b.col2 FROM foo f, bar b WHERE . . . "

    A columnName is defined as:

    columnName = delimitedList( ident, ".", combine=True )

    And a selectStmt looks like this:

    selectToken + ( "*" | columnNameList ).setResultsName("columns") + fromToken + . . .

    The problem?  According to our grammar, a select statement is either "SELECT * FROM foo" or "SELECT f.col1, f.col2 FROM foo f".  Because it finds "*" in "SELECT f.* FROM foo f", it expects fromToken to follow the "*".

    I've tried changing columnName such that the elements within the delimited list can be either ident OR * as follows:

    columnName = delimitedList( ("*" | ident ), ".", combine=True )

    This will allow the remained of the statement to be properly parsed, but will ignore column names after the "*".  Thus, in "SELECT f.* b.col1 FROM foo f, bar b", tokens.tables will only contain [f.*].

    Apologies for the long-winded description of the problem, but I thought it more convenient than having to (potentially) post several clarifications.

    Many thanks in advance for all of your thoughts and/or suggestions.

    Keith

     
    • Keith McCammon

      Keith McCammon - 2006-01-23

      Apologies for the representation of the double-quotes in my post (").  First time posting to an SF board.  Wasn't aware that it would mangle something like this, as it's clearly not part of any tag or code (of which I'm aware).

       
    • Keith McCammon

      Keith McCammon - 2006-01-24

      OK. Was able to fix this after taking a night off :)

      Removed the * from the selectStmt def, and left it in the columnName def only:

      columnName = delimitedList( "*" | ident ), ".", combine=True ) 

       
    • Paul McGuire

      Paul McGuire - 2006-01-26

      Keith -

      (Sorry you had to fix this yourself.  I thought I replied to this, but SF must have eaten my reply.  Here it is below.)

      I think the cleanest approach would be to get away from the delimitedList trickery, and just say what you mean:

      columnRef = "*" | ident
      tableRef = ident
      columnAlias = quotedString | ident
      columnName = Combine(Optional(tableRef + ".") + columnRef).setResultsName("colName") + Optional(columnAlias).setResultsName("colAlias")

      Does this make things better or worse?

      -- Paul

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.