Menu

SQLDOM / Blog: Recent posts

New branch / flavor of SQLDOM

Starting with version .927 there is an alternate flavor of SQLDOM available.

The standard version uses only temporary tables and temporary stored procedures. This is very portable: no changes are made to persistent database objects.

However, there is overhead associated with creation and compilation of the stored procedures.

In a production environment you probably want to have persistent stored procedures.... read more

Posted by David Rueter 2013-03-21

New Release (.927)

Version .927 3/21/2013

Added @HUIDLike as optional parameters for #spgetDOMHTML to allow
retrieving a subset of the HTML (where HUID LIKE @HUIDLike + '%'). Added
SET NOCOUNT ON to avoid performance hit if the caller had not set this option.

The new @HUIDLike parameter is really useful. For example, suppose you want to extract a subset of the HTML--perhaps just a particular DIV. You can find the HUID of the DIV you want by looking at #tblDOMHierarchy (or #spgetDOM), and then simply call:... read more

Posted by David Rueter 2013-03-21

New Release (.925)

Version .925 10/10/2012

Added table #tblDOMHierarchy that is populated by #spgetDOM (to avoid needing to insert results of #spgetDOM into a user-created temp table to utilize the HUID field when joining to DOM rows). NOTE: call #spgetDOM before using #tblDOMHierarchy

Added procedure #spgetText for convenience in getting text values once #spgetDOM has been called.

Fixed a problem with parsing script tags that contain text nodes in certain cases due to a bug introduced by optimizations in version .918... read more

Posted by David Rueter 2012-10-10 Labels: New Release

Theory of Operation for #spgetDOMHTML

Theory of Operation for #spgetDOMHTML

1) Create a temp table and populate via #spgetDOM. This allows us to leverage the functionality of this procedure that produces HUID and SortHUID values.
2) Populate a Sequence column--which is just a ROW_NUMBER generated over SortHUID
3) Populate an HasChild column by joining Sequence to Sequence + 1

The basic idea is that we will use a cursor to walk through the the temporary table in Sequence order. This allows us to emit tags as as we encounter them: they are in the correct order to render, without respect to hierachy.... read more

Posted by David Rueter 2012-02-23

New Release (.920)

Version .920 2/23/2012

  1. Refactor #spgetDOMHTML to fix bugs, streamline

Note: .919 2/21/2012 had only minor fixes to rendering of comments in #spgetDOMHTML. This was completely replaced by refactored version in .920

Posted by David Rueter 2012-02-23

New Release (.918)

Version .918
2/20/2012

  1. Removed dependencies on 3 UDF string helper functions
  2. Performance increase (approx. 24%: 130ms vs 170ms)
  3. Clean up some comments

Note: Performance times are benchmarks for parsing the Google home page--average of 100 iterations.

Posted by David Rueter 2012-02-20

Introduction to SQLDOM

SQLDOM is a set of temporary tables and native T-SQL code for Microsoft SQL 2005 or later.

SQLDOM allows for easy and robust parsing of HTML into a table-based DOM (Document Object Model). It also provides routines for manipulating the DOM, and for rendering the DOM out as HTML.

This means that SQLDOM is useful for digesting existing HTML pages, modifying HTML, and creating new HTML programatically--all inside SQL, with no external dependencies.... read more

Posted by David Rueter 2012-02-19 Labels: SQLDOM introduction parse HUID