Starting with version .927 there is an alternate flavor of SQLDOM available.
The standard version uses only temporary tables and temporary stored procedures. This is very portable: no changes are made to persistent database objects.
However, there is overhead associated with creation and compilation of the stored procedures.
In a production environment you probably want to have persistent stored procedures.... read more
Version .927 3/21/2013
Added @HUIDLike as optional parameters for #spgetDOMHTML to allow
retrieving a subset of the HTML (where HUID LIKE @HUIDLike + '%'). Added
SET NOCOUNT ON to avoid performance hit if the caller had not set this option.
The new @HUIDLike parameter is really useful. For example, suppose you want to extract a subset of the HTML--perhaps just a particular DIV. You can find the HUID of the DIV you want by looking at #tblDOMHierarchy (or #spgetDOM), and then simply call:... read more
Version .925 10/10/2012
Added table #tblDOMHierarchy that is populated by #spgetDOM (to avoid needing to insert results of #spgetDOM into a user-created temp table to utilize the HUID field when joining to DOM rows). NOTE: call #spgetDOM before using #tblDOMHierarchy
Added procedure #spgetText for convenience in getting text values once #spgetDOM has been called.
Fixed a problem with parsing script tags that contain text nodes in certain cases due to a bug introduced by optimizations in version .918... read more
Theory of Operation for #spgetDOMHTML
1) Create a temp table and populate via #spgetDOM. This allows us to leverage the functionality of this procedure that produces HUID and SortHUID values.
2) Populate a Sequence column--which is just a ROW_NUMBER generated over SortHUID
3) Populate an HasChild column by joining Sequence to Sequence + 1
The basic idea is that we will use a cursor to walk through the the temporary table in Sequence order. This allows us to emit tags as as we encounter them: they are in the correct order to render, without respect to hierachy.... read more
Version .920 2/23/2012
Note: .919 2/21/2012 had only minor fixes to rendering of comments in #spgetDOMHTML. This was completely replaced by refactored version in .920
Note: Performance times are benchmarks for parsing the Google home page--average of 100 iterations.
SQLDOM is a set of temporary tables and native T-SQL code for Microsoft SQL 2005 or later.
SQLDOM allows for easy and robust parsing of HTML into a table-based DOM (Document Object Model). It also provides routines for manipulating the DOM, and for rendering the DOM out as HTML.
This means that SQLDOM is useful for digesting existing HTML pages, modifying HTML, and creating new HTML programatically--all inside SQL, with no external dependencies.... read more