Hi, I am trying to use RAP to parse and re-serialize some custom RSS 1.0 feeds (ie, with some custom namespaces.)
I assume that each RSS file may have N number of channels within it, and each channel N number of related items. Each of these channels and its items should be unrelated to the other channels.
I would like to grab all the triples for a given channel and its elements and put them into a new model for serialization. Thus effectively splitting the multi-channel file into several single channel files. ( or even handier would be a "serialize graph from this node" function. )
I know how to grab the relevant channel resources. And I know that I can recursively forward walk the graph starting from the channel. However, I am concerned about endless loops if someone feeds me some bogus RDF. I guess I would have to keep a list of "already seen" nodes.
So then for each statement that I find during the walk I would add it to a new model. When finished walking, I serialize this new model, and voila, a file specific to this channel.
Can anyone comment on the validity and efficiency of this approach or recommend something better? I'd rather not re-einvent the wheel. Does code for this already exist somewhere?
regards,
Dan Libby
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, I am trying to use RAP to parse and re-serialize some custom RSS 1.0 feeds (ie, with some custom namespaces.)
I assume that each RSS file may have N number of channels within it, and each channel N number of related items. Each of these channels and its items should be unrelated to the other channels.
I would like to grab all the triples for a given channel and its elements and put them into a new model for serialization. Thus effectively splitting the multi-channel file into several single channel files. ( or even handier would be a "serialize graph from this node" function. )
I know how to grab the relevant channel resources. And I know that I can recursively forward walk the graph starting from the channel. However, I am concerned about endless loops if someone feeds me some bogus RDF. I guess I would have to keep a list of "already seen" nodes.
So then for each statement that I find during the walk I would add it to a new model. When finished walking, I serialize this new model, and voila, a file specific to this channel.
Can anyone comment on the validity and efficiency of this approach or recommend something better? I'd rather not re-einvent the wheel. Does code for this already exist somewhere?
regards,
Dan Libby
In case it is useful to someone else, I have come up with the following code that seemsto do the trick.
function copy_node_to_model( &$model, $resource ) {
$seen_list = array();
copy_node_to_model_worker( $model, $resource, $seen_list );
$resource->setAssociatedModel( $model );
return $resource;
}
function copy_node_to_model_worker( &$model, $resource, &$seen_list ) {
$uri = $resource->getURI();
if( !isset( $seen_list[$uri] ) ) {
$seen_list[$uri] = 1;
$statements = $resource->listProperties();
foreach( $statements as $s ) {
// Duplicate the statement, but we change the model for the sub, pred, and obj.
$sub = $s->getSubject();
$pred = $s->getPredicate();
$obj = $s->getObject();
$sub->model =& $model;
$pred->model =& $model;
$obj->model =& $model ;
$model->addWithoutDuplicates( new Statement( $sub, $pred, $obj ) );
$obj = $s->getObject();
if( is_a( $obj, 'Resource' ) ) {
copy_node_to_model_worker( $model, $obj, $seen_list );
}
}
}
}