Hello
This is maybe not really a metaf2xml question, but i take my chanses and post it here anyway.
I think that this is a great program, and I had an idea that I would be able to export the parsed data to mysql in some way.
However it seems to be a bit problematic to get the xml data (not produced by mysql itself) into mysql, so I thought I would try some other way. With this in mind I thought I would start by checking the parser output to see what it looked like. I am not a Perl wiz, but I was trying to view the parser module output using :
while ( ($k,$v) = each %report_data )
{
print "$k => $v\n";
}
with the result:
metaf2xml-1.43$ perl metaf2xml.pl -x test.xml 'SYNOP AAXX 25154 06180 22675 52915 10152 20044 30216 40222 54000 85801 333 60007 82845 85656 91130'
visPrev => HASH(0xa063998)
obsStationId => HASH(0xa2337f8)
.
.
.
How can I get the values out in plain text?
Is there any other way to get the data out in a more mysql friendly way?
Regards
Jonas Jonasson
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
there are several solutions to get the values in plain text (e.g. use the provided example XSLT style sheets), but there is no simple way to get the values suitable for an SQL INSERT. Some suggestions:
a) use the return value of parser::parse_report() (as you did)
This is a bit tricky because the data is a hash which contains other hashes and arrays and scalars. It is like a tree with many leaves, so the code has to "walk" along the brances of the tree to find the values. You can use a module like Dumpvalue (from the standard Perl distribution) to see how the structure looks like:
use Dumpvalue;
Dumpvalue->new(veryCompact=>1)->dumpValue(\%report_data);
Or you could have a look at the module metaf2xml::XML which produces XML data.
Once there was a module metaf2xml::FGFS (not in the git repository), which converted the %report_data (for METAR, only) directly to plain text output similar to that of the program "metar" of the project FlightGear. It was the only other "consumer" of %report_data.
If you know the path to the value (from a data dumper and from metaf.dtd), you can check if the value exists and retrieve it, e.g.:
# current air temperature (here: always in degree Celsius!)
$air_temp = undef;
if ( exists $report_data{temperature}
&& exists $report_data{temperature}{air}
&& exists $report_data{temperature}{air}{temp})
{
$air_temp = $report_data{temperature}{air}{temp}{v};
if ($report_data{temperature}{air}{temp}{u} eq 'F') {
$air_temp = ($air_temp - 32) / 1.8;
}
}
or:
# maximum air temperature in section 3 of a SYNOP (here: with original unit) $max_air_temp = undef; for (exists $report_data{section3} ? @{$report_data{section3}} : ()) {if(exists$_->{tempMaxPeriod}&&exists$_->{tempMaxPeriod}{temp}){$max_air_temp=$_->{tempMaxPeriod}{temp}{v};$max_air_temp_unit=$_->{tempMaxPeriod}{temp}{u};if(exists$_->{timePeriod}){$max_air_temp_period=#storevalueoftimePeriod...}else{$max_air_temp_period=#storevalueoftimeBeforeObs...}}}
Note: There is not always a hash "v" for the value. For values which have some qualifier, like a unit in "u", it is always used. For those that cannot have qualifiers it is never used. For some, it depends on the data. E.g. "dir" can have "rp" and "rn" and "q"; if it has none of these, the value is directly in …{dir}, otherwise it will be in …{dir}{v}. In the XML, the value will always be in the attribute "v" of the node.
b) use the return value of parser::parse_report() and parse its dump
The text output of some data dump module can be processed by a program which looks for certain strings and extracts the values (but be aware of the Note above).
c) use the XML produced by XML::print_report() and process with XSLT
You could write an XSLT style sheet to extract the values, perform unit conversions and create SQL statements in one go. You could have a look at the .xsl files from the project, especially metaf-sum.xsl. Referencing via unique database ids could be a bit tricky, though; using auto-incemented ones seems impossible without some post-processing. So the result of the XSLT transformation can only be an intermediate step.
d) use the XML produced by XML::print_report() and process with a non-XML parser
Similar to solution b, the XML can be processed by a program which looks for certain strings and extracts the values.
e) use the result of XSL transformation with one of the example .xsl files
The output of these is text (plain or in HTML tables). metaf.xsl produces detailed information, metaf-sum.xsl a summary from a subset of the data. You could write code to parse the text and convert it into SQL statements.
f) use a native XML database
If you are not restricted to MySQL, there are of course databases which store XML directly, (eXist DB, BaseX, Apache Xindice, …). I would strongly recommend to develop a separate XML schema for the database, though, and transform the data with XSLT.
From these solutions I would recommend f, then c, then a. XML is supposed to be _the_ interface for this project. It is better documented and if there are changes it is easier to adapt. Solution a is the direct and fast way, but it is rather an internal interface. Solution e is good if you want to avoid internal interfaces, but it is actually intended as a human readable interface, and less suited to be processed by other programs. Like solution e, solutions b and d require additional text parsing, which for me somehow defies the purpose of this project.
If you tell me which solution (one of those mentioned or any another) you prefer and (some part of) your database schema, I'll be happy to help you getting started with some example code.
BTW, you are not the only one with the task of filling a database. I know of at least one user (using solution c or d, no idea about the state or progress) and I also started on this topic (with a). My code is still incomplete, and it's hard to say when it will be ready. It is just an attempt of a demonstration of feasibility and uses the CSV "database". I haven't thought hard about which values to store (especially how to store min/max values) because it also depends on how the data is used afterwards.
As a first step, I will include my existing code "as is" (db_csv.pm and metaf2db.pl) in the branch "unstable" in the git repository, but with no guarantee that it will be in the next or indeed any release.
Up to now, the primary goal of the project was producing XML and presenting it as text/HTML. If there is increased interest in storing data in a database, priorities may change and it may get more "CPU" :-)
Kind regards,
Thomas
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello
This is maybe not really a metaf2xml question, but i take my chanses and post it here anyway.
I think that this is a great program, and I had an idea that I would be able to export the parsed data to mysql in some way.
However it seems to be a bit problematic to get the xml data (not produced by mysql itself) into mysql, so I thought I would try some other way. With this in mind I thought I would start by checking the parser output to see what it looked like. I am not a Perl wiz, but I was trying to view the parser module output using :
while ( ($k,$v) = each %report_data )
{
print "$k => $v\n";
}
with the result:
metaf2xml-1.43$ perl metaf2xml.pl -x test.xml 'SYNOP AAXX 25154 06180 22675 52915 10152 20044 30216 40222 54000 85801 333 60007 82845 85656 91130'
visPrev => HASH(0xa063998)
obsStationId => HASH(0xa2337f8)
.
.
.
How can I get the values out in plain text?
Is there any other way to get the data out in a more mysql friendly way?
Regards
Jonas Jonasson
Hello Jonas,
there are several solutions to get the values in plain text (e.g. use the provided example XSLT style sheets), but there is no simple way to get the values suitable for an SQL INSERT. Some suggestions:
a) use the return value of parser::parse_report() (as you did)
This is a bit tricky because the data is a hash which contains other hashes and arrays and scalars. It is like a tree with many leaves, so the code has to "walk" along the brances of the tree to find the values. You can use a module like Dumpvalue (from the standard Perl distribution) to see how the structure looks like:
or with Data::Dumper:
Or you could have a look at the module metaf2xml::XML which produces XML data.
Once there was a module metaf2xml::FGFS (not in the git repository), which converted the %report_data (for METAR, only) directly to plain text output similar to that of the program "metar" of the project FlightGear. It was the only other "consumer" of %report_data.
If you know the path to the value (from a data dumper and from metaf.dtd), you can check if the value exists and retrieve it, e.g.:
or:
Note: There is not always a hash "v" for the value. For values which have some qualifier, like a unit in "u", it is always used. For those that cannot have qualifiers it is never used. For some, it depends on the data. E.g. "dir" can have "rp" and "rn" and "q"; if it has none of these, the value is directly in …{dir}, otherwise it will be in …{dir}{v}. In the XML, the value will always be in the attribute "v" of the node.
b) use the return value of parser::parse_report() and parse its dump
The text output of some data dump module can be processed by a program which looks for certain strings and extracts the values (but be aware of the Note above).
c) use the XML produced by XML::print_report() and process with XSLT
You could write an XSLT style sheet to extract the values, perform unit conversions and create SQL statements in one go. You could have a look at the .xsl files from the project, especially metaf-sum.xsl. Referencing via unique database ids could be a bit tricky, though; using auto-incemented ones seems impossible without some post-processing. So the result of the XSLT transformation can only be an intermediate step.
d) use the XML produced by XML::print_report() and process with a non-XML parser
Similar to solution b, the XML can be processed by a program which looks for certain strings and extracts the values.
e) use the result of XSL transformation with one of the example .xsl files
The output of these is text (plain or in HTML tables). metaf.xsl produces detailed information, metaf-sum.xsl a summary from a subset of the data. You could write code to parse the text and convert it into SQL statements.
f) use a native XML database
If you are not restricted to MySQL, there are of course databases which store XML directly, (eXist DB, BaseX, Apache Xindice, …). I would strongly recommend to develop a separate XML schema for the database, though, and transform the data with XSLT.
From these solutions I would recommend f, then c, then a. XML is supposed to be _the_ interface for this project. It is better documented and if there are changes it is easier to adapt. Solution a is the direct and fast way, but it is rather an internal interface. Solution e is good if you want to avoid internal interfaces, but it is actually intended as a human readable interface, and less suited to be processed by other programs. Like solution e, solutions b and d require additional text parsing, which for me somehow defies the purpose of this project.
If you tell me which solution (one of those mentioned or any another) you prefer and (some part of) your database schema, I'll be happy to help you getting started with some example code.
BTW, you are not the only one with the task of filling a database. I know of at least one user (using solution c or d, no idea about the state or progress) and I also started on this topic (with a). My code is still incomplete, and it's hard to say when it will be ready. It is just an attempt of a demonstration of feasibility and uses the CSV "database". I haven't thought hard about which values to store (especially how to store min/max values) because it also depends on how the data is used afterwards.
As a first step, I will include my existing code "as is" (db_csv.pm and metaf2db.pl) in the branch "unstable" in the git repository, but with no guarantee that it will be in the next or indeed any release.
Up to now, the primary goal of the project was producing XML and presenting it as text/HTML. If there is increased interest in storing data in a database, priorities may change and it may get more "CPU" :-)
Kind regards,
Thomas
Hello Jonas,
if you're still looking for a decent way to get parsed data for mysql, maybe the new feature announced today in the Project News helps.
Kind regards,
Thomas