Remove blocks based on source list

burn2013
2013-12-04
2013-12-15
  • burn2013

    burn2013 - 2013-12-04

    I have an XML I'm editing. It has about 40,000 lines.

    1   <Rule>
    2   <Conditions>
    3      <Tooltype>TextTool</Tooltype>
    4      <LongID>N232321</LondID>
    5   </Conditions>
    6   <Actions>
    7      <LongID>MTRSEFM_DSCH</LongID>
    9  </Actions>
    10 </Rule>
    11
    12 <Rule>
    13 <Conditions>
    14     <Tooltype>TextTool</Tooltype>
    15     <LongID>N45423</LondID>
    16 </Conditions>
    17 <Actions>
    18     <LongID>COMPPSI</LongID>
    19 </Actions>
    20 </Rule>
    

    What this is doing is finding an object in a file based on the conditions. It'll be a TextTool and have the LongID designated in each set. Once it finds it, it will replace the LongID with the given LongID below in the Actions block.

    What I want to do is narrow my XML down to about 400 substitution blocks instead of 4,000.

    My question is; is there a way to take a source list of all my LongID's and remove the ~3,600 blocks I don't need? So my list is:

    MTRSEFM_DSCH
    COMPPSI
    etc.

    I want to somehow find those (which in this example would be found on lines 7 and 15) and capture the whole <rule> block associated with those (Lines 1-10 and 12-20), and put that into a separate file. There will be a ton of LongID's that I don't have in my source list, but will appear as a Rule block in my XML. I basically want to omit those that don't have a match in my source.

    I'm new to this, so I don't know where else to post this. Any help is appreciated. Thank you.

     
    Last edit: burn2013 2013-12-04
  • dail8859

    dail8859 - 2013-12-05

    So if I understand you correctly, you want all the LongID's that are inside the Action blocks.

    Note: I'm assuming that all LongID's inside the Condition blocks are N followed by some number, and that the LongID's inside the Actions block are all A through Z and _ If this is not the case then my suggestion probably won't work too well.

    Install the LineFilter2 plugin (download is on the bottom of that page). This plugin allows you to pull out lines of a file into a new file. Run the plugin and search for <LongID>\w+</LongID> and make sure to enable Regular Expression search mode. That should have a new file pretty close to what you are wanting
    and you will need to do some simple search/replace to remove unneeded text.

    If you need any more help, feel free to ask.

     
  • cchris

    cchris - 2013-12-15

    This solution just extracts the LongID tags. If what you need is to remove all the blocks containing these tags, you'll need to search for
    <Rule>.?<LongID>CMPPSI.?&lt/Rule>
    and replace with nothing, in regular expression mode, unchecking the ". matches newline" box.

    Again, I'm not sure I got it either.
    CChris

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks