Converging two lines of data based on similar contents

Hello1111
2013-07-25
2013-07-26
  • Hello1111

    Hello1111 - 2013-07-25

    I have lines of data in one file that resemble something like this

    xxxxxxxxx:ooooooooo
    yyyyyyyyy:qqqqqqqqq
    zzzzzzzzz:aaaaaaaaa

    and then in another ones that look like this

    ooooooooo:123456
    qqqqqqqqq:234567
    aaaaaaaaa:345678

    Is there any way to combine the two and parse the texts together so that the text before the delimiter in the first situation xxxxxxxxx, can be joined with 123456. The problem is that they are not in the same order (being that if xxxxxxxxx:ooooooooo was first in the document, ooooooooo:123456 would not be first in the other document).

    If someone could assist me with this it would be a great help.

     
  • THEVENOT Guy

    THEVENOT Guy - 2013-07-25

    Hi, Hello1111,

    I think I found how to achieve your needs.

    But, before posting my solution, I would like to verify three points, with you :


    1)

    From your first file :

    xxxxxxxxx:ooooooooo
    yyyyyyyyy:qqqqqqqqq
    zzzzzzzzz:aaaaaaaaa

    And your second file :

    ooooooooo:123456
    qqqqqqqqq:234567
    aaaaaaaaa:345678

    You would like to obtain the list below in a third file, wouldn't you ?

    xxxxxxxxx123456:ooooooooo
    yyyyyyyyy234567:qqqqqqqqq
    zzzzzzzzz345678:aaaaaaaaa


    2)

    Does the colon (:), which seems a delimiter, occur only ONCE on each line, in your two files ?


    3)

    Do your two files contain EXACTLY the SAME number of lines ? I mean : can all your records be organized as a list of couples ?


    My solution needs two cycles of search-replacement, in regular expression mode and one or two sorts, whether your prefer the results sorted or not.

    I tested my solution, based on your positive answers to the three questions above, and it worked well !

    Cheers,

    guy038

     
    Last edit: THEVENOT Guy 2013-07-26
  • Loreia2

    Loreia2 - 2013-07-26

    Hi, this is a job for the script. I would do this in Python by creating two dictionaries (one for each file), and then create third dictionary by combining the first two.

    Something like:

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    I presume here that each line contains some text with ':' present

    d1 = {}
    for line in lines:
    parts = line.split(':')
    d1[parts[0].strip()] = parts[1].strip()

    create d2 in a similar way

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Now is the time to combine d1 nad d2

    d3 = {}
    for k,v in d1.items():
        d3[k] = d2[v]
    
    # now just print eveything
    for k,v in d3.items():
        print k + ':' + v
    
    # again no error checking as I presume that every line has a match
    

    If you need to keep the order, you can use OrderedDict instead.

    BR
    Loreia

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks