Menu

trouble with dynamic network mode

tafit3
2014-11-13
2014-11-17
  • tafit3

    tafit3 - 2014-11-13

    I have trouble getting dynamic network mode to work properly. I have read the wiki description of different modes and use cases. When I try to execute commands from Use Case #9, the command in the first terminal works, but the command in the second terminal fails.

    Steps to reproduce:
    1 Execute the command in the first terminal:
    $ flom -A 239.255.0.1 -d -1 -- true

    Now the daemon is running and netstat returns:

        $ netstat -anp | grep flom
        tcp        0      0 0.0.0.0:49532           0.0.0.0:*               LISTEN      3509/flom
        udp        0      0 0.0.0.0:28015           0.0.0.0:*                           3509/flom
    

    2 Execute the command in the second terminal:

        $ flom -t /somefile -V -A 239.255.0.1 -d 0 -- true
    
        [Trace]/DaemonTraceFile='/somefile'
        [Trace]/CommandTraceFile=''
        [Trace]/Verbose=1
        [Resource]/Name='_RESOURCE'
        [Resource]/Wait=1
        [Resource]/Timeout=-1
        [Resource]/Quantity=1
        [Resource]/LockMode=5
        [Resource]/Create=1
        [Resource]/IdleLifespan=0
        [Daemon]/SocketName=''
        [Daemon]/Lifespan=0
        [Daemon]/UnicastAddress=''
        [Daemon]/UnicastPort=28015
        [Daemon]/MulticastAddress='239.255.0.1'
        [Daemon]/MulticastPort=28015
        [Network]/DiscoveryAttempts=2
        [Network]/DiscoveryTimeout=500
        [Network]/DiscoveryTTL=1
        [Network]/TcpKeepaliveTime=60
        [Network]/TcpKeepaliveIntvl=10
        [Network]/TcpKeepaliveProbes=6
        sending UDP multicast datagram to 239.255.0.1/28015 ('<?xml version="1.0" encoding="UTF-8" ?><msg level="2" verb="4" step="8"></msg>')
        reply from 239.255.0.1/28015 is '<?xml version="1.0" encoding="UTF-8" ?><msg level="2" verb="4" step="16"><network address="0.0.0.0" port="49532"/></msg>'
        flom_client_connect: ret_cod=-104 (ERROR: 'connect' function returned an error condition)
    

    I expected the second command to connect to the first host, but it returned an error, probably because of the address "0.0.0.0".

    I also checked Use Cases #7 and #8 and they work without problems. The address is different than 0.0.0.0 (eg. 192.168.0.10) in #7 and #8 and the command from the second host connects to the first host.

    What am I doing wrong? How should I invoke the command in the dynamic network mode on the second host, so that the command is executed without the error described above? Is there any Linux network configuration I need to change?

     

    Last edit: tafit3 2014-11-13
  • Christian Ferrari

    There's no mistake on your side: you discovered a bug that's not covered by case test suite. Tests run on a single system and this bug can be exploited only with two distinct hosts. I suppose I introduced the bug after 0.3.0 release: that's the release I used to write the wiki instruction you followed.
    I committed a new development version on both SourceForge and GitHub git repositories:
    https://sourceforge.net/p/flom/code/ci/1337eae274a8ad3ffaab89f555aa53867dd799ad/
    https://github.com/tiian/flom/commit/1337eae274a8ad3ffaab89f555aa53867dd799ad
    the fix is related to function "flom_accept_discover_reply" in file "src/flom_daemon.c" while the other stuff is scaffholding.
    The committed version is a pre-alpha version of the future 0.9.0 release, but it passes all the case tests and it should be quite usable.
    Using this version you should obtain something like below:

    tiian@mojan:~/src/flom$ flom -A 239.255.0.1 -d 0 -T /tmp/flom_client0.trc -V -- true
    [Trace]/DaemonTraceFile='/tmp/flom-daemon.trc'
    [Trace]/CommandTraceFile='/tmp/flom_client0.trc'
    [Trace]/Verbose=1
    [Resource]/Name='_RESOURCE'
    [Resource]/Wait=1
    [Resource]/Timeout=-1
    [Resource]/Quantity=1
    [Resource]/LockMode=5
    [Resource]/Create=1
    [Resource]/IdleLifespan=0
    [Daemon]/SocketName=''
    [Daemon]/Lifespan=0
    [Daemon]/UnicastAddress=''
    [Daemon]/UnicastPort=28015
    [Daemon]/MulticastAddress='239.255.0.1'
    [Daemon]/MulticastPort=28015
    [Network]/DiscoveryAttempts=2
    [Network]/DiscoveryTimeout=500
    [Network]/DiscoveryTTL=1
    [Network]/TcpKeepaliveTime=60
    [Network]/TcpKeepaliveIntvl=10
    [Network]/TcpKeepaliveProbes=6
    sending UDP multicast datagram to 239.255.0.1/28015 ('<?xml version="1.0" encoding="UTF-8" ?><msg level="2" verb="4" step="8"></msg>')
    reply from 239.255.0.1/28015 is '<?xml version="1.0" encoding="UTF-8" ?><msg level="2" verb="4" step="16"><network address="" port="36123"/></msg>'

    You guessed the issue was related to address "0.0.0.0": daemon must not reply its address if it's bound to INADDR_ANY.
    Thank you for your precious feedback, please tell me if the fix solves your issue.
    Best Regards
    Ch.F.

     

    Last edit: Christian Ferrari 2014-11-13
  • tafit3

    tafit3 - 2014-11-17

    It works now. However I had a little trouble with the build. I cloned the repository from github.

    commit 5d59cb4cff11a13360cf3c882428929f94a20688
    Author: Christian Ferrari tiian@users.sourceforge.net
    Date: Sun Nov 16 22:37:41 2014 +0100

    Documented a piece of API
    

    After executing "./configure" and "make" (as described in README file) I get:

    $ make
    make  all-recursive
    make[1]: Entering directory `/home/box/flom_0.90_alpha/second/flom'
    Making all in src
    make[2]: Entering directory `/home/box/flom_0.90_alpha/second/flom/src'
    /bin/bash ../libtool --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I..  -D_SYSCONFDIR='"/usr/local/etc"' -D_TRACE -pthread -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include  -Wall -g -O2 -MT flom_client.lo -MD -MP -MF .deps/flom_client.Tpo -c -o flom_client.lo flom_client.c
    libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I.. -D_SYSCONFDIR=\"/usr/local/etc\" -D_TRACE -pthread -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -Wall -g -O2 -MT flom_client.lo -MD -MP -MF .deps/flom_client.Tpo -c flom_client.c  -fPIC -DPIC -o .libs/flom_client.o
    flom_client.c:46:25: fatal error: flom_errors.h: No such file or directory
    compilation terminated.
    make[2]: *** [flom_client.lo] Error 1
    make[2]: Leaving directory `/home/box/flom_0.90_alpha/second/flom/src'
    make[1]: *** [all-recursive] Error 1
    make[1]: Leaving directory `/home/box/flom_0.90_alpha/second/flom'
    make: *** [all] Error 2
    

    Probably some Makefile dependency is missing. The workaround is:
    cd src
    make flom_errors.h
    cd ..
    make

    Thanks for the fix.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.