I have trouble getting dynamic network mode to work properly. I have read the wiki description of different modes and use cases. When I try to execute commands from Use Case #9, the command in the first terminal works, but the command in the second terminal fails.
Steps to reproduce:
1 Execute the command in the first terminal:
$ flom -A 239.255.0.1 -d -1 -- true
$ flom -t /somefile -V -A 239.255.0.1 -d 0 -- true
[Trace]/DaemonTraceFile='/somefile'
[Trace]/CommandTraceFile=''
[Trace]/Verbose=1
[Resource]/Name='_RESOURCE'
[Resource]/Wait=1
[Resource]/Timeout=-1
[Resource]/Quantity=1
[Resource]/LockMode=5
[Resource]/Create=1
[Resource]/IdleLifespan=0
[Daemon]/SocketName=''
[Daemon]/Lifespan=0
[Daemon]/UnicastAddress=''
[Daemon]/UnicastPort=28015
[Daemon]/MulticastAddress='239.255.0.1'
[Daemon]/MulticastPort=28015
[Network]/DiscoveryAttempts=2
[Network]/DiscoveryTimeout=500
[Network]/DiscoveryTTL=1
[Network]/TcpKeepaliveTime=60
[Network]/TcpKeepaliveIntvl=10
[Network]/TcpKeepaliveProbes=6
sending UDP multicast datagram to 239.255.0.1/28015 ('<?xml version="1.0" encoding="UTF-8" ?><msglevel="2"verb="4"step="8"></msg>')
reply from 239.255.0.1/28015 is '<?xml version="1.0" encoding="UTF-8" ?><msglevel="2"verb="4"step="16"><networkaddress="0.0.0.0"port="49532"/></msg>'
flom_client_connect: ret_cod=-104 (ERROR: 'connect' function returned an error condition)
I expected the second command to connect to the first host, but it returned an error, probably because of the address "0.0.0.0".
I also checked Use Cases #7 and #8 and they work without problems. The address is different than 0.0.0.0 (eg. 192.168.0.10) in #7 and #8 and the command from the second host connects to the first host.
What am I doing wrong? How should I invoke the command in the dynamic network mode on the second host, so that the command is executed without the error described above? Is there any Linux network configuration I need to change?
Last edit: tafit3 2014-11-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There's no mistake on your side: you discovered a bug that's not covered by case test suite. Tests run on a single system and this bug can be exploited only with two distinct hosts. I suppose I introduced the bug after 0.3.0 release: that's the release I used to write the wiki instruction you followed.
I committed a new development version on both SourceForge and GitHub git repositories: https://sourceforge.net/p/flom/code/ci/1337eae274a8ad3ffaab89f555aa53867dd799ad/ https://github.com/tiian/flom/commit/1337eae274a8ad3ffaab89f555aa53867dd799ad
the fix is related to function "flom_accept_discover_reply" in file "src/flom_daemon.c" while the other stuff is scaffholding.
The committed version is a pre-alpha version of the future 0.9.0 release, but it passes all the case tests and it should be quite usable.
Using this version you should obtain something like below:
You guessed the issue was related to address "0.0.0.0": daemon must not reply its address if it's bound to INADDR_ANY.
Thank you for your precious feedback, please tell me if the fix solves your issue.
Best Regards
Ch.F.
Last edit: Christian Ferrari 2014-11-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have trouble getting dynamic network mode to work properly. I have read the wiki description of different modes and use cases. When I try to execute commands from Use Case #9, the command in the first terminal works, but the command in the second terminal fails.
Steps to reproduce:
1 Execute the command in the first terminal:
$ flom -A 239.255.0.1 -d -1 -- true
Now the daemon is running and netstat returns:
2 Execute the command in the second terminal:
I expected the second command to connect to the first host, but it returned an error, probably because of the address "0.0.0.0".
I also checked Use Cases #7 and #8 and they work without problems. The address is different than 0.0.0.0 (eg. 192.168.0.10) in #7 and #8 and the command from the second host connects to the first host.
What am I doing wrong? How should I invoke the command in the dynamic network mode on the second host, so that the command is executed without the error described above? Is there any Linux network configuration I need to change?
Last edit: tafit3 2014-11-13
There's no mistake on your side: you discovered a bug that's not covered by case test suite. Tests run on a single system and this bug can be exploited only with two distinct hosts. I suppose I introduced the bug after 0.3.0 release: that's the release I used to write the wiki instruction you followed.
I committed a new development version on both SourceForge and GitHub git repositories:
https://sourceforge.net/p/flom/code/ci/1337eae274a8ad3ffaab89f555aa53867dd799ad/
https://github.com/tiian/flom/commit/1337eae274a8ad3ffaab89f555aa53867dd799ad
the fix is related to function "flom_accept_discover_reply" in file "src/flom_daemon.c" while the other stuff is scaffholding.
The committed version is a pre-alpha version of the future 0.9.0 release, but it passes all the case tests and it should be quite usable.
Using this version you should obtain something like below:
tiian@mojan:~/src/flom$ flom -A 239.255.0.1 -d 0 -T /tmp/flom_client0.trc -V -- true
[Trace]/DaemonTraceFile='/tmp/flom-daemon.trc'
[Trace]/CommandTraceFile='/tmp/flom_client0.trc'
[Trace]/Verbose=1
[Resource]/Name='_RESOURCE'
[Resource]/Wait=1
[Resource]/Timeout=-1
[Resource]/Quantity=1
[Resource]/LockMode=5
[Resource]/Create=1
[Resource]/IdleLifespan=0
[Daemon]/SocketName=''
[Daemon]/Lifespan=0
[Daemon]/UnicastAddress=''
[Daemon]/UnicastPort=28015
[Daemon]/MulticastAddress='239.255.0.1'
[Daemon]/MulticastPort=28015
[Network]/DiscoveryAttempts=2
[Network]/DiscoveryTimeout=500
[Network]/DiscoveryTTL=1
[Network]/TcpKeepaliveTime=60
[Network]/TcpKeepaliveIntvl=10
[Network]/TcpKeepaliveProbes=6
sending UDP multicast datagram to 239.255.0.1/28015 ('<?xml version="1.0" encoding="UTF-8" ?><msg level="2" verb="4" step="8"></msg>')
reply from 239.255.0.1/28015 is '<?xml version="1.0" encoding="UTF-8" ?><msg level="2" verb="4" step="16"><network address="" port="36123"/></msg>'
You guessed the issue was related to address "0.0.0.0": daemon must not reply its address if it's bound to INADDR_ANY.
Thank you for your precious feedback, please tell me if the fix solves your issue.
Best Regards
Ch.F.
Last edit: Christian Ferrari 2014-11-13
It works now. However I had a little trouble with the build. I cloned the repository from github.
commit 5d59cb4cff11a13360cf3c882428929f94a20688
Author: Christian Ferrari tiian@users.sourceforge.net
Date: Sun Nov 16 22:37:41 2014 +0100
After executing "./configure" and "make" (as described in README file) I get:
Probably some Makefile dependency is missing. The workaround is:
cd src
make flom_errors.h
cd ..
make
Thanks for the fix.