libonvif Wiki

Library implementing client ONVIF for Windows and Linux

Discovery

Using libonvif to discover cameras on the network

Discovery is probably the most likely reason to use this library. This will allow software developers to write programs that can find devices that use DHCP for IP assignment on the network. The end result of a discovery process is the RTSP string associated with a camera, which can be sent to another part of the program for decoding and display. Other camera functions may be controlled as well with libonif, including camera resolution, frame rate, PTZ functions, etc. These other functions are discussed later in this document.

There is an example program included with libonvif called libonvif_test.exe or libonvifdll_test.exe that will perform discovery. A detailed analysis of the program is provided below.

The discovery program starts with the memory allocation for two data structures

struct OnvifSession *onvif_session = (struct OnvifSession*)malloc(sizeof(struct OnvifSession));
struct OnvifData *onvif_data = (struct OnvifData*)malloc(sizeof(struct OnvifData));

If you are used to C++ programming, these might seem strange. libonvif is a C library so it requires that you allocate the memory for data structures, whereas C++ manages the memory behind the scenes. The structures are defined in the onvif.h header file, so you can refer to that for more detail. The malloc call shown above will allocate memory on the computer heap so that the structure will have a place to reside. It is important to note that when using this convention that it is incumbent on the programmer to deallocate the memory after the use of the structures to return memory to the system. Failure to do so will result in memory leaks which may cause unusual behavior later on during program execution. This is done with the free call as shown below, which should be called when the structures are no longer needed.

free(onvif_data);
free(onvif_session);

In addition the allocation of memory, there are a number of initializations that must be performed before the system may begin the process of discovery. Similar to the malloc and free calls shown above, there is an initializeSession and closeSession function calls that perform these duties. The function pair is shown below. As in the case of memory allocation, the initializeSession function must be called before further processing and the closeSession function is called when the routine is finished.

initializeSession(onvif_session);
closeSession(onvif_session);

initializeSession will set a GUID identifier that us sent out with the broadcast packet. The idea is that this will discriminate between multiple instances of the broadcast routine. It will also set a discovery_msg_id. The discovery message is an xml string that identifies the broadcast packet as an Onvif Discovery call. The default here is 1, but there is a secondary message that my be used in the event that a camera does not respond to the first message. This may occur in practice as the camera makers are not very strict about compliance with standards. The secondary message has nearly the same content but a slightly different format which may elicit response from cameras not in full compliance with the standard. Additionally, the Windows network is initialized. The xml parser included with libxml2 is initialized as well, although it is unclear if this is completely necessary. There was a time when libxml2 was considered to be unsafe to use with respect to memory leaks and re-entrancy problems, although these seem to have been fixed. The closeSession call performs the complementary close functions respectively.

Once everything has been set up, the program can be used to send a broadcast message over UDP to the local network. The form of this command is shown below:

int number_of_cameras = broadcast(onvif_session);

This is a higher level abstraction of a number of processes taking place in the code. Depending on the operating system, libonvif will initialize the broadcast socket for proper implementation, send out the UDP broadcast packet and process the responses from compliant devices on the network. The response packet from network devices will contain basic information about the camera and data required for further communication. If you look through the source code, you will see that the packets from all devices are collected first before any processing is done on the individual packets. This is done to maximize the chances to collect all packets as they will come very quickly at the same time. The packets are stored in the OnvifSession data structure.

Note that for various reasons, cameras may fail to respond to the UDP broadcast. It may be useful to repeat the broadcast a number of times and filter the repsonses for duplicates in order to insure that as many cameras as possible are discovered. If you choose this apporach, bear in mind that you don't want to flood the network with broadcast packets, as this may exacerbate the problem. A strategy commonly employed in this situation is to wait in increasing time intervals between re-broadcasts to give the network time to clear and for the cameras to recover from situations that may be causing them to lock up. For example, following the first broadcast, wait for one second before the next broadcast, then wait three seconds before re-broadcasting again. You might also try sending the secondary message (OnvifSession->broadcast_message_id = 2) during re-broadcast to capture non compliant cameras.

The broadcast function returns the number of responses found for iteration in the next step.

for (int i = 0; i<number_of_cameras; i++) {
    prepareOnvifData(i, onvif_session, onvif_data);

    fprintf(stdout, "%s\n", onvif_data->camera_name);
    fprintf(stdout, "enter username:");
    fgets(onvif_data->username, 128, stdin);
    fprintf(stdout, "enter password:");
    fgets(onvif_data->password, 128, stdin);

    onvif_data->username[strcspn(onvif_data->username, "\r\n")] = 0;
    onvif_data->password[strcspn(onvif_data->password, "\r\n")] = 0;

    if (fillRTSP(onvif_data) == 0)
        fprintf(stdout, "%s\n", onvif_data->stream_uri);
    else
        fprintf(stderr, "Error getting camera uri - %s\n", onvif_data->last_error);

}

The loop will iterate through the messages found in the responses to the broadcast in OnvifSession. The call to prepareOnvifData will take the ith message in OnvifSession and use that to fill an OnvifData structure with information related to the camera. It will clear the OnvifData structure first before populating it, so there is not a need to use a fresh OnvifData structure for each call, although it may be desirable to use a fresh structure depending on your application. This data is available to the application prior to establishing credentials, as it will supply information necessary to log into the camera, so this call is made prior to logging into the camera. Of note in this data is the time_offset, as this will be used in the login to establish currency of the login credentials. Some cameras are very strict with this parameter as it will prevent replay attacks. This will sometimes cause issues when Daylight Savings Time occurs, as some cameras that enforce the time_offset requirement in a strict sense are less than perfect when implementing Daylight Savings Time, so it may be necessary to adjust the camera settings if there is a login problem during the transition to or from Daylight Savings.

Following the prepareOnvifData call, the user credentials should be collected and used to sign onto the camera and get the RTSP string. This call will follow the convention that will be used for most post-login camera operations, where a pointer to the OnvifData structure is populated by the calling program with necessary information prior to the call and is returned with information supplied by the camera in return. If the call is successful, the function returns a 0, otherwise, onvif_data->last_error will contain the error message. In this case, the most likely cause of failure will be an unauthorized login attempt. Most cameras will use the Onvif standard Unauthorized message, but this is not always the case, so be aware of that possibility depending upon the camera.