Note: This manual is for JUNG 2.0 and later. Many things are different in JUNG 2.0 than they were in JUNG 1.x, and we don't address most of those differences here. If you want some basic information on porting 1.x code to 2.0, start here: JUNG 1.0 -> 2.0 Class Migration Guide.
In case this wasn't obvious, this manual is a work in progress. If you notice something missing that you need soon, or if you think you've spotted a mistake or a bug, please contact us and we'll do our best to fix it.
Introduction and Overview
JUNG — the Java Universal Network/Graph Framework--is a software library that provides a common and extendible language for the modeling, analysis, and visualization of data that can be represented as a graph or network. It is written in Java, which allows JUNG-based applications to make use of the extensive built-in capabilities of the Java API, as well as those of other existing third-party Java libraries.
The JUNG architecture is designed to support a variety of representations of entities and their relations, such as directed and undirected graphs, multi-modal graphs, graphs with parallel edges, and hypergraphs. It provides a mechanism for annotating graphs, entities, and relations with metadata. This facilitates the creation of analytic tools for complex data sets that can examine the relations between entities as well as the metadata attached to each entity and relation.
JUNG includes implementations of a number of algorithms from graph theory, data mining, and social network analysis, including clustering, filtering, random graph generation, blockmodeling, calculation of network distances and flows, and a wide variety of metrics (PageRank, HITS, betweenness, closeness, etc.).
JUNG also provides a visualization framework that makes it easy to construct tools for the interactive exploration of network data. Users can use one of the layout algorithms provided, or use the framework to create their own custom layouts.
As an open-source library, JUNG provides a common framework for graph/network analysis and visualization. We hope that JUNG will make it easier for those who work with graph and network data to make use of one anothers' development efforts, and thus avoid continually re-inventing the wheel.
JUNG requires the following:
- Java 1.5 or later
- Larvalabs' generics version of Apache's Commons Collections libraries: http://larvalabs.com/collections/
We will refer to this as the commons-collections library hereafter.
- The CERN Colt libraries (for some algorithms and I/O operations): http://www-itg.lbl.gov/~hoschek/colt/
This is referred to later as Colt.
These third-party libraries are included in the JUNG distribution.
Graphs, Vertices, and Edges
The basic JUNG type is the graph. Graphs are defined by the interfaces Hypergraph, Graph, Forest, Tree, and KPartiteGraph.
Vertex and Edge Types and Identities
JUNG graphs are analogous to Java collections (such as List, Set, Map, and so on). Just as collections may specify the type of their elements in the declaration (e.g., Set<Integer> or Map<String, YourClass>), JUNG graph declarations may specify the type of each of their element categories, that is, vertices and edges. The graph implementation is generally assumed to be responsible for maintaining the topology of the graph (how graphs and edges are connected to each other); the vertex and edge objects are essentially treated as keys into the graph internal data structure. This has a couple of practical consequences:
- Vertex and edge objects must be unique to a graph: there cannot be two vertices, or two edges, such that vertex1.equals(vertex2), or edge1.equals(edge2). In this sense the vertex and edge collections have the Set semantics, although the internal implementations need not use Set and they are not exposed as Sets.
- Vertex and edge objects can be elements of multiple graphs.
This design is one of the major departures from JUNG 1.x, in which vertices and edges (a) were required to implement specific interfaces, (b) maintained most of the graph topology information, and consequently (c) could inhabit only one graph. This set up complicated dependencies between graph, vertex, and edge types that no longer exist in the JUNG 2.0 API.
If performance (speed, space, or both) is critical, then you may be able to increase the performance by creating a graph implementation that depends on specific vertex and edge types.
Many of JUNG's methods require the user to specify an association between a graph element (vertex or edge) and data of some sort: label text, edge weight, color, etc. By convention this is generally done via a Transformer.
Transformer<I,O> is a commons-collections interface with a single method transform(I input) that returns some object of the output type (O) for any input. This essentially defines a relationship between elements of the input and output types. In a sense, it's something like Map, except that it's much more lightweight and is read-only.
There are a few different ways to write one of these Transformers. This applies to any place that you're asked to provide a Transformer.
(1) Constant value. There's actually a ConstantTransformer class (in commons-collections) for this purpose. This is useful for situations in which you're asked for a Transformer but in fact all elements should get the same value (e.g. providing an edge weight when it's an unweighted graph).
(2) Map-backed--either a new map or based on an existing Map. There's a MapTransformer class (again, commons-collections) that handles this case. This is useful when you have about as many distinct values as elements, or when there's no obvious pattern that relates elements to values/outputs.
(a) new map: for each element, you create a (element, value) pair in a Map. If the values don't relate to anything else, this may be appropriate...although that's probably pretty rare.
(b) an existing Map: often you'll have an existing lookup table that does what you need it to (see the note below); no need to create a new one.
(3) Element instance variable-backed. This is much the same as Map-backed but with a different storage model.
(4) Based on an on-the-fly function call (calculation, status report, etc.).
(5) Combinations or variants of the above, e.g. transformers that use picking information to determine which of two colors to use.
Note that in any of these cases, the transformer can take a process (map, instance variable, function call) which outputs something other than what you want (e.g., a floating-point value) and translate it to a value of the appropriate type (e.g., a Paint). For example, let's suppose that you want to paint vertices red if they have high PageRank, yellow if they have moderate PageRank, and black if they have low PageRank. You can easily construct a Transformer class that takes the PageRank data (probably itself provided by a Transformer which you provide to the constructor), figures out which of three intervals you want, and then outputs an appropriate color when you give it a vertex. Taking this a step further, it would even be pretty easy to write a threshold-based general Transformer that would take a Transformer from threshold values to colors as its constructor parameter.
This is really the key insight about using Transformers in JUNG: we're trying to use them in a way that means that you have as little work to do as possible in order to, say, run an algorithm where edge weights are based on the number of papers coauthored by the incident vertices (representing authors), or create a visualization for which vertex color is a function of activity level.
Input and Output
Appendix: How to Build JUNG
This is a brief intro to building JUNG jars with maven2 (the build system that JUNG currently uses).
First, ensure that you have a JDK of at least version 1.5: JUNG 2.0+ requires Java 1.5+. Ensure that your JAVA_HOME variable is set to the location of the JDK. On a Windows platform, you may have a separate JRE (Java Runtime Environment) and JDK (Java Development Kit). The JRE has no capability to compile Java source files, so you must have a JDK installed. If your JAVA_HOME variable is set to the location of the JRE, and not the location of the JDK, you will be unable to compile.
Download and install maven from maven.apache.org:
At time of writing (June 2012), the latest version was maven-3.0.4. You should generally use the latest version of Maven.
Install the downloaded maven (there are installation instructions on the Maven website).
Follow the installation instructions and confirm a successful installation by typing 'mvn --version' in a command terminal window.
Get the JUNG code from CVS:
If you are a developer, do this:
cvs -z3 -d:ext:email@example.com:/cvsroot/jung co -P jung2
If you are a user, do this:
cvs -z3 -d:pserver:firstname.lastname@example.org:/cvsroot/jung co -P jung2
If you're unable (or unwilling) to use CVS from a command-line console, see the Eclipse-based instructions below.
cd jung2 mvn install
This should build the sub-projects and run unit tests. During the build process, maven downloads code it needs from maven repositories. The code is cached in your local repository that maven creates in your home directory ($HOME/.m2/repository). If the download of something is interrupted, the build may fail. If so, just run it again (and again) and it should eventually succeed. Once all the files are cached in your local maven repository, the build process will be faster.
Prepare JUNG for Eclipse
(This step is only relevant if you use Eclipse as your IDE, of course.)
To prepare jung2 for eclipse, run the following maven command:
which will generate the .classpath and .project files for eclipse.
The .classpath file will make reference to a M2_REPO variable, which you must define in eclipse, so that M2_REPO points to your local repository. You can do that in eclipse by bringing up project properties and adding the variable M2_HOME, or you can run the following command to have maven set the variable for you:
mvn -Declipse.workspace=<path-to-eclipse-workspace> eclipse:add-maven-repo
If that does not work, you'll need to open one of the projects properties and use the 'add variables' button in the 'libraries' tab.
To load JUNG in eclipse, you need to overcome an eclipse limitation: Eclipse projects cannot contain subprojects. (JUNG currently contains 6 sub-projects.) The common work-around is to make eclipse think that each sub-project is a top-level project.
The most common way to proceed is as follows:
Add each subproject (jung-api, jung-graph-impl, jung-visualization, jung-algorithms, jung-samples, jung-io) as a top-level project in eclipse, each with its own classpath dependencies.
One approach is to use the eclipse feature for importing existing projects AFTER mvn eclipse:eclipse has been run as shown above. Simply point the eclipse import project file chooser to the jung2 directory, then check off the list of subprojects that are shown. You can import all of the subprojects at once this way.
Another approach is to manually add each subproject as follows:
In the 'New Project' dialog, select 'Java Project', then 'Create project from existing source'. Create the new project to point to where you downloaded jung2 and its subprojects. For example, you would create a new project from the existing source in '/where/it/is/jung2/jung-api' and name that project 'jung-api'.
Because you previously ran mvn eclipse:eclipse at the jung2 directory level, then the projects will already reference the other projects they depend on (instead of the jar from those projects).
You do not want to use jung2 (the parent project) as the eclipse project, as each eclipse project can have only one classpath, and you would then have difficulty maintaining the correct dependencies between the sub-projects.
Checking JUNG Out Using Eclipse
If you are unable to use cvs from the command prompt, you may check out jung2 using eclipse, HOWEVER, because of the above stated limitation that eclipse cannot manage nested projects, you must use the following trick:
Create a new workspace that you will be using only to check out the project. You will not be using this workspace to work on the project. Let's call the new workspace $HOME/checkout_base. From that workspace, use eclipse to check out jung2 from cvs. Next, open a command prompt console and change directory to the newly created $HOME/checkout_base/jung2. Execute this command:
That will build the eclipse artifacts.
Next, change eclipse to point to a different workspace, one that you will actually be working in. Use the above instructions to import the jung2 subprojects from $HOME/checkout_base/jung2 into your real workspace.
Running JUNG Sample Code
Once you have built everything (preceding instructions), here is a straightforward way to run some demos from the command line:
(NOTE: you may need to change the version part of the jar names below. It could be jung-samples-2.0.1.jar for example. Look at the actual jar file names to see.)
tar xvf jung-samples-2.0-dependencies.tar
java -cp jung-samples-2.0.jar samples.graph.VertexImageShaperDemo
java -cp jung-samples-2.0.jar samples.graph.SatelliteViewDemo
java -cp jung-samples-2.0.jar samples.graph.ShowLayouts
The jung-samples-dependencies.tar file contains all of the jar dependencies for the jung-samples project. It was created as part of the maven build process.