From: <sta...@us...> - 2007-03-20 00:51:03
|
Revision: 1593 http://archive-access.svn.sourceforge.net/archive-access/?rev=1593&view=rev Author: stack-sf Date: 2007-03-19 17:36:23 -0700 (Mon, 19 Mar 2007) Log Message: ----------- More on moving nutchwax to maven2. * projects/nutchwax/src/plugin/index-wax/plugin.xml * projects/nutchwax/src/plugin/index-wax/lib/archive-commons-1.12.0.jar Updated plugin. * projects/nutchwax/src/plugin/index-wax/lib/archive-commons-1.11.0-200702160009.jar Removed. * projects/nutchwax/src/plugin/build-plugin.xml Call from maven2. Pass the maven2 dependencies to javac as extra classpath argument. * projects/nutchwax/src/plugin/build.xml Replaced by call from maven2 * projects/nutchwax/nutchwax-core/pom.xml Moved generation of sources to the nutchwax-thirdparty module. Make this module dependent on nutchwax-thirdparty module. * projects/nutchwax/pom.xml Set scope on dependencies. Make update of released plugins daily instead of always. Add in new modules thirdparty and plugins. * projects/nutchwax/build.xml Remove lib dir references (Its been removed). * projects/nutchwax/nutchwax-job/src/main/assembly/assemble-job.xml Assemble a job jar (We used to do this in an ant file). * projects/nutchwax/nutchwax-job/pom.xml Renamed assembler as assemble-job... It used to be just a placeholder but now we no longer need to do the copy from parent hack that we used to rely on. Modified Paths: -------------- trunk/archive-access/projects/nutchwax/build.xml trunk/archive-access/projects/nutchwax/nutchwax-core/pom.xml trunk/archive-access/projects/nutchwax/nutchwax-job/pom.xml trunk/archive-access/projects/nutchwax/nutchwax-job/src/main/assembly/assemble-job.xml trunk/archive-access/projects/nutchwax/pom.xml trunk/archive-access/projects/nutchwax/src/plugin/build-plugin.xml trunk/archive-access/projects/nutchwax/src/plugin/index-wax/plugin.xml Added Paths: ----------- trunk/archive-access/projects/nutchwax/src/plugin/index-wax/lib/archive-commons-1.12.0.jar Removed Paths: ------------- trunk/archive-access/projects/nutchwax/src/plugin/build.xml trunk/archive-access/projects/nutchwax/src/plugin/index-wax/lib/archive-commons-1.11.0-200702160009.jar Modified: trunk/archive-access/projects/nutchwax/build.xml =================================================================== --- trunk/archive-access/projects/nutchwax/build.xml 2007-03-20 00:29:13 UTC (rev 1592) +++ trunk/archive-access/projects/nutchwax/build.xml 2007-03-20 00:36:23 UTC (rev 1593) @@ -38,13 +38,10 @@ <property name="build.encoding" value="ISO-8859-1"/> - <fileset id="lib.jars" dir="${root}" includes="lib/*.jar"/> - <!-- the normal classpath --> <path id="classpath"> <pathelement location="${build.classes}"/> <pathelement location="${nutch.root}/build/classes"/> - <fileset refid="lib.jars"/> <fileset dir="${nutch.root}/lib"> <include name="*.jar" /> </fileset> @@ -146,7 +143,9 @@ <zipfileset file="${nutch.root}/conf/nutch-default.xml"/> <zipfileset file="${nutch.root}/conf/common-terms.utf8"/> <zipfileset prefix="bin" file="${basedir}/src/plugin/parse-waxext/bin/parse-pdf.sh" filemode="555"/> - <zipfileset refid="lib.jars"/> + <!--<zipfileset refid="lib.jars"/> + --> + <!--Include all class files both nutch and nutchwax at top level so all needed to launch a job using the 'hadoop jar nutchwax.jobs' is on the classpath (Only classes that are at top-level in a jar can Modified: trunk/archive-access/projects/nutchwax/nutchwax-core/pom.xml =================================================================== --- trunk/archive-access/projects/nutchwax/nutchwax-core/pom.xml 2007-03-20 00:29:13 UTC (rev 1592) +++ trunk/archive-access/projects/nutchwax/nutchwax-core/pom.xml 2007-03-20 00:36:23 UTC (rev 1593) @@ -19,39 +19,22 @@ <configuration> <source>1.5</source> <target>1.5</target> + <!-- + <compilerArgument> -verbose -classpath ../third-party/nutch/build/classes</compilerArgument> + --> </configuration> </plugin> - <plugin> - <artifactId>maven-antrun-plugin</artifactId> - <executions> - <execution> - <id>antrun.generate.sources</id> - <phase>generate-sources</phase> - <configuration> - <tasks> - <!-- Make these conditional so do not run everytime--> - <echo>Compiling third.party dependencies as part of generate-sources</echo> - <ant dir=".." target="third.party.jar"/> - </tasks> - </configuration> - <goals> - <goal>run</goal> - </goals> - </execution> - <execution> - <id>antrun.clean</id> - <phase>clean</phase> - <configuration> - <tasks> - <ant dir=".." target="clean-all"/> - </tasks> - </configuration> - <goals> - <goal>run</goal> - </goals> - </execution> - </executions> - </plugin> </plugins> </build> + <!--Look for placeholder nutchwax-thirdparty jar + Means third-party sources have been compiled. + The jar itself is empty. + --> + <dependencies> + <dependency> + <groupId>org.archive.nutchwax</groupId> + <artifactId>nutchwax-thirdparty</artifactId> + <scope>compile</scope> + </dependency> + </dependencies> </project> Modified: trunk/archive-access/projects/nutchwax/nutchwax-job/pom.xml =================================================================== --- trunk/archive-access/projects/nutchwax/nutchwax-job/pom.xml 2007-03-20 00:29:13 UTC (rev 1592) +++ trunk/archive-access/projects/nutchwax/nutchwax-job/pom.xml 2007-03-20 00:36:23 UTC (rev 1593) @@ -21,6 +21,9 @@ <modelVersion>4.0.0</modelVersion> <groupId>org.archive.nutchwax</groupId> <artifactId>nutchwax-job</artifactId> + <!--Below we attach the job jar to the pom production. + The 'attach'ed assembly generates the job jar. + --> <packaging>pom</packaging> <name>NutchWAX Job Jar</name> <build> @@ -33,19 +36,17 @@ <configuration> <descriptors> <descriptor> - src/main/assembly/placeholder.xml + src/main/assembly/assemble-job.xml </descriptor> </descriptors> <appendAssemblyId> false </appendAssemblyId> - <archive> <manifest> <mainClass>org.archive.access.nutch.Nutchwax</mainClass> </manifest> </archive> - </configuration> <executions> <execution> @@ -57,25 +58,6 @@ </execution> </executions> </plugin> - <plugin> - <artifactId>maven-antrun-plugin</artifactId> - <executions> - <execution> - <id>antrun.generate.sources</id> - <phase>generate-sources</phase> - <configuration> - <tasks> - <!-- Make these conditional so do not run everytime--> - <echo>Compiling third.party plugins as part of generate-sources</echo> - <ant dir=".." target="third.party.plugins"/> - </tasks> - </configuration> - <goals> - <goal>run</goal> - </goals> - </execution> - </executions> - </plugin> </plugins> </build> <dependencies> Modified: trunk/archive-access/projects/nutchwax/nutchwax-job/src/main/assembly/assemble-job.xml =================================================================== --- trunk/archive-access/projects/nutchwax/nutchwax-job/src/main/assembly/assemble-job.xml 2007-03-20 00:29:13 UTC (rev 1592) +++ trunk/archive-access/projects/nutchwax/nutchwax-job/src/main/assembly/assemble-job.xml 2007-03-20 00:36:23 UTC (rev 1593) @@ -6,14 +6,92 @@ <includeBaseDirectory>false</includeBaseDirectory> <fileSets> <fileSet> - <directory>target/classes</directory> + <directory>../target/wax-plugins</directory> + <outputDirectory>/wax-plugins</outputDirectory> + </fileSet> + <fileSet> + <directory>../src/plugin/parse-waxext/bin</directory> + <outputDirectory>/bin</outputDirectory> + </fileSet> + <fileSet> + <directory>..</directory> <outputDirectory>/</outputDirectory> + <includes> + <include> + README* + </include> + </includes> </fileSet> + <fileSet> + <directory>../conf</directory> + <outputDirectory>/</outputDirectory> + <includes> + <include>log4j.properties</include> + <include>wax-parse-plugins.xml</include> + <include>wax-default.xml</include> + <include>regex-normalize.xml</include> + <include>regex-urlfilter.txt</include> + </includes> + </fileSet> + <fileSet> + <directory>../third-party/nutch/build/plugins</directory> + <outputDirectory>/plugins</outputDirectory> + <includes> + <include>analysis-*/**</include> + <include>index-*/**</include> + <include>language-*/**</include> + <include>lib-*/**</include> + <include>nutch-*/**</include> + <include>scoring-*/**</include> + <include>query-*/**</include> + <include>summary-*/**</include> + <include>urlfilter-*/**</include> + <include>urlnormalizer-*/**</include> + <include>parse-*/**</include> + </includes> + <excludes> + <exclude>parse-js/**</exclude> + </excludes> + </fileSet> + <fileSet> + <directory>../third-party/nutch/conf</directory> + <outputDirectory>/</outputDirectory> + <includes> + <include>mime-types.xml</include> + <include>nutch-default.xml</include> + <include>nutch-site.xml</include> + <include>common-terms.utf8</include> + </includes> + </fileSet> + <fileSet> + <directory>../third-party/nutch/lib</directory> + <outputDirectory>/lib</outputDirectory> + <includes> + <include>commons-lang*</include> + <include>lucene*</include> + <include>jakarta-oro*</include> + <include>xerces*</include> + <include>concurrent*</include> + </includes> + </fileSet> </fileSets> <dependencySets> <dependencySet> <outputDirectory>/lib</outputDirectory> - <scope>runtime</scope> + <!--<scope>runtime</scope> + --> + <excludes> + <exclude>commons-cli:commons-cli</exclude> + <exclude>commons-collections:commons-collections</exclude> + <exclude>commons-pool:commons-pool</exclude> + <exclude>commons-logging:commons-logging</exclude> + <exclude>org.apache:hadoop</exclude> + <exclude>org.apache:nutch</exclude> + <exclude>org.apache:nutch</exclude> + <exclude>com.sleepycat:je</exclude> + <exclude>junit:junit</exclude> + <exclude>javax.servlet:servlet-api</exclude> + </excludes> </dependencySet> </dependencySets> </assembly> Modified: trunk/archive-access/projects/nutchwax/pom.xml =================================================================== --- trunk/archive-access/projects/nutchwax/pom.xml 2007-03-20 00:29:13 UTC (rev 1592) +++ trunk/archive-access/projects/nutchwax/pom.xml 2007-03-20 00:36:23 UTC (rev 1593) @@ -166,7 +166,9 @@ <groupId>commons-cli</groupId> <artifactId>commons-cli</artifactId> <version>1.0-beta-2</version> + <scope>compile</scope> </dependency> + <!-- <dependency> <groupId>org.apache</groupId> <artifactId>hadoop</artifactId> @@ -179,6 +181,7 @@ <version>0.9-dev-508238</version> <scope>compile</scope> </dependency> + --> <dependency> <groupId>javax.servlet</groupId> <artifactId>servlet-api</artifactId> @@ -297,7 +300,7 @@ <repository> <releases> <enabled>true</enabled> - <updatePolicy>always</updatePolicy> + <updatePolicy>daily</updatePolicy> <checksumPolicy>warn</checksumPolicy> </releases> <snapshots> @@ -322,7 +325,7 @@ <layout>default</layout> <releases> <enabled>true</enabled> - <updatePolicy>always</updatePolicy> + <updatePolicy>daily</updatePolicy> <checksumPolicy>warn</checksumPolicy> </releases> <!-- @@ -354,6 +357,11 @@ <dependencies> <dependency> <groupId>org.archive.nutchwax</groupId> + <artifactId>nutchwax-thirdparty</artifactId> + <version>${project.version}</version> + </dependency> + <dependency> + <groupId>org.archive.nutchwax</groupId> <artifactId>nutchwax-core</artifactId> <version>${project.version}</version> </dependency> @@ -371,15 +379,19 @@ </dependencyManagement> <modules> <module> + nutchwax-thirdparty + </module> + <module> nutchwax-core </module> <module> + nutchwax-plugins + </module> + <module> nutchwax-job </module> - <!-- <module> nutchwax-war </module> - --> </modules> </project> Modified: trunk/archive-access/projects/nutchwax/src/plugin/build-plugin.xml =================================================================== --- trunk/archive-access/projects/nutchwax/src/plugin/build-plugin.xml 2007-03-20 00:29:13 UTC (rev 1592) +++ trunk/archive-access/projects/nutchwax/src/plugin/build-plugin.xml 2007-03-20 00:36:23 UTC (rev 1593) @@ -3,6 +3,9 @@ <!--Copied from nutch/src/plugin. Changes so we build into nutchwax/build and so we get dependencies from a nutch we expect to be in the nutchwax directory. + + + Called from maven2. --> <!-- Imported by plugin build.xml files to define default targets. --> @@ -44,16 +47,11 @@ <property name="build.encoding" value="ISO-8859-1"/> - <fileset id="lib.jars" dir="${root}" includes="lib/*.jar"/> <!-- the normal classpath --> <path id="classpath"> <pathelement location="${build.classes}"/> - <fileset refid="lib.jars"/> <pathelement location="${nutch.root}/target/classes"/> - <fileset dir="${nutch.root}/lib"> - <include name="*.jar" /> - </fileset> <!--IA: Add the nutch jars.--> <fileset dir="${real.nutch.root}/lib"> <include name="*.jar" /> @@ -99,6 +97,9 @@ debug="${javac.debug}" deprecation="${javac.deprecation}"> <classpath refid="classpath"/> + <!--This build file is being called out of maven2. Its + setting the below reference to maven.compile.classpath.--> + <classpath refid="maven.compile.classpath"/> </javac> </target> @@ -124,9 +125,6 @@ <copy file="plugin.xml" todir="${deploy.dir}" preservelastmodified="true"/> <copy file="${build.dir}/${name}.jar" todir="${deploy.dir}"/> - <copy todir="${deploy.dir}" flatten="true"> - <fileset refid="lib.jars"/> - </copy> </target> <!-- ================================================================== --> Deleted: trunk/archive-access/projects/nutchwax/src/plugin/build.xml =================================================================== --- trunk/archive-access/projects/nutchwax/src/plugin/build.xml 2007-03-20 00:29:13 UTC (rev 1592) +++ trunk/archive-access/projects/nutchwax/src/plugin/build.xml 2007-03-20 00:36:23 UTC (rev 1593) @@ -1,43 +0,0 @@ -<?xml version="1.0"?> - -<project name="Nutch" default="deploy" basedir="."> - - <!-- ====================================================== --> - <!-- Build & deploy all the plugin jars. --> - <!-- ====================================================== --> - <target name="deploy"> - <ant dir="index-wax" target="deploy"/> - <ant dir="query-wax" target="deploy"/> - <ant dir="parse-default" target="deploy"/> - <ant dir="parse-waxext" target="deploy"/> - <ant dir="query-host" target="deploy"/> - <ant dir="query-anchor" target="deploy"/> - <ant dir="query-title" target="deploy"/> - <ant dir="query-content" target="deploy"/> - </target> - - <!-- ====================================================== --> - <!-- Test all of the plugins. --> - <!-- ====================================================== --> - <target name="test"> - <ant dir="index-wax" target="test"/> - <ant dir="query-wax" target="test"/> - <ant dir="parse-default" target="test"/> - <ant dir="parse-waxext" target="test"/> - </target> - - <!-- ====================================================== --> - <!-- Clean all of the plugins. --> - <!-- ====================================================== --> - <target name="clean"> - <ant dir="index-wax" target="clean"/> - <ant dir="query-wax" target="clean"/> - <ant dir="parse-default" target="clean"/> - <ant dir="parse-waxext" target="clean"/> - <ant dir="query-host" target="clean"/> - <ant dir="query-anchor" target="clean"/> - <ant dir="query-title" target="clean"/> - <ant dir="query-content" target="clean"/> - </target> - -</project> Deleted: trunk/archive-access/projects/nutchwax/src/plugin/index-wax/lib/archive-commons-1.11.0-200702160009.jar =================================================================== (Binary files differ) Added: trunk/archive-access/projects/nutchwax/src/plugin/index-wax/lib/archive-commons-1.12.0.jar =================================================================== (Binary files differ) Property changes on: trunk/archive-access/projects/nutchwax/src/plugin/index-wax/lib/archive-commons-1.12.0.jar ___________________________________________________________________ Name: svn:mime-type + application/octet-stream Modified: trunk/archive-access/projects/nutchwax/src/plugin/index-wax/plugin.xml =================================================================== --- trunk/archive-access/projects/nutchwax/src/plugin/index-wax/plugin.xml 2007-03-20 00:29:13 UTC (rev 1592) +++ trunk/archive-access/projects/nutchwax/src/plugin/index-wax/plugin.xml 2007-03-20 00:36:23 UTC (rev 1593) @@ -12,7 +12,7 @@ <!--Alternative is to change the nutch script so that it includes libs from other than its local directory. Without that, need to have lib local to plugin.--> - <library name="archive-commons-1.11.0-200702160009.jar" /> + <library name="archive-commons-1.12.0.jar" /> </runtime> <extension id="org.archive.access.nutch.indexer" This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |