This list is closed, nobody may subscribe to it.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(139) |
Aug
(94) |
Sep
(232) |
Oct
(143) |
Nov
(138) |
Dec
(55) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(127) |
Feb
(90) |
Mar
(101) |
Apr
(74) |
May
(148) |
Jun
(241) |
Jul
(169) |
Aug
(121) |
Sep
(157) |
Oct
(199) |
Nov
(281) |
Dec
(75) |
2012 |
Jan
(107) |
Feb
(122) |
Mar
(184) |
Apr
(73) |
May
(14) |
Jun
(49) |
Jul
(26) |
Aug
(103) |
Sep
(133) |
Oct
(61) |
Nov
(51) |
Dec
(55) |
2013 |
Jan
(59) |
Feb
(72) |
Mar
(99) |
Apr
(62) |
May
(92) |
Jun
(19) |
Jul
(31) |
Aug
(138) |
Sep
(47) |
Oct
(83) |
Nov
(95) |
Dec
(111) |
2014 |
Jan
(125) |
Feb
(60) |
Mar
(119) |
Apr
(136) |
May
(270) |
Jun
(83) |
Jul
(88) |
Aug
(30) |
Sep
(47) |
Oct
(27) |
Nov
(23) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
(4) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: <tho...@us...> - 2013-09-01 14:29:18
|
Revision: 7376 http://bigdata.svn.sourceforge.net/bigdata/?rev=7376&view=rev Author: thompsonbry Date: 2013-09-01 14:29:10 +0000 (Sun, 01 Sep 2013) Log Message: ----------- added missing licenses for bigdata-ganglia. Added Paths: ----------- branches/BIGDATA_RELEASE_1_2_0/bigdata-ganglia/LEGAL/junit-license.html branches/BIGDATA_RELEASE_1_2_0/bigdata-ganglia/LEGAL/log4j-license.txt Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-ganglia/LEGAL/junit-license.html =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-ganglia/LEGAL/junit-license.html (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-ganglia/LEGAL/junit-license.html 2013-09-01 14:29:10 UTC (rev 7376) @@ -0,0 +1,226 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd"> +<!-- saved from url=(0041)http://www.eclipse.org/legal/cpl-v10.html --> +<HTML><HEAD><TITLE>Common Public License - v 1.0</TITLE> +<META http-equiv=Content-Type content="text/html; charset=ISO-8859-1"> +<META content="MSHTML 6.00.2900.3059" name=GENERATOR></HEAD> +<BODY vLink=#800000 bgColor=#ffffff> +<P align=center><B>Common Public License - v 1.0</B> +<P><B></B><FONT size=3></FONT> +<P><FONT size=3></FONT><FONT size=2>THE ACCOMPANYING PROGRAM IS PROVIDED UNDER +THE TERMS OF THIS COMMON PUBLIC LICENSE ("AGREEMENT"). ANY USE, REPRODUCTION OR +DISTRIBUTION OF THE PROGRAM CONSTITUTES RECIPIENT'S ACCEPTANCE OF THIS +AGREEMENT.</FONT> +<P><FONT size=2></FONT> +<P><FONT size=2><B>1. DEFINITIONS</B></FONT> +<P><FONT size=2>"Contribution" means:</FONT> +<UL><FONT size=2>a) in the case of the initial Contributor, the initial code + and documentation distributed under this Agreement, and<BR clear=left>b) in + the case of each subsequent Contributor:</FONT></UL> +<UL><FONT size=2>i) changes to the Program, and</FONT></UL> +<UL><FONT size=2>ii) additions to the Program;</FONT></UL> +<UL><FONT size=2>where such changes and/or additions to the Program originate + from and are distributed by that particular Contributor. </FONT><FONT size=2>A + Contribution 'originates' from a Contributor if it was added to the Program by + such Contributor itself or anyone acting on such Contributor's behalf. + </FONT><FONT size=2>Contributions do not include additions to the Program + which: (i) are separate modules of software distributed in conjunction with + the Program under their own license agreement, and (ii) are not derivative + works of the Program. </FONT></UL> +<P><FONT size=2></FONT> +<P><FONT size=2>"Contributor" means any person or entity that distributes the +Program.</FONT> +<P><FONT size=2></FONT><FONT size=2></FONT> +<P><FONT size=2>"Licensed Patents " mean patent claims licensable by a +Contributor which are necessarily infringed by the use or sale of its +Contribution alone or when combined with the Program. </FONT> +<P><FONT size=2></FONT><FONT size=2></FONT> +<P><FONT size=2></FONT><FONT size=2>"Program" means the Contributions +distributed in accordance with this Agreement.</FONT> +<P><FONT size=2></FONT> +<P><FONT size=2>"Recipient" means anyone who receives the Program under this +Agreement, including all Contributors.</FONT> +<P><FONT size=2><B></B></FONT> +<P><FONT size=2><B>2. GRANT OF RIGHTS</B></FONT> +<UL><FONT size=2></FONT><FONT size=2>a) </FONT><FONT size=2>Subject to the + terms of this Agreement, each Contributor hereby grants</FONT><FONT size=2> + Recipient a non-exclusive, worldwide, royalty-free copyright license + to</FONT><FONT color=#ff0000 size=2> </FONT><FONT size=2>reproduce, prepare + derivative works of, publicly display, publicly perform, distribute and + sublicense the Contribution of such Contributor, if any, and such derivative + works, in source code and object code form.</FONT></UL> +<UL><FONT size=2></FONT></UL> +<UL><FONT size=2></FONT><FONT size=2>b) Subject to the terms of this + Agreement, each Contributor hereby grants </FONT><FONT size=2>Recipient a + non-exclusive, worldwide,</FONT><FONT color=#008000 size=2> </FONT><FONT + size=2>royalty-free patent license under Licensed Patents to make, use, sell, + offer to sell, import and otherwise transfer the Contribution of such + Contributor, if any, in source code and object code form. This patent license + shall apply to the combination of the Contribution and the Program if, at the + time the Contribution is added by the Contributor, such addition of the + Contribution causes such combination to be covered by the Licensed Patents. + The patent license shall not apply to any other combinations which include the + Contribution. No hardware per se is licensed hereunder. </FONT></UL> +<UL><FONT size=2></FONT></UL> +<UL><FONT size=2>c) Recipient understands that although each Contributor + grants the licenses to its Contributions set forth herein, no assurances are + provided by any Contributor that the Program does not infringe the patent or + other intellectual property rights of any other entity. Each Contributor + disclaims any liability to Recipient for claims brought by any other entity + based on infringement of intellectual property rights or otherwise. As a + condition to exercising the rights and licenses granted hereunder, each + Recipient hereby assumes sole responsibility to secure any other intellectual + property rights needed, if any. For example, if a third party patent license + is required to allow Recipient to distribute the Program, it is Recipient's + responsibility to acquire that license before distributing the +Program.</FONT></UL> +<UL><FONT size=2></FONT></UL> +<UL><FONT size=2>d) Each Contributor represents that to its knowledge it has + sufficient copyright rights in its Contribution, if any, to grant the + copyright license set forth in this Agreement. </FONT></UL> +<UL><FONT size=2></FONT></UL> +<P><FONT size=2><B>3. REQUIREMENTS</B></FONT> +<P><FONT size=2><B></B>A Contributor may choose to distribute the Program in +object code form under its own license agreement, provided that:</FONT> +<UL><FONT size=2>a) it complies with the terms and conditions of this + Agreement; and</FONT></UL> +<UL><FONT size=2>b) its license agreement:</FONT></UL> +<UL><FONT size=2>i) effectively disclaims</FONT><FONT size=2> on behalf of all + Contributors all warranties and conditions, express and implied, including + warranties or conditions of title and non-infringement, and implied warranties + or conditions of merchantability and fitness for a particular purpose; +</FONT></UL> +<UL><FONT size=2>ii) effectively excludes on behalf of all Contributors all + liability for damages, including direct, indirect, special, incidental and + consequential damages, such as lost profits; </FONT></UL> +<UL><FONT size=2>iii)</FONT><FONT size=2> states that any provisions which + differ from this Agreement are offered by that Contributor alone and not by + any other party; and</FONT></UL> +<UL><FONT size=2>iv) states that source code for the Program is available from + such Contributor, and informs licensees how to obtain it in a reasonable + manner on or through a medium customarily used for software + exchange.</FONT><FONT color=#0000ff size=2> </FONT><FONT color=#ff0000 + size=2></FONT></UL> +<UL><FONT color=#ff0000 size=2></FONT><FONT size=2></FONT></UL> +<P><FONT size=2>When the Program is made available in source code form:</FONT> +<UL><FONT size=2>a) it must be made available under this Agreement; and +</FONT></UL> +<UL><FONT size=2>b) a copy of this Agreement must be included with each copy + of the Program. </FONT></UL> +<P><FONT size=2></FONT><FONT color=#0000ff size=2><STRIKE></STRIKE></FONT> +<P><FONT color=#0000ff size=2><STRIKE></STRIKE></FONT><FONT size=2>Contributors +may not remove or alter any copyright notices contained within the Program. +</FONT> +<P><FONT size=2></FONT> +<P><FONT size=2>Each Contributor must identify itself as the originator of its +Contribution, if any, in a manner that reasonably allows subsequent Recipients +to identify the originator of the Contribution. </FONT> +<P><FONT size=2></FONT> +<P><FONT size=2><B>4. COMMERCIAL DISTRIBUTION</B></FONT> +<P><FONT size=2>Commercial distributors of software may accept certain +responsibilities with respect to end users, business partners and the like. +While this license is intended to facilitate the commercial use of the Program, +the Contributor who includes the Program in a commercial product offering should +do so in a manner which does not create potential liability for other +Contributors. Therefore, if a Contributor includes the Program in a commercial +product offering, such Contributor ("Commercial Contributor") hereby agrees to +defend and indemnify every other Contributor ("Indemnified Contributor") against +any losses, damages and costs (collectively "Losses") arising from claims, +lawsuits and other legal actions brought by a third party against the +Indemnified Contributor to the extent caused by the acts or omissions of such +Commercial Contributor in connection with its distribution of the Program in a +commercial product offering. The obligations in this section do not apply to any +claims or Losses relating to any actual or alleged intellectual property +infringement. In order to qualify, an Indemnified Contributor must: a) promptly +notify the Commercial Contributor in writing of such claim, and b) allow the +Commercial Contributor to control, and cooperate with the Commercial Contributor +in, the defense and any related settlement negotiations. The Indemnified +Contributor may participate in any such claim at its own expense.</FONT> +<P><FONT size=2></FONT> +<P><FONT size=2>For example, a Contributor might include the Program in a +commercial product offering, Product X. That Contributor is then a Commercial +Contributor. If that Commercial Contributor then makes performance claims, or +offers warranties related to Product X, those performance claims and warranties +are such Commercial Contributor's responsibility alone. Under this section, the +Commercial Contributor would have to defend claims against the other +Contributors related to those performance claims and warranties, and if a court +requires any other Contributor to pay any damages as a result, the Commercial +Contributor must pay those damages.</FONT> +<P><FONT size=2></FONT><FONT color=#0000ff size=2></FONT> +<P><FONT color=#0000ff size=2></FONT><FONT size=2><B>5. NO WARRANTY</B></FONT> +<P><FONT size=2>EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, THE PROGRAM IS +PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, +EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR +CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A +PARTICULAR PURPOSE. Each Recipient is</FONT><FONT size=2> solely responsible for +determining the appropriateness of using and distributing </FONT><FONT +size=2>the Program</FONT><FONT size=2> and assumes all risks associated with its +exercise of rights under this Agreement</FONT><FONT size=2>, including but not +limited to the risks and costs of program errors, compliance with applicable +laws, damage to or loss of data, </FONT><FONT size=2>programs or equipment, and +unavailability or interruption of operations</FONT><FONT size=2>. </FONT><FONT +size=2></FONT> +<P><FONT size=2></FONT> +<P><FONT size=2></FONT><FONT size=2><B>6. DISCLAIMER OF LIABILITY</B></FONT> +<P><FONT size=2></FONT><FONT size=2>EXCEPT AS EXPRESSLY SET FORTH IN THIS +AGREEMENT, NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +</FONT><FONT size=2>(INCLUDING WITHOUT LIMITATION LOST PROFITS),</FONT><FONT +size=2> HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, +STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +OUT OF THE USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS +GRANTED HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.</FONT> +<P><FONT size=2></FONT><FONT size=2></FONT> +<P><FONT size=2><B>7. GENERAL</B></FONT> +<P><FONT size=2></FONT><FONT size=2>If any provision of this Agreement is +invalid or unenforceable under applicable law, it shall not affect the validity +or enforceability of the remainder of the terms of this Agreement, and without +further action by the parties hereto, such provision shall be reformed to the +minimum extent necessary to make such provision valid and enforceable.</FONT> +<P><FONT size=2></FONT> +<P><FONT size=2>If Recipient institutes patent litigation against a Contributor +with respect to a patent applicable to software (including a cross-claim or +counterclaim in a lawsuit), then any patent licenses granted by that Contributor +to such Recipient under this Agreement shall terminate as of the date such +litigation is filed. In addition, if Recipient institutes patent litigation +against any entity (including a cross-claim or counterclaim in a lawsuit) +alleging that the Program itself (excluding combinations of the Program with +other software or hardware) infringes such Recipient's patent(s), then such +Recipient's rights granted under Section 2(b) shall terminate as of the date +such litigation is filed. </FONT><FONT size=2></FONT> +<P><FONT size=2></FONT> +<P><FONT size=2>All Recipient's rights under this Agreement shall terminate if +it fails to comply with any of the material terms or conditions of this +Agreement and does not cure such failure in a reasonable period of time after +becoming aware of such noncompliance. If all Recipient's rights under this +Agreement terminate, Recipient agrees to cease use and distribution of the +Program as soon as reasonably practicable. However, Recipient's obligations +under this Agreement and any licenses granted by Recipient relating to the +Program shall continue and survive. </FONT><FONT size=2></FONT> +<P><FONT size=2></FONT> +<P><FONT size=2></FONT><FONT size=2>Everyone is permitted to copy and distribute +copies of this Agreement, but in order to avoid inconsistency the Agreement is +copyrighted and may only be modified in the following manner. The Agreement +Steward reserves the right to </FONT><FONT size=2>publish new versions +(including revisions) of this Agreement from time to </FONT><FONT size=2>time. +No one other than the Agreement Steward has the right to modify this Agreement. +IBM is the initial Agreement Steward. IBM may assign the responsibility to serve +as the Agreement Steward to a suitable separate entity. </FONT><FONT size=2>Each +new version of the Agreement will be given a distinguishing version number. The +Program (including Contributions) may always be distributed subject to the +version of the Agreement under which it was received. In addition, after a new +version of the Agreement is published, Contributor may elect to distribute the +Program (including its Contributions) under the new </FONT><FONT size=2>version. +</FONT><FONT size=2>Except as expressly stated in Sections 2(a) and 2(b) above, +Recipient receives no rights or licenses to the intellectual property of any +Contributor under this Agreement, whether expressly, </FONT><FONT size=2>by +implication, estoppel or otherwise</FONT><FONT size=2>.</FONT><FONT size=2> All +rights in the Program not expressly granted under this Agreement are +reserved.</FONT> +<P><FONT size=2></FONT> +<P><FONT size=2>This Agreement is governed by the laws of the State of New York +and the intellectual property laws of the United States of America. No party to +this Agreement will bring a legal action under this Agreement more than one year +after the cause of action arose. Each party waives its rights to a jury trial in +any resulting litigation.</FONT> +<P><FONT size=2></FONT><FONT size=2></FONT> +<P><FONT size=2></FONT></P></BODY></HTML> Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-ganglia/LEGAL/log4j-license.txt =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-ganglia/LEGAL/log4j-license.txt (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-ganglia/LEGAL/log4j-license.txt 2013-09-01 14:29:10 UTC (rev 7376) @@ -0,0 +1,201 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-31 11:40:18
|
Revision: 7375 http://bigdata.svn.sourceforge.net/bigdata/?rev=7375&view=rev Author: thompsonbry Date: 2013-08-31 11:40:11 +0000 (Sat, 31 Aug 2013) Log Message: ----------- Bug fix for #708 (BIND heisenbug) The root cause was a failure to clone the binding set before modifying it. --- This view of the same query run state log data (for the same run) shows the problem: {{{ step label bopId fanIO navail nrun cnksIn unitsIn cnksOut unitsOut 1 startQ 4 1 1 0 2 startOp 4 1 0 1 3 startOp 3 1 -1 2 4 startOp 2 1 -2 3 5 startOp 6 1 -3 4 6 startOp 7 1 -4 5 7 haltOp 4 2 -2 4 1 1 2 2 8 haltOp 2 1 -1 3 1 1 1 1 9 haltOp 3 1 0 2 1 1 1 1 10 startOp 1 1 -1 3 11 haltOp 7 0 -1 2 1 1 0 0 <=== NOTHING OUT 12 haltOp 6 1 0 1 1 1 1 1 13 startOp 8 1 -1 2 14 haltOp 1 1 0 1 1 1 1 1 15 haltOp 8 0 0 0 1 1 1 1 }}} BopId 7 is one of the BIND operators. In the table, you can see that it did not output any chunks before it halted. Given that nothing was output, there was only one solution flowing into bop#7 (the copyOp where the two solution paths were combined) and only one solution flowing into bop#8 (the projection operator). This is conclusive evidence that the problem is not with the RunState accounting. The HaltOp message which was generated for the BIND() clearly indicates that there was no solution produced. Perhaps the solution was eliminated by the BIND()? E.g., due to an interaction through the binding set objects? Ok. The fix is to ConditionalRoutingOp#171. Replace {{{ final IBindingSet bset = chunk[i]; }}} with {{{ final IBindingSet bset = chunk[i].clone(); }}} This cures the test failures. I have also added source.close() to the finally{} clause, but this is not required in order to cure the problems. com.bigdata.rdf.sparql.ast.eval.TestAll is good. I am going to assign the ticket back to Mike to take a look at the code in ConditionalRoutingOp and see whether it can be optimized with in the light of the need for this clone(). Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata/src/java/com/bigdata/bop/bset/ConditionalRoutingOp.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata/src/java/com/bigdata/bop/bset/ConditionalRoutingOp.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata/src/java/com/bigdata/bop/bset/ConditionalRoutingOp.java 2013-08-30 20:12:52 UTC (rev 7374) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata/src/java/com/bigdata/bop/bset/ConditionalRoutingOp.java 2013-08-31 11:40:11 UTC (rev 7375) @@ -168,7 +168,7 @@ for(int i=0; i<chunk.length; i++) { - final IBindingSet bset = chunk[i]; + final IBindingSet bset = chunk[i].clone(); if (condition.accept(bset)) { @@ -209,7 +209,7 @@ return null; } finally { - + source.close(); sink.close(); if (sink2 != null) sink2.close(); This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <mrp...@us...> - 2013-08-30 20:12:58
|
Revision: 7374 http://bigdata.svn.sourceforge.net/bigdata/?rev=7374&view=rev Author: mrpersonick Date: 2013-08-30 20:12:52 +0000 (Fri, 30 Aug 2013) Log Message: ----------- fixed ticket 733: range optimizer not optimizing slice service Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/eval/SliceServiceFactory.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/ASTRangeOptimizer.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/AbstractJoinGroupOptimizer.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/eval/SliceServiceFactory.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/eval/SliceServiceFactory.java 2013-08-30 19:30:09 UTC (rev 7373) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/eval/SliceServiceFactory.java 2013-08-30 20:12:52 UTC (rev 7374) @@ -56,6 +56,7 @@ import com.bigdata.rdf.internal.IV; import com.bigdata.rdf.internal.constraints.RangeBOp; import com.bigdata.rdf.internal.impl.literal.XSDNumericIV; +import com.bigdata.rdf.sparql.ast.FilterNode; import com.bigdata.rdf.sparql.ast.GroupNodeBase; import com.bigdata.rdf.sparql.ast.IGroupMemberNode; import com.bigdata.rdf.sparql.ast.StatementPatternNode; @@ -255,6 +256,13 @@ for (IGroupMemberNode node : group) { + if (node instanceof FilterNode) { + + // ok to have filters with ranges + continue; + + } + if (!(node instanceof StatementPatternNode)) { throw new RuntimeException("only statement patterns allowed"); Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/ASTRangeOptimizer.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/ASTRangeOptimizer.java 2013-08-30 19:30:09 UTC (rev 7373) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/ASTRangeOptimizer.java 2013-08-30 20:12:52 UTC (rev 7374) @@ -87,7 +87,7 @@ } if (!rangeSafe) return; - + final Map<VarNode, RangeNode> ranges = new LinkedHashMap<VarNode, RangeNode>(); Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/AbstractJoinGroupOptimizer.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/AbstractJoinGroupOptimizer.java 2013-08-30 19:30:09 UTC (rev 7373) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/AbstractJoinGroupOptimizer.java 2013-08-30 20:12:52 UTC (rev 7374) @@ -249,6 +249,13 @@ } else if (child instanceof ServiceNode && optimizeServiceNodes) { + final ServiceNode serviceNode = (ServiceNode) child; + + @SuppressWarnings("unchecked") + final GraphPatternGroup<IGroupMemberNode> childGroup = (GraphPatternGroup<IGroupMemberNode>) serviceNode + .getGraphPattern(); + + optimize(ctx, sa, bSets, childGroup); } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-30 19:30:16
|
Revision: 7373 http://bigdata.svn.sourceforge.net/bigdata/?rev=7373&view=rev Author: thompsonbry Date: 2013-08-30 19:30:09 +0000 (Fri, 30 Aug 2013) Log Message: ----------- Made the GAS statistics into a pure interface. See #629 Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASStats.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASContext.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASStats.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASStats.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASStats.java 2013-08-30 19:18:00 UTC (rev 7372) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASStats.java 2013-08-30 19:30:09 UTC (rev 7373) @@ -1,5 +1,31 @@ package com.bigdata.rdf.graph; +/** + * Statistics for GAS algorithms. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + */ public interface IGASStats { + void add(final long frontierSize, final long nedges, final long elapsedNanos); + + void add(final IGASStats o); + + long getNRounds(); + + /** + * The cumulative size of the frontier across the iterations. + */ + long getFrontierSize(); + + /** + * The number of traversed edges across the iterations. + */ + long getNEdges(); + + /** + * The elapsed nanoseconds across the iterations. + */ + long getElapsedNanos(); + } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASContext.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASContext.java 2013-08-30 19:18:00 UTC (rev 7372) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASContext.java 2013-08-30 19:30:09 UTC (rev 7373) @@ -254,8 +254,7 @@ final long totalEdges = scatterEdgeCount + gatherEdgeCount; - // TODO pure interface for this. - ((GASStats) stats).add(f.size(), totalEdges, totalElapsed); + stats.add(f.size(), totalEdges, totalElapsed); if (log.isInfoEnabled()) { Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java 2013-08-30 19:18:00 UTC (rev 7372) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java 2013-08-30 19:30:09 UTC (rev 7373) @@ -27,6 +27,7 @@ import com.bigdata.rdf.graph.IGASScheduler; import com.bigdata.rdf.graph.IGASSchedulerImpl; import com.bigdata.rdf.graph.IGASState; +import com.bigdata.rdf.graph.IGASStats; import com.bigdata.rdf.graph.impl.bd.BigdataGASEngine; import com.bigdata.rdf.graph.impl.bd.BigdataGASEngine.BigdataGraphAccessor; import com.bigdata.rdf.rio.LoadStats; @@ -42,7 +43,7 @@ * vertices? For such algorithms, we just run them once per graph * (unless the graph is dynamic). */ -public class GASRunner<VS, ES, ST> implements Callable<GASStats> { +public class GASRunner<VS, ES, ST> implements Callable<IGASStats> { private static final Logger log = Logger.getLogger(GASRunner.class); @@ -413,7 +414,7 @@ * create if the effective buffer mode is non-transient, then we can get all * this information. */ - public GASStats call() throws Exception { + public IGASStats call() throws Exception { final Properties properties = getProperties(propertyFile); @@ -600,7 +601,7 @@ * TODO Are we better off using sampling based on distinct vertices or with * a bais based on the #of edges for those vertices. */ - private GASStats runAnalytic(final Journal jnl, final String namespace) + private IGASStats runAnalytic(final Journal jnl, final String namespace) throws Exception { /* @@ -638,7 +639,7 @@ final IGASState<VS, ES, ST> gasState = gasContext.getGASState(); - final GASStats total = new GASStats(); + final IGASStats total = new GASStats(); final Value[] samples = graphAccessor.getRandomSample(r, nsamples); @@ -648,8 +649,7 @@ gasState.init(startingVertex); - // TODO STATS: Pure interface. - final GASStats stats = (GASStats) gasContext.call(); + final IGASStats stats = (IGASStats) gasContext.call(); total.add(stats); Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASStats.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASStats.java 2013-08-30 19:18:00 UTC (rev 7372) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASStats.java 2013-08-30 19:30:09 UTC (rev 7373) @@ -7,12 +7,8 @@ import com.bigdata.rdf.graph.IGASStats; /** - * FIXME STATS: Refactor to a pure interface - see RuleStats. + * Statistics for GAS algorithms. * - * FIXME STATS: Collect the details within round statistics and then lift the - * formatting of the statistics into this class (for the details within round - * statistics). - * * @author <a href="mailto:tho...@us...">Bryan Thompson</a> */ public class GASStats implements IGASStats { @@ -22,6 +18,10 @@ private final CAT nedges = new CAT(); private final CAT elapsedNanos = new CAT(); + /* (non-Javadoc) + * @see com.bigdata.rdf.graph.impl.IFOO#add(long, long, long) + */ + @Override public void add(final long frontierSize, final long nedges, final long elapsedNanos) { @@ -35,7 +35,11 @@ } - public void add(final GASStats o) { + /* (non-Javadoc) + * @see com.bigdata.rdf.graph.impl.IFOO#add(com.bigdata.rdf.graph.impl.IFOO) + */ + @Override + public void add(final IGASStats o) { nrounds.add(o.getNRounds()); @@ -47,27 +51,34 @@ } + /* (non-Javadoc) + * @see com.bigdata.rdf.graph.impl.IFOO#getNRounds() + */ + @Override public long getNRounds() { return nrounds.get(); } - /** - * The cumulative size of the frontier across the iterations. + /* (non-Javadoc) + * @see com.bigdata.rdf.graph.impl.IFOO#getFrontierSize() */ + @Override public long getFrontierSize() { return frontierSize.get(); } - /** - * The number of traversed edges across the iterations. + /* (non-Javadoc) + * @see com.bigdata.rdf.graph.impl.IFOO#getNEdges() */ + @Override public long getNEdges() { return nedges.get(); } - /** - * The elapsed nanoseconds across the iterations. + /* (non-Javadoc) + * @see com.bigdata.rdf.graph.impl.IFOO#getElapsedNanos() */ + @Override public long getElapsedNanos() { return elapsedNanos.get(); } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-30 19:18:06
|
Revision: 7372 http://bigdata.svn.sourceforge.net/bigdata/?rev=7372&view=rev Author: thompsonbry Date: 2013-08-30 19:18:00 +0000 (Fri, 30 Aug 2013) Log Message: ----------- added the namespace to the output data. This will be used to identify the different data sets in CI performance runs. Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java 2013-08-30 19:14:47 UTC (rev 7371) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java 2013-08-30 19:18:00 UTC (rev 7372) @@ -664,6 +664,7 @@ // Total over all sampled vertices. System.out.println("TOTAL"// +": analytic=" + gasProgram.getClass().getSimpleName() // + + ", namespace=" + namespace// + ", nseed=" + seed + ", nsamples=" + nsamples // + ", nthreads=" + nthreads This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-30 19:14:53
|
Revision: 7371 http://bigdata.svn.sourceforge.net/bigdata/?rev=7371&view=rev Author: thompsonbry Date: 2013-08-30 19:14:47 +0000 (Fri, 30 Aug 2013) Log Message: ----------- Rolling back unintended commit of build.properties from a test machine. Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/build.properties Modified: branches/BIGDATA_RELEASE_1_2_0/build.properties =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/build.properties 2013-08-30 19:13:18 UTC (rev 7370) +++ branches/BIGDATA_RELEASE_1_2_0/build.properties 2013-08-30 19:14:47 UTC (rev 7371) @@ -1,5 +1,5 @@ # Bigdata ant build properties. -# +# # $Id$ # # The root of the checked out svn source. This assumes that you have checked @@ -160,11 +160,11 @@ # it also provides much more detailed oversight into what the federation is # doing. You need to set this to true to take advantage of the Excel worksheets # for performance monitoring. -REPORT_ALL=true +REPORT_ALL=false # Where the sysstat utilities are found (performance counter reporting for un*x). -#SYSSTAT_HOME=/usr/local/bin -SYSSTAT_HOME=/usr/bin +SYSSTAT_HOME=/usr/local/bin +#SYSSTAT_HOME=/usr/bin # Specifies the value of com.sun.jini.jeri.tcp.useNIO. When true, use NIO for RMI. USE_NIO=true @@ -231,10 +231,10 @@ # an exit status of ZERO (0) to indicate that the lock was obtained. # # lockfile is part of procmail -#LOCK_CMD=lockfile -r 1 -1 +LOCK_CMD=lockfile -r 1 -1 # # dotlockfile is in the liblockfile1 package. -LOCK_CMD=/usr/bin/dotlockfile -r 1 +#LOCK_CMD=/usr/bin/dotlockfile -r 1 # The bigdata subsystem lock file. The user MUST be able to read/write this file # on each host. Therefore, if you are not installing as root this will need to be @@ -244,7 +244,7 @@ LOCK_FILE=${LAS}/lockFile # The main bigdata configuration file. -bigdata.config=${install.config.dir}/miniCluster.config +bigdata.config=${install.config.dir}/bigdataStandalone.config # The main jini configuration file. jini.config=${install.config.dir}/jini/startAll.config @@ -267,7 +267,7 @@ # Note: java.util.logging messages DO NOT get written onto this logger -- only # log4j messages. # -LOG4J_SOCKET_LOGGER_HOST = bigdata10 +LOG4J_SOCKET_LOGGER_HOST = bigdata01.systap.com LOG4J_SOCKET_LOGGER_PORT = 4445 # The socket logger uses a DailyRollingFileAppender by default and this This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-30 19:13:24
|
Revision: 7370 http://bigdata.svn.sourceforge.net/bigdata/?rev=7370&view=rev Author: thompsonbry Date: 2013-08-30 19:13:18 +0000 (Fri, 30 Aug 2013) Log Message: ----------- Re-do of commit with the correct files this time. Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/build.properties branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/build.xml Added Paths: ----------- branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/Makefile Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/Makefile =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/Makefile (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/Makefile 2013-08-30 19:13:18 UTC (rev 7370) @@ -0,0 +1,68 @@ +GRAPH_DIR = ../../bigdata-rdf/src/resources/data + +#Can restrict this to the graphs you have downloaded. Largest is kron_g500-logn21. +GRAPHS = foaf lehigh barData bsbm +#ak2010 belgium_osm delaunay_n13 coAuthorsDBLP delaunay_n21 kron_g500-logn21 soc-LiveJournal1 webbase-1M +#GRAPHS = ak2010 belgium_osm delaunay_n13 coAuthorsDBLP delaunay_n21 soc-LiveJournal1 webbase-1M + +#which algorithms to check +#ALGORITHMS = pagerank sssp bfs cc +ALGORITHMS = BFS SSSP + +#Match this to number of cores on your system +#POWERGRAPH_OPTS = --ncpus 4 + +#GOLD_BINARIES = $(foreach x,$(ALGORITHMS),../PowerGraphReferenceImplementations/$x.x) +#GOLD_FILES = $(foreach P,$(ALGORITHMS),$(foreach G,$(GRAPHS),$G.$P.gold)) +TEST_FILES = $(foreach P,$(ALGORITHMS),$(foreach G,$(GRAPHS),$G.$P.test)) + +#REGRESSIONS = $(foreach P,$(ALGORITHMS),$(foreach G,$(GRAPHS),$G.$P.pass)) + +NSAMPLES=1000 +SEED=217 +PROFILER= + +#all: regress +all: test + +#gold: $(GOLD_BINARIES) $(GOLD_FILES) + +test: $(TEST_FILES) + +#regress: $(REGRESSIONS) + +#define MAKEGOLD +# rm -f __$(1).$(2).out* ; +# ../PowerGraphReferenceImplementations/$(2).x --graph $(GRAPH_DIR)/$(1)/$(1).mtx --graph_opts ingress=batch --save __$(1).$(2).out | awk '/Finished Running/{print $$5}' > $(1).$(2).timing ; +# cat __$(1).$(2).out* | sort -n > $(1).$(2).gold ; +# rm -f __$(1).$(2).out* ; +#endef + +# TODO Add reference implementation and comparison for CC. See the pagerank example below. +%.BFS.test: + ant -Danalytic=BFS -Dnsamples=$(NSAMPLES) -Dseed=$(SEED) -DprofilerAgent=$(PROFILER) '-Dload=-load $(GRAPH_DIR)/$*' '-DjournalFile=bigdata-gas-$*.jnl' run-gas-engine > $*.BFS.out + grep "TOTAL: " $*.BFS.out > $@ + +%.SSSP.test: + ant -Danalytic=SSSP -Dnsamples=$(NSAMPLES) -Dseed=$(SEED) -DprofilerAgent=$(PROFILER) '-Dload=-load $(GRAPH_DIR)/$*' '-DjournalFile=bigdata-gas-$*.jnl' run-gas-engine > $*.SSSP.out + grep "TOTAL: " $*.SSSP.out > $@ + +# sort -n __tmp$* > $@ +# rm -f __tmp$* + +#%.bfs.gold: ../PowerGraphReferenceImplementations/bfs.x +# $(call MAKEGOLD,$*,bfs) + +#%.bfs.test: ../simpleBFS +# ../simpleBFS $(GRAPH_DIR)/$*/$*.mtx __tmp$* | awk '/Took/{print $$2}' > $*.bfs.timing_gpu +# sort -n __tmp$* > $@ +# rm -f __tmp$* + +#%.bfs.pass: %.bfs.test %.bfs.gold +# diff -q $*.bfs.test $*.bfs.gold +# touch $*.bfs.pass + +clean: + rm -f *.test *.out + +clean-all: clean Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/build.properties =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/build.properties 2013-08-30 19:11:33 UTC (rev 7369) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/build.properties 2013-08-30 19:13:18 UTC (rev 7370) @@ -44,6 +44,9 @@ # The namespace of the KB instance (multiple KBs can be in the same database). namespace=kb +# The name of the journal file to be used (ignored for memstore). +journalFile=bigdata.jnl + # The name of the file used to configure the Journal. journalPropertyFile=RWStore.properties @@ -68,10 +71,10 @@ analytic=BFS # The class used to schedule and compact the new frontier. -#scheduler=-schedulerClass com.bigdata.rdf.graph.impl.GASState$STScheduler -#scheduler=-schedulerClass com.bigdata.rdf.graph.impl.GASState$CHSScheduler +#scheduler=-schedulerClass com.bigdata.rdf.graph.impl.scheduler.STScheduler scheduler=-schedulerClass com.bigdata.rdf.graph.impl.scheduler.CHMScheduler -#scheduler=-schedulerClass com.bigdata.rdf.graph.impl.GASState$TLScheduler +#scheduler=-schedulerClass com.bigdata.rdf.graph.impl.scheduler.TLScheduler +#scheduler=-schedulerClass com.bigdata.rdf.graph.impl.scheduler.TLScheduler2 # # Profiler parameters. @@ -109,5 +112,5 @@ #gcdebug=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:jvm_gc.log # all jvm args for query. -jvmArgs=-server -Xmx${maxMem} -XX:MaxDirectMemorySize=2g -showversion ${gcopts} ${gcdebug} ${profiler} -Dlog4j.configuration=file:log4j.properties +jvmArgs=-server -Xmx${maxMem} -XX:MaxDirectMemorySize=2g -showversion ${gcopts} ${gcdebug} ${profiler} -Dlog4j.configuration=file:log4j.properties -Dcom.bigdata.journal.AbstractJournal.file=${journalFile} # -Dlog4j.debug Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/build.xml =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/build.xml 2013-08-30 19:11:33 UTC (rev 7369) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-perf/gas/build.xml 2013-08-30 19:13:18 UTC (rev 7370) @@ -32,7 +32,7 @@ <java classname="com.bigdata.rdf.graph.impl.GASRunner" fork="true" failonerror="true" > - <arg line="-bufferMode ${bufferMode} -namespace ${namespace} -seed ${seed} -nsamples ${nsamples} -nthreads ${nthreads} ${load} com.bigdata.rdf.graph.analytics.${analytic} ${journalPropertyFile}" /> + <arg line="-bufferMode ${bufferMode} -namespace ${namespace} -seed ${seed} -nsamples ${nsamples} -nthreads ${nthreads} ${scheduler} ${load} com.bigdata.rdf.graph.analytics.${analytic} ${journalPropertyFile}" /> <!-- specify/override the journal file name. --> <jvmarg line="${jvmArgs}" /> <classpath> This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-30 19:11:40
|
Revision: 7369 http://bigdata.svn.sourceforge.net/bigdata/?rev=7369&view=rev Author: thompsonbry Date: 2013-08-30 19:11:33 +0000 (Fri, 30 Aug 2013) Log Message: ----------- Added a Makefile that can be used as a driver to collect performance data on each implemented algorithm over a collection of data sets. Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/build.properties Modified: branches/BIGDATA_RELEASE_1_2_0/build.properties =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/build.properties 2013-08-30 15:52:35 UTC (rev 7368) +++ branches/BIGDATA_RELEASE_1_2_0/build.properties 2013-08-30 19:11:33 UTC (rev 7369) @@ -160,11 +160,11 @@ # it also provides much more detailed oversight into what the federation is # doing. You need to set this to true to take advantage of the Excel worksheets # for performance monitoring. -REPORT_ALL=false +REPORT_ALL=true # Where the sysstat utilities are found (performance counter reporting for un*x). -SYSSTAT_HOME=/usr/local/bin -#SYSSTAT_HOME=/usr/bin +#SYSSTAT_HOME=/usr/local/bin +SYSSTAT_HOME=/usr/bin # Specifies the value of com.sun.jini.jeri.tcp.useNIO. When true, use NIO for RMI. USE_NIO=true @@ -231,10 +231,10 @@ # an exit status of ZERO (0) to indicate that the lock was obtained. # # lockfile is part of procmail -LOCK_CMD=lockfile -r 1 -1 +#LOCK_CMD=lockfile -r 1 -1 # # dotlockfile is in the liblockfile1 package. -#LOCK_CMD=/usr/bin/dotlockfile -r 1 +LOCK_CMD=/usr/bin/dotlockfile -r 1 # The bigdata subsystem lock file. The user MUST be able to read/write this file # on each host. Therefore, if you are not installing as root this will need to be @@ -244,7 +244,7 @@ LOCK_FILE=${LAS}/lockFile # The main bigdata configuration file. -bigdata.config=${install.config.dir}/bigdataStandalone.config +bigdata.config=${install.config.dir}/miniCluster.config # The main jini configuration file. jini.config=${install.config.dir}/jini/startAll.config @@ -267,7 +267,7 @@ # Note: java.util.logging messages DO NOT get written onto this logger -- only # log4j messages. # -LOG4J_SOCKET_LOGGER_HOST = bigdata01.systap.com +LOG4J_SOCKET_LOGGER_HOST = bigdata10 LOG4J_SOCKET_LOGGER_PORT = 4445 # The socket logger uses a DailyRollingFileAppender by default and this This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-30 15:52:42
|
Revision: 7368 http://bigdata.svn.sourceforge.net/bigdata/?rev=7368&view=rev Author: thompsonbry Date: 2013-08-30 15:52:35 +0000 (Fri, 30 Aug 2013) Log Message: ----------- Allowing override of the journal file name from the command line in the GASRunner. See #629 Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java 2013-08-30 14:43:46 UTC (rev 7367) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java 2013-08-30 15:52:35 UTC (rev 7368) @@ -349,6 +349,35 @@ } + /* + * Allow override of select options from the command line. + */ + { + final String[] overrides = new String[] { + // Journal options. + com.bigdata.journal.Options.FILE, +// // RDFParserOptions. +// RDFParserOptions.Options.DATATYPE_HANDLING, +// RDFParserOptions.Options.PRESERVE_BNODE_IDS, +// RDFParserOptions.Options.STOP_AT_FIRST_ERROR, +// RDFParserOptions.Options.VERIFY_DATA, +// // DataLoader options. +// DataLoader.Options.BUFFER_CAPACITY, +// DataLoader.Options.CLOSURE, +// DataLoader.Options.COMMIT, +// DataLoader.Options.FLUSH, + }; + for (String s : overrides) { + if (System.getProperty(s) != null) { + // Override/set from the environment. + final String v = System.getProperty(s); + if (log.isInfoEnabled()) + log.info("OVERRIDE:: Using: " + s + "=" + v); + properties.setProperty(s, v); + } + } + } + return properties; } finally { This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <mrp...@us...> - 2013-08-30 14:43:56
|
Revision: 7367 http://bigdata.svn.sourceforge.net/bigdata/?rev=7367&view=rev Author: mrpersonick Date: 2013-08-30 14:43:46 +0000 (Fri, 30 Aug 2013) Log Message: ----------- applying patch from ticket #701 Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-sails/src/java/com/bigdata/rdf/sail/webapp/client/BackgroundTupleResult.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-sails/src/java/com/bigdata/rdf/sail/webapp/client/BackgroundTupleResult.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-sails/src/java/com/bigdata/rdf/sail/webapp/client/BackgroundTupleResult.java 2013-08-30 13:57:48 UTC (rev 7366) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-sails/src/java/com/bigdata/rdf/sail/webapp/client/BackgroundTupleResult.java 2013-08-30 14:43:46 UTC (rev 7367) @@ -11,7 +11,6 @@ import java.util.Collections; import java.util.List; import java.util.concurrent.CountDownLatch; -import java.util.concurrent.atomic.AtomicReference; import org.apache.http.HttpEntity; import org.apache.http.util.EntityUtils; @@ -33,9 +32,11 @@ public class BackgroundTupleResult extends TupleQueryResultImpl implements TupleQueryResult, Runnable, TupleQueryResultHandler { + final private Object closeLock = new Object(); + private volatile boolean closed; - private volatile Thread parserThread; + private Thread parserThread; final private TupleQueryResultParser parser; @@ -45,7 +46,9 @@ final private QueueCursor<BindingSet> queue; - final private AtomicReference<List<String>> bindingNamesRef; + // No need to synchronize this field because visibility is guaranteed + // by happens-before of CountDownLatch's countDown() and await() + private List<String> bindingNames; final private CountDownLatch bindingNamesReady = new CountDownLatch(1); @@ -59,19 +62,22 @@ public BackgroundTupleResult(final QueueCursor<BindingSet> queue, final TupleQueryResultParser parser, final InputStream in, final HttpEntity entity) { - super(Collections.EMPTY_LIST, queue); + super(Collections.<String>emptyList(), queue); this.queue = queue; this.parser = parser; this.in = in; this.entity = entity; - this.bindingNamesRef = new AtomicReference<List<String>>(); } @Override - public synchronized void close() throws QueryEvaluationException { - closed = true; - if (parserThread != null) { - parserThread.interrupt(); + public void close() throws QueryEvaluationException { + synchronized (closeLock) { + if (!closed) { + closed = true; + if (parserThread != null) { + parserThread.interrupt(); + } + } } } @@ -86,7 +92,9 @@ */ bindingNamesReady.await(); queue.checkException(); - return bindingNamesRef.get(); + if (closed) + throw new UndeclaredThrowableException(null, "Result closed"); + return bindingNames; } catch (InterruptedException e) { throw new UndeclaredThrowableException(e); } catch (QueryEvaluationException e) { @@ -96,8 +104,16 @@ @Override public void run() { + synchronized (closeLock) { + if (closed) { + // Result was closed before it was run. + // Need to unblock latch + bindingNamesReady.countDown(); + return; + } + parserThread = Thread.currentThread(); + } boolean completed = false; - parserThread = Thread.currentThread(); try { parser.setTupleQueryResultHandler(this); parser.parse(in); @@ -111,7 +127,9 @@ } catch (IOException e) { queue.toss(e); } finally { - parserThread = null; + synchronized (closeLock) { + parserThread = null; + } queue.done(); bindingNamesReady.countDown(); if (!completed) { @@ -125,7 +143,7 @@ @Override public void startQueryResult(final List<String> bindingNames) throws TupleQueryResultHandlerException { - this.bindingNamesRef.set(bindingNames); + this.bindingNames = bindingNames; bindingNamesReady.countDown(); } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-30 13:57:55
|
Revision: 7366 http://bigdata.svn.sourceforge.net/bigdata/?rev=7366&view=rev Author: thompsonbry Date: 2013-08-30 13:57:48 +0000 (Fri, 30 Aug 2013) Log Message: ----------- The Jetty-based CHSScheduler is close to the performance of the java.util.concurrent.ConcurrentHashMap based CHMScheduler. {{{ TOTAL: analytic=BFS, nseed=217, nsamples=1000, nthreads=4, bufferMode=DiskRW, scheduler=CHMScheduler, edges(kb)=444224, stats(total)=nrounds=22834, fontierSize=39967201, ms=145801, edges=93559787, teps=641694 TOTAL: analytic=BFS, nseed=217, nsamples=1000, nthreads=4, bufferMode=DiskRW, scheduler=CHSScheduler, edges(kb)=444224, stats(total)=nrounds=22834, fontierSize=39967201, ms=148106, edges=93559787, teps=631706 }}} Since the CHSScheduler does not dominate the CHMScheduler and since it creates a dependency on jetty, the CHScheduler is being removed. See #629 (Graph Mining API) Removed Paths: ------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHSScheduler.java Deleted: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHSScheduler.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHSScheduler.java 2013-08-30 13:36:14 UTC (rev 7365) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHSScheduler.java 2013-08-30 13:57:48 UTC (rev 7366) @@ -1,51 +0,0 @@ -package com.bigdata.rdf.graph.impl.scheduler; - -import org.eclipse.jetty.util.ConcurrentHashSet; -import org.openrdf.model.Value; - -import com.bigdata.rdf.graph.IGASSchedulerImpl; -import com.bigdata.rdf.graph.IStaticFrontier; -import com.bigdata.rdf.graph.impl.GASEngine; - -/** - * A simple scheduler based on a concurrent hash collection - * - * @author <a href="mailto:tho...@us...">Bryan - * Thompson</a> - * - * FIXME SCHEDULER: This is a Jetty class. Unbundle it! Use CHM - * instead. See {@link CHMScheduler}. - */ -public class CHSScheduler implements IGASSchedulerImpl { - - private final ConcurrentHashSet<Value> vertices; - - public CHSScheduler(final GASEngine gasEngine) { - - vertices = new ConcurrentHashSet<Value>(); - - } - - @Override - public void schedule(final Value v) { - - vertices.add(v); - - } - - @Override - public void clear() { - - vertices.clear(); - - } - - @Override - public void compactFrontier(final IStaticFrontier frontier) { - - frontier.resetFrontier(vertices.size()/* minCapacity */, - false/* ordered */, vertices.iterator()); - - } - -} // CHMScheduler \ No newline at end of file This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-30 13:36:21
|
Revision: 7365 http://bigdata.svn.sourceforge.net/bigdata/?rev=7365&view=rev Author: thompsonbry Date: 2013-08-30 13:36:14 +0000 (Fri, 30 Aug 2013) Log Message: ----------- The implementation to date has incorrectly failed to consider a bnode to be a vertex. All of the baselines will have to be recomputed (this has a large effect for the data set we have been using since bnodes are quite common in the foaf crawl.) See #629 (Graph Mining API) Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASState.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java 2013-08-30 13:15:14 UTC (rev 7364) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java 2013-08-30 13:36:14 UTC (rev 7365) @@ -7,6 +7,7 @@ import java.util.concurrent.atomic.AtomicInteger; import org.apache.log4j.Logger; +import org.openrdf.model.Resource; import org.openrdf.model.Statement; import org.openrdf.model.URI; import org.openrdf.model.Value; @@ -329,7 +330,7 @@ @Override public boolean isEdge(final Statement e) { - return e.getObject() instanceof URI; // FIXME CORRECTNESS (instanceof Resource) + return e.getObject() instanceof Resource; } @Override Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASState.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASState.java 2013-08-30 13:15:14 UTC (rev 7364) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASState.java 2013-08-30 13:36:14 UTC (rev 7365) @@ -117,9 +117,23 @@ public boolean isEdge(final Statement e) { final ISPO spo = (ISPO) e; + + /** + * For the early development of the GAS API, this test was written using + * o.isURI() rather than o.isResource(). That caused edges that ended in + * a bnode to be ignored, which means that a lot of the FOAF data set we + * were using was ignored. This was changed in r7365 to use + * isResource(). That change invalidates the historical baseline for the + * BFS and SSSP performance. This is also documented at the ticket + * below. + * + * @see <a + * href="https://sourceforge.net/apps/trac/bigdata/ticket/629#comment:33"> + * Graph Mining API </a> + */ + return spo.o().isResource(); +// return spo.o().isURI(); - return spo.o().isURI(); // FIXME CORRECTNESS : isResource() - } @Override This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-30 13:15:26
|
Revision: 7364 http://bigdata.svn.sourceforge.net/bigdata/?rev=7364&view=rev Author: thompsonbry Date: 2013-08-30 13:15:14 +0000 (Fri, 30 Aug 2013) Log Message: ----------- Refactored the GAS API to use the openrdf model classes for vertices (Value) and edges (Statement). The concrete implementation is still based on the bigdata indices and the bigdata AbstractTripleStore. However, it should now be easy to create an implementation based on the MemorySail. Performance should not change with this commit since the underlying implementation is essentially unchanged. We still use IV and ISPO objects, but those implement Value and Statement. All tests pass. The local performance test provides the same results and performance as the previous committed version. See #629 (Graph Mining API) Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/GASUtil.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASContext.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASOptions.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASProgram.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASSchedulerImpl.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASState.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGraphAccessor.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IReducer.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/analytics/BFS.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/analytics/SSSP.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/BaseGASProgram.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASContext.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/VertexTaskFactory.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASEngine.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASState.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHMScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHSScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/STScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler2.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/GASImplUtil.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/spo/ISPO.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPO.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/test/com/bigdata/rdf/graph/AbstractGraphTestCase.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/test/com/bigdata/rdf/graph/impl/TestGather.java Added Paths: ----------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/MergeSortIterator.java Removed Paths: ------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/MergeSortIterator.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/GASUtil.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/GASUtil.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/GASUtil.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -1,11 +1,14 @@ package com.bigdata.rdf.graph; +import java.util.Iterator; import java.util.concurrent.TimeUnit; -import com.bigdata.rdf.internal.IV; -import com.bigdata.rdf.spo.ISPO; +import org.openrdf.model.Statement; +import org.openrdf.model.Value; +import cutthecrap.utils.striterators.EmptyIterator; + /** * Utility class for operations on the public interfaces. * @@ -16,31 +19,6 @@ // private static final Logger log = Logger.getLogger(GASUtil.class); /** - * Return the other end of a link. - * - * @param u - * One end of the link. - * @param e - * The link. - * - * @return The other end of the link. - * - * FIXME We can optimize this to use reference testing if we are - * careful in the GATHER and SCATTER implementations to always use - * the {@link IV} values on the {@link ISPO} object that is exposed - * to the {@link IGASProgram}. - */ - @SuppressWarnings("rawtypes") - public static IV getOtherVertex(final IV u, final ISPO e) { - - if (e.s().equals(u)) - return e.o(); - - return e.s(); - - } - - /** * The average fan out of the frontier. * * @param frontierSize @@ -86,4 +64,16 @@ } + /** + * An empty vertex iterator. + */ + @SuppressWarnings({ "unchecked" }) + public static final Iterator<Value> EMPTY_VERTICES_ITERATOR = EmptyIterator.DEFAULT; + + /** + * An empty edge iterator. + */ + @SuppressWarnings({ "unchecked" }) + public static final Iterator<Statement> EMPTY_EDGES_ITERATOR = EmptyIterator.DEFAULT; + } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASContext.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASContext.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASContext.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -45,5 +45,11 @@ */ boolean doRound(IGASStats stats) throws Exception, ExecutionException, InterruptedException; + + /** + * Execute the associated {@link IGASProgram}. + */ + @Override + IGASStats call() throws Exception; } \ No newline at end of file Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASOptions.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASOptions.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASOptions.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -1,7 +1,7 @@ package com.bigdata.rdf.graph; -import com.bigdata.rdf.internal.IV; -import com.bigdata.rdf.spo.ISPO; +import org.openrdf.model.Statement; +import org.openrdf.model.Value; import cutthecrap.utils.striterators.IStriterator; @@ -20,7 +20,7 @@ * * @author <a href="mailto:tho...@us...">Bryan Thompson</a> */ -public interface IGASOptions<VS, ES> { +public interface IGASOptions<VS, ES, ST> { /** * Return the set of edges to which the GATHER is applied -or- @@ -41,15 +41,14 @@ * map, so if the algorithm does not use vertex state, then the factory * should return a singleton instance each time it is invoked. */ - @SuppressWarnings("rawtypes") - Factory<IV, VS> getVertexStateFactory(); + Factory<Value, VS> getVertexStateFactory(); /** * Return a factory for edge state objects -or- <code>null</code> if the * {@link IGASProgram} does not use edge state (in which case the edge state * will not be allocated or maintained). */ - Factory<ISPO, ES> getEdgeStateFactory(); + Factory<Statement, ES> getEdgeStateFactory(); /** * Return non-<code>null</code> iff there is a single link type to be @@ -61,55 +60,30 @@ * property set for the vertex. The graph is treated as if it were an * unattributed graph and only mined for the connectivity data. * - * @return The {@link IV} for the predicate that identifies the desired link - * type (there can be many types of links - the return value + * @return The {@link Value} for the predicate that identifies the desired + * link type (there can be many types of links - the return value * specifies which attribute is of interest). * * @see #getLinkAttribType() */ - @SuppressWarnings("rawtypes") - IV getLinkType(); + Value getLinkType(); -// /** -// * Return non-<code>null</code> iff there is a single link type to be -// * visited. This corresponds to a view of the graph as a sparse\xCAmatrix where -// * the data in the matrix provides the link weights. The type of the visited -// * link weights is specified by the return value for this method. The -// * {@link IGASEngine} can optimize traversal patterns using the -// * <code>POS</code> index. -// * <p> -// * Note: When this option is used, the scatter and gather will not visit the -// * property set for the vertex. The graph is treated as if it were an -// * unattributed graph and only mined for the connectivity data. -// * -// * @return The {@link IV} for the predicate that identifies the desired link -// * attribute type (a link can have many attributes - the return -// * value specifies which attribute is of interest). -// * -// * @see #getLinkType() -// */ -// IV getLinkAttribType(); -// -// /** -// * When non-<code>null</code>, the specified {@link Filter} will be used to -// * restrict the visited edges. For example, you can restrict the visitation -// * to a subset of the predicates that are of interest, to only visit edges -// * that have link edges, to visit only select property values, etc. Some -// * useful filters are defined in an abstract implementation of this -// * interface. -// * -// * @see #visitPropertySet() -// */ -// IFilterTest getEdgeFilter(); - /** * Hook to impose a constraint on the visited edges and/or property values. * * @param itr * The iterator visiting those edges and/or property values. - * + * * @return Either the same iterator or a constrained iterator. + * + * TODO Rename as constrainEdgeFilter or even split into a + * constrainGatherFilter and a constraintScatterFilter. + * + * FIXME APPLY : If we need access to the vertex property values in + * APPLY (which we probably do, at least optionally), then there + * should be a similar method to decide whether the property values + * for the vertex are made available during the APPLY. */ - IStriterator constrainFilter(IStriterator eitr); + IStriterator constrainFilter(IGASContext<VS, ES, ST> ctx, IStriterator eitr); } \ No newline at end of file Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASProgram.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASProgram.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASProgram.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -1,7 +1,7 @@ package com.bigdata.rdf.graph; -import com.bigdata.rdf.internal.IV; -import com.bigdata.rdf.spo.ISPO; +import org.openrdf.model.Statement; +import org.openrdf.model.Value; /** * Abstract interface for GAS programs. @@ -19,88 +19,8 @@ * the computation). * @author <a href="mailto:tho...@us...">Bryan Thompson</a> */ -@SuppressWarnings("rawtypes") -public interface IGASProgram<VS, ES, ST> extends IGASOptions<VS, ES> { +public interface IGASProgram<VS, ES, ST> extends IGASOptions<VS, ES, ST> { - /* - * TODO Unfortunately this pattern of hiding our more complex interfaces can - * not be made to work without creating wrapper objects that implement the - * derived interface, even though we would just like to use it as a marker - * interface. It might be workable if we put this under the IV and ISPO - * interfaces (as simpler interfaces without generic types). - */ - -// /** -// * A shorthand for the {@link IV} interface that cleans up the generic type -// * warnings. An {@link IV} corresponds to a vertex of the graph or the value -// * of an attribute. {@link IV}s may be materialized or not. For efficiency, -// * it is better to operate without materialization of the corresponding RDF -// * {@link Value}. Many {@link IV}s are <em>inline</em> can be immediately -// * interpreted as if they were materialized RDF {@link Value}s - for -// * example, this is true by default for all <code>xsd</code> numeric -// * datatypes. It may also be true of other kinds of {@link Value}s depending -// * on how the KB was configured. -// * -// * @author <a href="mailto:tho...@us...">Bryan -// * Thompson</a> -// */ -// @SuppressWarnings("rawtypes") -// private interface IV extends com.bigdata.rdf.internal.IV { -// -// } -// -// /** -// * An edge is comprised of a Subject (s), Predicate (p), and Object (o). -// * Depending on the KB configuration, there may also be an Context (c) -// * position on the edge - when present the Context supports the concept of -// * SPARQL named graphs. -// * <dl> -// * <dt>Subject</dt> -// * <dd>The Subject is either a {@link URI} or a {@link BNode}.</dd> -// * <dt>Predicate</dt> -// * <dd>The Predicate is always a {@link URI}.</dd> -// * <dt>Object</dt> -// * <dd>The Object is either a {@link URI} (in which case the "edge" is a -// * link) or a {@link Literal} (in which case the edge is a property value).</dd> -// * <dt>Context</dt> -// * <dd>The Context is either a {@link URI} or a {@link BNode}.</dd> -// * </dl> -// * Note that the Subject, Predicate, Object, and Context will be {@link IV} -// * instances and hence might or might not be materialized RDF {@link Value}s -// * and might or might not be <em>inline</em> and hence directly inspectable -// * as if they were materialized RDF {@link Value}s. -// * -// * @author <a href="mailto:tho...@us...">Bryan -// * Thompson</a> -// */ -// private interface ISPO extends com.bigdata.rdf.spo.ISPO { -// -// /** -// * {@inheritDoc} -// */ -// @Override -// IV s(); -// -// /** -// * {@inheritDoc} -// */ -// @Override -// IV p(); -// -// /** -// * {@inheritDoc} -// */ -// @Override -// IV o(); -// -// /** -// * {@inheritDoc} -// */ -// @Override -// IV c(); -// -// } - /** * Callback to initialize the state for each vertex in the initial frontier * before the first iteration. A typical use case is to set the distance of @@ -109,7 +29,7 @@ * @param u * The vertex. */ - void init(IGASState<VS, ES, ST> state, IV u); + void init(IGASState<VS, ES, ST> state, Value u); /** * GATHER is a map/reduce over the edges of the vertex. The SUM provides @@ -143,7 +63,7 @@ * depends on the algorithm. How can we get these constraints into * the API? */ - ST gather(IGASState<VS, ES, ST> state, IV u, ISPO e); + ST gather(IGASState<VS, ES, ST> state, Value u, Statement e); /** * SUM is a pair-wise reduction that is applied during the GATHER. @@ -190,7 +110,7 @@ * when compared to either the frontier or the set of states that * have been in the frontier during the computation. */ - VS apply(IGASState<VS, ES, ST> state, IV u, ST sum); + VS apply(IGASState<VS, ES, ST> state, Value u, ST sum); /** * Return <code>true</code> iff the vertex should run its SCATTER phase. @@ -203,7 +123,7 @@ * The vertex. * @return */ - boolean isChanged(IGASState<VS, ES, ST> state, IV u); + boolean isChanged(IGASState<VS, ES, ST> state, Value u); /** * @@ -213,7 +133,7 @@ * @param e * The edge. */ - void scatter(IGASState<VS, ES, ST> state, IGASScheduler sch, IV u, ISPO e); + void scatter(IGASState<VS, ES, ST> state, IGASScheduler sch, Value u, Statement e); /** * Return <code>true</code> iff the algorithm should continue. This is Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASScheduler.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASScheduler.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASScheduler.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -1,6 +1,6 @@ package com.bigdata.rdf.graph; -import com.bigdata.rdf.internal.IV; +import org.openrdf.model.Value; /** * Interface schedules a vertex for execution. This interface is exposed to the @@ -16,6 +16,6 @@ * @param v * The vertex. */ - void schedule(@SuppressWarnings("rawtypes") IV v); + void schedule(Value v); } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASSchedulerImpl.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASSchedulerImpl.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASSchedulerImpl.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -1,7 +1,5 @@ package com.bigdata.rdf.graph; -import com.bigdata.rdf.internal.IV; - /** * Extended {@link IGASScheduler} interface. This interface is exposed to the * implementation of the GAS Engine. The methods on this interface are NOT for @@ -15,11 +13,6 @@ /** * Compact the schedule into the new frontier. - * <p> - * Note: Typical contracts ensure that the frontier is compact (no - * duplicates) and in ascending {@link IV} order (this provides cache - * locality for the index reads, even if those reads are against indices - * wired into RAM). */ void compactFrontier(IStaticFrontier frontier); Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASState.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASState.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGASState.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -1,7 +1,8 @@ package com.bigdata.rdf.graph; -import com.bigdata.rdf.internal.IV; -import com.bigdata.rdf.spo.ISPO; +import org.openrdf.model.Statement; +import org.openrdf.model.URI; +import org.openrdf.model.Value; /** * Interface exposes access to the VS and ES that is visible during a GATHER or @@ -42,7 +43,7 @@ * @throws IllegalArgumentException * if no vertices are specified. */ - void init(@SuppressWarnings("rawtypes") IV... v); + void init(Value... v); /** * Discard computation state (the frontier, vertex state, and edge state) @@ -70,7 +71,7 @@ * * @see IGASProgram#getVertexStateFactory() */ - VS getState(@SuppressWarnings("rawtypes") IV v); + VS getState(Value v); /** * Get the state for the edge using the appropriate factory. If this is the @@ -84,7 +85,7 @@ * * @see IGASProgram#getEdgeStateFactory() */ - ES getState(ISPO e); + ES getState(Statement e); /** * The current frontier. @@ -120,12 +121,96 @@ void traceState(); /** + * Return the other end of a link. + * + * @param u + * One end of the link. + * @param e + * The link. + * + * @return The other end of the link. + */ + Value getOtherVertex(Value u, Statement e); + + /** * Return a useful representation of an edge (non-batch API, debug only). + * This method is only required when the edge objects are internal database + * objects lacking fully materialized RDF {@link Value}s. In this case, it + * will materialize the RDF Values and present a pleasant view of the edge. + * The materialization step is a random access, which is why this method is + * for debug only. Efficient, vectored mechanisms exist to materialize RDF + * {@link Value}s for other purposes, e.g., when exporting a set of edges as + * as graph in a standard interchange syntax. * * @param e * The edge. * @return The representation of that edge. */ - String toString(ISPO e); + String toString(Statement e); + /** + * Return <code>true</code> iff the given {@link Statement} models an edge + * that connects two vertices ({@link Statement}s also model property + * values). + * + * @param e + * The statement. + * + * @return <code>true</code> iff that {@link Statement} is an edge of the + * graph. + */ + boolean isEdge(final Statement e); + + /** + * Return <code>true</code> iff the given {@link Statement} models an + * property value for a vertex of the graph ({@link Statement}s also model + * edges). + * + * @param e + * The statement. + * @return <code>true</code> iff that {@link Statement} is an edge of the + * graph. + */ + boolean isAttrib(Statement e); + + /** + * Return <code>true</code> iff the statement models a link attribute having + * the specified link type. When this method returns <code>true</code>, the + * {@link Statement#getSubject()} may be decoded to obtain the link + * described by that link attribute using {@link #decodeStatement(Value)}. + * + * @param e + * The statement. + * @param linkAttribType + * The type for the link attribute. + * + * @return <code>true</code> iff the statement is an instance of a link + * attribute for the specified link type. + */ + boolean isLinkAttrib(Statement e, URI linkAttribType); + + /** + * If the vertex is actually an edge, then return the decoded edge. + * <p> + * Note: A vertex may be an edge. A link attribute is modeled by treating + * the link as a vertex and then asserting a property value about that + * "link vertex". For bigdata, this is handled efficiently as inline + * statements about statements. This approach subsumes the property graph + * model (property graphs do not permit recursive nesting of these + * relationships) and is 100% consistent with RDF reification, except that + * the link attributes are modeled efficiently inline with the links. This + * is what we call <a + * href="http://www.bigdata.com/whitepapers/reifSPARQL.pdf" > Reification + * Done Right </a>. + * + * @param v + * The vertex. + * + * @return The edge decoded from that vertex and <code>null</code> iff the + * vertex is not an edge. + * + * TODO RDR : Link to an RDR wiki page as well. + */ + Statement decodeStatement(Value v); + } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGraphAccessor.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGraphAccessor.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IGraphAccessor.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -1,9 +1,10 @@ package com.bigdata.rdf.graph; import java.util.Iterator; +import java.util.Random; -import com.bigdata.rdf.internal.IV; -import com.bigdata.rdf.spo.ISPO; +import org.openrdf.model.Statement; +import org.openrdf.model.Value; /** * Interface abstracts access to a backend graph implementation. @@ -15,15 +16,17 @@ /** * Return the edges for the vertex. * - * @param p The {@link IGASProgram} + * @param p + * The {@link IGASContext}. * @param u * The vertex. * @param edges * Typesafe enumeration indicating which edges should be visited. + * * @return An iterator that will visit the edges for that vertex. */ - @SuppressWarnings("rawtypes") - Iterator<ISPO> getEdges(IGASProgram<?, ?, ?> p, IV u, EdgesEnum edges); + Iterator<Statement> getEdges(IGASContext<?, ?, ?> p, Value u, + EdgesEnum edges); /** * Hook to advance the view of the graph. This is invoked at the end of each @@ -31,4 +34,19 @@ */ void advanceView(); + /** + * Return a sample (without duplicates) of vertices from the graph. + * + * @param desiredSampleSize + * The desired sample size. + * + * @return The distinct samples that were found. + * + * FIXME Specify whether the sample must be uniform over the + * vertices, proportional to the #of (in/out)edges for those + * vertices, etc. Without a clear specification, we will not + * have the same behavior across different backends. + */ + Value[] getRandomSample(final Random r, final int desiredSampleSize); + } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IReducer.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IReducer.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IReducer.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -26,7 +26,7 @@ */ package com.bigdata.rdf.graph; -import com.bigdata.rdf.internal.IV; +import org.openrdf.model.Value; /** * An interface for computing reductions over the vertices of a graph. @@ -48,7 +48,7 @@ * The result from applying the procedure to a single index * partition. */ - public void visit(IGASState<VS, ES, ST> ctx, @SuppressWarnings("rawtypes") IV u); + public void visit(IGASState<VS, ES, ST> ctx, Value u); /** * Return the aggregated results as an implementation dependent object. Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -2,7 +2,7 @@ import java.util.Iterator; -import com.bigdata.rdf.internal.IV; +import org.openrdf.model.Value; /** * Interface abstracts the fixed frontier as known on entry into a new @@ -11,8 +11,7 @@ * @author <a href="mailto:tho...@us...">Bryan * Thompson</a> */ -@SuppressWarnings("rawtypes") -public interface IStaticFrontier extends Iterable<IV> { +public interface IStaticFrontier extends Iterable<Value> { /** * The number of vertices in the frontier. @@ -43,7 +42,7 @@ boolean isCompact(); /** - * Reset the frontier from the {@link IV}s. + * Reset the frontier from the supplied vertices. * * @param minCapacity * The minimum capacity of the new frontier. (A minimum capacity @@ -53,7 +52,15 @@ * <code>true</code> iff the frontier is known to be ordered. * @param vertices * The vertices in the new frontier. + * + * FIXME The MergeSortIterator and ordered frontiers should only + * be used when the graph is backed by indices. The cache + * efficiency concerns for ordered frontiers apply when accessing + * indices. However, we should not presump that order of + * traversal matters for other backends. For example, a backend + * based on hash collections over RDF Statement objects will not + * benefit from an ordering based on the RDF Value objects. */ - void resetFrontier(int minCapacity, boolean ordered, Iterator<IV> vertices); + void resetFrontier(int minCapacity, boolean ordered, Iterator<Value> vertices); } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/analytics/BFS.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/analytics/BFS.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/analytics/BFS.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -2,14 +2,15 @@ import java.util.concurrent.atomic.AtomicInteger; +import org.openrdf.model.Statement; +import org.openrdf.model.Value; + import com.bigdata.rdf.graph.EdgesEnum; import com.bigdata.rdf.graph.Factory; import com.bigdata.rdf.graph.IGASContext; import com.bigdata.rdf.graph.IGASScheduler; import com.bigdata.rdf.graph.IGASState; import com.bigdata.rdf.graph.impl.BaseGASProgram; -import com.bigdata.rdf.internal.IV; -import com.bigdata.rdf.spo.ISPO; import cutthecrap.utils.striterators.IStriterator; @@ -21,7 +22,6 @@ * * @author <a href="mailto:tho...@us...">Bryan Thompson</a> */ -@SuppressWarnings("rawtypes") public class BFS extends BaseGASProgram<BFS.VS, BFS.ES, Void> { static class VS { @@ -87,10 +87,10 @@ } - private static final Factory<IV, BFS.VS> vertexStateFactory = new Factory<IV, BFS.VS>() { + private static final Factory<Value, BFS.VS> vertexStateFactory = new Factory<Value, BFS.VS>() { @Override - public BFS.VS initialValue(final IV value) { + public BFS.VS initialValue(final Value value) { return new VS(); @@ -99,14 +99,14 @@ }; @Override - public Factory<IV, BFS.VS> getVertexStateFactory() { + public Factory<Value, BFS.VS> getVertexStateFactory() { return vertexStateFactory; } @Override - public Factory<ISPO, BFS.ES> getEdgeStateFactory() { + public Factory<Statement, BFS.ES> getEdgeStateFactory() { return null; @@ -132,17 +132,18 @@ * Overridden to only visit the edges of the graph. */ @Override - public IStriterator constrainFilter(IStriterator itr) { + public IStriterator constrainFilter( + final IGASContext<BFS.VS, BFS.ES, Void> ctx, final IStriterator itr) { - return itr.addFilter(edgeOnlyFilter); - + return itr.addFilter(getEdgeOnlyFilter(ctx)); + } /** * Not used. */ @Override - public void init(final IGASState<BFS.VS, BFS.ES, Void> state, final IV u) { + public void init(final IGASState<BFS.VS, BFS.ES, Void> state, final Value u) { state.getState(u).visit(0); @@ -152,7 +153,7 @@ * Not used. */ @Override - public Void gather(IGASState<BFS.VS, BFS.ES, Void> state, IV u, ISPO e) { + public Void gather(IGASState<BFS.VS, BFS.ES, Void> state, Value u, Statement e) { throw new UnsupportedOperationException(); } @@ -168,7 +169,7 @@ * NOP */ @Override - public BFS.VS apply(final IGASState<BFS.VS, BFS.ES, Void> state, final IV u, + public BFS.VS apply(final IGASState<BFS.VS, BFS.ES, Void> state, final Value u, final Void sum) { return null; @@ -179,7 +180,7 @@ * Returns <code>true</code>. */ @Override - public boolean isChanged(IGASState<VS, ES, Void> state, IV u) { + public boolean isChanged(IGASState<VS, ES, Void> state, Value u) { return true; @@ -190,14 +191,15 @@ * visited. * <p> * Note: We are scattering to out-edges. Therefore, this vertex is - * {@link ISPO#s()}. The remote vertex is {@link ISPO#o()}. + * {@link Statement#getSubject()}. The remote vertex is + * {@link Statement#getObject()}. */ @Override public void scatter(final IGASState<BFS.VS, BFS.ES, Void> state, - final IGASScheduler sch, final IV u, final ISPO e) { + final IGASScheduler sch, final Value u, final Statement e) { // remote vertex state. - final VS otherState = state.getState(e.o()); + final VS otherState = state.getState(e.getObject()); // visit. if (otherState.visit(state.round() + 1)) { @@ -207,14 +209,14 @@ * schedule for the next iteration. */ - sch.schedule(e.o()); + sch.schedule(e.getObject()); } } @Override - public boolean nextRound(IGASContext ctx) { + public boolean nextRound(IGASContext<BFS.VS, BFS.ES, Void> ctx) { return true; Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/analytics/SSSP.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/analytics/SSSP.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/analytics/SSSP.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -1,14 +1,14 @@ package com.bigdata.rdf.graph.analytics; import org.apache.log4j.Logger; +import org.openrdf.model.Statement; +import org.openrdf.model.Value; import com.bigdata.rdf.graph.Factory; -import com.bigdata.rdf.graph.GASUtil; +import com.bigdata.rdf.graph.IGASContext; import com.bigdata.rdf.graph.IGASScheduler; import com.bigdata.rdf.graph.IGASState; import com.bigdata.rdf.graph.impl.BaseGASProgram; -import com.bigdata.rdf.internal.IV; -import com.bigdata.rdf.spo.ISPO; import cutthecrap.utils.striterators.IStriterator; @@ -31,7 +31,6 @@ * getOtherVertex(e) method to figure out the other edge when using * undirected scatter/gather. Add unit test for undirected. */ -@SuppressWarnings("rawtypes") public class SSSP extends BaseGASProgram<SSSP.VS, SSSP.ES, Integer/* dist */> { private static final Logger log = Logger.getLogger(SSSP.class); @@ -40,10 +39,10 @@ * The length of an edge. * * FIXME RDR: This should be modified to use link weights with RDR. We need - * a pattern to get the link attributes materialized with the {@link ISPO} + * a pattern to get the link attributes materialized with the {@link Statement} * for the link. That could be done using a read-ahead filter on the * striterator if the link weights are always clustered with the ground - * triple. See {@link #decodeStatement(IV)}. + * triple. See {@link #decodeStatement(Value)}. * <P> * When we make this change, the distance should be of the same type as the * link weight or generalized as <code>double</code>. @@ -115,10 +114,10 @@ } - private static final Factory<IV, SSSP.VS> vertexStateFactory = new Factory<IV, SSSP.VS>() { + private static final Factory<Value, SSSP.VS> vertexStateFactory = new Factory<Value, SSSP.VS>() { @Override - public SSSP.VS initialValue(final IV value) { + public SSSP.VS initialValue(final Value value) { return new VS(); @@ -127,7 +126,7 @@ }; @Override - public Factory<IV, SSSP.VS> getVertexStateFactory() { + public Factory<Value, SSSP.VS> getVertexStateFactory() { return vertexStateFactory; @@ -160,9 +159,11 @@ * Overridden to only visit the edges of the graph. */ @Override - public IStriterator constrainFilter(IStriterator itr) { + public IStriterator constrainFilter( + final IGASContext<SSSP.VS, SSSP.ES, Integer> ctx, + final IStriterator itr) { - return itr.addFilter(edgeOnlyFilter); + return itr.addFilter(getEdgeOnlyFilter(ctx)); } @@ -173,7 +174,7 @@ */ @Override public void init(final IGASState<SSSP.VS, SSSP.ES, Integer> state, - final IV u) { + final Value u) { final VS us = state.getState(u); @@ -196,11 +197,11 @@ */ @Override public Integer gather(final IGASState<SSSP.VS, SSSP.ES, Integer> state, - final IV u, final ISPO e) { + final Value u, final Statement e) { // assert e.o().equals(u); - final VS src = state.getState(e.s()); + final VS src = state.getState(e.getSubject()); final int d = src.dist(); @@ -233,7 +234,7 @@ */ @Override public SSSP.VS apply(final IGASState<SSSP.VS, SSSP.ES, Integer> state, - final IV u, final Integer sum) { + final Value u, final Integer sum) { if (sum != null) { @@ -263,7 +264,7 @@ @Override public boolean isChanged(final IGASState<SSSP.VS, SSSP.ES, Integer> state, - final IV u) { + final Value u) { return state.getState(u).isChanged(); @@ -273,7 +274,8 @@ * The remote vertex is scheduled if this vertex is changed. * <p> * Note: We are scattering to out-edges. Therefore, this vertex is - * {@link ISPO#s()}. The remote vertex is {@link ISPO#o()}. + * {@link Statement#getSubect()}. The remote vertex is + * {@link Statement#getObject()}. * <p> * {@inheritDoc} * @@ -297,9 +299,9 @@ */ @Override public void scatter(final IGASState<SSSP.VS, SSSP.ES, Integer> state, - final IGASScheduler sch, final IV u, final ISPO e) { + final IGASScheduler sch, final Value u, final Statement e) { - final IV other = GASUtil.getOtherVertex(u, e); + final Value other = state.getOtherVertex(u, e); final VS selfState = state.getState(u); @@ -323,7 +325,7 @@ + ", scheduling: " + other + " with newDist=" + newDist); // Then add the remote vertex to the next frontier. - sch.schedule(e.o()); + sch.schedule(e.getObject()); } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/BaseGASProgram.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/BaseGASProgram.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/BaseGASProgram.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -1,13 +1,14 @@ package com.bigdata.rdf.graph.impl; +import org.openrdf.model.Statement; +import org.openrdf.model.URI; +import org.openrdf.model.Value; + import com.bigdata.rdf.graph.EdgesEnum; import com.bigdata.rdf.graph.Factory; import com.bigdata.rdf.graph.IGASContext; import com.bigdata.rdf.graph.IGASProgram; import com.bigdata.rdf.graph.IGASState; -import com.bigdata.rdf.internal.IV; -import com.bigdata.rdf.internal.impl.bnode.SidIV; -import com.bigdata.rdf.spo.ISPO; import cutthecrap.utils.striterators.Filter; import cutthecrap.utils.striterators.IFilter; @@ -21,134 +22,120 @@ * @param <ES> * @param <ST> */ -@SuppressWarnings("rawtypes") abstract public class BaseGASProgram<VS, ES, ST> implements IGASProgram<VS, ES, ST> { /** + * {@inheritDoc} + * <p> + * The default implementation does not restrict the visitation to a + * connectivity matrix (returns <code>null</code>). + */ + @Override + public Value getLinkType() { + + return null; + + } + + /** + * {@inheritDoc} + * <p> + * The default implementation returns its argument. + */ + @Override + public IStriterator constrainFilter(final IGASContext<VS, ES, ST> ctx, + final IStriterator itr) { + + return itr; + + } + + /** + * Return an {@link IFilter} that will only visit the edges of the graph. + * + * @see IGASState#isEdge(Statement) + */ + protected IFilter getEdgeOnlyFilter(final IGASContext<VS, ES, ST> ctx) { + + return new EdgeOnlyFilter(ctx); + + } + + /** * Filter visits only edges (filters out attribute values). * <p> * Note: This filter is pushed down onto the AP and evaluated close to the * data. */ - protected static final IFilter edgeOnlyFilter = new Filter() { + private class EdgeOnlyFilter extends Filter { private static final long serialVersionUID = 1L; - + private final IGASState<VS, ES, ST> gasState; + private EdgeOnlyFilter(final IGASContext<VS, ES, ST> ctx) { + this.gasState = ctx.getGASState(); + } @Override public boolean isValid(final Object e) { - return ((ISPO) e).o().isURI(); + return gasState.isEdge((Statement) e); } }; - + /** - * Return <code>true</code> iff the visited {@link ISPO} is an instance - * of the specified link attribute type. + * Return a filter that only visits the edges of graph that are instances of + * the specified link attribute type. + * <p> + * Note: For bigdata, the visited edges can be decoded to recover the + * original link as well. * - * @return + * @see IGASState#isLinkAttrib(Statement, URI) + * @see IGASState#decodeStatement(Value) */ - protected static final IFilter newLinkAttribFilter(final IV linkAttribType) { + protected IFilter getLinkAttribFilter(final IGASContext<VS, ES, ST> ctx, + final URI linkAttribType) { - return new LinkAttribFilter(linkAttribType); - + return new LinkAttribFilter(ctx, linkAttribType); + } - static class LinkAttribFilter extends Filter { - + /** + * Filter visits only edges where the {@link Statement} is an instance of + * the specified link attribute type. For bigdata, the visited edges can be + * decoded to recover the original link as well. + */ + private class LinkAttribFilter extends Filter { private static final long serialVersionUID = 1L; - private final IV linkAttribType; + private final IGASState<VS, ES, ST> gasState; + private final URI linkAttribType; - public LinkAttribFilter(final IV linkAttribType) { - + public LinkAttribFilter(final IGASContext<VS, ES, ST> ctx, + final URI linkAttribType) { if (linkAttribType == null) throw new IllegalArgumentException(); - + this.gasState = ctx.getGASState(); this.linkAttribType = linkAttribType; - } @Override public boolean isValid(final Object e) { - final ISPO edge = (ISPO) e; - if(!edge.p().equals(linkAttribType)) { - // Edge does not use the specified link attribute type. - return false; - } - if (!(edge.s() instanceof SidIV)) { - // The subject of the edge is not a Statement. - return false; - } - return true; + return gasState.isLinkAttrib((Statement) e, linkAttribType); } - } - /** - * If the vertex is actually an edge, then return the decoded edge. - * <p> - * Note: A vertex may be an edge. A link attribute is modeled by treating - * the link as a vertex and then asserting a property value about that - * "link vertex". For bigdata, this is handled efficiently as inline - * statements about statements. This approach subsumes the property graph - * model (property graphs do not permit recursive nesting of these - * relationships) and is 100% consistent with RDF reification, except that - * the link attributes are modeled efficiently inline with the links. This - * is what we call <a - * href="http://www.bigdata.com/whitepapers/reifSPARQL.pdf" > Reification - * Done Right </a>. - * - * @param v - * The vertex. - * - * @return The edge decoded from that vertex and <code>null</code> iff the - * vertex is not an edge. - * - * TODO RDR : Link to an RDR wiki page as well. - * - * TODO We can almost write the same logic at the openrdf layer - * using <code>v instanceof Statement</code>. However, v can not be - * a Statement for openrdf and there is no way to decode the vertex - * as a Statement in openrdf. - */ - protected ISPO decodeStatement(final IV v) { +// /** +// * If the vertex is actually an edge, then return the decoded edge. +// * +// * @see GASUtil#decodeStatement(Value) +// */ +// protected Statement decodeStatement(final Value v) { +// +// return GASUtil.decodeStatement(v); +// +// } - if (!v.isStatement()) - return null; - - final ISPO decodedEdge = (ISPO) v.getInlineValue(); - - return decodedEdge; - - } - /** * {@inheritDoc} * <p> - * The default implementation does not restrict the visitation to a - * connectivity matrix (returns <code>null</code>). - */ - @Override - public IV getLinkType() { - - return null; - - } - - /** - * {@inheritDoc} - * <p> - * The default implementation returns its argument. - */ - @Override - public IStriterator constrainFilter(IStriterator itr) { - - return itr; - - } - - /** - * {@inheritDoc} - * <p> * The default gathers on the {@link EdgesEnum#InEdges}. */ @Override @@ -176,13 +163,13 @@ * The default is a NOP. */ @Override - public void init(final IGASState<VS, ES, ST> state, final IV u) { + public void init(final IGASState<VS, ES, ST> state, final Value u) { // NOP } -// public Factory<IV, VS> getVertexStateFactory(); +// public Factory<Value, VS> getVertexStateFactory(); /** * {@inheritDoc} @@ -191,7 +178,7 @@ * the algorithm uses per-edge computation state. */ @Override - public Factory<ISPO, ES> getEdgeStateFactory() { + public Factory<Statement, ES> getEdgeStateFactory() { return null; @@ -204,7 +191,7 @@ * you know whether or not the computation state of this vertex has changed. */ @Override - public boolean isChanged(IGASState<VS, ES, ST> state, IV u) { + public boolean isChanged(IGASState<VS, ES, ST> state, Value u) { return true; Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASContext.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASContext.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASContext.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -6,6 +6,8 @@ import java.util.concurrent.TimeUnit; import org.apache.log4j.Logger; +import org.openrdf.model.Statement; +import org.openrdf.model.Value; import com.bigdata.rdf.graph.EdgesEnum; import com.bigdata.rdf.graph.GASUtil; @@ -17,10 +19,7 @@ import com.bigdata.rdf.graph.IGraphAccessor; import com.bigdata.rdf.graph.IReducer; import com.bigdata.rdf.graph.IStaticFrontier; -import com.bigdata.rdf.internal.IV; -import com.bigdata.rdf.spo.ISPO; -@SuppressWarnings("rawtypes") public class GASContext<VS, ES, ST> implements IGASContext<VS, ES, ST> { private static final Logger log = Logger.getLogger(GASContext.class); @@ -331,7 +330,7 @@ */ private void apply(final IStaticFrontier f) { - for (IV u : f) { + for (Value u : f) { program.apply(gasState, u, null/* sum */); @@ -360,7 +359,7 @@ class ScatterVertexTaskFactory implements VertexTaskFactory<Long> { - public Callable<Long> newVertexTask(final IV u) { + public Callable<Long> newVertexTask(final Value u) { return new ScatterTask(u) { @Override @@ -411,7 +410,7 @@ class GatherVertexTaskFactory implements VertexTaskFactory<Long> { - public Callable<Long> newVertexTask(final IV u) { + public Callable<Long> newVertexTask(final Value u) { return new GatherTask(u) { @Override @@ -458,9 +457,9 @@ */ abstract private class VertexEdgesTask implements Callable<Long> { - protected final IV u; + protected final Value u; - public VertexEdgesTask(final IV u) { + public VertexEdgesTask(final Value u) { this.u = u; @@ -484,7 +483,7 @@ */ abstract private class ScatterTask extends VertexEdgesTask { - public ScatterTask(final IV u) { + public ScatterTask(final Value u) { super(u); @@ -526,15 +525,15 @@ final IGASScheduler sch = scheduler(); - final Iterator<ISPO> eitr = graphAccessor.getEdges(program, u, - getEdgesEnum()); + final Iterator<Statement> eitr = graphAccessor.getEdges( + GASContext.this, u, getEdgesEnum()); try { while (eitr.hasNext()) { // edge - final ISPO e = eitr.next(); + final Statement e = eitr.next(); nedges++; @@ -565,7 +564,7 @@ */ abstract private class GatherTask extends VertexEdgesTask { - public GatherTask(final IV u) { + public GatherTask(final Value u) { super(u); @@ -576,8 +575,8 @@ long nedges = 0; - final Iterator<ISPO> eitr = graphAccessor.getEdges(program, u, - getEdgesEnum()); + final Iterator<Statement> eitr = graphAccessor.getEdges( + GASContext.this, u, getEdgesEnum()); try { @@ -591,7 +590,7 @@ while (eitr.hasNext()) { - final ISPO e = eitr.next(); + final Statement e = eitr.next(); if (log.isTraceEnabled()) // TODO Batch resolve if @ TRACE log.trace("u=" + u + ", e=" + gasState.toString(e) + ", sum=" Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -10,6 +10,8 @@ import java.util.concurrent.FutureTask; import java.util.concurrent.atomic.AtomicReference; +import org.openrdf.model.Value; + import com.bigdata.rdf.graph.IGASEngine; import com.bigdata.rdf.graph.IGASProgram; import com.bigdata.rdf.graph.IGASScheduler; @@ -18,7 +20,6 @@ import com.bigdata.rdf.graph.IGraphAccessor; import com.bigdata.rdf.graph.IStaticFrontier; import com.bigdata.rdf.graph.impl.scheduler.CHMScheduler; -import com.bigdata.rdf.internal.IV; import com.bigdata.util.concurrent.DaemonThreadFactory; /** @@ -198,7 +199,7 @@ long nedges = 0L; // For all vertices in the frontier. - for (IV u : f) { + for (Value u : f) { nedges += taskFactory.newVertexTask(u).call(); @@ -230,7 +231,7 @@ * about order for this collection. It is only there to filter out * duplicate work. */ - private final HashSet<IV> scheduled; + private final HashSet<Value> scheduled; ParallelFrontierStrategy(final VertexTaskFactory<Long> taskFactory, final IStaticFrontier f) { @@ -247,7 +248,7 @@ * If the frontier is known to be compact, then this map is not * initialized and is not used. */ - this.scheduled = f.isCompact() ? null : new HashSet<IV>(f.size()); + this.scheduled = f.isCompact() ? null : new HashSet<Value>(f.size()); } @@ -274,7 +275,7 @@ try { // For all vertices in the frontier. - for (IV u : f) { + for (Value u : f) { if (scheduled != null) { Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASRunner.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -14,6 +14,7 @@ import java.util.concurrent.Callable; import org.apache.log4j.Logger; +import org.openrdf.model.Value; import org.openrdf.rio.RDFFormat; import com.bigdata.Banner; @@ -28,8 +29,6 @@ import com.bigdata.rdf.graph.IGASState; import com.bigdata.rdf.graph.impl.bd.BigdataGASEngine; import com.bigdata.rdf.graph.impl.bd.BigdataGASEngine.BigdataGraphAccessor; -import com.bigdata.rdf.graph.impl.bd.BigdataGASUtil; -import com.bigdata.rdf.internal.IV; import com.bigdata.rdf.rio.LoadStats; import com.bigdata.rdf.sail.BigdataSail; import com.bigdata.rdf.store.AbstractTripleStore; @@ -583,9 +582,6 @@ .getResourceLocator() .locate(namespace, jnl.getLastCommitTime()); - @SuppressWarnings("rawtypes") - final IV[] samples = BigdataGASUtil.getRandomSample(r, kb, nsamples); - // total #of edges in that graph. final long nedges = kb.getStatementCount(); @@ -615,10 +611,11 @@ final GASStats total = new GASStats(); + final Value[] samples = graphAccessor.getRandomSample(r, nsamples); + for (int i = 0; i < samples.length; i++) { - @SuppressWarnings("rawtypes") - final IV startingVertex = samples[i]; + final Value startingVertex = samples[i]; gasState.init(startingVertex); Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java 2013-08-30 10:24:13 UTC (rev 7363) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java 2013-08-30 13:15:14 UTC (rev 7364) @@ -7,42 +7,42 @@ import java.util.concurrent.atomic.AtomicInteger; import org.apache.log4j.Logger; +import org.openrdf.model.Statement; +import org.openrdf.model.URI; +import org.openrdf.model.Value; import com.bigdata.rdf.graph.Factory; +import com.bigdata.rdf.graph.GASUtil; import com.bigdata.rdf.graph.IGASProgram; import com.bigdata.rdf.graph.IGASSchedulerImpl; import com.bigdata.rdf.graph.IGASState; import com.bigdata.rdf.graph.IGraphAccessor; import com.bigdata.rdf.graph.IReducer; import com.bigdata.rdf.graph.IStaticFrontier; -import com.bigdata.rdf.graph.impl.util.GASImplUtil; -import com.bigdata.rdf.internal.IV; -import com.bigdata.rdf.spo.ISPO; -@SuppressWarnings("rawtypes") public class GASState<VS, ES, ST> implements IGASState<VS, ES, ST> { private static final Logger log = Logger.getLogger(GASState.class); -// /** -// * The {@link GASEngine} on which the {@link IGASProgram} will be run. -// */ -// private final GASEngine gasEngine; + // /** + // * The {@link GASEngine} on which the {@link IGASProgram} will be run. + // */ + // private final GASEngine gasEngine; /** * The {@link IGASProgram} to be run. */ private final IGASProgram<VS, ES, ST> gasProgram; - + /** * Factory for the vertex state objects. */ - private final Factory<IV, VS> vsf; + private final Factory<Value, VS> vsf; /** * Factory for the edge state objects. */ - private final Factory<ISPO, ES> esf; + private final Factory<Statement, ES> esf; /** * The set of vertices that were identified in the current iteration. @@ -60,7 +60,7 @@ * {@link #frontier} at the end of the round. */ private final IGASSchedulerImpl scheduler; - + /** * The current evaluation round. */ @@ -74,13 +74,13 @@ * visited sets using a full traveral strategy, but overflow to an HTree or * (if fixed stride) a MemStore or BigArray could help). */ - protected final ConcurrentMap<IV, VS> vertexState = new ConcurrentHashMap<IV, VS>(); + protected final ConcurrentMap<Value, VS> vertexState = new ConcurrentHashMap<Value, VS>(); /** * TODO EDGE STATE: state needs to be configurable. When disabled, leave * this as <code>null</code>. */ - protected final ConcurrentMap<ISPO, ES> edgeState = null; + protected final ConcurrentMap<Statement, ES> edgeState = null; /** * Provides access to the backing graph. Used to decode vertices and edges @@ -115,9 +115,9 @@ this.esf = gasProgram.getEdgeStateFactory(); this.frontier = frontier; - + this.scheduler = gasScheduler; - + } /** @@ -127,9 +127,9 @@ protected IGraphAccessor getGraphAccessor() { return graphAccessor; - + } - + @Override public IStaticFrontier frontier() { @@ -145,7 +145,7 @@ } @Override - public VS getState(final IV v) { + public VS getState(final Value v) { VS vs = vertexState.get(v); @@ -167,7 +167,7 @@ } @Override - public ES getState(final ISPO e) { + public ES getState(final Statement e) { if (edgeState == null) return null; @@ -209,12 +209,12 @@ edgeState.clear(); frontier.resetFrontier(0/* minCapacity */, true/* ordered */, - GASImplUtil.EMPTY_VERTICES_ITERATOR); + GASUtil.EMPTY_VERTICES_ITERATOR); } - + @Override - public void init(final IV... vertices) { + public void init(final Value... vertices) { if (vertices == null) throw new IllegalArgumentException(); @@ -222,21 +222,19 @@ reset(); // Used to ensure that the initial frontier is distinct. - final Set<IV> tmp = new HashSet<IV>(); + final Set<Value> tmp = new HashSet<Val... [truncated message content] |
From: <tho...@us...> - 2013-08-30 10:24:22
|
Revision: 7363 http://bigdata.svn.sourceforge.net/bigdata/?rev=7363&view=rev Author: thompsonbry Date: 2013-08-30 10:24:13 +0000 (Fri, 30 Aug 2013) Log Message: ----------- updated package documentation Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/package.html Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/package.html =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/package.html 2013-08-29 14:06:54 UTC (rev 7362) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/package.html 2013-08-30 10:24:13 UTC (rev 7363) @@ -3,9 +3,63 @@ <title>GAS Engine API</title> </head> <body> -<p> -The GAS (Gather Apply Scatter) API was developed for PowerGraph. This is -a port of that API to the Java platform. -</p> + <p>The GAS (Gather Apply Scatter) API was developed for PowerGraph + (aka GraphLab 2.1). This is a port of that API to the Java platform + and schema-flexible attributed graphs using RDF.</p> + <p>Graph algorithms are stated using the GAS (Gather, Apply, + Scatter) API. This API provides a vertex-centric approach to graph + processing ("think like a vertex") that can be used to write a large + number of graph algorithms (page rank, triangle counting, connected + components, SSSP, betweenness centrality, etc.). The GAS API allows + the GATHER operation to be efficently decomposed using fine-grained + parallelism over a cluster.</p> + <p>Part of our effort under the XDATA program is to examine how + fine-grained parallelism can be leveraged on GPUs and other many-core + devices to deliver extreme performance on graph algorithms. We are + looking at how the GAS abstraction can be evolved to expose more + parallelism.</p> + <p>The interfaces of this API are stated in terms of RDF {@link + org.openrdf.model.Value} objects (for vertices) and {@link + org.openrdf.model.Statement} objects (for edges). Link attributes are + handled efficiently by the bigdata implementation, which co-locates + them in the indices with the links and then applies prefix compression + to deliver a compact on disk foot print. See the section on + Reification Done Right (below) for more details.</p> + <h2>Reification Done Right and Property Graphs</h2> + <p> + <a href="http://www.bigdata.com/whitepapers/reifSPARQL.pdf">Reification + Done Right</a> (RDR) explains the relationship between the somewhat + opaque concept of RDF reification (which we use only for interchange) + and statements about statements (more generally, the ability to turn + any edge into a vertex and make statements about that vertex). There + are different ways to handle statemetns about statements efficiently + in the database, however these are internal physical schema design + questions. From a user perspective, the main concern should be the + performance of the database platform when using this feature. Bigdata + uses a combination of inlining and prefix compression to provide a + dense fast, bi-directional encoding of statements about statements and + fast access paths whether querying by vertices, property values, or + link attributes. You can also write queries using a high-level query + language (SPARQL) that are automatically optimized and executed + against the graph. + </p> + <p> + The RDR approach is more general than the + <a + href="https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model"> + Property Graph Model </a> - <em>anything</em> that you can do with a + property graph you can at as efficiently in an intelligently designed + RDF database. Further, RDF graphs allow efficient handling of the + following cases that are disallowed under the property graph model: + </p> + <ul> + <li>A vertex may have multiple property values for the same key.</li> + <li>A link may have multiple link attributes for the same key.</li> + <li>A link may serve as a vertex - thus you may have links whose + sources or targets are other links (hypergraphs).</li> + </ul> + <p>Because of its lack of cardinality constraints on property values + and generality, RDF data sets may be freely combined and then + leveraged. Data-level collisions simply do not occur.</p> </body> </html> \ No newline at end of file This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-29 14:07:01
|
Revision: 7362 http://bigdata.svn.sourceforge.net/bigdata/?rev=7362&view=rev Author: thompsonbry Date: 2013-08-29 14:06:54 +0000 (Thu, 29 Aug 2013) Log Message: ----------- removed import from outside of the bigdata-client JAR Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-sails/src/java/com/bigdata/rdf/sail/webapp/client/RemoteRepository.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-sails/src/java/com/bigdata/rdf/sail/webapp/client/RemoteRepository.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-sails/src/java/com/bigdata/rdf/sail/webapp/client/RemoteRepository.java 2013-08-29 10:42:07 UTC (rev 7361) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-sails/src/java/com/bigdata/rdf/sail/webapp/client/RemoteRepository.java 2013-08-29 14:06:54 UTC (rev 7362) @@ -97,7 +97,9 @@ import org.xml.sax.Attributes; import org.xml.sax.ext.DefaultHandler2; -import com.bigdata.rdf.sparql.ast.service.RemoteServiceOptions; +// Note: Do not import. Not part of the bigdata-client.jar +// +//import com.bigdata.rdf.sparql.ast.service.RemoteServiceOptions; /** @@ -235,8 +237,6 @@ * The method which may be "POST" or "GET". * * @see #getQueryMethod() - * - * @see RemoteServiceOptions#setGET(boolean) */ public void setQueryMethod(final String method) { This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <mar...@us...> - 2013-08-29 10:42:18
|
Revision: 7361 http://bigdata.svn.sourceforge.net/bigdata/?rev=7361&view=rev Author: martyncutcher Date: 2013-08-29 10:42:07 +0000 (Thu, 29 Aug 2013) Log Message: ----------- Remove unnecessary delay in constructor and use HAConnection method to build HAGlue[] from UUID[] Modified Paths: -------------- branches/READ_CACHE2/bigdata-jini/src/java/com/bigdata/journal/jini/ha/DumpLogDigests.java Modified: branches/READ_CACHE2/bigdata-jini/src/java/com/bigdata/journal/jini/ha/DumpLogDigests.java =================================================================== --- branches/READ_CACHE2/bigdata-jini/src/java/com/bigdata/journal/jini/ha/DumpLogDigests.java 2013-08-29 10:32:38 UTC (rev 7360) +++ branches/READ_CACHE2/bigdata-jini/src/java/com/bigdata/journal/jini/ha/DumpLogDigests.java 2013-08-29 10:42:07 UTC (rev 7361) @@ -99,8 +99,8 @@ public DumpLogDigests(final String[] configFiles) throws ConfigurationException, IOException, InterruptedException { - // wait for zk services to register! - Thread.sleep(1000); + // Sleeping should not be necessary + // Thread.sleep(1000); // wait for zk services to register! client = new HAClient(configFiles); } @@ -577,10 +577,10 @@ final Quorum<HAGlue, QuorumClient<HAGlue>> quorum = cnxn.getHAGlueQuorum(serviceRoot); final UUID[] uuids = quorum.getJoined(); - QuorumClient<HAGlue> qclient = quorum.getClient(); - - for (UUID uuid : uuids) { - ret.add(qclient.getService(uuid)); + final HAGlue[] haglues = cnxn.getHAGlueService(uuids); + + for (HAGlue haglue : haglues) { + ret.add(haglue); } return ret; This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-29 10:32:45
|
Revision: 7360 http://bigdata.svn.sourceforge.net/bigdata/?rev=7360&view=rev Author: thompsonbry Date: 2013-08-29 10:32:38 +0000 (Thu, 29 Aug 2013) Log Message: ----------- Added @Override annotations on SPO Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPO.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPO.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPO.java 2013-08-29 10:19:03 UTC (rev 7359) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPO.java 2013-08-29 10:32:38 UTC (rev 7360) @@ -142,6 +142,7 @@ private static int SIDABLE_BIT = 6; + @Override @SuppressWarnings("rawtypes") final public IV get(final int index) { switch(index) { @@ -153,21 +154,25 @@ } } + @Override @SuppressWarnings("rawtypes") final public IV s() { return s; } + @Override @SuppressWarnings("rawtypes") final public IV p() { return p; } + @Override @SuppressWarnings("rawtypes") final public IV o() { return o; } + @Override @SuppressWarnings("rawtypes") final public IV c() { @@ -180,6 +185,7 @@ } + @Override public final void setStatementIdentifier(final boolean sid) { if (sid && type() != StatementEnum.Explicit) { @@ -208,18 +214,21 @@ } + @Override final public boolean hasStatementIdentifier() { return sidable(); } + @Override public void setOverride(final boolean override) { override(override); } + @Override public boolean isOverride() { return override(); @@ -411,6 +420,7 @@ /** * Return <code>true</code> IFF the {@link SPO} is marked as {@link StatementEnum#Explicit}. */ + @Override public final boolean isExplicit() { return type() == StatementEnum.Explicit; @@ -420,6 +430,7 @@ /** * Return <code>true</code> IFF the {@link SPO} is marked as {@link StatementEnum#Inferred}. */ + @Override public final boolean isInferred() { return type() == StatementEnum.Inferred; @@ -429,6 +440,7 @@ /** * Return <code>true</code> IFF the {@link SPO} is marked as {@link StatementEnum#Axiom}. */ + @Override public final boolean isAxiom() { return type() == StatementEnum.Axiom; @@ -438,6 +450,7 @@ /** * Return <code>true</code> IFF the {@link SPO} has the user flag bit set. */ + @Override public final boolean getUserFlag() { return userFlag(); @@ -449,6 +462,7 @@ * * @parm userFlag boolean flag */ + @Override public final void setUserFlag(final boolean userFlag) { userFlag(userFlag); @@ -461,6 +475,7 @@ * Hash code for the (s,p,o) per Sesame's {@link Statement#hashCode()}. It * DOES NOT consider the context position. */ + @Override public int hashCode() { if (hashCode == 0) { @@ -488,6 +503,7 @@ } + @Override public boolean equals(final Object o) { if (this == o) @@ -529,6 +545,7 @@ * * @see ITripleStore#toString(IV, IV, IV) */ + @Override public String toString() { return ("< " + toString(s) + ", " + toString(p) + ", " + toString(o)) @@ -547,6 +564,7 @@ * The term identifier. * @return */ + @SuppressWarnings("rawtypes") public static String toString(final IV iv) { if (iv == null) @@ -567,6 +585,7 @@ * * @return The externalized representation of the statement. */ + @Override public String toString(final IRawTripleStore store) { if (store != null) { @@ -595,18 +614,21 @@ } + @Override final public boolean isFullyBound() { return s != null && p != null && o != null; } + @Override final public StatementEnum getStatementType() { return type(); } + @Override final public void setStatementType(final StatementEnum type) { if(this.type() != null && this.type() != type) { @@ -619,24 +641,28 @@ } + @Override final public boolean hasStatementType() { return type() != null; } + @Override public boolean isModified() { return modified() != ModifiedEnum.NONE; } + @Override public void setModified(final ModifiedEnum modified) { modified(modified); } + @Override public ModifiedEnum getModified() { return modified(); This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-29 10:19:12
|
Revision: 7359 http://bigdata.svn.sourceforge.net/bigdata/?rev=7359&view=rev Author: thompsonbry Date: 2013-08-29 10:19:03 +0000 (Thu, 29 Aug 2013) Log Message: ----------- Modified IV to implement Value. All concrete IV implementations already implement Value. I have run TestLocalTripleStore, TestBigdataSailWithQuads, and ast.eval.TestAll, and com.bigdata.rdf.TestAll with this change. Everything is fine. I had to modify SearchInSearchServiceFactory to clarifiy a method call which became ambiguous with this change (compiler error). I had to add a missing method (Value.stringValue()) to TestEncodeDecodeKeys. This change was made to support #629 (Graph mining API). Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/internal/IV.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/eval/SearchInSearchServiceFactory.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/test/com/bigdata/rdf/internal/TestEncodeDecodeKeys.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/internal/IV.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/internal/IV.java 2013-08-28 21:41:23 UTC (rev 7358) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/internal/IV.java 2013-08-29 10:19:03 UTC (rev 7359) @@ -47,7 +47,7 @@ * The generic type for the inline value. */ public interface IV<V extends BigdataValue, T> extends Serializable, - Comparable<IV>, IVCache<V,T> { + Comparable<IV>, IVCache<V,T>, Value { /** * The value of the flags representing the {@link VTE} and the {@link DTE}. Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/eval/SearchInSearchServiceFactory.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/eval/SearchInSearchServiceFactory.java 2013-08-28 21:41:23 UTC (rev 7358) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/eval/SearchInSearchServiceFactory.java 2013-08-29 10:19:03 UTC (rev 7359) @@ -535,7 +535,7 @@ final IV o = (IV) src.next().getDocId(); final Iterator<ISPO> it = - store.getAccessPath(null, null, o).iterator(); + store.getAccessPath((IV)null, (IV)null, o).iterator(); while (it.hasNext()) { Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/test/com/bigdata/rdf/internal/TestEncodeDecodeKeys.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/test/com/bigdata/rdf/internal/TestEncodeDecodeKeys.java 2013-08-28 21:41:23 UTC (rev 7358) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/test/com/bigdata/rdf/internal/TestEncodeDecodeKeys.java 2013-08-29 10:19:03 UTC (rev 7359) @@ -1,796 +1,808 @@ -/** - -Copyright (C) SYSTAP, LLC 2006-2010. All rights reserved. - -Contact: - SYSTAP, LLC - 4501 Tower Road - Greensboro, NC 27410 - lic...@bi... - -This program is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; version 2 of the License. - -This program is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with this program; if not, write to the Free Software -Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ -/* - * Created on Apr 19, 2010 - */ - -package com.bigdata.rdf.internal; - -import java.util.LinkedList; -import java.util.List; -import java.util.Random; -import java.util.TimeZone; -import java.util.UUID; - -import javax.xml.datatype.DatatypeConfigurationException; -import javax.xml.datatype.DatatypeFactory; - -import org.openrdf.model.Literal; -import org.openrdf.model.URI; -import org.openrdf.model.impl.LiteralImpl; - -import com.bigdata.rdf.internal.ColorsEnumExtension.Color; -import com.bigdata.rdf.internal.impl.AbstractIV; -import com.bigdata.rdf.internal.impl.BlobIV; -import com.bigdata.rdf.internal.impl.bnode.NumericBNodeIV; -import com.bigdata.rdf.internal.impl.bnode.SidIV; -import com.bigdata.rdf.internal.impl.bnode.UUIDBNodeIV; -import com.bigdata.rdf.internal.impl.extensions.DateTimeExtension; -import com.bigdata.rdf.internal.impl.extensions.DerivedNumericsExtension; -import com.bigdata.rdf.internal.impl.literal.LiteralExtensionIV; -import com.bigdata.rdf.internal.impl.literal.UUIDLiteralIV; -import com.bigdata.rdf.internal.impl.literal.XSDBooleanIV; -import com.bigdata.rdf.internal.impl.literal.XSDNumericIV; -import com.bigdata.rdf.internal.impl.uri.VocabURIByteIV; -import com.bigdata.rdf.internal.impl.uri.VocabURIShortIV; -import com.bigdata.rdf.lexicon.LexiconRelation; -import com.bigdata.rdf.model.BigdataBNode; -import com.bigdata.rdf.model.BigdataLiteral; -import com.bigdata.rdf.model.BigdataURI; -import com.bigdata.rdf.model.BigdataValue; -import com.bigdata.rdf.model.BigdataValueFactory; -import com.bigdata.rdf.model.BigdataValueFactoryImpl; -import com.bigdata.rdf.model.StatementEnum; -import com.bigdata.rdf.spo.SPO; -import com.bigdata.rdf.vocab.Vocabulary; - -/** - * Unit tests for encoding and decoding compound keys (such as are used by the - * statement indices) in which some of the key components are inline values - * having variable component lengths while others are term identifiers. - * - * @author <a href="mailto:tho...@us...">Bryan Thompson</a> - * @version $Id: TestEncodeDecodeKeys.java 2756 2010-05-03 22:26:18Z - * thompsonbry$ - * - * FIXME Test heterogenous sets of {@link IV}s, ideally randomly - * generated for each type of VTE and DTE. - */ -public class TestEncodeDecodeKeys extends AbstractEncodeDecodeKeysTestCase { - - public TestEncodeDecodeKeys() { - super(); - } - - public TestEncodeDecodeKeys(String name) { - super(name); - } - - public void test_InlineValue() { - - for (VTE vte : VTE.values()) { - - for (DTE dte : DTE.values()) { - - final IV<?, ?> v = new AbstractIV(vte, - true/* inline */, false/* extension */, dte) { - - private static final long serialVersionUID = 1L; - - @Override - public boolean equals(Object o) { - if (this == o) - return true; - return false; - } - - public int byteLength() { - throw new UnsupportedOperationException(); - } - - @Override - public int hashCode() { - return 0; - } - - public IV<?, ?> clone(boolean clearCache) { - throw new UnsupportedOperationException(); - } - - public int _compareTo(IV o) { - throw new UnsupportedOperationException(); - } - - public BigdataValue asValue(final LexiconRelation lex) - throws UnsupportedOperationException { - return null; - } - - public Object getInlineValue() - throws UnsupportedOperationException { - return null; - } - - public boolean isInline() { - return true; - } - - public boolean needsMaterialization() { - return false; - } - - }; - - assertTrue(v.isInline()); - -// if (termId == 0L) { -// assertTrue(v.toString(), v.isNull()); -// } else { -// assertFalse(v.toString(), v.isNull()); -// } -// -// assertEquals(termId, v.getTermId()); - - // should not throw an exception. - v.getInlineValue(); - - assertEquals("flags=" + v.flags(), vte, v.getVTE()); - - assertEquals(dte, v.getDTE()); - - switch (vte) { - case URI: - assertTrue(v.isURI()); - assertFalse(v.isBNode()); - assertFalse(v.isLiteral()); - assertFalse(v.isStatement()); - break; - case BNODE: - assertFalse(v.isURI()); - assertTrue(v.isBNode()); - assertFalse(v.isLiteral()); - assertFalse(v.isStatement()); - break; - case LITERAL: - assertFalse(v.isURI()); - assertFalse(v.isBNode()); - assertTrue(v.isLiteral()); - assertFalse(v.isStatement()); - break; - case STATEMENT: - assertFalse(v.isURI()); - assertFalse(v.isBNode()); - assertFalse(v.isLiteral()); - assertTrue(v.isStatement()); - break; - default: - fail("vte=" + vte); - } - - } - - } - - } - - /** - * Unit test for encoding and decoding a statement formed from - * {@link BlobIV}s. - */ - public void test_encodeDecode_allTermIds() { - - final IV<?, ?>[] e = {// - newTermId(VTE.URI),// - newTermId(VTE.URI),// - newTermId(VTE.URI),// - newTermId(VTE.URI) // - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test where the RDF Object position is an xsd:boolean. - */ - public void test_encodeDecode_XSDBoolean() { - - final IV<?, ?>[] e = {// - new XSDBooleanIV<BigdataLiteral>(true),// - new XSDBooleanIV<BigdataLiteral>(false),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for {@link XSDNumericIV}. - */ - public void test_encodeDecode_XSDByte() { - - final IV<?, ?>[] e = {// - new XSDNumericIV<BigdataLiteral>((byte)Byte.MIN_VALUE),// - new XSDNumericIV<BigdataLiteral>((byte)-1),// - new XSDNumericIV<BigdataLiteral>((byte)0),// - new XSDNumericIV<BigdataLiteral>((byte)1),// - new XSDNumericIV<BigdataLiteral>((byte)Byte.MAX_VALUE),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for {@link XSDNumericIV}. - */ - public void test_encodeDecode_XSDShort() { - - final IV<?, ?>[] e = {// - new XSDNumericIV<BigdataLiteral>((short)-1),// - new XSDNumericIV<BigdataLiteral>((short)0),// - new XSDNumericIV<BigdataLiteral>((short)1),// - new XSDNumericIV<BigdataLiteral>((short)Short.MIN_VALUE),// - new XSDNumericIV<BigdataLiteral>((short)Short.MAX_VALUE),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for {@link XSDNumericIV}. - */ - public void test_encodeDecode_XSDInt() { - - final IV<?, ?>[] e = {// - new XSDNumericIV<BigdataLiteral>(1),// - new XSDNumericIV<BigdataLiteral>(0),// - new XSDNumericIV<BigdataLiteral>(-1),// - new XSDNumericIV<BigdataLiteral>(Integer.MAX_VALUE),// - new XSDNumericIV<BigdataLiteral>(Integer.MIN_VALUE),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for {@link XSDNumericIV}. - */ - public void test_encodeDecode_XSDLong() { - - final IV<?, ?>[] e = {// - new XSDNumericIV<BigdataLiteral>(1L),// - new XSDNumericIV<BigdataLiteral>(0L),// - new XSDNumericIV<BigdataLiteral>(-1L),// - new XSDNumericIV<BigdataLiteral>(Long.MIN_VALUE),// - new XSDNumericIV<BigdataLiteral>(Long.MAX_VALUE),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for {@link XSDNumericIV}. - */ - public void test_encodeDecode_XSDFloat() { - - /* - * Note: -0f and +0f are converted to the same point in the value space. - */ -// new XSDNumericIV<BigdataLiteral>(-0f); - - final IV<?, ?>[] e = {// - new XSDNumericIV<BigdataLiteral>(1f),// - new XSDNumericIV<BigdataLiteral>(-1f),// - new XSDNumericIV<BigdataLiteral>(+0f),// - new XSDNumericIV<BigdataLiteral>(Float.MAX_VALUE),// - new XSDNumericIV<BigdataLiteral>(Float.MIN_VALUE),// - new XSDNumericIV<BigdataLiteral>(Float.MIN_NORMAL),// - new XSDNumericIV<BigdataLiteral>(Float.POSITIVE_INFINITY),// - new XSDNumericIV<BigdataLiteral>(Float.NEGATIVE_INFINITY),// - new XSDNumericIV<BigdataLiteral>(Float.NaN),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for {@link XSDNumericIV}. - */ - public void test_encodeDecode_XSDDouble() { - - /* - * Note: -0d and +0d are converted to the same point in the value space. - */ -// new XSDNumericIV<BigdataLiteral>(-0d); - - final IV<?, ?>[] e = {// - new XSDNumericIV<BigdataLiteral>(1d),// - new XSDNumericIV<BigdataLiteral>(-1d),// - new XSDNumericIV<BigdataLiteral>(+0d),// - new XSDNumericIV<BigdataLiteral>(Double.MAX_VALUE),// - new XSDNumericIV<BigdataLiteral>(Double.MIN_VALUE),// - new XSDNumericIV<BigdataLiteral>(Double.MIN_NORMAL),// - new XSDNumericIV<BigdataLiteral>(Double.POSITIVE_INFINITY),// - new XSDNumericIV<BigdataLiteral>(Double.NEGATIVE_INFINITY),// - new XSDNumericIV<BigdataLiteral>(Double.NaN),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for {@link UUIDLiteralIV}. - */ - public void test_encodeDecode_UUID() { - - final IV<?, ?>[] e = new IV[100]; - - for (int i = 0; i < e.length; i++) { - - e[i] = new UUIDLiteralIV<BigdataLiteral>(UUID.randomUUID()); - - } - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for {@link UUIDBNodeIV}, which provides support for inlining a - * told blank node whose <code>ID</code> can be parsed as a {@link UUID}. - */ - public void test_encodeDecode_BNode_UUID_ID() { - - final IV<?, ?>[] e = new IV[100]; - - for (int i = 0; i < e.length; i++) { - - e[i] = new UUIDBNodeIV<BigdataBNode>(UUID.randomUUID()); - - } - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for {@link NumericBNodeIV}, which provides support for inlining - * a told blank node whose <code>ID</code> can be parsed as an - * {@link Integer}. - */ - public void test_encodeDecode_BNode_INT_ID() { - - final IV<?, ?>[] e = {// - new NumericBNodeIV<BigdataBNode>(-1),// - new NumericBNodeIV<BigdataBNode>(0),// - new NumericBNodeIV<BigdataBNode>(1),// - new NumericBNodeIV<BigdataBNode>(-52),// - new NumericBNodeIV<BigdataBNode>(52),// - new NumericBNodeIV<BigdataBNode>(Integer.MAX_VALUE),// - new NumericBNodeIV<BigdataBNode>(Integer.MIN_VALUE),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for the {@link EpochExtension}. - */ - public void test_encodeDecodeEpoch() { - - final BigdataValueFactory vf = BigdataValueFactoryImpl.getInstance("test"); - - final EpochExtension<BigdataValue> ext = - new EpochExtension<BigdataValue>(new IDatatypeURIResolver() { - public BigdataURI resolve(final URI uri) { - final BigdataURI buri = vf.createURI(uri.stringValue()); - buri.setIV(newTermId(VTE.URI)); - return buri; - } - }); - - final Random r = new Random(); - - final IV<?, ?>[] e = new IV[100]; - - for (int i = 0; i < e.length; i++) { - - final long v = r.nextLong(); - - final String s = Long.toString(v); - - final Literal lit = new LiteralImpl(s, EpochExtension.EPOCH); - - final IV<?,?> iv = ext.createIV(lit); - - if (iv == null) - fail("Did not create IV: lit=" + lit); - - e[i] = iv; - - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for the {@link ColorsEnumExtension}. - */ - public void test_encodeDecodeColor() { - - final BigdataValueFactory vf = BigdataValueFactoryImpl.getInstance("test"); - - final ColorsEnumExtension<BigdataValue> ext = - new ColorsEnumExtension<BigdataValue>(new IDatatypeURIResolver() { - public BigdataURI resolve(URI uri) { - final BigdataURI buri = vf.createURI(uri.stringValue()); - buri.setIV(newTermId(VTE.URI)); - return buri; - } - }); - - final List<IV<?, ?>> a = new LinkedList<IV<?, ?>>(); - - for (Color c : Color.values()) { - - a.add(ext.createIV(new LiteralImpl(c.name(), - ColorsEnumExtension.COLOR))); - - } - - final IV<?, ?>[] e = a.toArray(new IV[0]); - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for round-trip of xsd:dateTime values. - */ - public void test_encodeDecodeDateTime() throws Exception { - - final BigdataValueFactory vf = BigdataValueFactoryImpl.getInstance("test"); - - final DatatypeFactory df = DatatypeFactory.newInstance(); - - final DateTimeExtension<BigdataValue> ext = - new DateTimeExtension<BigdataValue>(new IDatatypeURIResolver() { - public BigdataURI resolve(URI uri) { - final BigdataURI buri = vf.createURI(uri.stringValue()); - buri.setIV(newTermId(VTE.URI)); - return buri; - } - }, - TimeZone.getDefault() - ); - - final BigdataLiteral[] dt = { - vf.createLiteral( - df.newXMLGregorianCalendar("2001-10-26T21:32:52")), - vf.createLiteral( - df.newXMLGregorianCalendar("2001-10-26T21:32:52+02:00")), - vf.createLiteral( - df.newXMLGregorianCalendar("2001-10-26T19:32:52Z")), - vf.createLiteral( - df.newXMLGregorianCalendar("2001-10-26T19:32:52+00:00")), - vf.createLiteral( - df.newXMLGregorianCalendar("-2001-10-26T21:32:52")), - vf.createLiteral( - df.newXMLGregorianCalendar("2001-10-26T21:32:52.12679")), - vf.createLiteral( - df.newXMLGregorianCalendar("1901-10-26T21:32:52")), - }; - - final IV<?, ?>[] e = new IV[dt.length]; - - for (int i = 0; i < dt.length; i++) { - - e[i] = ext.createIV(dt[i]); - - } - - final IV<?, ?>[] a = doEncodeDecodeTest(e); - - if (log.isInfoEnabled()) { - for (int i = 0; i < e.length; i++) { - log.info("original: " + dt[i]); - log.info("asValue : " + ext.asValue((LiteralExtensionIV<?>) e[i], vf)); - log.info("decoded : " + ext.asValue((LiteralExtensionIV<?>) a[i], vf)); - log.info(""); - } -// log.info(svf.createLiteral( -// df.newXMLGregorianCalendar("2001-10-26T21:32:52.12679"))); - } - - doComparatorTest(e); - - } - - /** - * Unit test verifies that the inline xsd:dateTime representation preserves - * the milliseconds units. However, precision beyond milliseconds is NOT - * preserved by the inline representation, which is based on milliseconds - * since the epoch. - * - * @throws DatatypeConfigurationException - */ - public void test_dateTime_preservesMillis() - throws DatatypeConfigurationException { - - final BigdataValueFactory vf = BigdataValueFactoryImpl - .getInstance("test"); - - final DatatypeFactory df = DatatypeFactory.newInstance(); - - final DateTimeExtension<BigdataValue> ext = new DateTimeExtension<BigdataValue>( - new IDatatypeURIResolver() { - public BigdataURI resolve(URI uri) { - final BigdataURI buri = vf.createURI(uri.stringValue()); - buri.setIV(newTermId(VTE.URI)); - return buri; - } - }, TimeZone.getTimeZone("GMT")); - - /* - * The string representation of the dateTime w/ milliseconds+ precision. - * This is assumed to be a time in the time zone specified to the date - * time extension. - */ - final String givenStr = "2001-10-26T21:32:52.12679"; - - /* - * The string representation w/ only milliseconds precision. This will - * be a time in the time zone given to the date time extension. The - * canonical form of a GMT time zone is "Z", indicating "Zulu", which is - * why that is part of the expected representation here. - */ - final String expectedStr = "2001-10-26T21:32:52.126Z"; - - /* - * A bigdata literal w/o inlining from the *givenStr*. This - * representation has greater milliseconds+ precision. - */ - final BigdataLiteral lit = vf.createLiteral(df - .newXMLGregorianCalendar(givenStr)); - - // Verify the representation is exact. - assertEquals(givenStr, lit.stringValue()); - - /* - * The IV representation of the dateTime. This will convert the date - * time into the time zone given to the extension and will also truncate - * the precision to no more than milliseconds. - */ - final LiteralExtensionIV<?> iv = ext.createIV(lit); - - // Convert the IV back into a bigdata literal. - final BigdataLiteral lit2 = (BigdataLiteral) ext.asValue(iv, vf); - - // Verify that millisecond precision was retained. - assertEquals(expectedStr, lit2.stringValue()); - - } - - /** - * Unit test for round-trip of derived numeric values. - */ - public void test_encodeDecodeDerivedNumerics() throws Exception { - - final BigdataValueFactory vf = BigdataValueFactoryImpl.getInstance("test"); - - final DatatypeFactory df = DatatypeFactory.newInstance(); - - final DerivedNumericsExtension<BigdataValue> ext = - new DerivedNumericsExtension<BigdataValue>(new IDatatypeURIResolver() { - public BigdataURI resolve(URI uri) { - final BigdataURI buri = vf.createURI(uri.stringValue()); - buri.setIV(newTermId(VTE.URI)); - return buri; - } - }); - - final BigdataLiteral[] dt = { - vf.createLiteral("1", XSD.POSITIVE_INTEGER), - vf.createLiteral("-1", XSD.NEGATIVE_INTEGER), - vf.createLiteral("-1", XSD.NON_POSITIVE_INTEGER), - vf.createLiteral("1", XSD.NON_NEGATIVE_INTEGER), - vf.createLiteral("0", XSD.NON_POSITIVE_INTEGER), - vf.createLiteral("0", XSD.NON_NEGATIVE_INTEGER), - }; - - final IV<?, ?>[] e = new IV[dt.length]; - - for (int i = 0; i < dt.length; i++) { - - e[i] = ext.createIV(dt[i]); - - } - - final IV<?, ?>[] a = doEncodeDecodeTest(e); - - if (log.isInfoEnabled()) { - for (int i = 0; i < e.length; i++) { - log.info("original: " + dt[i]); - log.info("asValue : " + ext.asValue((LiteralExtensionIV<?>) e[i], vf)); - log.info("decoded : " + ext.asValue((LiteralExtensionIV<?>) a[i], vf)); - log.info(""); - } -// log.info(svf.createLiteral( -// df.newXMLGregorianCalendar("2001-10-26T21:32:52.12679"))); - } - - doComparatorTest(e); - - } - - /** - * Unit test for {@link SidIV}. - */ - public void test_encodeDecode_sids() { - - final IV<?,?> s1 = newTermId(VTE.URI); - final IV<?,?> s2 = newTermId(VTE.URI); - final IV<?,?> p1 = newTermId(VTE.URI); - final IV<?,?> p2 = newTermId(VTE.URI); - final IV<?,?> o1 = newTermId(VTE.URI); - final IV<?,?> o2 = newTermId(VTE.BNODE); - final IV<?,?> o3 = newTermId(VTE.LITERAL); - - final SPO spo1 = new SPO(s1, p1, o1, StatementEnum.Explicit); - final SPO spo2 = new SPO(s1, p1, o2, StatementEnum.Explicit); - final SPO spo3 = new SPO(s1, p1, o3, StatementEnum.Explicit); - final SPO spo4 = new SPO(s1, p2, o1, StatementEnum.Explicit); - final SPO spo5 = new SPO(s1, p2, o2, StatementEnum.Explicit); - final SPO spo6 = new SPO(s1, p2, o3, StatementEnum.Explicit); - final SPO spo7 = new SPO(s2, p1, o1, StatementEnum.Explicit); - final SPO spo8 = new SPO(s2, p1, o2, StatementEnum.Explicit); - final SPO spo9 = new SPO(s2, p1, o3, StatementEnum.Explicit); - final SPO spo10 = new SPO(s2, p2, o1, StatementEnum.Explicit); - final SPO spo11 = new SPO(s2, p2, o2, StatementEnum.Explicit); - final SPO spo12 = new SPO(s2, p2, o3, StatementEnum.Explicit); - spo1.setStatementIdentifier(true); - spo2.setStatementIdentifier(true); - spo3.setStatementIdentifier(true); - spo6.setStatementIdentifier(true); - final SPO spo13 = new SPO(spo1.getStatementIdentifier(), p1, o1, - StatementEnum.Explicit); - final SPO spo14 = new SPO(spo2.getStatementIdentifier(), p2, o2, - StatementEnum.Explicit); - final SPO spo15 = new SPO(s1, p1, spo3.getStatementIdentifier(), - StatementEnum.Explicit); - spo15.setStatementIdentifier(true); - final SPO spo16 = new SPO(s1, p1, spo6.getStatementIdentifier(), - StatementEnum.Explicit); - final SPO spo17 = new SPO(spo1.getStatementIdentifier(), p1, spo15 - .getStatementIdentifier(), StatementEnum.Explicit); - - final IV<?, ?>[] e = {// - new SidIV<BigdataBNode>(spo1),// - new SidIV<BigdataBNode>(spo2),// - new SidIV<BigdataBNode>(spo3),// - new SidIV<BigdataBNode>(spo4),// - new SidIV<BigdataBNode>(spo5),// - new SidIV<BigdataBNode>(spo6),// - new SidIV<BigdataBNode>(spo7),// - new SidIV<BigdataBNode>(spo8),// - new SidIV<BigdataBNode>(spo9),// - new SidIV<BigdataBNode>(spo10),// - new SidIV<BigdataBNode>(spo11),// - new SidIV<BigdataBNode>(spo12),// - new SidIV<BigdataBNode>(spo13),// - new SidIV<BigdataBNode>(spo14),// - new SidIV<BigdataBNode>(spo15),// - new SidIV<BigdataBNode>(spo16),// - new SidIV<BigdataBNode>(spo17),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for a fully inlined representation of a URI based on a - * <code>byte</code> code. The flags byte looks like: - * <code>VTE=URI, inline=true, extension=false, - * DTE=XSDByte</code>. It is followed by a <code>unsigned byte</code> value - * which is the index of the URI in the {@link Vocabulary} class for the - * triple store. - */ - public void test_encodeDecode_URIByteIV() { - - final IV<?, ?>[] e = {// - new VocabURIByteIV<BigdataURI>((byte) Byte.MIN_VALUE),// - new VocabURIByteIV<BigdataURI>((byte) -1),// - new VocabURIByteIV<BigdataURI>((byte) 0),// - new VocabURIByteIV<BigdataURI>((byte) 1),// - new VocabURIByteIV<BigdataURI>((byte) Byte.MAX_VALUE),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - - /** - * Unit test for a fully inlined representation of a URI based on a - * <code>short</code> code. The flags byte looks like: - * <code>VTE=URI, inline=true, extension=false, - * DTE=XSDShort</code>. It is followed by an <code>unsigned short</code> - * value which is the index of the URI in the {@link Vocabulary} class for - * the triple store. - */ - public void test_encodeDecode_URIShortIV() { - - final IV<?, ?>[] e = {// - new VocabURIShortIV<BigdataURI>((short) Short.MIN_VALUE),// - new VocabURIShortIV<BigdataURI>((short) -1),// - new VocabURIShortIV<BigdataURI>((short) 0),// - new VocabURIShortIV<BigdataURI>((short) 1),// - new VocabURIShortIV<BigdataURI>((short) Short.MAX_VALUE),// - }; - - doEncodeDecodeTest(e); - - doComparatorTest(e); - - } - -} +/** + +Copyright (C) SYSTAP, LLC 2006-2010. All rights reserved. + +Contact: + SYSTAP, LLC + 4501 Tower Road + Greensboro, NC 27410 + lic...@bi... + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; version 2 of the License. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ +/* + * Created on Apr 19, 2010 + */ + +package com.bigdata.rdf.internal; + +import java.util.LinkedList; +import java.util.List; +import java.util.Random; +import java.util.TimeZone; +import java.util.UUID; + +import javax.xml.datatype.DatatypeConfigurationException; +import javax.xml.datatype.DatatypeFactory; + +import org.openrdf.model.Literal; +import org.openrdf.model.URI; +import org.openrdf.model.impl.LiteralImpl; + +import com.bigdata.rdf.internal.ColorsEnumExtension.Color; +import com.bigdata.rdf.internal.impl.AbstractIV; +import com.bigdata.rdf.internal.impl.BlobIV; +import com.bigdata.rdf.internal.impl.bnode.NumericBNodeIV; +import com.bigdata.rdf.internal.impl.bnode.SidIV; +import com.bigdata.rdf.internal.impl.bnode.UUIDBNodeIV; +import com.bigdata.rdf.internal.impl.extensions.DateTimeExtension; +import com.bigdata.rdf.internal.impl.extensions.DerivedNumericsExtension; +import com.bigdata.rdf.internal.impl.literal.LiteralExtensionIV; +import com.bigdata.rdf.internal.impl.literal.UUIDLiteralIV; +import com.bigdata.rdf.internal.impl.literal.XSDBooleanIV; +import com.bigdata.rdf.internal.impl.literal.XSDNumericIV; +import com.bigdata.rdf.internal.impl.uri.VocabURIByteIV; +import com.bigdata.rdf.internal.impl.uri.VocabURIShortIV; +import com.bigdata.rdf.lexicon.LexiconRelation; +import com.bigdata.rdf.model.BigdataBNode; +import com.bigdata.rdf.model.BigdataLiteral; +import com.bigdata.rdf.model.BigdataURI; +import com.bigdata.rdf.model.BigdataValue; +import com.bigdata.rdf.model.BigdataValueFactory; +import com.bigdata.rdf.model.BigdataValueFactoryImpl; +import com.bigdata.rdf.model.StatementEnum; +import com.bigdata.rdf.spo.SPO; +import com.bigdata.rdf.vocab.Vocabulary; + +/** + * Unit tests for encoding and decoding compound keys (such as are used by the + * statement indices) in which some of the key components are inline values + * having variable component lengths while others are term identifiers. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + * @version $Id: TestEncodeDecodeKeys.java 2756 2010-05-03 22:26:18Z + * thompsonbry$ + * + * FIXME Test heterogenous sets of {@link IV}s, ideally randomly + * generated for each type of VTE and DTE. + */ +public class TestEncodeDecodeKeys extends AbstractEncodeDecodeKeysTestCase { + + public TestEncodeDecodeKeys() { + super(); + } + + public TestEncodeDecodeKeys(String name) { + super(name); + } + + public void test_InlineValue() { + + for (VTE vte : VTE.values()) { + + for (DTE dte : DTE.values()) { + + final IV<?, ?> v = new AbstractIV(vte, + true/* inline */, false/* extension */, dte) { + + private static final long serialVersionUID = 1L; + + @Override + public boolean equals(Object o) { + if (this == o) + return true; + return false; + } + + @Override + public int byteLength() { + throw new UnsupportedOperationException(); + } + + @Override + public int hashCode() { + return 0; + } + + @Override + public IV<?, ?> clone(boolean clearCache) { + throw new UnsupportedOperationException(); + } + + @Override + public int _compareTo(IV o) { + throw new UnsupportedOperationException(); + } + + @Override + public BigdataValue asValue(final LexiconRelation lex) + throws UnsupportedOperationException { + return null; + } + + @Override + public Object getInlineValue() + throws UnsupportedOperationException { + return null; + } + + @Override + public boolean isInline() { + return true; + } + + @Override + public boolean needsMaterialization() { + return false; + } + + @Override + public String stringValue() { + throw new UnsupportedOperationException(); + } + + }; + + assertTrue(v.isInline()); + +// if (termId == 0L) { +// assertTrue(v.toString(), v.isNull()); +// } else { +// assertFalse(v.toString(), v.isNull()); +// } +// +// assertEquals(termId, v.getTermId()); + + // should not throw an exception. + v.getInlineValue(); + + assertEquals("flags=" + v.flags(), vte, v.getVTE()); + + assertEquals(dte, v.getDTE()); + + switch (vte) { + case URI: + assertTrue(v.isURI()); + assertFalse(v.isBNode()); + assertFalse(v.isLiteral()); + assertFalse(v.isStatement()); + break; + case BNODE: + assertFalse(v.isURI()); + assertTrue(v.isBNode()); + assertFalse(v.isLiteral()); + assertFalse(v.isStatement()); + break; + case LITERAL: + assertFalse(v.isURI()); + assertFalse(v.isBNode()); + assertTrue(v.isLiteral()); + assertFalse(v.isStatement()); + break; + case STATEMENT: + assertFalse(v.isURI()); + assertFalse(v.isBNode()); + assertFalse(v.isLiteral()); + assertTrue(v.isStatement()); + break; + default: + fail("vte=" + vte); + } + + } + + } + + } + + /** + * Unit test for encoding and decoding a statement formed from + * {@link BlobIV}s. + */ + public void test_encodeDecode_allTermIds() { + + final IV<?, ?>[] e = {// + newTermId(VTE.URI),// + newTermId(VTE.URI),// + newTermId(VTE.URI),// + newTermId(VTE.URI) // + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test where the RDF Object position is an xsd:boolean. + */ + public void test_encodeDecode_XSDBoolean() { + + final IV<?, ?>[] e = {// + new XSDBooleanIV<BigdataLiteral>(true),// + new XSDBooleanIV<BigdataLiteral>(false),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for {@link XSDNumericIV}. + */ + public void test_encodeDecode_XSDByte() { + + final IV<?, ?>[] e = {// + new XSDNumericIV<BigdataLiteral>((byte)Byte.MIN_VALUE),// + new XSDNumericIV<BigdataLiteral>((byte)-1),// + new XSDNumericIV<BigdataLiteral>((byte)0),// + new XSDNumericIV<BigdataLiteral>((byte)1),// + new XSDNumericIV<BigdataLiteral>((byte)Byte.MAX_VALUE),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for {@link XSDNumericIV}. + */ + public void test_encodeDecode_XSDShort() { + + final IV<?, ?>[] e = {// + new XSDNumericIV<BigdataLiteral>((short)-1),// + new XSDNumericIV<BigdataLiteral>((short)0),// + new XSDNumericIV<BigdataLiteral>((short)1),// + new XSDNumericIV<BigdataLiteral>((short)Short.MIN_VALUE),// + new XSDNumericIV<BigdataLiteral>((short)Short.MAX_VALUE),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for {@link XSDNumericIV}. + */ + public void test_encodeDecode_XSDInt() { + + final IV<?, ?>[] e = {// + new XSDNumericIV<BigdataLiteral>(1),// + new XSDNumericIV<BigdataLiteral>(0),// + new XSDNumericIV<BigdataLiteral>(-1),// + new XSDNumericIV<BigdataLiteral>(Integer.MAX_VALUE),// + new XSDNumericIV<BigdataLiteral>(Integer.MIN_VALUE),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for {@link XSDNumericIV}. + */ + public void test_encodeDecode_XSDLong() { + + final IV<?, ?>[] e = {// + new XSDNumericIV<BigdataLiteral>(1L),// + new XSDNumericIV<BigdataLiteral>(0L),// + new XSDNumericIV<BigdataLiteral>(-1L),// + new XSDNumericIV<BigdataLiteral>(Long.MIN_VALUE),// + new XSDNumericIV<BigdataLiteral>(Long.MAX_VALUE),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for {@link XSDNumericIV}. + */ + public void test_encodeDecode_XSDFloat() { + + /* + * Note: -0f and +0f are converted to the same point in the value space. + */ +// new XSDNumericIV<BigdataLiteral>(-0f); + + final IV<?, ?>[] e = {// + new XSDNumericIV<BigdataLiteral>(1f),// + new XSDNumericIV<BigdataLiteral>(-1f),// + new XSDNumericIV<BigdataLiteral>(+0f),// + new XSDNumericIV<BigdataLiteral>(Float.MAX_VALUE),// + new XSDNumericIV<BigdataLiteral>(Float.MIN_VALUE),// + new XSDNumericIV<BigdataLiteral>(Float.MIN_NORMAL),// + new XSDNumericIV<BigdataLiteral>(Float.POSITIVE_INFINITY),// + new XSDNumericIV<BigdataLiteral>(Float.NEGATIVE_INFINITY),// + new XSDNumericIV<BigdataLiteral>(Float.NaN),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for {@link XSDNumericIV}. + */ + public void test_encodeDecode_XSDDouble() { + + /* + * Note: -0d and +0d are converted to the same point in the value space. + */ +// new XSDNumericIV<BigdataLiteral>(-0d); + + final IV<?, ?>[] e = {// + new XSDNumericIV<BigdataLiteral>(1d),// + new XSDNumericIV<BigdataLiteral>(-1d),// + new XSDNumericIV<BigdataLiteral>(+0d),// + new XSDNumericIV<BigdataLiteral>(Double.MAX_VALUE),// + new XSDNumericIV<BigdataLiteral>(Double.MIN_VALUE),// + new XSDNumericIV<BigdataLiteral>(Double.MIN_NORMAL),// + new XSDNumericIV<BigdataLiteral>(Double.POSITIVE_INFINITY),// + new XSDNumericIV<BigdataLiteral>(Double.NEGATIVE_INFINITY),// + new XSDNumericIV<BigdataLiteral>(Double.NaN),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for {@link UUIDLiteralIV}. + */ + public void test_encodeDecode_UUID() { + + final IV<?, ?>[] e = new IV[100]; + + for (int i = 0; i < e.length; i++) { + + e[i] = new UUIDLiteralIV<BigdataLiteral>(UUID.randomUUID()); + + } + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for {@link UUIDBNodeIV}, which provides support for inlining a + * told blank node whose <code>ID</code> can be parsed as a {@link UUID}. + */ + public void test_encodeDecode_BNode_UUID_ID() { + + final IV<?, ?>[] e = new IV[100]; + + for (int i = 0; i < e.length; i++) { + + e[i] = new UUIDBNodeIV<BigdataBNode>(UUID.randomUUID()); + + } + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for {@link NumericBNodeIV}, which provides support for inlining + * a told blank node whose <code>ID</code> can be parsed as an + * {@link Integer}. + */ + public void test_encodeDecode_BNode_INT_ID() { + + final IV<?, ?>[] e = {// + new NumericBNodeIV<BigdataBNode>(-1),// + new NumericBNodeIV<BigdataBNode>(0),// + new NumericBNodeIV<BigdataBNode>(1),// + new NumericBNodeIV<BigdataBNode>(-52),// + new NumericBNodeIV<BigdataBNode>(52),// + new NumericBNodeIV<BigdataBNode>(Integer.MAX_VALUE),// + new NumericBNodeIV<BigdataBNode>(Integer.MIN_VALUE),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for the {@link EpochExtension}. + */ + public void test_encodeDecodeEpoch() { + + final BigdataValueFactory vf = BigdataValueFactoryImpl.getInstance("test"); + + final EpochExtension<BigdataValue> ext = + new EpochExtension<BigdataValue>(new IDatatypeURIResolver() { + public BigdataURI resolve(final URI uri) { + final BigdataURI buri = vf.createURI(uri.stringValue()); + buri.setIV(newTermId(VTE.URI)); + return buri; + } + }); + + final Random r = new Random(); + + final IV<?, ?>[] e = new IV[100]; + + for (int i = 0; i < e.length; i++) { + + final long v = r.nextLong(); + + final String s = Long.toString(v); + + final Literal lit = new LiteralImpl(s, EpochExtension.EPOCH); + + final IV<?,?> iv = ext.createIV(lit); + + if (iv == null) + fail("Did not create IV: lit=" + lit); + + e[i] = iv; + + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for the {@link ColorsEnumExtension}. + */ + public void test_encodeDecodeColor() { + + final BigdataValueFactory vf = BigdataValueFactoryImpl.getInstance("test"); + + final ColorsEnumExtension<BigdataValue> ext = + new ColorsEnumExtension<BigdataValue>(new IDatatypeURIResolver() { + public BigdataURI resolve(URI uri) { + final BigdataURI buri = vf.createURI(uri.stringValue()); + buri.setIV(newTermId(VTE.URI)); + return buri; + } + }); + + final List<IV<?, ?>> a = new LinkedList<IV<?, ?>>(); + + for (Color c : Color.values()) { + + a.add(ext.createIV(new LiteralImpl(c.name(), + ColorsEnumExtension.COLOR))); + + } + + final IV<?, ?>[] e = a.toArray(new IV[0]); + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for round-trip of xsd:dateTime values. + */ + public void test_encodeDecodeDateTime() throws Exception { + + final BigdataValueFactory vf = BigdataValueFactoryImpl.getInstance("test"); + + final DatatypeFactory df = DatatypeFactory.newInstance(); + + final DateTimeExtension<BigdataValue> ext = + new DateTimeExtension<BigdataValue>(new IDatatypeURIResolver() { + public BigdataURI resolve(URI uri) { + final BigdataURI buri = vf.createURI(uri.stringValue()); + buri.setIV(newTermId(VTE.URI)); + return buri; + } + }, + TimeZone.getDefault() + ); + + final BigdataLiteral[] dt = { + vf.createLiteral( + df.newXMLGregorianCalendar("2001-10-26T21:32:52")), + vf.createLiteral( + df.newXMLGregorianCalendar("2001-10-26T21:32:52+02:00")), + vf.createLiteral( + df.newXMLGregorianCalendar("2001-10-26T19:32:52Z")), + vf.createLiteral( + df.newXMLGregorianCalendar("2001-10-26T19:32:52+00:00")), + vf.createLiteral( + df.newXMLGregorianCalendar("-2001-10-26T21:32:52")), + vf.createLiteral( + df.newXMLGregorianCalendar("2001-10-26T21:32:52.12679")), + vf.createLiteral( + df.newXMLGregorianCalendar("1901-10-26T21:32:52")), + }; + + final IV<?, ?>[] e = new IV[dt.length]; + + for (int i = 0; i < dt.length; i++) { + + e[i] = ext.createIV(dt[i]); + + } + + final IV<?, ?>[] a = doEncodeDecodeTest(e); + + if (log.isInfoEnabled()) { + for (int i = 0; i < e.length; i++) { + log.info("original: " + dt[i]); + log.info("asValue : " + ext.asValue((LiteralExtensionIV<?>) e[i], vf)); + log.info("decoded : " + ext.asValue((LiteralExtensionIV<?>) a[i], vf)); + log.info(""); + } +// log.info(svf.createLiteral( +// df.newXMLGregorianCalendar("2001-10-26T21:32:52.12679"))); + } + + doComparatorTest(e); + + } + + /** + * Unit test verifies that the inline xsd:dateTime representation preserves + * the milliseconds units. However, precision beyond milliseconds is NOT + * preserved by the inline representation, which is based on milliseconds + * since the epoch. + * + * @throws DatatypeConfigurationException + */ + public void test_dateTime_preservesMillis() + throws DatatypeConfigurationException { + + final BigdataValueFactory vf = BigdataValueFactoryImpl + .getInstance("test"); + + final DatatypeFactory df = DatatypeFactory.newInstance(); + + final DateTimeExtension<BigdataValue> ext = new DateTimeExtension<BigdataValue>( + new IDatatypeURIResolver() { + public BigdataURI resolve(URI uri) { + final BigdataURI buri = vf.createURI(uri.stringValue()); + buri.setIV(newTermId(VTE.URI)); + return buri; + } + }, TimeZone.getTimeZone("GMT")); + + /* + * The string representation of the dateTime w/ milliseconds+ precision. + * This is assumed to be a time in the time zone specified to the date + * time extension. + */ + final String givenStr = "2001-10-26T21:32:52.12679"; + + /* + * The string representation w/ only milliseconds precision. This will + * be a time in the time zone given to the date time extension. The + * canonical form of a GMT time zone is "Z", indicating "Zulu", which is + * why that is part of the expected representation here. + */ + final String expectedStr = "2001-10-26T21:32:52.126Z"; + + /* + * A bigdata literal w/o inlining from the *givenStr*. This + * representation has greater milliseconds+ precision. + */ + final BigdataLiteral lit = vf.createLiteral(df + .newXMLGregorianCalendar(givenStr)); + + // Verify the representation is exact. + assertEquals(givenStr, lit.stringValue()); + + /* + * The IV representation of the dateTime. This will convert the date + * time into the time zone given to the extension and will also truncate + * the precision to no more than milliseconds. + */ + final LiteralExtensionIV<?> iv = ext.createIV(lit); + + // Convert the IV back into a bigdata literal. + final BigdataLiteral lit2 = (BigdataLiteral) ext.asValue(iv, vf); + + // Verify that millisecond precision was retained. + assertEquals(expectedStr, lit2.stringValue()); + + } + + /** + * Unit test for round-trip of derived numeric values. + */ + public void test_encodeDecodeDerivedNumerics() throws Exception { + + final BigdataValueFactory vf = BigdataValueFactoryImpl.getInstance("test"); + + final DatatypeFactory df = DatatypeFactory.newInstance(); + + final DerivedNumericsExtension<BigdataValue> ext = + new DerivedNumericsExtension<BigdataValue>(new IDatatypeURIResolver() { + public BigdataURI resolve(URI uri) { + final BigdataURI buri = vf.createURI(uri.stringValue()); + buri.setIV(newTermId(VTE.URI)); + return buri; + } + }); + + final BigdataLiteral[] dt = { + vf.createLiteral("1", XSD.POSITIVE_INTEGER), + vf.createLiteral("-1", XSD.NEGATIVE_INTEGER), + vf.createLiteral("-1", XSD.NON_POSITIVE_INTEGER), + vf.createLiteral("1", XSD.NON_NEGATIVE_INTEGER), + vf.createLiteral("0", XSD.NON_POSITIVE_INTEGER), + vf.createLiteral("0", XSD.NON_NEGATIVE_INTEGER), + }; + + final IV<?, ?>[] e = new IV[dt.length]; + + for (int i = 0; i < dt.length; i++) { + + e[i] = ext.createIV(dt[i]); + + } + + final IV<?, ?>[] a = doEncodeDecodeTest(e); + + if (log.isInfoEnabled()) { + for (int i = 0; i < e.length; i++) { + log.info("original: " + dt[i]); + log.info("asValue : " + ext.asValue((LiteralExtensionIV<?>) e[i], vf)); + log.info("decoded : " + ext.asValue((LiteralExtensionIV<?>) a[i], vf)); + log.info(""); + } +// log.info(svf.createLiteral( +// df.newXMLGregorianCalendar("2001-10-26T21:32:52.12679"))); + } + + doComparatorTest(e); + + } + + /** + * Unit test for {@link SidIV}. + */ + public void test_encodeDecode_sids() { + + final IV<?,?> s1 = newTermId(VTE.URI); + final IV<?,?> s2 = newTermId(VTE.URI); + final IV<?,?> p1 = newTermId(VTE.URI); + final IV<?,?> p2 = newTermId(VTE.URI); + final IV<?,?> o1 = newTermId(VTE.URI); + final IV<?,?> o2 = newTermId(VTE.BNODE); + final IV<?,?> o3 = newTermId(VTE.LITERAL); + + final SPO spo1 = new SPO(s1, p1, o1, StatementEnum.Explicit); + final SPO spo2 = new SPO(s1, p1, o2, StatementEnum.Explicit); + final SPO spo3 = new SPO(s1, p1, o3, StatementEnum.Explicit); + final SPO spo4 = new SPO(s1, p2, o1, StatementEnum.Explicit); + final SPO spo5 = new SPO(s1, p2, o2, StatementEnum.Explicit); + final SPO spo6 = new SPO(s1, p2, o3, StatementEnum.Explicit); + final SPO spo7 = new SPO(s2, p1, o1, StatementEnum.Explicit); + final SPO spo8 = new SPO(s2, p1, o2, StatementEnum.Explicit); + final SPO spo9 = new SPO(s2, p1, o3, StatementEnum.Explicit); + final SPO spo10 = new SPO(s2, p2, o1, StatementEnum.Explicit); + final SPO spo11 = new SPO(s2, p2, o2, StatementEnum.Explicit); + final SPO spo12 = new SPO(s2, p2, o3, StatementEnum.Explicit); + spo1.setStatementIdentifier(true); + spo2.setStatementIdentifier(true); + spo3.setStatementIdentifier(true); + spo6.setStatementIdentifier(true); + final SPO spo13 = new SPO(spo1.getStatementIdentifier(), p1, o1, + StatementEnum.Explicit); + final SPO spo14 = new SPO(spo2.getStatementIdentifier(), p2, o2, + StatementEnum.Explicit); + final SPO spo15 = new SPO(s1, p1, spo3.getStatementIdentifier(), + StatementEnum.Explicit); + spo15.setStatementIdentifier(true); + final SPO spo16 = new SPO(s1, p1, spo6.getStatementIdentifier(), + StatementEnum.Explicit); + final SPO spo17 = new SPO(spo1.getStatementIdentifier(), p1, spo15 + .getStatementIdentifier(), StatementEnum.Explicit); + + final IV<?, ?>[] e = {// + new SidIV<BigdataBNode>(spo1),// + new SidIV<BigdataBNode>(spo2),// + new SidIV<BigdataBNode>(spo3),// + new SidIV<BigdataBNode>(spo4),// + new SidIV<BigdataBNode>(spo5),// + new SidIV<BigdataBNode>(spo6),// + new SidIV<BigdataBNode>(spo7),// + new SidIV<BigdataBNode>(spo8),// + new SidIV<BigdataBNode>(spo9),// + new SidIV<BigdataBNode>(spo10),// + new SidIV<BigdataBNode>(spo11),// + new SidIV<BigdataBNode>(spo12),// + new SidIV<BigdataBNode>(spo13),// + new SidIV<BigdataBNode>(spo14),// + new SidIV<BigdataBNode>(spo15),// + new SidIV<BigdataBNode>(spo16),// + new SidIV<BigdataBNode>(spo17),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for a fully inlined representation of a URI based on a + * <code>byte</code> code. The flags byte looks like: + * <code>VTE=URI, inline=true, extension=false, + * DTE=XSDByte</code>. It is followed by a <code>unsigned byte</code> value + * which is the index of the URI in the {@link Vocabulary} class for the + * triple store. + */ + public void test_encodeDecode_URIByteIV() { + + final IV<?, ?>[] e = {// + new VocabURIByteIV<BigdataURI>((byte) Byte.MIN_VALUE),// + new VocabURIByteIV<BigdataURI>((byte) -1),// + new VocabURIByteIV<BigdataURI>((byte) 0),// + new VocabURIByteIV<BigdataURI>((byte) 1),// + new VocabURIByteIV<BigdataURI>((byte) Byte.MAX_VALUE),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + + /** + * Unit test for a fully inlined representation of a URI based on a + * <code>short</code> code. The flags byte looks like: + * <code>VTE=URI, inline=true, extension=false, + * DTE=XSDShort</code>. It is followed by an <code>unsigned short</code> + * value which is the index of the URI in the {@link Vocabulary} class for + * the triple store. + */ + public void test_encodeDecode_URIShortIV() { + + final IV<?, ?>[] e = {// + new VocabURIShortIV<BigdataURI>((short) Short.MIN_VALUE),// + new VocabURIShortIV<BigdataURI>((short) -1),// + new VocabURIShortIV<BigdataURI>((short) 0),// + new VocabURIShortIV<BigdataURI>((short) 1),// + new VocabURIShortIV<BigdataURI>((short) Short.MAX_VALUE),// + }; + + doEncodeDecodeTest(e); + + doComparatorTest(e); + + } + +} This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-28 21:41:32
|
Revision: 7358 http://bigdata.svn.sourceforge.net/bigdata/?rev=7358&view=rev Author: thompsonbry Date: 2013-08-28 21:41:23 +0000 (Wed, 28 Aug 2013) Log Message: ----------- Optimize the thread local scheduler in a new implementation that defers the eliding of duplicate vertices in the frontier until the next round. This should reduce the time in between rounds (and the time in which the code is single threaded) significantly. @see #629 Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedArray.java Added Paths: ----------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler2.java Removed Paths: ------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java 2013-08-28 20:01:46 UTC (rev 7357) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java 2013-08-28 21:41:23 UTC (rev 7358) @@ -25,6 +25,22 @@ * Return <code>true</code> if the frontier is known to be empty. */ boolean isEmpty(); + + /** + * Return <code>true</code> iff the frontier is known to be compact (no + * duplicate vertices). + * <p> + * Note: If the frontier is not compact, then the {@link IGASEngine} may + * optionally elect to eliminate duplicate work when it schedules the + * vertices in the frontier. + * <p> + * Note: A non-compact frontier can arise when the {@link IGASScheduler} + * chooses a per-thread approach and then copies the per-thread segments + * onto the shared backing array in parallel. This can reduce the time + * between rounds, which can speed up the overall execution of the algorithm + * significantly. + */ + boolean isCompact(); /** * Reset the frontier from the {@link IV}s. Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java 2013-08-28 20:01:46 UTC (rev 7357) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java 2013-08-28 21:41:23 UTC (rev 7358) @@ -1,6 +1,7 @@ package com.bigdata.rdf.graph.impl; import java.lang.reflect.Constructor; +import java.util.HashSet; import java.util.LinkedList; import java.util.List; import java.util.concurrent.Callable; @@ -219,6 +220,17 @@ private class ParallelFrontierStrategy extends AbstractFrontierStrategy { private final IStaticFrontier f; + + /** + * Used to eliminate duplicates from the frontier at the time that we + * schedule the tasks. This shifts the duplicate elimination burden from + * the period between the rounds into the execution of the round. Since + * we schedule the tasks that consume the frontier from a single thread, + * we do not need to use a thread-safe collection. We also do not care + * about order for this collection. It is only there to filter out + * duplicate work. + */ + private final HashSet<IV> scheduled; ParallelFrontierStrategy(final VertexTaskFactory<Long> taskFactory, final IStaticFrontier f) { @@ -227,6 +239,16 @@ this.f = f; + /* + * Pre-allocate based on the known size of the frontier. The + * frontier may have duplicates (depending on its implementation) so + * this size is conservative. + * + * If the frontier is known to be compact, then this map is not + * initialized and is not used. + */ + this.scheduled = f.isCompact() ? null : new HashSet<IV>(f.size()); + } @Override @@ -254,6 +276,25 @@ // For all vertices in the frontier. for (IV u : f) { + if (scheduled != null) { + +// null values are NOT expected. +// +// if (u == null) { +// +// // null references are ignored for non-compact +// // frontiers. +// continue; +// } + + if (!scheduled.add(u)) { + + // already scheduled. + continue; + + } + } + // Future will compute scatter for vertex. final FutureTask<Long> ft = new FutureTask<Long>( taskFactory.newVertexTask(u)); Deleted: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java 2013-08-28 20:01:46 UTC (rev 7357) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java 2013-08-28 21:41:23 UTC (rev 7358) @@ -1,108 +0,0 @@ -package com.bigdata.rdf.graph.impl; - -import java.util.ArrayList; -import java.util.Arrays; -import java.util.Iterator; - -import com.bigdata.rdf.graph.IStaticFrontier; -import com.bigdata.rdf.internal.IV; - -import cutthecrap.utils.striterators.ArrayIterator; - -/** - * Simple implementation of a "static" frontier. - * <p> - * Note: This implementation has package private methods that permit certain - * kinds of mutation. - * - * @author <a href="mailto:tho...@us...">Bryan Thompson</a> - * @deprecated by {@link StaticFrontier2}, which is more efficient. - */ -@SuppressWarnings("rawtypes") -public class StaticFrontier implements IStaticFrontier { - - private final ArrayList<IV> vertices; - - StaticFrontier() { - - vertices = new ArrayList<IV>(); - - } - - @Override - public int size() { - - return vertices.size(); - - } - - @Override - public boolean isEmpty() { - - return vertices.isEmpty(); - - } - - @Override - public Iterator<IV> iterator() { - - return vertices.iterator(); - - } - - private void ensureCapacity(final int minCapacity) { - - vertices.ensureCapacity(minCapacity); - - } - - // private void clear() { - // - // vertices.clear(); - // - // } - - // private void schedule(IV v) { - // - // vertices.add(v); - // - // } - - @Override - public void resetFrontier(final int minCapacity, final boolean ordered, - Iterator<IV> itr) { - - if (!ordered) { - - final IV[] a = new IV[minCapacity]; - - int i=0; - while(itr.hasNext()) { - a[i++] = itr.next(); - } - - // sort. - Arrays.sort(a, 0/* fromIndex */, i/* toIndex */); - - // iterator over the ordered vertices. - itr = new ArrayIterator<IV>(a, 0/* fromIndex */, i/* len */); - - } - - // clear the old frontier. - vertices.clear(); - - // ensure enough capacity for the new frontier. - ensureCapacity(minCapacity); - - while (itr.hasNext()) { - - final IV v = itr.next(); - - vertices.add(v); - - } - - } - -} \ No newline at end of file Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java 2013-08-28 20:01:46 UTC (rev 7357) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java 2013-08-28 21:41:23 UTC (rev 7358) @@ -32,6 +32,8 @@ */ private IArraySlice<IV> vertices; + private boolean compact = true; + StaticFrontier2() { /* @@ -48,6 +50,19 @@ } @Override + public boolean isCompact() { + + return compact; + + } + + public void setCompact(final boolean newValue) { + + this.compact = newValue; + + } + + @Override public int size() { return vertices.len(); @@ -67,8 +82,67 @@ return vertices.iterator(); } + + /** + * Grow the backing array iff necessary. Regardless, the entries from end of + * the current view to the first non-<code>null</code> are cleared. This is + * done to faciltiate GC by clearing references that would otherwise remain + * if/when the frontier contracted. + * + * @param minCapacity + * The required minimum capacity. + */ + public void resetAndEnsureCapacity(final int minCapacity) { + + final int len0 = size(); + + backing.ensureCapacity(minCapacity); + + final int len1 = backing.len(); + + if (len1 > len0) { + + final IV[] a = backing.array(); + + for (int i = len0; i < len1; i++) { + + if (a[i] == null) + break; + + a[i] = null; + + } + + } + + /* + * Replace the view. The caller will need to copy the data into the + * backing array before it will appear in the new view. + */ + this.vertices = backing.slice(0/* off */, minCapacity/* len */); + + } /** + * Copy a slice into the backing array. This method is intended for use by + * parallel threads. The backing array MUST have sufficient capacity. The + * threads MUST write to offsets that are known to not overlap. NO checking + * is done to ensure that the concurrent copy of these slices will not + * overlap. + * + * @param off + * The offset at which to copy the slice. + * @param slice + * The slice. + */ + public void copyIntoResetFrontier(final int off, final IArraySlice<IV> slice) { + + backing.put(off/* dstoff */, slice.array()/* src */, + slice.off()/* srcoff */, slice.len()/* srclen */); + + } + + /** * Setup the same static frontier object for the new compact fronter (it is * reused in each round). */ @@ -77,7 +151,7 @@ final Iterator<IV> itr) { copyScheduleIntoFrontier(minCapacity, itr); - + if (!ordered) { /* @@ -129,7 +203,7 @@ break; a[i] = null; } - + /* * Take a slice of the backing showing only the valid entries and use it * to replace the view of the backing array. @@ -138,4 +212,12 @@ } + @Override + public String toString() { + + return getClass().getName() + "{size=" + size() + ",compact=" + + isCompact() + ",capacity=" + backing.capacity() + "}"; + + } + } \ No newline at end of file Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java 2013-08-28 20:01:46 UTC (rev 7357) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java 2013-08-28 21:41:23 UTC (rev 7358) @@ -29,6 +29,8 @@ * is a sequential step with a linear cost in the size of the frontier. * * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + * + * TODO Discard if dominated by {@link TLScheduler2}. */ @SuppressWarnings("rawtypes") public class TLScheduler implements IGASSchedulerImpl { @@ -123,8 +125,8 @@ * Clear the per-thread maps, but do not discard. They will be reused in * the next round. * - * FIXME This is a big cost. Try simply clearing [map] and see if that - * is less expensive. + * Note: This is a big cost. Simply clearing [map] results in much less + * time and less GC. */ // for (STScheduler s : map.values()) { // @@ -182,11 +184,8 @@ } } - // // Clear the new frontier. - // frontier.clear(); + if (nvertices == 0) { - if (nsources == 0) { - /* * The new frontier is empty. */ @@ -209,7 +208,7 @@ + nthreads); } - + /* * Now merge sort those arrays and populate the new frontier. */ Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler2.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler2.java (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler2.java 2013-08-28 21:41:23 UTC (rev 7358) @@ -0,0 +1,300 @@ +package com.bigdata.rdf.graph.impl.scheduler; + +import java.util.ArrayList; +import java.util.LinkedHashSet; +import java.util.List; +import java.util.concurrent.Callable; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.Future; + +import org.apache.log4j.Logger; + +import com.bigdata.rdf.graph.IGASScheduler; +import com.bigdata.rdf.graph.IGASSchedulerImpl; +import com.bigdata.rdf.graph.IStaticFrontier; +import com.bigdata.rdf.graph.impl.GASEngine; +import com.bigdata.rdf.graph.impl.StaticFrontier2; +import com.bigdata.rdf.graph.impl.util.GASImplUtil; +import com.bigdata.rdf.graph.impl.util.IArraySlice; +import com.bigdata.rdf.graph.impl.util.ManagedArray; +import com.bigdata.rdf.graph.impl.util.MergeSortIterator; +import com.bigdata.rdf.internal.IV; + +/** + * This scheduler uses thread-local buffers ({@link LinkedHashSet}) to track the + * distinct vertices scheduled by each execution thread. After the computation + * round, those per-thread segments of the frontier are combined into a single + * global, compact, and ordered frontier. To maximize the parallel activity, the + * per-thread frontiers are sorted using N threads (one per segment). Finally, + * the frontier segments are combined using a {@link MergeSortIterator} - this + * is a sequential step with a linear cost in the size of the frontier. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + */ +@SuppressWarnings("rawtypes") +public class TLScheduler2 implements IGASSchedulerImpl { + + private static final Logger log = Logger.getLogger(TLScheduler2.class); + + /** + * Class bundles a reusable, extensible array for sorting the thread-local + * frontier. + * + * @author <a href="mailto:tho...@us...">Bryan + * Thompson</a> + */ + private static class MySTScheduler extends STScheduler { + + /** + * This is used to sort the thread-local frontier (that is, the frontier + * for a single thread). The backing array will grow as necessary and is + * reused in each round. + * <P> + * Note: The schedule (for each thread) is using a set - see the + * {@link STScheduler} base class. This means that the schedule (for + * each thread) is compact, but not ordered. We need to use (and re-use) + * an array to order that compact per-thread schedule. The compact + * per-thread schedules are then combined into a single compact frontier + * for the new round. + */ + private final ManagedArray<IV> tmp; + + public MySTScheduler(final GASEngine gasEngine) { + + super(gasEngine); + + tmp = new ManagedArray<IV>(IV.class, 64); + + } + + } // class MySTScheduler + + private final GASEngine gasEngine; + private final int nthreads; + private final ConcurrentHashMap<Long/* threadId */, MySTScheduler> map; + + public TLScheduler2(final GASEngine gasEngine) { + + this.gasEngine = gasEngine; + + this.nthreads = gasEngine.getNThreads(); + + this.map = new ConcurrentHashMap<Long, MySTScheduler>( + nthreads/* initialCapacity */, .75f/* loadFactor */, nthreads); + + } + + private IGASScheduler threadLocalScheduler() { + + final Long id = Thread.currentThread().getId(); + + MySTScheduler s = map.get(id); + + if (s == null) { + + final IGASScheduler old = map.putIfAbsent(id, s = new MySTScheduler( + gasEngine)); + + if (old != null) { + + /* + * We should not have a key collision since this is based on the + * threadId. + */ + + throw new AssertionError(); + + } + + } + + return s; + + } + + @Override + public void schedule(final IV v) { + + threadLocalScheduler().schedule(v); + + } + + @Override + public void clear() { + + /* + * Clear the per-thread maps, but do not discard. They will be reused in + * the next round. + * + * Note: This is a big cost. Simply clearing [map] results in much less + * time and less GC. + */ +// for (STScheduler s : map.values()) { +// +// s.clear(); +// +// } + map.clear(); + } + + @Override + public void compactFrontier(final IStaticFrontier frontier) { + + /* + * Figure out the #of sources and the #of vertices across those sources. + * + * This also computes the cumulative offsets into the new frontier for + * the different per-thread segments. + */ + final int[] off = new int[nthreads]; // zero for 1st thread. + final int nsources; + final int nvertices; + { + int ns = 0, nv = 0; + for (MySTScheduler s : map.values()) { + final MySTScheduler t = s; + final int sz = t.vertices.size(); + off[ns] = nv; // starting index. + ns++; + nv += sz; + } + nsources = ns; + nvertices = nv; + } + + if (nsources > nthreads) { + + /* + * nsources could be LT nthreads if we have a very small frontier, + * but it should never be GTE nthreads. + */ + + throw new AssertionError("nsources=" + nsources + ", nthreads=" + + nthreads); + + } + + if (nvertices == 0) { + + /* + * The new frontier is empty. + */ + + frontier.resetFrontier(0/* minCapacity */, true/* ordered */, + GASImplUtil.EMPTY_VERTICES_ITERATOR); + + return; + + } + + /* + * Parallel copy of the per-thread frontiers onto the new frontier. + * + * Note: This DOES NOT produce a compact frontier! The code that maps + * the gather/reduce operations over the frontier will eliminate + * duplicate work. + */ + +// /* +// * Extract a sorted, compact frontier from each thread local frontier. +// */ +// @SuppressWarnings("unchecked") +// final IArraySlice<IV>[] frontiers = new IArraySlice[nsources]; + + // TODO Requires a specific class to work! API! + final StaticFrontier2 f2 = (StaticFrontier2) frontier; + { + + // ensure sufficient capacity! + f2.resetAndEnsureCapacity(nvertices); + f2.setCompact(false); // NOT COMPACT! + + final List<Callable<IArraySlice<IV>>> tasks = new ArrayList<Callable<IArraySlice<IV>>>( + nsources); + + int i = 0; + for (MySTScheduler s : map.values()) { // TODO Paranoia suggests to put these into an [] so we know that we have the same traversal order as above. That might not be guaranteed. + final MySTScheduler t = s; + final int index = i++; + tasks.add(new Callable<IArraySlice<IV>>() { + @Override + public IArraySlice<IV> call() throws Exception { + final IArraySlice<IV> orderedSegment = GASImplUtil + .compactAndSort(t.vertices, t.tmp); + f2.copyIntoResetFrontier(off[index], orderedSegment); + return orderedSegment; // TODO CAN RETURN Void now! + } + }); + } + + // invokeAll() - futures will be done() before it returns. + final List<Future<IArraySlice<IV>>> futures; + try { + futures = gasEngine.getGASThreadPool().invokeAll(tasks); + } catch (InterruptedException e) { + throw new RuntimeException(e); + } + + for (Future<IArraySlice<IV>> f : futures) { + + try { + f.get(); +// final IArraySlice<IV> b = frontiers[nsources] = f.get(); +// nvertices += b.len(); +// nsources++; + } catch (InterruptedException e) { + throw new RuntimeException(e); + } catch (ExecutionException e) { + throw new RuntimeException(e); + } + + } + } + if (log.isInfoEnabled()) + log.info("Done: " + this.getClass().getCanonicalName() + + ",frontier=" + frontier); +// /* +// * Now merge sort those arrays and populate the new frontier. +// */ +// mergeSortSourcesAndSetFrontier(nsources, nvertices, frontiers, frontier); + + } + +// /** +// * Now merge sort the ordered frontier segments and populate the new +// * frontier. +// * +// * @param nsources +// * The #of frontier segments. +// * @param nvertices +// * The total #of vertice across those segments (may double-count +// * across segments). +// * @param frontiers +// * The ordered, compact frontier segments +// * @param frontier +// * The new frontier to be populated. +// */ +// private void mergeSortSourcesAndSetFrontier(final int nsources, +// final int nvertices, final IArraySlice<IV>[] frontiers, +// final IStaticFrontier frontier) { +// +// // wrap IVs[] as Iterators. +// @SuppressWarnings("unchecked") +// final Iterator<IV>[] itrs = new Iterator[nsources]; +// +// for (int i = 0; i < nsources; i++) { +// +// itrs[i] = frontiers[i].iterator(); +// +// } +// +// // merge sort of those iterators. +// final Iterator<IV> itr = new MergeSortIterator(itrs); +// +// frontier.resetFrontier(nvertices/* minCapacity */, true/* ordered */, +// itr); +// +// } + +} \ No newline at end of file Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedArray.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedArray.java 2013-08-28 20:01:46 UTC (rev 7357) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedArray.java 2013-08-28 21:41:23 UTC (rev 7358) @@ -235,8 +235,8 @@ final int capacity = Math.max(required, capacity() * 2); - if (log.isInfoEnabled()) - log.info("Extending buffer to capacity=" + capacity + " bytes."); + if (log.isDebugEnabled()) + log.debug("Extending buffer to capacity=" + capacity + " bytes."); return capacity; This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-28 20:01:53
|
Revision: 7357 http://bigdata.svn.sourceforge.net/bigdata/?rev=7357&view=rev Author: thompsonbry Date: 2013-08-28 20:01:46 +0000 (Wed, 28 Aug 2013) Log Message: ----------- Modified StaticFrontier2 to clear out any non-null elements in the backing array. This addresses a GC leak during the computation. Modified TLScheduler to clear the map of per-thread schedulers rather than clearing each per-thread scheduler. This is a big hot spot. See #629 Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java 2013-08-28 19:47:36 UTC (rev 7356) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java 2013-08-28 20:01:46 UTC (rev 7357) @@ -119,6 +119,18 @@ } /* + * Null fill until the end of the last frontier. That will help out GC. + * Otherwise those IV references are pinned and can hang around. We + * could track the high water mark on the backing array for this + * purpose. + */ + for (int i = nvertices; i < a.length; i++) { + if (a[i] == null) + break; + a[i] = null; + } + + /* * Take a slice of the backing showing only the valid entries and use it * to replace the view of the backing array. */ Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java 2013-08-28 19:47:36 UTC (rev 7356) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java 2013-08-28 20:01:46 UTC (rev 7357) @@ -122,13 +122,16 @@ /* * Clear the per-thread maps, but do not discard. They will be reused in * the next round. + * + * FIXME This is a big cost. Try simply clearing [map] and see if that + * is less expensive. */ - for (STScheduler s : map.values()) { - - s.clear(); - - } - +// for (STScheduler s : map.values()) { +// +// s.clear(); +// +// } + map.clear(); } @Override This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-28 19:47:42
|
Revision: 7356 http://bigdata.svn.sourceforge.net/bigdata/?rev=7356&view=rev Author: thompsonbry Date: 2013-08-28 19:47:36 +0000 (Wed, 28 Aug 2013) Log Message: ----------- Deprecated the old StaticFrontier class. Refactored the new StaticFrontier2 class so that the time required to copy in the new frontier and the time to sort it show up clearly in the profiler and can be clearly distinguished. Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java 2013-08-28 19:17:16 UTC (rev 7355) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java 2013-08-28 19:47:36 UTC (rev 7356) @@ -16,6 +16,7 @@ * kinds of mutation. * * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + * @deprecated by {@link StaticFrontier2}, which is more efficient. */ @SuppressWarnings("rawtypes") public class StaticFrontier implements IStaticFrontier { Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java 2013-08-28 19:17:16 UTC (rev 7355) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java 2013-08-28 19:47:36 UTC (rev 7356) @@ -67,19 +67,7 @@ return vertices.iterator(); } - - // private void clear() { - // - // vertices.clear(); - // - // } - - // private void schedule(IV v) { - // - // vertices.add(v); - // - // } - + /** * Setup the same static frontier object for the new compact fronter (it is * reused in each round). @@ -88,6 +76,32 @@ public void resetFrontier(final int minCapacity, final boolean ordered, final Iterator<IV> itr) { + copyScheduleIntoFrontier(minCapacity, itr); + + if (!ordered) { + + /* + * Sort the current slice of the backing array. + */ + + Arrays.sort(backing.array(), 0/* fromIndex */, vertices.len()/* toIndex */); + + } + + } + + /** + * Copy the data from the iterator into the backing array and update the + * slice which provides our exposed view of the backing array. + * + * @param minCapacity + * The minimum required capacity for the backing array. + * @param itr + * The source from which we will repopulate the backing array. + */ + private void copyScheduleIntoFrontier(final int minCapacity, + final Iterator<IV> itr) { + // ensure enough capacity for the new frontier. backing.ensureCapacity(minCapacity); @@ -104,19 +118,12 @@ } - if (!ordered) { + /* + * Take a slice of the backing showing only the valid entries and use it + * to replace the view of the backing array. + */ + this.vertices = backing.slice(0/* off */, nvertices); - // Sort. - Arrays.sort(a, 0/* fromIndex */, nvertices/* toIndex */); - - } - - // take a slice of the backing showing only the valid entries. - final IArraySlice<IV> tmp = backing.slice(0/* off */, nvertices); - - // update the view. - this.vertices = tmp; - } } \ No newline at end of file This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-28 19:17:22
|
Revision: 7355 http://bigdata.svn.sourceforge.net/bigdata/?rev=7355&view=rev Author: thompsonbry Date: 2013-08-28 19:17:16 +0000 (Wed, 28 Aug 2013) Log Message: ----------- javadoc Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java 2013-08-28 19:14:10 UTC (rev 7354) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java 2013-08-28 19:17:16 UTC (rev 7355) @@ -46,6 +46,13 @@ * This is used to sort the thread-local frontier (that is, the frontier * for a single thread). The backing array will grow as necessary and is * reused in each round. + * <P> + * Note: The schedule (for each thread) is using a set - see the + * {@link STScheduler} base class. This means that the schedule (for + * each thread) is compact, but not ordered. We need to use (and re-use) + * an array to order that compact per-thread schedule. The compact + * per-thread schedules are then combined into a single compact frontier + * for the new round. */ private final ManagedArray<IV> tmp; This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-28 19:14:18
|
Revision: 7354 http://bigdata.svn.sourceforge.net/bigdata/?rev=7354&view=rev Author: thompsonbry Date: 2013-08-28 19:14:10 +0000 (Wed, 28 Aug 2013) Log Message: ----------- Modified the thread-local scheduler to reuse the same backing array to sort the thread-local frontier in each round. Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/GASImplUtil.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java 2013-08-28 18:57:07 UTC (rev 7353) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java 2013-08-28 19:14:10 UTC (rev 7354) @@ -14,12 +14,11 @@ import com.bigdata.rdf.graph.IStaticFrontier; import com.bigdata.rdf.graph.impl.GASEngine; import com.bigdata.rdf.graph.impl.util.GASImplUtil; +import com.bigdata.rdf.graph.impl.util.IArraySlice; import com.bigdata.rdf.graph.impl.util.ManagedArray; import com.bigdata.rdf.graph.impl.util.MergeSortIterator; import com.bigdata.rdf.internal.IV; -import cutthecrap.utils.striterators.ArrayIterator; - /** * This scheduler uses thread-local buffers ({@link LinkedHashSet}) to track the * distinct vertices scheduled by each execution thread. After the computation @@ -34,35 +33,35 @@ @SuppressWarnings("rawtypes") public class TLScheduler implements IGASSchedulerImpl { -// /** -// * Class bundles a reusable, extensible array for sorting the thread-local -// * frontier. -// * -// * @author <a href="mailto:tho...@us...">Bryan -// * Thompson</a> -// */ -// private static class MySTScheduler extends STScheduler { -// -// /** -// * This is used to sort the thread-local frontier (that is, the frontier -// * for a single thread). The backing array will grow as necessary and is -// * reused in each round. -// */ -// private final ManagedArray<IV> tmp; -// -// public MySTScheduler(final GASEngine gasEngine) { -// -// super(gasEngine); -// -// tmp = new ManagedArray<IV>(IV.class, 64); -// -// } -// -// } + /** + * Class bundles a reusable, extensible array for sorting the thread-local + * frontier. + * + * @author <a href="mailto:tho...@us...">Bryan + * Thompson</a> + */ + private static class MySTScheduler extends STScheduler { + + /** + * This is used to sort the thread-local frontier (that is, the frontier + * for a single thread). The backing array will grow as necessary and is + * reused in each round. + */ + private final ManagedArray<IV> tmp; + + public MySTScheduler(final GASEngine gasEngine) { + + super(gasEngine); + + tmp = new ManagedArray<IV>(IV.class, 64); + + } + + } // class MySTScheduler private final GASEngine gasEngine; private final int nthreads; - private final ConcurrentHashMap<Long/* threadId */, STScheduler> map; + private final ConcurrentHashMap<Long/* threadId */, MySTScheduler> map; public TLScheduler(final GASEngine gasEngine) { @@ -70,7 +69,7 @@ this.nthreads = gasEngine.getNThreads(); - this.map = new ConcurrentHashMap<Long, STScheduler>( + this.map = new ConcurrentHashMap<Long, MySTScheduler>( nthreads/* initialCapacity */, .75f/* loadFactor */, nthreads); } @@ -79,11 +78,11 @@ final Long id = Thread.currentThread().getId(); - STScheduler s = map.get(id); + MySTScheduler s = map.get(id); if (s == null) { - final IGASScheduler old = map.putIfAbsent(id, s = new STScheduler( + final IGASScheduler old = map.putIfAbsent(id, s = new MySTScheduler( gasEngine)); if (old != null) { @@ -131,38 +130,38 @@ /* * Extract a sorted, compact frontier from each thread local frontier. */ - final IV[][] frontiers = new IV[nthreads][]; + @SuppressWarnings("unchecked") + final IArraySlice<IV>[] frontiers = new IArraySlice[nthreads]; int nsources = 0; int nvertices = 0; { - final List<Callable<IV[]>> tasks = new ArrayList<Callable<IV[]>>( + final List<Callable<IArraySlice<IV>>> tasks = new ArrayList<Callable<IArraySlice<IV>>>( nthreads); - for (STScheduler s : map.values()) { - final STScheduler t = s; - tasks.add(new Callable<IV[]>() { + for (MySTScheduler s : map.values()) { + final MySTScheduler t = s; + tasks.add(new Callable<IArraySlice<IV>>() { @Override - public IV[] call() throws Exception { - return GASImplUtil.compactAndSort(t.vertices); + public IArraySlice<IV> call() throws Exception { + return GASImplUtil.compactAndSort(t.vertices, t.tmp); } }); } // invokeAll() - futures will be done() before it returns. - final List<Future<IV[]>> futures; + final List<Future<IArraySlice<IV>>> futures; try { futures = gasEngine.getGASThreadPool().invokeAll(tasks); } catch (InterruptedException e) { throw new RuntimeException(e); } - for (Future<IV[]> f : futures) { + for (Future<IArraySlice<IV>> f : futures) { - final IV[] b; try { - b = frontiers[nsources] = f.get(); - nvertices += b.length; + final IArraySlice<IV> b = frontiers[nsources] = f.get(); + nvertices += b.len(); nsources++; } catch (InterruptedException e) { throw new RuntimeException(e); @@ -223,7 +222,7 @@ * The new frontier to be populated. */ private void mergeSortSourcesAndSetFrontier(final int nsources, - final int nvertices, final IV[][] frontiers, + final int nvertices, final IArraySlice<IV>[] frontiers, final IStaticFrontier frontier) { // wrap IVs[] as Iterators. @@ -232,7 +231,7 @@ for (int i = 0; i < nsources; i++) { - itrs[i] = new ArrayIterator<IV>(frontiers[i]); + itrs[i] = frontiers[i].iterator(); } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/GASImplUtil.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/GASImplUtil.java 2013-08-28 18:57:07 UTC (rev 7353) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/GASImplUtil.java 2013-08-28 19:14:10 UTC (rev 7354) @@ -17,45 +17,45 @@ @SuppressWarnings({ "unchecked", "rawtypes" }) public static final Iterator<IV> EMPTY_VERTICES_ITERATOR = EmptyIterator.DEFAULT; - /** - * Compact a collection of vertices into an ordered frontier. - * - * @param vertices - * The collection of vertices for the new frontier. - * - * @return The compact, ordered frontier. - * - * @deprecated This implementation fails to reuse/grow the array for each - * round. This causes a lot of avoidable heap pressure during - * the single-threaded execution between each round and is a - * large percentage of the total runtime costs of the engine! - */ - @Deprecated - @SuppressWarnings("rawtypes") - public static IV[] compactAndSort(final Set<IV> vertices) { - - final IV[] a; - - final int size = vertices.size(); - - /* - * FIXME FRONTIER: Grow/reuse this array for each round! This is 15% of - * all time in the profiler. The #1 hot spot with the CHMScheduler. We - * need to reuse the target array!!! - */ - vertices.toArray(a = new IV[size]); - - /* - * Order for index access. An ordered scan on a B+Tree is 10X faster - * than random access lookups. - * - * Note: This uses natural V order, which is also the index order. - */ - java.util.Arrays.sort(a); - - return a; - - } +// /** +// * Compact a collection of vertices into an ordered frontier. +// * +// * @param vertices +// * The collection of vertices for the new frontier. +// * +// * @return The compact, ordered frontier. +// * +// * @deprecated This implementation fails to reuse/grow the array for each +// * round. This causes a lot of avoidable heap pressure during +// * the single-threaded execution between each round and is a +// * large percentage of the total runtime costs of the engine! +// */ +// @Deprecated +// @SuppressWarnings("rawtypes") +// public static IV[] compactAndSort(final Set<IV> vertices) { +// +// final IV[] a; +// +// final int size = vertices.size(); +// +// /* +// * FRONTIER: Grow/reuse this array for each round! This is 15% of +// * all time in the profiler. The #1 hot spot with the CHMScheduler. We +// * need to reuse the target array!!! +// */ +// vertices.toArray(a = new IV[size]); +// +// /* +// * Order for index access. An ordered scan on a B+Tree is 10X faster +// * than random access lookups. +// * +// * Note: This uses natural V order, which is also the index order. +// */ +// java.util.Arrays.sort(a); +// +// return a; +// +// } /** * Compact a collection of vertices into an ordered frontier. @@ -69,7 +69,7 @@ * @return A slice onto just the new frontier. */ @SuppressWarnings("rawtypes") - public static IArraySlice compactAndSort(final Set<IV> vertices, + public static IArraySlice<IV> compactAndSort(final Set<IV> vertices, final IManagedArray<IV> buffer) { final int nvertices = vertices.size(); @@ -93,8 +93,6 @@ * B+Tree is 10X faster than random access lookups. * * Note: This uses natural V order, which is also the index order. - * - * FIXME FRONTIER : We should parallelize this sort! */ java.util.Arrays.sort(a, 0/* fromIndex */, nvertices/* toIndex */); This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2013-08-28 18:57:18
|
Revision: 7353 http://bigdata.svn.sourceforge.net/bigdata/?rev=7353&view=rev Author: thompsonbry Date: 2013-08-28 18:57:07 +0000 (Wed, 28 Aug 2013) Log Message: ----------- Added a managed array abstraction and replaced the static frontier implementation with that abstraction. The code paths that compact and sort the frontier have been modified to reuse an array that grows as necessary. This should reduce the overhead associated with memory allocation for the frontier. I still need to optimize this for the thread-local frontier. I still need to use a parallel sort for the frontiers (other than the thread-local one, which already breaks the sort into N threads since each per-thread frontier is sorted independently). Modified Paths: -------------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASEngine.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASState.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHMScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHSScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/STScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/GASImplUtil.java Added Paths: ----------- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IArraySlice.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IIntArraySlice.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IManagedArray.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IManagedIntArray.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedArray.java branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedIntArray.java Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java 2013-08-28 14:24:03 UTC (rev 7352) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/IStaticFrontier.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -33,9 +33,11 @@ * The minimum capacity of the new frontier. (A minimum capacity * is specified since many techniques to compact the frontier can * only estimate the required capacity.) + * @param ordered + * <code>true</code> iff the frontier is known to be ordered. * @param vertices * The vertices in the new frontier. */ - void resetFrontier(int minCapacity, Iterator<IV> vertices); - + void resetFrontier(int minCapacity, boolean ordered, Iterator<IV> vertices); + } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java 2013-08-28 14:24:03 UTC (rev 7352) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASEngine.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -326,6 +326,18 @@ } + /** + * Return an {@link IStaticFrontier} - this is the object that models the + * frontier that is consumed during a given round of evaluation. + * + * TODO Config overrides + */ + public IStaticFrontier newStaticFrontier() { + + return new StaticFrontier2(); + + } + public IGASSchedulerImpl newScheduler() { final Class<IGASSchedulerImpl> cls = schedulerClassRef.get(); @@ -352,9 +364,11 @@ final IGraphAccessor graphAccessor, final IGASProgram<VS, ES, ST> gasProgram) { + final IStaticFrontier frontier = newStaticFrontier(); + final IGASSchedulerImpl gasScheduler = newScheduler(); - return new GASState<VS, ES, ST>(graphAccessor, gasScheduler, gasProgram); + return new GASState<VS, ES, ST>(graphAccessor, frontier, gasScheduler, gasProgram); } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java 2013-08-28 14:24:03 UTC (rev 7352) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/GASState.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -1,9 +1,6 @@ package com.bigdata.rdf.graph.impl; -import java.util.ArrayList; -import java.util.Arrays; import java.util.HashSet; -import java.util.Iterator; import java.util.Set; import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ConcurrentMap; @@ -22,8 +19,6 @@ import com.bigdata.rdf.internal.IV; import com.bigdata.rdf.spo.ISPO; -import cutthecrap.utils.striterators.ArrayIterator; - @SuppressWarnings("rawtypes") public class GASState<VS, ES, ST> implements IGASState<VS, ES, ST> { @@ -54,11 +49,11 @@ * <p> * Note: This data structure is reused for each round. * - * @see StaticFrontier - * @see CHMScheduler + * @see IStaticFrontier + * @see IGASSchedulerImpl * @see #scheduler */ - private final StaticFrontier frontier; + private final IStaticFrontier frontier; /** * Used to schedule the new frontier and then compact it onto @@ -94,13 +89,17 @@ private IGraphAccessor graphAccessor; public GASState(final IGraphAccessor graphAccessor, // + final IStaticFrontier frontier,// final IGASSchedulerImpl gasScheduler,// final IGASProgram<VS, ES, ST> gasProgram// - ) { + ) { if (graphAccessor == null) throw new IllegalArgumentException(); + if (frontier == null) + throw new IllegalArgumentException(); + if (gasScheduler == null) throw new IllegalArgumentException(); @@ -110,12 +109,12 @@ this.graphAccessor = graphAccessor; this.gasProgram = gasProgram; - + this.vsf = gasProgram.getVertexStateFactory(); this.esf = gasProgram.getEdgeStateFactory(); - this.frontier = new StaticFrontier(); + this.frontier = frontier; this.scheduler = gasScheduler; @@ -209,7 +208,8 @@ if (edgeState != null) edgeState.clear(); - frontier.resetFrontier(0/* minCapacity */, GASImplUtil.EMPTY_VERTICES_ITERATOR); + frontier.resetFrontier(0/* minCapacity */, true/* ordered */, + GASImplUtil.EMPTY_VERTICES_ITERATOR); } @@ -230,26 +230,20 @@ } - - // dense vector. - final IV[] a = tmp.toArray(new IV[tmp.size()]); - // Ascending order - Arrays.sort(a); - /* * Callback to initialize the vertex state before the first * iteration. */ - for(IV v : a) { + for(IV v : tmp) { gasProgram.init(this, v); } // Reset the frontier. - frontier.resetFrontier(a.length/* minCapacity */, - new ArrayIterator<IV>(a)); + frontier.resetFrontier(tmp.size()/* minCapacity */, false/* ordered */, + tmp.iterator()); } @@ -292,88 +286,6 @@ } - /** - * Simple implementation of a "static" frontier. - * <p> - * Note: This implementation has package private methods that permit certain - * kinds of mutation. - * - * @author <a href="mailto:tho...@us...">Bryan Thompson</a> - */ - static class StaticFrontier implements IStaticFrontier { - - private final ArrayList<IV> vertices; - - private StaticFrontier() { - - vertices = new ArrayList<IV>(); - - } - - @Override - public int size() { - - return vertices.size(); - - } - - @Override - public boolean isEmpty() { - - return vertices.isEmpty(); - - } - - @Override - public Iterator<IV> iterator() { - - return vertices.iterator(); - - } - - private void ensureCapacity(final int minCapacity) { - - vertices.ensureCapacity(minCapacity); - - } - -// private void clear() { -// -// vertices.clear(); -// -// } - -// private void schedule(IV v) { -// -// vertices.add(v); -// -// } - - /** - * Setup the same static frontier object for the new compact fronter (it - * is reused in each round). - */ - @Override - public void resetFrontier(final int minCapacity, final Iterator<IV> itr) { - - // clear the old frontier. - vertices.clear(); - - // ensure enough capacity for the new frontier. - ensureCapacity(minCapacity); - - while (itr.hasNext()) { - - final IV v = itr.next(); - - vertices.add(v); - - } - - } - - } - @Override public String toString(final ISPO e) { Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -0,0 +1,107 @@ +package com.bigdata.rdf.graph.impl; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Iterator; + +import com.bigdata.rdf.graph.IStaticFrontier; +import com.bigdata.rdf.internal.IV; + +import cutthecrap.utils.striterators.ArrayIterator; + +/** + * Simple implementation of a "static" frontier. + * <p> + * Note: This implementation has package private methods that permit certain + * kinds of mutation. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + */ +@SuppressWarnings("rawtypes") +public class StaticFrontier implements IStaticFrontier { + + private final ArrayList<IV> vertices; + + StaticFrontier() { + + vertices = new ArrayList<IV>(); + + } + + @Override + public int size() { + + return vertices.size(); + + } + + @Override + public boolean isEmpty() { + + return vertices.isEmpty(); + + } + + @Override + public Iterator<IV> iterator() { + + return vertices.iterator(); + + } + + private void ensureCapacity(final int minCapacity) { + + vertices.ensureCapacity(minCapacity); + + } + + // private void clear() { + // + // vertices.clear(); + // + // } + + // private void schedule(IV v) { + // + // vertices.add(v); + // + // } + + @Override + public void resetFrontier(final int minCapacity, final boolean ordered, + Iterator<IV> itr) { + + if (!ordered) { + + final IV[] a = new IV[minCapacity]; + + int i=0; + while(itr.hasNext()) { + a[i++] = itr.next(); + } + + // sort. + Arrays.sort(a, 0/* fromIndex */, i/* toIndex */); + + // iterator over the ordered vertices. + itr = new ArrayIterator<IV>(a, 0/* fromIndex */, i/* len */); + + } + + // clear the old frontier. + vertices.clear(); + + // ensure enough capacity for the new frontier. + ensureCapacity(minCapacity); + + while (itr.hasNext()) { + + final IV v = itr.next(); + + vertices.add(v); + + } + + } + +} \ No newline at end of file Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/StaticFrontier2.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -0,0 +1,122 @@ +package com.bigdata.rdf.graph.impl; + +import java.util.Arrays; +import java.util.Iterator; + +import com.bigdata.rdf.graph.IStaticFrontier; +import com.bigdata.rdf.graph.impl.util.IArraySlice; +import com.bigdata.rdf.graph.impl.util.IManagedArray; +import com.bigdata.rdf.graph.impl.util.ManagedArray; +import com.bigdata.rdf.internal.IV; + +/** + * An implementation of a "static" frontier that grows and reuses the backing + * vertex array. + * <p> + * Note: This implementation has package private methods that permit certain + * kinds of mutation. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + */ +@SuppressWarnings("rawtypes") +public class StaticFrontier2 implements IStaticFrontier { + + /** + * The backing structure. + */ + private final IManagedArray<IV> backing; + + /** + * A slice onto the {@link #backing} structure for the current frontier. + * This gets replaced when the frontier is changed. + */ + private IArraySlice<IV> vertices; + + StaticFrontier2() { + + /* + * The managed backing array. + */ + backing = new ManagedArray<IV>(IV.class); + + /* + * Initialize with an empty slice. The backing [] will grow as + * necessary. + */ + vertices = backing.slice(0/* off */, 0/* len */); + + } + + @Override + public int size() { + + return vertices.len(); + + } + + @Override + public boolean isEmpty() { + + return vertices.len() == 0; + + } + + @Override + public Iterator<IV> iterator() { + + return vertices.iterator(); + + } + + // private void clear() { + // + // vertices.clear(); + // + // } + + // private void schedule(IV v) { + // + // vertices.add(v); + // + // } + + /** + * Setup the same static frontier object for the new compact fronter (it is + * reused in each round). + */ + @Override + public void resetFrontier(final int minCapacity, final boolean ordered, + final Iterator<IV> itr) { + + // ensure enough capacity for the new frontier. + backing.ensureCapacity(minCapacity); + + // the actual backing array. should not changed since pre-extended. + final IV[] a = backing.array(); + + int nvertices = 0; + + while (itr.hasNext()) { + + final IV v = itr.next(); + + a[nvertices++] = v; + + } + + if (!ordered) { + + // Sort. + Arrays.sort(a, 0/* fromIndex */, nvertices/* toIndex */); + + } + + // take a slice of the backing showing only the valid entries. + final IArraySlice<IV> tmp = backing.slice(0/* off */, nvertices); + + // update the view. + this.vertices = tmp; + + } + +} \ No newline at end of file Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASEngine.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASEngine.java 2013-08-28 14:24:03 UTC (rev 7352) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASEngine.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -19,6 +19,7 @@ import com.bigdata.rdf.graph.IGASSchedulerImpl; import com.bigdata.rdf.graph.IGASState; import com.bigdata.rdf.graph.IGraphAccessor; +import com.bigdata.rdf.graph.IStaticFrontier; import com.bigdata.rdf.graph.impl.GASContext; import com.bigdata.rdf.graph.impl.GASEngine; import com.bigdata.rdf.internal.IV; @@ -138,10 +139,13 @@ final IGraphAccessor graphAccessor, final IGASProgram<VS, ES, ST> gasProgram) { + final IStaticFrontier frontier = newStaticFrontier(); + final IGASSchedulerImpl gasScheduler = newScheduler(); return new BigdataGASState<VS, ES, ST>( - (BigdataGraphAccessor) graphAccessor, gasScheduler, gasProgram); + (BigdataGraphAccessor) graphAccessor, frontier, gasScheduler, + gasProgram); } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASState.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASState.java 2013-08-28 14:24:03 UTC (rev 7352) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/bd/BigdataGASState.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -9,6 +9,7 @@ import com.bigdata.rdf.graph.IGASProgram; import com.bigdata.rdf.graph.IGASSchedulerImpl; +import com.bigdata.rdf.graph.IStaticFrontier; import com.bigdata.rdf.graph.impl.GASState; import com.bigdata.rdf.graph.impl.bd.BigdataGASEngine.BigdataGraphAccessor; import com.bigdata.rdf.internal.IV; @@ -28,10 +29,11 @@ } public BigdataGASState(final BigdataGraphAccessor graphAccessor, + final IStaticFrontier frontier, final IGASSchedulerImpl gasScheduler, final IGASProgram<VS, ES, ST> gasProgram) { - super(graphAccessor, gasScheduler, gasProgram); + super(graphAccessor, frontier, gasScheduler, gasProgram); } Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHMScheduler.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHMScheduler.java 2013-08-28 14:24:03 UTC (rev 7352) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHMScheduler.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -5,11 +5,8 @@ import com.bigdata.rdf.graph.IGASSchedulerImpl; import com.bigdata.rdf.graph.IStaticFrontier; import com.bigdata.rdf.graph.impl.GASEngine; -import com.bigdata.rdf.graph.impl.util.GASImplUtil; import com.bigdata.rdf.internal.IV; -import cutthecrap.utils.striterators.ArrayIterator; - /** * A simple scheduler based on a {@link ConcurrentHashMap}. * @@ -44,10 +41,9 @@ @Override public void compactFrontier(final IStaticFrontier frontier) { - final IV[] a = GASImplUtil.compactAndSort(vertices.keySet()); - - frontier.resetFrontier(a.length, new ArrayIterator<IV>(a)); - + frontier.resetFrontier(vertices.size()/* minCapacity */, + false/* ordered */, vertices.keySet().iterator()); + } } // CHMScheduler \ No newline at end of file Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHSScheduler.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHSScheduler.java 2013-08-28 14:24:03 UTC (rev 7352) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/CHSScheduler.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -5,11 +5,8 @@ import com.bigdata.rdf.graph.IGASSchedulerImpl; import com.bigdata.rdf.graph.IStaticFrontier; import com.bigdata.rdf.graph.impl.GASEngine; -import com.bigdata.rdf.graph.impl.util.GASImplUtil; import com.bigdata.rdf.internal.IV; -import cutthecrap.utils.striterators.ArrayIterator; - /** * A simple scheduler based on a concurrent hash collection * @@ -47,10 +44,9 @@ @Override public void compactFrontier(final IStaticFrontier frontier) { - final IV[] a = GASImplUtil.compactAndSort(vertices); + frontier.resetFrontier(vertices.size()/* minCapacity */, + false/* ordered */, vertices.iterator()); - frontier.resetFrontier(a.length, new ArrayIterator<IV>(a)); - } } // CHMScheduler \ No newline at end of file Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/STScheduler.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/STScheduler.java 2013-08-28 14:24:03 UTC (rev 7352) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/STScheduler.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -6,11 +6,8 @@ import com.bigdata.rdf.graph.IGASSchedulerImpl; import com.bigdata.rdf.graph.IStaticFrontier; import com.bigdata.rdf.graph.impl.GASEngine; -import com.bigdata.rdf.graph.impl.util.GASImplUtil; import com.bigdata.rdf.internal.IV; -import cutthecrap.utils.striterators.ArrayIterator; - /** * A scheduler suitable for a single thread. * @@ -37,10 +34,9 @@ @Override public void compactFrontier(final IStaticFrontier frontier) { - final IV[] a = GASImplUtil.compactAndSort(vertices); + frontier.resetFrontier(vertices.size()/* minCapacity */, + false/* ordered */, vertices.iterator()); - frontier.resetFrontier(a.length, new ArrayIterator<IV>(a)); - } @Override Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java 2013-08-28 14:24:03 UTC (rev 7352) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/scheduler/TLScheduler.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -14,6 +14,7 @@ import com.bigdata.rdf.graph.IStaticFrontier; import com.bigdata.rdf.graph.impl.GASEngine; import com.bigdata.rdf.graph.impl.util.GASImplUtil; +import com.bigdata.rdf.graph.impl.util.ManagedArray; import com.bigdata.rdf.graph.impl.util.MergeSortIterator; import com.bigdata.rdf.internal.IV; @@ -33,6 +34,32 @@ @SuppressWarnings("rawtypes") public class TLScheduler implements IGASSchedulerImpl { +// /** +// * Class bundles a reusable, extensible array for sorting the thread-local +// * frontier. +// * +// * @author <a href="mailto:tho...@us...">Bryan +// * Thompson</a> +// */ +// private static class MySTScheduler extends STScheduler { +// +// /** +// * This is used to sort the thread-local frontier (that is, the frontier +// * for a single thread). The backing array will grow as necessary and is +// * reused in each round. +// */ +// private final ManagedArray<IV> tmp; +// +// public MySTScheduler(final GASEngine gasEngine) { +// +// super(gasEngine); +// +// tmp = new ManagedArray<IV>(IV.class, 64); +// +// } +// +// } + private final GASEngine gasEngine; private final int nthreads; private final ConcurrentHashMap<Long/* threadId */, STScheduler> map; @@ -155,7 +182,7 @@ * The new frontier is empty. */ - frontier.resetFrontier(0/* minCapacity */, + frontier.resetFrontier(0/* minCapacity */, true/* ordered */, GASImplUtil.EMPTY_VERTICES_ITERATOR); return; @@ -212,20 +239,9 @@ // merge sort of those iterators. final Iterator<IV> itr = new MergeSortIterator(itrs); - frontier.resetFrontier(nvertices/* minCapacity */, itr); + frontier.resetFrontier(nvertices/* minCapacity */, true/* ordered */, + itr); - // // ensure enough capacity for the new frontier. - // frontier.ensureCapacity(nvertices); - // - // // and populate the new frontier. - // while (itr.hasNext()) { - // - // final IV v = itr.next(); - // - // frontier.vertices.add(v); - // - // } - } } \ No newline at end of file Modified: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/GASImplUtil.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/GASImplUtil.java 2013-08-28 14:24:03 UTC (rev 7352) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/GASImplUtil.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -24,7 +24,13 @@ * The collection of vertices for the new frontier. * * @return The compact, ordered frontier. + * + * @deprecated This implementation fails to reuse/grow the array for each + * round. This causes a lot of avoidable heap pressure during + * the single-threaded execution between each round and is a + * large percentage of the total runtime costs of the engine! */ + @Deprecated @SuppressWarnings("rawtypes") public static IV[] compactAndSort(final Set<IV> vertices) { @@ -32,7 +38,11 @@ final int size = vertices.size(); - // TODO FRONTIER: Could reuse this array for each round! + /* + * FIXME FRONTIER: Grow/reuse this array for each round! This is 15% of + * all time in the profiler. The #1 hot spot with the CHMScheduler. We + * need to reuse the target array!!! + */ vertices.toArray(a = new IV[size]); /* @@ -47,4 +57,50 @@ } + /** + * Compact a collection of vertices into an ordered frontier. + * + * @param vertices + * The collection of vertices for the new frontier. + * @param buffer + * The backing buffer for the new frontier - it will be resized + * if necessary. + * + * @return A slice onto just the new frontier. + */ + @SuppressWarnings("rawtypes") + public static IArraySlice compactAndSort(final Set<IV> vertices, + final IManagedArray<IV> buffer) { + + final int nvertices = vertices.size(); + + // ensure buffer has sufficient capacity. + buffer.ensureCapacity(nvertices); + + // backing array reference (presized above). + final IV[] a = buffer.array(); + + // copy frontier into backing array. + int i = 0; + for (IV v : vertices) { + + a[i++] = v; + + } + + /* + * Order the frontier for efficient index access. An ordered scan on a + * B+Tree is 10X faster than random access lookups. + * + * Note: This uses natural V order, which is also the index order. + * + * FIXME FRONTIER : We should parallelize this sort! + */ + java.util.Arrays.sort(a, 0/* fromIndex */, nvertices/* toIndex */); + + // A view onto just the new frontier. + return buffer.slice(0/* off */, nvertices/* len */); + + } + } Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IArraySlice.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IArraySlice.java (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IArraySlice.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -0,0 +1,128 @@ +package com.bigdata.rdf.graph.impl.util; + +/** + * Interface for a slice of a backing array. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + * @version $Id: IByteArraySlice.java 4548 2011-05-25 19:36:34Z thompsonbry $ + */ +public interface IArraySlice<T> extends Iterable<T> { + + /** + * The backing array. This method DOES NOT guarantee that the backing array + * reference will remain constant. Some implementations use an extensible + * backing array and will replace the reference when the backing buffer is + * extended. + */ + T[] array(); + + /** + * The start of the slice in the {@link #array()}. + */ + int off(); + + /** + * The length of the slice in the {@link #array()}. + */ + int len(); + + /** + * Return a copy of the data in the slice. + * + * @return A new array containing data in the slice. + */ + T[] toArray(); + + /** + * Return a slice of the backing buffer. The slice will always reference the + * current backing {@link #array()}, even when the buffer is extended and + * the array reference is replaced. + * + * @param off + * The starting offset into the backing buffer of the slice. + * @param len + * The length of that slice. + * + * @return The slice. + */ + IArraySlice<T> slice(final int off, final int len); + + /** + * Absolute put of a value at an index. + * + * @param pos + * The index. + * @param v + * The value. + */ + void put(int pos, T v); + + /** + * Absolute get of a value at an index. + * + * @param pos + * The index. + * + * @return The value. + */ + T get(int pos); + + /** + * Absolute bulk <i>put</i> copies all values in the caller's array into + * this buffer starting at the specified position within the slice defined + * by this buffer. + * + * @param pos + * The starting position within the slice defined by this buffer. + * @param src + * The source data. + */ + void put(int pos, T[] src); + + /** + * Absolute bulk <i>put</i> copies the specified slice of values + * from the caller's array into this buffer starting at the specified + * position within the slice defined by this buffer. + * + * @param dstoff + * The offset into the slice to which the data will be copied. + * @param src + * The source data. + * @param srcoff + * The offset of the 1st value in the source data to + * be copied. + * @param srclen + * The #of values to be copied. + */ + void put(int dstoff, T[] src, int srcoff, int srclen); + + /** + * Absolute bulk <i>get</i> copies <code>dst.length</code> values + * from the specified offset into the slice defined by this buffer into the + * caller's array. + * + * @param srcoff + * The offset into the slice of the first value to be copied. + * @param dst + * The array into which the data will be copied. + */ + void get(final int srcoff, final T[] dst); + + /** + * Absolute bulk <i>get</i> copies the specified slice of values + * from this buffer into the specified slice of the caller's array. + * + * @param srcoff + * The offset into the slice defined by this buffer of the first + * value to be copied. + * @param dst + * The array into which the data will be copied. + * @param dstoff + * The offset of the first value in that array onto + * which the data will be copied. + * @param dstlen + * The #of values to be copied. + */ + void get(final int srcoff, final T[] dst, final int dstoff, final int dstlen); + +} Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IIntArraySlice.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IIntArraySlice.java (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IIntArraySlice.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -0,0 +1,129 @@ +package com.bigdata.rdf.graph.impl.util; + +/** + * Interface for a slice of a backing int[]. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + * @version $Id: IByteArraySlice.java 4548 2011-05-25 19:36:34Z thompsonbry $ + */ +public interface IIntArraySlice { + + /** + * The backing array. This method DOES NOT guarantee that the backing array + * reference will remain constant. Some implementations use an extensible + * backing array and will replace the reference when the backing buffer is + * extended. + */ + int[] array(); + + /** + * The start of the slice in the {@link #array()}. + */ + int off(); + + /** + * The length of the slice in the {@link #array()}. + */ + int len(); + + /** + * Return a copy of the data in the slice. + * + * @return A new array containing data in the slice. + */ + int[] toArray(); + + /** + * Return a slice of the backing buffer. The slice will always reference the + * current backing {@link #array()}, even when the buffer is extended and + * the array reference is replaced. + * + * @param off + * The starting offset into the backing buffer of the slice. + * @param len + * The length of that slice. + * + * @return The slice. + */ + IIntArraySlice slice(final int off, final int len); + + /** + * Absolute put of a value at an index. + * + * @param pos + * The index. + * @param v + * The value. + */ + void putInt(int pos, int v); + + /** + * Absolute get of a value at an index. + * + * @param pos + * The index. + * + * @return The value. + */ + int getInt(int pos); + + /** + * Absolute bulk <i>put</i> copies all <code>int</code>s in the caller's + * array into this buffer starting at the specified position within the + * slice defined by this buffer. + * + * @param pos + * The starting position within the slice defined by this buffer. + * @param src + * The source data. + */ + void put(int pos, int[] src); + + /** + * Absolute bulk <i>put</i> copies the specified slice of <code>int</code>s + * from the caller's array into this buffer starting at the specified + * position within the slice defined by this buffer. + * + * @param dstoff + * The offset into the slice to which the data will be copied. + * @param src + * The source data. + * @param srcoff + * The offset of the 1st <code>int</code> in the source data to + * be copied. + * @param srclen + * The #of <code>int</code>s to be copied. + */ + void put(int dstoff, int[] src, int srcoff, int srclen); + + /** + * Absolute bulk <i>get</i> copies <code>dst.length</code> <code>int</code>s + * from the specified offset into the slice defined by this buffer into the + * caller's array. + * + * @param srcoff + * The offset into the slice of the first <code>int</code> to be copied. + * @param dst + * The array into which the data will be copied. + */ + void get(final int srcoff, final int[] dst); + + /** + * Absolute bulk <i>get</i> copies the specified slice of <code>int</code>s + * from this buffer into the specified slice of the caller's array. + * + * @param srcoff + * The offset into the slice defined by this buffer of the first + * <code>int</code> to be copied. + * @param dst + * The array into which the data will be copied. + * @param dstoff + * The offset of the first <code>int</code> in that array onto + * which the data will be copied. + * @param dstlen + * The #of <code>int</code>s to be copied. + */ + void get(final int srcoff, final int[] dst, final int dstoff, + final int dstlen); + +} Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IManagedArray.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IManagedArray.java (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IManagedArray.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -0,0 +1,26 @@ +package com.bigdata.rdf.graph.impl.util; + +/** + * An interface for a managed array. Implementations of this interface may + * permit transparent extension of the managed array. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + * @version $Id: IManagedByteArray.java 4548 2011-05-25 19:36:34Z thompsonbry $ + */ +public interface IManagedArray<T> extends IArraySlice<T> { + + /** + * Return the capacity of the backing buffer. + */ + int capacity(); + + /** + * Ensure that the buffer capacity is a least <i>capacity</i> total values. + * The buffer may be grown by this operation but it will not be truncated. + * + * @param capacity + * The minimum #of values in the buffer. + */ + void ensureCapacity(int capacity); + +} Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IManagedIntArray.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IManagedIntArray.java (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/IManagedIntArray.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -0,0 +1,26 @@ +package com.bigdata.rdf.graph.impl.util; + +/** + * An interface for a managed int[]. Implementations of this interface may + * permit transparent extension of the managed int[]. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + * @version $Id: IManagedByteArray.java 4548 2011-05-25 19:36:34Z thompsonbry $ + */ +public interface IManagedIntArray extends IIntArraySlice { + + /** + * Return the capacity of the backing buffer. + */ + int capacity(); + + /** + * Ensure that the buffer capacity is a least <i>capacity</i> total values. + * The buffer may be grown by this operation but it will not be truncated. + * + * @param capacity + * The minimum #of values in the buffer. + */ + void ensureCapacity(int capacity); + +} Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedArray.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedArray.java (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedArray.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -0,0 +1,531 @@ +package com.bigdata.rdf.graph.impl.util; + +import java.util.Iterator; + +import org.apache.log4j.Logger; + +import com.bigdata.io.ByteArrayBuffer; + +import cutthecrap.utils.striterators.ArrayIterator; + +/** + * A view on a mutable int[] that may be extended. + * <p> + * Note: The backing int[] always has an {@link #off() offset} of ZERO (0) and a + * {@link #len() length} equal to the capacity of the backing int[]. + * <p> + * This class is NOT thread-safe for mutation. The operation which replaces the + * {@link #array()} when the capacity of the backing buffer must be extended is + * not atomic. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + * @version $Id: ByteArrayBuffer.java 6279 2012-04-12 15:27:30Z thompsonbry $ + * + * TODO Refactor to create a test suite for this. Especially the slice + * methods since the slice relies on a nested inner class for its + * semantics. + */ +public class ManagedArray<T> implements IManagedArray<T> { + + private static final transient Logger log = Logger + .getLogger(ManagedArray.class); + + /** + * The default capacity of the buffer. + */ + final public static int DEFAULT_INITIAL_CAPACITY = 128;// 1024; + + /** + * The {@link Class} of the elements of the array. This is required in order + * to safely allocate new arrays of the same type. + */ + private final Class<? extends T> elementClass; + + /** + * The backing array. This is re-allocated whenever the capacity of the + * buffer is too small and reused otherwise. + */ + private T[] buf; + + /** + * {@inheritDoc} This is re-allocated whenever the capacity of the buffer is + * too small and reused otherwise. + */ + @Override + final public T[] array() { + + return buf; + + } + + /** + * {@inheritDoc} + * <p> + * The offset of the slice into the backing byte[] is always zero. + */ + @Override + final public int off() { + + return 0; + + } + + /** + * {@inheritDoc} + * <p> + * The length of the slice is always the capacity of the backing byte[]. + */ + @Override + final public int len() { + + return buf.length; + + } + + /** + * Return a new instance of an array of the correct generic type. + * + * @param capacity + * The capacity of the array. + * + * @return The array. + */ + @SuppressWarnings("unchecked") + private T[] newArray(final int capacity) { + + return (T[]) java.lang.reflect.Array + .newInstance(elementClass, capacity); + + } + + /** + * Throws exception unless the value is non-negative. + * + * @param msg + * The exception message. + * @param v + * The value. + * + * @return The value. + * + * @exception IllegalArgumentException + * unless the value is non-negative. + */ + protected static int assertNonNegative(final String msg, final int v) { + + if (v < 0) + throw new IllegalArgumentException(msg); + + return v; + + } + + /** + * Creates a buffer with an initial capacity of + * {@value #DEFAULT_INITIAL_CAPACITY} bytes. The capacity of the buffer will + * be automatically extended as required. + */ + public ManagedArray(final Class<T> elementClass) { + + this(elementClass, DEFAULT_INITIAL_CAPACITY); + + } + + /** + * Creates a buffer with the specified initial capacity. The capacity of the + * buffer will be automatically extended as required. + * + * @param initialCapacity + * The initial capacity. + */ + public ManagedArray(final Class<? extends T> elementClass, + final int initialCapacity) { + + if (elementClass == null) + throw new IllegalArgumentException(); + + this.elementClass = elementClass; + + this.buf = newArray(assertNonNegative("initialCapacity", + initialCapacity)); + + } + + /** + * Create a view wrapping the entire array. + * <p> + * Note: the caller's reference will be used until and unless the array is + * grown, at which point the caller's reference will be replaced by a larger + * array having the same data. + * + * @param array + * The array. + */ + @SuppressWarnings("unchecked") + public ManagedArray(final T[] array) { + + if (array == null) + throw new IllegalArgumentException(); + + this.elementClass = (Class<? extends T>) array.getClass() + .getComponentType(); + + this.buf = array; + + } + + @Override + final public String toString() { + + return getClass().getName() + "{capacity=" + capacity() + "}"; + + } + + @Override + final public void ensureCapacity(final int capacity) { + + if (capacity < 0) + throw new IllegalArgumentException(); + + if (buf == null) { + + buf = newArray(capacity); + + return; + + } + + final int overflow = capacity - buf.length; + + if (overflow > 0) { + + // Extend to at least the target capacity. + final T[] tmp = newArray(extend(capacity)); + + // copy all bytes to the new byte[]. + System.arraycopy(buf, 0, tmp, 0, buf.length); + + // update the reference to use the new byte[]. + buf = tmp; + + } + + } + + @Override + final public int capacity() { + + return buf == null ? 0 : buf.length; + + } + + /** + * Return the new capacity for the buffer (default is always large enough + * and will normally double the buffer capacity each time it overflows). + * + * @param required + * The minimum required capacity. + * + * @return The new capacity. + * + * @todo this does not need to be final. also, caller's could set the policy + * including a policy that refuses to extend the capacity. + */ + private int extend(final int required) { + + final int capacity = Math.max(required, capacity() * 2); + + if (log.isInfoEnabled()) + log.info("Extending buffer to capacity=" + capacity + " bytes."); + + return capacity; + + } + + @Override + public Iterator<T> iterator() { + + return new ArrayIterator<T>(array(), off(), len()); + + } + + /* + * Absolute put/get methods. + */ + + @Override + final public void put(final int pos, // + final T[] b) { + + put(pos, b, 0, b.length); + + } + + @Override + final public void put(final int pos,// + final T[] b, final int off, final int len) { + + ensureCapacity(pos + len); + + System.arraycopy(b, off, buf, pos, len); + + } + + @Override + final public void get(final int srcoff, final T[] dst) { + + get(srcoff, dst, 0/* dstoff */, dst.length); + + } + + @Override + final public void get(final int srcoff, final T[] dst, final int dstoff, + final int dstlen) { + + System.arraycopy(buf, srcoff, dst, dstoff, dstlen); + + } + + @Override + final public void put(final int pos, final T v) { + + if (pos + 1 > buf.length) + ensureCapacity(pos + 1); + + buf[pos] = v; + + } + + @Override + final public T get(final int pos) { + + return buf[pos]; + + } + + @Override + final public T[] toArray() { + + final T[] tmp = newArray(buf.length); + + System.arraycopy(buf, 0, tmp, 0, buf.length); + + return tmp; + + } + + @Override + public IArraySlice<T> slice(final int off, final int len) { + + return new SliceImpl(off, len); + + } + + /** + * A slice of the outer {@link ByteArrayBuffer}. The slice will always + * reflect the backing {@link #array()} for the instance of the outer class. + * + * @author <a href="mailto:tho...@us...">Bryan + * Thompson</a> + */ + private class SliceImpl implements IArraySlice<T> { + + /** + * The start of the slice in the {@link #array()}. + */ + private final int off; + + /** + * The length of the slice in the {@link #array()}. + */ + private final int len; + + @Override + final public int off() { + + return off; + + } + + @Override + final public int len() { + + return len; + + } + + /** + * Protected constructor used to create a slice. The caller is + * responsible for verifying that the slice is valid for the backing + * byte[] buffer. + * + * @param off + * The offset of the start of the slice. + * @param len + * The length of the slice. + */ + protected SliceImpl(final int off, final int len) { + + if (off < 0) + throw new IllegalArgumentException("off<0"); + + if (len < 0) + throw new IllegalArgumentException("len<0"); + + this.off = off; + + this.len = len; + + } + + @Override + public String toString() { + + return super.toString() + "{off=" + off() + ",len=" + len() + "}"; + + } + + @Override + public T[] array() { + + return ManagedArray.this.array(); + + } + + /* + * Absolute get/put operations. + */ + + /** + * Verify that an operation starting at the specified offset into the + * slice and having the specified length is valid against the slice. + * + * @param aoff + * The offset into the slice. + * @param alen + * The #of bytes to be addressed starting from that offset. + * + * @return <code>true</code>. + * + * @throws IllegalArgumentException + * if the operation is not valid. + */ + private boolean rangeCheck(final int aoff, final int alen) { + + if (aoff < 0) + throw new IndexOutOfBoundsException(); + + if (alen < 0) + throw new IndexOutOfBoundsException(); + + if ((aoff + alen) > len) { + + /* + * The operation run length at that offset would extend beyond + * the end of the slice. + */ + + throw new IndexOutOfBoundsException(); + + } + + return true; + + } + + @Override + final public void put(final int pos, final T[] b) { + + put(pos, b, 0, b.length); + + } + + @Override + final public void put(final int dstoff,// + final T[] src, final int srcoff, final int srclen) { + + assert rangeCheck(dstoff, srclen); + + System.arraycopy(src, srcoff, array(), off + dstoff, srclen); + + } + + @Override + final public void get(final int srcoff, final T[] dst) { + + get(srcoff, dst, 0/* dstoff */, dst.length); + + } + + @Override + final public void get(final int srcoff, final T[] dst, + final int dstoff, final int dstlen) { + + assert rangeCheck(srcoff, dstlen); + + System.arraycopy(array(), off + srcoff, dst, dstoff, dstlen); + + } + + @Override + final public void put(final int pos, final T v) { + + assert rangeCheck(pos, 1); + + array()[pos] = v; + + } + + @Override + final public T get(final int pos) { + + assert rangeCheck(pos, 1); + + final T v = array()[pos]; + + return v; + + } + + @Override + final public T[] toArray() { + + final T[] tmp = newArray(len); + + System.arraycopy(array(), off/* srcPos */, tmp/* dst */, + 0/* destPos */, len); + + return tmp; + + } + + @Override + public IArraySlice<T> slice(final int aoff, final int alen) { + + final ManagedArray<T> outer = ManagedArray.this; + + assert rangeCheck(aoff, alen); + + return new SliceImpl(off() + aoff, alen) { + + @Override + public T[] array() { + + return outer.array(); + + } + + }; + + } + + @Override + public Iterator<T> iterator() { + + return new ArrayIterator<T>(array(), off(), len()); + + } + + } // class SliceImpl + +} Added: branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedIntArray.java =================================================================== --- branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedIntArray.java (rev 0) +++ branches/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/java/com/bigdata/rdf/graph/impl/util/ManagedIntArray.java 2013-08-28 18:57:07 UTC (rev 7353) @@ -0,0 +1,470 @@ +package com.bigdata.rdf.graph.impl.util; + +import org.apache.log4j.Logger; + +import com.bigdata.io.ByteArrayBuffer; + +/** + * A view on a mutable int[] that may be extended. + * <p> + * Note: The backing int[] always has an {@link #off() offset} of ZERO (0) and a + * {@link #len() length} equal to the capacity of the backing int[]. + * <p> + * This class is NOT thread-safe for mutation. The operation which replaces the + * {@link #array()} when the capacity of the backing buffer must be extended is + * not atomic. + * + * @author <a href="mailto:tho...@us...">Bryan Thompson</a> + * @version $Id: ByteArrayBuffer.java 6279 2012-04-12 15:27:30Z thompsonbry $ + * + * TODO Refactor to create a test suite for this. + */ +public class ManagedIntArray implements IManagedIntArray { + + private static final transient Logger log = Logger + .getLogger(ManagedIntArray.class); + + /** + * The default capacity of the buffer. + */ + final public static int DEFAULT_INITIAL_CAPACITY = 128;// 1024; + + /** + * The backing array. This is re-allocated whenever the capacity of the + * buffer is too small and reused otherwise. + */ + private int[] buf; + + /** + * {@inheritDoc} This is re-allocated whenever the capacity of the buffer is + * too small and reused otherwise. + */ + @Override + final public int[] array() { + + return buf; + + } + + /** + * {@inheritDoc} + * <p> + * The offset of the slice into the backing byte[] is always zero. + */ + @Override + final public int off() { + + return 0; + + } + + /** + * {@inheritDoc} + * <p> + * The length of the slice is always the capacity of the backing byte[]. + */ + @Override + final public int len() { + + return buf.length; + + } + + /** + * Throws exception unless the value is non-negative. + * + * @param msg + * The exception message. + * @param v + * The value. + * + * @return The value. + * + * @exception IllegalArgumentException + * unless the value is non-negative. + */ + protected static int assertNonNegative(final String msg, final int v) { + + if (v < 0) + throw new IllegalArgumentException(msg); + + return v; + + } + + /** + * Creates a buffer with an initial capacity of + * {@value #DEFAULT_INITIAL_CAPACITY} bytes. The capacity of the buffer will + * be automatically extended as required. + */ + public ManagedIntArray() { + + this(DEFAULT_INITIAL_CAPACITY); + + } + + /** + * Creates a buffer with the specified initial capacity. The capacity of the + * buffer will be automatically extended as required. + * + * @param initialCapacity + * The initial capacity. + */ + public ManagedIntArray(final int initialCapacity) { + + this.buf = new int[assertNonNegative("initialCapacity", initialCapacity)]; + + } + + /** + * Create a view wrapping the entire array. + * <p> + * Note: the caller's reference will be used until and unless the array is + * grown, at which point the caller's reference will be replaced by a larger + * array having the same data. + * + * @param array + * The array. + */ + public ManagedIntArray(final int[] array) { + + if (array == null) + throw new IllegalArgumentException(); + + this.buf = array; + + } + + final public void ensureCapacity(final int capacity) { + + if (capacity < 0) + throw new IllegalArgumentException(); + + if (buf == null) { + + buf = new int[capacity]; + + return; + + } + + final int overflow = capacity - buf.length; + + if (overflow > 0) { + + // Extend to at least the target capacity. + final int[] tmp = new int[extend(capacity)]; + + // copy all bytes to the new byte[]. + System.arraycopy(buf, 0, tmp, 0, buf.length); + + // update the reference to use the new byte[]. + buf = tmp; + + } + + } + + @Override + final public int capacity() { + + return buf == null ? 0 : buf.length; + + } + + /** + * Return the new capacity for the buffer (default is always large enough + * and will normally double the buffer capacity each time it overflows). + * + * @param required + * The minimum required capacity. + * + * @return The new capacity. + * + * @todo this does not need to be final. also, caller's could set the policy + * including a policy that refuses to extend the capacity. + */ + private int extend(final int required) { + + final int capacity = Math.max(required, capacity() * 2); + + if (log.isInfoEnabled()) + log.info("Extending buffer to capacity=" + capacity + " bytes."); + + return capacity; + + } + + /* + * Absolute put/get methods. + */ + + @Override + final public void put(final int pos, // + final int[] b) { + + put(pos, b, 0, b.length); + + } + + @Override + final public void put(final int pos,// + final int[] b, final int off, final int len) { + + ensureCapacity(pos + len); + + System.arraycopy(b, off, buf, pos, len); + + } + + @Override + final public void get(final int srcoff, final int[] dst) { + + get(srcoff, dst, 0/* dstoff */, dst.length); + + } + + @Override + final public void get(final int srcoff, final int[] dst, final int dstoff, + final int dstlen) { + + System.arraycopy(buf, srcoff, dst, dstoff, dstlen); + + } + + @Override + final public void putInt(final int pos, final int v) { + + if (pos + 1 > buf.length) + ensureCapacity(pos + 1); + + buf[pos] = v; + + } + + @Override + final public int getInt(final int pos) { + + return buf[pos]; + + } + + @Override + final public int[] toArray() { + + final int[] tmp = new int[buf.length]; + + System.arraycopy(buf, 0, tmp, 0, buf.length); + + return tmp; + + } + + @Override + public IIntArraySlice slice(final int off, final int len) { + + return new SliceImpl(off, len); + + } + + /** + * A slice of the outer {@link ByteArrayBuffer}. The slice will always + * reflect the backing {@link #array()} for the instance of the o... [truncated message content] |
From: <tho...@us...> - 2013-08-28 14:24:10
|
Revision: 7352 http://bigdata.svn.sourceforge.net/bigdata/?rev=7352&view=rev Author: thompsonbry Date: 2013-08-28 14:24:03 +0000 (Wed, 28 Aug 2013) Log Message: ----------- Exposed the HAClient so a Callable running on the HAJournalServer process can cast the IIndexManager to an HAJournal and then access the HAClient. ((HAJournal)indexManager).getHAClient() See #728 (Refactor to create HAClient) Modified Paths: -------------- branches/READ_CACHE2/bigdata-jini/src/java/com/bigdata/journal/jini/ha/HAJournal.java Modified: branches/READ_CACHE2/bigdata-jini/src/java/com/bigdata/journal/jini/ha/HAJournal.java =================================================================== --- branches/READ_CACHE2/bigdata-jini/src/java/com/bigdata/journal/jini/ha/HAJournal.java 2013-08-28 14:23:42 UTC (rev 7351) +++ branches/READ_CACHE2/bigdata-jini/src/java/com/bigdata/journal/jini/ha/HAJournal.java 2013-08-28 14:24:03 UTC (rev 7352) @@ -245,6 +245,18 @@ } /** + * Return the {@link HAClient} instance that is in use by the + * {@link HAJournalServer}. + * + * @see HAClient#getConnection() + */ + public HAClient getHAClient() { + + return server.getHAClient(); + + } + + /** * The {@link HAJournalServer} instance that is managing this * {@link HAJournal}. */ This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |