<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to Home</title><link>https://sourceforge.net/p/crgrep/wiki/Home/</link><description>Recent changes to Home</description><atom:link href="https://sourceforge.net/p/crgrep/wiki/Home/feed" rel="self"/><language>en</language><lastBuildDate>Tue, 11 Aug 2015 22:25:03 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/crgrep/wiki/Home/feed" rel="self" type="application/rss+xml"/><item><title>Discussion for Home page</title><link>https://sourceforge.net/p/crgrep/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;One of the documents I ship is a ROADMAP.txt with various notes I keep on feature ideas for future crgrep releases. I'll post the latest (1.0.4) roadmap here rather than require a full download to view it. I'm looking for comments on the list I've compiled and any new resource types you suggest I might add to the list. &lt;/p&gt;
&lt;h1 id="roadmap"&gt;ROADMAP&lt;/h1&gt;
&lt;p&gt;Backlog of possible features, bugs, fixes, extensions.&lt;/p&gt;
&lt;h2 id="versions"&gt;Versions&lt;/h2&gt;
&lt;p&gt;within the next couple of releases:&lt;br/&gt;
    ResultMatcher&lt;br/&gt;
    -v invert match&lt;br/&gt;
    filters / --include/--exclude options&lt;/p&gt;
&lt;h2 id="resource-types-general-new"&gt;Resource Types - General / New&lt;/h2&gt;
&lt;p&gt;. other archive formats: bzip, ar, cpio, dump etc&lt;br/&gt;
. OpenOffice ODF formats&lt;br/&gt;
. RSS feeds. rss/torrent grep for title, description as 'category.title'. RSS/REST only.&lt;br/&gt;
   rss structure: &lt;a href="http://www.landofcode.com/rss-tutorials/rss-structure.php" rel="nofollow"&gt;http://www.landofcode.com/rss-tutorials/rss-structure.php&lt;/a&gt;&lt;br/&gt;
       'foo' 'channel-category.item-category'&lt;br/&gt;
       for &amp;lt;category&amp;gt; &lt;br/&gt;
. social network grep. fb/t/li&lt;br/&gt;
. Decompile .class files and grep result. Most likely candidate: &lt;br/&gt;
&lt;a href="https://bitbucket.org/mstrobel/procyon/wiki/Java%20Decompiler" rel="nofollow"&gt;https://bitbucket.org/mstrobel/procyon/wiki/Java%20Decompiler&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="resource-types-web"&gt;Resource Types - Web&lt;/h2&gt;
&lt;p&gt;. Http search, follow links in html pages.&lt;br/&gt;
. Http search, support multiple resource list(s)&lt;/p&gt;
&lt;h2 id="resource-types-maven-etc"&gt;Resource Types - Maven, etc&lt;/h2&gt;
&lt;p&gt;. maven pom file, follow &amp;lt;modules&amp;gt; entries for child pom files&lt;br/&gt;
. gradle dependencies&lt;/p&gt;
&lt;h2 id="resource-types-file"&gt;Resource Types - File&lt;/h2&gt;
&lt;p&gt;. file search to support URL (file://..)&lt;br/&gt;
. MS docs - embedded docs (Word within Excel etc)&lt;br/&gt;
. file type specific settings, eg&lt;br/&gt;
     zip.password=password for decrypting zip files&lt;br/&gt;
     (doc,ppt,xls).password=password for decrypting a list of MS file types&lt;/p&gt;
&lt;h2 id="resource-types-database"&gt;Resource Types - Database&lt;/h2&gt;
&lt;p&gt;. Database column data containing URLs, option to follow these links.&lt;br/&gt;
. Database column data containing image data, apply OCR.&lt;br/&gt;
. database search, support multiple resource list(s)&lt;br/&gt;
. neo4j include relationship properties in search&lt;br/&gt;
. mongodb, other graph databases&lt;br/&gt;
. db search with &amp;gt;1 'tab.col'&lt;br/&gt;
. db query for collation rules for -i&lt;/p&gt;
&lt;h2 id="input-environment-command-line-options"&gt;Input / Environment / Command Line / Options&lt;/h2&gt;
&lt;p&gt;. More 'grep' features, -x (whole line match only), -s (no errors), -c (count) &lt;br/&gt;
    from &lt;a href="http://www.gnu.org/software/grep/manual/grep.html" rel="nofollow"&gt;http://www.gnu.org/software/grep/manual/grep.html&lt;/a&gt;&lt;br/&gt;
. Some 'find' features (time based options, -type, -user)&lt;br/&gt;
. read from a file containing resource paths&lt;br/&gt;
. options to output to file or stream (for use as a library)&lt;br/&gt;
. inclusion/exclusion filters for resource lists&lt;br/&gt;
    find like functionality&lt;br/&gt;
    -r -include '&lt;em&gt;.pdf' 'patten' dir/to/search&lt;br/&gt;
       which will only look at pdf files buried inside stuff under dir/to/search&lt;br/&gt;
    -r -l -include '&lt;/em&gt;.pdf' '&lt;em&gt;' dir/to/search&lt;br/&gt;
       lists all pdf files found under dir/to/search. Alias to 'crgrep --find '&lt;/em&gt;.pdf' dir/to/search ... ??&lt;br/&gt;
. add user and system .crgrep?&lt;br/&gt;
. .crgrep file type specific passwords eg 'docx.password = 1234' &lt;br/&gt;
. Use Grep diffs:&lt;br/&gt;
    - grep uses "Binary file &amp;lt;f&amp;gt; matches"&lt;br/&gt;
    - grep doesn't show line numbers by default. Requires -n/--line-number for line numbers.&lt;br/&gt;
    - grep has -h to suppress filename on output, makes no sense for embedded grep. Doc 'diffs to normal grap'.&lt;br/&gt;
. --env to verify if all drivers, external libs/deps&lt;br/&gt;
. -l listing&lt;br/&gt;
   - means match by name&lt;br/&gt;
   - should mean list filename if contents match. Or have 2 options? 1: match name, dont open. 2: open, match data &amp;amp; if match show name only&lt;br/&gt;
   - some results include content.&lt;br/&gt;
. when -r specified then '*.xml' will search inside archives for file name matches.&lt;/p&gt;
&lt;h2 id="output-results-formatting-display"&gt;Output / Results / Formatting / Display&lt;/h2&gt;
&lt;p&gt;. swing gui, see &lt;a href="https://sourceforge.net/projects/grepgui"&gt;https://sourceforge.net/projects/grepgui&lt;/a&gt;&lt;br/&gt;
. chained output filters for line wrap, output formats&lt;br/&gt;
. formatted output: html, reports, tree view of nested results&lt;br/&gt;
     test.war&lt;br/&gt;
            |&lt;strong&gt;foo.jar&lt;br/&gt;
                     |&lt;/strong&gt;src/com/bar&lt;br/&gt;
                                  |__test.java&lt;br/&gt;
     --&amp;gt;       if (matched == true) {&lt;br/&gt;
. context around match. Need class to capture every line tested, keep the last 5 for example.&lt;br/&gt;
   in ResourceMatcher?&lt;/p&gt;
&lt;h2 id="cleanup-bugs-issues-refactor-tech-debt"&gt;Cleanup / Bugs / Issues / Refactor / Tech Debt&lt;/h2&gt;
&lt;p&gt;. move ocr/ libs into lib/&lt;br/&gt;
. fix -Xdebug to not display file listings not searched such as binaries. Add to trace instead.&lt;/p&gt;
&lt;h2 id="documentation"&gt;Documentation&lt;/h2&gt;
&lt;p&gt;. Change all remaining docs to .md format&lt;br/&gt;
&lt;a href="http://daringfireball.net/projects/markdown/" rel="nofollow"&gt;http://daringfireball.net/projects/markdown/&lt;/a&gt;&lt;br/&gt;
  verify it displays correctly in bitbucket/sourceforge&lt;/p&gt;
&lt;h2 id="misc"&gt;Misc&lt;/h2&gt;
&lt;p&gt;. create github Pages&lt;br/&gt;
  ref to this for ideas: &lt;a href="http://thechangelog.com/top-ten-reasons-why-i-wont-use-your-open-source-project/" rel="nofollow"&gt;http://thechangelog.com/top-ten-reasons-why-i-wont-use-your-open-source-project/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;. deps uplift. Tess4j is now at 2.0 beta (Tesseract 3.03)&lt;br/&gt;
  uplift to newer neo4j version&lt;br/&gt;
  uplift postgresql to 9.4 etc&lt;br/&gt;
  uplift sonatype to eclipse package&lt;br/&gt;
. gradle build system to replace pom.xml&lt;/p&gt;
&lt;h2 id="notes"&gt;Notes&lt;/h2&gt;
&lt;h2 id="fileresourcematcher"&gt;FileResourceMatcher:&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;- interface to access all specified files
new (ResouceList, ExcludesGlob, IncludesGlob, isRecursive)

filters.addFilter(new FileFilter(excludeGlob)); 
filters.addFilter(new FileFilter(includeGlob));

for (res : resourceList.args) {
    // eg res = '**/?foo/*.tar'
    pathMatchers.add(new PathMatcher(res, isRecursive))
}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;hasNext()&lt;br/&gt;
    p = getPathMatcher()&lt;br/&gt;
    return p == null ? false : p hasNext()&lt;br/&gt;
getPatchMatcher()&lt;br/&gt;
    if currPath.hasNext()&lt;br/&gt;
       return currPath&lt;br/&gt;
    return nextPatchMatcher()&lt;br/&gt;
next()&lt;br/&gt;
       p-&amp;gt; nextPath()&lt;br/&gt;
           -&amp;gt; applyFilters &lt;br/&gt;
           -&amp;gt; next File(..)&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">craig</dc:creator><pubDate>Tue, 11 Aug 2015 22:25:03 -0000</pubDate><guid>https://sourceforge.neta2fe536f66cd5e5be335c61ecd08909505d74da4</guid></item><item><title>Home modified by craig</title><link>https://sourceforge.net/p/crgrep/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v1
+++ v2
@@ -1,8 +1,5 @@
-Welcome to your wiki!

-This is the default page, edit it as you see fit. To add a new page simply reference it within brackets, e.g.: [SamplePage].
-
-The wiki uses [Markdown](/p/crgrep/wiki/markdown_syntax/) syntax.
+A Grep utility to match text in documents, inside archives, database tables, images and other resource types.

 [[members limit=20]]
 [[download_button]]
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">craig</dc:creator><pubDate>Sun, 01 Jun 2014 08:04:00 -0000</pubDate><guid>https://sourceforge.net5cca2b12b88ae96dbc67945c35d7b30a52584a80</guid></item><item><title>Home modified by craig</title><link>https://sourceforge.net/p/crgrep/wiki/Home/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;Welcome to your wiki!&lt;/p&gt;
&lt;p&gt;This is the default page, edit it as you see fit. To add a new page simply reference it within brackets, e.g.: &lt;span&gt;[SamplePage]&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;The wiki uses &lt;a class="" href="/p/crgrep/wiki/markdown_syntax/"&gt;Markdown&lt;/a&gt; syntax.&lt;/p&gt;
&lt;p&gt;&lt;h6&gt;Project Members:&lt;/h6&gt;
&lt;ul class="md-users-list"&gt;
&lt;li&gt;&lt;a href="/u/cryandublin/"&gt;craig&lt;/a&gt; (admin)&lt;/li&gt;
&lt;/ul&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;span class="download-button-51fa680b90954753227b8e68" style="margin-bottom: 1em; display: block;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">craig</dc:creator><pubDate>Thu, 01 Aug 2013 13:52:12 -0000</pubDate><guid>https://sourceforge.nete3e26afb8959840bbb7835499839ce41429ba5dc</guid></item></channel></rss>