Name | Modified | Size | Downloads / Week |
---|---|---|---|
README.md | 2023-04-23 | 8.0 kB | |
crgrep-1.0.6.zip | 2023-04-23 | 127.1 MB | |
crgrep-1.0.5.zip | 2016-01-20 | 57.4 MB | |
crgrep-1.0.4.zip | 2015-07-14 | 67.4 MB | |
crgrep-1.0.3.zip | 2015-05-20 | 67.9 MB | |
crgrep-1.0.2.zip | 2015-03-12 | 65.8 MB | |
crgrep-1.0.1.zip | 2015-02-09 | 53.7 MB | |
crgrep-1.0.0.zip | 2014-11-25 | 56.6 MB | |
README.txt | 2014-07-28 | 6.5 kB | |
crgrep-0.5.zip | 2014-06-01 | 24.7 MB | |
crgrep-0.4.zip | 2014-04-14 | 23.2 MB | |
crgrep-0.3.zip | 2013-10-23 | 16.8 MB | |
crgrep-0.2.zip | 2013-08-17 | 12.7 MB | |
crgrep-0.1.zip | 2013-08-01 | 12.7 MB | |
Totals: 14 Items | 585.9 MB | 0 |
Common Resource Grep
Version: 1.0.6, April 23th, 2023.
(C) Copyright 2013-2023 Craig Ryan. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License (LGPL) version 2.1 which accompanies this distribution, and is available at http://www.gnu.org/licenses/lgpl-2.1.html
CRGREP is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
Description
CRGREP is an open source COMMON RESOURCE GREP command line utility to match text by name and content discovered in difficult to access resource data.
Search for pattern matches in database tables, ZIP and other archive files, MS Office formats, images and scanned documents, maven dependencies, web resources and combinations of supported resources nested within other resources.
Differences to plain old grep
CRGREP combines various features of both find and grep for deep search and pattern matching. Users of plain old grep will note various general differences. Not all standard grep options are provided either because they are yet to be developed or do not make sense for CRGREP style search. CRGREP will display search results with specific details depending on the type of resource matched and may include information such as page number, column name or sheet number in Excel.
Documentation
README.md: (this file) discover CRGREP capabilities and usage.
INSTALL.txt: read this if you download a binary distribution ready to run. It is important to read this document, specifically details of configuration of some third party software.
BUILD.txt: read this if you download a source distribution to build CRGREP yourself.
CHANGELOG.txt: historhy of versions and changes in CRGREP.
This document is accompanied by additional documents:
docs/USAGE.txt usage, platform and configuration details
docs/FILE_GREP.txt file grep in detail
docs/DATABASE_GREP.txt database grep in detail
docs/WEB_GREP.txt web grep in detail
As this is a multi-platform download, a 'docs/unix/' directory also contains the additional documents in Unix/Mac friendly format.
CRGREP in action
Here are some examples showing what you can do with CRGREP.
1/ What files and data are nested anywhere under my 'target' directory matching 'key' including data buried inside archives?
$ crgrep -r key target
target/simple_file.txt: a key moment
target/misc.zip[misc/nested_monkey.txt]
target/monkey-pics.txt:1:A file about happy monkeys.
target/test-ear.ear[META-INF/MANIFEST.MF]:5:Created-By: Apache monkey
2/ Is there data in my database matching 'handle'? ('~/.crgrep' defines user/password)
$ crgrep -d -U "jdbc:sqlite:/databases/db.sqlite3" handle '*'
(relational database)
customers: [id,name,status,joined_date,handle]
customers: 3,Craig,active,2012-10-24 01:05:44,Craig's handle
tags: [id,tag]
tags: 4,handle
3/ Does my scanned report document contain the word 'report'
$ crgrep --ocr report report_scan.png
report_scan.png:10: abc report for management
4/ Which of my Microsoft Office files mention 'profit'?
$ crgrep profit msoffice/*
agm.doc:2:4:The annual profit has risen this year
statement.xlsx:1:2:4:Annual profit:
board.ppt:3:Highlights contributing to profit figures
5/ Where is my favourite AC/DC track under an MP3 library?
$ crgrep "Back in Black" music/*
music/HellsBells.mp3: @{Album=Back In Black, TrackTitle=Hells Bells, Music By=Angus Young,..}",
6/ Which of my photo images are tagged with comments of our holiday in Perth?
$ crgrep Perth pics/*.jpeg
pics/pic1112.jpg: @{JpegComment=Lovely shot of Perth city, just arrived.}
pics/pic1113.jpg: @{JpegComment=Breakfast in a Perth cafe}
7/ Does the google web page contain a 'favicon' reference?
$ crgrep google_favicon http://www.google.com
http://www.google.com:<!doctype html><html itemscope="itemscope" itemtype="http://schema.org/WebPage"><head><meta itemprop="image" content="/images/google_favicon_128.png"><title>Google</title>...
8/ Do I have any maven (POM) dependencies in my project with content matching 'RunWith'?
$ crgrep -m RunWith pom.xml
C:/Users/Craig/.m2/repository/junit/junit/4.8.2/junit-4.8.2.jar[org/junit/runner/RunWith.class]
CRGREP will search for text matches within any combination of supported resources contained within, or referenced by, other supported resources.
Calling CRGREP
Ensure you have at least Java 8 installed and your environment has JAVA_HOME and PATH set correctly e.g:
$ set JAVA_HOME=C:\path\to\java1.8
$ set PATH=...;C:\path\to\crgrep\bin
All CRGREP operations involve a similar set of command line arguments:
$ crgrep <pattern> <resource path(s)>
Simple wildcards '' and '?' may be specified in 'pattern' while 'resource path' supports full glob pattern search including 'ant style' glob (such as 'a/**/.txt').
To read from standard input (stdin) either specify no <resource path> or provide a hyphen '-':
$ cat file.txt | crgrep sometext [-]
See docs/USAGE.txt for further usage details.
Displaying Results
Results are displayed in a format depending on the type of resource and includes the name of the resource, any nesting information, line and page numbers and any matching text.
The basic format for displaying results for file based resources is
<resource>[[:pagenum]:linenum:matching_content]
Some examples of CRGREP displayed results
Output | Match
------------------------------------- | -----------------------------
src/foo.java | File listing match
src/bar.txt:25:some text | File content match (+lineno)
lib/all.zip[image.gif] | Archive file listing match
lib/app.war[WEB-INF/web.xml]:6:<d..> | Archive file content match
pom.xml->stuff.zip[doc.txt] | File listing match
mypic.jpg: @{Size=25,Com=Scene} | File meta-data match
TAB: [COL1,COL2,COL3] | Table column name match
TAB: data1,data2,data3 | Table data match
Node[1]:{name:"John"} | Graph database node match
sample.pdf:1:1:Sample PDF Document | Text extracted from a PDF
| (+pageno and +linenum)
report.docx:2:Second paragraph | MS Word text (+paragraph)
See docs/USAGE.txt for a detailed description of output formats.
File Grep
A file grep will search for textual matches in the following resource types:
- Plain text files similar to normal grep
- Resources within archives such as ZIP, TAR, WAR, EAR and JAR formats
- Meta data in images (jpeg/gif etc) and MP3 audio files
- Text in scanned documents (jpeg/gif/tiff/bmp/png), extracted using Optical Character Recognition (OCR) techniques.
- Text extracted from PDF files and MS Office formats (doc[x], xls[x], ppt[x])
- Maven POM files, following dependency trees of resource artifacts
The file docs/FILE_GREP.txt provides more details on file based grep.
Database Grep
By specifying a database grep (-d option), the search will attempt to match the search pattern against persisted data in either a relational (SQL) database or graph database identified by a URI. For example
$ crgrep -d -U jdbc:vendor:db 'john' 'mytable.column*'
See docs/DATABASE_GREP.txt for detailed database grep behaviour. This document contains important configuration requirements for connecting to supported database servers.
Web Grep
A web page search is attempted when the <resource> begins with 'http://' and no -d (database) option is specified.
See docs/WEB_GREP.txt for usage and examples.