These would need proper tests added. Ideally test macro files containing macro features (I don't know details of this format) before I include them. If you would like to help, please add a PR with macro versions of, say, core/src/test/resources/ms files 'excel.xls' and 'word.docx' which actually include macros (?) so that I can include in tests and do a rebuild. (The build is tricky to complete cleanly with all the env setup required, easier if I do it).
This is an across the board uplift of DB drivers, tools and other library dependencies. This release was built and tested with Java 19. The models (NLP/mood and OCR) should also be improved. The main feature add is -e/--regex for file search. This interprets the 'search pattern' as a regular expression instead of the default wildcard. DB search is based on LIKE queries so this remains wildcard only (if -d is specified then -e is ignored, with warning). I removed Neo4J at the same time, the uplift...
Yes! Beta testing for me would be muich appreciated. On any platform, but my testing has been on Windows. So mac, linux testing needs some help
This requires regex support which I expect to include in the upcoming 1.0.6 release. Stay tuned
Hi Darren, thanks for your post. Sorry but notifications of new posts were lost somehow, I only just noticed this one! The future is good, I've finally found time to do a major uplift and in the processes of completing all the integration tests which are the hardest to setup. I expect to release 1.0.6 in the near future with some initial feature tweaks. The main one being regex support for the search pattern. CRGREP comes with a lot of docs but looking again you're right, the 'search-pattern' is...
Hi Darren, thanks for your post. Sorry but notifications of new posts were lost somehow, I only just noticed this one! The future is good, I've finally found time to do a major uplift and in the processes of completing all the integration tests which are the hardest to setup. I expect to release 1.0.6 in the near future with some initial feature tweaks. The main one being regex support for the search pattern. CRGREP comes with a lot of docs but looking again you're right, the 'search-pattern' is...
Hi Culverine, it's not meant to be bundled or packaged with standalone external tools included. The idea is that all these third party packages I make use of (NLP, OCR etc) need to be installed independently of crgrep because their datasets and software are typically quite large while the default crgrep distro is small given that it's a simple CLI tool and users can decide what extra tools they wish to install and use with it. I've made every attempt to provide complete documentation, have you tried...
Hi Darren, I could look at handling nbsp the same as space. Leave that with me. In the meantime, I created a Word doc with nbsp line (nb space char between 34 and mm) and another line using a normal space character and the crgrep call below matched both for me using a wildcard in the pattern: Data: (the first line will display as '35mm' on the command line) 34 mm nbsp 34 mm space $ crgrep '34*mm' nbspace.docx nbspace.docx:P:34mm nbsp nbspace.docx:P:34 mm space Hope that helps as a workaround.