sleuthkit-users Mailing List for The Sleuth Kit (Page 11)
Brought to you by:
carrier
You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(11) |
Oct
(5) |
Nov
(4) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(1) |
Feb
(20) |
Mar
(60) |
Apr
(40) |
May
(24) |
Jun
(28) |
Jul
(18) |
Aug
(27) |
Sep
(6) |
Oct
(14) |
Nov
(15) |
Dec
(22) |
2004 |
Jan
(34) |
Feb
(13) |
Mar
(28) |
Apr
(23) |
May
(27) |
Jun
(26) |
Jul
(37) |
Aug
(19) |
Sep
(20) |
Oct
(39) |
Nov
(17) |
Dec
(9) |
2005 |
Jan
(45) |
Feb
(43) |
Mar
(66) |
Apr
(36) |
May
(19) |
Jun
(64) |
Jul
(10) |
Aug
(11) |
Sep
(35) |
Oct
(6) |
Nov
(4) |
Dec
(13) |
2006 |
Jan
(52) |
Feb
(34) |
Mar
(39) |
Apr
(39) |
May
(37) |
Jun
(15) |
Jul
(13) |
Aug
(48) |
Sep
(9) |
Oct
(10) |
Nov
(47) |
Dec
(13) |
2007 |
Jan
(25) |
Feb
(4) |
Mar
(2) |
Apr
(29) |
May
(11) |
Jun
(19) |
Jul
(13) |
Aug
(15) |
Sep
(30) |
Oct
(12) |
Nov
(10) |
Dec
(13) |
2008 |
Jan
(2) |
Feb
(54) |
Mar
(58) |
Apr
(43) |
May
(10) |
Jun
(27) |
Jul
(25) |
Aug
(27) |
Sep
(48) |
Oct
(69) |
Nov
(55) |
Dec
(43) |
2009 |
Jan
(26) |
Feb
(36) |
Mar
(28) |
Apr
(27) |
May
(55) |
Jun
(9) |
Jul
(19) |
Aug
(16) |
Sep
(15) |
Oct
(17) |
Nov
(70) |
Dec
(21) |
2010 |
Jan
(56) |
Feb
(59) |
Mar
(53) |
Apr
(32) |
May
(25) |
Jun
(31) |
Jul
(36) |
Aug
(11) |
Sep
(37) |
Oct
(19) |
Nov
(23) |
Dec
(6) |
2011 |
Jan
(21) |
Feb
(20) |
Mar
(30) |
Apr
(30) |
May
(74) |
Jun
(50) |
Jul
(34) |
Aug
(34) |
Sep
(12) |
Oct
(33) |
Nov
(10) |
Dec
(8) |
2012 |
Jan
(23) |
Feb
(57) |
Mar
(26) |
Apr
(14) |
May
(27) |
Jun
(27) |
Jul
(60) |
Aug
(88) |
Sep
(13) |
Oct
(36) |
Nov
(97) |
Dec
(85) |
2013 |
Jan
(60) |
Feb
(24) |
Mar
(43) |
Apr
(32) |
May
(22) |
Jun
(38) |
Jul
(51) |
Aug
(50) |
Sep
(76) |
Oct
(65) |
Nov
(25) |
Dec
(30) |
2014 |
Jan
(19) |
Feb
(41) |
Mar
(43) |
Apr
(28) |
May
(61) |
Jun
(12) |
Jul
(10) |
Aug
(37) |
Sep
(76) |
Oct
(31) |
Nov
(41) |
Dec
(12) |
2015 |
Jan
(33) |
Feb
(28) |
Mar
(53) |
Apr
(22) |
May
(29) |
Jun
(20) |
Jul
(15) |
Aug
(17) |
Sep
(52) |
Oct
(3) |
Nov
(18) |
Dec
(21) |
2016 |
Jan
(20) |
Feb
(8) |
Mar
(21) |
Apr
(7) |
May
(13) |
Jun
(35) |
Jul
(34) |
Aug
(11) |
Sep
(14) |
Oct
(22) |
Nov
(31) |
Dec
(23) |
2017 |
Jan
(20) |
Feb
(7) |
Mar
(5) |
Apr
(6) |
May
(6) |
Jun
(22) |
Jul
(11) |
Aug
(16) |
Sep
(8) |
Oct
(1) |
Nov
(1) |
Dec
(1) |
2018 |
Jan
|
Feb
|
Mar
(16) |
Apr
(2) |
May
(6) |
Jun
(5) |
Jul
|
Aug
(2) |
Sep
(4) |
Oct
|
Nov
(16) |
Dec
(13) |
2019 |
Jan
|
Feb
(1) |
Mar
(25) |
Apr
(9) |
May
(2) |
Jun
(1) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
(2) |
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
(3) |
Jul
(2) |
Aug
|
Sep
|
Oct
(5) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(4) |
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2022 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(3) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
(2) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
2024 |
Jan
|
Feb
(3) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Hoyt H. <hoy...@gm...> - 2016-12-06 17:11:40
|
My vote is option 2 - Read Only. I think a minor incrementation shouldn't totally abandon work product from any release of the same major version, so no option 1. As long as it's clear that older v4 cases will be read-only and the previous minor release version should be retained (or re-downloaded), that should be fine. Option 3 is probably not worth the effort. Hoyt On Dec 6, 2016 9:04 AM, "Brian Carrier" <ca...@sl...> wrote: > I have an update Solr / Elastic / regular expression work and a question > about backward compatibility. > > Update: We’re sticking with Solr and will be breaking text into 32KB > chunks to use a different regular expression searching approach that gives > us better results. It is actually faster than before! > > Question: How much backward compatibility are people expecting? We have > three general options: > - no backward compatibility: You need to have Autopsy 4.2 to open > existing 4.2 cases. Existing cases are not upgraded. We’d probably need to > call this release Autopsy 5 to make it clear what can open what. I’m not > sure there are enough new features to justify such a major version increase. > - read-only: Autopsy 4.2 cases can be opened in the new Autopsy (let’s > call it 4.3), but only searched. You can’t add new data sources to it and > it would have the old regular expression searching. If you need to add Data > Sources, open the case up in 4.2. > - fully: Autopsy converts the old schema to the new schema (a time > intensive process). You could open Autopsy cases originally created with > 4.2 in 4.3 and add to them. > > I’ll bias this thread by saying my preference is the read-only approach. > It’s the least amount of work to provide some level of backward > compatibility. Historically, we have always upgraded cases to work with > new versions of Autopsy. This is just a lot of work to fully upgrade and > it isn’t clear that there is a lot of value in doing it. > > Who would be sad if we did the read-only approach? > > > ------------------------------------------------------------ > ------------------ > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/xeonphi > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > |
From: MATT P. <mat...@ad...> - 2016-12-06 16:56:13
|
Thank you for all the hard work you put in. I am perfectly OK with breaking backwards compatibility. I would hope that the tool recognizes older case types and warns the user of the issue. It would be great if the user were able to create a new case based on the old case source data to fit the new capabilities. -----Original Message----- From: Brian Carrier [mailto:ca...@sl...] Sent: Tuesday, December 6, 2016 9:00 AM To: sle...@li... users <sle...@li...> Subject: [sleuthkit-users] Solr / RegExp Update and Survey I have an update Solr / Elastic / regular expression work and a question about backward compatibility. Update: We’re sticking with Solr and will be breaking text into 32KB chunks to use a different regular expression searching approach that gives us better results. It is actually faster than before! Question: How much backward compatibility are people expecting? We have three general options: - no backward compatibility: You need to have Autopsy 4.2 to open existing 4.2 cases. Existing cases are not upgraded. We’d probably need to call this release Autopsy 5 to make it clear what can open what. I’m not sure there are enough new features to justify such a major version increase. - read-only: Autopsy 4.2 cases can be opened in the new Autopsy (let’s call it 4.3), but only searched. You can’t add new data sources to it and it would have the old regular expression searching. If you need to add Data Sources, open the case up in 4.2. - fully: Autopsy converts the old schema to the new schema (a time intensive process). You could open Autopsy cases originally created with 4.2 in 4.3 and add to them. I’ll bias this thread by saying my preference is the read-only approach. It’s the least amount of work to provide some level of backward compatibility. Historically, we have always upgraded cases to work with new versions of Autopsy. This is just a lot of work to fully upgrade and it isn’t clear that there is a lot of value in doing it. Who would be sad if we did the read-only approach? ------------------------------------------------------------------------------ Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/xeonphi _______________________________________________ sleuthkit-users mailing list https://lists.sourceforge.net/lists/listinfo/sleuthkit-users http://www.sleuthkit.org |
From: Danilo M. <da...@gm...> - 2016-12-06 15:51:03
|
Option 1. Autopsy lets to keep earlier versions, so it wouldn't a big deal to open old cases up with proper old ones. Em 6 de dez de 2016 13:01, "Brian Carrier" <ca...@sl...> escreveu: > I have an update Solr / Elastic / regular expression work and a question > about backward compatibility. > > Update: We’re sticking with Solr and will be breaking text into 32KB > chunks to use a different regular expression searching approach that gives > us better results. It is actually faster than before! > > Question: How much backward compatibility are people expecting? We have > three general options: > - no backward compatibility: You need to have Autopsy 4.2 to open > existing 4.2 cases. Existing cases are not upgraded. We’d probably need to > call this release Autopsy 5 to make it clear what can open what. I’m not > sure there are enough new features to justify such a major version increase. > - read-only: Autopsy 4.2 cases can be opened in the new Autopsy (let’s > call it 4.3), but only searched. You can’t add new data sources to it and > it would have the old regular expression searching. If you need to add Data > Sources, open the case up in 4.2. > - fully: Autopsy converts the old schema to the new schema (a time > intensive process). You could open Autopsy cases originally created with > 4.2 in 4.3 and add to them. > > I’ll bias this thread by saying my preference is the read-only approach. > It’s the least amount of work to provide some level of backward > compatibility. Historically, we have always upgraded cases to work with > new versions of Autopsy. This is just a lot of work to fully upgrade and > it isn’t clear that there is a lot of value in doing it. > > Who would be sad if we did the read-only approach? > > > ------------------------------------------------------------ > ------------------ > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/xeonphi > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > |
From: John L. <slo...@gm...> - 2016-12-06 15:33:15
|
I second the motion. Sent from my iPhone > On Dec 6, 2016, at 07:22, Derrick Karpo <dk...@gm...> wrote: > > I'd be happy with either the first or second approach but would opt for no backwards compatibility, at least for the way we use it. Since each Autopsy version installs standalone in its own directory we typically keep multiple installs going and rarely upgrade mid-case. Once a case is started with a specific version we'll stick with it pending no major issues. > > Derrick > >> On Tue, Dec 6, 2016 at 08:04 Brian Carrier <ca...@sl...> wrote: >> I have an update Solr / Elastic / regular expression work and a question about backward compatibility. >> >> Update: We’re sticking with Solr and will be breaking text into 32KB chunks to use a different regular expression searching approach that gives us better results. It is actually faster than before! >> >> Question: How much backward compatibility are people expecting? We have three general options: >> - no backward compatibility: You need to have Autopsy 4.2 to open existing 4.2 cases. Existing cases are not upgraded. We’d probably need to call this release Autopsy 5 to make it clear what can open what. I’m not sure there are enough new features to justify such a major version increase. >> - read-only: Autopsy 4.2 cases can be opened in the new Autopsy (let’s call it 4.3), but only searched. You can’t add new data sources to it and it would have the old regular expression searching. If you need to add Data Sources, open the case up in 4.2. >> - fully: Autopsy converts the old schema to the new schema (a time intensive process). You could open Autopsy cases originally created with 4.2 in 4.3 and add to them. >> >> I’ll bias this thread by saying my preference is the read-only approach. It’s the least amount of work to provide some level of backward compatibility. Historically, we have always upgraded cases to work with new versions of Autopsy. This is just a lot of work to fully upgrade and it isn’t clear that there is a lot of value in doing it. >> >> Who would be sad if we did the read-only approach? >> >> >> ------------------------------------------------------------------------------ >> Developer Access Program for Intel Xeon Phi Processors >> Access to Intel Xeon Phi processor-based developer platforms. >> With one year of Intel Parallel Studio XE. >> Training and support from Colfax. >> Order your platform today.http://sdm.link/xeonphi >> _______________________________________________ >> sleuthkit-users mailing list >> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >> http://www.sleuthkit.org > ------------------------------------------------------------------------------ > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/xeonphi > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org |
From: Derrick K. <dk...@gm...> - 2016-12-06 15:22:47
|
I'd be happy with either the first or second approach but would opt for no backwards compatibility, at least for the way we use it. Since each Autopsy version installs standalone in its own directory we typically keep multiple installs going and rarely upgrade mid-case. Once a case is started with a specific version we'll stick with it pending no major issues. Derrick On Tue, Dec 6, 2016 at 08:04 Brian Carrier <ca...@sl...> wrote: > I have an update Solr / Elastic / regular expression work and a question > about backward compatibility. > > Update: We’re sticking with Solr and will be breaking text into 32KB > chunks to use a different regular expression searching approach that gives > us better results. It is actually faster than before! > > Question: How much backward compatibility are people expecting? We have > three general options: > - no backward compatibility: You need to have Autopsy 4.2 to open > existing 4.2 cases. Existing cases are not upgraded. We’d probably need to > call this release Autopsy 5 to make it clear what can open what. I’m not > sure there are enough new features to justify such a major version increase. > - read-only: Autopsy 4.2 cases can be opened in the new Autopsy (let’s > call it 4.3), but only searched. You can’t add new data sources to it and > it would have the old regular expression searching. If you need to add Data > Sources, open the case up in 4.2. > - fully: Autopsy converts the old schema to the new schema (a time > intensive process). You could open Autopsy cases originally created with > 4.2 in 4.3 and add to them. > > I’ll bias this thread by saying my preference is the read-only approach. > It’s the least amount of work to provide some level of backward > compatibility. Historically, we have always upgraded cases to work with > new versions of Autopsy. This is just a lot of work to fully upgrade and > it isn’t clear that there is a lot of value in doing it. > > Who would be sad if we did the read-only approach? > > > > ------------------------------------------------------------------------------ > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/xeonphi > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > |
From: Brian C. <ca...@sl...> - 2016-12-06 15:00:00
|
I have an update Solr / Elastic / regular expression work and a question about backward compatibility. Update: We’re sticking with Solr and will be breaking text into 32KB chunks to use a different regular expression searching approach that gives us better results. It is actually faster than before! Question: How much backward compatibility are people expecting? We have three general options: - no backward compatibility: You need to have Autopsy 4.2 to open existing 4.2 cases. Existing cases are not upgraded. We’d probably need to call this release Autopsy 5 to make it clear what can open what. I’m not sure there are enough new features to justify such a major version increase. - read-only: Autopsy 4.2 cases can be opened in the new Autopsy (let’s call it 4.3), but only searched. You can’t add new data sources to it and it would have the old regular expression searching. If you need to add Data Sources, open the case up in 4.2. - fully: Autopsy converts the old schema to the new schema (a time intensive process). You could open Autopsy cases originally created with 4.2 in 4.3 and add to them. I’ll bias this thread by saying my preference is the read-only approach. It’s the least amount of work to provide some level of backward compatibility. Historically, we have always upgraded cases to work with new versions of Autopsy. This is just a lot of work to fully upgrade and it isn’t clear that there is a lot of value in doing it. Who would be sad if we did the read-only approach? |
From: Richard C. <rco...@ba...> - 2016-12-06 14:26:59
|
The solr.log.stdout file looks like it has some clues in it. If you can, when you get the error shown in error2.jpg, it would be helpful if you would click on the hyperlink and send a screen shot of the full message (assuming it has more detail). Thanks! On Tue, Dec 6, 2016 at 9:20 AM, Richard Cordovano <rco...@ba...> wrote: > Thanks! > > On Tue, Dec 6, 2016 at 4:24 AM, Nanni Bassetti <dig...@gm...> wrote: > >> Done! :-) >> >> 2016-12-06 0:38 GMT+01:00 Richard Cordovano <rco...@ba...>: >> >>> Nanni, I have combed through the logs you sent. The local Solr server >>> process appears to be starting normally. However, when Autopsy sends a core >>> (index) creation request to the Solr process during case creation, Autopsy >>> is unable to connect. It is not clear whether this is because the process >>> has shut down shortly after starting, or is just refusing the connection >>> request. Then, when you try to run ingest, the keyword search module tries >>> to open the core (index) for the case and fails, because it does not exist. >>> The module does not start, and when a module does not start, ingest is >>> aborted and you get the message to disable the ingest module that would not >>> start, in this case the keyword search module. >>> >>> It looks like you closed Autopsy altogether to get the case to open and >>> the ingest to run, which means that the misbehaving Solr process (if it was >>> still running) was terminated and a new process was started. Unfortunately, >>> this means that the solr.stdout.log file was deleted and recreated, so I >>> have no trace of any error messages that the Solr server may have written. >>> The interesting thing is that this new Solr process appears to experience >>> no unexpected errors, as evidenced by both your success and the >>> solr.stdout.log file you sent me. >>> >>> Are you able to reproduce this problem? If so, here are a few things you >>> could do to help me to help you: >>> >>> - When Autopsy is started, but before you try to open a case, open a >>> browser and got to the Solr Admin web page at: >>> http://localhost:23232/solr/#. Look to see if there are any error >>> messages on the logging page (push the Logging button) and send me a >>> screenshot if there are. >>> - After you open the case, go back to the Solr Admin page and check to >>> see if you can use the Core Selector button to choose the core for the >>> case, which will be a core with a name that looks like your case name with >>> a time/data stamp suffix. Also, check the logging page again. >>> - After you shut down Autopsy, but before you restart, collect a copy of ~/Users/[your >>> user name]/AppData/roaming/autopsy/var/log/solr.stdout.log for me. This >>> should actually agree with the logging page snapshots from the Solr Admin >>> page. >>> >>> Thanks, >>> Richard >>> >>> >>> On Wed, Nov 23, 2016 at 12:39 PM, Nanni Bassetti <dig...@gm...> >>> wrote: >>> >>>> no problem....see the attachment. >>>> >>>> 2016-11-23 18:20 GMT+01:00 Richard Cordovano <rco...@ba...> >>>> : >>>> >>>>> Nanni, thank you for sending the autopsy logs from the case folder. >>>>> Autopsy was failing to connect to the Solr server that it starts up in >>>>> jetty on your machine. Will you kindly also send me the entire contents >>>>> (all log files) of the ~/Users/[your user name]/AppData/roaming/autopsy/var/log >>>>> folder? >>>>> >>>>> Thanks, >>>>> >>>>> Richard Cordovano >>>>> Autopsy Team Lead >>>>> Basis Technology >>>>> >>>>> On Wed, Nov 23, 2016 at 2:35 AM, Nanni Bassetti <dig...@gm...> >>>>> wrote: >>>>> >>>>>> I tried to run Autopsy 4.2.0 working 2 times directly with 2 >>>>>> pendrives and 1 time with an EWF disk image. >>>>>> Everytime, after to have create the case, Autopsy said that I must >>>>>> disable keyword ingest module, but if I close all and re-run it opening the >>>>>> same case, already created, the problem disappeared. >>>>>> >>>>>> I attach the log file of one test of mine. >>>>>> >>>>>> -- >>>>>> Dott. Nanni Bassetti >>>>>> http://www.nannibassetti.com >>>>>> CAINE project manager - http://www.caine-live.net >>>>>> >>>>>> ------------------------------------------------------------ >>>>>> ------------------ >>>>>> >>>>>> _______________________________________________ >>>>>> sleuthkit-users mailing list >>>>>> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >>>>>> http://www.sleuthkit.org >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Dott. Nanni Bassetti >>>> http://www.nannibassetti.com >>>> CAINE project manager - http://www.caine-live.net >>>> >>> >>> >> >> >> -- >> Dott. Nanni Bassetti >> http://www.nannibassetti.com >> CAINE project manager - http://www.caine-live.net >> > > |
From: Richard C. <rco...@ba...> - 2016-12-06 14:21:03
|
Thanks! On Tue, Dec 6, 2016 at 4:24 AM, Nanni Bassetti <dig...@gm...> wrote: > Done! :-) > > 2016-12-06 0:38 GMT+01:00 Richard Cordovano <rco...@ba...>: > >> Nanni, I have combed through the logs you sent. The local Solr server >> process appears to be starting normally. However, when Autopsy sends a core >> (index) creation request to the Solr process during case creation, Autopsy >> is unable to connect. It is not clear whether this is because the process >> has shut down shortly after starting, or is just refusing the connection >> request. Then, when you try to run ingest, the keyword search module tries >> to open the core (index) for the case and fails, because it does not exist. >> The module does not start, and when a module does not start, ingest is >> aborted and you get the message to disable the ingest module that would not >> start, in this case the keyword search module. >> >> It looks like you closed Autopsy altogether to get the case to open and >> the ingest to run, which means that the misbehaving Solr process (if it was >> still running) was terminated and a new process was started. Unfortunately, >> this means that the solr.stdout.log file was deleted and recreated, so I >> have no trace of any error messages that the Solr server may have written. >> The interesting thing is that this new Solr process appears to experience >> no unexpected errors, as evidenced by both your success and the >> solr.stdout.log file you sent me. >> >> Are you able to reproduce this problem? If so, here are a few things you >> could do to help me to help you: >> >> - When Autopsy is started, but before you try to open a case, open a >> browser and got to the Solr Admin web page at: >> http://localhost:23232/solr/#. Look to see if there are any error >> messages on the logging page (push the Logging button) and send me a >> screenshot if there are. >> - After you open the case, go back to the Solr Admin page and check to >> see if you can use the Core Selector button to choose the core for the >> case, which will be a core with a name that looks like your case name with >> a time/data stamp suffix. Also, check the logging page again. >> - After you shut down Autopsy, but before you restart, collect a copy of ~/Users/[your >> user name]/AppData/roaming/autopsy/var/log/solr.stdout.log for me. This >> should actually agree with the logging page snapshots from the Solr Admin >> page. >> >> Thanks, >> Richard >> >> >> On Wed, Nov 23, 2016 at 12:39 PM, Nanni Bassetti <dig...@gm...> >> wrote: >> >>> no problem....see the attachment. >>> >>> 2016-11-23 18:20 GMT+01:00 Richard Cordovano <rco...@ba...>: >>> >>>> Nanni, thank you for sending the autopsy logs from the case folder. >>>> Autopsy was failing to connect to the Solr server that it starts up in >>>> jetty on your machine. Will you kindly also send me the entire contents >>>> (all log files) of the ~/Users/[your user name]/AppData/roaming/autopsy/var/log >>>> folder? >>>> >>>> Thanks, >>>> >>>> Richard Cordovano >>>> Autopsy Team Lead >>>> Basis Technology >>>> >>>> On Wed, Nov 23, 2016 at 2:35 AM, Nanni Bassetti <dig...@gm...> >>>> wrote: >>>> >>>>> I tried to run Autopsy 4.2.0 working 2 times directly with 2 pendrives >>>>> and 1 time with an EWF disk image. >>>>> Everytime, after to have create the case, Autopsy said that I must >>>>> disable keyword ingest module, but if I close all and re-run it opening the >>>>> same case, already created, the problem disappeared. >>>>> >>>>> I attach the log file of one test of mine. >>>>> >>>>> -- >>>>> Dott. Nanni Bassetti >>>>> http://www.nannibassetti.com >>>>> CAINE project manager - http://www.caine-live.net >>>>> >>>>> ------------------------------------------------------------ >>>>> ------------------ >>>>> >>>>> _______________________________________________ >>>>> sleuthkit-users mailing list >>>>> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >>>>> http://www.sleuthkit.org >>>>> >>>>> >>>> >>> >>> >>> -- >>> Dott. Nanni Bassetti >>> http://www.nannibassetti.com >>> CAINE project manager - http://www.caine-live.net >>> >> >> > > > -- > Dott. Nanni Bassetti > http://www.nannibassetti.com > CAINE project manager - http://www.caine-live.net > |
From: Richard C. <rco...@ba...> - 2016-12-05 23:38:36
|
Nanni, I have combed through the logs you sent. The local Solr server process appears to be starting normally. However, when Autopsy sends a core (index) creation request to the Solr process during case creation, Autopsy is unable to connect. It is not clear whether this is because the process has shut down shortly after starting, or is just refusing the connection request. Then, when you try to run ingest, the keyword search module tries to open the core (index) for the case and fails, because it does not exist. The module does not start, and when a module does not start, ingest is aborted and you get the message to disable the ingest module that would not start, in this case the keyword search module. It looks like you closed Autopsy altogether to get the case to open and the ingest to run, which means that the misbehaving Solr process (if it was still running) was terminated and a new process was started. Unfortunately, this means that the solr.stdout.log file was deleted and recreated, so I have no trace of any error messages that the Solr server may have written. The interesting thing is that this new Solr process appears to experience no unexpected errors, as evidenced by both your success and the solr.stdout.log file you sent me. Are you able to reproduce this problem? If so, here are a few things you could do to help me to help you: - When Autopsy is started, but before you try to open a case, open a browser and got to the Solr Admin web page at: http://localhost:23232/solr/#. Look to see if there are any error messages on the logging page (push the Logging button) and send me a screenshot if there are. - After you open the case, go back to the Solr Admin page and check to see if you can use the Core Selector button to choose the core for the case, which will be a core with a name that looks like your case name with a time/data stamp suffix. Also, check the logging page again. - After you shut down Autopsy, but before you restart, collect a copy of ~/Users/[your user name]/AppData/roaming/autopsy/var/log/solr.stdout.log for me. This should actually agree with the logging page snapshots from the Solr Admin page. Thanks, Richard On Wed, Nov 23, 2016 at 12:39 PM, Nanni Bassetti <dig...@gm...> wrote: > no problem....see the attachment. > > 2016-11-23 18:20 GMT+01:00 Richard Cordovano <rco...@ba...>: > >> Nanni, thank you for sending the autopsy logs from the case folder. >> Autopsy was failing to connect to the Solr server that it starts up in >> jetty on your machine. Will you kindly also send me the entire contents >> (all log files) of the ~/Users/[your user name]/AppData/roaming/autopsy/var/log >> folder? >> >> Thanks, >> >> Richard Cordovano >> Autopsy Team Lead >> Basis Technology >> >> On Wed, Nov 23, 2016 at 2:35 AM, Nanni Bassetti <dig...@gm...> >> wrote: >> >>> I tried to run Autopsy 4.2.0 working 2 times directly with 2 pendrives >>> and 1 time with an EWF disk image. >>> Everytime, after to have create the case, Autopsy said that I must >>> disable keyword ingest module, but if I close all and re-run it opening the >>> same case, already created, the problem disappeared. >>> >>> I attach the log file of one test of mine. >>> >>> -- >>> Dott. Nanni Bassetti >>> http://www.nannibassetti.com >>> CAINE project manager - http://www.caine-live.net >>> >>> ------------------------------------------------------------ >>> ------------------ >>> >>> _______________________________________________ >>> sleuthkit-users mailing list >>> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >>> http://www.sleuthkit.org >>> >>> >> > > > -- > Dott. Nanni Bassetti > http://www.nannibassetti.com > CAINE project manager - http://www.caine-live.net > |
From: Richard C. <rco...@ba...> - 2016-12-05 15:28:38
|
There currently is no documentation of module dependencies, but I can sum it up simply for the core modules that ship with Autopsy: run the hash lookup and file type identification modules first, and always run the file type identification module. The reason is that other modules can be configured to skip known files and several modules need to know file types. In fact, some modules will run file type detection if it has not already been done. Also, running these modules tends to load file content into cache memory. I agree that it would be nice to have finer-grained control of the ingest process and prevention of artifact duplication. However, please be aware that although Basis Technology donates resources to Autopsy development, major features are generally added when Basis customers paying for Autopsy customization request them. Often these funded features go to directly into open source Autopsy, to the benefit of the entire community. The features we are discussing are not currently being developed, but they are reasonably high on the list of potential future enhancements. Richard Cordovano Autopsy and Autopsy Customization Team Leads Basis Technology On Mon, Dec 5, 2016 at 9:55 AM, Alessandro Fiorenzi < ale...@al...> wrote: > Sorry have the same problem of Nanni, and believe a resume function should > be appreciate for tow reason: > - it do not duplicate data > - it safe time of analysis > > instead of warning of previous ingest module x session, I think should be > better to have a resume funtion o if it is impossible clear all data to do > not have duplicates > > > Is there a flow diagram of ingest module dipendencies? so to start befeore > some task and later the other,; this becasuse I have expericenced with > analysis time of 48/72 hours on disk grather than 500GB/1TB and doing > modular execution could safe time. > > Alessandro Fiorenzi > > > [image: Studio Fiorenzi] <http://www.studiofiorenzi.it/> > > Dott. Alessandro Fiorenzi > af...@st... / +39 3487920172 <+39%20348%20792%200172> > > Studio Fiorenzi > 0550351263 > Vai Daniele Manin, 50 50019 Sesto Fiorentino > http://www.studiofiorenzi.it > > IMPORTANTE: questa e-mail (inclusi tutti gli allegati) è inviata dallo > Studio Informatica Forense Fiorenzi Alessandro e può contenere informazioni > riservate soggette a segreto professionale. Essa può essere letta, copiata > e usata solo dal destinatario indicato e non deve essere ritrasmessa con > modifiche senza il nostro consenso. Se l'avete ricevuta per errore, Vi > preghiamo di contattarci per e-mail o telefono e, quindi, di distruggerla > senza mostrarla ad alcun estraneo. La sicurezza e l'affidabilità delle > e-mail non è garantita. Noi adottiamo programmi anti virus, ma decliniamo > ogni responsabilità in ordine alla prevenzione degli eventuali virus. > > 2016-12-05 15:22 GMT+01:00 Richard Cordovano <rco...@ba...>: > >> It is not currently possible to stop an ingest job (i.e., a data source >> [e.g., an image], a set of ingest modules, and the settings for those >> modules) or an individual ingest module and later start again where you >> left off. Instead, you will have an incomplete set of results (artifacts, >> carved files, etc.). On a related note, if you run the same ingest modules >> on the same inputs, duplicate results (artifacts, carved files, etc.) will >> be generated. However, we have recently implemented an ingest history >> feature, which among other things, warns users if a particular module is >> about to be used to analyze the same input data source. This feature uses >> case database tables that relate ingest modules by version to data sources, >> and is a first step towards more comprehensive tracking of what has been >> executed by Autopsy. >> >> Richard Cordovano >> Autopsy and Autopsy Customization Teams Lead >> Basis Technology >> >> On Sun, Dec 4, 2016 at 7:22 AM, Nanni Bassetti <dig...@gm...> >> wrote: >> >>> Hi all, >>> it seems that if you stop some ingesting engines, when you restart them, >>> they start again from the beginning...why? >>> Is it possible to restart them from the breaking point? >>> Thanks >>> >>> -- >>> Dott. Nanni Bassetti >>> http://www.nannibassetti.com >>> CAINE project manager - http://www.caine-live.net >>> >>> ------------------------------------------------------------ >>> ------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >>> _______________________________________________ >>> sleuthkit-users mailing list >>> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >>> http://www.sleuthkit.org >>> >>> >> >> ------------------------------------------------------------ >> ------------------ >> >> _______________________________________________ >> sleuthkit-users mailing list >> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >> http://www.sleuthkit.org >> >> > |
From: Alessandro F. <ale...@al...> - 2016-12-05 15:02:33
|
Sorry have the same problem of Nanni, and believe a resume function should be appreciate for tow reason: - it do not duplicate data - it safe time of analysis instead of warning of previous ingest module x session, I think should be better to have a resume funtion o if it is impossible clear all data to do not have duplicates Is there a flow diagram of ingest module dipendencies? so to start befeore some task and later the other,; this becasuse I have expericenced with analysis time of 48/72 hours on disk grather than 500GB/1TB and doing modular execution could safe time. Alessandro Fiorenzi [image: Studio Fiorenzi] <http://www.studiofiorenzi.it/> Dott. Alessandro Fiorenzi af...@st... / +39 3487920172 Studio Fiorenzi 0550351263 Vai Daniele Manin, 50 50019 Sesto Fiorentino http://www.studiofiorenzi.it IMPORTANTE: questa e-mail (inclusi tutti gli allegati) è inviata dallo Studio Informatica Forense Fiorenzi Alessandro e può contenere informazioni riservate soggette a segreto professionale. Essa può essere letta, copiata e usata solo dal destinatario indicato e non deve essere ritrasmessa con modifiche senza il nostro consenso. Se l'avete ricevuta per errore, Vi preghiamo di contattarci per e-mail o telefono e, quindi, di distruggerla senza mostrarla ad alcun estraneo. La sicurezza e l'affidabilità delle e-mail non è garantita. Noi adottiamo programmi anti virus, ma decliniamo ogni responsabilità in ordine alla prevenzione degli eventuali virus. 2016-12-05 15:22 GMT+01:00 Richard Cordovano <rco...@ba...>: > It is not currently possible to stop an ingest job (i.e., a data source > [e.g., an image], a set of ingest modules, and the settings for those > modules) or an individual ingest module and later start again where you > left off. Instead, you will have an incomplete set of results (artifacts, > carved files, etc.). On a related note, if you run the same ingest modules > on the same inputs, duplicate results (artifacts, carved files, etc.) will > be generated. However, we have recently implemented an ingest history > feature, which among other things, warns users if a particular module is > about to be used to analyze the same input data source. This feature uses > case database tables that relate ingest modules by version to data sources, > and is a first step towards more comprehensive tracking of what has been > executed by Autopsy. > > Richard Cordovano > Autopsy and Autopsy Customization Teams Lead > Basis Technology > > On Sun, Dec 4, 2016 at 7:22 AM, Nanni Bassetti <dig...@gm...> wrote: > >> Hi all, >> it seems that if you stop some ingesting engines, when you restart them, >> they start again from the beginning...why? >> Is it possible to restart them from the breaking point? >> Thanks >> >> -- >> Dott. Nanni Bassetti >> http://www.nannibassetti.com >> CAINE project manager - http://www.caine-live.net >> >> ------------------------------------------------------------ >> ------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> _______________________________________________ >> sleuthkit-users mailing list >> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >> http://www.sleuthkit.org >> >> > > ------------------------------------------------------------ > ------------------ > > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > > |
From: Richard C. <rco...@ba...> - 2016-12-05 14:22:27
|
It is not currently possible to stop an ingest job (i.e., a data source [e.g., an image], a set of ingest modules, and the settings for those modules) or an individual ingest module and later start again where you left off. Instead, you will have an incomplete set of results (artifacts, carved files, etc.). On a related note, if you run the same ingest modules on the same inputs, duplicate results (artifacts, carved files, etc.) will be generated. However, we have recently implemented an ingest history feature, which among other things, warns users if a particular module is about to be used to analyze the same input data source. This feature uses case database tables that relate ingest modules by version to data sources, and is a first step towards more comprehensive tracking of what has been executed by Autopsy. Richard Cordovano Autopsy and Autopsy Customization Teams Lead Basis Technology On Sun, Dec 4, 2016 at 7:22 AM, Nanni Bassetti <dig...@gm...> wrote: > Hi all, > it seems that if you stop some ingesting engines, when you restart them, > they start again from the beginning...why? > Is it possible to restart them from the breaking point? > Thanks > > -- > Dott. Nanni Bassetti > http://www.nannibassetti.com > CAINE project manager - http://www.caine-live.net > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > > |
From: Nanni B. <dig...@gm...> - 2016-12-04 12:22:16
|
Hi all, it seems that if you stop some ingesting engines, when you restart them, they start again from the beginning...why? Is it possible to restart them from the breaking point? Thanks -- Dott. Nanni Bassetti http://www.nannibassetti.com CAINE project manager - http://www.caine-live.net |
From: D3 k. <dee...@gm...> - 2016-12-02 19:20:35
|
Thanks worth initiative. On 2 December 2016 at 18:42, Brian Carrier <ca...@sl...> wrote: > At OSDFCon this year, there were several educators who wanted to meet > others who were using Autopsy in the classroom. We setup a Google Group to > allow educators to share their experiences, data, slides, etc. If you are > using Autopsy in your classroom, sign up and join the conversation. > > https://groups.google.com/a/basistech.com/forum/#!forum/ > autopsy-educators > > thanks, > brian > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > -- *Deepak Kumar* Digital Forensics | Cyber Intelligence Profile <http://www.D3pak.branded.me> =============================================== *CONFIDENTIALITY NOTICE *: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. *!!!! Try to be a rainbow in someone's cloud !!!!* |
From: Brian C. <ca...@sl...> - 2016-12-02 15:42:25
|
At OSDFCon this year, there were several educators who wanted to meet others who were using Autopsy in the classroom. We setup a Google Group to allow educators to share their experiences, data, slides, etc. If you are using Autopsy in your classroom, sign up and join the conversation. https://groups.google.com/a/basistech.com/forum/#!forum/autopsy-educators thanks, brian |
From: Brian C. <ca...@sl...> - 2016-12-02 14:34:53
|
For those who could not attend OSDFCon (over a month ago!), you missed another successful year. Thanks again to the speakers. The presentations are up on the website: http://www.osdfcon.org/2016-event/2016-agenda/ The Autopsy module competition submissions are also on the website. We had a record number of 21 submissions. http://www.osdfcon.org/2016-event/2016-module-development-contest/ As usual, the attendees of OSDFCon voted based on presentations/videos and the winners were: - Emily Wicki (1st place) - Mathias Vetsch and Luca Taennler (2nd place) - Mark McKinnon (3rd place) Congratulations! brian |
From: Nanni B. <dig...@gm...> - 2016-11-23 17:39:55
|
no problem....see the attachment. 2016-11-23 18:20 GMT+01:00 Richard Cordovano <rco...@ba...>: > Nanni, thank you for sending the autopsy logs from the case folder. > Autopsy was failing to connect to the Solr server that it starts up in > jetty on your machine. Will you kindly also send me the entire contents > (all log files) of the ~/Users/[your user name]/AppData/roaming/autopsy/var/log > folder? > > Thanks, > > Richard Cordovano > Autopsy Team Lead > Basis Technology > > On Wed, Nov 23, 2016 at 2:35 AM, Nanni Bassetti <dig...@gm...> > wrote: > >> I tried to run Autopsy 4.2.0 working 2 times directly with 2 pendrives >> and 1 time with an EWF disk image. >> Everytime, after to have create the case, Autopsy said that I must >> disable keyword ingest module, but if I close all and re-run it opening the >> same case, already created, the problem disappeared. >> >> I attach the log file of one test of mine. >> >> -- >> Dott. Nanni Bassetti >> http://www.nannibassetti.com >> CAINE project manager - http://www.caine-live.net >> >> ------------------------------------------------------------ >> ------------------ >> >> _______________________________________________ >> sleuthkit-users mailing list >> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >> http://www.sleuthkit.org >> >> > -- Dott. Nanni Bassetti http://www.nannibassetti.com CAINE project manager - http://www.caine-live.net |
From: Richard C. <rco...@ba...> - 2016-11-23 17:20:42
|
Nanni, thank you for sending the autopsy logs from the case folder. Autopsy was failing to connect to the Solr server that it starts up in jetty on your machine. Will you kindly also send me the entire contents (all log files) of the ~/Users/[your user name]/AppData/roaming/autopsy/var/log folder? Thanks, Richard Cordovano Autopsy Team Lead Basis Technology On Wed, Nov 23, 2016 at 2:35 AM, Nanni Bassetti <dig...@gm...> wrote: > I tried to run Autopsy 4.2.0 working 2 times directly with 2 pendrives and > 1 time with an EWF disk image. > Everytime, after to have create the case, Autopsy said that I must disable > keyword ingest module, but if I close all and re-run it opening the same > case, already created, the problem disappeared. > > I attach the log file of one test of mine. > > -- > Dott. Nanni Bassetti > http://www.nannibassetti.com > CAINE project manager - http://www.caine-live.net > > ------------------------------------------------------------ > ------------------ > > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > > |
From: Nanni B. <dig...@gm...> - 2016-11-23 07:36:08
|
I tried to run Autopsy 4.2.0 working 2 times directly with 2 pendrives and 1 time with an EWF disk image. Everytime, after to have create the case, Autopsy said that I must disable keyword ingest module, but if I close all and re-run it opening the same case, already created, the problem disappeared. I attach the log file of one test of mine. -- Dott. Nanni Bassetti http://www.nannibassetti.com CAINE project manager - http://www.caine-live.net |
From: Richard C. <rco...@ba...> - 2016-11-22 23:42:03
|
Alessandro, a ConcurrentModificationError happens when the Java virtual machine detects that more than one thread can access an unguarded collection. It is symptomatic of a multithreaded programming error. Is it at all possible to send me a copy of the autopsy_traces.log files from the logs directory of the case folder and the messages.log files and autopsy_traces.log files from ~\Users\[your user name]\AppData\Roaming\autopsy\var\log folder? If you are able to do this, I will be more than happy to try to track down and fix the problem based on what I can glean from those application logs. Judging from the large number of errors that have been registered (19722) and the lack of progress, I am sorry to say that I believe that Autopsy is hung up and will need to be restarted. Sincerely, Richard Cordovano Autopsy Team Lead Basis Technology 2016-11-22 17:30 GMT-05:00 Alessandro Fiorenzi < ale...@al...>: > Autopsy 4.2.0 > > Using keyword search of ingest modules on E0x acqusition I get a strange > error I have never seen before: > > “CET Error > > Ubale to close connection to case database caused by: An SQLExeception was > provoked by the following failure:java.util.ConcurrentModificationExceptio > n” > > > > I close many times this message but analysis si still on 12% after 2 days > is it right? What does it means previuous error? > > > > Thanks > > > > Alessandro Fiorenzi > > > > > > [image: Studio Fiorenzi] > <http://www.studiofiorenzi.it/>*Dott. Alessandro Fiorenzi* > > www.studiofiorenzi.it > > af...@st... / +39 3487920172 > > > > *Studio Fiorenzi** - Security & Forensics* > Tel 0550351263 > Vai Daniele Manin, 50 50019 Sesto Fiorentino > http://www.studiofiorenzi.it > > IMPORTANTE: questa e-mail (inclusi tutti gli allegati) è inviata dallo > Studio Informatica Forense Fiorenzi Alessandro e può contenere informazioni > riservate soggette a segreto professionale. Essa può essere letta, copiata > e usata solo dal destinatario indicato e non deve essere ritrasmessa con > modifiche senza il nostro consenso. Se l'avete ricevuta per errore, Vi > preghiamo di contattarci per e-mail o telefono e, quindi, di distruggerla > senza mostrarla ad alcun estraneo. La sicurezza e l'affidabilità delle > e-mail non è garantita. Noi adottiamo programmi anti virus, ma decliniamo > ogni responsabilità in ordine alla prevenzione degli eventuali virus. > > > > ------------------------------------------------------------ > ------------------ > > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > > |
From: Alessandro F. <ale...@al...> - 2016-11-22 22:59:12
|
Autopsy 4.2.0 Using keyword search of ingest modules on E0x acqusition I get a strange error I have never seen before: “CET Error Ubale to close connection to case database caused by: An SQLExeception was provoked by the following failure:java.util.ConcurrentModificationException” I close many times this message but analysis si still on 12% after 2 days is it right? What does it means previuous error? Thanks Alessandro Fiorenzi [image: Studio Fiorenzi] <http://www.studiofiorenzi.it/>*Dott. Alessandro Fiorenzi* www.studiofiorenzi.it af...@st... / +39 3487920172 *Studio Fiorenzi** - Security & Forensics* Tel 0550351263 Vai Daniele Manin, 50 50019 Sesto Fiorentino http://www.studiofiorenzi.it IMPORTANTE: questa e-mail (inclusi tutti gli allegati) è inviata dallo Studio Informatica Forense Fiorenzi Alessandro e può contenere informazioni riservate soggette a segreto professionale. Essa può essere letta, copiata e usata solo dal destinatario indicato e non deve essere ritrasmessa con modifiche senza il nostro consenso. Se l'avete ricevuta per errore, Vi preghiamo di contattarci per e-mail o telefono e, quindi, di distruggerla senza mostrarla ad alcun estraneo. La sicurezza e l'affidabilità delle e-mail non è garantita. Noi adottiamo programmi anti virus, ma decliniamo ogni responsabilità in ordine alla prevenzione degli eventuali virus. |
From: Luís F. N. <lfc...@gm...> - 2016-11-16 09:26:26
|
Hi Brian, Thanks for the explanation. I do not know about Solr, but a String field in Lucene is indexed and not tokenized. I have never tried regex on lucene string Fields, but I think it may match only at the beginning of the text if you do not start the regex with something like .* maybe I am wrong, so it will not be fast although the Field is indexed. Regards, Luis Em 15 de nov de 2016 00:28, "Brian Carrier" <ca...@sl...> escreveu: > Hi Luis, > > We currently (and will in the future) maintain two “copies” of the text to > support text and regexp searches. What will change if we adopt the 32KB > approach is to start storing the text in a non-indexed “string” field > (which has a size limitation of 32KB). It will not be tokenized and Solr > will apply the regular expression to each text field. > > So, this is in essence what Jon was also proposing of just doing a regexp > on the extracted text. Because this new field is not indexed, it will be > slower. Exact search performance hit TBD. > > brian > > > > > > > On Nov 14, 2016, at 8:53 PM, Luís Filipe Nassif <lfc...@gm...> > wrote: > > > > Hi Brian, > > > > I didn't understand exactly how text chunk size will help to index > spaces and other chars that breaks words into tokens. You will index text > twice? First with default tokenization, breaking words at spaces and > similar chars, and second time will index the whole text chunk as one > single token? Does the 32KB is the maximum Lucene token size? I think you > can do the second indexing (with performance consequences if you index > twice, it should be configurable, so users could disable it if they do not > need regex or if performance is critical). But I think you should not > disable the default indexing (with tokenization), otherwise users will have > to always use * as prefix and suffix of their searches, if not they will > miss a lot of hits. I do not known if they will be able to do phrase > searches, because Lucene does not allow to use * into a phrase search (* > between two " "). I do not know about Solr and if it extended that. > > > > Regards, > > Luis Nassif > > > > 2016-11-14 20:14 GMT-02:00 Brian Carrier <ca...@sl...>: > > Making this a little more specific, we seem to have two options to solve > this problem (which is inherent to Lucene/Solr/Elastic): > > > > 1) We store text in 32KB chunks (instead of our current 1MB chunks) and > can have the full power of regular expressions. The downside of the > smaller chunks is that there are more boundaries and places where a term > could span the boundary and we could miss a hit if it spans that boundary. > If we needed to, we could do some fancy overlapping. 32KB of text is > about 12 pages of English text (less for non-English). > > > > 2) We limit the types of regular expressions that people can use and > keep our 1MB chunks. We’ll add some logic into Autopsy to span tokens, but > we won’t be able to support all expressions. For example, if you gave us > “\d\d\d\s\d\d\d\d” we’d turn that into a search for “\d\d\d \d\d\d\d”, but > we wouldn’t able to support a search like “\d\d\d[\s-]\d\d\d\d”. Well we > could in theory, but we dont’ want to add crazy complexity here. > > > > So, the question is if you’d rather have smaller chunks and the full > breadth of regular expressions or a more limited set of expressions and > bigger chunks. We are looking at the performance differences now, but > wanted to get some initial opinions. > > > > > > > > > > > On Nov 14, 2016, at 1:09 PM, Brian Carrier <ca...@sl...> > wrote: > > > > > > Autopsy currently has a limitation when searching for regular > expressions, that spaces are not supported. It’s not a problem for Email > addresses and URLs, but becomes an issue phone numbers, account numbers, > etc. This limitation comes from using an indexed search engine (since > spaces are used to break text into tokens). > > > > > > We’re looking at ways of solving that and need some guidance. > > > > > > If you write your own regular expressions, can you please let me know > and share what they look like. We want to know how complex the expressions > are that people use in real life. > > > > > > Thanks! > > > ------------------------------------------------------------ > ------------------ > > > _______________________________________________ > > > sleuthkit-users mailing list > > > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > > > http://www.sleuthkit.org > > > > > > ------------------------------------------------------------ > ------------------ > > _______________________________________________ > > sleuthkit-users mailing list > > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > > http://www.sleuthkit.org > > > > |
From: Brian C. <ca...@sl...> - 2016-11-15 02:35:26
|
The Autopsy design is a bit interesting for this because we only report in the DB one hit per keyword per file. The details of each hit are then figured out when the user wants to review the file. So de-duping at that level is not important (we make an artifact at the first hit. . But, we need to do something fancy with Solr to allow for both the overlap for the matching, but allow a human to page through the file and not read the overlapping text and think “I feel like I just read that on the previous page”…. > On Nov 14, 2016, at 6:39 PM, Simson Garfinkel <si...@ac...> wrote: > > Hi Tim, > > Take a look at the bulk_extractor paper, which explains this in detail. There is no need to index block[N-1] below, just block[N] || X bytes from block[N+1], where X is the margin. > > You always need to worry about the margins, because if you don't, you double-report findings. It turns out that there are a lot of optimizations that you can implement if you do things the way I recommend below. For example, you never need to do duplicate suppression if you only index strings that being in block[N], even if they extend into block[N+1]. > >> On Nov 14, 2016, at 6:04 PM, Tim <tim...@se...> wrote: >> >> >> Right. Why not go with 32KB blocks and then index based on overlapped >> windows? To index block[N], you include this string in the index: >> (block[N-1] || block[N] || block[N+1]) >> >> Then when a match occurs, you just add some logic to figure out where >> it actually showed up (only in margin blocks or partially in block[N]) >> >> This is perhaps more naive than what Simson suggests, but with small >> blocks you don't need to worry about having the margins be much >> smaller than the block you're indexing. >> >> tim >> >> PS - I'm probably missing something here. I've been out of the game a >> while. >> >> >> On Mon, Nov 14, 2016 at 05:22:26PM -0500, Simson Garfinkel wrote: >>> Brian, >>> >>> With respect to #1 - I solved this problem with bulk_extractor by using an overlapping margin. Extend each block 1K or so into the next block. The extra 1k is called the Margin. Only report hits on string search if the text string beings in the main block, not if it begin in the margin (because then it is included entirely in the next block). You can tune the margin size to describe the largest text object that you wish to find with search. >>> >>> Simson >>> >>> >>>> On Nov 14, 2016, at 5:14 PM, Brian Carrier <ca...@sl...> wrote: >>>> >>>> Making this a little more specific, we seem to have two options to solve this problem (which is inherent to Lucene/Solr/Elastic): >>>> >>>> 1) We store text in 32KB chunks (instead of our current 1MB chunks) and can have the full power of regular expressions. The downside of the smaller chunks is that there are more boundaries and places where a term could span the boundary and we could miss a hit if it spans that boundary. If we needed to, we could do some fancy overlapping. 32KB of text is about 12 pages of English text (less for non-English). >>>> >>>> 2) We limit the types of regular expressions that people can use and keep our 1MB chunks. We’ll add some logic into Autopsy to span tokens, but we won’t be able to support all expressions. For example, if you gave us “\d\d\d\s\d\d\d\d” we’d turn that into a search for “\d\d\d \d\d\d\d”, but we wouldn’t able to support a search like “\d\d\d[\s-]\d\d\d\d”. Well we could in theory, but we dont’ want to add crazy complexity here. >>>> >>>> So, the question is if you’d rather have smaller chunks and the full breadth of regular expressions or a more limited set of expressions and bigger chunks. We are looking at the performance differences now, but wanted to get some initial opinions. >>>> >>>> >>>> >>>> >>>>> On Nov 14, 2016, at 1:09 PM, Brian Carrier <ca...@sl...> wrote: >>>>> >>>>> Autopsy currently has a limitation when searching for regular expressions, that spaces are not supported. It’s not a problem for Email addresses and URLs, but becomes an issue phone numbers, account numbers, etc. This limitation comes from using an indexed search engine (since spaces are used to break text into tokens). >>>>> >>>>> We’re looking at ways of solving that and need some guidance. >>>>> >>>>> If you write your own regular expressions, can you please let me know and share what they look like. We want to know how complex the expressions are that people use in real life. >>>>> >>>>> Thanks! >>>>> ------------------------------------------------------------------------------ >>>>> _______________________________________________ >>>>> sleuthkit-users mailing list >>>>> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >>>>> http://www.sleuthkit.org >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> _______________________________________________ >>>> sleuthkit-users mailing list >>>> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >>>> http://www.sleuthkit.org >>> >>> >>> ------------------------------------------------------------------------------ >>> _______________________________________________ >>> sleuthkit-users mailing list >>> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >>> http://www.sleuthkit.org > |
From: Brian C. <ca...@sl...> - 2016-11-15 02:28:59
|
Hi Luis, We currently (and will in the future) maintain two “copies” of the text to support text and regexp searches. What will change if we adopt the 32KB approach is to start storing the text in a non-indexed “string” field (which has a size limitation of 32KB). It will not be tokenized and Solr will apply the regular expression to each text field. So, this is in essence what Jon was also proposing of just doing a regexp on the extracted text. Because this new field is not indexed, it will be slower. Exact search performance hit TBD. brian > On Nov 14, 2016, at 8:53 PM, Luís Filipe Nassif <lfc...@gm...> wrote: > > Hi Brian, > > I didn't understand exactly how text chunk size will help to index spaces and other chars that breaks words into tokens. You will index text twice? First with default tokenization, breaking words at spaces and similar chars, and second time will index the whole text chunk as one single token? Does the 32KB is the maximum Lucene token size? I think you can do the second indexing (with performance consequences if you index twice, it should be configurable, so users could disable it if they do not need regex or if performance is critical). But I think you should not disable the default indexing (with tokenization), otherwise users will have to always use * as prefix and suffix of their searches, if not they will miss a lot of hits. I do not known if they will be able to do phrase searches, because Lucene does not allow to use * into a phrase search (* between two " "). I do not know about Solr and if it extended that. > > Regards, > Luis Nassif > > 2016-11-14 20:14 GMT-02:00 Brian Carrier <ca...@sl...>: > Making this a little more specific, we seem to have two options to solve this problem (which is inherent to Lucene/Solr/Elastic): > > 1) We store text in 32KB chunks (instead of our current 1MB chunks) and can have the full power of regular expressions. The downside of the smaller chunks is that there are more boundaries and places where a term could span the boundary and we could miss a hit if it spans that boundary. If we needed to, we could do some fancy overlapping. 32KB of text is about 12 pages of English text (less for non-English). > > 2) We limit the types of regular expressions that people can use and keep our 1MB chunks. We’ll add some logic into Autopsy to span tokens, but we won’t be able to support all expressions. For example, if you gave us “\d\d\d\s\d\d\d\d” we’d turn that into a search for “\d\d\d \d\d\d\d”, but we wouldn’t able to support a search like “\d\d\d[\s-]\d\d\d\d”. Well we could in theory, but we dont’ want to add crazy complexity here. > > So, the question is if you’d rather have smaller chunks and the full breadth of regular expressions or a more limited set of expressions and bigger chunks. We are looking at the performance differences now, but wanted to get some initial opinions. > > > > > > On Nov 14, 2016, at 1:09 PM, Brian Carrier <ca...@sl...> wrote: > > > > Autopsy currently has a limitation when searching for regular expressions, that spaces are not supported. It’s not a problem for Email addresses and URLs, but becomes an issue phone numbers, account numbers, etc. This limitation comes from using an indexed search engine (since spaces are used to break text into tokens). > > > > We’re looking at ways of solving that and need some guidance. > > > > If you write your own regular expressions, can you please let me know and share what they look like. We want to know how complex the expressions are that people use in real life. > > > > Thanks! > > ------------------------------------------------------------------------------ > > _______________________________________________ > > sleuthkit-users mailing list > > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > > http://www.sleuthkit.org > > > ------------------------------------------------------------------------------ > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > |
From: Jon S. <JSt...@St...> - 2016-11-15 02:12:45
|
Presumably what you've proposed so far works off of Lucene's capabilities. The other way to go would be simply to have a background processing job to grep the text documents and save the search hits. You lose out on the speed of an indexed search, but since the text has already been extracted it may still run in a reasonable timeframe, and you could search documents concurrently. Handling of overlaps is something that liblightgrep supports well. If you provide it the shingled overlap text, it will look into it only enough to evaluate any potential search hits beginning before the overlap and exit early if all the potential hits resolve. This way you don't have to dedupe/filter hits. A lot of matters, especially ones involving discovery in some fashion, will revolve around a set of search terms that have been negotiated by different parties, and it is common to use regexps to reduce false positives as well as account for variances. For those types of matters, it can be tedious to perform a series of interactive searches, depending on how easy it is to record the results. Jon > On Nov 14, 2016, at 5:18 PM, Brian Carrier <ca...@sl...> wrote: > > Making this a little more specific, we seem to have two options to solve this problem (which is inherent to Lucene/Solr/Elastic): > > 1) We store text in 32KB chunks (instead of our current 1MB chunks) and can have the full power of regular expressions. The downside of the smaller chunks is that there are more boundaries and places where a term could span the boundary and we could miss a hit if it spans that boundary. If we needed to, we could do some fancy overlapping. 32KB of text is about 12 pages of English text (less for non-English). > > 2) We limit the types of regular expressions that people can use and keep our 1MB chunks. We’ll add some logic into Autopsy to span tokens, but we won’t be able to support all expressions. For example, if you gave us “\d\d\d\s\d\d\d\d” we’d turn that into a search for “\d\d\d \d\d\d\d”, but we wouldn’t able to support a search like “\d\d\d[\s-]\d\d\d\d”. Well we could in theory, but we dont’ want to add crazy complexity here. > > So, the question is if you’d rather have smaller chunks and the full breadth of regular expressions or a more limited set of expressions and bigger chunks. We are looking at the performance differences now, but wanted to get some initial opinions. > > > > >> On Nov 14, 2016, at 1:09 PM, Brian Carrier <ca...@sl...> wrote: >> >> Autopsy currently has a limitation when searching for regular expressions, that spaces are not supported. It’s not a problem for Email addresses and URLs, but becomes an issue phone numbers, account numbers, etc. This limitation comes from using an indexed search engine (since spaces are used to break text into tokens). >> >> We’re looking at ways of solving that and need some guidance. >> >> If you write your own regular expressions, can you please let me know and share what they look like. We want to know how complex the expressions are that people use in real life. >> >> Thanks! >> ------------------------------------------------------------------------------ >> _______________________________________________ >> sleuthkit-users mailing list >> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >> http://www.sleuthkit.org > > > ------------------------------------------------------------------------------ > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org |