Thread: [sleuthkit-users] Charset encoder error when processing mbox files
Brought to you by:
carrier
From: Joseph H. <hyl...@is...> - 2018-12-09 08:17:30
|
Hi all, First post to the list. I am trying to use Autopsy to run some keyword searches on mbox files downloaded from gmail. Unfortunately, autopsy returns an error: "Error while processing: Could not find appropriate charset encoder." I am running Autopsy on Caine 10 in a KVM VM with 8GB RAM on a Lenovo P51 with a Core I7 processor. Any help would be appreciated. -- "Far better it is to dare mighty things, to win glorious triumphs, even though checkered by failure, than to take rank with those poor spirits who neither enjoy much nor suffer much, because they live in the gray twilight that knows neither victory nor defeat." -- Theodore Roosevelt, "The Strenuous Life." |
From: Derrick K. <dk...@gm...> - 2018-12-09 23:06:16
|
Hi Joseph. This question might be better asked directly to Nanni as it sounds like it may be Caine specific! I just tested mbox parsing under Debian testing w/Autopsy 4.9.1 and didn't have any issues with keyword searches. While I don't have a copy of Caine to test with at the moment I wonder if it's a manifestation of your systems locale. If you fire up a terminal emulator, can you send the output from 'locale' and 'locale charmap'? From MboxParser.java:111 in Autopsy it looks like if it can't detect the character encoder that it'll throw that message but I could be way off base here. Derrick On Sun, Dec 9, 2018 at 1:18 AM Joseph Hylkema <hyl...@is...> wrote: > > Hi all, > > First post to the list. > > I am trying to use Autopsy to run some keyword searches on mbox files > downloaded from gmail. Unfortunately, autopsy returns an error: > "Error while processing: Could not find appropriate charset encoder." > I am running Autopsy on Caine 10 in a KVM VM with 8GB RAM on a Lenovo > P51 with a Core I7 processor. > > Any help would be appreciated. > > -- > "Far better it is to dare mighty things, to win glorious triumphs, even > though checkered by failure, than to take rank with those poor spirits > who neither enjoy much nor suffer much, because they live in the gray > twilight that knows neither victory nor defeat." > > -- Theodore Roosevelt, "The Strenuous Life." > > > > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org |
From: <hyl...@is...> - 2018-12-10 05:19:15
|
> Hi Joseph. > > This question might be better asked directly to Nanni as it sounds > like it may be Caine specific! I just tested mbox parsing under > Debian testing w/Autopsy 4.9.1 and didn't have any issues with keyword > searches. > > While I don't have a copy of Caine to test with at the moment I wonder > if it's a manifestation of your systems locale. If you fire up a > terminal emulator, can you send the output from 'locale' and 'locale > charmap'? From MboxParser.java:111 in Autopsy it looks like if it > can't detect the character encoder that it'll throw that message but I > could be way off base here. Hi Derrick, Thank you very much for the quick reply. Below is the output of 'locale': jhylkema@caine-vm:~$ locale LANG=en_US.UTF-8 LANGUAGE=en_US LC_CTYPE="en_US.UTF-8" LC_NUMERIC=it_IT.UTF-8 LC_TIME=it_IT.UTF-8 LC_COLLATE="en_US.UTF-8" LC_MONETARY=it_IT.UTF-8 LC_MESSAGES="en_US.UTF-8" LC_PAPER=it_IT.UTF-8 LC_NAME=it_IT.UTF-8 LC_ADDRESS=it_IT.UTF-8 LC_TELEPHONE=it_IT.UTF-8 LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=it_IT.UTF-8 LC_ALL= And below is the output of 'locale charmap': jhylkema@caine-vm:~$ locale charmap UTF-8 If I were a betting man, my money would be on the fact that LC_ALL isn't set. Is that environment variable set in your Debian test distro? I will also email Nanni. Thank you. > > Derrick > On Sun, Dec 9, 2018 at 1:18 AM Joseph Hylkema <hyl...@is...> > wrote: >> >> Hi all, >> >> First post to the list. >> >> I am trying to use Autopsy to run some keyword searches on mbox files >> downloaded from gmail. Unfortunately, autopsy returns an error: >> "Error while processing: Could not find appropriate charset encoder." >> I am running Autopsy on Caine 10 in a KVM VM with 8GB RAM on a Lenovo >> P51 with a Core I7 processor. >> >> Any help would be appreciated. >> >> -- >> "Far better it is to dare mighty things, to win glorious triumphs, even >> though checkered by failure, than to take rank with those poor spirits >> who neither enjoy much nor suffer much, because they live in the gray >> twilight that knows neither victory nor defeat." >> >> -- Theodore Roosevelt, "The Strenuous Life." >> >> >> >> _______________________________________________ >> sleuthkit-users mailing list >> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >> http://www.sleuthkit.org > |
From: Nanni B. <dig...@gm...> - 2018-12-10 09:08:02
|
Hi, I did not test this, but I have to remind you that Caine 10 has Autopsy 4.9 onboard and not 4.9.1, so could it be a 4.9's issue? Thanks Il giorno lun 10 dic 2018 alle ore 06:20 <hyl...@is...> ha scritto: > > Hi Joseph. > > > > This question might be better asked directly to Nanni as it sounds > > like it may be Caine specific! I just tested mbox parsing under > > Debian testing w/Autopsy 4.9.1 and didn't have any issues with keyword > > searches. > > > > While I don't have a copy of Caine to test with at the moment I wonder > > if it's a manifestation of your systems locale. If you fire up a > > terminal emulator, can you send the output from 'locale' and 'locale > > charmap'? From MboxParser.java:111 in Autopsy it looks like if it > > can't detect the character encoder that it'll throw that message but I > > could be way off base here. > > Hi Derrick, > > Thank you very much for the quick reply. Below is the output of 'locale': > > jhylkema@caine-vm:~$ locale > LANG=en_US.UTF-8 > LANGUAGE=en_US > LC_CTYPE="en_US.UTF-8" > LC_NUMERIC=it_IT.UTF-8 > LC_TIME=it_IT.UTF-8 > LC_COLLATE="en_US.UTF-8" > LC_MONETARY=it_IT.UTF-8 > LC_MESSAGES="en_US.UTF-8" > LC_PAPER=it_IT.UTF-8 > LC_NAME=it_IT.UTF-8 > LC_ADDRESS=it_IT.UTF-8 > LC_TELEPHONE=it_IT.UTF-8 > LC_MEASUREMENT=it_IT.UTF-8 > LC_IDENTIFICATION=it_IT.UTF-8 > LC_ALL= > > And below is the output of 'locale charmap': > > jhylkema@caine-vm:~$ locale charmap > UTF-8 > > If I were a betting man, my money would be on the fact that LC_ALL isn't > set. Is that environment variable set in your Debian test distro? > > I will also email Nanni. > > Thank you. > > > > > Derrick > > On Sun, Dec 9, 2018 at 1:18 AM Joseph Hylkema <hyl...@is...> > > wrote: > >> > >> Hi all, > >> > >> First post to the list. > >> > >> I am trying to use Autopsy to run some keyword searches on mbox files > >> downloaded from gmail. Unfortunately, autopsy returns an error: > >> "Error while processing: Could not find appropriate charset encoder." > >> I am running Autopsy on Caine 10 in a KVM VM with 8GB RAM on a Lenovo > >> P51 with a Core I7 processor. > >> > >> Any help would be appreciated. > >> > >> -- > >> "Far better it is to dare mighty things, to win glorious triumphs, even > >> though checkered by failure, than to take rank with those poor spirits > >> who neither enjoy much nor suffer much, because they live in the gray > >> twilight that knows neither victory nor defeat." > >> > >> -- Theodore Roosevelt, "The Strenuous Life." > >> > >> > >> > >> _______________________________________________ > >> sleuthkit-users mailing list > >> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > >> http://www.sleuthkit.org > > > > > > > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > -- Dott. Nanni Bassetti http://www.nannibassetti.com CAINE project manager - http://www.caine-live.net |
From: Derrick K. <dk...@gm...> - 2018-12-11 01:58:07
|
Hi Joseph. I've attached my locale output below: dk@anubis:~$ locale LANG=en_CA.utf8 LANGUAGE=en_CA:en LC_CTYPE="en_CA.utf8" LC_NUMERIC="en_CA.utf8" LC_TIME="en_CA.utf8" LC_COLLATE="en_CA.utf8" LC_MONETARY="en_CA.utf8" LC_MESSAGES="en_CA.utf8" LC_PAPER="en_CA.utf8" LC_NAME="en_CA.utf8" LC_ADDRESS="en_CA.utf8" LC_TELEPHONE="en_CA.utf8" LC_MEASUREMENT="en_CA.utf8" LC_IDENTIFICATION="en_CA.utf8" LC_ALL= dk@anubis:~$ locale charmap UTF-8 I tested my system under Autopsy 4.9.0 and 4.9.1 and both ran fine. While I'm not convinced we are on the right track with the locales stuff we could try something: $ sudo dpkg-reconfigure locales (generate "en_US.UTF-8" and set it as the default locale) <log out of the Caine X session> $ locale (make sure it's all "en_US.utf8") <test Autopsy again> Derrick On Sun, Dec 9, 2018 at 10:18 PM <hyl...@is...> wrote: > > > Hi Joseph. > > > > This question might be better asked directly to Nanni as it sounds > > like it may be Caine specific! I just tested mbox parsing under > > Debian testing w/Autopsy 4.9.1 and didn't have any issues with keyword > > searches. > > > > While I don't have a copy of Caine to test with at the moment I wonder > > if it's a manifestation of your systems locale. If you fire up a > > terminal emulator, can you send the output from 'locale' and 'locale > > charmap'? From MboxParser.java:111 in Autopsy it looks like if it > > can't detect the character encoder that it'll throw that message but I > > could be way off base here. > > Hi Derrick, > > Thank you very much for the quick reply. Below is the output of 'locale': > > jhylkema@caine-vm:~$ locale > LANG=en_US.UTF-8 > LANGUAGE=en_US > LC_CTYPE="en_US.UTF-8" > LC_NUMERIC=it_IT.UTF-8 > LC_TIME=it_IT.UTF-8 > LC_COLLATE="en_US.UTF-8" > LC_MONETARY=it_IT.UTF-8 > LC_MESSAGES="en_US.UTF-8" > LC_PAPER=it_IT.UTF-8 > LC_NAME=it_IT.UTF-8 > LC_ADDRESS=it_IT.UTF-8 > LC_TELEPHONE=it_IT.UTF-8 > LC_MEASUREMENT=it_IT.UTF-8 > LC_IDENTIFICATION=it_IT.UTF-8 > LC_ALL= > > And below is the output of 'locale charmap': > > jhylkema@caine-vm:~$ locale charmap > UTF-8 > > If I were a betting man, my money would be on the fact that LC_ALL isn't > set. Is that environment variable set in your Debian test distro? > > I will also email Nanni. > > Thank you. > > > > > Derrick > > On Sun, Dec 9, 2018 at 1:18 AM Joseph Hylkema <hyl...@is...> > > wrote: > >> > >> Hi all, > >> > >> First post to the list. > >> > >> I am trying to use Autopsy to run some keyword searches on mbox files > >> downloaded from gmail. Unfortunately, autopsy returns an error: > >> "Error while processing: Could not find appropriate charset encoder." > >> I am running Autopsy on Caine 10 in a KVM VM with 8GB RAM on a Lenovo > >> P51 with a Core I7 processor. > >> > >> Any help would be appreciated. > >> > >> -- > >> "Far better it is to dare mighty things, to win glorious triumphs, even > >> though checkered by failure, than to take rank with those poor spirits > >> who neither enjoy much nor suffer much, because they live in the gray > >> twilight that knows neither victory nor defeat." > >> > >> -- Theodore Roosevelt, "The Strenuous Life." > >> > >> > >> > >> _______________________________________________ > >> sleuthkit-users mailing list > >> https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > >> http://www.sleuthkit.org > > > > |
From: Joseph H. <hyl...@is...> - 2018-12-11 07:21:48
|
Okay, here's what I did: I changed the contents of /etc/default/locale to remove the hard-coded Italian references in that file, changed the default locale to en_US.UTF-8, and got the same locale output as Derrick did. I then attempted to re-run the ingest... and got the same error. I then upgraded to Autopsy 4.9.1... and got the same error. I then installed Autopsy 4.9.1 in a Mint test VM, spun it up, ran it against the data... and got the same error. I am wondering if maybe I should just punt and install all of the locales? After all, this data has God-only-knows what character encoding in it. So, it's probably not a CAINE issue. It could be an issue with teh data itself. Perhaps I could import it into Thunderbird (read-only and off-network) and see if there is any strange encoding in it. Thoughts? On Mon, 2018-12-10 at 18:57 -0700, Derrick Karpo wrote: > Hi Joseph. > > I've attached my locale output below: > > dk@anubis:~$ locale > LANG=en_CA.utf8 > LANGUAGE=en_CA:en > LC_CTYPE="en_CA.utf8" > LC_NUMERIC="en_CA.utf8" > LC_TIME="en_CA.utf8" > LC_COLLATE="en_CA.utf8" > LC_MONETARY="en_CA.utf8" > LC_MESSAGES="en_CA.utf8" > LC_PAPER="en_CA.utf8" > LC_NAME="en_CA.utf8" > LC_ADDRESS="en_CA.utf8" > LC_TELEPHONE="en_CA.utf8" > LC_MEASUREMENT="en_CA.utf8" > LC_IDENTIFICATION="en_CA.utf8" > LC_ALL= > dk@anubis:~$ locale charmap > UTF-8 > > I tested my system under Autopsy 4.9.0 and 4.9.1 and both ran fine. > While I'm not convinced we are on the right track with the locales > stuff we could try something: > > $ sudo dpkg-reconfigure locales (generate "en_US.UTF-8" and set it > as the default locale) > <log out of the Caine X session> > $ locale (make sure it's all "en_US.utf8") > <test Autopsy again> > > Derrick > > On Sun, Dec 9, 2018 at 10:18 PM <hyl...@is...> wrote: > > > > > Hi Joseph. > > > > > > This question might be better asked directly to Nanni as it > > > sounds > > > like it may be Caine specific! I just tested mbox parsing under > > > Debian testing w/Autopsy 4.9.1 and didn't have any issues with > > > keyword > > > searches. > > > > > > While I don't have a copy of Caine to test with at the moment I > > > wonder > > > if it's a manifestation of your systems locale. If you fire up a > > > terminal emulator, can you send the output from 'locale' and > > > 'locale > > > charmap'? From MboxParser.java:111 in Autopsy it looks like if > > > it > > > can't detect the character encoder that it'll throw that message > > > but I > > > could be way off base here. > > > > Hi Derrick, > > > > Thank you very much for the quick reply. Below is the output of > > 'locale': > > > > jhylkema@caine-vm:~$ locale > > LANG=en_US.UTF-8 > > LANGUAGE=en_US > > LC_CTYPE="en_US.UTF-8" > > LC_NUMERIC=it_IT.UTF-8 > > LC_TIME=it_IT.UTF-8 > > LC_COLLATE="en_US.UTF-8" > > LC_MONETARY=it_IT.UTF-8 > > LC_MESSAGES="en_US.UTF-8" > > LC_PAPER=it_IT.UTF-8 > > LC_NAME=it_IT.UTF-8 > > LC_ADDRESS=it_IT.UTF-8 > > LC_TELEPHONE=it_IT.UTF-8 > > LC_MEASUREMENT=it_IT.UTF-8 > > LC_IDENTIFICATION=it_IT.UTF-8 > > LC_ALL= > > > > And below is the output of 'locale charmap': > > > > jhylkema@caine-vm:~$ locale charmap > > UTF-8 > > > > If I were a betting man, my money would be on the fact that LC_ALL > > isn't > > set. Is that environment variable set in your Debian test distro? > > > > I will also email Nanni. > > > > Thank you. > > > > > > > > Derrick > > > On Sun, Dec 9, 2018 at 1:18 AM Joseph Hylkema < > > > hyl...@is...> > > > wrote: > > > > > > > > Hi all, > > > > > > > > First post to the list. > > > > > > > > I am trying to use Autopsy to run some keyword searches on mbox > > > > files > > > > downloaded from gmail. Unfortunately, autopsy returns an > > > > error: > > > > "Error while processing: Could not find appropriate charset > > > > encoder." > > > > I am running Autopsy on Caine 10 in a KVM VM with 8GB RAM on a > > > > Lenovo > > > > P51 with a Core I7 processor. > > > > > > > > Any help would be appreciated. > > > > > > > > -- > > > > "Far better it is to dare mighty things, to win glorious > > > > triumphs, even > > > > though checkered by failure, than to take rank with those poor > > > > spirits > > > > who neither enjoy much nor suffer much, because they live in > > > > the gray > > > > twilight that knows neither victory nor defeat." > > > > > > > > -- Theodore Roosevelt, "The Strenuous Life." > > > > > > > > > > > > > > > > _______________________________________________ > > > > sleuthkit-users mailing list > > > > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > > > > http://www.sleuthkit.org > > > > |
From: Derrick K. <dk...@gm...> - 2018-12-11 15:31:03
|
Is it possible to share your data at all? Derrick On Tue, Dec 11, 2018, 00:21 Joseph Hylkema <hyl...@is... wrote: > Okay, here's what I did: > > I changed the contents of /etc/default/locale to remove the hard-coded > Italian references in that file, changed the default locale to > en_US.UTF-8, and got the same locale output as Derrick did. > > I then attempted to re-run the ingest... and got the same error. > > I then upgraded to Autopsy 4.9.1... and got the same error. > > I then installed Autopsy 4.9.1 in a Mint test VM, spun it up, ran it > against the data... and got the same error. > > I am wondering if maybe I should just punt and install all of the > locales? After all, this data has God-only-knows what character > encoding in it. > > So, it's probably not a CAINE issue. It could be an issue with teh > data itself. Perhaps I could import it into Thunderbird (read-only and > off-network) and see if there is any strange encoding in it. > > Thoughts? > > On Mon, 2018-12-10 at 18:57 -0700, Derrick Karpo wrote: > > Hi Joseph. > > > > I've attached my locale output below: > > > > dk@anubis:~$ locale > > LANG=en_CA.utf8 > > LANGUAGE=en_CA:en > > LC_CTYPE="en_CA.utf8" > > LC_NUMERIC="en_CA.utf8" > > LC_TIME="en_CA.utf8" > > LC_COLLATE="en_CA.utf8" > > LC_MONETARY="en_CA.utf8" > > LC_MESSAGES="en_CA.utf8" > > LC_PAPER="en_CA.utf8" > > LC_NAME="en_CA.utf8" > > LC_ADDRESS="en_CA.utf8" > > LC_TELEPHONE="en_CA.utf8" > > LC_MEASUREMENT="en_CA.utf8" > > LC_IDENTIFICATION="en_CA.utf8" > > LC_ALL= > > dk@anubis:~$ locale charmap > > UTF-8 > > > > I tested my system under Autopsy 4.9.0 and 4.9.1 and both ran fine. > > While I'm not convinced we are on the right track with the locales > > stuff we could try something: > > > > $ sudo dpkg-reconfigure locales (generate "en_US.UTF-8" and set it > > as the default locale) > > <log out of the Caine X session> > > $ locale (make sure it's all "en_US.utf8") > > <test Autopsy again> > > > > Derrick > > > > On Sun, Dec 9, 2018 at 10:18 PM <hyl...@is...> wrote: > > > > > > > Hi Joseph. > > > > > > > > This question might be better asked directly to Nanni as it > > > > sounds > > > > like it may be Caine specific! I just tested mbox parsing under > > > > Debian testing w/Autopsy 4.9.1 and didn't have any issues with > > > > keyword > > > > searches. > > > > > > > > While I don't have a copy of Caine to test with at the moment I > > > > wonder > > > > if it's a manifestation of your systems locale. If you fire up a > > > > terminal emulator, can you send the output from 'locale' and > > > > 'locale > > > > charmap'? From MboxParser.java:111 in Autopsy it looks like if > > > > it > > > > can't detect the character encoder that it'll throw that message > > > > but I > > > > could be way off base here. > > > > > > Hi Derrick, > > > > > > Thank you very much for the quick reply. Below is the output of > > > 'locale': > > > > > > jhylkema@caine-vm:~$ locale > > > LANG=en_US.UTF-8 > > > LANGUAGE=en_US > > > LC_CTYPE="en_US.UTF-8" > > > LC_NUMERIC=it_IT.UTF-8 > > > LC_TIME=it_IT.UTF-8 > > > LC_COLLATE="en_US.UTF-8" > > > LC_MONETARY=it_IT.UTF-8 > > > LC_MESSAGES="en_US.UTF-8" > > > LC_PAPER=it_IT.UTF-8 > > > LC_NAME=it_IT.UTF-8 > > > LC_ADDRESS=it_IT.UTF-8 > > > LC_TELEPHONE=it_IT.UTF-8 > > > LC_MEASUREMENT=it_IT.UTF-8 > > > LC_IDENTIFICATION=it_IT.UTF-8 > > > LC_ALL= > > > > > > And below is the output of 'locale charmap': > > > > > > jhylkema@caine-vm:~$ locale charmap > > > UTF-8 > > > > > > If I were a betting man, my money would be on the fact that LC_ALL > > > isn't > > > set. Is that environment variable set in your Debian test distro? > > > > > > I will also email Nanni. > > > > > > Thank you. > > > > > > > > > > > Derrick > > > > On Sun, Dec 9, 2018 at 1:18 AM Joseph Hylkema < > > > > hyl...@is...> > > > > wrote: > > > > > > > > > > Hi all, > > > > > > > > > > First post to the list. > > > > > > > > > > I am trying to use Autopsy to run some keyword searches on mbox > > > > > files > > > > > downloaded from gmail. Unfortunately, autopsy returns an > > > > > error: > > > > > "Error while processing: Could not find appropriate charset > > > > > encoder." > > > > > I am running Autopsy on Caine 10 in a KVM VM with 8GB RAM on a > > > > > Lenovo > > > > > P51 with a Core I7 processor. > > > > > > > > > > Any help would be appreciated. > > > > > > > > > > -- > > > > > "Far better it is to dare mighty things, to win glorious > > > > > triumphs, even > > > > > though checkered by failure, than to take rank with those poor > > > > > spirits > > > > > who neither enjoy much nor suffer much, because they live in > > > > > the gray > > > > > twilight that knows neither victory nor defeat." > > > > > > > > > > -- Theodore Roosevelt, "The Strenuous Life." > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > sleuthkit-users mailing list > > > > > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > > > > > http://www.sleuthkit.org > > > > > > > > |
From: <hyl...@is...> - 2018-12-11 18:18:13
|
> Is it possible to share your data at all? Unfortunately no as it is proprietary client information. I can share Autopsy log files, though, or anything of that nature (system log files, etc.) What can I provide that would be helpful? > Derrick > > > On Tue, Dec 11, 2018, 00:21 Joseph Hylkema <hyl...@is... wrote: > >> Okay, here's what I did: >> >> I changed the contents of /etc/default/locale to remove the hard-coded >> Italian references in that file, changed the default locale to >> en_US.UTF-8, and got the same locale output as Derrick did. >> >> I then attempted to re-run the ingest... and got the same error. >> >> I then upgraded to Autopsy 4.9.1... and got the same error. >> >> I then installed Autopsy 4.9.1 in a Mint test VM, spun it up, ran it >> against the data... and got the same error. >> >> I am wondering if maybe I should just punt and install all of the >> locales? After all, this data has God-only-knows what character >> encoding in it. >> >> So, it's probably not a CAINE issue. It could be an issue with teh >> data itself. Perhaps I could import it into Thunderbird (read-only and >> off-network) and see if there is any strange encoding in it. >> >> Thoughts? >> >> On Mon, 2018-12-10 at 18:57 -0700, Derrick Karpo wrote: >> > Hi Joseph. >> > >> > I've attached my locale output below: >> > >> > dk@anubis:~$ locale >> > LANG=en_CA.utf8 >> > LANGUAGE=en_CA:en >> > LC_CTYPE="en_CA.utf8" >> > LC_NUMERIC="en_CA.utf8" >> > LC_TIME="en_CA.utf8" >> > LC_COLLATE="en_CA.utf8" >> > LC_MONETARY="en_CA.utf8" >> > LC_MESSAGES="en_CA.utf8" >> > LC_PAPER="en_CA.utf8" >> > LC_NAME="en_CA.utf8" >> > LC_ADDRESS="en_CA.utf8" >> > LC_TELEPHONE="en_CA.utf8" >> > LC_MEASUREMENT="en_CA.utf8" >> > LC_IDENTIFICATION="en_CA.utf8" >> > LC_ALL= >> > dk@anubis:~$ locale charmap >> > UTF-8 >> > >> > I tested my system under Autopsy 4.9.0 and 4.9.1 and both ran fine. >> > While I'm not convinced we are on the right track with the locales >> > stuff we could try something: >> > >> > $ sudo dpkg-reconfigure locales (generate "en_US.UTF-8" and set it >> > as the default locale) >> > <log out of the Caine X session> >> > $ locale (make sure it's all "en_US.utf8") >> > <test Autopsy again> >> > >> > Derrick >> > >> > On Sun, Dec 9, 2018 at 10:18 PM <hyl...@is...> wrote: >> > > >> > > > Hi Joseph. >> > > > >> > > > This question might be better asked directly to Nanni as it >> > > > sounds >> > > > like it may be Caine specific! I just tested mbox parsing under >> > > > Debian testing w/Autopsy 4.9.1 and didn't have any issues with >> > > > keyword >> > > > searches. >> > > > >> > > > While I don't have a copy of Caine to test with at the moment I >> > > > wonder >> > > > if it's a manifestation of your systems locale. If you fire up a >> > > > terminal emulator, can you send the output from 'locale' and >> > > > 'locale >> > > > charmap'? From MboxParser.java:111 in Autopsy it looks like if >> > > > it >> > > > can't detect the character encoder that it'll throw that message >> > > > but I >> > > > could be way off base here. >> > > >> > > Hi Derrick, >> > > >> > > Thank you very much for the quick reply. Below is the output of >> > > 'locale': >> > > >> > > jhylkema@caine-vm:~$ locale >> > > LANG=en_US.UTF-8 >> > > LANGUAGE=en_US >> > > LC_CTYPE="en_US.UTF-8" >> > > LC_NUMERIC=it_IT.UTF-8 >> > > LC_TIME=it_IT.UTF-8 >> > > LC_COLLATE="en_US.UTF-8" >> > > LC_MONETARY=it_IT.UTF-8 >> > > LC_MESSAGES="en_US.UTF-8" >> > > LC_PAPER=it_IT.UTF-8 >> > > LC_NAME=it_IT.UTF-8 >> > > LC_ADDRESS=it_IT.UTF-8 >> > > LC_TELEPHONE=it_IT.UTF-8 >> > > LC_MEASUREMENT=it_IT.UTF-8 >> > > LC_IDENTIFICATION=it_IT.UTF-8 >> > > LC_ALL= >> > > >> > > And below is the output of 'locale charmap': >> > > >> > > jhylkema@caine-vm:~$ locale charmap >> > > UTF-8 >> > > >> > > If I were a betting man, my money would be on the fact that LC_ALL >> > > isn't >> > > set. Is that environment variable set in your Debian test distro? >> > > >> > > I will also email Nanni. >> > > >> > > Thank you. >> > > >> > > > >> > > > Derrick >> > > > On Sun, Dec 9, 2018 at 1:18 AM Joseph Hylkema < >> > > > hyl...@is...> >> > > > wrote: >> > > > > >> > > > > Hi all, >> > > > > >> > > > > First post to the list. >> > > > > >> > > > > I am trying to use Autopsy to run some keyword searches on mbox >> > > > > files >> > > > > downloaded from gmail. Unfortunately, autopsy returns an >> > > > > error: >> > > > > "Error while processing: Could not find appropriate charset >> > > > > encoder." >> > > > > I am running Autopsy on Caine 10 in a KVM VM with 8GB RAM on a >> > > > > Lenovo >> > > > > P51 with a Core I7 processor. >> > > > > >> > > > > Any help would be appreciated. >> > > > > >> > > > > -- >> > > > > "Far better it is to dare mighty things, to win glorious >> > > > > triumphs, even >> > > > > though checkered by failure, than to take rank with those poor >> > > > > spirits >> > > > > who neither enjoy much nor suffer much, because they live in >> > > > > the gray >> > > > > twilight that knows neither victory nor defeat." >> > > > > >> > > > > -- Theodore Roosevelt, "The Strenuous Life." >> > > > > >> > > > > >> > > > > >> > > > > _______________________________________________ >> > > > > sleuthkit-users mailing list >> > > > > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users >> > > > > http://www.sleuthkit.org >> > > >> > > >> >> > |
From: Derrick K. <dk...@gm...> - 2018-12-11 20:51:33
|
Hello. I'm not sure what else can be done without seeing the data. I don't even think going into Autopsy's "Help -> About -> Activate verbose logging" will help but you can give it a shot. Autopsy uses Tika's CharsetDetector which is straight from ICU4J I believe and this could be a upstream issue in Tika as it seems very specific to your data. I understand about not being able to share your data though! As a thought to isolate this, how about splitting your mbox into a zillion individual mbox's and running Autopsy against the split versions to see if a specific culprit message can found? The procmail package has the 'formail' utility which can do the splitting for you. ie: dk@anubis:/tmp/bleck$ mkdir splitmbox dk@anubis:/tmp/bleck$ cat mbox | formail -ds sh -c 'cat > splitmbox/msg.$FILENO' Derrick On Tue, Dec 11, 2018 at 11:18 AM <hyl...@is...> wrote: > > > Is it possible to share your data at all? > > Unfortunately no as it is proprietary client information. I can share > Autopsy log files, though, or anything of that nature (system log files, > etc.) What can I provide that would be helpful? > > > Derrick > > > > > > On Tue, Dec 11, 2018, 00:21 Joseph Hylkema <hyl...@is... wrote: > > > >> Okay, here's what I did: > >> > >> I changed the contents of /etc/default/locale to remove the hard-coded > >> Italian references in that file, changed the default locale to > >> en_US.UTF-8, and got the same locale output as Derrick did. > >> > >> I then attempted to re-run the ingest... and got the same error. > >> > >> I then upgraded to Autopsy 4.9.1... and got the same error. > >> > >> I then installed Autopsy 4.9.1 in a Mint test VM, spun it up, ran it > >> against the data... and got the same error. > >> > >> I am wondering if maybe I should just punt and install all of the > >> locales? After all, this data has God-only-knows what character > >> encoding in it. > >> > >> So, it's probably not a CAINE issue. It could be an issue with teh > >> data itself. Perhaps I could import it into Thunderbird (read-only and > >> off-network) and see if there is any strange encoding in it. > >> > >> Thoughts? > >> > >> On Mon, 2018-12-10 at 18:57 -0700, Derrick Karpo wrote: > >> > Hi Joseph. > >> > > >> > I've attached my locale output below: > >> > > >> > dk@anubis:~$ locale > >> > LANG=en_CA.utf8 > >> > LANGUAGE=en_CA:en > >> > LC_CTYPE="en_CA.utf8" > >> > LC_NUMERIC="en_CA.utf8" > >> > LC_TIME="en_CA.utf8" > >> > LC_COLLATE="en_CA.utf8" > >> > LC_MONETARY="en_CA.utf8" > >> > LC_MESSAGES="en_CA.utf8" > >> > LC_PAPER="en_CA.utf8" > >> > LC_NAME="en_CA.utf8" > >> > LC_ADDRESS="en_CA.utf8" > >> > LC_TELEPHONE="en_CA.utf8" > >> > LC_MEASUREMENT="en_CA.utf8" > >> > LC_IDENTIFICATION="en_CA.utf8" > >> > LC_ALL= > >> > dk@anubis:~$ locale charmap > >> > UTF-8 > >> > > >> > I tested my system under Autopsy 4.9.0 and 4.9.1 and both ran fine. > >> > While I'm not convinced we are on the right track with the locales > >> > stuff we could try something: > >> > > >> > $ sudo dpkg-reconfigure locales (generate "en_US.UTF-8" and set it > >> > as the default locale) > >> > <log out of the Caine X session> > >> > $ locale (make sure it's all "en_US.utf8") > >> > <test Autopsy again> > >> > > >> > Derrick > >> > > >> > On Sun, Dec 9, 2018 at 10:18 PM <hyl...@is...> wrote: > >> > > > >> > > > Hi Joseph. > >> > > > > >> > > > This question might be better asked directly to Nanni as it > >> > > > sounds > >> > > > like it may be Caine specific! I just tested mbox parsing under > >> > > > Debian testing w/Autopsy 4.9.1 and didn't have any issues with > >> > > > keyword > >> > > > searches. > >> > > > > >> > > > While I don't have a copy of Caine to test with at the moment I > >> > > > wonder > >> > > > if it's a manifestation of your systems locale. If you fire up a > >> > > > terminal emulator, can you send the output from 'locale' and > >> > > > 'locale > >> > > > charmap'? From MboxParser.java:111 in Autopsy it looks like if > >> > > > it > >> > > > can't detect the character encoder that it'll throw that message > >> > > > but I > >> > > > could be way off base here. > >> > > > >> > > Hi Derrick, > >> > > > >> > > Thank you very much for the quick reply. Below is the output of > >> > > 'locale': > >> > > > >> > > jhylkema@caine-vm:~$ locale > >> > > LANG=en_US.UTF-8 > >> > > LANGUAGE=en_US > >> > > LC_CTYPE="en_US.UTF-8" > >> > > LC_NUMERIC=it_IT.UTF-8 > >> > > LC_TIME=it_IT.UTF-8 > >> > > LC_COLLATE="en_US.UTF-8" > >> > > LC_MONETARY=it_IT.UTF-8 > >> > > LC_MESSAGES="en_US.UTF-8" > >> > > LC_PAPER=it_IT.UTF-8 > >> > > LC_NAME=it_IT.UTF-8 > >> > > LC_ADDRESS=it_IT.UTF-8 > >> > > LC_TELEPHONE=it_IT.UTF-8 > >> > > LC_MEASUREMENT=it_IT.UTF-8 > >> > > LC_IDENTIFICATION=it_IT.UTF-8 > >> > > LC_ALL= > >> > > > >> > > And below is the output of 'locale charmap': > >> > > > >> > > jhylkema@caine-vm:~$ locale charmap > >> > > UTF-8 > >> > > > >> > > If I were a betting man, my money would be on the fact that LC_ALL > >> > > isn't > >> > > set. Is that environment variable set in your Debian test distro? > >> > > > >> > > I will also email Nanni. > >> > > > >> > > Thank you. > >> > > > >> > > > > >> > > > Derrick > >> > > > On Sun, Dec 9, 2018 at 1:18 AM Joseph Hylkema < > >> > > > hyl...@is...> > >> > > > wrote: > >> > > > > > >> > > > > Hi all, > >> > > > > > >> > > > > First post to the list. > >> > > > > > >> > > > > I am trying to use Autopsy to run some keyword searches on mbox > >> > > > > files > >> > > > > downloaded from gmail. Unfortunately, autopsy returns an > >> > > > > error: > >> > > > > "Error while processing: Could not find appropriate charset > >> > > > > encoder." > >> > > > > I am running Autopsy on Caine 10 in a KVM VM with 8GB RAM on a > >> > > > > Lenovo > >> > > > > P51 with a Core I7 processor. > >> > > > > > >> > > > > Any help would be appreciated. > >> > > > > > >> > > > > -- > >> > > > > "Far better it is to dare mighty things, to win glorious > >> > > > > triumphs, even > >> > > > > though checkered by failure, than to take rank with those poor > >> > > > > spirits > >> > > > > who neither enjoy much nor suffer much, because they live in > >> > > > > the gray > >> > > > > twilight that knows neither victory nor defeat." > >> > > > > > >> > > > > -- Theodore Roosevelt, "The Strenuous Life." > >> > > > > > >> > > > > > >> > > > > > >> > > > > _______________________________________________ > >> > > > > sleuthkit-users mailing list > >> > > > > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > >> > > > > http://www.sleuthkit.org > >> > > > >> > > > >> > >> > > > > |