Different file encodings between backups

Help
2012-05-05
2012-10-08
  • Stefan Willinger

    Hello,
    Shottwell just destroyed my image library and I need to restore it. I thought
    no problem, because I am using Acera Backups for years :-) But after the
    restore, I missed several hundret files. The missing files was placed in
    folders with German 'Umlaute' like "Märchenpark".
    The crazy thing is, that the full backup (done during a GUI session) contains
    this files, while the incremental backup (done by a cronjob) does not contain
    these files. Looking at the details of the backup shows me, that the full
    backup uses UTF-8, while the incremental backup uses US-ASCII - Ok, I think
    this is the reason, why the incremental backup does not contain the files.
    But, why does Areca use 2 differnt file encodings? And how can I enfore the
    incremental backup to use UTF-8?

    By the way, luckily, I had no folder with German 'Umlaut' since the last full
    backup, so I can restore my data by hand :-)

     
  • aventin

    aventin - 2012-05-05

    Hi

    A few questions :
    1) Is the encoding correct if you perform an incremental backup with the gui ?
    2) Could you post your target configuration ? (.bcfg file)
    3) Could you also post the manifest of the full backup and the manifest of the
    corrupted incremental backup ?

    thanks !

     
  • Stefan Willinger

    Hi,

    1) Is the encoding correct if you perform an incremental backup with the gui ?

    Yes, then UTF-8 is used

    2) Could you post your target configuration ? (.bcfg file)

    <?xml version="1.0" encoding="UTF-8"?>
    
    <target version="7" id="2" uid="634530762" follow_symlinks="false" register_empty_directories="true" follow_subdirs="true" xml_security_copy="true" name="Digicam gesamt" forward_preproc_errors="true" description="Alle Daten der Digicam">
    <source path="/home/XYZ/.shotwell"/>
    <source path="/mnt/Digicam"/>
    <medium  type="directory"  policy="hd" path="/media/Dasi/dasi/634530762" archive_name="%YYYY%%MM%%DD%" encrypted="false"  track_permissions="true" overwrite="false">
    <handler type="standard"/>
    <transaction_configuration use_transactions="true" transaction_size="51200"/>
    </medium>
    <filter_group logical_not="false" operator="and" >
    <extension_filter  logical_not="true">
    <ext>.tmp</ext>
    <ext>.temp</ext>
    </extension_filter>
    <locked_filter  logical_not="true"/>
    <directory_filter  logical_not="true" directory="/mnt/Digicam/.Trash-1000"/>
    </filter_group>
    </target>
    

    3) Could you also post the manifest of the full backup and the manifest of the
    corrupted incremental backup ?
    The full backup

    <?xml version="1.0" encoding="UTF-8"?>
    <manifest version="1" type="0" date="2012_01_01 10h36-14-738">
    <properties>
    <property key="Archive name" value="20120101" />
    <property key="Archive size" value="23428449800" />
    <property key="Backup duration" value="1 h 1 mn 4 s " />
    <property key="Build id" value="-1" />
    <property key="File encoding" value="UTF-8" />
    <property key="Filtered entries" value="0" />
    <property key="Operating system" value="Linux - 2.6.35-31-generic" />
    <property key="Option [backup scheme]" value="Full backup" />
    <property key="Scanned entries (files or directories)" value="10367" />
    <property key="Source path" value="/mnt/Digicam" />
    <property key="Stored files" value="9514" />
    <property key="Target ID" value="634530762" />
    <property key="Unfiltered directories" value="853" />
    <property key="Unfiltered files" value="9514" />
    <property key="Unmodified files (not stored)" value="0" />
    <property key="Version" value="7.2.3" />
    <property key="Version date" value="20. August 2011" />
    </properties>
    

    corrupted incremental backup

    <?xml version="1.0" encoding="UTF-8"?>
    <manifest version="1" type="0" date="2012_04_29 15h40-35-721">
    <properties>
    <property key="Archive name" value="20120429" />
    <property key="Archive size" value="1248792850" />
    <property key="Backup duration" value="40 s " />
    <property key="Build id" value="7464008258520651107" />
    <property key="File encoding" value="US-ASCII" />
    <property key="Filtered entries" value="0" />
    <property key="Operating system" value="Linux - 2.6.35-32-generic" />
    <property key="Option [backup scheme]" value="Incremental backup" />
    <property key="Scanned entries (files or directories)" value="9235" />
    <property key="Source path" value="/mnt/Digicam" />
    <property key="Stored files" value="200" />
    <property key="Target ID" value="634530762" />
    <property key="Unfiltered directories" value="819" />
    <property key="Unfiltered files" value="8416" />
    <property key="Unmodified files (not stored)" value="8216" />
    <property key="Version" value="7.2.4" />
    <property key="Version date" value="January 18, 2012" />
    </properties>
    

    Manifest of the correct incremental backup, done by a GUI session

    <?xml version="1.0" encoding="UTF-8"?>
    <manifest version="1" type="0" date="2012_05_06 20h56-26-139">
    <properties>
    <property key="Archive name" value="20120506_1" />
    <property key="Archive size" value="217836475" />
    <property key="Backup duration" value="54 s " />
    <property key="Build id" value="7464008258520651107" />
    <property key="File encoding" value="UTF-8" />
    <property key="Filtered entries" value="2" />
    <property key="Operating system" value="Linux - 3.2.0-24-generic" />
    <property key="Option [backup scheme]" value="Incremental backup" />
    <property key="Scanned entries (files or directories)" value="28080" />
    <property key="Source path" value="/" />
    <property key="Stored files" value="218" />
    <property key="Target ID" value="634530762" />
    <property key="Unfiltered directories" value="904" />
    <property key="Unfiltered files" value="27174" />
    <property key="Unmodified files (not stored)" value="26956" />
    <property key="Version" value="7.2.4" />
    <property key="Version date" value="January 18, 2012" />
    </properties>
    
     
  • aventin

    aventin - 2012-05-06

    Thanks for your answer

    If you are using Sun Microsystems' JRE, you can workaround the problem by
    adding the following option to the last line of the "areca_run.sh" file :
    -Dfile.encoding=UTF-8
    ... which would lead to something like that :

    "${JAVA_PROGRAM_DIR}java" -Dfile.encoding=UTF-8 -Xmx256m -Xms64m -cp "${CLASSPATH}" -Duser.dir="${PROGRAM_DIR}" -Djava.library.path="${LIBRARY_PATH}" -Djava.system.class.loader=com.application.areca.impl.tools.ArecaClassLoader $1 "$2" "$3" "$4" "$5" "$6" "$7" "$8" "$9" "${10}" "${11}" "${12}"
    

    I'll try to find a way to ensure that the encoding used by the JVM is the same
    as the underlying system

    Best regards

     
  • aventin

    aventin - 2012-05-06

    Could you please do the following test : launch areca.sh from a shell window,
    go to the "about Areca" window, go to the "system" tab and post the values of
    the following properties :
    - sun.jnu.encoding
    - file.encoding
    without adding the -Dfile.encoding=UTF-8 option mentioned above

    Comparing those properties could be a way to detect potential issues and warn
    the user
    Thanks !

     
  • aventin

    aventin - 2012-05-06

    (it is important that you launch it from a shell window, not a shortcut or
    whatever

     
  • Stefan Willinger

    Running areca.sh as root from a shell - the values of the system properties:

    sun.jnu.encoding : UTF-8
    file.encoding : UTF-8
    
     
  • Stefan Willinger

    I have done a small aditional test.
    I have written an easy java programm to write the broth system properties into
    a temp. file
    Running this program as user or as root, both properties are UTF-8.
    But running it via anacron:

    sun.jnu.encoding : ANSI_X3.4-1968
    file.encoding : ANSI_X3.4-1968
    

    According to Wikipedia, ANSI_X3.4-1968 is the canonical name of US-ASCII.

     
  • aventin

    aventin - 2012-05-10

    ok so it is probably a problem related to the default encoding
    parameterization when running scheduled tasks ... perhaps this post might help
    :
    http://sourceforge.net/projects/areca/forums/forum/587586/topic/3715997

    btw : did setting the encoding directly in areca s script as suggested in one
    of my previous posts help ?

    thanks

     
  • Stefan Willinger

    thx for help - and excuse the slow response :-)
    In the meantime, I had migrated to Ubuntu 12.04 and first, I had to reproduce
    the problem.
    Specifying UTF-8 in the areca-run.sh solves the issue - as long, as I do not
    update areca.

    And yes, the mentioned link seems to be the same issue - I will check it at
    weekend.

    Maybe, it would be a good idea, if the backup configuration can specify the
    encoding.

    thx
    STefan

     

Log in to post a comment.