Menu

xls File validation failure

wxsnail
2017-02-17
2023-10-27
1 2 > >> (Page 1 of 2)
  • wxsnail

    wxsnail - 2017-02-17

    Hi friends:
    1. I creat one worksheet in one xls file,
    2. insert 32833 rows in two columns,
    but, when I open xls file, there are exactly 32833 rows of data but failed to open,
    Excel says "Office has detected a problem with this file. Editing it may harm your computer. Click for more details"
    via support.office.com

    File validation failure When you see a message in Protected View that says Office has detected a problem with this file. Editing it may harm your computer. Click for more details., the file didn’t pass file validation.

    32832 rows is OK, but 32833 or more rows will be failed to open

    and more:
    when I add more worksheet in this file after step 2,
    worksheet can be successful add in, but I can not insert any data in this new sheet
    that 's a big question.

    Can you please help me why?
    Is this a bug or limit of xlslib or my mistake
    Thanks!

    here is my test code:
    <Ubuntu 16.04_i386="" xlslib-package-2.5.0="">

    ///-----------------------------------------------//
    include <stdio.h>
    include <stdlib.h>
    include <stdbool.h>
    include <xlslib xlslib.h="">

    int main(int argc, char argv[])
    {
    int i = 0;
    workbook
    wb;
    worksheet *ws;
    int max = strtoul(argv[1], NULL, 0);

    wb = xlsNewWorkbook();
    ws = xlsWorkbookSheet(wb, "first Sheet");
    
    for (i = 0; i < max; i++) {
        xlsWorksheetLabel(ws, i, 0, "Sheet1_test", NULL);//if max > 32832, Excel will report "File validation failure"
    }
    
    //if max > 32832 here can not be insert any more
    ws = xlsWorkbookSheet(wb, "Second Sheet");
    xlsWorksheetLabel(ws, 0, 0, "Sheet2_test", NULL);
    
    int err = xlsWorkbookDump(wb, "test.xls");
    
    xlsDeleteWorkbook(wb);
    
    if (err != 0) {
        fprintf(stderr, "Dump xls file failed: I/O failure %d.\n", err);
        return -1;
    }
    
    return 0;
    

    }

     
    • David Hoerl

      David Hoerl - 2017-02-18

      People use this all the time with 64000 rows and multiple columns.
      Please add either an assert of a printf after you set max. For sure if
      you do not add at least one cell the program fails (we know this, never
      fixed it). Note that 32832 is the 0x7FFF and 32833 is 0x8000 - if there
      is code somewhere that is making that number sign extend to a negative
      number you will fail. Also, it could be in your compiler or in the way
      the library is build. I'm afraid you will need to add some logging to
      find out exactly where 32833 is getting turned into a negative number.

      Hopefuly you will find that "int max = strtoul(argv[1], NULL, 0);" is
      returning a negative number for your argument, since that would explain
      everything. You might want to make that line read "unsigned int max".

      Good luck

      On 2/17/17 4:07 AM, wxsnail wrote:

      Hi friends:
      1. I creat one worksheet in one xls file,
      2. insert 32833 rows in two columns,
      but, when I open xls file, there are exactly 32833 rows of data but
      failed to open,
      Excel says "Office has detected a problem with this file. Editing it
      may harm your computer. Click for more details
      "
      via support.office.com
      https://support.office.com/en-us/article/What-is-Protected-View-d6f09ac7-e6b9-4495-8e43-2bbcdbcb6653

      /File validation failure When you see a message in Protected View that
      says Office has detected a problem with this file. Editing it may harm
      your computer. Click for more details., the file didn’t pass file
      validation./

      32832 rows is OK, but 32833 or more rows will be failed to open

      and more:
      when I add more worksheet in this file after step 2,
      worksheet can be successful add in, but I can not insert any data in
      this new sheet

      that 's a big question.

      Can you please help me why?
      Is this a bug or limit of xlslib or my mistake
      Thanks!

      here is my test code:
      <Ubuntu 16.04_i386="" xlslib-package-2.5.0="">

      ///-----------------------------------------------//
      include <stdio.h>
      include <stdlib.h>
      include <stdbool.h>
      include <xlslib xlslib.h="">

      int main(int argc, char /argv[])
      {
      int i = 0;
      workbook /wb;
      worksheet *ws;
      int max = strtoul(argv[1], NULL, 0);

      wb = xlsNewWorkbook();
      ws = xlsWorkbookSheet(wb, "first Sheet");

      for (i = 0; i < max; i++) {
      xlsWorksheetLabel(ws, i, 0, "Sheet1_test", NULL);//if max > 32832, Excel will report "File validation failure"
      }

      //if max > 32832 here can not be insert any more
      ws = xlsWorkbookSheet(wb, "Second Sheet");
      xlsWorksheetLabel(ws, 0, 0, "Sheet2_test", NULL);

      int err = xlsWorkbookDump(wb, "test.xls");

      xlsDeleteWorkbook(wb);

      if (err != 0) {
      fprintf(stderr, "Dump xls file failed: I/O failure %d.\n", err);
      return -1;
      }

      return 0;

      }


      xls File validation failure
      https://sourceforge.net/p/xlslib/discussion/402379/thread/1bf8608a/?limit=25#b030


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/xlslib/discussion/402379/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
  • wxsnail

    wxsnail - 2017-02-20

    Thank you for the relpy,
    First 32832=0x8040 32833=0x8041;
    Second I don't think it's a signed / unsigned problem

    here is what I do:
    The file validation is still failure, even if I modify the test code like this :

    //-----------------------------------------------------------//
    unsigned int i = 0;
    while(i < 32832)
    {
    fprintf(stderr, "%d\n", i);
    xlsWorksheetLabel(ws, i++, 0, "Sheet1_test", NULL);
    }

    //until now every thing work fine if you comment next line
    //but when you uncomment next line, File validation will be failed.

    xlsWorksheetLabel(ws, 32832, 0, "Failed Line", NULL);
    //-----------------------------------------------------------//

    and what I need to explan is
    this "Failed Line" still can write into this sheet, only when you open xls file, Excel will show you an alarm report

    Actually this alarm doesn't disturb write dates in one worksheet, it's just an alarm, you can just ignore.
    but when you add another worksheet in this file,
    no more date can be wrote into Sheet2
    that's the issues I care.

    I still need your help, Thank you very much.

    function declaration of "xlsWorksheetLabel" is:
    EXTERN_TYPE cell_t xlsWorksheetLabel(worksheet w, unsigned32_t row, unsigned32_t col, const char strlabel, xf_t pxformat);

     

    Last edit: wxsnail 2017-02-20
  • David Hoerl

    David Hoerl - 2017-02-21

    The complaint about file corruption is probably due to xlslib not correctly writing the file. Can you try yet another line:

    xlsWorksheetLabel(ws, 32833, 0, "Failed Line", NULL);

    To see if its a threshold issue (size > ) or a problem that occurs on a boundary (size ==) - we had one such problem with size == several years ago.

    I'll try to replicate but its going to be a few days - traveling.

     
  • wxsnail

    wxsnail - 2017-02-22

    I tried
    xlsWorksheetLabel(ws, 32833, 0, "Failed Line", NULL);
    &
    xlsWorksheetLabel(ws, 32840, 0, "Failed Line", NULL);
    both of them still caused file corruption.

    May be its a threshold issue (size > ) as your explain.

    Thanks.

     
  • David Hoerl

    David Hoerl - 2017-02-22

    Again, we have tested this thing with 255 columns and 64K cell items. Can you try numbers? Since you are using so many strings, it might overflow some other table - try numbers instead or a mix. It would just help me narrow it down if you can exclude/include some combinations of values and sizes. I'll assume in the past we used more numbers than strings.

     
  • wxsnail

    wxsnail - 2017-02-23

    Same issues
    I modify xlsWorksheetNumberInt instead of xlsWorksheetLabel
    write every cell number 1
    modified it like this but the same issues:

    I don't know how you tested this thing with 255 columns and 64K cell
    could you show me your test as an example?

    I have made this test code as sample as possable, to remove any interference factor.
    but failure is still there

    Thanks again
    //////////////////////////////////////////////////////////////////////
    #include <stdio.h>
    #include <stdlib.h>
    #include <stdbool.h>
    #include <xlslib/xlslib.h>

    int main(int argc, char *argv[])
    {
    workbook *wb = xlsNewWorkbook();
    worksheet *ws = xlsWorkbookSheet(wb, "first Sheet");

    unsigned int i = 0;
    while(i < 32832)
    {
    xlsWorksheetNumberInt(ws, i++, 0, 1, NULL);
    //xlsWorksheetLabel(ws, i++, 0, "Sheet1_test", NULL);
    }

    //until now every thing work fine if you comment next line
    //but when you uncomment next line, File validation will be failed.

    xlsWorksheetNumberInt(ws, 32840, 0, 1, NULL);
    //xlsWorksheetLabel(ws, 32833, 0, "Failed Line", NULL);

    int err = xlsWorkbookDump(wb, "Test.xls");

    xlsDeleteWorkbook(wb);

    if (err != 0)
    {
    fprintf(stderr, "Dump xls file failed: I/O failure %d.\n", err);
    return -1;
    }

    return 0;
    }
    ///////////////////////////////////////////////////////////////////////////////

     
  • David Hoerl

    David Hoerl - 2017-03-01

    So testing using C++, everything works just fine - no problems with either numbers or labels as you used. I tried 0 to 32840. That said, I'm using the C++ interface. I'm going to have to dig around and find some test harness for the C interface.

     
  • wxsnail

    wxsnail - 2017-03-02

    Unfortunately, C++ test code with xlslib has the same issues.....
    both numbers and labels are the same
    Could you show me your test code as an example?

    This is my test code:

    g++ -g3 Test.cpp -o test -lxls -I/usr/include/xlslib
    /////////////////////////////////////////////////////////////////////////

    #include <stdio.h>
    #include "xlslib/xlslib.h"

    using namespace std;
    using namespace xlslib_core;

    int main(int argc, char *argv[])
    {
    unsigned int i = 0;
    workbook wb;
    worksheet *sh = wb.sheet("first Sheet");

    while (i < 32832) {
    // sh->number(i++, 0, 1.0);
    sh->label(i++,0,"Sheet1_test");
    }

    //until now every thing work fine if you comment next line
    //but when you uncomment next line, File validation will be failed.

    // sh->number(32832, 0, 1.0);
    sh->label(i++,0,"Sheet1_test");

    int err = wb.Dump("Test.xls");
    if (err != 0) {
    fprintf(stderr, "Dump xls file failed: I/O failure %d.\n", err);
    return -1;
    }
    return 0;
    }
    //////////////////////////////////////////////////////////////////////////////////////////

     

    Last edit: wxsnail 2017-03-02
  • David Hoerl

    David Hoerl - 2017-03-02

    It would appear your library wasn't properly built. I have a tool (libxls) that decodes a .xls file. I can try running it through that and see what happens - send me the file with numbers it will be easier for me to test.

    My code is:

    {
        workbook wb;
        worksheet *sh = wb.sheet("Test1");
    
        for(unsigned int row=0; row<32840; ++row) {
            for(unsigned int col=0; col<1; ++col) {
                sh->label(row, col, "Sheet1_test");
                //sh->number(row, col, (float)row*(float)(col+1));
            }
        }
        wb.Dump("/Volumes/Data/Users/dhoerl/Public/Corrupt2.xls");
        fprintf(stderr, "Dump End Bug \n");
        return(0);
    }
    

    My test setup is on a Mac using Xcode/Clang to build the library. I don't have a current version so have to test with an older one on another computer - thus the file is saved to a file I can mount from that machine.

     
  • David Hoerl

    David Hoerl - 2017-03-02

    Another way for me to test is to open the file with the Apple "Numbers" program - it complains bitterly if the file is in any was corrupt and then will refuse to open the file. So its less forgiving than Excel itself. And its opening the files just fine.

     
  • wxsnail

    wxsnail - 2017-03-03

    All right, forget this Excel warning.
    actually I'm no care about this, I can just ignore.

    what I want to say and care about is:

    When this warning happens in the "first Sheet" (Ignore warning)
    Add another worksheet ("Second Sheet") in this file, (this step is OK)
    Add whatever lables/numbers into "Second Sheet" (this step is NOT OK, none of them can be wrote into "Second Sheet")

    that means when I open a xls file, there are two sheets "first Sheet" and "Second Sheet"
    "first Sheet" have more than 32832 rows
    "Second Sheet" have no datas

    That's the issues I care.

     
  • wxsnail

    wxsnail - 2017-03-03

    Oh, Damn it, I know why, It's Excel

    Same xls file,
    Excel open it with warning and no datas in "2nd Sheet".
    but, It's OK in LibreOffice in Ubuntu

    I figure out why your Apple is OK too
    I'm crazy about this. @_@

    Why?......
    Is that means xlslib will no longer Compatible with Excel ?

     

    Last edit: wxsnail 2017-03-03
    • David Hoerl

      David Hoerl - 2017-03-04

      As I said earlier, my 2010 version of Excel is opening the files, as are
      Pages. Please upload your problem file so I can try it on my Mac with
      both old Excel and Pages.

      The process of creating the .xls file is EXTREMELY complicated. We used
      the Open Office spec on Excel. If even one bit is out of place real
      Excel may get unhappy about the file. When Open Office did their spec,
      there was no real MS document on the format. Later MS did produce a
      REALLY long spec which is almost impossible to read.

      There really isn't anything I can do to help you. Without a reproducible
      problem what could I do?

      There are open source projects to create an Excel file in I recall PHP,
      and newer ones to create that new XML formatted file. You could try
      those. Also, someone sells a library to do xls - unsure of how it would
      work for you thought.

       
  • wxsnail

    wxsnail - 2017-03-06

    Yes Thank you, the attachment is my problem file

    it with 32833 row of number one in "1st Sheet"
    and with 3 column of number one in "2st Sheet"

    I can see the "2st Sheet" with no data base on Excel (Office 365 ProPlus)
    BUT
    I can see the "2st Sheet" with all data (3 column of number one) base on LibreOffice Calc (4.2.8.2

    May be it's OK on your Apple.

     
  • David Hoerl

    David Hoerl - 2017-03-06

    It opens on my old version, there is a sheet 2, but nothing is in it. No complaint but hey that Excel is quite old. Numbers on the Mac won't open it. Please upload exact code you used to build it.

     
  • wxsnail

    wxsnail - 2017-03-07

    "Test.cpp" is exact code that I used to build "32833_int_Err.xls"

    "Test.c" is a C code which can cause the same issues

     
  • wxsnail

    wxsnail - 2017-03-14

    Hi David, do you have any idea about this issue?

    as you can see in the code, we certainly wrote data in "2st Sheet"
    but they are invisible base on Excel, and they are visible base on LibreOffice or Mac
    all this are because of file validation failure cause by too many data(>32832) in "1st Sheet"

    I don't know why and how to fix this issue.
    do you have any idea about this?

    I'm looking forward for you reply.
    Thank you very much

    Test.cpp
    This is exact code that I used to build "32833_int_Err.xls"

    #include <stdio.h>
    #include "xlslib/xlslib.h"
    
    using namespace std;
    using namespace xlslib_core;
    
    int main(int argc, char *argv[])
    {
        unsigned int i = 0;
        workbook wb;
        worksheet *sh1 = wb.sheet("1st Sheet");
        worksheet *sh2 = wb.sheet("2nd Sheet");
    
        while (i < 32832) {
            sh1->number(i++, 0, 1.0);
        }
    
    //until now every thing work fine if you comment next line
    //but when you uncomment next line, File validation will be failed.
        sh1->number(32832, 0, 1.0);
    
    //writ data in "2nd Sheet"
        sh2->number(0, 0, 1.0);
        sh2->number(0, 1, 1.0);
        sh2->number(0, 2, 1.0);
    //If File validation failed, all these three numbers in 2nd Sheet will be invisible base on Excel
    //But they are visible base on LibreOffice or Mac
    //That's the issue I care about
    
        int err = wb.Dump("Test.xls");
        if (err != 0) {
            fprintf(stderr, "Dump xls file failed: I/O failure %d.\n", err);
            return -1;
        }
    
        return 0;
    }
    

    Test.c

    #include <stdio.h>
    #include <stdlib.h>
    #include <stdbool.h>
    #include <xlslib/xlslib.h>
    
    int main(int argc, char *argv[])
    {
        workbook *wb = xlsNewWorkbook();
        worksheet *ws1 = xlsWorkbookSheet(wb, "1st Sheet");
        worksheet *ws2 = xlsWorkbookSheet(wb, "2nd Sheet");
    
        unsigned int i = 0;
        while(i < 32832)
        {
            xlsWorksheetNumberInt(ws1, i++, 0, 1, NULL);
        }
    
    //until now every thing work fine if you comment next line
    //but when you uncomment next line, File validation will be failed.
        xlsWorksheetNumberInt(ws1, 32832, 0, 1, NULL);
    
    //writ data in "2nd Sheet"
        xlsWorksheetNumberInt(ws2, 0, 0, 1, NULL);
        xlsWorksheetNumberInt(ws2, 0, 1, 1, NULL);
        xlsWorksheetNumberInt(ws2, 0, 2, 1, NULL);
    //If File validation failed, all these three numbers in 2nd Sheet will be invisible base on Excel
    //But they are visible base on LibreOffice or Mac
    //That's the issue I care about
    
        int err = xlsWorkbookDump(wb, "Test.xls");
    
        xlsDeleteWorkbook(wb);
    
        if (err != 0)
        {
            fprintf(stderr, "Dump xls file failed: I/O failure %d.\n", err);
            return -1;
        }
    
        return 0;
    }
    
     
  • David Hoerl

    David Hoerl - 2017-03-17

    I was traveling last week overseas - I'll look at the file you produced but I'm not very optimistic I can find the root problem.

     
  • David Hoerl

    David Hoerl - 2017-03-19

    Well, I dumped your file and can see a problem - a missing record - on the problem row. I'm going to guess that this is a bug in your compiler/optimizer since clang/llvm (and all other compilers I've used in the past does create a good file.)

    I'd like you to run one more test - instead of creating rows that start at 0 and going up to the problem point:

    for(unsigned int row=0; row<32840; ++row)

    change the for loop to start just below the problem, then see if Excel will open the file wihtout complaint or not. Knowing if the is the row index or the total numbers of cells will help. Also, you could just try starting at say row 19 - some odd number - and see if the row index for the problem shifts up by that number, or stays where it is now.

    Hopefully it will turn out that just having a dozen cells or so around the problematic row number will force the error. If so please send me the corrup file and the C++ source that you use to create it (the loop is good enough).

    You only need to have one sheet - I don't think the second sheet affects the problem.

    I'll post again with some helpful print statements you can add to sheetrec.cpp when I figure what to print. If you can get a small number of cells to produce the error debugging is going to get a lot easier.

     
  • David Hoerl

    David Hoerl - 2017-03-19

    Now not sure sure if that is really the issue. Try this for cell generation:

    {
        workbook wb;
        worksheet *sh = wb.sheet("Test1");
    
        for(unsigned int row=32820; row<32840; ++row) { // 1750 OK 1800 bad
            for(unsigned int col=0; col<1; ++col) {
                //sh->label(row, col, "Sheet1_test");
                sh->number(row, col, (float)row*(float)(col+1));
            }
        }
        wb.Dump("/Volumes/Data/Users/dhoerl/Public/Thresh.xls");
        fprintf(stderr, "Dump End Bug \n");
        return(0);
    }
    

    The file I created for that is attached - see if you can open it. If you can, and you cannot open your generation of that, attach the xls file to your response.

     
  • wxsnail

    wxsnail - 2017-03-22

    1.I can open your "Thresh.xls" without "file validation failure" warning from Excel.

    2.I use your code, use the xls dynamic-link library and "g++" compiler on my machine.
    running the code creat a "Thresh.xls". this file is the same with your "Thresh.xls" attached ,
    there is NO "file validation failure" warning from Excel

    1. I change the "for" loop to start, and I found that:
      if I start write from row 101, It'll fail > 32932 (32832+100)
      if I start write from row 201, It'll fail > 33032 (32832+200)
      if I start wrtie from row 1001, It'll fail > 33832 (32832+1000)

    As I had tested before, the problem is the total numbers of cells (<=32832 is OK)
    whatever which row of data, total row numbers of record >32832 will be failed
    Whatever the blank cell is on the top or middle of row

    #include <stdio.h>
    #include "xlslib/xlslib.h"
    
    using namespace std;
    using namespace xlslib_core;
    
    int main(int argc, char *argv[])
    {
       workbook wb;
       worksheet *sh = wb.sheet("Test1");
    
       unsigned int start = 100;
       unsigned int total = 32833;//<=32832 is OK; >32832 will file validation failure
       unsigned int end   = start + total;
    
       for (unsigned int row = start; row < end; row++) {
          sh->number(row, 0, 123.0);
       }
    
       wb.Dump("Thresh.xls");
       fprintf(stderr, "Dump from %u to %u, total:%u End Bug \n", start, total, end);
       return (0);
    }
    
     

    Last edit: wxsnail 2017-03-22
  • David Hoerl

    David Hoerl - 2017-03-22

    In hex, the 32832 number is 8000x + 40x. Probably there are some records xls spits out every 64th (dec) - some kind of row/block record - and that's what internally is going negative, and thus xlslib stops outputting the record.

    I use libxls to dump these files. But you can imagine having to "more" through 32K records to find the missing one isn't an easy job.

     
  • David Hoerl

    David Hoerl - 2017-03-22

    In any case, its quite useful to know the problem is the absolute number of cells and not the actual row number.

    Can you please try one more test - do 32832/2 cells in one group starting from 0, then skip 100 rows, then do the second 32832/2 cells - see if the problem happens or not. Then try adding one more to the second group.

    This information will better help me figure out where to look.

     
  • wxsnail

    wxsnail - 2017-03-24
    1. No matter how many groups in one column, as long as the total number of data in one column greater than 32832, the file validation will be failed.

    2. No matter how many column in the sheet, as long as the total number of data in one column greater than 32832, the file validation will be failed.

    I did a instrenting test:
    1.Creates a random row index array with a total of no duplicates 32833 elements from 1~65536(Max. Rows of ".xls" file is 65536).

    2.Fill all these cells with nunber or string in one or more columns

    3.as long as the total number of data in one column greater than 32832, the file validation will be failed.

    The attatched files is
    3col * Random 32832row is OK
    3col * Random 32833row is Error

    Source code

    #include <stdio.h>
    #include "xlslib/xlslib.h"
    
    using namespace std;
    using namespace xlslib_core;
    
    //fun: Creates a random index array with a total of no duplicates n elements from 1~m
    void Random_m_n(int m, int n, int *arry)
    {
       srand((int)time(0));
       int *data_m = new int[m];
    
       for (int i = 0; i < m; i++) {
          data_m[i] = i + 1;
       }
    
       while (n--) {
          int i = rand() % m;
          arry[n] = data_m[i];
          data_m[i] = data_m[m];
          m--;
       }
    }
    
    int main(int argc, char *argv[])
    {
       workbook wb;
       worksheet *sh = wb.sheet("Test1");
    
       unsigned int row_total = 32833;  //<=32832 is OK; >32832 will file validation failure
       unsigned int col_total = 3;      //whatever how many columns in this sheet.
    
       int *randRow = (int *)malloc(row_total * sizeof(int));
       Random_m_n(65536, row_total, randRow);   //Max. Rows of ".xls" file is 65536
    
       for (unsigned int iRow = 0; iRow < row_total; iRow++) {
          for (unsigned int iCol = 0; iCol < col_total; iCol++) {
             sh->number(randRow[iRow], iCol, 1.0);
          }
       }
    
       wb.Dump("Thresh.xls");
       return (0);
    }
    
     
1 2 > >> (Page 1 of 2)

Log in to post a comment.