Menu

#48 file too big,open bloc by bloc

open
nobody
None
1
2019-02-06
2018-11-12
Raph Schim
No

Hi!
First of all! Sorry, I don't know what priority I have to use for this question...

I'm working with really big point clouds (~50GO [~2Billions points] actually, but they can grow bigger).
From what I can see actually, I have to reserve the space in my vectors in order to call CSetUpData3DPointsData(...), but the problem is that there is so much points in my pointcloud that I don't have enough memory to reserve such space and I always come with a bad_alloc.
My question is : Is there a way to get points bloc by bloc (for example total_size / 4), work on these points, (apply filter or something like that), save them in a file, and then take the next total_size / 4 points, and do the same? That way, I would be able to read any file.
But Actually, I can't find a way to do so ...

Thanks a lot for reading that, and I'm ready to apply every ideas you can come with.
Have a great days!
Best regards!

Raphaël Schim

Related

Bug Reports: #48

Discussion

  • Raph Schim

    Raph Schim - 2019-02-06

    Hi! I come back here since I though it was resolved, but it appears it's not.
    I have a file, let's say 1million points. And I want to read it by bloc of 250k points, so I don't have to reserve for 1million points for xData, yData, zData, isInvalidData, intData, redData, greenData and blueData, but only 250k for each (here I present a simple case, but my files are bigger and reserving memory for all points crash my computer.)
    So I tried to create all the good vector, resize them to 250k and then tried this code :

    for (int numDiv = 0; numDiv < 4; ++numDiv) {
            e57::CompressedVectorReader dataReader = eReader.SetUpData3DPointsData(
                    0,
                    sizeChunks,
                    read.xData_r.data(),
                    read.yData_r.data(),
                    read.zData_r.data(),
                    read.isInvalidData_r.data(),
                    read.intData_r.data(),
                    NULL,
                    read.redData_r.data(),
                    read.greenData_r.data(),
                    read.blueData_r.data(),
                    NULL,
                    NULL,
                    NULL,
                    NULL,
                    NULL,
                    read.rowIndex_r.data(),
                    read.columnIndex_r.data()
                );
    
    
                unsigned size = 0;
                size_t col = 0;
                size_t row = 0;
                int64_t count = 0;
                pcl::PointCloud<MyPoint>::Ptr pts(new pcl::PointCloud<MyPoint>);
                pts->points.resize(sizeChunks);
                pts->width = pts->points.size();
                pts->height = 1;
    
                while (size = dataReader.read()) {
                    for (int64_t i = 0; i < size; ++i) {
    
                        if (read.columnIndex_r.data())
                            col = read.columnIndex_r[i];
                        else
                            col = 0;
    
                        if (read.rowIndex_r.data())
                            row = read.rowIndex_r[i];
                        else
                            row = count;
    
                        if (i < 3) {
                            std::cout << read.xData_r[i] << " " << read.yData_r[i] << " " << read.zData_r[i] << "\n";
                        }
    
                        MyPoint& pt = pts->points[i];
                        //MyPoint& pt = pts->points[i];
                        if (read.bInvalidState) {
                            if (read.isInvalidData_r[i] == 0) {
                                pt.x = read.xData_r[i];
                                pt.y = read.yData_r[i];
                                pt.z = read.zData_r[i];
                            }
                        }
                        else
                        {
                            pt.x = read.xData_r[i];
                            pt.y = read.yData_r[i];
                            pt.z = read.zData_r[i];
                        }
    
    
                        if (read.bIntens) {         //Normalize intensity to 0 - 1.
                            double intensity = (read.intData_r[i] - read.intOffset) / read.intRange;
                            pt.intensity = read.intData_r[i];
                        }
    
                        if (read.bColor) {                     //Normalize color to 0 - 255
                            int red = ((read.redData_r[i] - read.colorRedOffset) * 255) / read.colorRedRange;
                            int green = ((read.greenData_r[i] - read.colorGreenOffset) * 255) / read.colorBlueRange;
                            int blue = ((read.blueData_r[i] - read.colorBlueOffset) * 255) / read.colorBlueRange;
                            pt.r = red;
                            pt.g = green;
                            pt.b = blue;
                        }
    
                        count++;
                    }
                }
                dataReader.close();
                }
        }
    }
    

    But the problem is that this code always gives me the first 250k points of the file.
    in the first iteration, I have the [0 to 250k] points, in the second, I have the [0 to 250k] first point, and so on. I can't get the [250k to 500k] points since SetUpData3DPointsData doesn't change, and always start at the same place.

    Is there a way to take the next 250k points, so that in my 4 iteration for loop, I have all the points of the big file?

     
    • Stan Coleby

      Stan Coleby - 2019-02-06

      Put the eReader.SetUpData3DPointsData on the outside of the loop. This
      should only be called once for each scan. And of course don't close() it
      until you are done.
      The FoundationAPI was written by someone else so I don't have the good feel
      about that area.
      The CompressedVectorReader::seek() function was never completed. So this
      forces you to read from the beginning in order to get all the data at the
      end. It would have been nice to be able to seek() into the middle but he
      left E57 committee for another job and lose interest in finishing this
      area. So the only way to access the data at random is to make each block
      it's own scan.
      Hope that helps.
      Stan Coleby
      E57 Committee member.

      On Wed, Feb 6, 2019 at 8:11 AM Raph Schim kirbx@users.sourceforge.net
      wrote:

      Hi! I come back here since I though it was resolved, but it appears it's
      not.
      I have a file, let's say 1million points. And I want to read it by bloc of
      250k points, so I don't have to reserve for 1million points for xData,
      yData, zData, isInvalidData, intData, redData, greenData and blueData, but
      only 250k for each (here I present a simple case, but my files are bigger
      and reserving memory for all points crash my computer.)
      So I tried to create all the good vector, resize them to 250k and then
      tried this code :

      for (int numDiv = 0; numDiv < 4; ++numDiv) {
      e57::CompressedVectorReader dataReader = eReader.SetUpData3DPointsData(
      0,
      sizeChunks,
      read.xData_r.data(),
      read.yData_r.data(),
      read.zData_r.data(),
      read.isInvalidData_r.data(),
      read.intData_r.data(),
      NULL,
      read.redData_r.data(),
      read.greenData_r.data(),
      read.blueData_r.data(),
      NULL,
      NULL,
      NULL,
      NULL,
      NULL,
      read.rowIndex_r.data(),
      read.columnIndex_r.data()
      );

              unsigned size = 0;
              size_t col = 0;
              size_t row = 0;
              int64_t count = 0;
              pcl::PointCloud<MyPoint>::Ptr pts(new pcl::PointCloud<MyPoint>);
              pts->points.resize(sizeChunks);
              pts->width = pts->points.size();
              pts->height = 1;
      
              while (size = dataReader.read()) {
                  for (int64_t i = 0; i < size; ++i) {
      
                      if (read.columnIndex_r.data())
                          col = read.columnIndex_r[i];
                      else
                          col = 0;
      
                      if (read.rowIndex_r.data())
                          row = read.rowIndex_r[i];
                      else
                          row = count;
      
                      if (i < 3) {
                          std::cout << read.xData_r[i] << " " << read.yData_r[i] << " " << read.zData_r[i] << "\n";
                      }
      
                      MyPoint& pt = pts->points[i];
                      //MyPoint& pt = pts->points[i];
                      if (read.bInvalidState) {
                          if (read.isInvalidData_r[i] == 0) {
                              pt.x = read.xData_r[i];
                              pt.y = read.yData_r[i];
                              pt.z = read.zData_r[i];
                          }
                      }
                      else
                      {
                          pt.x = read.xData_r[i];
                          pt.y = read.yData_r[i];
                          pt.z = read.zData_r[i];
                      }
      
                      if (read.bIntens) {         //Normalize intensity to 0 - 1.
                          double intensity = (read.intData_r[i] - read.intOffset) / read.intRange;
                          pt.intensity = read.intData_r[i];
                      }
      
                      if (read.bColor) {                     //Normalize color to 0 - 255
                          int red = ((read.redData_r[i] - read.colorRedOffset) * 255) / read.colorRedRange;
                          int green = ((read.greenData_r[i] - read.colorGreenOffset) * 255) / read.colorBlueRange;
                          int blue = ((read.blueData_r[i] - read.colorBlueOffset) * 255) / read.colorBlueRange;
                          pt.r = red;
                          pt.g = green;
                          pt.b = blue;
                      }
      
                      count++;
                  }
              }
              dataReader.close();
              }
      }}
      

      But the problem is that this code always gives me the first 250k points of
      the file.
      in the first iteration, I have the [0 to 250k] points, in the second, I
      have the [0 to 250k] first point, and so on. I can't get the [250k to
      500k]
      points since SetUpData3DPointsData doesn't change, and always start
      at the same place.

      Is there a way to take the next 250k points, so that in my 4 iteration for
      loop, I have all the points of the big file?


      Status: open
      Group:
      Created: Mon Nov 12, 2018 07:51 AM UTC by Raph Schim
      Last Updated: Mon Nov 12, 2018 07:51 AM UTC
      Owner: nobody

      Hi!
      First of all! Sorry, I don't know what priority I have to use for this
      question...

      I'm working with really big point clouds (~50GO [~2Billions points]
      actually, but they can grow bigger).
      From what I can see actually, I have to reserve the space in my vectors in
      order to call CSetUpData3DPointsData(...), but the problem is that there
      is so much points in my pointcloud that I don't have enough memory to
      reserve such space and I always come with a bad_alloc.
      My question is : Is there a way to get points bloc by bloc (for example
      total_size / 4), work on these points, (apply filter or something like
      that), save them in a file, and then take the next total_size / 4 points,
      and do the same? That way, I would be able to read any file.
      But Actually, I can't find a way to do so ...

      Thanks a lot for reading that, and I'm ready to apply every ideas you can
      come with.
      Have a great days!
      Best regards!

      Raphaël Schim

      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/e57-3d-imgfmt/bug-reports/48/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       

      Related

      Bug Reports: #48

      • Raph Schim

        Raph Schim - 2019-02-08

        Ah Ok! I see!

        Thanks a lot for all these precisions, it really helped me. Too bad the
        person left before finishing the seek function.

        Best regards!

        Raphaël S.

        Le mer. 6 févr. 2019 à 21:46, Stan Coleby stancoleby@users.sourceforge.net
        a écrit :

        Put the eReader.SetUpData3DPointsData on the outside of the loop. This
        should only be called once for each scan. And of course don't close() it
        until you are done.
        The FoundationAPI was written by someone else so I don't have the good feel
        about that area.
        The CompressedVectorReader::seek() function was never completed. So this
        forces you to read from the beginning in order to get all the data at the
        end. It would have been nice to be able to seek() into the middle but he
        left E57 committee for another job and lose interest in finishing this
        area. So the only way to access the data at random is to make each block
        it's own scan.
        Hope that helps.
        Stan Coleby
        E57 Committee member.

        On Wed, Feb 6, 2019 at 8:11 AM Raph Schim kirbx@users.sourceforge.net
        wrote:

        Hi! I come back here since I though it was resolved, but it appears it's
        not.
        I have a file, let's say 1million points. And I want to read it by bloc of
        250k points, so I don't have to reserve for 1million points for xData,
        yData, zData, isInvalidData, intData, redData, greenData and blueData, but
        only 250k for each (here I present a simple case, but my files are bigger
        and reserving memory for all points crash my computer.)
        So I tried to create all the good vector, resize them to 250k and then
        tried this code :

        for (int numDiv = 0; numDiv < 4; ++numDiv) {
        e57::CompressedVectorReader dataReader = eReader.SetUpData3DPointsData(
        0,
        sizeChunks,
        read.xData_r.data(),
        read.yData_r.data(),
        read.zData_r.data(),
        read.isInvalidData_r.data(),
        read.intData_r.data(),
        NULL,
        read.redData_r.data(),
        read.greenData_r.data(),
        read.blueData_r.data(),
        NULL,
        NULL,
        NULL,
        NULL,
        NULL,
        read.rowIndex_r.data(),
        read.columnIndex_r.data()
        );

            unsigned size = 0;
            size_t col = 0;
            size_t row = 0;
            int64_t count = 0;
            pcl::PointCloud<MyPoint>::Ptr pts(new pcl::PointCloud<MyPoint>);
            pts->points.resize(sizeChunks);
            pts->width = pts->points.size();
            pts->height = 1;
        
            while (size = dataReader.read()) {
                for (int64_t i = 0; i < size; ++i) {
        
                    if (read.columnIndex_r.data())
                        col = read.columnIndex_r[i];
                    else
                        col = 0;
        
                    if (read.rowIndex_r.data())
                        row = read.rowIndex_r[i];
                    else
                        row = count;
        
                    if (i < 3) {
                        std::cout << read.xData_r[i] << " " << read.yData_r[i] << " " << read.zData_r[i] << "\n";
                    }
        
                    MyPoint& pt = pts->points[i];
                    //MyPoint& pt = pts->points[i];
                    if (read.bInvalidState) {
                        if (read.isInvalidData_r[i] == 0) {
                            pt.x = read.xData_r[i];
                            pt.y = read.yData_r[i];
                            pt.z = read.zData_r[i];
                        }
                    }
                    else
                    {
                        pt.x = read.xData_r[i];
                        pt.y = read.yData_r[i];
                        pt.z = read.zData_r[i];
                    }
        
                    if (read.bIntens) {         //Normalize intensity to 0 - 1.
                        double intensity = (read.intData_r[i] - read.intOffset) / read.intRange;
                        pt.intensity = read.intData_r[i];
                    }
        
                    if (read.bColor) {                     //Normalize color to 0 - 255
                        int red = ((read.redData_r[i] - read.colorRedOffset) * 255) / read.colorRedRange;
                        int green = ((read.greenData_r[i] - read.colorGreenOffset) * 255) / read.colorBlueRange;
                        int blue = ((read.blueData_r[i] - read.colorBlueOffset) * 255) / read.colorBlueRange;
                        pt.r = red;
                        pt.g = green;
                        pt.b = blue;
                    }
        
                    count++;
                }
            }
            dataReader.close();
            }}}
        

        But the problem is that this code always gives me the first 250k points of
        the file.
        in the first iteration, I have the [0 to 250k] points, in the second, I
        have the [0 to 250k] first point, and so on. I can't get the [250k to
        500k]
        points since SetUpData3DPointsData doesn't change, and always start
        at the same place.

        Is there a way to take the next 250k points, so that in my 4 iteration for
        loop, I have all the points of the big file?


        Status: open
        Group:
        Created: Mon Nov 12, 2018 07:51 AM UTC by Raph Schim
        Last Updated: Mon Nov 12, 2018 07:51 AM UTC
        Owner: nobody

        Hi!
        First of all! Sorry, I don't know what priority I have to use for this
        question...

        I'm working with really big point clouds (~50GO [~2Billions points]
        actually, but they can grow bigger).
        From what I can see actually, I have to reserve the space in my vectors in
        order to call CSetUpData3DPointsData(...), but the problem is that there
        is so much points in my pointcloud that I don't have enough memory to
        reserve such space and I always come with a bad_alloc.
        My question is : Is there a way to get points bloc by bloc (for example
        total_size / 4), work on these points, (apply filter or something like
        that), save them in a file, and then take the next total_size / 4 points,
        and do the same? That way, I would be able to read any file.
        But Actually, I can't find a way to do so ...

        Thanks a lot for reading that, and I'm ready to apply every ideas you can
        come with.
        Have a great days!
        Best regards!
        Raphaël Schim

        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/e57-3d-imgfmt/bug-reports/48/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/


        Status: open
        Group:
        Created: Mon Nov 12, 2018 07:51 AM UTC by Raph Schim
        Last Updated: Wed Feb 06, 2019 03:11 PM UTC
        Owner: nobody

        Hi!
        First of all! Sorry, I don't know what priority I have to use for this
        question...

        I'm working with really big point clouds (~50GO [~2Billions points]
        actually, but they can grow bigger).
        From what I can see actually, I have to reserve the space in my vectors in
        order to call CSetUpData3DPointsData(...), but the problem is that there
        is so much points in my pointcloud that I don't have enough memory to
        reserve such space and I always come with a bad_alloc.
        My question is : Is there a way to get points bloc by bloc (for example
        total_size / 4), work on these points, (apply filter or something like
        that), save them in a file, and then take the next total_size / 4 points,
        and do the same? That way, I would be able to read any file.
        But Actually, I can't find a way to do so ...

        Thanks a lot for reading that, and I'm ready to apply every ideas you can
        come with.
        Have a great days!
        Best regards!

        Raphaël Schim

        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/e57-3d-imgfmt/bug-reports/48/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/

        --
        SCHIMCHOWITSCH Raphaël

         

        Related

        Bug Reports: #48


Log in to post a comment.

MongoDB Logo MongoDB