KH Coder / Discussion / Open Discussion: Failed in plotting with R "Simple error: cannot allocate vector size 4.0Go

Comment has been marked as spam.
Undo

View and moderate all "Open Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Discussion"

Anonymous - 2016-12-26

Hello, First, thank you for Khcoder,
I have an error that appears using Cluster Analysis of Documents and I can not fix it by configuring the environment variables R (R_MAX_MEM_SIZE, R_VSIZE),
I have 7 GB of RAM. How can I fix the problem?
Thank you for your help,

Hello, First, thank you for Khcoder, I have an error that appears using Cluster Analysis of Documents and I can not fix it by configuring the environment variables R (R_MAX_MEM_SIZE, R_VSIZE), I have 7 GB of RAM. How can I fix the problem? Thank you for your help,

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

HIGUCHI Koichi - 2016-12-26

Hello,

Well, it seems that I need some more info to answer your question.

How many documents are you dealing with? And how many words do you use for that analysis?

Also, open “task manager” and watch the memory usage when you run the analysis. R uses 6GB or 7GB and then dies? Or it uses only 2GB or something?

Last edit: HIGUCHI Koichi 2016-12-26

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Comment has been marked as spam.
Undo

View and moderate all "Open Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Discussion"

Anonymous - 2016-12-26

Thank you for your reply,
I use a single document (314000 sentences, 6400000 token).
I want to use Cluster Analysis Documents with 1000 tags and 50 clusters (Ward method).
For the Task Manager, Rterm.exe stops at a consumption between 6 and 7Go.

Thank you for your reply, I use a single document (314000 sentences, 6400000 token). I want to use Cluster Analysis Documents with 1000 tags and 50 clusters (Ward method). For the Task Manager, Rterm.exe stops at a consumption between 6 and 7Go.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

HIGUCHI Koichi - 2016-12-26

Hello,

It seems that R is actually running out of physical RAM. So, you have to (I) reduce the data size, (II) try a different clustering method, or (III) increase the physical RAM.

(I) What is important here is the size of data matrix sent to R.

[a] Number of rows = number of documents
[b] Number of columns = number of distinctive words that are used for the analysis

About the value of [a], if you select “Sentences” as the “Unit”, it will be 314000. But if you select “Paragraphs” as the “Unit”, the value of [a] will somewhat decrease. Or you can perform random sampling and compose the data file again to reduce this value.

About the value of [b], you can check it as “Number of selected words” in the cluster analysis option screen. You can reduce the value by increasing "Min. TF" or “Min. DF”.

(II) You can try CLARA as the clustering method. It stands for Clustering LARge Applications. You can choose it in the cluster analysis option screen.

(III) R consume relatively large amount of RAM. So, you can add physical RAM to your system but I don’t think it helps a lot.

Last edit: HIGUCHI Koichi 2016-12-26

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Comment has been marked as spam.
Undo

View and moderate all "Open Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Discussion"

Anonymous - 2016-12-27

Hello, I'm already working on a reduced random sample so I can not reduce it anymore.
The use of CLARA has solved the memory problem :-).
Now I have another problem but that deserves can be a new post.
I use tags of several languages in order to have multi-language aggregates. The result dismisses Arabic, Jabonese, Chinese, ... and I do not understand why,
I will analyze the distribution of these words and see if this is not a frequency problem otherwise I will make a separate post.
Thanks a lot for your help,

Hello, I'm already working on a reduced random sample so I can not reduce it anymore. The use of CLARA has solved the memory problem :-). Now I have another problem but that deserves can be a new post. I use tags of several languages in order to have multi-language aggregates. The result dismisses Arabic, Jabonese, Chinese, ... and I do not understand why, I will analyze the distribution of these words and see if this is not a frequency problem otherwise I will make a separate post. Thanks a lot for your help,

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Failed in plotting with R "Simple error: cannot allocate vector size 4.0Go

Quantitative Content Analysis or Text Mining

Forums

Help

Failed in plotting with R "Simple error: cannot allocate vector size 4.0Go

Failed in plotting with R "Simple error: cannot allocate vector size 4.0Go

Quantitative Content Analysis or Text Mining

Forums

Help

Failed in plotting with R "Simple error: cannot allocate vector size 4.0Go document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Failed in plotting with R "Simple error: cannot allocate vector size 4.0Go