From: Victor M. <ma...@fh...> - 2019-12-13 18:59:59
|
Hi, I'm making extensive use of the C++ interface, and the aggressive garbage collection in eclipse-clp is still giving me trouble. So as I understand it, any term goes BAD after it's been given to the eclipse engine once. Now I already have infrastructure in place to deal with that, but in my control flow it's becoming a huge issue that I never really know whether a term has gone BAD or not. Is there any way to query whether a certain term (handle) has already been garbage collected? Thanks in advance, Victor |
From: Joachim S. <jsc...@co...> - 2019-12-14 01:37:01
|
Hello Victor, On 13/12/2019 18:46, Victor Mataré wrote: > I'm making extensive use of the C++ interface, and the aggressive garbage > collection in eclipse-clp is still giving me trouble. So as I understand it, > any term goes BAD after it's been given to the eclipse engine once. > > Now I already have infrastructure in place to deal with that, but in my > control flow it's becoming a huge issue that I never really know whether a > term has gone BAD or not. Is there any way to query whether a certain term > (handle) has already been garbage collected? I'm quite confident that the C++ interface has the functionality to solve your problem in a clean way, no strange workarounds should be necessary. From what you say, I suspect that you may not be using term references where they are needed. To give some background: "terms" are always stored and managed by an ECLiPSe engine. When an engine runs, it may (a) attempt to garbage-collect terms, and (b) move terms as a side effect of garbage collection. For that reason, the engine must know about any references to terms that you are keeping in your own C++ data structures. The way to do that is to use the EC_ref/EC_refs class: terms assigned to an EC_ref/EC_refs will not be garbage collected, and the references will be correctly relocated when the term is moved in memory. In contrast, the EC_word class must be considered volatile: the content of an EC_word becomes invalid once you pass control to the engine (the reason EC_words exist at all is that they have less overhead, and are good enough to hold subterms temporarily while constructing or deconstructing complex terms). That's my best guess without knowing your code. It would also be helpful to know whether you observe this problem already in ECLiPSe 6.1, or whether it is new in 7.0. Cheers, Joachim |
From: Victor M. <ma...@fh...> - 2019-12-15 04:14:50
|
On Samstag, 14. Dezember 2019 02:21:33 CET Joachim Schimpf wrote: > Hello Victor, > > On 13/12/2019 18:46, Victor Mataré wrote: > > I'm making extensive use of the C++ interface, and the aggressive garbage > > collection in eclipse-clp is still giving me trouble. So as I understand > > it, any term goes BAD after it's been given to the eclipse engine once. > > > > Now I already have infrastructure in place to deal with that, but in my > > control flow it's becoming a huge issue that I never really know whether a > > term has gone BAD or not. Is there any way to query whether a certain term > > (handle) has already been garbage collected? > > I'm quite confident that the C++ interface has the functionality to solve > your problem in a clean way, no strange workarounds should be necessary. > > From what you say, I suspect that you may not be using term references > where they are needed. It's not that. I don't need persistent term references because I'm not pulling information out of the eclipse engine. In the part I'm talking about, I'm basically just using compile_term/1 to "compile" a C++ object structure into a bunch of prolog clauses. The problem is that some of those C++ objects represent variables that are referenced in other objects (i.e. they are not singleton variables). So I can't simply re-initialize the variable terms whenever they're referenced because that would make them singletons. I've now resorted to initializing these variable terms once before building a term or clause that has variables, but that is error prone because I didn't expect to have that problem when I designed the control flow. Now that I think of it, some clearer documentation might have helped me in the beginning. There is just this rather vague half-sentence in the Embedding and Interfacing Manual: "terms do not survive the execution of ECLiPSe" None of these words are really clear. What terms are really affected and how, what does "survive" mean and what does "execution" mean? I think the Embedding Manual should have at least a short section about memory management that clearly states what happens. In fact, it should have a big fat warning along these lines: ============================================================================== Every EC_word that has been put into post_goal() is invalid after EC_resume(). That includes all pieces of complex terms. Putting an invalid EC_word into another post_goal() leads to undefined behaviour, which might cause the eclipse engine to corrupt data, crash immediately, crash randomly, or to never crash. ============================================================================== At least that is the behaviour I observed so far when I made mistakes with this. I assume that the undefined behaviour helps performance, but what would also be really useful for debugging these things is a switch that makes it crash immediately. Sure, I can design my client application so that it cannot happen, but I failed to do that initially because there was neither a fat warning nor a fail-fast behavior. It's really a huge pitfall if you're designing a larger application. Hope I did get the problem across ;-) Best regards & thanks for your helpful support, Victor PS: If you look at e.g. SWI Prolog, they expose a GC frame object which makes their (also non-trivial) memory management explicit, controllable and therefore obvious. Lacking that, I believe one does need very clear documentation. |
From: Joachim S. <jsc...@co...> - 2019-12-16 14:07:37
|
On 15/12/2019 03:39, Victor Mataré wrote: > On Samstag, 14. Dezember 2019 02:21:33 CET Joachim Schimpf wrote: >> Hello Victor, >> >> On 13/12/2019 18:46, Victor Mataré wrote: >>> I'm making extensive use of the C++ interface, and the aggressive garbage >>> collection in eclipse-clp is still giving me trouble. So as I understand >>> it, any term goes BAD after it's been given to the eclipse engine once. >>> >>> Now I already have infrastructure in place to deal with that, but in my >>> control flow it's becoming a huge issue that I never really know whether a >>> term has gone BAD or not. Is there any way to query whether a certain term >>> (handle) has already been garbage collected? >> >> I'm quite confident that the C++ interface has the functionality to solve >> your problem in a clean way, no strange workarounds should be necessary. >> >> From what you say, I suspect that you may not be using term references >> where they are needed. > > It's not that. I don't need persistent term references because I'm not pulling > information out of the eclipse engine. You don't need them in principle, but you _do_ need them if you want to reuse subterms (such as variables) across invocations of EC_resume(). The rules are really quite simple: - All EC_words must be considered invalid after an EC_resume() - EC_Refs remain valid after EC_resume() (although the term they refer to may have changed) In C++ parlance, you might think of EC_words as raw pointers, and of EC_refs as smart pointers. In your case, you seem to repeatedly construct terms and invoke EC_resume(). Because you don't _need_ to share variables between the constructed terms, you have two choices: - you can construct fresh terms each time, and assign them either to your old invalid EC_words, or to new EC_words. [I think this is what you are doing now] - you can choose to reuse old subterms, but then you must assign them to EC_refs to carry them across invocations of EC_resume() > In the part I'm talking about, I'm > basically just using compile_term/1 to "compile" a C++ object structure into a > bunch of prolog clauses. The problem is that some of those C++ objects > represent variables that are referenced in other objects (i.e. they are not > singleton variables). > > So I can't simply re-initialize the variable terms whenever they're referenced > because that would make them singletons. I've now resorted to initializing > these variable terms once before building a term or clause that has variables, > but that is error prone because I didn't expect to have that problem when I > designed the control flow. > > Now that I think of it, some clearer documentation might have helped me in the > beginning. There is just this rather vague half-sentence in the Embedding and > Interfacing Manual: > > "terms do not survive the execution of ECLiPSe" > > None of these words are really clear. What terms are really affected and how, > what does "survive" mean and what does "execution" mean? "Term" has its standard meaning in Prolog/ECLiPSe, namely the universal data type (everything is a term: constants, structures and lists, even variables). Everything you can assign to an EC_word is a term. "Execution of ECLiPSe" means passing control to ECLiPSe, either by calling EC_resume(), or (in case your code was called from ECLiPSe as an external) by returning. "not survive" means your EC_words may contain nonsense. The meanings were probably crystal clear to the implementer who once wrote them, but less so from a user's perspective ;) > > I think the Embedding Manual should have at least a short section about memory > management that clearly states what happens. In fact, it should have a big fat > warning along these lines: > > ============================================================================== > Every EC_word that has been put into post_goal() is invalid after EC_resume(). > That includes all pieces of complex terms. Putting an invalid EC_word into > another post_goal() leads to undefined behaviour, which might cause the > eclipse engine to corrupt data, crash immediately, crash randomly, or to never > crash. > ============================================================================== post_goal() has nothing to do with it, only EC_resume() is important. The C++ interface is a thin layer on top of the C interface. It is a low-level interface involving pointers etc, and thus of course capable of causing crashes. One could design a safer, less direct, higher-level interface (like the Java or TCL one), but that's not what we have here. > ... > Hope I did get the problem across ;-) Sure, I hope I could help a little. -- Joachim |