|
From: David W. <we...@in...> - 2017-06-21 10:13:10
|
Hi everyone, As I'm stuck with this problem I would appreciate any kind of advice. Best Regards, David Am 07.06.2017 um 15:13 schrieb David Werner: > Hi everyone, > > after Denis Huber left the project, I am in charge of making our > checkpoint/restore component work. > Therefore i would like to ask some more questions on the IRQ kernel > object. > > > 1. When is the IRQ object created? Does every component have an own > IRQ object? > > I tried to figure out when the IRQ object is mapped into the object > space of a component on its startup. Therefore I took a look at the > code in [repos/base-foc/src/core/signal_source_component.cc]. The IRQ > object appears in the object space after the "_sem = > <Rpc_request_semaphore>();" statement in the constructor. > > As far as I could follow the implementation the "request_semaphore" > RPC call is answered by the "Signal_source_rpc_object" in > [base-foc/src/include/signal_source/rpc_object.h] which > returns/delegates the native capability "_blocking_semaphore" which is > an attribute of the "Signal_source_rpc_object". It seems to me that > the IRQ object already exists at this point and is only delegated to > the component. > > But when is the IRQ object created and by whom? Is it created when a > new PD session is created? > > > > 2. Does the IRQ object carry any information? Do I need to checkpoint > this information in order to be able to recreate the object properly > during a restore process? Is the IRQ object created automatically (and > i only have to make sure that the object is getting mapped into the > object space of the target) or do i have to create it manually? > > In our current implementation of the restore process we restore a > component by recreating its sessions to core services (+timer) with > the help of information we gathered using a custom runtime > environment. After the sessions are restored we place them in the > object space at the correct position. Will I also have to somehow > store information about the IRQ object? Or is it just some object that > needs to exist? > > > Kind Regards, > David > > > Am 29.03.2017 um 14:05 schrieb Stefan Kalkowski: >> Hello Dennis, >> >> On 03/27/2017 04:14 PM, Denis Huber wrote: >>> Dear Genode community, >>> >>> Preliminary: We implemented a Checkpoint/Restore mechanism on basis of >>> Genode/Fiasco.OC (Thanks to the great help of you all). We store the >>> state of the target component by monitoring its RPC function calls >>> which >>> go through the parent component (= our Checkpoint/Restore component). >>> The capability space is indirectly checkpointed through the >>> capability map. >>> The restoring of the state of the target is done by restoring the RPC >>> objects used by the target component (e.g. PD session, dataspaces, >>> region maps, etc.). The capabilities of the restored objects have to be >>> also restored in the capability space (kernel) and in the capability >>> map >>> (userspace). >>> >>> For restoring the target component Norman suggested the usage of the >>> Genode::Child constructor with an invalid ROM dataspace capability >>> which >>> does not trigger the bootstrap mechanism. Thus, we have the full >>> control >>> of inserting the capabilities of the restored RPC objects into the >>> capability space/map. >>> >>> Our problem is the following: We restore the RPC objects and insert >>> them >>> into the capability map and then in the capability space. From the >>> kernel point of view these capabilities are all "IPC Gates". >>> Unfortunately, there was also an IRQ kernel object created by the >>> bootstrap mechanism. The following table shows the kernel debugger >>> output of the capability space of the freshly bootstraped target >>> component: >>> >>> 000204 :0016e* Gate 0015f* Gate 00158* Gate 00152* Gate >>> 000208 :00154* Gate 0017e* Gate 0017f* Gate 00179* Gate >>> 00020c :00180* Gate 00188* Gate -- -- >>> 000210 : -- -- 0018a* Gate 0018c* Gate >>> 000214 :0018e* Gate 00196* Gate 00145* Gate 00144* IRQ >>> 000218 :00198* Gate -- -- -- >>> 00021c : -- 0019c* Gate -- -- >>> >>> At address 000217 you can see the IRQ kernel object. What does this >>> object do, how can we store/monitor it, and how can it be restored? >>> Where can we find the source code which creates this object in Genode's >>> bootstrap code? >> The IRQ kernel object you refer to is used by the "signal_handler" >> thread to block for signals of core's corresponding service. It is a >> base-foc specific internal core RPC object[1] that is used by the signal >> handler[2] and the related capability gets returned by the call to >> 'alloc_signal_source()' provided by the PD session[3]. >> >> I have to admit, I did not follow your current implementation approach >> in depth. Thereby, I do not know how to exactly handle this specific >> signal hander thread and its semaphore-like IRQ object, but maybe the >> references already help you further. >> >> Regards >> Stefan >> >> [1] repos/base-foc/src/core/signal_source_component.cc >> [2] repos/base-foc/src/lib/base/signal_source_client.cc >> [3] repos/base/src/core/include/pd_session_component.h >>> >>> Best regards, >>> Denis >>> >>> On 11.12.2016 13:01, Denis Huber wrote: >>>> Hello Norman, >>>> >>>>> What you observe here is the ELF loading of the child's binary. As >>>>> part >>>>> of the 'Child' object, the so-called '_process' member is >>>>> constructed. >>>>> You can find the corresponding code at >>>>> 'base/src/lib/base/child_process.cc'. The code parses the ELF >>>>> executable >>>>> and loads the program segments, specifically the read-only text >>>>> segment >>>>> and the read-writable data/bss segment. For the latter, a RAM >>>>> dataspace >>>>> is allocated and filled with the content of the ELF binary's data. In >>>>> your case, when resuming, this procedure is wrong. After all, you >>>>> want >>>>> to supply the checkpointed data to the new child, not the initial >>>>> data >>>>> provided by the ELF binary. >>>>> >>>>> Fortunately, I encountered the same problem when implementing fork >>>>> for >>>>> noux. I solved it by letting the 'Child_process' constructor >>>>> accept an >>>>> invalid dataspace capability as ELF argument. This has two effects: >>>>> First, the ELF loading is skipped (obviously - there is no ELF to >>>>> load). >>>>> And second the creation of the initial thread is skipped as well. >>>>> >>>>> In short, by supplying an invalid dataspace capability as binary >>>>> for the >>>>> new child, you avoid all those unwanted operations. The new child >>>>> will >>>>> not start at 'Component::construct'. You will have to manually create >>>>> and start the threads of the new child via the PD and CPU session >>>>> interfaces. >>>> Thank you for the hint. I will try out your approach >>>> >>>>> The approach looks good. I presume that you encounter >>>>> base-foc-specific >>>>> peculiarities of the thread-creation procedure. I would try to follow >>>>> the code in 'base-foc/src/core/platform_thread.cc' to see what the >>>>> interaction of core with the kernel looks like. The order of >>>>> operations >>>>> might be important. >>>>> >>>>> One remaining problem may be that - even though you may by able the >>>>> restore most part of the thread state - the kernel-internal state >>>>> cannot >>>>> be captured. E.g., think of a thread that was blocking in the >>>>> kernel via >>>>> 'l4_ipc_reply_and_wait' when checkpointed. When resumed, the new >>>>> thread >>>>> can naturally not be in this blocking state because the kernel's >>>>> state >>>>> is not part of the checkpointed state. The new thread would possibly >>>>> start its execution at the instruction pointer of the syscall and >>>>> issue >>>>> system call again, but I am not sure what really happens in practice. >>>> Is there a way to avoid this situation? Can I postpone the >>>> checkpoint by >>>> letting the entrypoint thread finish the intercepted RPC function >>>> call, >>>> then increment the ip of child's thread to the next command? >>>> >>>>> I think that you don't need the LOG-session quirk if you follow my >>>>> suggestion to skip the ELF loading for the restored component >>>>> altogether. Could you give it a try? >>>> You are right, the LOG-session quirk seems a bit clumsy. I like your >>>> idea of skipping the ELF loading and automated creation of CPU threads >>>> more, because it gives me the control to create and start the threads >>>> from the stored ip and sp. >>>> >>>> >>>> Best regards, >>>> Denis >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> Developer Access Program for Intel Xeon Phi Processors >>>> Access to Intel Xeon Phi processor-based developer platforms. >>>> With one year of Intel Parallel Studio XE. >>>> Training and support from Colfax. >>>> Order your platform today.http://sdm.link/xeonphi >>>> _______________________________________________ >>>> genode-main mailing list >>>> gen...@li... >>>> https://lists.sourceforge.net/lists/listinfo/genode-main >>>> >>> ------------------------------------------------------------------------------ >>> >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> _______________________________________________ >>> genode-main mailing list >>> gen...@li... >>> https://lists.sourceforge.net/lists/listinfo/genode-main >>> > > > ------------------------------------------------------------------------------ > > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > genode-main mailing list > gen...@li... > https://lists.sourceforge.net/lists/listinfo/genode-main > |