From: Matt H. <mat...@us...> - 2006-05-17 06:05:45
|
Jes, all, I've built on Chandra's Task Notifiers patch to modularize the process events connector. I haven't completed testing, however I thought I'd post the patches to get some early feedback. I believe I've cc'd everyone from the original "simple task notifier support" thread. Please speak up if I've neglected to add you or you're no longer interested :). PATCHES This series applies to 2.6.17-rc4-mm1: task_notifiers.patch The patch posted by Chandra, updated for 2.6.17-rc4-mm1. tn_move_fork_notification tn_move_exec_notification Move the location of the fork and exec notification chain calls as I recommended in: http://marc.theaimsgroup.com/?l=lse-tech&m=114618434520870&w=4 tn_make_exit_code_valid_in_task_notifier_handlers This patch moves assignment of the task's exit code to allow registered handlers access to the valid exit_code field. tn_renumber This patch changes the values for each invocation to reflect roughly the order they are expected to be called while also providing room for new values. tn_add_id_notifications This patch adds notification for id changes. tn_registration This patch adds externs describing the registration functions in include/linux/notifier.h. Honestly, I'm not sure where they belong. However, something like this is necessary so that modules can take advantage of Task Notifiers. protect_task_notifier_chain This patch switches Task Notifiers from atomic to blocking. This means it uses a global rwsem and allows handlers to block. cn_proc_move_exit_connector This patch moves the location used to invoke proc_exit_connector. It's intended to make the next patch easier to understand and does not affect functionality. cn_proc_modularize This patch uses Task Notifiers to invoke the Process Events Connector functions. Note that it removes the invokations of functions in fork/exec/[re][ug]id/exit paths. per_task_notifiers This patch introduces a module that demonstrates what I was trying to describe with the "2-level mechanism" discussion earlier. Unlike the the previous portions of this series, this patch is completely untested (see TESTING). It's intended to show one way per-task notifier chains can be implemented using the global task notifier mechanism. That's it for the series of patches. Something very much like Alan Stern's quick rwsem implementation should be usable for the blocking notifier chains if cacheline bouncing is a concern. TESTING I've done some quick testing by hand to ensure that the patches compile, boot, and that the process events connector still functions as expected. I also repeatedly ran LTP with and without the series applied. In all runs there was a signal 11 in the clone06: clone06 1 BROK : Unexpected signal 11 received. The fact that it was in all runs suggests it was not caused by the patches. I've only tested with a configuration where connector and process events are both modules. I still need to test the other 4 combinations (cn:y, cn_proc:m; yy; yn; nn) just to be sure. Cheers, -Matt Helsley |
From: Matt H. <mat...@us...> - 2006-05-17 06:27:13
|
The global task notifier implementation posted by Chandra, updated to apply cleanly to 2.6.17-rc4-mm1. Index: linux-2.6.17-rc4/kernel/exit.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/exit.c +++ linux-2.6.17-rc4/kernel/exit.c @@ -36,10 +36,11 @@ #include <linux/mutex.h> #include <linux/futex.h> #include <linux/compat.h> #include <linux/pipe_fs_i.h> #include <linux/audit.h> /* for audit_free() */ +#include <linux/notifier.h> #include <asm/uaccess.h> #include <asm/unistd.h> #include <asm/pgtable.h> #include <asm/mmu_context.h> @@ -911,10 +912,11 @@ fastcall NORET_TYPE void do_exit(long co if (unlikely(tsk->compat_robust_list)) compat_exit_robust_list(tsk); #endif if (unlikely(tsk->audit_context)) audit_free(tsk); + task_notifier_call_chain(TN_EXIT, tsk); taskstats_exit_send(tsk, tidstats, tgidstats); taskstats_exit_free(tidstats, tgidstats); delayacct_tsk_exit(tsk); exit_mm(tsk); Index: linux-2.6.17-rc4/kernel/fork.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/fork.c +++ linux-2.6.17-rc4/kernel/fork.c @@ -43,10 +43,11 @@ #include <linux/profile.h> #include <linux/rmap.h> #include <linux/acct.h> #include <linux/cn_proc.h> #include <linux/delayacct.h> +#include <linux/notifier.h> #include <asm/pgtable.h> #include <asm/pgalloc.h> #include <asm/uaccess.h> #include <asm/mmu_context.h> @@ -1119,10 +1120,12 @@ static task_t *copy_process(unsigned lon /* ok, now we should be set up.. */ p->exit_signal = (clone_flags & CLONE_THREAD) ? -1 : (clone_flags & CSIGNAL); p->pdeath_signal = 0; p->exit_state = 0; + if (task_notifier_call_chain(TN_FORK, p)) + goto bad_fork_cleanup_namespace; /* * Ok, make it visible to the rest of the system. * We dont wake it up yet. */ p->group_leader = p; @@ -1244,10 +1247,11 @@ bad_fork_cleanup_semundo: exit_sem(p); bad_fork_cleanup_audit: audit_free(p); bad_fork_cleanup_security: security_task_free(p); + task_notifier_call_chain(TN_EXIT, p); bad_fork_cleanup_policy: #ifdef CONFIG_NUMA mpol_free(p->mempolicy); bad_fork_cleanup_cpuset: #endif Index: linux-2.6.17-rc4/fs/exec.c =================================================================== --- linux-2.6.17-rc4.orig/fs/exec.c +++ linux-2.6.17-rc4/fs/exec.c @@ -47,10 +47,11 @@ #include <linux/security.h> #include <linux/syscalls.h> #include <linux/rmap.h> #include <linux/acct.h> #include <linux/cn_proc.h> +#include <linux/notifier.h> #include <asm/uaccess.h> #include <asm/mmu_context.h> #ifdef CONFIG_KMOD @@ -987,10 +988,11 @@ void compute_creds(struct linux_binprm * if (bprm->e_uid != current->uid) suid_keys(current); exec_keys(current); + task_notifier_call_chain(TN_EXEC, NULL); task_lock(current); unsafe = unsafe_exec(current); security_bprm_apply_creds(bprm, unsafe); task_unlock(current); security_bprm_post_apply_creds(bprm); Index: linux-2.6.17-rc4/include/linux/notifier.h =================================================================== --- linux-2.6.17-rc4.orig/include/linux/notifier.h +++ linux-2.6.17-rc4/include/linux/notifier.h @@ -152,7 +152,12 @@ extern int raw_notifier_call_chain(struc #define CPU_UP_CANCELED 0x0004 /* CPU (unsigned)v NOT coming up */ #define CPU_DOWN_PREPARE 0x0005 /* CPU (unsigned)v going down */ #define CPU_DOWN_FAILED 0x0006 /* CPU (unsigned)v NOT going down */ #define CPU_DEAD 0x0007 /* CPU (unsigned)v dead */ +#define TN_FORK 0x0001 /* Task notifier - fork */ +#define TN_EXIT 0x0002 /* Task notifier - exit */ +#define TN_FREE 0x0003 /* Task notifier - free */ +#define TN_EXEC 0x0004 /* Task notifier - exec */ + #endif /* __KERNEL__ */ #endif /* _LINUX_NOTIFIER_H */ Index: linux-2.6.17-rc4/include/linux/sched.h =================================================================== --- linux-2.6.17-rc4.orig/include/linux/sched.h +++ linux-2.6.17-rc4/include/linux/sched.h @@ -208,10 +208,11 @@ long io_schedule_timeout(long timeout); extern void cpu_init (void); extern void trap_init(void); extern void update_process_times(int user); extern void scheduler_tick(void); +extern int task_notifier_call_chain(unsigned long, void *); #ifdef CONFIG_DETECT_SOFTLOCKUP extern void softlockup_tick(void); extern void spawn_softlockup_task(void); extern void touch_softlockup_watchdog(void); Index: linux-2.6.17-rc4/kernel/sched.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/sched.c +++ linux-2.6.17-rc4/kernel/sched.c @@ -55,10 +55,30 @@ #include <linux/delayacct.h> #include <asm/tlb.h> #include <asm/unistd.h> +/* Task notifier Mechanism */ +static ATOMIC_NOTIFIER_HEAD(task_notifier); + +int register_task_notifier(struct notifier_block *nb) +{ + return atomic_notifier_chain_register(&task_notifier, nb); +} +EXPORT_SYMBOL(register_task_notifier); + +int unregister_task_notifier(struct notifier_block *nb) +{ + return atomic_notifier_chain_unregister(&task_notifier, nb); +} +EXPORT_SYMBOL(unregister_task_notifier); + +int task_notifier_call_chain(unsigned long val, void *v) +{ + return atomic_notifier_call_chain(&task_notifier, val, v); +} + /* * Convert user-nice values [ -20 ... 0 ... 19 ] * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ], * and back. */ |
From: Matt H. <mat...@us...> - 2006-05-17 06:28:47
|
Move the location of the fork notification chain calls because Process Events, which will use Task Notifier, needs the real parent's pid in the task_struct. Index: linux-2.6.17-rc4/kernel/fork.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/fork.c +++ linux-2.6.17-rc4/kernel/fork.c @@ -1120,12 +1120,10 @@ static task_t *copy_process(unsigned lon /* ok, now we should be set up.. */ p->exit_signal = (clone_flags & CLONE_THREAD) ? -1 : (clone_flags & CSIGNAL); p->pdeath_signal = 0; p->exit_state = 0; - if (task_notifier_call_chain(TN_FORK, p)) - goto bad_fork_cleanup_namespace; /* * Ok, make it visible to the rest of the system. * We dont wake it up yet. */ p->group_leader = p; @@ -1158,10 +1156,11 @@ static task_t *copy_process(unsigned lon p->real_parent = current->real_parent; else p->real_parent = current; p->parent = p->real_parent; + spin_lock(¤t->sighand->siglock); /* * Process group and session signals need to be delivered to just the * parent before the fork or both the parent and the child after the @@ -1223,10 +1222,11 @@ static task_t *copy_process(unsigned lon } total_forks++; spin_unlock(¤t->sighand->siglock); write_unlock_irq(&tasklist_lock); + task_notifier_call_chain(TN_FORK, p); proc_fork_connector(p); return p; bad_fork_cleanup_namespace: exit_namespace(p); |
From: Matt H. <mat...@us...> - 2006-05-17 06:30:54
|
Move the location of the exec notification chain calls to a point where we're certain that the exec call will be successfull. Pass in current for consistency with the other invokations. -- As I recommended in: http://marc.theaimsgroup.com/?l=lse-tech&m=114618434520870&w=4 Index: linux-2.6.17-rc4/fs/exec.c =================================================================== --- linux-2.6.17-rc4.orig/fs/exec.c +++ linux-2.6.17-rc4/fs/exec.c @@ -988,11 +988,10 @@ void compute_creds(struct linux_binprm * if (bprm->e_uid != current->uid) suid_keys(current); exec_keys(current); - task_notifier_call_chain(TN_EXEC, NULL); task_lock(current); unsafe = unsafe_exec(current); security_bprm_apply_creds(bprm, unsafe); task_unlock(current); security_bprm_post_apply_creds(bprm); @@ -1091,10 +1090,11 @@ int search_binary_handler(struct linux_b allow_write_access(bprm->file); if (bprm->file) fput(bprm->file); bprm->file = NULL; current->did_exec = 1; + task_notifier_call_chain(TN_EXEC, current); proc_exec_connector(current); return retval; } read_lock(&binfmt_lock); put_binfmt(fmt); |
From: Matt H. <mat...@us...> - 2006-05-17 06:33:06
|
This patch moves assignment of the task's exit code to allow registered handlers access to the final, valid exit_code field. Process Events needs this change in order to use Task Notifiers. Index: linux-2.6.17-rc4/kernel/exit.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/exit.c +++ linux-2.6.17-rc4/kernel/exit.c @@ -912,10 +912,11 @@ fastcall NORET_TYPE void do_exit(long co if (unlikely(tsk->compat_robust_list)) compat_exit_robust_list(tsk); #endif if (unlikely(tsk->audit_context)) audit_free(tsk); + tsk->exit_code = code; task_notifier_call_chain(TN_EXIT, tsk); taskstats_exit_send(tsk, tidstats, tgidstats); taskstats_exit_free(tidstats, tgidstats); delayacct_tsk_exit(tsk); @@ -934,11 +935,10 @@ fastcall NORET_TYPE void do_exit(long co module_put(task_thread_info(tsk)->exec_domain->module); if (tsk->binfmt) module_put(tsk->binfmt->module); - tsk->exit_code = code; proc_exit_connector(tsk); exit_notify(tsk); #ifdef CONFIG_NUMA mpol_free(tsk->mempolicy); tsk->mempolicy = NULL; |
From: Matt H. <mat...@us...> - 2006-05-17 06:35:19
|
This patch changes the values for each invocation to reflect the order they are expected to be called while also providing room for new values. Index: linux-2.6.17-rc3-mm1-rg-04-27-2006/include/linux/notifier.h =================================================================== --- linux-2.6.17-rc3-mm1-rg-04-27-2006.orig/include/linux/notifier.h +++ linux-2.6.17-rc3-mm1-rg-04-27-2006/include/linux/notifier.h @@ -153,11 +153,11 @@ extern int raw_notifier_call_chain(struc #define CPU_DOWN_PREPARE 0x0005 /* CPU (unsigned)v going down */ #define CPU_DOWN_FAILED 0x0006 /* CPU (unsigned)v NOT going down */ #define CPU_DEAD 0x0007 /* CPU (unsigned)v dead */ #define TN_FORK 0x0001 /* Task notifier - fork */ -#define TN_EXIT 0x0002 /* Task notifier - exit */ -#define TN_FREE 0x0003 /* Task notifier - free */ -#define TN_EXEC 0x0004 /* Task notifier - exec */ +#define TN_EXEC 0x0002 /* Task notifier - exec */ +#define TN_EXIT 0x7FFE /* Task notifier - exit */ +#define TN_FREE 0x7FFF /* Task notifier - free */ #endif /* __KERNEL__ */ #endif /* _LINUX_NOTIFIER_H */ |
From: Matt H. <mat...@us...> - 2006-05-17 06:35:54
|
This patch adds notification for id changes. The goal is to be able to remove the proc_id_connector() calls below each call inserted in this patch. Index: linux-2.6.17-rc4/include/linux/notifier.h =================================================================== --- linux-2.6.17-rc4.orig/include/linux/notifier.h +++ linux-2.6.17-rc4/include/linux/notifier.h @@ -154,10 +154,12 @@ extern int raw_notifier_call_chain(struc #define CPU_DOWN_FAILED 0x0006 /* CPU (unsigned)v NOT going down */ #define CPU_DEAD 0x0007 /* CPU (unsigned)v dead */ #define TN_FORK 0x0001 /* Task notifier - fork */ #define TN_EXEC 0x0002 /* Task notifier - exec */ +#define TN_UID 0x0003 /* Task notifier - [re]uid changes */ +#define TN_GID 0x0004 /* Task notifier - [re]gid changed */ #define TN_EXIT 0x7FFE /* Task notifier - exit */ #define TN_FREE 0x7FFF /* Task notifier - free */ #endif /* __KERNEL__ */ #endif /* _LINUX_NOTIFIER_H */ Index: linux-2.6.17-rc4/kernel/sys.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/sys.c +++ linux-2.6.17-rc4/kernel/sys.c @@ -839,10 +839,11 @@ asmlinkage long sys_setregid(gid_t rgid, current->sgid = new_egid; current->fsgid = new_egid; current->egid = new_egid; current->gid = new_rgid; key_fsgid_changed(current); + task_notifier_call_chain(TN_GID, current); proc_id_connector(current, PROC_EVENT_GID); return 0; } /* @@ -879,10 +880,11 @@ asmlinkage long sys_setgid(gid_t gid) } else return -EPERM; key_fsgid_changed(current); + task_notifier_call_chain(TN_GID, current); proc_id_connector(current, PROC_EVENT_GID); return 0; } static int set_user(uid_t new_ruid, int dumpclear) @@ -969,10 +971,11 @@ asmlinkage long sys_setreuid(uid_t ruid, (euid != (uid_t) -1 && euid != old_ruid)) current->suid = current->euid; current->fsuid = current->euid; key_fsuid_changed(current); + task_notifier_call_chain(TN_UID, current); proc_id_connector(current, PROC_EVENT_UID); return security_task_post_setuid(old_ruid, old_euid, old_suid, LSM_SETID_RE); } @@ -1017,10 +1020,11 @@ asmlinkage long sys_setuid(uid_t uid) } current->fsuid = current->euid = uid; current->suid = new_suid; key_fsuid_changed(current); + task_notifier_call_chain(TN_UID, current); proc_id_connector(current, PROC_EVENT_UID); return security_task_post_setuid(old_ruid, old_euid, old_suid, LSM_SETID_ID); } @@ -1066,10 +1070,11 @@ asmlinkage long sys_setresuid(uid_t ruid current->fsuid = current->euid; if (suid != (uid_t) -1) current->suid = suid; key_fsuid_changed(current); + task_notifier_call_chain(TN_UID, current); proc_id_connector(current, PROC_EVENT_UID); return security_task_post_setuid(old_ruid, old_euid, old_suid, LSM_SETID_RES); } @@ -1119,10 +1124,11 @@ asmlinkage long sys_setresgid(gid_t rgid current->gid = rgid; if (sgid != (gid_t) -1) current->sgid = sgid; key_fsgid_changed(current); + task_notifier_call_chain(TN_GID, current); proc_id_connector(current, PROC_EVENT_GID); return 0; } asmlinkage long sys_getresgid(gid_t __user *rgid, gid_t __user *egid, gid_t __user *sgid) @@ -1162,10 +1168,11 @@ asmlinkage long sys_setfsuid(uid_t uid) } current->fsuid = uid; } key_fsuid_changed(current); + task_notifier_call_chain(TN_UID, current); proc_id_connector(current, PROC_EVENT_UID); security_task_post_setuid(old_fsuid, (uid_t)-1, (uid_t)-1, LSM_SETID_FS); return old_fsuid; @@ -1191,10 +1198,11 @@ asmlinkage long sys_setfsgid(gid_t gid) current->mm->dumpable = suid_dumpable; smp_wmb(); } current->fsgid = gid; key_fsgid_changed(current); + task_notifier_call_chain(TN_GID, current); proc_id_connector(current, PROC_EVENT_GID); } return old_fsgid; } |
From: Matt H. <mat...@us...> - 2006-05-17 06:36:07
|
This patch adds externs describing the registration functions in include/linux/notifier.h. Honestly, I'm not sure where they belong. However, something like this is necessary so that modules can take advantage of Task Notifiers. Index: linux-2.6.17-rc3-mm1-rg-04-27-2006/include/linux/notifier.h =================================================================== --- linux-2.6.17-rc3-mm1-rg-04-27-2006.orig/include/linux/notifier.h +++ linux-2.6.17-rc3-mm1-rg-04-27-2006/include/linux/notifier.h @@ -152,10 +152,12 @@ extern int raw_notifier_call_chain(struc #define CPU_UP_CANCELED 0x0004 /* CPU (unsigned)v NOT coming up */ #define CPU_DOWN_PREPARE 0x0005 /* CPU (unsigned)v going down */ #define CPU_DOWN_FAILED 0x0006 /* CPU (unsigned)v NOT going down */ #define CPU_DEAD 0x0007 /* CPU (unsigned)v dead */ +extern int register_task_notifier(struct notifier_block *nb); +extern int unregister_task_notifier(struct notifier_block *nb); #define TN_FORK 0x0001 /* Task notifier - fork */ #define TN_EXEC 0x0002 /* Task notifier - exec */ #define TN_UID 0x0003 /* Task notifier - [re]uid changes */ #define TN_GID 0x0004 /* Task notifier - [re]gid changed */ #define TN_EXIT 0x7FFE /* Task notifier - exit */ |
From: Matt H. <mat...@us...> - 2006-05-17 06:36:51
|
This patch switches Task Notifiers from atomic to blocking. This means it uses a global rwsem and allows handlers to block. Index: linux-2.6.17-rc4/kernel/sched.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/sched.c +++ linux-2.6.17-rc4/kernel/sched.c @@ -56,27 +56,27 @@ #include <asm/tlb.h> #include <asm/unistd.h> /* Task notifier Mechanism */ -static ATOMIC_NOTIFIER_HEAD(task_notifier); +static BLOCKING_NOTIFIER_HEAD(task_notifier); int register_task_notifier(struct notifier_block *nb) { - return atomic_notifier_chain_register(&task_notifier, nb); + return blocking_notifier_chain_register(&task_notifier, nb); } EXPORT_SYMBOL(register_task_notifier); int unregister_task_notifier(struct notifier_block *nb) { - return atomic_notifier_chain_unregister(&task_notifier, nb); + return blocking_notifier_chain_unregister(&task_notifier, nb); } EXPORT_SYMBOL(unregister_task_notifier); int task_notifier_call_chain(unsigned long val, void *v) { - return atomic_notifier_call_chain(&task_notifier, val, v); + return blocking_notifier_call_chain(&task_notifier, val, v); } /* * Convert user-nice values [ -20 ... 0 ... 19 ] * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ], |
From: Balbir S. <ba...@in...> - 2006-05-18 04:28:20
|
On Tue, May 16, 2006 at 11:25:07PM -0700, Matt Helsley wrote: > This patch switches Task Notifiers from atomic to blocking. This > means it uses a global rwsem and allows handlers to block. > > Index: linux-2.6.17-rc4/kernel/sched.c > =================================================================== > --- linux-2.6.17-rc4.orig/kernel/sched.c > +++ linux-2.6.17-rc4/kernel/sched.c > @@ -56,27 +56,27 @@ > #include <asm/tlb.h> > > #include <asm/unistd.h> > > /* Task notifier Mechanism */ > -static ATOMIC_NOTIFIER_HEAD(task_notifier); > +static BLOCKING_NOTIFIER_HEAD(task_notifier); > > int register_task_notifier(struct notifier_block *nb) > { > - return atomic_notifier_chain_register(&task_notifier, nb); > + return blocking_notifier_chain_register(&task_notifier, nb); > } > EXPORT_SYMBOL(register_task_notifier); > > int unregister_task_notifier(struct notifier_block *nb) > { > - return atomic_notifier_chain_unregister(&task_notifier, nb); > + return blocking_notifier_chain_unregister(&task_notifier, nb); > } > EXPORT_SYMBOL(unregister_task_notifier); > > int task_notifier_call_chain(unsigned long val, void *v) > { > - return atomic_notifier_call_chain(&task_notifier, val, v); > + return blocking_notifier_call_chain(&task_notifier, val, v); > } > > /* > * Convert user-nice values [ -20 ... 0 ... 19 ] > * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ], > > > > Why not use blocking notifiers in the first patch itself? Why do you use atomic notifiers and then change to blocking notifiers? Warm Regards, Balbir Singh, Linux Technology Center, IBM Software Labs |
From: Matt H. <mat...@us...> - 2006-05-18 20:57:43
|
On Thu, 2006-05-18 at 09:54 +0530, Balbir Singh wrote: > On Tue, May 16, 2006 at 11:25:07PM -0700, Matt Helsley wrote: > > This patch switches Task Notifiers from atomic to blocking. This > > means it uses a global rwsem and allows handlers to block. > > > > Index: linux-2.6.17-rc4/kernel/sched.c > > =================================================================== > > --- linux-2.6.17-rc4.orig/kernel/sched.c > > +++ linux-2.6.17-rc4/kernel/sched.c > > @@ -56,27 +56,27 @@ > > #include <asm/tlb.h> > > > > #include <asm/unistd.h> > > > > /* Task notifier Mechanism */ > > -static ATOMIC_NOTIFIER_HEAD(task_notifier); > > +static BLOCKING_NOTIFIER_HEAD(task_notifier); > > > > int register_task_notifier(struct notifier_block *nb) > > { > > - return atomic_notifier_chain_register(&task_notifier, nb); > > + return blocking_notifier_chain_register(&task_notifier, nb); > > } > > EXPORT_SYMBOL(register_task_notifier); > > > > int unregister_task_notifier(struct notifier_block *nb) > > { > > - return atomic_notifier_chain_unregister(&task_notifier, nb); > > + return blocking_notifier_chain_unregister(&task_notifier, nb); > > } > > EXPORT_SYMBOL(unregister_task_notifier); > > > > int task_notifier_call_chain(unsigned long val, void *v) > > { > > - return atomic_notifier_call_chain(&task_notifier, val, v); > > + return blocking_notifier_call_chain(&task_notifier, val, v); > > } > > > > /* > > * Convert user-nice values [ -20 ... 0 ... 19 ] > > * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ], > > > > > > > > > > Why not use blocking notifiers in the first patch itself? Why do you use atomic > notifiers and then change to blocking notifiers? > > Warm Regards, > Balbir Singh, > Linux Technology Center, > IBM Software Labs The first patch is Chandra's implementation using a global notifier chain rather than Jes and Jack's per-task notifier chain. I've refreshed the patch in quilt to apply cleanly to 2.6.17-rc4-mm1 but left it otherwise unchanged. The short patches that follow it show some modifications I recommend and allow for folks to comment on those separately from the Process Events modularization patch. Would it make it easier if I merged them? I could merge the "move" patches for instance.. Cheers, -Matt Helsley |
From: Chandra S. <sek...@us...> - 2006-05-18 23:09:39
|
On Thu, 2006-05-18 at 13:52 -0700, Matt Helsley wrote: <snipped> > > > > Why not use blocking notifiers in the first patch itself? Why do you use atomic > > notifiers and then change to blocking notifiers? > > > > Warm Regards, > > Balbir Singh, > > Linux Technology Center, > > IBM Software Labs > > The first patch is Chandra's implementation using a global notifier > chain rather than Jes and Jack's per-task notifier chain. I've refreshed > the patch in quilt to apply cleanly to 2.6.17-rc4-mm1 but left it > otherwise unchanged. > > The short patches that follow it show some modifications I recommend > and allow for folks to comment on those separately from the Process > Events modularization patch. > > Would it make it easier if I merged them? I could merge the "move" > patches for instance.. Yes, IMO, merging/moving patches 1-8 would make it easier to follow. > > Cheers, > -Matt Helsley > > > > ------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Lse-tech mailing list > Lse...@li... > https://lists.sourceforge.net/lists/listinfo/lse-tech -- ---------------------------------------------------------------------- Chandra Seetharaman | Be careful what you choose.... - sek...@us... | .......you may get it. ---------------------------------------------------------------------- |
From: Jes S. <je...@sg...> - 2006-05-19 07:30:09
|
Matt Helsley wrote: > The first patch is Chandra's implementation using a global notifier > chain rather than Jes and Jack's per-task notifier chain. I've refreshed > the patch in quilt to apply cleanly to 2.6.17-rc4-mm1 but left it > otherwise unchanged. > > The short patches that follow it show some modifications I recommend > and allow for folks to comment on those separately from the Process > Events modularization patch. > > Would it make it easier if I merged them? I could merge the "move" > patches for instance.. I'd say go ahead and merge the core of it and then leave the parts that are still open to debate as follow on patches. That way it also gets easier for Andrew when it gets pushed upstream. Cheers, Jes |
From: Matt H. <mat...@us...> - 2006-05-17 06:37:08
|
This patch moves the location used to invoke proc_exit_connector(). It's intended to make the next patch easier to understand and does not affect functionality. Index: linux-2.6.17-rc4/kernel/exit.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/exit.c +++ linux-2.6.17-rc4/kernel/exit.c @@ -914,10 +914,11 @@ fastcall NORET_TYPE void do_exit(long co #endif if (unlikely(tsk->audit_context)) audit_free(tsk); tsk->exit_code = code; task_notifier_call_chain(TN_EXIT, tsk); + proc_exit_connector(tsk); taskstats_exit_send(tsk, tidstats, tgidstats); taskstats_exit_free(tidstats, tgidstats); delayacct_tsk_exit(tsk); exit_mm(tsk); @@ -935,11 +936,10 @@ fastcall NORET_TYPE void do_exit(long co module_put(task_thread_info(tsk)->exec_domain->module); if (tsk->binfmt) module_put(tsk->binfmt->module); - proc_exit_connector(tsk); exit_notify(tsk); #ifdef CONFIG_NUMA mpol_free(tsk->mempolicy); tsk->mempolicy = NULL; #endif |
From: Matt H. <mat...@us...> - 2006-05-17 06:38:19
|
This patch uses Task Notifiers to trigger the Process Events. Note that it removes the invocations of functions in fork/exec/[re][ug]id/exit paths. It also makes Process Events configurable as a module. Index: linux-2.6.17-rc4/drivers/connector/cn_proc.c =================================================================== --- linux-2.6.17-rc4.orig/drivers/connector/cn_proc.c +++ linux-2.6.17-rc4/drivers/connector/cn_proc.c @@ -25,10 +25,11 @@ #include <linux/module.h> #include <linux/kernel.h> #include <linux/ktime.h> #include <linux/init.h> #include <linux/connector.h> +#include <linux/notifier.h> #include <asm/atomic.h> #include <linux/cn_proc.h> #define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event)) @@ -44,11 +45,11 @@ static inline void get_seq(__u32 *ts, in *ts = get_cpu_var(proc_event_counts)++; *cpu = smp_processor_id(); put_cpu_var(proc_event_counts); } -void proc_fork_connector(struct task_struct *task) +static void proc_fork_connector(struct task_struct *task) { struct cn_msg *msg; struct proc_event *ev; __u8 buffer[CN_PROC_MSG_SIZE]; @@ -70,11 +71,11 @@ void proc_fork_connector(struct task_str msg->len = sizeof(*ev); /* If cn_netlink_send() failed, the data is not sent */ cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL); } -void proc_exec_connector(struct task_struct *task) +static void proc_exec_connector(struct task_struct *task) { struct cn_msg *msg; struct proc_event *ev; __u8 buffer[CN_PROC_MSG_SIZE]; @@ -93,11 +94,11 @@ void proc_exec_connector(struct task_str msg->ack = 0; /* not used */ msg->len = sizeof(*ev); cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL); } -void proc_id_connector(struct task_struct *task, int which_id) +static void proc_id_connector(struct task_struct *task, int which_id) { struct cn_msg *msg; struct proc_event *ev; __u8 buffer[CN_PROC_MSG_SIZE]; @@ -124,11 +125,11 @@ void proc_id_connector(struct task_struc msg->ack = 0; /* not used */ msg->len = sizeof(*ev); cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL); } -void proc_exit_connector(struct task_struct *task) +static void proc_exit_connector(struct task_struct *task) { struct cn_msg *msg; struct proc_event *ev; __u8 buffer[CN_PROC_MSG_SIZE]; @@ -207,10 +208,50 @@ static void cn_proc_mcast_ctl(void *data break; } cn_proc_ack(err, msg->seq, msg->ack); } + +/* + * Dispatch task notifier events to the appropriate process event + * generation function. + */ +static int cn_proc_notify(struct notifier_block *nb, unsigned long event, + void *v) +{ + struct task_struct *task = v; + int rc = NOTIFY_OK; + + switch (event) { + case TN_FORK: + proc_fork_connector(task); + break; + case TN_EXEC: + proc_exec_connector(task); + break; + case TN_UID: + proc_id_connector(task, PROC_EVENT_UID); + break; + case TN_GID: + proc_id_connector(task, PROC_EVENT_GID); + break; + case TN_EXIT: + proc_exit_connector(task); + break; + default: + rc = NOTIFY_DONE; /* ignore all other notifications */ + break; + } + return rc; +} + +static struct notifier_block cn_proc_nb = { + .notifier_call = cn_proc_notify, + .next = NULL, + .priority = 0, +}; + /* * cn_proc_init - initialization entry point * * Adds the connector callback to the connector driver. */ @@ -219,11 +260,34 @@ static int __init cn_proc_init(void) int err; if ((err = cn_add_callback(&cn_proc_event_id, "cn_proc", &cn_proc_mcast_ctl))) { printk(KERN_WARNING "cn_proc failed to register\n"); - return err; + goto out; } - return 0; + + err = register_task_notifier(&cn_proc_nb); + if (err != 0) + cn_del_callback(&cn_proc_event_id); +out: + return err; } +static void cn_proc_fini(void) +{ + int err; + + err = unregister_task_notifier(&cn_proc_nb); + if (err != 0) + printk(KERN_WARNING + "cn_proc failed to unregister its task notify block\n"); + + cn_del_callback(&cn_proc_event_id); +} module_init(cn_proc_init); +module_exit(cn_proc_fini); + +MODULE_AUTHOR("Matt Helsley <mat...@us...>, IBM Corp. 2005, " + "Guillaume Thouvenin <gui...@bu...>"); +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("Notification of process events."); +MODULE_VERSION("2:18.0"); Index: linux-2.6.17-rc4/kernel/exit.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/exit.c +++ linux-2.6.17-rc4/kernel/exit.c @@ -914,11 +914,10 @@ fastcall NORET_TYPE void do_exit(long co #endif if (unlikely(tsk->audit_context)) audit_free(tsk); tsk->exit_code = code; task_notifier_call_chain(TN_EXIT, tsk); - proc_exit_connector(tsk); taskstats_exit_send(tsk, tidstats, tgidstats); taskstats_exit_free(tidstats, tgidstats); delayacct_tsk_exit(tsk); exit_mm(tsk); Index: linux-2.6.17-rc4/kernel/fork.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/fork.c +++ linux-2.6.17-rc4/kernel/fork.c @@ -1223,11 +1223,10 @@ static task_t *copy_process(unsigned lon total_forks++; spin_unlock(¤t->sighand->siglock); write_unlock_irq(&tasklist_lock); task_notifier_call_chain(TN_FORK, p); - proc_fork_connector(p); return p; bad_fork_cleanup_namespace: exit_namespace(p); bad_fork_cleanup_keys: Index: linux-2.6.17-rc4/fs/exec.c =================================================================== --- linux-2.6.17-rc4.orig/fs/exec.c +++ linux-2.6.17-rc4/fs/exec.c @@ -1091,11 +1091,10 @@ int search_binary_handler(struct linux_b if (bprm->file) fput(bprm->file); bprm->file = NULL; current->did_exec = 1; task_notifier_call_chain(TN_EXEC, current); - proc_exec_connector(current); return retval; } read_lock(&binfmt_lock); put_binfmt(fmt); if (retval != -ENOEXEC || bprm->mm == NULL) Index: linux-2.6.17-rc4/kernel/sys.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/sys.c +++ linux-2.6.17-rc4/kernel/sys.c @@ -840,11 +840,10 @@ asmlinkage long sys_setregid(gid_t rgid, current->fsgid = new_egid; current->egid = new_egid; current->gid = new_rgid; key_fsgid_changed(current); task_notifier_call_chain(TN_GID, current); - proc_id_connector(current, PROC_EVENT_GID); return 0; } /* * setgid() is implemented like SysV w/ SAVED_IDS @@ -881,11 +880,10 @@ asmlinkage long sys_setgid(gid_t gid) else return -EPERM; key_fsgid_changed(current); task_notifier_call_chain(TN_GID, current); - proc_id_connector(current, PROC_EVENT_GID); return 0; } static int set_user(uid_t new_ruid, int dumpclear) { @@ -972,11 +970,10 @@ asmlinkage long sys_setreuid(uid_t ruid, current->suid = current->euid; current->fsuid = current->euid; key_fsuid_changed(current); task_notifier_call_chain(TN_UID, current); - proc_id_connector(current, PROC_EVENT_UID); return security_task_post_setuid(old_ruid, old_euid, old_suid, LSM_SETID_RE); } @@ -1021,11 +1018,10 @@ asmlinkage long sys_setuid(uid_t uid) current->fsuid = current->euid = uid; current->suid = new_suid; key_fsuid_changed(current); task_notifier_call_chain(TN_UID, current); - proc_id_connector(current, PROC_EVENT_UID); return security_task_post_setuid(old_ruid, old_euid, old_suid, LSM_SETID_ID); } @@ -1071,11 +1067,10 @@ asmlinkage long sys_setresuid(uid_t ruid if (suid != (uid_t) -1) current->suid = suid; key_fsuid_changed(current); task_notifier_call_chain(TN_UID, current); - proc_id_connector(current, PROC_EVENT_UID); return security_task_post_setuid(old_ruid, old_euid, old_suid, LSM_SETID_RES); } asmlinkage long sys_getresuid(uid_t __user *ruid, uid_t __user *euid, uid_t __user *suid) @@ -1125,11 +1120,10 @@ asmlinkage long sys_setresgid(gid_t rgid if (sgid != (gid_t) -1) current->sgid = sgid; key_fsgid_changed(current); task_notifier_call_chain(TN_GID, current); - proc_id_connector(current, PROC_EVENT_GID); return 0; } asmlinkage long sys_getresgid(gid_t __user *rgid, gid_t __user *egid, gid_t __user *sgid) { @@ -1169,11 +1163,10 @@ asmlinkage long sys_setfsuid(uid_t uid) current->fsuid = uid; } key_fsuid_changed(current); task_notifier_call_chain(TN_UID, current); - proc_id_connector(current, PROC_EVENT_UID); security_task_post_setuid(old_fsuid, (uid_t)-1, (uid_t)-1, LSM_SETID_FS); return old_fsuid; } @@ -1199,11 +1192,10 @@ asmlinkage long sys_setfsgid(gid_t gid) smp_wmb(); } current->fsgid = gid; key_fsgid_changed(current); task_notifier_call_chain(TN_GID, current); - proc_id_connector(current, PROC_EVENT_GID); } return old_fsgid; } asmlinkage long sys_times(struct tms __user * tbuf) Index: linux-2.6.17-rc4/include/linux/cn_proc.h =================================================================== --- linux-2.6.17-rc4.orig/include/linux/cn_proc.h +++ linux-2.6.17-rc4/include/linux/cn_proc.h @@ -93,28 +93,6 @@ struct proc_event { pid_t process_tgid; __u32 exit_code, exit_signal; } exit; } event_data; }; - -#ifdef __KERNEL__ -#ifdef CONFIG_PROC_EVENTS -void proc_fork_connector(struct task_struct *task); -void proc_exec_connector(struct task_struct *task); -void proc_id_connector(struct task_struct *task, int which_id); -void proc_exit_connector(struct task_struct *task); -#else -static inline void proc_fork_connector(struct task_struct *task) -{} - -static inline void proc_exec_connector(struct task_struct *task) -{} - -static inline void proc_id_connector(struct task_struct *task, - int which_id) -{} - -static inline void proc_exit_connector(struct task_struct *task) -{} -#endif /* CONFIG_PROC_EVENTS */ -#endif /* __KERNEL__ */ #endif /* CN_PROC_H */ Index: linux-2.6.17-rc4/drivers/connector/Kconfig =================================================================== --- linux-2.6.17-rc4.orig/drivers/connector/Kconfig +++ linux-2.6.17-rc4/drivers/connector/Kconfig @@ -9,13 +9,13 @@ config CONNECTOR Connector support can also be built as a module. If so, the module will be called cn.ko. config PROC_EVENTS - boolean "Report process events to userspace" - depends on CONNECTOR=y - default y - ---help--- + tristate "Report process events to userspace" + default m + depends on CONNECTOR + help Provide a connector that reports process events to userspace. Send events such as fork, exec, id change (uid, gid, suid, etc), and exit. endmenu |
From: Matt H. <mat...@us...> - 2006-05-17 06:44:20
|
This patch introduces a module that demonstrates what I was trying to describe with the "2-level mechanism" discussion earlier: http://marc.theaimsgroup.com/?l=lse-tech&m=114654571406352&w=4 It's intended to show one way per-task notifier chains can be implemented using the global task notifier mechanism (This corresponds to the combination in 2.i in the above email). Unlike the the previous portions of this series, this patch is completely untested (see TESTING in the introductory email). Index: linux-2.6.17-rc4/drivers/misc/per_task_notifiers.c =================================================================== --- /dev/null +++ linux-2.6.17-rc4/drivers/misc/per_task_notifiers.c @@ -0,0 +1,41 @@ +#include <linux/notifier.h> + +static int per_task_notify (struct notifier_block *nb, unsigned int val, + struct task_struct *task) +{ + return raw_notifier_call_chain(&task->notify, val, task); +} + +static struct notifier_block per_task_nb __read_mostly { + .notifier_call = per_task_notify, + .next = NULL, + .priority = 0, +}; + +int register_per_task_notifier(struct task_struct *task, + struct notifier_block *nb) +{ + BUG_ON(task != current); + return raw_notifier_chain_register(&task->notify, nb); +} +EXPORT_SYMBOL_GPL(register_per_task_notifier); + +int unregister_per_task_notifier(struct notifier_block *nb) +{ + BUG_ON(task != current); + return raw_notifier_chain_unregister(&task->notify, nb); +} +EXPORT_SYMBOL_GPL(unregister_per_task_notifier); + +static int __init per_task_notifiers_init(void) +{ + return register_task_notifier(&per_task_nb); +} + +static int per_task_notifiers_fini(void) +{ + return unregister_task_notifier(&per_task_nb); +} + +module_init(per_task_notifiers_init); +module_exit(per_task_notifiers_fini); Index: linux-2.6.17-rc4/include/linux/sched.h =================================================================== --- linux-2.6.17-rc4.orig/include/linux/sched.h +++ linux-2.6.17-rc4/include/linux/sched.h @@ -967,10 +967,12 @@ struct task_struct { struct futex_pi_state *pi_state_cache; atomic_t fs_excl; /* holding fs exclusive resources */ struct rcu_head rcu; + struct raw_notifier_head notify; /* generic per-task notifications */ + /* * cache last used pipe for splice */ struct pipe_inode_info *splice_pipe; #ifdef CONFIG_TASK_DELAY_ACCT Index: linux-2.6.17-rc4/include/linux/init_task.h =================================================================== --- linux-2.6.17-rc4.orig/include/linux/init_task.h +++ linux-2.6.17-rc4/include/linux/init_task.h @@ -121,10 +121,11 @@ extern struct group_info init_groups; .blocked = {{0}}, \ .alloc_lock = SPIN_LOCK_UNLOCKED, \ .journal_info = NULL, \ .cpu_timers = INIT_CPU_TIMERS(tsk.cpu_timers), \ .fs_excl = ATOMIC_INIT(0), \ + .task_notifier = RAW_NOTIFIER_INIT(tsk.notify), \ INIT_RT_MUTEXES(tsk) \ } #define INIT_CPU_TIMERS(cpu_timers) \ Index: linux-2.6.17-rc4/kernel/fork.c =================================================================== --- linux-2.6.17-rc4.orig/kernel/fork.c +++ linux-2.6.17-rc4/kernel/fork.c @@ -1058,10 +1058,12 @@ static task_t *copy_process(unsigned lon p->tgid = p->pid; if (clone_flags & CLONE_THREAD) p->tgid = current->tgid; + RAW_INIT_NOTIFIER_HEAD(&p->notify); + if ((retval = security_task_alloc(p))) goto bad_fork_cleanup_policy; if ((retval = audit_alloc(p))) goto bad_fork_cleanup_security; /* copy all the process information */ Index: linux-2.6.17-rc4/drivers/misc/Kconfig =================================================================== --- linux-2.6.17-rc4.orig/drivers/misc/Kconfig +++ linux-2.6.17-rc4/drivers/misc/Kconfig @@ -26,7 +26,21 @@ config IBM_ASM for your IBM server. If unsure, say N. +config PER_TASK_NOTIFIERS + tristate "Task-specific event notification" + depends on EXPERIMENTAL + help + This option allows modules to register for notification of events + pertaining to a specific task. Modules may receive events when + a task forks, execs, changes id (real or effective, user or group), + exits, and is about to be freed. + + NOTE: This option is not needed if modules wish to register for + notification of events for every task in the system. + + If unsure, and you have configured loadable modules, say M. + endmenu Index: linux-2.6.17-rc4/drivers/misc/Makefile =================================================================== --- linux-2.6.17-rc4.orig/drivers/misc/Makefile +++ linux-2.6.17-rc4/drivers/misc/Makefile @@ -3,5 +3,6 @@ # obj- := misc.o # Dummy rule to force built-in.o to be made obj-$(CONFIG_IBM_ASM) += ibmasm/ obj-$(CONFIG_HDPU_FEATURES) += hdpuftrs/ +obj-$(CONFIG_PER_TASK_NOTIFIERS) += per_task_notifiers.o |
From: Matt H. <mat...@us...> - 2006-05-17 23:03:03
|
Note: I messed up on the subject line for this patch. It should be: [RFC][PATCH 011/011] Per-task notifier chains |
From: Jes S. <je...@sg...> - 2006-05-17 11:51:15
|
Matt Helsley wrote: > Jes, all, > > I've built on Chandra's Task Notifiers patch to modularize the process > events connector. I haven't completed testing, however I thought I'd > post the patches to get some early feedback. > > I believe I've cc'd everyone from the original "simple task notifier > support" thread. Please speak up if I've neglected to add you or you're > no longer interested :). Matt, A couple of points, first of all I do object to the naming. These are clearly not task notifiers, but process event notifiers (or something along those lines). The use of the original names from Jack's and my code seems inappropriate. Second, patch 8/11, changing it to a blocking chain is not good. It does exactly what we've been discussing for a while, adding an expensive semaphore to the wrong place. This is where Alan Stern's per CPU locks would be ideal. Last I really don't see how the proof of concept 11/11 patch proves how to do things. It means you lose the ability to pass in any data to the notifier chain since the task struct is passed in instead. This may not be sufficient for some cases. Anyway, the patch clearly won't compile, so lets ignore it for now. If we can resolve the naming and locking issues, then I don't have any particular objections to this, provided we see some in-tree users of it. The process connector switch over seems a good candidate for an in tree client. If task_notifier_call_chain() etc. were inline'd we'd save one level of indirection without growing the code size given that these functions only continue calling into the next level of notifier functions. Cheers, Jes |
From: Alan S. <st...@ro...> - 2006-05-17 14:40:04
|
On Wed, 17 May 2006, Jes Sorensen wrote: > If task_notifier_call_chain() etc. were inline'd we'd save one > level of indirection without growing the code size given that these > functions only continue calling into the next level of notifier > functions. The way Matt did it is quite common; lots of other notifier chains in the kernel work this way. Keeping those routines out-of-line rather than in-line has two advantages: 1: It makes the kernel a little bit smaller, because each caller has to submit only two arguments instead of three. (Of course, traded off against this is the need to include the code for the out-of-line routine itself. The more callers, the bigger the advantage.) 2: It allows you to hide the underlying mechanism (the chain head) and expose only an API. This simplifies later changes, such as the move from an atomic chain to a blocking chain. Alan Stern |
From: Jes S. <je...@sg...> - 2006-05-18 12:01:14
|
Alan Stern wrote: > On Wed, 17 May 2006, Jes Sorensen wrote: > >> If task_notifier_call_chain() etc. were inline'd we'd save one >> level of indirection without growing the code size given that these >> functions only continue calling into the next level of notifier >> functions. > > The way Matt did it is quite common; lots of other notifier chains in the > kernel work this way. Keeping those routines out-of-line rather than > in-line has two advantages: > > 1: It makes the kernel a little bit smaller, because each caller has > to submit only two arguments instead of three. (Of course, traded > off against this is the need to include the code for the > out-of-line routine itself. The more callers, the bigger the > advantage.) Hi Alan, In this case I think 1) is a wash, since it's a function call straight to another function call, only dereferencing a pointer once. The cost of the call by far outweighs it. > 2: It allows you to hide the underlying mechanism (the chain head) > and expose only an API. This simplifies later changes, such as > the move from an atomic chain to a blocking chain. The users have to be recompiled for any kernel update anyway so I don't feel this is applicable either. If someone finds it makes sense the register/unregister calls could be kept out-lined, but I think the call chain should be inlined for performance reasons. Cheers, Jes |
From: Alan S. <st...@ro...> - 2006-05-18 15:27:20
|
On Thu, 18 May 2006, Jes Sorensen wrote: > > The way Matt did it is quite common; lots of other notifier chains in the > > kernel work this way. Keeping those routines out-of-line rather than > > in-line has two advantages: > > > > 1: It makes the kernel a little bit smaller, because each caller has > > to submit only two arguments instead of three. (Of course, traded > > off against this is the need to include the code for the > > out-of-line routine itself. The more callers, the bigger the > > advantage.) > > Hi Alan, > > In this case I think 1) is a wash, since it's a function call straight > to another function call, only dereferencing a pointer once. The cost of > the call by far outweighs it. How can you know until you've measured the timing? There are conflicting effects: the time required to execute the extra function call versus the time required to satisfy the cache misses caused by the code that pushes an extra argument on the stack. My guess is that if you tried to measure the timing difference, you would not be able to see it because it would be too small. > > 2: It allows you to hide the underlying mechanism (the chain head) > > and expose only an API. This simplifies later changes, such as > > the move from an atomic chain to a blocking chain. > > The users have to be recompiled for any kernel update anyway so I don't > feel this is applicable either. Recompilation isn't the issue; robustness against future changes is. If the interface needs to be changed, it's a lot simpler and easier to get it correct if you need to change it in only one spot than if you need to search through the entire kernel for multiple occurrences. Alan Stern |
From: Jes S. <je...@sg...> - 2006-05-19 07:27:31
|
Alan Stern wrote: > On Thu, 18 May 2006, Jes Sorensen wrote: >> In this case I think 1) is a wash, since it's a function call straight >> to another function call, only dereferencing a pointer once. The cost of >> the call by far outweighs it. > > How can you know until you've measured the timing? There are conflicting > effects: the time required to execute the extra function call versus the > time required to satisfy the cache misses caused by the code that pushes > an extra argument on the stack. You'd still have to prepare the extra argument since the call goes to a function that unconditionally calls another function. On the other hand having the preparation of arg #3 in the bigger function allows the compiler to be more flexible and possibly prepare it in advance. Also note that arguments only go on the stack on some architectures. >> The users have to be recompiled for any kernel update anyway so I don't >> feel this is applicable either. > > Recompilation isn't the issue; robustness against future changes is. If > the interface needs to be changed, it's a lot simpler and easier to get it > correct if you need to change it in only one spot than if you need to > search through the entire kernel for multiple occurrences. This really isn't very likely to change in the future, and the changes would still only be visible in the few files declaring it. It's not like the callout type would be visible in every client. Cheers, Jes |
From: Alan S. <st...@ro...> - 2006-05-19 17:05:29
|
On Fri, 19 May 2006, Jes Sorensen wrote: > Alan Stern wrote: > > On Thu, 18 May 2006, Jes Sorensen wrote: > >> In this case I think 1) is a wash, since it's a function call straight > >> to another function call, only dereferencing a pointer once. The cost of > >> the call by far outweighs it. > > > > How can you know until you've measured the timing? There are conflicting > > effects: the time required to execute the extra function call versus the > > time required to satisfy the cache misses caused by the code that pushes > > an extra argument on the stack. > > You'd still have to prepare the extra argument since the call goes to a > function that unconditionally calls another function. I'm not getting through. Look at it like this (the numbers are just made-up values): My way: ------- caller: task_watcher_call_chain(val, ptr); // Has 2 arguments, generates 10 bytes of code // Occurs in 20 different callers core: int task_watcher_call_chain(unsigned long val, void *ptr) { return blocking_notifier_call_chain( task_watcher_chain, val, ptr); } // Generates 20 bytes of code, but only occurs once Your way: --------- caller: blocking_notifier_call_chain(task_watcher_chain, val, ptr); // Has 3 arguments, generates 15 bytes of code // Occurs in 20 different callers Now compare the total code sizes. My way has 10*20 + 20 = 220 bytes, and yours has 15*20 = 300 bytes. Those extra code bytes mean extra i-cache misses. > On the other hand > having the preparation of arg #3 in the bigger function allows the > compiler to be more flexible and possibly prepare it in advance. Possibly. Like I said, the timing effects are impossible to predict and can be determined only by actual measurement (unless they are too small to be visible). > Also > note that arguments only go on the stack on some architectures. It doesn't matter. On any architecture, passing 3 arguments takes more code space than passing 2. Alan Stern |
From: Jes S. <je...@sg...> - 2006-05-22 09:18:59
|
Alan Stern wrote: > On Fri, 19 May 2006, Jes Sorensen wrote: > caller: blocking_notifier_call_chain(task_watcher_chain, val, ptr); > // Has 3 arguments, generates 15 bytes of code > // Occurs in 20 different callers > > > Now compare the total code sizes. My way has 10*20 + 20 = 220 bytes, and > yours has 15*20 = 300 bytes. Those extra code bytes mean extra i-cache > misses. On which architecture? On ia64 all it would take is an add, which in many cases can be slotted in for free. In this case there would be zero impact on code size and icache misses. >> Also >> note that arguments only go on the stack on some architectures. > > It doesn't matter. On any architecture, passing 3 arguments takes more > code space than passing 2. All depends on how good the compiler is, in some cases the value you are passing is already available in the function calling in. It may however cause more pipeline stalls since you have a short function where it has to do the load then and there. Sorry, but with modern compilers on modern architectures, I just don't buy that argument. Cheers, Jes |
From: Alan S. <st...@ro...> - 2006-05-22 14:59:46
|
On Mon, 22 May 2006, Jes Sorensen wrote: > > It doesn't matter. On any architecture, passing 3 arguments takes more > > code space than passing 2. > > All depends on how good the compiler is, in some cases the value you are > passing is already available in the function calling in. It may however > cause more pipeline stalls since you have a short function where it has > to do the load then and there. > > Sorry, but with modern compilers on modern architectures, I just don't > buy that argument. You're arguing that _sometimes_ on _some_ architectures there won't be any penalty and therefore the function should always be inline. I'm arguing that sometimes there may indeed be a penalty and the only way to find out for sure which is best is by actual measurement. To me, my argument seems more compelling. Alan Stern P.S.: At various times there have been long discussions on LKML about the advantages and disadvantages of inlining functions. Linus's viewpoint has generally been that the best rule-of-thumb is overall code size. There's no question that this is a borderline case, which might work out better one way on some architectures and better the other way on other architectures. |