All of lore.kernel.org
 help / color / mirror / Atom feed
* [POC][RFC][PATCH 0/2] PROOF OF CONCEPT: Dynamic Functions (jump functions)
@ 2018-10-06  1:51 Steven Rostedt
  2018-10-06  1:51 ` [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function" Steven Rostedt
  2018-10-06  1:51 ` [POC][RFC][PATCH 2/2] tracepoints: Implement it with dynamic functions Steven Rostedt
  0 siblings, 2 replies; 43+ messages in thread
From: Steven Rostedt @ 2018-10-06  1:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Masami Hiramatsu, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina


This is just a Proof Of Concept (POC), as I have done some "no no"s like
having x86 asm code in generic code paths, and it also needs a way of
working when an arch does not support this feature. Not to mention, I didn't
add proper change logs (that will come later).

Background:

 During David Woodhouse's presentation on Spectre and Meltdown at Kernel
Recipes he talked about how retpolines are implemented. I haven't had time
to look at the details so I haven't given it much thought. But as he
demonstrated that it has a measurable overhead on indirect calls, I realized
how much this can affect tracepoints. Tracepoints are implemented with
indirect calls, where the code iterates over an array calling each callback
that has registered with the tracepoint.

I ran a test to see how much overhead this entails.

With RETPOLINE disabled (CONFIG_RETPOLINE=n):

# trace-cmd start -e all
# perf stat -r 10 /work/c/hackbench 50
Time: 29.369
Time: 28.998
Time: 28.816
Time: 28.734
Time: 29.034
Time: 28.631
Time: 28.594
Time: 28.762
Time: 28.915
Time: 28.741

 Performance counter stats for '/work/c/hackbench 50' (10 runs):

     232926.801609      task-clock (msec)         #    7.465 CPUs utilized            ( +-  0.26% )
         3,175,526      context-switches          #    0.014 M/sec                    ( +-  0.50% )
           394,920      cpu-migrations            #    0.002 M/sec                    ( +-  1.71% )
            44,273      page-faults               #    0.190 K/sec                    ( +-  1.06% )
   859,904,212,284      cycles                    #    3.692 GHz                      ( +-  0.26% )
   526,010,328,375      stalled-cycles-frontend   #   61.17% frontend cycles idle     ( +-  0.26% )
   799,414,387,443      instructions              #    0.93  insn per cycle
                                                  #    0.66  stalled cycles per insn  ( +-  0.25% )
   157,516,396,866      branches                  #  676.248 M/sec                    ( +-  0.25% )
       445,888,666      branch-misses             #    0.28% of all branches          ( +-  0.19% )

      31.201263687 seconds time elapsed                                          ( +-  0.24% )

With RETPOLINE enabled (CONFIG_RETPOLINE=y)

# trace-cmd start -e all
# perf stat -r 10 /work/c/hackbench 50
Time: 31.087
Time: 31.180
Time: 31.250
Time: 30.905
Time: 31.024
Time: 32.056
Time: 31.312
Time: 31.409
Time: 31.451
Time: 31.275

 Performance counter stats for '/work/c/hackbench 50' (10 runs):

     252893.216212      task-clock (msec)         #    7.444 CPUs utilized            ( +-  0.31% )
         3,218,524      context-switches          #    0.013 M/sec                    ( +-  0.45% )
           427,129      cpu-migrations            #    0.002 M/sec                    ( +-  1.52% )
            43,666      page-faults               #    0.173 K/sec                    ( +-  0.92% )
   933,615,337,142      cycles                    #    3.692 GHz                      ( +-  0.31% )
   593,141,521,286      stalled-cycles-frontend   #   63.53% frontend cycles idle     ( +-  0.32% )
   806,848,677,318      instructions              #    0.86  insn per cycle
                                                  #    0.74  stalled cycles per insn  ( +-  0.30% )
   161,289,933,342      branches                  #  637.779 M/sec                    ( +-  0.29% )
     2,070,719,044      branch-misses             #    1.28% of all branches          ( +-  0.25% )

      33.971942318 seconds time elapsed                                          ( +-  0.28% )


What the above represents, is running "hackbench 50" with all trace events
enabled, went from: 31.201263687 to: 33.971942318 to perform, which is an
8.9% increase!

So I thought about how to solve this, and came up with "jump_functions".
These are similar to jump_labels, but instead of having a static branch, we
would have a dynamic function. A function "dynfunc_X()" that can be assigned
any other function, just as if it was a variable, and have it call the new
function. Talking with other kernel developers at Kernel Recipes, I was told
that this feature would be useful for other subsystems in the kernel and not
just for tracing.

The first attempt created a call in inline assembly, and did macro tricks to
create the parameters, but this was overly complex, especially when one of
the trace events has 12 parameters!

Then I decided to simplify it to have the dynfunc_X() call a trampoline,
that does a direct jump. It's similar to what a retpoline does, but a
retpoline does an indirect jump. A direct jump is much more efficient.

When changing what function a dynamic function should call, text_poke_bp()
is used to modify the trampoline to call the new function.

The first "no change log" patch implements the dynamic function (poorly, as
its just a proof of concept), and the second "no change log" patch
implements a way that tracepoints can take advantage of it.

The tracepoints creates a "default" function that does the iteration over
the tracepoint array like it currently does. But if only a single callback
is attached to the tracepoint (the most common case), it changes the dynamic
function to call the callback directly, without any iteration over the list.

After implementing this, running the above test produced:

# trace-cmd start -e all
# perf stat -r 10 /work/c/hackbench 50
Time: 29.927
Time: 29.504
Time: 29.761
Time: 29.693
Time: 29.430
Time: 29.999
Time: 29.389
Time: 29.404
Time: 29.871
Time: 29.335

 Performance counter stats for '/work/c/hackbench 50' (10 runs):

     239377.553785      task-clock (msec)         #    7.447 CPUs utilized            ( +-  0.27% )
         3,203,640      context-switches          #    0.013 M/sec                    ( +-  0.36% )
           417,511      cpu-migrations            #    0.002 M/sec                    ( +-  1.56% )
            43,462      page-faults               #    0.182 K/sec                    ( +-  0.98% )
   883,720,553,554      cycles                    #    3.692 GHz                      ( +-  0.27% )
   553,115,449,444      stalled-cycles-frontend   #   62.59% frontend cycles idle     ( +-  0.27% )
   792,603,930,472      instructions              #    0.90  insn per cycle
                                                  #    0.70  stalled cycles per insn  ( +-  0.27% )
   159,390,986,499      branches                  #  665.856 M/sec                    ( +-  0.27% )
     1,310,355,667      branch-misses             #    0.82% of all branches          ( +-  0.18% )

      32.146081513 seconds time elapsed                                          ( +-  0.25% )

We didn't get back 100% of performance. I didn't expect to, as retpolines
will cause overhead in other areas than just tracing. But we went from
33.971942318 to 32.146081513. Instead of being 8.9% slower with retpoline
enabled, we are now just 3% slower.

I tried this patch set without RETPOLINE and had this:

# trace-cmd start -e all
# perf stat -r 10 /work/c/hackbench 50
Time: 28.830
Time: 28.457
Time: 29.078
Time: 28.606
Time: 28.377
Time: 28.629
Time: 28.642
Time: 29.005
Time: 28.513
Time: 28.357

 Performance counter stats for '/work/c/hackbench 50' (10 runs):

     231452.110483      task-clock (msec)         #    7.466 CPUs utilized            ( +-  0.28% )
         3,181,305      context-switches          #    0.014 M/sec                    ( +-  0.44% )
           393,496      cpu-migrations            #    0.002 M/sec                    ( +-  1.20% )
            43,673      page-faults               #    0.189 K/sec                    ( +-  0.61% )
   854,481,304,821      cycles                    #    3.692 GHz                      ( +-  0.28% )
   528,175,627,905      stalled-cycles-frontend   #   61.81% frontend cycles idle     ( +-  0.28% )
   787,765,717,278      instructions              #    0.92  insn per cycle
                                                  #    0.67  stalled cycles per insn  ( +-  0.28% )
   157,169,268,775      branches                  #  679.057 M/sec                    ( +-  0.27% )
       366,443,397      branch-misses             #    0.23% of all branches          ( +-  0.15% )

      31.002540109 seconds time elapsed 

Which  went from 31.201263687 to 31.002540109 which is a 0.6% speed up.
Not great, but not bad either.

Notice, there's also test code that creates some files in the debugfs
directory. There's files called: func0, func1, func2 and func3, where each
has a dynamic function associated to it with the number of parameters that
is the same as the number in the name of the file. There's three functions
that each of these dynamic functions can be change to, and echoing in "0",
"1" or "2" will update the dynamic function. Reading from the function
causes the called functions to printk() to the console to see how it worked.

Now what?

OK, for the TODO, if nobody has any issues with this, I was going to hand
this off to Matt Helsley to make this into something thats actually
presentable for inclusion.

1) We need to move the x86 specific code into x86 specific locations.

2) We need to have this work without doing the dynamic updates (for archs
that don't have this implemented). Basically, the dynamic function is going
to probably be a macro with a function pointer that does an indirect jump to
the code that is assigned to the dynamic function.

3) Write up proper change logs ;-)

And I'm sure there's more to do.

Enjoy,

-- Steve



  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
ftrace/jump_function

Head SHA1: 1a2e530e7534d82b95eaa9ddc5218c5652a60d49


Steven Rostedt (VMware) (2):
      jump_function: Addition of new feature "jump_function"
      tracepoints: Implement it with dynamic functions

----
 include/asm-generic/vmlinux.lds.h |   4 +
 include/linux/jump_function.h     |  93 ++++++++++
 include/linux/tracepoint-defs.h   |   3 +
 include/linux/tracepoint.h        |  65 ++++---
 include/trace/define_trace.h      |  14 +-
 kernel/Makefile                   |   2 +-
 kernel/jump_function.c            | 368 ++++++++++++++++++++++++++++++++++++++
 kernel/tracepoint.c               |  29 ++-
 8 files changed, 545 insertions(+), 33 deletions(-)
 create mode 100644 include/linux/jump_function.h
 create mode 100644 kernel/jump_function.c

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06  1:51 [POC][RFC][PATCH 0/2] PROOF OF CONCEPT: Dynamic Functions (jump functions) Steven Rostedt
@ 2018-10-06  1:51 ` Steven Rostedt
  2018-10-06  2:00   ` Steven Rostedt
                     ` (4 more replies)
  2018-10-06  1:51 ` [POC][RFC][PATCH 2/2] tracepoints: Implement it with dynamic functions Steven Rostedt
  1 sibling, 5 replies; 43+ messages in thread
From: Steven Rostedt @ 2018-10-06  1:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Masami Hiramatsu, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina

[-- Attachment #1: 0001-jump_function-Addition-of-new-feature-jump_function.patch --]
[-- Type: text/plain, Size: 13243 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/asm-generic/vmlinux.lds.h |   4 +
 include/linux/jump_function.h     |  93 ++++++++
 kernel/Makefile                   |   2 +-
 kernel/jump_function.c            | 368 ++++++++++++++++++++++++++++++
 4 files changed, 466 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/jump_function.h
 create mode 100644 kernel/jump_function.c

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 7b75ff6e2fce..0e205069ff36 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -257,6 +257,10 @@
 	__start___jump_table = .;					\
 	KEEP(*(__jump_table))                                           \
 	__stop___jump_table = .;					\
+	. = ALIGN(8);                                                   \
+	__start___dynfunc_table = .;					\
+	KEEP(*(__dynfunc_table))					\
+	__stop___dynfunc_table = .;					\
 	. = ALIGN(8);							\
 	__start___verbose = .;						\
 	KEEP(*(__verbose))                                              \
diff --git a/include/linux/jump_function.h b/include/linux/jump_function.h
new file mode 100644
index 000000000000..8c6b0bab5f10
--- /dev/null
+++ b/include/linux/jump_function.h
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_JUMP_FUNCTION_H
+#define _LINUX_JUMP_FUNCTION_H
+
+
+//// This all should be in arch/x86/include/asm
+
+typedef long dynfunc_t;
+
+struct dynfunc_struct;
+
+#define arch_dynfunc_trampoline(name, def)	\
+	asm volatile (				\
+	".globl dynfunc_" #name "; \n\t"	\
+	"dynfunc_" #name ": \n\t"		\
+	"jmp " #def " \n\t"			\
+	".balign 8 \n \t"			\
+	: : : "memory" )
+
+int arch_assign_dynamic_function(const struct dynfunc_struct *dynfunc, void *func);
+
+//////////////// The below should be in include/linux
+
+#ifndef PARAMS
+#define PARAMS(x...) x
+#endif
+
+#ifndef ARGS
+#define ARGS(x...) x
+#endif
+
+struct dynfunc_struct {
+	const void		*dynfunc;
+	void			*func;
+};
+
+int assign_dynamic_function(const struct dynfunc_struct *dynfunc, void *func);
+
+/*
+ * DECLARE_DYNAMIC_FUNCTION - Declaration to create a dynamic function call
+ * @name: The name of the function call to create
+ * @proto: The proto-type of the function (up to 4 args)
+ * @args: The arguments used by @proto
+ *
+ * This macro creates the function that can by used to create a dynamic
+ * function call later. It also creates the function to modify what is
+ * called:
+ *
+ *   dynfunc_[name](args);
+ *
+ * This is placed in the code where the dynamic function should be called
+ * from.
+ *
+ *   assign_dynamic_function_[name](func);
+ *
+ * This is used to make the dynfunc_[name]() call a different function.
+ * It will then call (func) instead.
+ *
+ * This must be added in a header for users of the above two functions.
+ */
+#define DECLARE_DYNAMIC_FUNCTION(name, proto, args)			\
+	extern struct dynfunc_struct ___dyn_func__##name;		\
+	static inline int assign_dynamic_function_##name(int(*func)(proto)) { \
+		return assign_dynamic_function(&___dyn_func__##name, func); \
+	}								\
+	extern int dynfunc_##name(proto)
+
+/*
+ * DEFINE_DYNAMIC_FUNCTION - Define the dynamic function and default
+ * @name: The name of the function call to create
+ * @def: The default function to call
+ * @proto: The proto-type of the function (up to 4 args)
+ *
+ * Must be placed in a C file.
+ *
+ * This sets up the dynamic function that other places may call
+ * dynfunc_[name]().
+ *
+ * It defines the default function that the dynamic function will start
+ * out calling at boot up.
+ */
+#define DEFINE_DYNAMIC_FUNCTION(name, def, proto)			\
+	static void __used __dyn_func_trampoline_##name(void)		\
+	{								\
+		arch_dynfunc_trampoline(name, def);			\
+		unreachable();						\
+	}								\
+	struct dynfunc_struct ___dyn_func__##name __used = {		\
+		.dynfunc	= (void *)dynfunc_##name,		\
+		.func		= def,					\
+	}
+
+#endif	/*  _LINUX_JUMP_FUNCTION_H */
diff --git a/kernel/Makefile b/kernel/Makefile
index 7a63d567fdb5..c647c7f15318 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -10,7 +10,7 @@ obj-y     = fork.o exec_domain.o panic.o \
 	    extable.o params.o \
 	    kthread.o sys_ni.o nsproxy.o \
 	    notifier.o ksysfs.o cred.o reboot.o \
-	    async.o range.o smpboot.o ucount.o
+	    async.o range.o smpboot.o ucount.o jump_function.o
 
 obj-$(CONFIG_MODULES) += kmod.o
 obj-$(CONFIG_MULTIUSER) += groups.o
diff --git a/kernel/jump_function.c b/kernel/jump_function.c
new file mode 100644
index 000000000000..f3decae1bb84
--- /dev/null
+++ b/kernel/jump_function.c
@@ -0,0 +1,368 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * dynamic function support
+ *
+ * Copyright (C) 2018 VMware inc, Steven Rostedt <rostedt@goodmis.org>
+ *
+ */
+
+#include <linux/jump_function.h>
+#include <linux/memory.h>
+#include <linux/module.h>
+#include <linux/list.h>
+#include <linux/sort.h>
+#include <linux/err.h>
+
+#include <asm/sections.h>
+#include <asm/text-patching.h>
+
+#include <linux/uaccess.h>
+
+static DEFINE_MUTEX(dynfunc_mutex);
+
+
+////// The below should be in arch/x86/kernel
+
+#define CALL_SIZE 5
+
+union call_code_union {
+	unsigned char code[CALL_SIZE];
+	struct {
+		unsigned char e9;
+		int offset;
+	} __attribute__((packed));
+};
+
+int arch_assign_dynamic_function(const struct dynfunc_struct *dynfunc,
+				void *func)
+{
+	unsigned long dfunc = (unsigned long)dynfunc->dynfunc;
+	union call_code_union code;
+
+	/* Debug to see what we are replacing (remove this) */
+	probe_kernel_read(code.code, (void *)dfunc, CALL_SIZE);
+#if 0
+	printk("old code = %02x %02x %02x %02x %02x %pS (%lx)\n",
+		code.code[0], code.code[1], code.code[2], code.code[3], code.code[4],
+	       (void *)(code.offset + dfunc + CALL_SIZE),
+	       code.offset + dfunc + CALL_SIZE);
+#endif
+
+	code.e9 = 0xe9;
+	code.offset = (int)((unsigned long)func - (dfunc + CALL_SIZE));
+
+#if 0
+	/* Debug to see what we are updating to (remove this) */
+	printk("adding func %pS to %pS (%lx) %02x %02x %02x %02x %02x\n",
+	       func, (void *)dfunc, (unsigned long)dfunc,
+		code.code[0], code.code[1], code.code[2], code.code[3], code.code[4]);
+#endif
+
+	mutex_lock(&text_mutex);
+	text_poke_bp((void *)dfunc, code.code, CALL_SIZE, func);
+	mutex_unlock(&text_mutex);
+
+	return 0;
+}
+
+////////////// The below can be in kernel/jump_function.c
+
+int assign_dynamic_function(const struct dynfunc_struct *dynfunc, void *func)
+{
+	int ret;
+
+	mutex_lock(&dynfunc_mutex);
+	ret = arch_assign_dynamic_function(dynfunc, func);
+	mutex_unlock(&dynfunc_mutex);
+
+	return ret;
+}
+
+///////// The below is for testing. Can be added in sample code.
+
+#include <linux/debugfs.h>
+
+/*
+ * The below creates a directory in debugfs called "jump_funcs" and
+ * five files within that directory:
+ *
+ * func0, func1, func2, func3, func4.
+ *
+ * Each of those files trigger a dynamic function, with the number
+ * of arguments that match the number in the file name. The
+ * arguments are an "int", "long", "void *" and "char *" (for the defined
+ * arguments of the dynmaic functions). The values used are:
+ * "1", "2", "0xdeadbeef" and "random string".
+ *
+ * Reading the file causes a dynamic function to be called. The
+ * functions assigned to the dynamic functions just prints its own
+ * function name, followed by the parameters passed to it.
+ *
+ * Each dynamic function has 3 functions that can be assigned to it.
+ * By echoing a "0" through "2" will change the function that is
+ * assigned. By doing another read of that file, it should show that
+ * the dynamic function has been updated.
+ */
+DECLARE_DYNAMIC_FUNCTION(myfunc0, PARAMS(void), ARGS());
+DECLARE_DYNAMIC_FUNCTION(myfunc1, PARAMS(int a), ARGS(a));
+DECLARE_DYNAMIC_FUNCTION(myfunc2, PARAMS(int a, long b), ARGS(a, b));
+DECLARE_DYNAMIC_FUNCTION(myfunc3, PARAMS(int a, long b, void *c),
+			 ARGS(a, b, c));
+DECLARE_DYNAMIC_FUNCTION(myfunc4, PARAMS(int a, long b, void *c, char *d),
+			 ARGS(a, b, c, d));
+
+static int myfunc0_default(void)
+{
+	printk("%s\n", __func__);
+	return 0;
+}
+
+static int myfunc1_default(int a)
+{
+	printk("%s %d\n", __func__, a);
+	return 0;
+}
+
+static int myfunc2_default(int a, long b)
+{
+	printk("%s %d %ld\n", __func__, a, b);
+	return 0;
+}
+
+static int myfunc3_default(int a, long b, void *c)
+{
+	printk("%s %d %ld %p\n", __func__, a, b, c);
+	return 0;
+}
+
+static int myfunc4_default(int a, long b, void *c, char *d)
+{
+	printk("%s %d %ld %p %s\n", __func__, a, b, c, d);
+	return 0;
+}
+
+DEFINE_DYNAMIC_FUNCTION(myfunc0, myfunc0_default, PARAMS(void));
+DEFINE_DYNAMIC_FUNCTION(myfunc1, myfunc1_default, PARAMS(int a));
+DEFINE_DYNAMIC_FUNCTION(myfunc2, myfunc2_default, PARAMS(int a, long b));
+DEFINE_DYNAMIC_FUNCTION(myfunc3, myfunc3_default, PARAMS(int a, long b, void *c));
+DEFINE_DYNAMIC_FUNCTION(myfunc4, myfunc4_default,
+			PARAMS(int a, long b, void *c, char *d));
+
+static int myfunc0_test1(void)
+{
+	printk("%s\n", __func__);
+	return 1;
+}
+
+static int myfunc1_test1(int a)
+{
+	printk("%s %d\n", __func__, a);
+	return 1;
+}
+
+static int myfunc2_test1(int a, long b)
+{
+	printk("%s %d %ld\n", __func__, a, b);
+	return 1;
+}
+
+static int myfunc3_test1(int a, long b, void *c)
+{
+	printk("%s %d %ld %p\n", __func__, a, b, c);
+	return 1;
+}
+
+static int myfunc4_test1(int a, long b, void *c, char *d)
+{
+	printk("%s %d %ld %p %s\n", __func__, a, b, c, d);
+	return 1;
+}
+
+static int myfunc0_test2(void)
+{
+	printk("%s\n", __func__);
+	return 2;
+}
+
+static int myfunc1_test2(int a)
+{
+	printk("%s %d\n", __func__, a);
+	return 2;
+}
+
+static int myfunc2_test2(int a, long b)
+{
+	printk("%s %d %ld\n", __func__, a, b);
+	return 2;
+}
+
+static int myfunc3_test2(int a, long b, void *c)
+{
+	printk("%s %d %ld %px\n", __func__, a, b, c);
+	return 2;
+}
+
+static int myfunc4_test2(int a, long b, void *c, char *d)
+{
+	printk("%s %d %ld %px %s\n", __func__, a, b, c, d);
+	return 2;
+}
+
+static int open_generic(struct inode *inode, struct file *filp)
+{
+	filp->private_data = inode->i_private;
+	return 0;
+}
+
+static ssize_t
+jump_func_write(struct file *filp, const char __user *ubuf,
+	       size_t cnt, loff_t *ppos)
+{
+	long type = (long)filp->private_data;
+	unsigned long val;
+	int ret;
+
+	ret = kstrtoul_from_user(ubuf, cnt, 10, &val);
+	if (ret)
+		return ret;
+
+	switch (type) {
+	case 0:
+		switch(val) {
+		case 0:
+			assign_dynamic_function_myfunc0(myfunc0_default);
+			break;
+		case 1:
+			assign_dynamic_function_myfunc0(myfunc0_test1);
+			break;
+		case 2:
+			assign_dynamic_function_myfunc0(myfunc0_test2);
+			break;
+		}
+		break;
+	case 1:
+		switch(val) {
+		case 0:
+			assign_dynamic_function_myfunc1(myfunc1_default);
+			break;
+		case 1:
+			assign_dynamic_function_myfunc1(myfunc1_test1);
+			break;
+		case 2:
+			assign_dynamic_function_myfunc1(myfunc1_test2);
+			break;
+		}
+		break;
+	case 2:
+		switch(val) {
+		case 0:
+			assign_dynamic_function_myfunc2(myfunc2_default);
+			break;
+		case 1:
+			assign_dynamic_function_myfunc2(myfunc2_test1);
+			break;
+		case 2:
+			assign_dynamic_function_myfunc2(myfunc2_test2);
+			break;
+		}
+		break;
+	case 3:
+		switch(val) {
+		case 0:
+			assign_dynamic_function_myfunc3(myfunc3_default);
+			break;
+		case 1:
+			assign_dynamic_function_myfunc3(myfunc3_test1);
+			break;
+		case 2:
+			assign_dynamic_function_myfunc3(myfunc3_test2);
+			break;
+		}
+		break;
+	case 4:
+		switch(val) {
+		case 0:
+			assign_dynamic_function_myfunc4(myfunc4_default);
+			break;
+		case 1:
+			assign_dynamic_function_myfunc4(myfunc4_test1);
+			break;
+		case 2:
+			assign_dynamic_function_myfunc4(myfunc4_test2);
+			break;
+		}
+		break;
+	}
+	return cnt;
+}
+
+static ssize_t
+jump_func_read(struct file *filp, char __user *ubuf,
+	       size_t count, loff_t *ppos)
+{
+	long type = (long)filp->private_data;
+	int a = 1;
+	long b = 2;
+	void *c = (void *)0xdeadbeef;
+	char *d = "random string";
+	long ret;
+
+	switch (type) {
+	case 0:
+		ret = dynfunc_myfunc0();
+		printk("ret=%ld\n", ret);
+		break;
+	case 1:
+		ret = dynfunc_myfunc1(a);
+		printk("ret=%ld\n", ret);
+		break;
+	case 2:
+		ret = dynfunc_myfunc2(a, b);
+		printk("ret=%ld\n", ret);
+		break;
+	case 3:
+		ret = dynfunc_myfunc3(a, b, c);
+		printk("ret=%ld\n", ret);
+		break;
+	case 4:
+		ret = dynfunc_myfunc4(a, b, c, d);
+		printk("ret=%ld\n", ret);
+		break;
+	}
+
+	*ppos += count;
+	return 0;
+}
+
+static const struct file_operations jump_func_ops = {
+	.open			= open_generic,
+	.write			= jump_func_write,
+	.read			= jump_func_read,
+};
+
+
+static __init int setup_test(void)
+{
+	struct dentry *top = debugfs_create_dir("jump_funcs", NULL);
+
+	if (!top)
+		return -ENOMEM;
+
+	debugfs_create_file("func0", 0666, top, (void *)0,
+			    &jump_func_ops);
+
+	debugfs_create_file("func1", 0666, top, (void *)1,
+			    &jump_func_ops);
+
+	debugfs_create_file("func2", 0666, top, (void *)2,
+			    &jump_func_ops);
+
+	debugfs_create_file("func3", 0666, top, (void *)3,
+			    &jump_func_ops);
+
+	debugfs_create_file("func4", 0666, top, (void *)4,
+			    &jump_func_ops);
+
+	return 0;
+}
+__initcall(setup_test);
-- 
2.19.0



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [POC][RFC][PATCH 2/2] tracepoints: Implement it with dynamic functions
  2018-10-06  1:51 [POC][RFC][PATCH 0/2] PROOF OF CONCEPT: Dynamic Functions (jump functions) Steven Rostedt
  2018-10-06  1:51 ` [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function" Steven Rostedt
@ 2018-10-06  1:51 ` Steven Rostedt
  1 sibling, 0 replies; 43+ messages in thread
From: Steven Rostedt @ 2018-10-06  1:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Masami Hiramatsu, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina

[-- Attachment #1: 0002-tracepoints-Implement-it-with-dynamic-functions.patch --]
[-- Type: text/plain, Size: 10822 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/tracepoint-defs.h |  3 ++
 include/linux/tracepoint.h      | 65 ++++++++++++++++++++++-----------
 include/trace/define_trace.h    | 14 +++----
 kernel/tracepoint.c             | 29 +++++++++++++--
 4 files changed, 79 insertions(+), 32 deletions(-)

diff --git a/include/linux/tracepoint-defs.h b/include/linux/tracepoint-defs.h
index 22c5a46e9693..a9d267be98de 100644
--- a/include/linux/tracepoint-defs.h
+++ b/include/linux/tracepoint-defs.h
@@ -11,6 +11,8 @@
 #include <linux/atomic.h>
 #include <linux/static_key.h>
 
+struct dynfunc_struct;
+
 struct trace_print_flags {
 	unsigned long		mask;
 	const char		*name;
@@ -30,6 +32,7 @@ struct tracepoint_func {
 struct tracepoint {
 	const char *name;		/* Tracepoint name */
 	struct static_key key;
+	struct dynfunc_struct *dynfunc;
 	int (*regfunc)(void);
 	void (*unregfunc)(void);
 	struct tracepoint_func __rcu *funcs;
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 041f7e56a289..800c1b025e1f 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -21,6 +21,7 @@
 #include <linux/cpumask.h>
 #include <linux/rcupdate.h>
 #include <linux/tracepoint-defs.h>
+#include <linux/jump_function.h>
 
 struct module;
 struct tracepoint;
@@ -94,7 +95,9 @@ extern int syscall_regfunc(void);
 extern void syscall_unregfunc(void);
 #endif /* CONFIG_HAVE_SYSCALL_TRACEPOINTS */
 
+#ifndef PARAMS
 #define PARAMS(args...) args
+#endif
 
 #define TRACE_DEFINE_ENUM(x)
 #define TRACE_DEFINE_SIZEOF(x)
@@ -138,12 +141,11 @@ extern void syscall_unregfunc(void);
  * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
  * "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto".
  */
-#define __DO_TRACE(tp, proto, args, cond, rcuidle)			\
+#define __DO_TRACE(name, proto, args, cond, rcuidle)			\
 	do {								\
 		struct tracepoint_func *it_func_ptr;			\
-		void *it_func;						\
-		void *__data;						\
 		int __maybe_unused idx = 0;				\
+		void *__data;						\
 									\
 		if (!(cond))						\
 			return;						\
@@ -163,14 +165,11 @@ extern void syscall_unregfunc(void);
 			rcu_irq_enter_irqson();				\
 		}							\
 									\
-		it_func_ptr = rcu_dereference_raw((tp)->funcs);		\
-									\
+		it_func_ptr =						\
+			rcu_dereference_raw((&__tracepoint_##name)->funcs); \
 		if (it_func_ptr) {					\
-			do {						\
-				it_func = (it_func_ptr)->func;		\
-				__data = (it_func_ptr)->data;		\
-				((void(*)(proto))(it_func))(args);	\
-			} while ((++it_func_ptr)->func);		\
+			__data = (it_func_ptr)->data;			\
+			dynfunc_tp_func_##name(args);			\
 		}							\
 									\
 		if (rcuidle) {						\
@@ -186,7 +185,7 @@ extern void syscall_unregfunc(void);
 	static inline void trace_##name##_rcuidle(proto)		\
 	{								\
 		if (static_key_false(&__tracepoint_##name.key))		\
-			__DO_TRACE(&__tracepoint_##name,		\
+			__DO_TRACE(name,				\
 				TP_PROTO(data_proto),			\
 				TP_ARGS(data_args),			\
 				TP_CONDITION(cond), 1);			\
@@ -208,11 +207,13 @@ extern void syscall_unregfunc(void);
  * poking RCU a bit.
  */
 #define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args) \
+	DECLARE_DYNAMIC_FUNCTION(tp_func_##name, PARAMS(data_proto),	\
+				 PARAMS(data_args));			\
 	extern struct tracepoint __tracepoint_##name;			\
 	static inline void trace_##name(proto)				\
 	{								\
 		if (static_key_false(&__tracepoint_##name.key))		\
-			__DO_TRACE(&__tracepoint_##name,		\
+			__DO_TRACE(name,				\
 				TP_PROTO(data_proto),			\
 				TP_ARGS(data_args),			\
 				TP_CONDITION(cond), 0);			\
@@ -271,21 +272,43 @@ extern void syscall_unregfunc(void);
  * structures, so we create an array of pointers that will be used for iteration
  * on the tracepoints.
  */
-#define DEFINE_TRACE_FN(name, reg, unreg)				 \
+#define DEFINE_TRACE_FN(name, reg, unreg, proto, args)			\
 	static const char __tpstrtab_##name[]				 \
 	__attribute__((section("__tracepoints_strings"))) = #name;	 \
 	struct tracepoint __tracepoint_##name				 \
 	__attribute__((section("__tracepoints"), used)) =		 \
-		{ __tpstrtab_##name, STATIC_KEY_INIT_FALSE, reg, unreg, NULL };\
-	__TRACEPOINT_ENTRY(name);
+		{ __tpstrtab_##name, STATIC_KEY_INIT_FALSE,		\
+		  &___dyn_func__tp_func_##name, reg, unreg, NULL };	\
+	__TRACEPOINT_ENTRY(name);					\
+	int __tracepoint_iter_##name(void *__data, proto)		\
+	{								\
+		struct tracepoint_func *it_func_ptr;			\
+		void *it_func;						\
+									\
+		it_func_ptr =						\
+			rcu_dereference_raw((&__tracepoint_##name)->funcs); \
+		do {							\
+			it_func = (it_func_ptr)->func;			\
+			__data = (it_func_ptr)->data;			\
+			((void(*)(void *, proto))(it_func))(__data, args); \
+		} while ((++it_func_ptr)->func);			\
+		return 0;						\
+	}								\
+	DEFINE_DYNAMIC_FUNCTION(tp_func_##name, __tracepoint_iter_##name, \
+				PARAMS(void *__data, proto))
 
-#define DEFINE_TRACE(name)						\
-	DEFINE_TRACE_FN(name, NULL, NULL);
+#define DEFINE_TRACE(name, proto, args)		\
+	DEFINE_TRACE_FN(name, NULL, NULL, PARAMS(proto), PARAMS(args));
 
 #define EXPORT_TRACEPOINT_SYMBOL_GPL(name)				\
-	EXPORT_SYMBOL_GPL(__tracepoint_##name)
+	EXPORT_SYMBOL_GPL(__tracepoint_##name);				\
+	EXPORT_SYMBOL_GPL(___dyn_func__tp_func_##name);			\
+	EXPORT_SYMBOL_GPL(dynfunc_tp_func_##name)
 #define EXPORT_TRACEPOINT_SYMBOL(name)					\
-	EXPORT_SYMBOL(__tracepoint_##name)
+	EXPORT_SYMBOL(__tracepoint_##name);				\
+	EXPORT_SYMBOL(___dyn_func__tp_func_##name);			\
+	EXPORT_SYMBOL(dynfunc_tp_func_##name)
+
 
 #else /* !TRACEPOINTS_ENABLED */
 #define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args) \
@@ -314,8 +337,8 @@ extern void syscall_unregfunc(void);
 		return false;						\
 	}
 
-#define DEFINE_TRACE_FN(name, reg, unreg)
-#define DEFINE_TRACE(name)
+#define DEFINE_TRACE_FN(name, reg, unreg, proto, args)
+#define DEFINE_TRACE(name, proto, args)
 #define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
 #define EXPORT_TRACEPOINT_SYMBOL(name)
 
diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h
index cb30c5532144..c19aea44efb2 100644
--- a/include/trace/define_trace.h
+++ b/include/trace/define_trace.h
@@ -25,7 +25,7 @@
 
 #undef TRACE_EVENT
 #define TRACE_EVENT(name, proto, args, tstruct, assign, print)	\
-	DEFINE_TRACE(name)
+	DEFINE_TRACE(name, PARAMS(proto), PARAMS(args))
 
 #undef TRACE_EVENT_CONDITION
 #define TRACE_EVENT_CONDITION(name, proto, args, cond, tstruct, assign, print) \
@@ -39,24 +39,24 @@
 #undef TRACE_EVENT_FN
 #define TRACE_EVENT_FN(name, proto, args, tstruct,		\
 		assign, print, reg, unreg)			\
-	DEFINE_TRACE_FN(name, reg, unreg)
+	DEFINE_TRACE_FN(name, reg, unreg, PARAMS(proto), PARAMS(args))
 
 #undef TRACE_EVENT_FN_COND
 #define TRACE_EVENT_FN_COND(name, proto, args, cond, tstruct,		\
 		assign, print, reg, unreg)			\
-	DEFINE_TRACE_FN(name, reg, unreg)
+	DEFINE_TRACE_FN(name, reg, unreg, PARAMS(proto), PARAMS(args))
 
 #undef DEFINE_EVENT
 #define DEFINE_EVENT(template, name, proto, args) \
-	DEFINE_TRACE(name)
+	DEFINE_TRACE(name, PARAMS(proto), PARAMS(args))
 
 #undef DEFINE_EVENT_FN
 #define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg) \
-	DEFINE_TRACE_FN(name, reg, unreg)
+	DEFINE_TRACE_FN(name, reg, unreg, PARAMS(proto), PARAMS(args))
 
 #undef DEFINE_EVENT_PRINT
 #define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
-	DEFINE_TRACE(name)
+	DEFINE_TRACE(name, PARAMS(proto), PARAMS(args))
 
 #undef DEFINE_EVENT_CONDITION
 #define DEFINE_EVENT_CONDITION(template, name, proto, args, cond) \
@@ -64,7 +64,7 @@
 
 #undef DECLARE_TRACE
 #define DECLARE_TRACE(name, proto, args)	\
-	DEFINE_TRACE(name)
+	DEFINE_TRACE(name, PARAMS(proto), PARAMS(args))
 
 #undef TRACE_INCLUDE
 #undef __TRACE_INCLUDE
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index bf2c06ef9afc..b141f25d4b3a 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -140,7 +140,7 @@ static void debug_print_probes(struct tracepoint_func *funcs)
 
 static struct tracepoint_func *
 func_add(struct tracepoint_func **funcs, struct tracepoint_func *tp_func,
-	 int prio)
+	 int prio, int *tot_probes)
 {
 	struct tracepoint_func *old, *new;
 	int nr_probes = 0;
@@ -183,11 +183,12 @@ func_add(struct tracepoint_func **funcs, struct tracepoint_func *tp_func,
 	new[nr_probes + 1].func = NULL;
 	*funcs = new;
 	debug_print_probes(*funcs);
+	*tot_probes = nr_probes + 1;
 	return old;
 }
 
 static void *func_remove(struct tracepoint_func **funcs,
-		struct tracepoint_func *tp_func)
+		struct tracepoint_func *tp_func, int *left)
 {
 	int nr_probes = 0, nr_del = 0, i;
 	struct tracepoint_func *old, *new;
@@ -241,6 +242,7 @@ static int tracepoint_add_func(struct tracepoint *tp,
 			       struct tracepoint_func *func, int prio)
 {
 	struct tracepoint_func *old, *tp_funcs;
+	int probes = 0;
 	int ret;
 
 	if (tp->regfunc && !static_key_enabled(&tp->key)) {
@@ -251,7 +253,7 @@ static int tracepoint_add_func(struct tracepoint *tp,
 
 	tp_funcs = rcu_dereference_protected(tp->funcs,
 			lockdep_is_held(&tracepoints_mutex));
-	old = func_add(&tp_funcs, func, prio);
+	old = func_add(&tp_funcs, func, prio, &probes);
 	if (IS_ERR(old)) {
 		WARN_ON_ONCE(PTR_ERR(old) != -ENOMEM);
 		return PTR_ERR(old);
@@ -266,6 +268,15 @@ static int tracepoint_add_func(struct tracepoint *tp,
 	rcu_assign_pointer(tp->funcs, tp_funcs);
 	if (!static_key_enabled(&tp->key))
 		static_key_slow_inc(&tp->key);
+
+	if (probes == 1) {
+//		printk("make direct call to %pS\n", tp_funcs->func);
+		assign_dynamic_function(tp->dynfunc, tp_funcs->func);
+	} else {
+//		printk("[%d] make call to iterator %pS\n", probes, tp->dynfunc->func);
+		assign_dynamic_function(tp->dynfunc, tp->dynfunc->func);
+	}
+
 	release_probes(old);
 	return 0;
 }
@@ -280,10 +291,11 @@ static int tracepoint_remove_func(struct tracepoint *tp,
 		struct tracepoint_func *func)
 {
 	struct tracepoint_func *old, *tp_funcs;
+	int probes_left = 0;
 
 	tp_funcs = rcu_dereference_protected(tp->funcs,
 			lockdep_is_held(&tracepoints_mutex));
-	old = func_remove(&tp_funcs, func);
+	old = func_remove(&tp_funcs, func, &probes_left);
 	if (IS_ERR(old)) {
 		WARN_ON_ONCE(PTR_ERR(old) != -ENOMEM);
 		return PTR_ERR(old);
@@ -297,6 +309,15 @@ static int tracepoint_remove_func(struct tracepoint *tp,
 		if (static_key_enabled(&tp->key))
 			static_key_slow_dec(&tp->key);
 	}
+
+	if (probes_left == 1) {
+//		printk("make direct call to %pS\n", tp_funcs->func);
+		assign_dynamic_function(tp->dynfunc, tp_funcs->func);
+	} else {
+//		printk("[%d] make call to iterator %pS\n", probes_left, tp->dynfunc->func);
+		assign_dynamic_function(tp->dynfunc, tp->dynfunc->func);
+	}
+
 	rcu_assign_pointer(tp->funcs, tp_funcs);
 	release_probes(old);
 	return 0;
-- 
2.19.0



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06  1:51 ` [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function" Steven Rostedt
@ 2018-10-06  2:00   ` Steven Rostedt
  2018-10-06  2:02   ` Steven Rostedt
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 43+ messages in thread
From: Steven Rostedt @ 2018-10-06  2:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Masami Hiramatsu, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina

On Fri, 05 Oct 2018 21:51:11 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  include/asm-generic/vmlinux.lds.h |   4 +
>  include/linux/jump_function.h     |  93 ++++++++
>  kernel/Makefile                   |   2 +-
>  kernel/jump_function.c            | 368 ++++++++++++++++++++++++++++++
>  4 files changed, 466 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/jump_function.h
>  create mode 100644 kernel/jump_function.c
> 
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 7b75ff6e2fce..0e205069ff36 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -257,6 +257,10 @@
>  	__start___jump_table = .;					\
>  	KEEP(*(__jump_table))                                           \
>  	__stop___jump_table = .;					\
> +	. = ALIGN(8);                                                   \
> +	__start___dynfunc_table = .;					\
> +	KEEP(*(__dynfunc_table))					\
> +	__stop___dynfunc_table = .;					\
>  	. = ALIGN(8);							\
>  	__start___verbose = .;						\
>  	KEEP(*(__verbose))                                              \
>

BAH, this is leftover from my first attempt. It's not needed and can be
nuked.

-- Steve

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06  1:51 ` [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function" Steven Rostedt
  2018-10-06  2:00   ` Steven Rostedt
@ 2018-10-06  2:02   ` Steven Rostedt
  2018-10-06  2:03   ` Steven Rostedt
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 43+ messages in thread
From: Steven Rostedt @ 2018-10-06  2:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Masami Hiramatsu, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina

On Fri, 05 Oct 2018 21:51:11 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> +#define arch_dynfunc_trampoline(name, def)	\
> +	asm volatile (				\
> +	".globl dynfunc_" #name "; \n\t"	\
> +	"dynfunc_" #name ": \n\t"		\
> +	"jmp " #def " \n\t"			\
> +	".balign 8 \n \t"			\
> +	: : : "memory" )
> +

Note, the assembler can easily put in a two byte jump here. The .balign
was suppose to also have some padding (nop) incase that happens. It's
fine, because we can just replace it with a 5 byte jump, as long as we
have 3 bytes afterward if it is a two byte jump.

-- Steve

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06  1:51 ` [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function" Steven Rostedt
  2018-10-06  2:00   ` Steven Rostedt
  2018-10-06  2:02   ` Steven Rostedt
@ 2018-10-06  2:03   ` Steven Rostedt
  2018-10-06 15:15     ` Steven Rostedt
  2018-10-06 12:12   ` Peter Zijlstra
  2018-10-09  3:44   ` Masami Hiramatsu
  4 siblings, 1 reply; 43+ messages in thread
From: Steven Rostedt @ 2018-10-06  2:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Masami Hiramatsu, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina

On Fri, 05 Oct 2018 21:51:11 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> +#ifndef PARAMS
> +#define PARAMS(x...) x
> +#endif
> +
> +#ifndef ARGS
> +#define ARGS(x...) x
> +#endif
> +

This is also leftover from the first attempt and can be nuked.

Yeah, yeah, I should have reviewed my patches better before sending
them. But I was so excited that I got it working I just wanted to share
the joy!

-- Steve

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06  1:51 ` [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function" Steven Rostedt
                     ` (2 preceding siblings ...)
  2018-10-06  2:03   ` Steven Rostedt
@ 2018-10-06 12:12   ` Peter Zijlstra
  2018-10-06 13:39     ` Steven Rostedt
  2018-10-09  3:44   ` Masami Hiramatsu
  4 siblings, 1 reply; 43+ messages in thread
From: Peter Zijlstra @ 2018-10-06 12:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Masami Hiramatsu, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina,
	ard.biesheuvel, Andy Lutomirski

On Fri, Oct 05, 2018 at 09:51:11PM -0400, Steven Rostedt wrote:
> +#define arch_dynfunc_trampoline(name, def)	\
> +	asm volatile (				\
> +	".globl dynfunc_" #name "; \n\t"	\
> +	"dynfunc_" #name ": \n\t"		\
> +	"jmp " #def " \n\t"			\
> +	".balign 8 \n \t"			\
> +	: : : "memory" )

Bah, what is it with you people and trampolines. Why can't we, just like
jump_label, patch the call directly?

The whole call+jmp thing is silly, don't do that. It just wrecks I$ and
is slower for no real reason afaict.

Steve, also see:

  https://lkml.kernel.org/r/20181005081333.15018-1-ard.biesheuvel@linaro.org

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06 12:12   ` Peter Zijlstra
@ 2018-10-06 13:39     ` Steven Rostedt
  2018-10-06 15:13       ` Andy Lutomirski
                         ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Steven Rostedt @ 2018-10-06 13:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Masami Hiramatsu, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina,
	ard.biesheuvel, Andy Lutomirski

On Sat, 6 Oct 2018 14:12:11 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, Oct 05, 2018 at 09:51:11PM -0400, Steven Rostedt wrote:
> > +#define arch_dynfunc_trampoline(name, def)	\
> > +	asm volatile (				\
> > +	".globl dynfunc_" #name "; \n\t"	\
> > +	"dynfunc_" #name ": \n\t"		\
> > +	"jmp " #def " \n\t"			\
> > +	".balign 8 \n \t"			\
> > +	: : : "memory" )  
> 
> Bah, what is it with you people and trampolines. Why can't we, just like
> jump_label, patch the call directly?
> 
> The whole call+jmp thing is silly, don't do that. It just wrecks I$ and
> is slower for no real reason afaict.

My first attempt was to do just that. But to add a label at the
call site required handling all the parameters too. See my branch:
 ftrace/jump_function-v1 for how ugly it got (and it didn't work).

> 
> Steve, also see:
> 
>   https://lkml.kernel.org/r/20181005081333.15018-1-ard.biesheuvel@linaro.org

Interesting. I don't have time to look at it at the moment to see what
was done, but will do so in the near future.

Remember, this was a proof of concept and even with the trampolines, it
showed a great level of improvement. One thought was to do a
"recordmcount.c" type of action to find where the calls were and patch
them directly at boot up. I tried to keep the API the same where this
could actually be done as an improvement later.

Perhaps a gcc plugin might work too.

I'll have to see what Ard did to handle the function parameters.

-- Steve


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06 13:39     ` Steven Rostedt
@ 2018-10-06 15:13       ` Andy Lutomirski
  2018-10-06 15:16         ` Steven Rostedt
  2018-10-08  7:21       ` Peter Zijlstra
  2018-10-08 11:30       ` Ard Biesheuvel
  2 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2018-10-06 15:13 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, linux-kernel, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina, ard.biesheuvel, Andy Lutomirski



> On Oct 6, 2018, at 6:39 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> On Sat, 6 Oct 2018 14:12:11 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
>>> On Fri, Oct 05, 2018 at 09:51:11PM -0400, Steven Rostedt wrote:
>>> +#define arch_dynfunc_trampoline(name, def)    \
>>> +    asm volatile (                \
>>> +    ".globl dynfunc_" #name "; \n\t"    \
>>> +    "dynfunc_" #name ": \n\t"        \
>>> +    "jmp " #def " \n\t"            \
>>> +    ".balign 8 \n \t"            \
>>> +    : : : "memory" )  
>> 
>> Bah, what is it with you people and trampolines. Why can't we, just like
>> jump_label, patch the call directly?
>> 
>> The whole call+jmp thing is silly, don't do that. It just wrecks I$ and
>> is slower for no real reason afaict.
> 
> My first attempt was to do just that. But to add a label at the
> call site required handling all the parameters too. See my branch:
> ftrace/jump_function-v1 for how ugly it got (and it didn't work).
> 
>> 
>> Steve, also see:
>> 
>>  https://lkml.kernel.org/r/20181005081333.15018-1-ard.biesheuvel@linaro.org
> 
> Interesting. I don't have time to look at it at the moment to see what
> was done, but will do so in the near future.
> 
> Remember, this was a proof of concept and even with the trampolines, it
> showed a great level of improvement. One thought was to do a
> "recordmcount.c" type of action to find where the calls were and patch
> them directly at boot up. I tried to keep the API the same where this
> could actually be done as an improvement later.
> 
> Perhaps a gcc plugin might work too.
> 

My suggestion was to have objtool do the dirty work. Josh said something suspiciously like “sounds fun” on IRC :)

> I'll have to see what Ard did to handle the function parameters.
> 
> -- Steve
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06  2:03   ` Steven Rostedt
@ 2018-10-06 15:15     ` Steven Rostedt
  0 siblings, 0 replies; 43+ messages in thread
From: Steven Rostedt @ 2018-10-06 15:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Masami Hiramatsu, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina

On Fri, 5 Oct 2018 22:03:51 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 05 Oct 2018 21:51:11 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > +#ifndef PARAMS
> > +#define PARAMS(x...) x
> > +#endif
> > +
> > +#ifndef ARGS
> > +#define ARGS(x...) x
> > +#endif
> > +  
> 
> This is also leftover from the first attempt and can be nuked.
>

I take this back. It is still needed.

-- Steve

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06 15:13       ` Andy Lutomirski
@ 2018-10-06 15:16         ` Steven Rostedt
  0 siblings, 0 replies; 43+ messages in thread
From: Steven Rostedt @ 2018-10-06 15:16 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Peter Zijlstra, linux-kernel, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina, ard.biesheuvel, Andy Lutomirski

On Sat, 6 Oct 2018 08:13:18 -0700
Andy Lutomirski <luto@amacapital.net> wrote:

> > Perhaps a gcc plugin might work too.
> >   
> 
> My suggestion was to have objtool do the dirty work. Josh said something suspiciously like “sounds fun” on IRC :)
> 

objtool does basically the same thing as recordmcount does. Josh and I
have both said that it's on our todo list to combine the two and make it
more generic for operations like this.

Seems like now's the time to do it.

-- Steve


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06 13:39     ` Steven Rostedt
  2018-10-06 15:13       ` Andy Lutomirski
@ 2018-10-08  7:21       ` Peter Zijlstra
  2018-10-08  8:33         ` Andy Lutomirski
  2018-10-08 11:30       ` Ard Biesheuvel
  2 siblings, 1 reply; 43+ messages in thread
From: Peter Zijlstra @ 2018-10-08  7:21 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Masami Hiramatsu, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina,
	ard.biesheuvel, Andy Lutomirski

On Sat, Oct 06, 2018 at 09:39:05AM -0400, Steven Rostedt wrote:
> On Sat, 6 Oct 2018 14:12:11 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Fri, Oct 05, 2018 at 09:51:11PM -0400, Steven Rostedt wrote:
> > > +#define arch_dynfunc_trampoline(name, def)	\
> > > +	asm volatile (				\
> > > +	".globl dynfunc_" #name "; \n\t"	\
> > > +	"dynfunc_" #name ": \n\t"		\
> > > +	"jmp " #def " \n\t"			\
> > > +	".balign 8 \n \t"			\
> > > +	: : : "memory" )  
> > 
> > Bah, what is it with you people and trampolines. Why can't we, just like
> > jump_label, patch the call directly?
> > 
> > The whole call+jmp thing is silly, don't do that. It just wrecks I$ and
> > is slower for no real reason afaict.
> 
> My first attempt was to do just that. But to add a label at the
> call site required handling all the parameters too. See my branch:
>  ftrace/jump_function-v1 for how ugly it got (and it didn't work).

Can't we hijack the relocation records for these functions before they
get thrown out in the (final) link pass or something?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08  7:21       ` Peter Zijlstra
@ 2018-10-08  8:33         ` Andy Lutomirski
  2018-10-08 15:57           ` Peter Zijlstra
  0 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2018-10-08  8:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Rostedt, linux-kernel, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina, ard.biesheuvel, Andy Lutomirski



> On Oct 8, 2018, at 12:21 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
>> On Sat, Oct 06, 2018 at 09:39:05AM -0400, Steven Rostedt wrote:
>> On Sat, 6 Oct 2018 14:12:11 +0200
>> Peter Zijlstra <peterz@infradead.org> wrote:
>> 
>>>> On Fri, Oct 05, 2018 at 09:51:11PM -0400, Steven Rostedt wrote:
>>>> +#define arch_dynfunc_trampoline(name, def)    \
>>>> +    asm volatile (                \
>>>> +    ".globl dynfunc_" #name "; \n\t"    \
>>>> +    "dynfunc_" #name ": \n\t"        \
>>>> +    "jmp " #def " \n\t"            \
>>>> +    ".balign 8 \n \t"            \
>>>> +    : : : "memory" )  
>>> 
>>> Bah, what is it with you people and trampolines. Why can't we, just like
>>> jump_label, patch the call directly?
>>> 
>>> The whole call+jmp thing is silly, don't do that. It just wrecks I$ and
>>> is slower for no real reason afaict.
>> 
>> My first attempt was to do just that. But to add a label at the
>> call site required handling all the parameters too. See my branch:
>> ftrace/jump_function-v1 for how ugly it got (and it didn't work).
> 
> Can't we hijack the relocation records for these functions before they
> get thrown out in the (final) link pass or something?

I could be talking out my arse here, but I thought we could do this, too, then changed my mind.  The relocation records give us the location of the call or jump operand, but they don’t give the address of the beginning of the instruction. If the instruction crosses a cache line boundary, don’t we need to use the int3 patching trick?  And that requires knowing which byte to patch with int3.

Or am I wrong and can the CPUs we care about correctly handle a locked write to part of an instruction that crosses a cache line boundary?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06 13:39     ` Steven Rostedt
  2018-10-06 15:13       ` Andy Lutomirski
  2018-10-08  7:21       ` Peter Zijlstra
@ 2018-10-08 11:30       ` Ard Biesheuvel
  2 siblings, 0 replies; 43+ messages in thread
From: Ard Biesheuvel @ 2018-10-08 11:30 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina, Andy Lutomirski

On 6 October 2018 at 15:39, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Sat, 6 Oct 2018 14:12:11 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
>
>> On Fri, Oct 05, 2018 at 09:51:11PM -0400, Steven Rostedt wrote:
>> > +#define arch_dynfunc_trampoline(name, def) \
>> > +   asm volatile (                          \
>> > +   ".globl dynfunc_" #name "; \n\t"        \
>> > +   "dynfunc_" #name ": \n\t"               \
>> > +   "jmp " #def " \n\t"                     \
>> > +   ".balign 8 \n \t"                       \
>> > +   : : : "memory" )
>>
>> Bah, what is it with you people and trampolines. Why can't we, just like
>> jump_label, patch the call directly?
>>
>> The whole call+jmp thing is silly, don't do that. It just wrecks I$ and
>> is slower for no real reason afaict.
>
> My first attempt was to do just that. But to add a label at the
> call site required handling all the parameters too. See my branch:
>  ftrace/jump_function-v1 for how ugly it got (and it didn't work).
>
>>
>> Steve, also see:
>>
>>   https://lkml.kernel.org/r/20181005081333.15018-1-ard.biesheuvel@linaro.org
>
> Interesting. I don't have time to look at it at the moment to see what
> was done, but will do so in the near future.
>
> Remember, this was a proof of concept and even with the trampolines, it
> showed a great level of improvement. One thought was to do a
> "recordmcount.c" type of action to find where the calls were and patch
> them directly at boot up. I tried to keep the API the same where this
> could actually be done as an improvement later.
>
> Perhaps a gcc plugin might work too.
>
> I'll have to see what Ard did to handle the function parameters.
>

I didn't. My approach is just a generic function pointer which can be
overridden by the arch to be emitted as a trampoline instead.

Patching the call directly simply isn't feasible without compiler
support like we have with asm goto for jump_labels.

The use case I am proposing is providing different implementations of
crypto routines or CRC computation etc without having to rely on
function pointers, but still keep them as loadable modules. These
routines are a) heavy weight or we wouldn't bother providing
alternatives in the first place, and b) likely to have a considerable
I$ footprint already (and crypto libraries that have
encrypt/decrypt/setkey or init/update/final entry points will end up
with multiple trampolines in a single I-cache line). Also, the actual
patching only occurs on module load/unload.

I have no idea whether this reasoning applies to Steven's use case as
well, though, I haven't looked at his patches (I wasn't on cc)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08  8:33         ` Andy Lutomirski
@ 2018-10-08 15:57           ` Peter Zijlstra
  2018-10-08 16:29             ` Andy Lutomirski
  2018-10-08 16:31             ` Steven Rostedt
  0 siblings, 2 replies; 43+ messages in thread
From: Peter Zijlstra @ 2018-10-08 15:57 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Steven Rostedt, linux-kernel, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina, ard.biesheuvel, Andy Lutomirski

On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:
> > Can't we hijack the relocation records for these functions before they
> > get thrown out in the (final) link pass or something?
> 
> I could be talking out my arse here, but I thought we could do this,
> too, then changed my mind.  The relocation records give us the
> location of the call or jump operand, but they don’t give the address
> of the beginning of the instruction.

But that's like 1 byte before the operand, right? We could even double check
this by reading back that byte and ensuring it is in fact 0xE8 (CALL).

AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
so if we have the PLT32 location, we also have the instruction location. Or am
I missing something?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 15:57           ` Peter Zijlstra
@ 2018-10-08 16:29             ` Andy Lutomirski
  2018-10-08 16:39               ` Steven Rostedt
                                 ` (2 more replies)
  2018-10-08 16:31             ` Steven Rostedt
  1 sibling, 3 replies; 43+ messages in thread
From: Andy Lutomirski @ 2018-10-08 16:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Rostedt, linux-kernel, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina, ard.biesheuvel, Andy Lutomirski



> On Oct 8, 2018, at 8:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:
>>> Can't we hijack the relocation records for these functions before they
>>> get thrown out in the (final) link pass or something?
>> 
>> I could be talking out my arse here, but I thought we could do this,
>> too, then changed my mind.  The relocation records give us the
>> location of the call or jump operand, but they don’t give the address
>> of the beginning of the instruction.
> 
> But that's like 1 byte before the operand, right? We could even double check
> this by reading back that byte and ensuring it is in fact 0xE8 (CALL).
> 
> AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
> so if we have the PLT32 location, we also have the instruction location. Or am
> I missing something?

There’s also JMP and Jcc, any of which can be used for rail calls, but those are also one byte. I suppose GCC is unlikely to emit a prefixed form of any of these. So maybe we really can assume they’re all one byte.

But there is a nasty potential special case: anything that takes the function’s address. This includes jump tables, computed gotos, and plain old function pointers. And I suspect that any of these could have one of the rather large number of CALL/JMP/Jcc bytes before the relocation by coincidence.



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 15:57           ` Peter Zijlstra
  2018-10-08 16:29             ` Andy Lutomirski
@ 2018-10-08 16:31             ` Steven Rostedt
  1 sibling, 0 replies; 43+ messages in thread
From: Steven Rostedt @ 2018-10-08 16:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andy Lutomirski, linux-kernel, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina, ard.biesheuvel, Andy Lutomirski

On Mon, 8 Oct 2018 17:57:57 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:
> > > Can't we hijack the relocation records for these functions before they
> > > get thrown out in the (final) link pass or something?  
> > 
> > I could be talking out my arse here, but I thought we could do this,
> > too, then changed my mind.  The relocation records give us the
> > location of the call or jump operand, but they don’t give the address
> > of the beginning of the instruction.  
> 
> But that's like 1 byte before the operand, right? We could even double check
> this by reading back that byte and ensuring it is in fact 0xE8 (CALL).
> 
> AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
> so if we have the PLT32 location, we also have the instruction location. Or am
> I missing something?

Yes, this is exactly what I was thinking of doing. All we need to do is
have objtool (or a modification of whatever we come up with), to find
the call sites of a specific function (we can have a table to look up
for), that creates a section listing all these call sites, and on boot
up, we can confirm that they are indeed calls (e8 operations).

-- Steve

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 16:29             ` Andy Lutomirski
@ 2018-10-08 16:39               ` Steven Rostedt
  2018-10-08 16:39               ` Peter Zijlstra
  2018-10-09  2:17               ` Josh Poimboeuf
  2 siblings, 0 replies; 43+ messages in thread
From: Steven Rostedt @ 2018-10-08 16:39 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Peter Zijlstra, linux-kernel, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina, ard.biesheuvel, Andy Lutomirski

On Mon, 8 Oct 2018 09:29:56 -0700
Andy Lutomirski <luto@amacapital.net> wrote:

> > On Oct 8, 2018, at 8:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:  
> >>> Can't we hijack the relocation records for these functions before they
> >>> get thrown out in the (final) link pass or something?  
> >> 
> >> I could be talking out my arse here, but I thought we could do this,
> >> too, then changed my mind.  The relocation records give us the
> >> location of the call or jump operand, but they don’t give the address
> >> of the beginning of the instruction.  
> > 
> > But that's like 1 byte before the operand, right? We could even double check
> > this by reading back that byte and ensuring it is in fact 0xE8 (CALL).
> > 
> > AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
> > so if we have the PLT32 location, we also have the instruction location. Or am
> > I missing something?  
> 
> There’s also JMP and Jcc, any of which can be used for rail calls, but those are also one byte. I suppose GCC is unlikely to emit a prefixed form of any of these. So maybe we really can assume they’re all one byte.
> 
> But there is a nasty potential special case: anything that takes the function’s address. This includes jump tables, computed gotos, and plain old function pointers. And I suspect that any of these could have one of the rather large number of CALL/JMP/Jcc bytes before the relocation by coincidence.
> 

FYI, your email client is horrible to read from decent email clients :-p

Anyway,

I'd like to have these "dynamic functions" be "special" where they
can't be added to tables or what not. If you need to add one to a
table or function pointer, then you need to have a wrapper function
that does the call. I think we can come up with some kind of wrapper or
linker magic to enforce this too.

-- Steve



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 16:29             ` Andy Lutomirski
  2018-10-08 16:39               ` Steven Rostedt
@ 2018-10-08 16:39               ` Peter Zijlstra
  2018-10-08 17:25                 ` Andy Lutomirski
  2018-10-09  2:17               ` Josh Poimboeuf
  2 siblings, 1 reply; 43+ messages in thread
From: Peter Zijlstra @ 2018-10-08 16:39 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Steven Rostedt, linux-kernel, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina, ard.biesheuvel, Andy Lutomirski

On Mon, Oct 08, 2018 at 09:29:56AM -0700, Andy Lutomirski wrote:
> 
> 
> > On Oct 8, 2018, at 8:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:
> >>> Can't we hijack the relocation records for these functions before they
> >>> get thrown out in the (final) link pass or something?
> >> 
> >> I could be talking out my arse here, but I thought we could do this,
> >> too, then changed my mind.  The relocation records give us the
> >> location of the call or jump operand, but they don’t give the address
> >> of the beginning of the instruction.
> > 
> > But that's like 1 byte before the operand, right? We could even double check
> > this by reading back that byte and ensuring it is in fact 0xE8 (CALL).
> > 
> > AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
> > so if we have the PLT32 location, we also have the instruction location. Or am
> > I missing something?
> 
> There’s also JMP and Jcc, any of which can be used for rail calls, but
> those are also one byte. I suppose GCC is unlikely to emit a prefixed
> form of any of these. So maybe we really can assume they’re all one
> byte.

Oh, I had not considered tail calls..

> But there is a nasty potential special case: anything that takes the
> function’s address. This includes jump tables, computed gotos, and
> plain old function pointers. And I suspect that any of these could
> have one of the rather large number of CALL/JMP/Jcc bytes before the
> relocation by coincidence.

We can have objtool verify the CALL/JMP/Jcc only condition. So if
someone tries to take the address of a patchable function, it will error
out.

Heck, it could initially even error out on tail calls.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 16:39               ` Peter Zijlstra
@ 2018-10-08 17:25                 ` Andy Lutomirski
  2018-10-08 17:30                   ` Ard Biesheuvel
  0 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2018-10-08 17:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Rostedt, LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Masami Hiramatsu, Mathieu Desnoyers, mhelsley,
	Rafael J. Wysocki, David Woodhouse, Paolo Bonzini,
	Josh Poimboeuf, Jason Baron, Jiri Kosina, Ard Biesheuvel,
	Andrew Lutomirski

On Mon, Oct 8, 2018 at 9:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Oct 08, 2018 at 09:29:56AM -0700, Andy Lutomirski wrote:
> >
> >
> > > On Oct 8, 2018, at 8:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:
> > >>> Can't we hijack the relocation records for these functions before they
> > >>> get thrown out in the (final) link pass or something?
> > >>
> > >> I could be talking out my arse here, but I thought we could do this,
> > >> too, then changed my mind.  The relocation records give us the
> > >> location of the call or jump operand, but they don’t give the address
> > >> of the beginning of the instruction.
> > >
> > > But that's like 1 byte before the operand, right? We could even double check
> > > this by reading back that byte and ensuring it is in fact 0xE8 (CALL).
> > >
> > > AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
> > > so if we have the PLT32 location, we also have the instruction location. Or am
> > > I missing something?
> >
> > There’s also JMP and Jcc, any of which can be used for rail calls, but
> > those are also one byte. I suppose GCC is unlikely to emit a prefixed
> > form of any of these. So maybe we really can assume they’re all one
> > byte.
>
> Oh, I had not considered tail calls..
>
> > But there is a nasty potential special case: anything that takes the
> > function’s address. This includes jump tables, computed gotos, and
> > plain old function pointers. And I suspect that any of these could
> > have one of the rather large number of CALL/JMP/Jcc bytes before the
> > relocation by coincidence.
>
> We can have objtool verify the CALL/JMP/Jcc only condition. So if
> someone tries to take the address of a patchable function, it will error
> out.

I think we should just ignore the sites that take the address and
maybe issue a warning.  After all, GCC can create them all by itself.
We'll always have a plain wrapper function, and I think we should just
not patch code that takes its address.  So we do, roughly:

void default_foo(void);

GLOBAL(foo)
  jmp *current_foo(%rip)
ENDPROC(foo)

And code that does:

foo();

as a call, a tail call, a conditional tail call, etc, gets discovered
by objtool + relocation processing or whatever and gets patched.  (And
foo() itself gets patched, too, as a special case.  But we patch foo
itself at some point during boot to turn it into a direct JMP.  Doing
it this way means that the whole mechanism works from very early
boot.)  And anything awful like:

switch(whatever) {
case 0:
  foo();
};

that gets translated to a jump table and gets optimized such that it
jumps straight to foo just gets left alone, since it still works.
It's just a bit suboptimial.  Similarly, code that does:

void (*ptr)(void);
ptr = foo;

gets a bona fide pointer to foo(), and any calls through the pointer
land on foo() and jump to the current selected foo with only a single
indirect branch / retpoline.

Does this seem reasonable?  Is there a reason we should make it more
restrictive?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 17:25                 ` Andy Lutomirski
@ 2018-10-08 17:30                   ` Ard Biesheuvel
  2018-10-08 17:42                     ` Andy Lutomirski
  2018-10-08 17:44                     ` Jiri Kosina
  0 siblings, 2 replies; 43+ messages in thread
From: Ard Biesheuvel @ 2018-10-08 17:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Peter Zijlstra, Steven Rostedt, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina, Andrew Lutomirski

On 8 October 2018 at 19:25, Andy Lutomirski <luto@amacapital.net> wrote:
> On Mon, Oct 8, 2018 at 9:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> On Mon, Oct 08, 2018 at 09:29:56AM -0700, Andy Lutomirski wrote:
>> >
>> >
>> > > On Oct 8, 2018, at 8:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > >
>> > > On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:
>> > >>> Can't we hijack the relocation records for these functions before they
>> > >>> get thrown out in the (final) link pass or something?
>> > >>
>> > >> I could be talking out my arse here, but I thought we could do this,
>> > >> too, then changed my mind.  The relocation records give us the
>> > >> location of the call or jump operand, but they don’t give the address
>> > >> of the beginning of the instruction.
>> > >
>> > > But that's like 1 byte before the operand, right? We could even double check
>> > > this by reading back that byte and ensuring it is in fact 0xE8 (CALL).
>> > >
>> > > AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
>> > > so if we have the PLT32 location, we also have the instruction location. Or am
>> > > I missing something?
>> >
>> > There’s also JMP and Jcc, any of which can be used for rail calls, but
>> > those are also one byte. I suppose GCC is unlikely to emit a prefixed
>> > form of any of these. So maybe we really can assume they’re all one
>> > byte.
>>
>> Oh, I had not considered tail calls..
>>
>> > But there is a nasty potential special case: anything that takes the
>> > function’s address. This includes jump tables, computed gotos, and
>> > plain old function pointers. And I suspect that any of these could
>> > have one of the rather large number of CALL/JMP/Jcc bytes before the
>> > relocation by coincidence.
>>
>> We can have objtool verify the CALL/JMP/Jcc only condition. So if
>> someone tries to take the address of a patchable function, it will error
>> out.
>
> I think we should just ignore the sites that take the address and
> maybe issue a warning.  After all, GCC can create them all by itself.
> We'll always have a plain wrapper function, and I think we should just
> not patch code that takes its address.  So we do, roughly:
>
> void default_foo(void);
>
> GLOBAL(foo)
>   jmp *current_foo(%rip)
> ENDPROC(foo)
>
> And code that does:
>
> foo();
>
> as a call, a tail call, a conditional tail call, etc, gets discovered
> by objtool + relocation processing or whatever and gets patched.  (And
> foo() itself gets patched, too, as a special case.  But we patch foo
> itself at some point during boot to turn it into a direct JMP.  Doing
> it this way means that the whole mechanism works from very early
> boot.)

Does that mean that architectures could opt out of doing the whole
objtool + relocation processing thing, and instead take the hit of
going through the trampoline for all calls?

> And anything awful like:
>
> switch(whatever) {
> case 0:
>   foo();
> };
>
> that gets translated to a jump table and gets optimized such that it
> jumps straight to foo just gets left alone, since it still works.
> It's just a bit suboptimial.  Similarly, code that does:
>
> void (*ptr)(void);
> ptr = foo;
>
> gets a bona fide pointer to foo(), and any calls through the pointer
> land on foo() and jump to the current selected foo with only a single
> indirect branch / retpoline.
>
> Does this seem reasonable?  Is there a reason we should make it more
> restrictive?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 17:30                   ` Ard Biesheuvel
@ 2018-10-08 17:42                     ` Andy Lutomirski
  2018-10-08 17:44                     ` Jiri Kosina
  1 sibling, 0 replies; 43+ messages in thread
From: Andy Lutomirski @ 2018-10-08 17:42 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Peter Zijlstra, Steven Rostedt, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, mhelsley, Rafael J. Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina,
	Andrew Lutomirski

On Mon, Oct 8, 2018 at 10:30 AM Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
>
> On 8 October 2018 at 19:25, Andy Lutomirski <luto@amacapital.net> wrote:
> > On Mon, Oct 8, 2018 at 9:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >> On Mon, Oct 08, 2018 at 09:29:56AM -0700, Andy Lutomirski wrote:
> >> >
> >> >
> >> > > On Oct 8, 2018, at 8:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >> > >
> >> > > On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:
> >> > >>> Can't we hijack the relocation records for these functions before they
> >> > >>> get thrown out in the (final) link pass or something?
> >> > >>
> >> > >> I could be talking out my arse here, but I thought we could do this,
> >> > >> too, then changed my mind.  The relocation records give us the
> >> > >> location of the call or jump operand, but they don’t give the address
> >> > >> of the beginning of the instruction.
> >> > >
> >> > > But that's like 1 byte before the operand, right? We could even double check
> >> > > this by reading back that byte and ensuring it is in fact 0xE8 (CALL).
> >> > >
> >> > > AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
> >> > > so if we have the PLT32 location, we also have the instruction location. Or am
> >> > > I missing something?
> >> >
> >> > There’s also JMP and Jcc, any of which can be used for rail calls, but
> >> > those are also one byte. I suppose GCC is unlikely to emit a prefixed
> >> > form of any of these. So maybe we really can assume they’re all one
> >> > byte.
> >>
> >> Oh, I had not considered tail calls..
> >>
> >> > But there is a nasty potential special case: anything that takes the
> >> > function’s address. This includes jump tables, computed gotos, and
> >> > plain old function pointers. And I suspect that any of these could
> >> > have one of the rather large number of CALL/JMP/Jcc bytes before the
> >> > relocation by coincidence.
> >>
> >> We can have objtool verify the CALL/JMP/Jcc only condition. So if
> >> someone tries to take the address of a patchable function, it will error
> >> out.
> >
> > I think we should just ignore the sites that take the address and
> > maybe issue a warning.  After all, GCC can create them all by itself.
> > We'll always have a plain wrapper function, and I think we should just
> > not patch code that takes its address.  So we do, roughly:
> >
> > void default_foo(void);
> >
> > GLOBAL(foo)
> >   jmp *current_foo(%rip)
> > ENDPROC(foo)
> >
> > And code that does:
> >
> > foo();
> >
> > as a call, a tail call, a conditional tail call, etc, gets discovered
> > by objtool + relocation processing or whatever and gets patched.  (And
> > foo() itself gets patched, too, as a special case.  But we patch foo
> > itself at some point during boot to turn it into a direct JMP.  Doing
> > it this way means that the whole mechanism works from very early
> > boot.)
>
> Does that mean that architectures could opt out of doing the whole
> objtool + relocation processing thing, and instead take the hit of
> going through the trampoline for all calls?
>

I don't see why not.

--Andy

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 17:30                   ` Ard Biesheuvel
  2018-10-08 17:42                     ` Andy Lutomirski
@ 2018-10-08 17:44                     ` Jiri Kosina
  2018-10-08 17:45                       ` Ard Biesheuvel
  2018-10-08 17:47                       ` Andy Lutomirski
  1 sibling, 2 replies; 43+ messages in thread
From: Jiri Kosina @ 2018-10-08 17:44 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Andy Lutomirski, Peter Zijlstra, Steven Rostedt, LKML,
	Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Masami Hiramatsu, Mathieu Desnoyers, Matthew Helsley,
	Rafael J. Wysocki, David Woodhouse, Paolo Bonzini,
	Josh Poimboeuf, Jason Baron, Andrew Lutomirski

On Mon, 8 Oct 2018, Ard Biesheuvel wrote:

> Does that mean that architectures could opt out of doing the whole
> objtool + relocation processing thing, and instead take the hit of
> going through the trampoline for all calls?

There are architectures that aren't [currently] supported by objtool at 
all anyway.

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 17:44                     ` Jiri Kosina
@ 2018-10-08 17:45                       ` Ard Biesheuvel
  2018-10-08 17:47                       ` Andy Lutomirski
  1 sibling, 0 replies; 43+ messages in thread
From: Ard Biesheuvel @ 2018-10-08 17:45 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Andy Lutomirski, Peter Zijlstra, Steven Rostedt, LKML,
	Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Masami Hiramatsu, Mathieu Desnoyers, Matthew Helsley,
	Rafael J. Wysocki, David Woodhouse, Paolo Bonzini,
	Josh Poimboeuf, Jason Baron, Andrew Lutomirski

On 8 October 2018 at 19:44, Jiri Kosina <jikos@kernel.org> wrote:
> On Mon, 8 Oct 2018, Ard Biesheuvel wrote:
>
>> Does that mean that architectures could opt out of doing the whole
>> objtool + relocation processing thing, and instead take the hit of
>> going through the trampoline for all calls?
>
> There are architectures that aren't [currently] supported by objtool at
> all anyway.
>

That was kind of my point :-)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 17:44                     ` Jiri Kosina
  2018-10-08 17:45                       ` Ard Biesheuvel
@ 2018-10-08 17:47                       ` Andy Lutomirski
  1 sibling, 0 replies; 43+ messages in thread
From: Andy Lutomirski @ 2018-10-08 17:47 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Ard Biesheuvel, Peter Zijlstra, Steven Rostedt, LKML,
	Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Masami Hiramatsu, Mathieu Desnoyers, mhelsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Andrew Lutomirski

On Mon, Oct 8, 2018 at 10:44 AM Jiri Kosina <jikos@kernel.org> wrote:
>
> On Mon, 8 Oct 2018, Ard Biesheuvel wrote:
>
> > Does that mean that architectures could opt out of doing the whole
> > objtool + relocation processing thing, and instead take the hit of
> > going through the trampoline for all calls?
>
> There are architectures that aren't [currently] supported by objtool at
> all anyway.
>

The the credit of most architectures, though, the only reason x86
would want to use objtool instead of digging the results directly out
of the relocation data is that x86 has an overcomplicated instruction
encoding and there's no fully reliable way to find the address of the
instruction that contains a given relocation without fully
disassembling the binary.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-08 16:29             ` Andy Lutomirski
  2018-10-08 16:39               ` Steven Rostedt
  2018-10-08 16:39               ` Peter Zijlstra
@ 2018-10-09  2:17               ` Josh Poimboeuf
  2018-10-09  3:57                 ` Steven Rostedt
  2 siblings, 1 reply; 43+ messages in thread
From: Josh Poimboeuf @ 2018-10-09  2:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Peter Zijlstra, Steven Rostedt, linux-kernel, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	ard.biesheuvel, Andy Lutomirski

On Mon, Oct 08, 2018 at 09:29:56AM -0700, Andy Lutomirski wrote:
> 
> 
> > On Oct 8, 2018, at 8:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:
> >>> Can't we hijack the relocation records for these functions before they
> >>> get thrown out in the (final) link pass or something?
> >> 
> >> I could be talking out my arse here, but I thought we could do this,
> >> too, then changed my mind.  The relocation records give us the
> >> location of the call or jump operand, but they don’t give the address
> >> of the beginning of the instruction.
> > 
> > But that's like 1 byte before the operand, right? We could even double check
> > this by reading back that byte and ensuring it is in fact 0xE8 (CALL).
> > 
> > AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
> > so if we have the PLT32 location, we also have the instruction location. Or am
> > I missing something?
> 
> There’s also JMP and Jcc, any of which can be used for rail calls, but
> those are also one byte. I suppose GCC is unlikely to emit a prefixed
> form of any of these. So maybe we really can assume they’re all one
> byte.

I'm pretty sure only a basic JMP is used for tail calls.

> But there is a nasty potential special case: anything that takes the
> function’s address. This includes jump tables, computed gotos, and
> plain old function pointers. And I suspect that any of these could
> have one of the rather large number of CALL/JMP/Jcc bytes before the
> relocation by coincidence.

But those special cases aren't in a text section, right?  If we just
make sure the relocations are applied to a text section, and that
they're preceded by the CALL or JMP byte, wouldn't that be sufficient?

I'm not really convinced we need objtool for this, maybe I'll try
whipping up a POC.

-- 
Josh

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-06  1:51 ` [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function" Steven Rostedt
                     ` (3 preceding siblings ...)
  2018-10-06 12:12   ` Peter Zijlstra
@ 2018-10-09  3:44   ` Masami Hiramatsu
  2018-10-09  3:55     ` Steven Rostedt
  2018-10-09  8:59     ` David Laight
  4 siblings, 2 replies; 43+ messages in thread
From: Masami Hiramatsu @ 2018-10-09  3:44 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Josh Poimboeuf, Jason Baron,
	Jiri Kosina

On Fri, 05 Oct 2018 21:51:11 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> +typedef long dynfunc_t;
> +
> +struct dynfunc_struct;
> +
> +#define arch_dynfunc_trampoline(name, def)	\
> +	asm volatile (				\
> +	".globl dynfunc_" #name "; \n\t"	\
> +	"dynfunc_" #name ": \n\t"		\
> +	"jmp " #def " \n\t"			\
> +	".balign 8 \n \t"			\
> +	: : : "memory" )
> +

I have just a question, what is this different from livepatch? :)

I think we can replace the first 5 bytes of the default function
to jmp instruction (to alternative function) instead of making
this trampoline.

IOW, as far as I can see, this is changing

----
call %reg (or retpoline_reg)
----

to 

----
call dynfunc_A

dynfunc_A:
jmp func_A or altered_func_A
----

If so, why don't we put the jmp on default func_A directly?
----
call func_A

func_A:
"jmp altered_func" or "original sequence"
----
(this is idealy same as jprobes did)

Of course we have to arbitrate it with ftrace (fentry) but it may
not so hard (simplest way is just adding "notrace" on the default
function)

BTW, I think "dynamic_function" may not correct name, it may be
"alternative_function" or something like that, because this
function must be replaced system-wide and this means we can
not use this for generic function pointer usage which depends
on thread context (like file_operations). But good for something
pluggable code (LSM?).


Thank you,


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-09  3:44   ` Masami Hiramatsu
@ 2018-10-09  3:55     ` Steven Rostedt
  2018-10-09 16:04       ` Masami Hiramatsu
  2018-10-09  8:59     ` David Laight
  1 sibling, 1 reply; 43+ messages in thread
From: Steven Rostedt @ 2018-10-09  3:55 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina

On Tue, 9 Oct 2018 12:44:01 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> On Fri, 05 Oct 2018 21:51:11 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > +typedef long dynfunc_t;
> > +
> > +struct dynfunc_struct;
> > +
> > +#define arch_dynfunc_trampoline(name, def)	\
> > +	asm volatile (				\
> > +	".globl dynfunc_" #name "; \n\t"	\
> > +	"dynfunc_" #name ": \n\t"		\
> > +	"jmp " #def " \n\t"			\
> > +	".balign 8 \n \t"			\
> > +	: : : "memory" )
> > +  
> 
> I have just a question, what is this different from livepatch? :)

I actually thought about this a bit, but decided against it.

I didn't want to hook another infrastructure into the fentry nop. It's
already complex enough with kprobes, live patching and ftrace.

The ideal solution is what Peter suggested, and that's to patch the
call sites, and I think that is attainable with objtool modifications.

> 
> I think we can replace the first 5 bytes of the default function
> to jmp instruction (to alternative function) instead of making
> this trampoline.
> 
> IOW, as far as I can see, this is changing
> 
> ----
> call %reg (or retpoline_reg)
> ----
> 
> to 
> 
> ----
> call dynfunc_A
> 
> dynfunc_A:
> jmp func_A or altered_func_A
> ----
> 
> If so, why don't we put the jmp on default func_A directly?
> ----
> call func_A
> 
> func_A:
> "jmp altered_func" or "original sequence"
> ----
> (this is idealy same as jprobes did)
> 
> Of course we have to arbitrate it with ftrace (fentry) but it may
> not so hard (simplest way is just adding "notrace" on the default
> function)

Then we lose the 5 byte nop.

> 
> BTW, I think "dynamic_function" may not correct name, it may be
> "alternative_function" or something like that, because this
> function must be replaced system-wide and this means we can
> not use this for generic function pointer usage which depends
> on thread context (like file_operations). But good for something
> pluggable code (LSM?).

I don't like the name alternative, as that's usually a one shot deal
(SMP vs UP).

It is dynamic, as it's a function that changes dynamically. Yes its
global, but that's not mutually exclusive to dynamic.

The use case I want this for is for tracing. But it can be useful for
KVM and power management governors. Basically anything that has a
global function pointer (hmm, even the idle call can use this).

-- Steve

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-09  2:17               ` Josh Poimboeuf
@ 2018-10-09  3:57                 ` Steven Rostedt
  2018-10-10 17:52                   ` Josh Poimboeuf
  0 siblings, 1 reply; 43+ messages in thread
From: Steven Rostedt @ 2018-10-09  3:57 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Andy Lutomirski, Peter Zijlstra, linux-kernel, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	ard.biesheuvel, Andy Lutomirski

On Mon, 8 Oct 2018 21:17:10 -0500
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> I'm not really convinced we need objtool for this, maybe I'll try
> whipping up a POC.

Awesome!

I wasn't thinking of actually having objtool itself perform this task,
but instead breaking the internals of objtool up into more of a generic
infrastructure, that recordmcount.c, objtool, and whatever this does
can use.

-- Steve

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-09  3:44   ` Masami Hiramatsu
  2018-10-09  3:55     ` Steven Rostedt
@ 2018-10-09  8:59     ` David Laight
  1 sibling, 0 replies; 43+ messages in thread
From: David Laight @ 2018-10-09  8:59 UTC (permalink / raw)
  To: 'Masami Hiramatsu', Steven Rostedt
  Cc: linux-kernel, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina

From: Masami Hiramatsu
> Sent: 09 October 2018 04:44
...
> I think we can replace the first 5 bytes of the default function
> to jmp instruction (to alternative function) instead of making
> this trampoline.

Or have a trampoline that is just a jump instruction and overwrite
the target address at run time to select the non-default code.
With care the target address can be aligned so that the write is atomic
and can be done while other cpu might be calling the function.

This will be lower impact that a 'jump indirect' - especially since
the latter would have to be implemented using a 'retpoline'.

It would also make it possible to re-instate the default function.
(By saving its address after the jump.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-09  3:55     ` Steven Rostedt
@ 2018-10-09 16:04       ` Masami Hiramatsu
  0 siblings, 0 replies; 43+ messages in thread
From: Masami Hiramatsu @ 2018-10-09 16:04 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Mathieu Desnoyers,
	Matthew Helsley, Rafael J . Wysocki, David Woodhouse,
	Paolo Bonzini, Josh Poimboeuf, Jason Baron, Jiri Kosina

On Mon, 8 Oct 2018 23:55:34 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 9 Oct 2018 12:44:01 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > On Fri, 05 Oct 2018 21:51:11 -0400
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > +typedef long dynfunc_t;
> > > +
> > > +struct dynfunc_struct;
> > > +
> > > +#define arch_dynfunc_trampoline(name, def)	\
> > > +	asm volatile (				\
> > > +	".globl dynfunc_" #name "; \n\t"	\
> > > +	"dynfunc_" #name ": \n\t"		\
> > > +	"jmp " #def " \n\t"			\
> > > +	".balign 8 \n \t"			\
> > > +	: : : "memory" )
> > > +  
> > 
> > I have just a question, what is this different from livepatch? :)
> 
> I actually thought about this a bit, but decided against it.
> 
> I didn't want to hook another infrastructure into the fentry nop. It's
> already complex enough with kprobes, live patching and ftrace.
> 
> The ideal solution is what Peter suggested, and that's to patch the
> call sites, and I think that is attainable with objtool modifications.

OK, the ideal solution sounds good to me. 

> 
> > 
> > I think we can replace the first 5 bytes of the default function
> > to jmp instruction (to alternative function) instead of making
> > this trampoline.
> > 
> > IOW, as far as I can see, this is changing
> > 
> > ----
> > call %reg (or retpoline_reg)
> > ----
> > 
> > to 
> > 
> > ----
> > call dynfunc_A
> > 
> > dynfunc_A:
> > jmp func_A or altered_func_A
> > ----
> > 
> > If so, why don't we put the jmp on default func_A directly?
> > ----
> > call func_A
> > 
> > func_A:
> > "jmp altered_func" or "original sequence"
> > ----
> > (this is idealy same as jprobes did)
> > 
> > Of course we have to arbitrate it with ftrace (fentry) but it may
> > not so hard (simplest way is just adding "notrace" on the default
> > function)
> 
> Then we lose the 5 byte nop.

Yeah, but we can remove the trampoline code.

> > BTW, I think "dynamic_function" may not correct name, it may be
> > "alternative_function" or something like that, because this
> > function must be replaced system-wide and this means we can
> > not use this for generic function pointer usage which depends
> > on thread context (like file_operations). But good for something
> > pluggable code (LSM?).
> 
> I don't like the name alternative, as that's usually a one shot deal
> (SMP vs UP).
> 
> It is dynamic, as it's a function that changes dynamically. Yes its
> global, but that's not mutually exclusive to dynamic.

OK, so we may add a note that this is "global" patching :)

> The use case I want this for is for tracing. But it can be useful for
> KVM and power management governors. Basically anything that has a
> global function pointer (hmm, even the idle call can use this).

Indeed.

Thanks,

> 
> -- Steve


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-09  3:57                 ` Steven Rostedt
@ 2018-10-10 17:52                   ` Josh Poimboeuf
  2018-10-10 18:03                     ` Andy Lutomirski
  0 siblings, 1 reply; 43+ messages in thread
From: Josh Poimboeuf @ 2018-10-10 17:52 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andy Lutomirski, Peter Zijlstra, linux-kernel, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J . Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	ard.biesheuvel, Andy Lutomirski

On Mon, Oct 08, 2018 at 11:57:50PM -0400, Steven Rostedt wrote:
> On Mon, 8 Oct 2018 21:17:10 -0500
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > I'm not really convinced we need objtool for this, maybe I'll try
> > whipping up a POC.
> 
> Awesome!
> 
> I wasn't thinking of actually having objtool itself perform this task,
> but instead breaking the internals of objtool up into more of a generic
> infrastructure, that recordmcount.c, objtool, and whatever this does
> can use.

So I had been thinking that we could find the call sites at runtime, by
looking at the relocations.  But I managed to forget that vmlinux
relocations are resolved during linking.  So yeah, some kind of tooling
magic would be needed.

I worked up a POC using objtool.  It doesn't *have* to be done with
objtool, but since it's already reading/writing all the ELF stuff
anyway, it was pretty easy to add this on.

This patch has at least a few issues:

- No module support.

- For some reason, the sync_cores in text_poke_bp() don't always seem to
  be working as expected.  Running this patch on my VM, the test code in
  cmdline_proc_show() works *most* of the time, but it occasionally
  branches off into the weeds.  I have no idea what the problem is yet.

diff --git a/arch/Kconfig b/arch/Kconfig
index 9d329608913e..20ff5624dad7 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -865,6 +865,9 @@ config HAVE_ARCH_PREL32_RELOCATIONS
 	  architectures, and don't require runtime relocation on relocatable
 	  kernels.
 
+config HAVE_ARCH_STATIC_CALL
+	bool
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5136a1281870..1a14c8f87876 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -128,6 +128,7 @@ config X86
 	select HAVE_ARCH_COMPAT_MMAP_BASES	if MMU && COMPAT
 	select HAVE_ARCH_PREL32_RELOCATIONS
 	select HAVE_ARCH_SECCOMP_FILTER
+	select HAVE_ARCH_STATIC_CALL		if X86_64
 	select HAVE_ARCH_THREAD_STRUCT_WHITELIST
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
diff --git a/arch/x86/include/asm/static_call.h b/arch/x86/include/asm/static_call.h
new file mode 100644
index 000000000000..40fec631b760
--- /dev/null
+++ b/arch/x86/include/asm/static_call.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_STATIC_CALL_H
+#define _ASM_STATIC_CALL_H
+
+#ifdef CONFIG_X86_64
+
+#include <linux/frame.h>
+
+void static_call_init(void);
+extern void __static_call_update(void *tramp, void *func);
+
+#define DECLARE_STATIC_CALL(tramp, func)				\
+	extern typeof(func) tramp;					\
+	static void __used __section(.discard.static_call_tramps)	\
+		*__static_call_tramp_##tramp = tramp
+
+#define DEFINE_STATIC_CALL(tramp, func)					\
+	DECLARE_STATIC_CALL(tramp, func);				\
+	asm(".pushsection .text, \"ax\"				\n"	\
+	    ".align 4						\n"	\
+	    ".globl " #tramp "					\n"	\
+	    ".type " #tramp ", @function			\n"	\
+	    #tramp ":						\n"	\
+	    "jmp " #func "					\n"	\
+	    ASM_NOP3 "						\n"	\
+	    ".popsection					\n")
+
+#define static_call_update(tramp, func)					\
+	__static_call_update(tramp, func)
+
+#else /* !CONFIG_X86_64 */
+static inline void static_call_init(void) {}
+#endif
+
+#endif /* _ASM_STATIC_CALL_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 8824d01c0c35..e5d9f3a1e73f 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -62,6 +62,7 @@ obj-y			+= tsc.o tsc_msr.o io_delay.o rtc.o
 obj-y			+= pci-iommu_table.o
 obj-y			+= resource.o
 obj-y			+= irqflags.o
+obj-$(CONFIG_X86_64)	+= static_call.o
 
 obj-y				+= process.o
 obj-y				+= fpu/
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index b4866badb235..447401fc8d65 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -117,6 +117,7 @@
 #include <asm/microcode.h>
 #include <asm/kaslr.h>
 #include <asm/unwind.h>
+#include <asm/static_call.h>
 
 /*
  * max_low_pfn_mapped: highest direct mapped pfn under 4GB
@@ -874,6 +875,7 @@ void __init setup_arch(char **cmdline_p)
 	early_cpu_init();
 	arch_init_ideal_nops();
 	jump_label_init();
+	static_call_init();
 	early_ioremap_init();
 
 	setup_olpc_ofw_pgd();
diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c
new file mode 100644
index 000000000000..e7a17ee6942d
--- /dev/null
+++ b/arch/x86/kernel/static_call.c
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <linux/init.h>
+#include <linux/static_call.h>
+#include <linux/printk.h>
+#include <linux/bug.h>
+#include <linux/smp.h>
+#include <linux/memory.h>
+#include <asm/text-patching.h>
+#include <asm/processor.h>
+
+extern int cmdline_proc_show(void);
+
+/* The static call table is created by objtool */
+struct static_call_entry {
+	s32 insn, tramp;
+};
+extern struct static_call_entry __start_static_call_table[],
+			        __stop_static_call_table[];
+
+void __init static_call_init(void)
+{
+	struct static_call_entry *entry;
+	unsigned long insn, tramp, func;
+	unsigned char insn_opcode, tramp_opcode;
+	s32 call_dest;
+
+	for (entry = __start_static_call_table;
+	     entry < __stop_static_call_table; entry++) {
+
+		insn = (long)entry->insn + (unsigned long)&entry->insn;
+		tramp = (long)entry->tramp + (unsigned long)&entry->tramp;
+
+		insn_opcode = *(unsigned char *)insn;
+		if (insn_opcode != 0xe8 && insn_opcode != 0xe9) {
+			WARN_ONCE(1, "unexpected static call insn opcode %x at %pS",
+				  insn_opcode, (void *)insn);
+			continue;
+		}
+
+		tramp_opcode = *(unsigned char *)tramp;
+		if (tramp_opcode != 0xeb && tramp_opcode != 0xe9) {
+			WARN_ONCE(1, "unexpected trampoline jump opcode %x at %ps",
+				 tramp_opcode, (void *)tramp);
+			continue;
+		}
+
+		if (tramp_opcode == 0xeb)
+			func = *(s8 *)(tramp + 1) + (tramp + 2);
+		else
+			func = *(s32 *)(tramp + 1) + (tramp + 5);
+
+		call_dest = (long)(func) - (long)(insn + 5);
+
+		printk("static_call_init: poking %lx at %lx\n", (unsigned long)call_dest, (insn+1));
+
+		text_poke_early((void *)(insn + 1), &call_dest, 4);
+	}
+}
+
+/* cribbed from arch/x86/kernel/alternative.c */
+static void do_sync_core(void *info)
+{
+	sync_core();
+}
+
+void __static_call_update(void *tramp, void *func)
+{
+	struct static_call_entry *entry;
+	unsigned long insn, t;
+	s32 call_dest;
+	unsigned char opcodes[5];
+
+	mutex_lock(&text_mutex);
+
+	/*
+	 * Reuse the (now unused) trampoline to be the fallback handler
+	 * for text_poke_bp():
+	 */
+	call_dest = (long)(func) - (long)(tramp + 5);
+	opcodes[0] = 0xe8;
+	memcpy(&opcodes[1], &call_dest, 4);
+	text_poke(tramp, opcodes, 5);
+	on_each_cpu(do_sync_core, NULL, 1);
+
+	/* Patch the call sites: */
+	for (entry = __start_static_call_table;
+	     entry < __stop_static_call_table; entry++) {
+
+		t = (long)entry->tramp + (unsigned long)&entry->tramp;
+		if ((void *)t != tramp)
+			continue;
+
+		insn = (long)entry->insn + (unsigned long)&entry->insn;
+		call_dest = (long)(func) - (long)(insn + 5);
+		opcodes[0] = 0xe8;
+		memcpy(&opcodes[1], &call_dest, 4);
+
+		text_poke_bp((void *)insn, opcodes, 5, tramp);
+	}
+
+	mutex_unlock(&text_mutex);
+}
+
+/*** TEST CODE BELOW - called from cmdline_proc_show() ***/
+
+int my_func_add(int arg1, int arg2)
+{
+	return arg1 + arg2;
+}
+
+int my_func_sub(int arg1, int arg2)
+{
+	return arg1 - arg2;
+}
+
+DEFINE_STATIC_CALL(my_static_call, my_func_add);
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 0d618ee634ac..cf0566f8a13c 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -185,6 +185,9 @@ SECTIONS
 
 	BUG_TABLE
 
+	/* FIXME: move to read-only section */
+	STATIC_CALL_TABLE
+
 	ORC_UNWIND_TABLE
 
 	. = ALIGN(PAGE_SIZE);
diff --git a/fs/proc/cmdline.c b/fs/proc/cmdline.c
index fa762c5fbcb2..c704b9e1fe5f 100644
--- a/fs/proc/cmdline.c
+++ b/fs/proc/cmdline.c
@@ -3,9 +3,27 @@
 #include <linux/init.h>
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
+#include <linux/static_call.h>
+
+extern int my_func_add(int arg1, int arg2);
+extern int my_func_sub(int arg1, int arg2);
+DECLARE_STATIC_CALL(my_static_call, my_func_add);
 
 static int cmdline_proc_show(struct seq_file *m, void *v)
 {
+	int ret;
+
+	ret = my_static_call(1, 2);
+	printk("static call (orig): ret=%d\n", ret);
+
+	static_call_update(my_static_call, my_func_sub);
+	ret = my_static_call(1, 2);
+	printk("static call (sub): ret=%d\n", ret);
+
+	static_call_update(my_static_call, my_func_add);
+	ret = my_static_call(1, 2);
+	printk("static call (add): ret=%d\n", ret);
+
 	seq_puts(m, saved_command_line);
 	seq_putc(m, '\n');
 	return 0;
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index f09ee3c544bc..a1c7bda1b22a 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -722,6 +722,14 @@
 #define BUG_TABLE
 #endif
 
+#define STATIC_CALL_TABLE						\
+	. = ALIGN(8);							\
+	__static_call_table : AT(ADDR(__static_call_table) - LOAD_OFFSET) { \
+		__start_static_call_table = .;			\
+		KEEP(*(__static_call_table))				\
+		__stop_static_call_table = .;				\
+	}
+
 #ifdef CONFIG_UNWINDER_ORC
 #define ORC_UNWIND_TABLE						\
 	. = ALIGN(4);							\
diff --git a/include/linux/static_call.h b/include/linux/static_call.h
new file mode 100644
index 000000000000..729e7ee4c66b
--- /dev/null
+++ b/include/linux/static_call.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_STATIC_CALL_H
+#define _LINUX_STATIC_CALL_H
+
+#ifdef CONFIG_HAVE_ARCH_STATIC_CALL
+#include <asm/static_call.h>
+#else
+
+#define DECLARE_STATIC_CALL(ptr, func)					\
+	extern typeof(func) *ptr
+
+#define DEFINE_STATIC_CALL(ptr, func)					\
+	typeof(func) *ptr = func
+
+#define static_call_update(ptr, func)					\
+	WRITE_ONCE(ptr, func)
+
+#endif /* !CONFIG_HAVE_ARCH_STATIC_CALL */
+
+#endif /* _LINUX_STATIC_CALL_H */
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 0414a0d52262..a8e7d3b92513 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -525,6 +525,10 @@ static int add_jump_destinations(struct objtool_file *file)
 		} else {
 			/* sibling call */
 			insn->jump_dest = 0;
+			if (rela->sym->static_call_tramp) {
+				list_add_tail(&insn->static_call_node,
+					      &file->static_call_list);
+			}
 			continue;
 		}
 
@@ -1202,6 +1206,21 @@ static int read_retpoline_hints(struct objtool_file *file)
 	return 0;
 }
 
+static int read_static_call_tramps(struct objtool_file *file)
+{
+	struct section *sec;
+	struct rela *rela;
+
+	sec = find_section_by_name(file->elf, ".rela.discard.static_call_tramps");
+	if (!sec)
+		return 0;
+
+	list_for_each_entry(rela, &sec->rela_list, list)
+		rela->sym->static_call_tramp = true;
+
+	return 0;
+}
+
 static void mark_rodata(struct objtool_file *file)
 {
 	struct section *sec;
@@ -1267,6 +1286,10 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	ret = read_static_call_tramps(file);
+	if (ret)
+		return ret;
+
 	return 0;
 }
 
@@ -1920,6 +1943,11 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
 			if (is_fentry_call(insn))
 				break;
 
+			if (insn->call_dest->static_call_tramp) {
+				list_add_tail(&insn->static_call_node,
+					      &file->static_call_list);
+			}
+
 			ret = dead_end_function(file, insn->call_dest);
 			if (ret == 1)
 				return 0;
@@ -2167,6 +2195,83 @@ static int validate_reachable_instructions(struct objtool_file *file)
 	return 0;
 }
 
+struct static_call_entry {
+	s32 insn, tramp;
+};
+
+static int create_static_call_sections(struct objtool_file *file)
+{
+	struct section *sec, *rela_sec;
+	struct rela *rela;
+	struct static_call_entry *entry;
+	struct instruction *insn;
+	int idx;
+
+	sec = find_section_by_name(file->elf, "__static_call_table");
+	if (sec) {
+		WARN("file already has __static_call_table section, skipping");
+		return -1;
+	}
+
+	if (list_empty(&file->static_call_list))
+		return 0;
+
+	idx = 0;
+	list_for_each_entry(insn, &file->static_call_list, static_call_node)
+		idx++;
+
+	sec = elf_create_section(file->elf, "__static_call_table",
+				 sizeof(struct static_call_entry), idx);
+	if (!sec)
+		return -1;
+
+	rela_sec = elf_create_rela_section(file->elf, sec);
+	if (!rela_sec)
+		return -1;
+
+	idx = 0;
+	list_for_each_entry(insn, &file->static_call_list, static_call_node) {
+
+		entry = (struct static_call_entry *)sec->data->d_buf + idx;
+		memset(entry, 0, sizeof(struct static_call_entry));
+
+		/* populate rela for 'insn' */
+		rela = malloc(sizeof(*rela));
+		if (!rela) {
+			perror("malloc");
+			return -1;
+		}
+		memset(rela, 0, sizeof(*rela));
+		rela->sym = insn->sec->sym;
+		rela->addend = insn->offset;
+		rela->type = R_X86_64_PC32;
+		rela->offset = idx * sizeof(struct static_call_entry);
+		list_add_tail(&rela->list, &rela_sec->rela_list);
+		hash_add(rela_sec->rela_hash, &rela->hash, rela->offset);
+
+		/* populate rela for 'tramp' */
+		rela = malloc(sizeof(*rela));
+		if (!rela) {
+			perror("malloc");
+			return -1;
+		}
+		memset(rela, 0, sizeof(*rela));
+		rela->sym = insn->call_dest;
+		rela->addend = 0;
+		rela->type = R_X86_64_PC32;
+		rela->offset = idx * sizeof(struct static_call_entry) + 4;
+		list_add_tail(&rela->list, &rela_sec->rela_list);
+		hash_add(rela_sec->rela_hash, &rela->hash, rela->offset);
+
+		idx++;
+	}
+
+	if (elf_rebuild_rela_section(rela_sec))
+		return -1;
+
+	return 0;
+}
+
 static void cleanup(struct objtool_file *file)
 {
 	struct instruction *insn, *tmpinsn;
@@ -2197,6 +2302,7 @@ int check(const char *_objname, bool orc)
 
 	INIT_LIST_HEAD(&file.insn_list);
 	hash_init(file.insn_hash);
+	INIT_LIST_HEAD(&file.static_call_list);
 	file.whitelist = find_section_by_name(file.elf, ".discard.func_stack_frame_non_standard");
 	file.c_file = find_section_by_name(file.elf, ".comment");
 	file.ignore_unreachables = no_unreachable;
@@ -2236,6 +2342,11 @@ int check(const char *_objname, bool orc)
 		warnings += ret;
 	}
 
+	ret = create_static_call_sections(&file);
+	if (ret < 0)
+		goto out;
+	warnings += ret;
+
 	if (orc) {
 		ret = create_orc(&file);
 		if (ret < 0)
@@ -2244,7 +2355,9 @@ int check(const char *_objname, bool orc)
 		ret = create_orc_sections(&file);
 		if (ret < 0)
 			goto out;
+	}
 
+	if (orc || !list_empty(&file.static_call_list)) {
 		ret = elf_write(file.elf);
 		if (ret < 0)
 			goto out;
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index e6e8a655b556..56b8b7fb1bd1 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -39,6 +39,7 @@ struct insn_state {
 struct instruction {
 	struct list_head list;
 	struct hlist_node hash;
+	struct list_head static_call_node;
 	struct section *sec;
 	unsigned long offset;
 	unsigned int len;
@@ -60,6 +61,7 @@ struct objtool_file {
 	struct elf *elf;
 	struct list_head insn_list;
 	DECLARE_HASHTABLE(insn_hash, 16);
+	struct list_head static_call_list;
 	struct section *whitelist;
 	bool ignore_unreachables, c_file, hints, rodata;
 };
diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index bc97ed86b9cd..3cf44d7cc3ac 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -62,6 +62,7 @@ struct symbol {
 	unsigned long offset;
 	unsigned int len;
 	struct symbol *pfunc, *cfunc;
+	bool static_call_tramp;
 };
 
 struct rela {

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-10 17:52                   ` Josh Poimboeuf
@ 2018-10-10 18:03                     ` Andy Lutomirski
  2018-10-10 18:16                       ` Josh Poimboeuf
  0 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2018-10-10 18:03 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Steven Rostedt, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski

On Wed, Oct 10, 2018 at 10:52 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Mon, Oct 08, 2018 at 11:57:50PM -0400, Steven Rostedt wrote:
> > On Mon, 8 Oct 2018 21:17:10 -0500
> > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > > I'm not really convinced we need objtool for this, maybe I'll try
> > > whipping up a POC.
> >
> > Awesome!
> >
> > I wasn't thinking of actually having objtool itself perform this task,
> > but instead breaking the internals of objtool up into more of a generic
> > infrastructure, that recordmcount.c, objtool, and whatever this does
> > can use.
>
> So I had been thinking that we could find the call sites at runtime, by
> looking at the relocations.  But I managed to forget that vmlinux
> relocations are resolved during linking.  So yeah, some kind of tooling
> magic would be needed.
>
> I worked up a POC using objtool.  It doesn't *have* to be done with
> objtool, but since it's already reading/writing all the ELF stuff
> anyway, it was pretty easy to add this on.
>
> This patch has at least a few issues:
>
> - No module support.
>
> - For some reason, the sync_cores in text_poke_bp() don't always seem to
>   be working as expected.  Running this patch on my VM, the test code in
>   cmdline_proc_show() works *most* of the time, but it occasionally
>   branches off into the weeds.  I have no idea what the problem is yet.
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 9d329608913e..20ff5624dad7 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -865,6 +865,9 @@ config HAVE_ARCH_PREL32_RELOCATIONS
>           architectures, and don't require runtime relocation on relocatable
>           kernels.
>
> +config HAVE_ARCH_STATIC_CALL
> +       bool
> +
>  source "kernel/gcov/Kconfig"
>
>  source "scripts/gcc-plugins/Kconfig"
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 5136a1281870..1a14c8f87876 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -128,6 +128,7 @@ config X86
>         select HAVE_ARCH_COMPAT_MMAP_BASES      if MMU && COMPAT
>         select HAVE_ARCH_PREL32_RELOCATIONS
>         select HAVE_ARCH_SECCOMP_FILTER
> +       select HAVE_ARCH_STATIC_CALL            if X86_64
>         select HAVE_ARCH_THREAD_STRUCT_WHITELIST
>         select HAVE_ARCH_TRACEHOOK
>         select HAVE_ARCH_TRANSPARENT_HUGEPAGE
> diff --git a/arch/x86/include/asm/static_call.h b/arch/x86/include/asm/static_call.h
> new file mode 100644
> index 000000000000..40fec631b760
> --- /dev/null
> +++ b/arch/x86/include/asm/static_call.h
> @@ -0,0 +1,35 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_STATIC_CALL_H
> +#define _ASM_STATIC_CALL_H
> +
> +#ifdef CONFIG_X86_64
> +
> +#include <linux/frame.h>
> +
> +void static_call_init(void);
> +extern void __static_call_update(void *tramp, void *func);
> +
> +#define DECLARE_STATIC_CALL(tramp, func)                               \
> +       extern typeof(func) tramp;                                      \
> +       static void __used __section(.discard.static_call_tramps)       \
> +               *__static_call_tramp_##tramp = tramp
> +

Confused.  What's the __static_call_tramp_##tramp variable for?  And
why is a DECLARE_ macro defining a variable?

> +#define DEFINE_STATIC_CALL(tramp, func)                                        \
> +       DECLARE_STATIC_CALL(tramp, func);                               \
> +       asm(".pushsection .text, \"ax\"                         \n"     \
> +           ".align 4                                           \n"     \
> +           ".globl " #tramp "                                  \n"     \
> +           ".type " #tramp ", @function                        \n"     \
> +           #tramp ":                                           \n"     \
> +           "jmp " #func "                                      \n"     \

I think this would be nicer as an indirect call that gets patched to a
direct call so that the update mechanism works even before it's
initialized.  (Currently static_branch blows up horribly if you try to
update one too early, and that's rather annoying IMO.)

Also, I think you're looking for "jmp.d32", which is available in
newer toolchains.  For older toolchains, you could use .byte 0xe9 or
you could use a different section (I think) to force a real 32-bit
jump.

> +void __init static_call_init(void)
> +{
> +       struct static_call_entry *entry;
> +       unsigned long insn, tramp, func;
> +       unsigned char insn_opcode, tramp_opcode;
> +       s32 call_dest;
> +
> +       for (entry = __start_static_call_table;
> +            entry < __stop_static_call_table; entry++) {
> +
> +               insn = (long)entry->insn + (unsigned long)&entry->insn;
> +               tramp = (long)entry->tramp + (unsigned long)&entry->tramp;
> +
> +               insn_opcode = *(unsigned char *)insn;
> +               if (insn_opcode != 0xe8 && insn_opcode != 0xe9) {
> +                       WARN_ONCE(1, "unexpected static call insn opcode %x at %pS",
> +                                 insn_opcode, (void *)insn);
> +                       continue;
> +               }
> +
> +               tramp_opcode = *(unsigned char *)tramp;
> +               if (tramp_opcode != 0xeb && tramp_opcode != 0xe9) {
> +                       WARN_ONCE(1, "unexpected trampoline jump opcode %x at %ps",
> +                                tramp_opcode, (void *)tramp);
> +                       continue;
> +               }
> +
> +               if (tramp_opcode == 0xeb)
> +                       func = *(s8 *)(tramp + 1) + (tramp + 2);

I realize you expect some instances of 0xeb due to the assembler
messing you up (see above), but this seems a bit too permissive, since
a 0xeb without the appropriate set of NOPs is going to explode.  And:

> +               else
> +                       func = *(s32 *)(tramp + 1) + (tramp + 5);
> +
> +               call_dest = (long)(func) - (long)(insn + 5);
> +
> +               printk("static_call_init: poking %lx at %lx\n", (unsigned long)call_dest, (insn+1));
> +
> +               text_poke_early((void *)(insn + 1), &call_dest, 4);

If you really are going to rewrite an 8-bit jump to a 32-bit jump, I
think you need to actually patch the opcode byte :)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-10 18:03                     ` Andy Lutomirski
@ 2018-10-10 18:16                       ` Josh Poimboeuf
  2018-10-10 18:17                         ` Josh Poimboeuf
  2018-10-10 18:33                         ` Josh Poimboeuf
  0 siblings, 2 replies; 43+ messages in thread
From: Josh Poimboeuf @ 2018-10-10 18:16 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Steven Rostedt, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski

On Wed, Oct 10, 2018 at 11:03:43AM -0700, Andy Lutomirski wrote:
> > +#define DECLARE_STATIC_CALL(tramp, func)                               \
> > +       extern typeof(func) tramp;                                      \
> > +       static void __used __section(.discard.static_call_tramps)       \
> > +               *__static_call_tramp_##tramp = tramp
> > +
> 
> Confused.  What's the __static_call_tramp_##tramp variable for?  And
> why is a DECLARE_ macro defining a variable?

This is the magic needed for objtool to find all the call sites.

The variable itself isn't needed, but the .discard.static_call_tramps
entry is.  Objtool reads that section to find out which function call
sites are targeted to a static call trampoline.

> > +#define DEFINE_STATIC_CALL(tramp, func)                                        \
> > +       DECLARE_STATIC_CALL(tramp, func);                               \
> > +       asm(".pushsection .text, \"ax\"                         \n"     \
> > +           ".align 4                                           \n"     \
> > +           ".globl " #tramp "                                  \n"     \
> > +           ".type " #tramp ", @function                        \n"     \
> > +           #tramp ":                                           \n"     \
> > +           "jmp " #func "                                      \n"     \
> 
> I think this would be nicer as an indirect call that gets patched to a
> direct call so that the update mechanism works even before it's
> initialized.  (Currently static_branch blows up horribly if you try to
> update one too early, and that's rather annoying IMO.)

Yeah, that would be better.  It would also allow trampoline function
pointers to work, which I think you mentioned elsewhere.  And then I
shouldn't trample this code in __static_call_update() -- that was
already kind of nasty anyway.

> Also, I think you're looking for "jmp.d32", which is available in
> newer toolchains.  For older toolchains, you could use .byte 0xe9 or
> you could use a different section (I think) to force a real 32-bit
> jump.

Good idea.

> > +void __init static_call_init(void)
> > +{
> > +       struct static_call_entry *entry;
> > +       unsigned long insn, tramp, func;
> > +       unsigned char insn_opcode, tramp_opcode;
> > +       s32 call_dest;
> > +
> > +       for (entry = __start_static_call_table;
> > +            entry < __stop_static_call_table; entry++) {
> > +
> > +               insn = (long)entry->insn + (unsigned long)&entry->insn;
> > +               tramp = (long)entry->tramp + (unsigned long)&entry->tramp;
> > +
> > +               insn_opcode = *(unsigned char *)insn;
> > +               if (insn_opcode != 0xe8 && insn_opcode != 0xe9) {
> > +                       WARN_ONCE(1, "unexpected static call insn opcode %x at %pS",
> > +                                 insn_opcode, (void *)insn);
> > +                       continue;
> > +               }
> > +
> > +               tramp_opcode = *(unsigned char *)tramp;
> > +               if (tramp_opcode != 0xeb && tramp_opcode != 0xe9) {
> > +                       WARN_ONCE(1, "unexpected trampoline jump opcode %x at %ps",
> > +                                tramp_opcode, (void *)tramp);
> > +                       continue;
> > +               }
> > +
> > +               if (tramp_opcode == 0xeb)
> > +                       func = *(s8 *)(tramp + 1) + (tramp + 2);
> 
> I realize you expect some instances of 0xeb due to the assembler
> messing you up (see above), but this seems a bit too permissive, since
> a 0xeb without the appropriate set of NOPs is going to explode.  And:

Yep.

> > +               else
> > +                       func = *(s32 *)(tramp + 1) + (tramp + 5);
> > +
> > +               call_dest = (long)(func) - (long)(insn + 5);
> > +
> > +               printk("static_call_init: poking %lx at %lx\n", (unsigned long)call_dest, (insn+1));
> > +
> > +               text_poke_early((void *)(insn + 1), &call_dest, 4);
> 
> If you really are going to rewrite an 8-bit jump to a 32-bit jump, I
> think you need to actually patch the opcode byte :)

Oops :-)

-- 
Josh

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-10 18:16                       ` Josh Poimboeuf
@ 2018-10-10 18:17                         ` Josh Poimboeuf
  2018-10-10 21:13                           ` Andy Lutomirski
  2018-10-10 18:33                         ` Josh Poimboeuf
  1 sibling, 1 reply; 43+ messages in thread
From: Josh Poimboeuf @ 2018-10-10 18:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Steven Rostedt, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski

On Wed, Oct 10, 2018 at 01:16:05PM -0500, Josh Poimboeuf wrote:
> On Wed, Oct 10, 2018 at 11:03:43AM -0700, Andy Lutomirski wrote:
> > > +#define DECLARE_STATIC_CALL(tramp, func)                               \
> > > +       extern typeof(func) tramp;                                      \
> > > +       static void __used __section(.discard.static_call_tramps)       \
> > > +               *__static_call_tramp_##tramp = tramp
> > > +
> > 
> > Confused.  What's the __static_call_tramp_##tramp variable for?  And
> > why is a DECLARE_ macro defining a variable?
> 
> This is the magic needed for objtool to find all the call sites.
> 
> The variable itself isn't needed, but the .discard.static_call_tramps
> entry is.  Objtool reads that section to find out which function call
> sites are targeted to a static call trampoline.

To clarify: objtool reads that section to find out which functions are
really static call trampolines.  Then it annotates all the instructions
which call/jmp to those trampolines.  Those annotations are then read by
the kernel.

-- 
Josh

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-10 18:16                       ` Josh Poimboeuf
  2018-10-10 18:17                         ` Josh Poimboeuf
@ 2018-10-10 18:33                         ` Josh Poimboeuf
  2018-10-10 18:56                           ` Steven Rostedt
  1 sibling, 1 reply; 43+ messages in thread
From: Josh Poimboeuf @ 2018-10-10 18:33 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Steven Rostedt, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski

On Wed, Oct 10, 2018 at 01:16:05PM -0500, Josh Poimboeuf wrote:
> > > +#define DEFINE_STATIC_CALL(tramp, func)                                        \
> > > +       DECLARE_STATIC_CALL(tramp, func);                               \
> > > +       asm(".pushsection .text, \"ax\"                         \n"     \
> > > +           ".align 4                                           \n"     \
> > > +           ".globl " #tramp "                                  \n"     \
> > > +           ".type " #tramp ", @function                        \n"     \
> > > +           #tramp ":                                           \n"     \
> > > +           "jmp " #func "                                      \n"     \
> > 
> > I think this would be nicer as an indirect call that gets patched to a
> > direct call so that the update mechanism works even before it's
> > initialized.  (Currently static_branch blows up horribly if you try to
> > update one too early, and that's rather annoying IMO.)
> 
> Yeah, that would be better.  It would also allow trampoline function
> pointers to work, which I think you mentioned elsewhere.  And then I
> shouldn't trample this code in __static_call_update() -- that was
> already kind of nasty anyway.

Re-reading your suggestion, I may have misunderstood what you're
suggesting here, but I'm thinking about doing something like what you
proposed earlier:

    GLOBAL(tramp)
      jmp *current_func(%rip)
    ENDPROC(tramp)

That is, doing an indirect jump instead of the above direct jump, so
that any previous references to the trampoline would still work (and it
would also work during early boot).

Though it should probably be a retpoline instead of an indirect jump.

-- 
Josh

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-10 18:33                         ` Josh Poimboeuf
@ 2018-10-10 18:56                           ` Steven Rostedt
  2018-10-10 20:16                             ` Josh Poimboeuf
  0 siblings, 1 reply; 43+ messages in thread
From: Steven Rostedt @ 2018-10-10 18:56 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Andy Lutomirski, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski

On Wed, 10 Oct 2018 13:33:30 -0500
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> Re-reading your suggestion, I may have misunderstood what you're
> suggesting here, but I'm thinking about doing something like what you
> proposed earlier:
> 
>     GLOBAL(tramp)
>       jmp *current_func(%rip)
>     ENDPROC(tramp)
> 
> That is, doing an indirect jump instead of the above direct jump, so
> that any previous references to the trampoline would still work (and it
> would also work during early boot).
> 
> Though it should probably be a retpoline instead of an indirect jump.

But do we care, as it only takes place during text_poke_bp() right?

I don't think we need to worry about training trampoline branch
prediction that can only be hit when something enables the jump.

-- Steve

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-10 18:56                           ` Steven Rostedt
@ 2018-10-10 20:16                             ` Josh Poimboeuf
  2018-10-10 20:57                               ` Andy Lutomirski
  0 siblings, 1 reply; 43+ messages in thread
From: Josh Poimboeuf @ 2018-10-10 20:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andy Lutomirski, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski

On Wed, Oct 10, 2018 at 02:56:08PM -0400, Steven Rostedt wrote:
> On Wed, 10 Oct 2018 13:33:30 -0500
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > Re-reading your suggestion, I may have misunderstood what you're
> > suggesting here, but I'm thinking about doing something like what you
> > proposed earlier:
> > 
> >     GLOBAL(tramp)
> >       jmp *current_func(%rip)
> >     ENDPROC(tramp)
> > 
> > That is, doing an indirect jump instead of the above direct jump, so
> > that any previous references to the trampoline would still work (and it
> > would also work during early boot).
> > 
> > Though it should probably be a retpoline instead of an indirect jump.
> 
> But do we care, as it only takes place during text_poke_bp() right?
> 
> I don't think we need to worry about training trampoline branch
> prediction that can only be hit when something enables the jump.

Yeah, I guess it depends on if we'd expect anybody (or gcc) to get a
function pointer to the trampoline itself.  I can just create a warning
for that in objtool.

-- 
Josh

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-10 20:16                             ` Josh Poimboeuf
@ 2018-10-10 20:57                               ` Andy Lutomirski
  0 siblings, 0 replies; 43+ messages in thread
From: Andy Lutomirski @ 2018-10-10 20:57 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Steven Rostedt, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski

On Wed, Oct 10, 2018 at 1:16 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Wed, Oct 10, 2018 at 02:56:08PM -0400, Steven Rostedt wrote:
> > On Wed, 10 Oct 2018 13:33:30 -0500
> > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > > Re-reading your suggestion, I may have misunderstood what you're
> > > suggesting here, but I'm thinking about doing something like what you
> > > proposed earlier:
> > >
> > >     GLOBAL(tramp)
> > >       jmp *current_func(%rip)
> > >     ENDPROC(tramp)
> > >
> > > That is, doing an indirect jump instead of the above direct jump, so
> > > that any previous references to the trampoline would still work (and it
> > > would also work during early boot).
> > >
> > > Though it should probably be a retpoline instead of an indirect jump.
> >
> > But do we care, as it only takes place during text_poke_bp() right?
> >
> > I don't think we need to worry about training trampoline branch
> > prediction that can only be hit when something enables the jump.
>
> Yeah, I guess it depends on if we'd expect anybody (or gcc) to get a
> function pointer to the trampoline itself.  I can just create a warning
> for that in objtool.
>

The jmp * in the trampoline itself is harmless even with Spectre
because it won't ever execute -- static_call_init() should just patch
it out even if the actual call target is never updated.  And gcc has
no business generating any unprotected indirect branches to it from
anywhere else, since, as far as gcc is concerned, they're just like
indirect branches to any other function.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-10 18:17                         ` Josh Poimboeuf
@ 2018-10-10 21:13                           ` Andy Lutomirski
  2018-10-11  3:07                             ` Josh Poimboeuf
  0 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2018-10-10 21:13 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Steven Rostedt, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski

On Wed, Oct 10, 2018 at 11:17 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Wed, Oct 10, 2018 at 01:16:05PM -0500, Josh Poimboeuf wrote:
> > On Wed, Oct 10, 2018 at 11:03:43AM -0700, Andy Lutomirski wrote:
> > > > +#define DECLARE_STATIC_CALL(tramp, func)                               \
> > > > +       extern typeof(func) tramp;                                      \
> > > > +       static void __used __section(.discard.static_call_tramps)       \
> > > > +               *__static_call_tramp_##tramp = tramp
> > > > +
> > >
> > > Confused.  What's the __static_call_tramp_##tramp variable for?  And
> > > why is a DECLARE_ macro defining a variable?
> >
> > This is the magic needed for objtool to find all the call sites.
> >
> > The variable itself isn't needed, but the .discard.static_call_tramps
> > entry is.  Objtool reads that section to find out which function call
> > sites are targeted to a static call trampoline.
>
> To clarify: objtool reads that section to find out which functions are
> really static call trampolines.  Then it annotates all the instructions
> which call/jmp to those trampolines.  Those annotations are then read by
> the kernel.
>

Ah, right, and objtool runs on a per-object basis so it has no other
way to know what symbols are actually static calls.

There's another way to skin this cat, though:

extern typeof(func) __static_call_trampoline_##tramp;
#define tramp __static_call_trampoline_##tramp

And objtool could recognize it by name.  But, of course, you can't put
a #define in a macro.  But maybe there's a way to hack it up with a
static inline?

Anyway, your way is probably fine with a few caveats:

 - It won't really work if the call comes from a .S file.
 - There should probably be a comment to help de-confuse future people
like me :)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-10 21:13                           ` Andy Lutomirski
@ 2018-10-11  3:07                             ` Josh Poimboeuf
  2018-10-11 12:52                               ` Josh Poimboeuf
  0 siblings, 1 reply; 43+ messages in thread
From: Josh Poimboeuf @ 2018-10-11  3:07 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Steven Rostedt, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski

On Wed, Oct 10, 2018 at 02:13:22PM -0700, Andy Lutomirski wrote:
> On Wed, Oct 10, 2018 at 11:17 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > On Wed, Oct 10, 2018 at 01:16:05PM -0500, Josh Poimboeuf wrote:
> > > On Wed, Oct 10, 2018 at 11:03:43AM -0700, Andy Lutomirski wrote:
> > > > > +#define DECLARE_STATIC_CALL(tramp, func)                               \
> > > > > +       extern typeof(func) tramp;                                      \
> > > > > +       static void __used __section(.discard.static_call_tramps)       \
> > > > > +               *__static_call_tramp_##tramp = tramp
> > > > > +
> > > >
> > > > Confused.  What's the __static_call_tramp_##tramp variable for?  And
> > > > why is a DECLARE_ macro defining a variable?
> > >
> > > This is the magic needed for objtool to find all the call sites.
> > >
> > > The variable itself isn't needed, but the .discard.static_call_tramps
> > > entry is.  Objtool reads that section to find out which function call
> > > sites are targeted to a static call trampoline.
> >
> > To clarify: objtool reads that section to find out which functions are
> > really static call trampolines.  Then it annotates all the instructions
> > which call/jmp to those trampolines.  Those annotations are then read by
> > the kernel.
> >
> 
> Ah, right, and objtool runs on a per-object basis so it has no other
> way to know what symbols are actually static calls.
> 
> There's another way to skin this cat, though:
> 
> extern typeof(func) __static_call_trampoline_##tramp;
> #define tramp __static_call_trampoline_##tramp
> 
> And objtool could recognize it by name.  But, of course, you can't put
> a #define in a macro.  But maybe there's a way to hack it up with a
> static inline?

Not sure how to do that...

> Anyway, your way is probably fine with a few caveats:
> 
>  - It won't really work if the call comes from a .S file.

Maybe we can have similar macros for asm code.

>  - There should probably be a comment to help de-confuse future people
> like me :)

Done.  I also converted the trampoline to use an indirect jump.  The
latest code is below.  I'm going off the grid this weekend but I can
probably post proper patches next week.

diff --git a/arch/Kconfig b/arch/Kconfig
index 9d329608913e..20ff5624dad7 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -865,6 +865,9 @@ config HAVE_ARCH_PREL32_RELOCATIONS
 	  architectures, and don't require runtime relocation on relocatable
 	  kernels.
 
+config HAVE_ARCH_STATIC_CALL
+	bool
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5136a1281870..1a14c8f87876 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -128,6 +128,7 @@ config X86
 	select HAVE_ARCH_COMPAT_MMAP_BASES	if MMU && COMPAT
 	select HAVE_ARCH_PREL32_RELOCATIONS
 	select HAVE_ARCH_SECCOMP_FILTER
+	select HAVE_ARCH_STATIC_CALL		if X86_64
 	select HAVE_ARCH_THREAD_STRUCT_WHITELIST
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
diff --git a/arch/x86/include/asm/static_call.h b/arch/x86/include/asm/static_call.h
new file mode 100644
index 000000000000..50006bcc3352
--- /dev/null
+++ b/arch/x86/include/asm/static_call.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_STATIC_CALL_H
+#define _ASM_STATIC_CALL_H
+
+#ifdef CONFIG_X86_64
+
+#include <linux/frame.h>
+
+void static_call_init(void);
+extern void __static_call_update(void **func_ptr, void *func);
+
+#define STATIC_CALL_PTR(tramp) (__static_call_ptr_##tramp)
+
+/*
+ * In addition to declaring external variables, this macro also spits out some
+ * magic for objtool to read.  The .discard.static_call_tramps section tells
+ * objtool which functions are static call trampolines, so it can then annotate
+ * calls to those functions in the __static_call_table section.
+ */
+#define DECLARE_STATIC_CALL(tramp, func)				\
+	extern typeof(func) tramp;					\
+	extern typeof(func) *STATIC_CALL_PTR(tramp);			\
+	static void __used __section(.discard.static_call_tramps)	\
+		*__static_call_tramp_##tramp = tramp
+
+#define DEFINE_STATIC_CALL(tramp, func)					\
+	DECLARE_STATIC_CALL(tramp, func);				\
+	typeof(func) *STATIC_CALL_PTR(tramp) = func;			\
+	asm(".pushsection .text, \"ax\"				\n"	\
+	    ".align 4						\n"	\
+	    ".globl " #tramp "					\n"	\
+	    ".type " #tramp ", @function			\n"	\
+	    #tramp ":						\n"	\
+	    "jmpq *" __stringify(STATIC_CALL_PTR(tramp)) "(%rip)\n"	\
+	    ".popsection					\n")
+
+#define static_call_update(tramp, func)					\
+({									\
+	/* TODO: validate func type */					\
+	__static_call_update((void **)&STATIC_CALL_PTR(tramp), func);	\
+})
+
+#else /* !CONFIG_X86_64 */
+static inline void static_call_init(void) {}
+#endif
+
+#endif /* _ASM_STATIC_CALL_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 8824d01c0c35..e5d9f3a1e73f 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -62,6 +62,7 @@ obj-y			+= tsc.o tsc_msr.o io_delay.o rtc.o
 obj-y			+= pci-iommu_table.o
 obj-y			+= resource.o
 obj-y			+= irqflags.o
+obj-$(CONFIG_X86_64)	+= static_call.o
 
 obj-y				+= process.o
 obj-y				+= fpu/
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index b4866badb235..447401fc8d65 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -117,6 +117,7 @@
 #include <asm/microcode.h>
 #include <asm/kaslr.h>
 #include <asm/unwind.h>
+#include <asm/static_call.h>
 
 /*
  * max_low_pfn_mapped: highest direct mapped pfn under 4GB
@@ -874,6 +875,7 @@ void __init setup_arch(char **cmdline_p)
 	early_cpu_init();
 	arch_init_ideal_nops();
 	jump_label_init();
+	static_call_init();
 	early_ioremap_init();
 
 	setup_olpc_ofw_pgd();
diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c
new file mode 100644
index 000000000000..80c0c0de06e7
--- /dev/null
+++ b/arch/x86/kernel/static_call.c
@@ -0,0 +1,118 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <linux/init.h>
+#include <linux/static_call.h>
+#include <linux/bug.h>
+#include <linux/smp.h>
+#include <linux/memory.h>
+#include <asm/text-patching.h>
+#include <asm/processor.h>
+
+extern int cmdline_proc_show(void);
+
+/* The static call table is created by objtool */
+struct static_call_entry {
+	s32 insn, func_ptr;
+};
+extern struct static_call_entry __start_static_call_table[],
+			        __stop_static_call_table[];
+
+static int static_call_initialized;
+
+void __init static_call_init(void)
+{
+	struct static_call_entry *entry;
+	unsigned long insn, func_ptr, func;
+	unsigned char insn_opcode;
+	s32 call_dest;
+
+	for (entry = __start_static_call_table;
+	     entry < __stop_static_call_table; entry++) {
+
+		insn = (long)entry->insn + (unsigned long)&entry->insn;
+		func_ptr = (long)entry->func_ptr + (unsigned long)&entry->func_ptr;
+		func = *(unsigned long *)func_ptr;
+
+		/*
+		 * Make sure the original call to be patched is either a call
+		 * or a tail call jump:
+		 */
+		insn_opcode = *(unsigned char *)insn;
+		if (insn_opcode != 0xe8 && insn_opcode != 0xe9) {
+			WARN_ONCE(1, "unexpected static call insn opcode %x at %pS",
+				  insn_opcode, (void *)insn);
+			continue;
+		}
+
+		call_dest = (long)(func) - (long)(insn + 5);
+
+		text_poke_early((void *)(insn + 1), &call_dest, 4);
+	}
+
+	static_call_initialized = 1;
+}
+
+void static_call_bp_handler(void);
+void *bp_handler_func;
+void *bp_handler_continue;
+
+asm(".pushsection .text, \"ax\"						\n"
+    ".globl static_call_bp_handler					\n"
+    ".type static_call_bp_handler, @function				\n"
+    "static_call_bp_handler:						\n"
+    "call *bp_handler_func(%rip)					\n"
+    "jmp *bp_handler_continue(%rip)					\n"
+    ".popsection .text							\n");
+
+void __static_call_update(void **func_ptr, void *func)
+{
+	struct static_call_entry *entry;
+	unsigned long insn, ptr;
+	s32 call_dest;
+	unsigned char opcodes[5];
+
+	/* If called before init, use the trampoline's indirect jump. */
+	if (!static_call_initialized)
+		return;
+
+	*func_ptr = func;
+
+	mutex_lock(&text_mutex);
+
+	for (entry = __start_static_call_table;
+	     entry < __stop_static_call_table; entry++) {
+
+		ptr = (long)entry->func_ptr + (long)&entry->func_ptr;
+		if ((void **)ptr != func_ptr)
+			continue;
+
+		/* Calculate the new call destination: */
+		insn = (long)entry->insn + (unsigned long)&entry->insn;
+		call_dest = (long)(func) - (long)(insn + 5);
+		opcodes[0] = 0xe8;
+		memcpy(&opcodes[1], &call_dest, 4);
+
+		/* Set up the variables for the breakpoint handler: */
+		bp_handler_func = func;
+		bp_handler_continue = (void *)(insn + 5);
+
+		/* Patch the call site: */
+		text_poke_bp((void *)insn, opcodes, 5, static_call_bp_handler);
+	}
+
+	mutex_unlock(&text_mutex);
+}
+
+/*** TEST CODE BELOW -- called from cmdline_proc_show ***/
+
+int my_func_add(int arg1, int arg2)
+{
+	return arg1 + arg2;
+}
+
+int my_func_sub(int arg1, int arg2)
+{
+	return arg1 - arg2;
+}
+
+DEFINE_STATIC_CALL(my_static_call, my_func_add);
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 0d618ee634ac..cf0566f8a13c 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -185,6 +185,9 @@ SECTIONS
 
 	BUG_TABLE
 
+	/* FIXME: move to read-only section */
+	STATIC_CALL_TABLE
+
 	ORC_UNWIND_TABLE
 
 	. = ALIGN(PAGE_SIZE);
diff --git a/fs/proc/cmdline.c b/fs/proc/cmdline.c
index fa762c5fbcb2..c704b9e1fe5f 100644
--- a/fs/proc/cmdline.c
+++ b/fs/proc/cmdline.c
@@ -3,9 +3,27 @@
 #include <linux/init.h>
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
+#include <linux/static_call.h>
+
+extern int my_func_add(int arg1, int arg2);
+extern int my_func_sub(int arg1, int arg2);
+DECLARE_STATIC_CALL(my_static_call, my_func_add);
 
 static int cmdline_proc_show(struct seq_file *m, void *v)
 {
+	int ret;
+
+	ret = my_static_call(1, 2);
+	printk("static call (orig): ret=%d\n", ret);
+
+	static_call_update(my_static_call, my_func_sub);
+	ret = my_static_call(1, 2);
+	printk("static call (sub): ret=%d\n", ret);
+
+	static_call_update(my_static_call, my_func_add);
+	ret = my_static_call(1, 2);
+	printk("static call (add): ret=%d\n", ret);
+
 	seq_puts(m, saved_command_line);
 	seq_putc(m, '\n');
 	return 0;
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index f09ee3c544bc..a1c7bda1b22a 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -722,6 +722,14 @@
 #define BUG_TABLE
 #endif
 
+#define STATIC_CALL_TABLE						\
+	. = ALIGN(8);							\
+	__static_call_table : AT(ADDR(__static_call_table) - LOAD_OFFSET) { \
+		__start_static_call_table = .;			\
+		KEEP(*(__static_call_table))				\
+		__stop_static_call_table = .;				\
+	}
+
 #ifdef CONFIG_UNWINDER_ORC
 #define ORC_UNWIND_TABLE						\
 	. = ALIGN(4);							\
diff --git a/include/linux/static_call.h b/include/linux/static_call.h
new file mode 100644
index 000000000000..729e7ee4c66b
--- /dev/null
+++ b/include/linux/static_call.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_STATIC_CALL_H
+#define _LINUX_STATIC_CALL_H
+
+#ifdef CONFIG_HAVE_ARCH_STATIC_CALL
+#include <asm/static_call.h>
+#else
+
+#define DECLARE_STATIC_CALL(ptr, func)					\
+	extern typeof(func) *ptr
+
+#define DEFINE_STATIC_CALL(ptr, func)					\
+	typeof(func) *ptr = func
+
+#define static_call_update(ptr, func)					\
+	WRITE_ONCE(ptr, func)
+
+#endif /* !CONFIG_HAVE_ARCH_STATIC_CALL */
+
+#endif /* _LINUX_STATIC_CALL_H */
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 0414a0d52262..b59770f8ed4b 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -525,6 +525,10 @@ static int add_jump_destinations(struct objtool_file *file)
 		} else {
 			/* sibling call */
 			insn->jump_dest = 0;
+			if (rela->sym->static_call_tramp) {
+				list_add_tail(&insn->static_call_node,
+					      &file->static_call_list);
+			}
 			continue;
 		}
 
@@ -1202,6 +1206,21 @@ static int read_retpoline_hints(struct objtool_file *file)
 	return 0;
 }
 
+static int read_static_call_tramps(struct objtool_file *file)
+{
+	struct section *sec;
+	struct rela *rela;
+
+	sec = find_section_by_name(file->elf, ".rela.discard.static_call_tramps");
+	if (!sec)
+		return 0;
+
+	list_for_each_entry(rela, &sec->rela_list, list)
+		rela->sym->static_call_tramp = true;
+
+	return 0;
+}
+
 static void mark_rodata(struct objtool_file *file)
 {
 	struct section *sec;
@@ -1267,6 +1286,10 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	ret = read_static_call_tramps(file);
+	if (ret)
+		return ret;
+
 	return 0;
 }
 
@@ -1920,6 +1943,11 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
 			if (is_fentry_call(insn))
 				break;
 
+			if (insn->call_dest->static_call_tramp) {
+				list_add_tail(&insn->static_call_node,
+					      &file->static_call_list);
+			}
+
 			ret = dead_end_function(file, insn->call_dest);
 			if (ret == 1)
 				return 0;
@@ -2167,6 +2195,99 @@ static int validate_reachable_instructions(struct objtool_file *file)
 	return 0;
 }
 
+struct static_call_entry {
+	s32 insn, func_ptr;
+};
+
+static int create_static_call_sections(struct objtool_file *file)
+{
+	struct section *sec, *rela_sec;
+	struct rela *rela;
+	struct static_call_entry *entry;
+	struct instruction *insn;
+	char func_ptr_name[128];
+	struct symbol *func_ptr;
+	int idx, ret;
+
+	sec = find_section_by_name(file->elf, "__static_call_table");
+	if (sec) {
+		WARN("file already has __static_call_table section, skipping");
+		return -1;
+	}
+
+	if (list_empty(&file->static_call_list))
+		return 0;
+
+	idx = 0;
+	list_for_each_entry(insn, &file->static_call_list, static_call_node)
+		idx++;
+
+	sec = elf_create_section(file->elf, "__static_call_table",
+				 sizeof(struct static_call_entry), idx);
+	if (!sec)
+		return -1;
+
+	rela_sec = elf_create_rela_section(file->elf, sec);
+	if (!rela_sec)
+		return -1;
+
+	idx = 0;
+	list_for_each_entry(insn, &file->static_call_list, static_call_node) {
+
+		entry = (struct static_call_entry *)sec->data->d_buf + idx;
+		memset(entry, 0, sizeof(struct static_call_entry));
+
+		/* populate rela for 'insn' */
+		rela = malloc(sizeof(*rela));
+		if (!rela) {
+			perror("malloc");
+			return -1;
+		}
+		memset(rela, 0, sizeof(*rela));
+		rela->sym = insn->sec->sym;
+		rela->addend = insn->offset;
+		rela->type = R_X86_64_PC32;
+		rela->offset = idx * sizeof(struct static_call_entry);
+		list_add_tail(&rela->list, &rela_sec->rela_list);
+		hash_add(rela_sec->rela_hash, &rela->hash, rela->offset);
+
+		/* find function pointer symbol */
+		ret = snprintf(func_ptr_name, 128, "__static_call_ptr_%s",
+			       insn->call_dest->name);
+		if (ret < 0 || ret >= 128) {
+			WARN("snprintf failed");
+			return -1;
+		}
+
+		func_ptr = find_symbol_by_name(file->elf, func_ptr_name);
+		if (!func_ptr) {
+			WARN("can't find static call ptr symbol: %s", func_ptr_name);
+			return -1;
+		}
+
+		/* populate rela for 'func_ptr' */
+		rela = malloc(sizeof(*rela));
+		if (!rela) {
+			perror("malloc");
+			return -1;
+		}
+		memset(rela, 0, sizeof(*rela));
+		rela->sym = func_ptr;
+		rela->addend = 0;
+		rela->type = R_X86_64_PC32;
+		rela->offset = idx * sizeof(struct static_call_entry) + 4;
+		list_add_tail(&rela->list, &rela_sec->rela_list);
+		hash_add(rela_sec->rela_hash, &rela->hash, rela->offset);
+
+		idx++;
+	}
+
+	if (elf_rebuild_rela_section(rela_sec))
+		return -1;
+
+	return 0;
+}
+
 static void cleanup(struct objtool_file *file)
 {
 	struct instruction *insn, *tmpinsn;
@@ -2197,6 +2318,7 @@ int check(const char *_objname, bool orc)
 
 	INIT_LIST_HEAD(&file.insn_list);
 	hash_init(file.insn_hash);
+	INIT_LIST_HEAD(&file.static_call_list);
 	file.whitelist = find_section_by_name(file.elf, ".discard.func_stack_frame_non_standard");
 	file.c_file = find_section_by_name(file.elf, ".comment");
 	file.ignore_unreachables = no_unreachable;
@@ -2236,6 +2358,11 @@ int check(const char *_objname, bool orc)
 		warnings += ret;
 	}
 
+	ret = create_static_call_sections(&file);
+	if (ret < 0)
+		goto out;
+	warnings += ret;
+
 	if (orc) {
 		ret = create_orc(&file);
 		if (ret < 0)
@@ -2244,7 +2371,9 @@ int check(const char *_objname, bool orc)
 		ret = create_orc_sections(&file);
 		if (ret < 0)
 			goto out;
+	}
 
+	if (orc || !list_empty(&file.static_call_list)) {
 		ret = elf_write(file.elf);
 		if (ret < 0)
 			goto out;
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index e6e8a655b556..56b8b7fb1bd1 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -39,6 +39,7 @@ struct insn_state {
 struct instruction {
 	struct list_head list;
 	struct hlist_node hash;
+	struct list_head static_call_node;
 	struct section *sec;
 	unsigned long offset;
 	unsigned int len;
@@ -60,6 +61,7 @@ struct objtool_file {
 	struct elf *elf;
 	struct list_head insn_list;
 	DECLARE_HASHTABLE(insn_hash, 16);
+	struct list_head static_call_list;
 	struct section *whitelist;
 	bool ignore_unreachables, c_file, hints, rodata;
 };
diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index bc97ed86b9cd..3cf44d7cc3ac 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -62,6 +62,7 @@ struct symbol {
 	unsigned long offset;
 	unsigned int len;
 	struct symbol *pfunc, *cfunc;
+	bool static_call_tramp;
 };
 
 struct rela {

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-11  3:07                             ` Josh Poimboeuf
@ 2018-10-11 12:52                               ` Josh Poimboeuf
  2018-10-11 16:20                                 ` Andy Lutomirski
  0 siblings, 1 reply; 43+ messages in thread
From: Josh Poimboeuf @ 2018-10-11 12:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Steven Rostedt, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski

On Wed, Oct 10, 2018 at 10:07:38PM -0500, Josh Poimboeuf wrote:
> On Wed, Oct 10, 2018 at 02:13:22PM -0700, Andy Lutomirski wrote:
> > On Wed, Oct 10, 2018 at 11:17 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > >
> > > On Wed, Oct 10, 2018 at 01:16:05PM -0500, Josh Poimboeuf wrote:
> > > > On Wed, Oct 10, 2018 at 11:03:43AM -0700, Andy Lutomirski wrote:
> > > > > > +#define DECLARE_STATIC_CALL(tramp, func)                               \
> > > > > > +       extern typeof(func) tramp;                                      \
> > > > > > +       static void __used __section(.discard.static_call_tramps)       \
> > > > > > +               *__static_call_tramp_##tramp = tramp
> > > > > > +
> > > > >
> > > > > Confused.  What's the __static_call_tramp_##tramp variable for?  And
> > > > > why is a DECLARE_ macro defining a variable?
> > > >
> > > > This is the magic needed for objtool to find all the call sites.
> > > >
> > > > The variable itself isn't needed, but the .discard.static_call_tramps
> > > > entry is.  Objtool reads that section to find out which function call
> > > > sites are targeted to a static call trampoline.
> > >
> > > To clarify: objtool reads that section to find out which functions are
> > > really static call trampolines.  Then it annotates all the instructions
> > > which call/jmp to those trampolines.  Those annotations are then read by
> > > the kernel.
> > >
> > 
> > Ah, right, and objtool runs on a per-object basis so it has no other
> > way to know what symbols are actually static calls.
> > 
> > There's another way to skin this cat, though:
> > 
> > extern typeof(func) __static_call_trampoline_##tramp;
> > #define tramp __static_call_trampoline_##tramp
> > 
> > And objtool could recognize it by name.  But, of course, you can't put
> > a #define in a macro.  But maybe there's a way to hack it up with a
> > static inline?

I went with something similar in the latest version.  It's less
surprising in a couple of ways:

- DECLARE_STATIC_CALL doesn't have the magic objtool definition.

- Call sites use the static_call() wrapper, which makes static calls
  clearly visible.


#define STATIC_CALL_TRAMP(key) ____static_call_tramp_##key
#define STATIC_CALL_PTR(key)   ____static_call_ptr_##key

#define STATIC_CALL_TRAMP_STR(key) __stringify(STATIC_CALL_TRAMP(key))
#define STATIC_CALL_PTR_STR(key) __stringify(STATIC_CALL_PTR(key))

#define DECLARE_STATIC_CALL(key, func)					\
	extern typeof(func)  STATIC_CALL_TRAMP(key);			\
	extern typeof(func) *STATIC_CALL_PTR(key)

#define DEFINE_STATIC_CALL(key, func)					\
	typeof(func) *STATIC_CALL_PTR(key) = func;			\
	asm(".pushsection .text, \"ax\"				\n"	\
	    ".align 4						\n"	\
	    ".globl " STATIC_CALL_TRAMP_STR(key) "		\n"	\
	    ".type " STATIC_CALL_TRAMP_STR(key) ", @function	\n"	\
	    STATIC_CALL_TRAMP_STR(key) ":			\n"	\
	    "jmpq *" STATIC_CALL_PTR_STR(key) "(%rip)		\n"	\
	    ".popsection					\n")

#define static_call(key, args...) STATIC_CALL_TRAMP(key)(args)

#define static_call_update(key, func)					\
({									\
	STATIC_CALL_PTR(key) = func;					\
	__static_call_update((void **)&STATIC_CALL_PTR(key));		\
})

-- 
Josh

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
  2018-10-11 12:52                               ` Josh Poimboeuf
@ 2018-10-11 16:20                                 ` Andy Lutomirski
  0 siblings, 0 replies; 43+ messages in thread
From: Andy Lutomirski @ 2018-10-11 16:20 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Steven Rostedt, Peter Zijlstra, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Thomas Gleixner, Masami Hiramatsu,
	Mathieu Desnoyers, Matthew Helsley, Rafael J. Wysocki,
	David Woodhouse, Paolo Bonzini, Jason Baron, Jiri Kosina,
	Ard Biesheuvel, Andrew Lutomirski



> On Oct 11, 2018, at 5:52 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
>> On Wed, Oct 10, 2018 at 10:07:38PM -0500, Josh Poimboeuf wrote:
>>> On Wed, Oct 10, 2018 at 02:13:22PM -0700, Andy Lutomirski wrote:
>>>> On Wed, Oct 10, 2018 at 11:17 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>>>> 
>>>>> On Wed, Oct 10, 2018 at 01:16:05PM -0500, Josh Poimboeuf wrote:
>>>>> On Wed, Oct 10, 2018 at 11:03:43AM -0700, Andy Lutomirski wrote:
>>>>>>> +#define DECLARE_STATIC_CALL(tramp, func)                               \
>>>>>>> +       extern typeof(func) tramp;                                      \
>>>>>>> +       static void __used __section(.discard.static_call_tramps)       \
>>>>>>> +               *__static_call_tramp_##tramp = tramp
>>>>>>> +
>>>>>> 
>>>>>> Confused.  What's the __static_call_tramp_##tramp variable for?  And
>>>>>> why is a DECLARE_ macro defining a variable?
>>>>> 
>>>>> This is the magic needed for objtool to find all the call sites.
>>>>> 
>>>>> The variable itself isn't needed, but the .discard.static_call_tramps
>>>>> entry is.  Objtool reads that section to find out which function call
>>>>> sites are targeted to a static call trampoline.
>>>> 
>>>> To clarify: objtool reads that section to find out which functions are
>>>> really static call trampolines.  Then it annotates all the instructions
>>>> which call/jmp to those trampolines.  Those annotations are then read by
>>>> the kernel.
>>>> 
>>> 
>>> Ah, right, and objtool runs on a per-object basis so it has no other
>>> way to know what symbols are actually static calls.
>>> 
>>> There's another way to skin this cat, though:
>>> 
>>> extern typeof(func) __static_call_trampoline_##tramp;
>>> #define tramp __static_call_trampoline_##tramp
>>> 
>>> And objtool could recognize it by name.  But, of course, you can't put
>>> a #define in a macro.  But maybe there's a way to hack it up with a
>>> static inline?
> 
> I went with something similar in the latest version.  It's less
> surprising in a couple of ways:
> 
> - DECLARE_STATIC_CALL doesn't have the magic objtool definition.
> 
> - Call sites use the static_call() wrapper, which makes static calls
>  clearly visible.

Seems reasonable. Also, for a real patch, it should be straightforward to have a fallback implementation in include/linux/static_call.h that just dereferences the pointer.

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2018-10-11 16:20 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-06  1:51 [POC][RFC][PATCH 0/2] PROOF OF CONCEPT: Dynamic Functions (jump functions) Steven Rostedt
2018-10-06  1:51 ` [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function" Steven Rostedt
2018-10-06  2:00   ` Steven Rostedt
2018-10-06  2:02   ` Steven Rostedt
2018-10-06  2:03   ` Steven Rostedt
2018-10-06 15:15     ` Steven Rostedt
2018-10-06 12:12   ` Peter Zijlstra
2018-10-06 13:39     ` Steven Rostedt
2018-10-06 15:13       ` Andy Lutomirski
2018-10-06 15:16         ` Steven Rostedt
2018-10-08  7:21       ` Peter Zijlstra
2018-10-08  8:33         ` Andy Lutomirski
2018-10-08 15:57           ` Peter Zijlstra
2018-10-08 16:29             ` Andy Lutomirski
2018-10-08 16:39               ` Steven Rostedt
2018-10-08 16:39               ` Peter Zijlstra
2018-10-08 17:25                 ` Andy Lutomirski
2018-10-08 17:30                   ` Ard Biesheuvel
2018-10-08 17:42                     ` Andy Lutomirski
2018-10-08 17:44                     ` Jiri Kosina
2018-10-08 17:45                       ` Ard Biesheuvel
2018-10-08 17:47                       ` Andy Lutomirski
2018-10-09  2:17               ` Josh Poimboeuf
2018-10-09  3:57                 ` Steven Rostedt
2018-10-10 17:52                   ` Josh Poimboeuf
2018-10-10 18:03                     ` Andy Lutomirski
2018-10-10 18:16                       ` Josh Poimboeuf
2018-10-10 18:17                         ` Josh Poimboeuf
2018-10-10 21:13                           ` Andy Lutomirski
2018-10-11  3:07                             ` Josh Poimboeuf
2018-10-11 12:52                               ` Josh Poimboeuf
2018-10-11 16:20                                 ` Andy Lutomirski
2018-10-10 18:33                         ` Josh Poimboeuf
2018-10-10 18:56                           ` Steven Rostedt
2018-10-10 20:16                             ` Josh Poimboeuf
2018-10-10 20:57                               ` Andy Lutomirski
2018-10-08 16:31             ` Steven Rostedt
2018-10-08 11:30       ` Ard Biesheuvel
2018-10-09  3:44   ` Masami Hiramatsu
2018-10-09  3:55     ` Steven Rostedt
2018-10-09 16:04       ` Masami Hiramatsu
2018-10-09  8:59     ` David Laight
2018-10-06  1:51 ` [POC][RFC][PATCH 2/2] tracepoints: Implement it with dynamic functions Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.