All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vikash Bansal <bvikas@vmware.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: "bhelgaas@google.com" <bhelgaas@google.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Srivatsa Bhat <srivatsab@vmware.com>,
	"srivatsa@csail.mit.edu" <srivatsa@csail.mit.edu>,
	Alexey Makhalov <amakhalov@vmware.com>,
	Srinidhi Rao <srinidhir@vmware.com>,
	Anish Swaminathan <anishs@vmware.com>,
	Vasavi Sirnapalli <vsirnapalli@vmware.com>,
	Ajay Kaher <akaher@vmware.com>
Subject: Re: [PATCH] PCI: Speed up device init by parsing capabilities all at once
Date: Fri, 21 Jan 2022 17:26:35 +0000	[thread overview]
Message-ID: <7E2C2648-76CE-4987-AB4F-7B4576F10D7B@vmware.com> (raw)
In-Reply-To: <YekAgfkDgV6z6hYV@kroah.com>

    On 20/01/22, 11:56 AM, "Greg KH" <gregkh@linuxfoundation.org> wrote:

    Run pahole for pci_dev structure, it is not adding any padding bytes.
    Please refer to my previous email for replies to Greg's other comments. 

    >On Tue, Jan 18, 2022 at 09:16:01AM -0800, Vikash Bansal wrote:
    >> In the current implementation, the PCI capability list is parsed from
    >> the beginning to find each capability, which results in a large number
    >> of redundant PCI reads.
    >>
    >> Instead, we can parse the complete list just once, store it in the
    >> pci_dev structure, and get the offset of each capability directly from
    >> the pci_dev structure.
    >>
    >> This implementation improves pci devices initialization time  by ~2-3% in
    >> case of bare metal and 7-8% in case of VM running on ESXi.
    >
    >What is that in terms of "wall clock" time?  % is hard to know here, and
    >of course it will depend on the PCI bus speed, right?
    >
    >> It also adds a memory overhead of 20bytes (value of PCI_CAP_ID_MAX) per
    >> PCI device.
    >>
    >> Signed-off-by: Vikash Bansal <bvikas@vmware.com>
    >> ---
    >>  drivers/pci/pci.c   | 43 ++++++++++++++++++++++++++++++++++++-------
    >>  drivers/pci/probe.c |  5 +++++
    >>  include/linux/pci.h |  2 ++
    >>  3 files changed, 43 insertions(+), 7 deletions(-)
    >>
    >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
    >> index 3d2fb394986a..8e024db30262 100644
    >> --- a/drivers/pci/pci.c
    >> +++ b/drivers/pci/pci.c
    >> @@ -468,6 +468,41 @@ static u8 __pci_bus_find_cap_start(struct pci_bus *bus,
    >>  	return 0;
    >>  }
    >>  
    >> +
    >> +/**
    >> + * pci_find_all_capabilities - Read all capabilities
    >> + * @dev: the PCI device
    >> + *
    >> + * Read all capabilities and store offsets in cap_off
    >> + * array in pci_dev structure.
    >> + */
    >> +void pci_find_all_capabilities(struct pci_dev *dev)
    >> +{
    >> +	int ttl = PCI_FIND_CAP_TTL;
    >> +	u16 ent;
    >> +	u8 pos;
    >> +	u8 id;
    >> +
    >> +	pos = __pci_bus_find_cap_start(dev->bus, dev->devfn, dev->hdr_type);
    >> +	if (!pos)
    >> +		return;
    >> +	pci_bus_read_config_byte(dev->bus, dev->devfn, pos, &pos);
    >> +	while (ttl--) {
    >> +		if (pos < 0x40)
    >
    >What is this magic value of 0x40?
    >
    >> +			break;
    >> +		pos &= ~3;
    >
    >Why ~3?
    >
    >> +		pci_bus_read_config_word(dev->bus, dev->devfn, pos, &ent);
    >> +		id = ent & 0xff;
    >
    >Do you really need the & if you are truncating it?
    >
    >> +		if (id == 0xff)
    >> +			break;
    >> +
    >> +		/* Read first instance of capability */
    >> +		if (!(dev->cap_off[id]))
    >> +			dev->cap_off[id] = pos;
    >
    >Shouldn't you have checked this before you read the value?
    >
    >> +		pos = (ent >> 8);
    >
    >What about walking the list using __pci_find_next_cap() like before?
    >Why is this somehow the same as the old function?
    >
    >> +	}
    >> +}
    >> +
    >>  /**
    >>   * pci_find_capability - query for devices' capabilities
    >>   * @dev: PCI device to query
    >> @@ -489,13 +524,7 @@ static u8 __pci_bus_find_cap_start(struct pci_bus *bus,
    >>   */
    >>  u8 pci_find_capability(struct pci_dev *dev, int cap)
    >>  {
    >> -	u8 pos;
    >> -
    > -	pos = __pci_bus_find_cap_start(dev->bus, dev->devfn, dev->hdr_type);
    >> -	if (pos)
    >> -		pos = __pci_find_next_cap(dev->bus, dev->devfn, pos, cap);
    >> -
    >> -	return pos;
    >> +	return dev->cap_off[cap];
    >>  }
    >>  EXPORT_SYMBOL(pci_find_capability);
    >>  
    >> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
    >> index 087d3658f75c..bacab12cedbb 100644
    >> --- a/drivers/pci/probe.c
    >> +++ b/drivers/pci/probe.c
    >> @@ -1839,6 +1839,11 @@ int pci_setup_device(struct pci_dev *dev)
    >>  	dev->hdr_type = hdr_type & 0x7f;
    >>  	dev->multifunction = !!(hdr_type & 0x80);
    >>  	dev->error_state = pci_channel_io_normal;
    >> +	/*
    >> +	 * Read all capabilities and store offsets in cap_off
    >> +	 * array in pci_dev structure.
    >> +	 */
    >
    >Comment is not needed if the function name is descriptive.
    >
    >> +	pci_find_all_capabilities(dev);
    >
    >And it is, so no need for the comment.
    >
    >>  	set_pcie_port_type(dev);
    >>  
    >>  	pci_set_of_node(dev);
    >> diff --git a/include/linux/pci.h b/include/linux/pci.h
    >> index 18a75c8e615c..d221c73e67f8 100644
    >> --- a/include/linux/pci.h
    >> +++ b/include/linux/pci.h
    >> @@ -326,6 +326,7 @@ struct pci_dev {
    >>  	unsigned int	class;		/* 3 bytes: (base,sub,prog-if) */
    >>  	u8		revision;	/* PCI revision, low byte of class word */
    >>  	u8		hdr_type;	/* PCI header type (`multi' flag masked out) */
    >> +	u8              cap_off[PCI_CAP_ID_MAX]; /* Offsets of all pci capabilities */
    >
    >Did you run 'pahole' to ensure you are not adding extra padding bytes
    >here?
    >
    >>  #ifdef CONFIG_PCIEAER
    >>  	u16		aer_cap;	/* AER capability offset */
    >>  	struct aer_stats *aer_stats;	/* AER stats for this device */
    >> @@ -1128,6 +1129,7 @@ void pci_sort_breadthfirst(void);
    >>  
    >>  u8 pci_bus_find_capability(struct pci_bus *bus, unsigned int devfn, int cap);
    >>  u8 pci_find_capability(struct pci_dev *dev, int cap);
    >> +void pci_find_all_capabilities(struct pci_dev *dev);
    >
    >Why is this now a global function and not one just local to the pci
    >core?  Who else would ever need to call it?
    >
    >thanks,
    >
    >greg k-h





  parent reply	other threads:[~2022-01-21 17:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1642526161-22499-1-git-send-email-bvikas@vmware.com>
2022-01-20  6:26 ` [PATCH] PCI: Speed up device init by parsing capabilities all at once Greg KH
2022-01-20 17:31   ` Vikash Bansal
2022-01-22 19:09     ` Vikash Bansal
2022-01-21 17:26   ` Vikash Bansal [this message]
2022-01-21 19:42     ` Bjorn Helgaas
2022-01-22  2:12       ` Srivatsa S. Bhat
2022-01-22 18:51         ` Vikash Bansal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7E2C2648-76CE-4987-AB4F-7B4576F10D7B@vmware.com \
    --to=bvikas@vmware.com \
    --cc=akaher@vmware.com \
    --cc=amakhalov@vmware.com \
    --cc=anishs@vmware.com \
    --cc=bhelgaas@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=srinidhir@vmware.com \
    --cc=srivatsa@csail.mit.edu \
    --cc=srivatsab@vmware.com \
    --cc=vsirnapalli@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.