dpdk PMD

news2025/7/14 13:01:07

PMD是Poll Mode Driver的缩写,即基于用户态的轮询机制的驱动

在不考虑vfio的情况下,PMD的结构图如下

   虽然PMD是在用户态实现设备驱动,但还是依赖于内核提供的策略。其中uio模块,是内核提供的用户态驱动框架,而igb_uio是DPDK kit中拥有与uio交互,bind指定网卡的内核模块;

  当使用DPDK脚本dpdk-devbind来bind网卡时,会通过sysfs与内核交互,让内核使用指定驱动来匹配网卡。具体的行为向/sys/bus/pci/devices/(pci id)/driver_override写入指定驱动名称,或者向/sys/bus/pci/drivers/igb_uio(驱动名称)/new_id写入要绑定网卡的PCI ID。前者是配置设备,让其选择驱动。后者是是配置驱动,让其支持新的PCI设备。按照内核的文档https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-pci,这两个动作都会促使驱动bind新设备。

  但是在dpdk-devbind脚本中,还是通过向/sys/bus/pci/drivers/igb_uio(驱动名称)/bind写入pci id来保证bind;

  当使用igb_uio bind指定设备后,内核会调用igb_uio注册的struct pci_driver的probe函数,即igbuio_pci_probe;

其会调用uio_register_device,注册uio设备

打开uio设备

 应用层DPDK已经可以使用uio设备了。DPDK的应用层代码,会打开uioX设备,在函数pci_uio_alloc_resource中;

   当open对应的uio设备时,对应的内核操作为uio_open,其又会调用igb_uio的open函数,流程图如下

 

 igb_uio的默认中断模式为RTE_INTR_MODE_MSIX,在igbuio_pci_enable_interrupts的关键代码如下

static int
igbuio_pci_enable_interrupts(struct rte_uio_pci_dev *udev)
{
    int err = 0;

    switch (igbuio_intr_mode_preferred) {
    case RTE_INTR_MODE_MSIX:
        /* Only 1 msi-x vector needed */
#ifndef HAVE_ALLOC_IRQ_VECTORS
        msix_entry.entry = 0;
        if (pci_enable_msix(udev->pdev, &msix_entry, 1) == 0) {
            dev_dbg(&udev->pdev->dev, "using MSI-X");
            udev->info.irq_flags = IRQF_NO_THREAD;
            udev->info.irq = msix_entry.vector;
            udev->mode = RTE_INTR_MODE_MSIX;
            break;
        }
#else
        if (pci_alloc_irq_vectors(udev->pdev, 1, 1, PCI_IRQ_MSIX) == 1) {
            dev_dbg(&udev->pdev->dev, "using MSI-X");
            udev->info.irq_flags = IRQF_NO_THREAD;
            udev->info.irq = pci_irq_vector(udev->pdev, 0);
            udev->mode = RTE_INTR_MODE_MSIX;
            break;
        }
#endif

default:
        dev_err(&udev->pdev->dev, "invalid IRQ mode %u",
            igbuio_intr_mode_preferred);
        udev->info.irq = UIO_IRQ_NONE;
        err = -EINVAL;
    }

    if (udev->info.irq != UIO_IRQ_NONE)
        err = request_irq(udev->info.irq, igbuio_pci_irqhandler,
                  udev->info.irq_flags, udev->info.name,
                  udev);
    dev_info(&udev->pdev->dev, "uio device registered with irq %ld\n",
         udev->info.irq);

    return err;
}

   当打开uio设备时,igb_uio注册了一个中断。这时大家应该有个疑问,PMD不是用户态轮询设备吗?为什么还要申请中断,注册中断处理函数呢?这是因为,即使应用层可以通过uio来实现设备驱动,但是设备的某些事件还是需要内核进行响应,然后通知应用层;相当于中断的上半部 和下半部

 上半部中断服务函数: 主要通知一个event到user比如: 唤醒在idev->wait等待队列中的task

/**
 * This is interrupt handler which will check if the interrupt is for the right device.
 * If yes, disable it here and will be enable later.
 */
static irqreturn_t
igbuio_pci_irqhandler(int irq, void *dev_id)
{
    struct rte_uio_pci_dev *udev = (struct rte_uio_pci_dev *)dev_id;
    struct uio_info *info = &udev->info;

    /* Legacy mode need to mask in hardware */
    if (udev->mode == RTE_INTR_MODE_LEGACY &&
        !pci_check_and_mask_intx(udev->pdev))
        return IRQ_NONE;

    uio_event_notify(info);

    /* Message signal mode, no share IRQ and automasked */
    return IRQ_HANDLED;
}

当DPDK的app启动时,会进行EAL初始化,如下图:

 

以ixgbe驱动为例详细说明:

学习地址: Dpdk/网络协议栈/vpp/OvS/DDos/NFV/虚拟化/高性能专家-学习视频教程-腾讯课堂
更多DPDK相关学习资料有需要的可以自行报名学习,免费订阅,久学习,或点击这里加qun免费
领取,关注我持续更新哦! !  

RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pmd);

#define RTE_PMD_EXPORT_NAME(name, idx) \
static const char RTE_PMD_EXPORT_NAME_ARRAY(this_pmd_name, idx) \
__attribute__((used)) = RTE_STR(name)

/** Helper for PCI device registration from driver (eth, crypto) instance */
#define RTE_PMD_REGISTER_PCI(nm, pci_drv) \
RTE_INIT(pciinitfn_ ##nm) \
{\
    (pci_drv).driver.name = RTE_STR(nm);\
    rte_pci_register(&pci_drv); \
} \
RTE_PMD_EXPORT_NAME(nm, __COUNTER__)

/* register a driver */
void
rte_pci_register(struct rte_pci_driver *driver)
{
    TAILQ_INSERT_TAIL(&rte_pci_bus.driver_list, driver, next);
    driver->bus = &rte_pci_bus;
}

    RTE_PMD_REGISTER_PCI为dpdk定义的宏,使用了GNU C提供的“__attribute__(constructor)”机制,使得注册设备驱动的过程在main函数执行之前完成。 
  这样就有了设备驱动类型、设备驱动的初始化函数

 使用attribute的constructor属性,在MAIN函数执行前,就执行pmd_driver_register()函数,将pmd_igb_drv驱动挂到全局dev_driver_list链表上。

2、通过读取/sys/bus/pci/devices/目录下的信息,扫描当前系统的PCI设备,并初始化,并按照PCI地址从大到小的顺序挂在到pci_debice_list上。

3、rte_bus_probe------>rte_eal_pci_probe_one_driver();通过比对PCI_ID的vendor_id、device_id、subsystem_vendor_id、subsystem_device_id四个字段判断pci设备和pci驱动是否匹配。

  PCI设备和PCI驱动匹配后,调用pci_map_device()函数为该PCI设备创建map resource;  pci_uio_map_resource()函数为pci设备在虚拟地址空间映射pci资源,后续直接通过操作内存来操作pci设备

4、rte_bus_probe---------->调用rte_eth_dev_create 以及 eth_ixgbe_pci_probe()初始化PCI设备; 并通过eth_ixgbe_dev_init 函数分配一个网卡设备,并在全局数组rte_eth_dev_data[]中为网卡设备的数据域分配空间。

  同时初始化网卡设备。首先设置网卡设备的操作函数集,以及收包、发包函数

/* Launch threads, called at application init(). */
int
rte_eal_init(int argc, char **argv)
{
    int i, fctret, ret;
    pthread_t thread_id;
    static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
    const char *p;
    static char logid[PATH_MAX];
    char cpuset[RTE_CPU_AFFINITY_STR_LEN];
    char thread_name[RTE_MAX_THREAD_NAME_LEN];

    /* checks if the machine is adequate */
    if (!rte_cpu_is_supported()) {
        rte_eal_init_alert("unsupported cpu type.");
        rte_errno = ENOTSUP;
        return -1;
    }
    //操作静态局部变量run_once确保函数只执行一次

    if (!rte_atomic32_test_and_set(&run_once)) {
        rte_eal_init_alert("already called initialization.");
        rte_errno = EALREADY;
        return -1;
    }
//dpdk: EAL init args: -c 3 -n 4 --file-prefix vpp --master-lcore 0
    p = strrchr(argv[0], '/');
    strlcpy(logid, p ? p + 1 : argv[0], sizeof(logid));
    thread_id = pthread_self();//获取主线程的线程ID
//初始化结构体struct internal_config
    eal_reset_internal_config(&internal_config);

    /* set log level as early as possible设置log level */
    eal_log_level_parse(argc, argv);

    //赋值全局结构struct lcore_config
    if (rte_eal_cpu_init() < 0) {//赋值全局结构struct lcore_config
        rte_eal_init_alert("Cannot detect lcores.");
        rte_errno = ENOTSUP;
        return -1;
    }

    fctret = eal_parse_args(argc, argv);
    if (fctret < 0) {
        rte_eal_init_alert("Invalid 'command line' arguments.");
        rte_errno = EINVAL;
        rte_atomic32_clear(&run_once);
        return -1;
    }

    if (eal_plugins_init() < 0) {//EAL的“-d”选项可以指定需要载入的动态链接库)
        rte_eal_init_alert("Cannot init plugins\n");
        rte_errno = EINVAL;
        rte_atomic32_clear(&run_once);
        return -1;
    }

    if (eal_option_device_parse()) {
        rte_errno = ENODEV;
        rte_atomic32_clear(&run_once);
        return -1;
    }

    rte_config_init();

    if (rte_eal_intr_init() < 0) {
        rte_eal_init_alert("Cannot init interrupt-handling thread\n");
        return -1;
    }

    /* Put mp channel init before bus scan so that we can init the vdev
     * bus through mp channel in the secondary process before the bus scan.
     */
    if (rte_mp_channel_init() < 0) {
        rte_eal_init_alert("failed to init mp channel\n");
        if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
            rte_errno = EFAULT;
            return -1;
        }
    }

    if (rte_bus_scan()) {//scan 扫描设备的device信息
    ///sys/bus/pci/devices”目录下的所有pci地址,逐个获取对应的pci地址、pci id、sriov使能时的vf个数、亲和的numa、设备地址空间、驱动类型等
        rte_eal_init_alert("Cannot scan the buses for devices\n");
        rte_errno = ENODEV;
        rte_atomic32_clear(&run_once);
        return -1;
    }

    /* autodetect the iova mapping mode (default is iova_pa) */
    rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();

    /* Workaround for KNI which requires physical address to work */
    if (rte_eal_get_configuration()->iova_mode == RTE_IOVA_VA &&
            rte_eal_check_module("rte_kni") == 1) {
        rte_eal_get_configuration()->iova_mode = RTE_IOVA_PA;
        RTE_LOG(WARNING, EAL,
            "Some devices want IOVA as VA but PA will be used because.. "
            "KNI module inserted\n");
    }

    if (internal_config.no_hugetlbfs == 0) {
        /* rte_config isn't initialized yet */
        ret = internal_config.process_type == RTE_PROC_PRIMARY ?
                eal_hugepage_info_init() :
                eal_hugepage_info_read();
        if (ret < 0) {
            rte_eal_init_alert("Cannot get hugepage information.");
            rte_errno = EACCES;
            rte_atomic32_clear(&run_once);
            return -1;
        }
    }

    if (internal_config.memory == 0 && internal_config.force_sockets == 0) {
        if (internal_config.no_hugetlbfs)
            internal_config.memory = MEMSIZE_IF_NO_HUGE_PAGE;
    }

    if (internal_config.vmware_tsc_map == 1) {
#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
        rte_cycles_vmware_tsc_map = 1;
        RTE_LOG (DEBUG, EAL, "Using VMWARE TSC MAP, "
                "you must have monitor_control.pseudo_perfctr = TRUE\n");
#else
        RTE_LOG (WARNING, EAL, "Ignoring --vmware-tsc-map because "
                "RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT is not set\n");
#endif
    }

    rte_srand(rte_rdtsc());

    if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
        rte_eal_init_alert("Cannot init logging.");
        rte_errno = ENOMEM;
        rte_atomic32_clear(&run_once);
        return -1;
    }

#ifdef VFIO_PRESENT
    if (rte_eal_vfio_setup() < 0) {
        rte_eal_init_alert("Cannot init VFIO\n");
        rte_errno = EAGAIN;
        rte_atomic32_clear(&run_once);
        return -1;
    }
#endif
    /* in secondary processes, memory init may allocate additional fbarrays
     * not present in primary processes, so to avoid any potential issues,
     * initialize memzones first.
     */
    if (rte_eal_memzone_init() < 0) {
        rte_eal_init_alert("Cannot init memzone\n");
        rte_errno = ENODEV;
        return -1;
    }

    if (rte_eal_memory_init() < 0) {
        rte_eal_init_alert("Cannot init memory\n");
        rte_errno = ENOMEM;
        return -1;
    }

    /* the directories are locked during eal_hugepage_info_init */
    eal_hugedirs_unlock();

    if (rte_eal_malloc_heap_init() < 0) {
        rte_eal_init_alert("Cannot init malloc heap\n");
        rte_errno = ENODEV;
        return -1;
    }

    if (rte_eal_tailqs_init() < 0) {
        rte_eal_init_alert("Cannot init tail queues for objects\n");
        rte_errno = EFAULT;
        return -1;
    }

    if (rte_eal_alarm_init() < 0) {
        rte_eal_init_alert("Cannot init interrupt-handling thread\n");
        /* rte_eal_alarm_init sets rte_errno on failure. */
        return -1;
    }

    if (rte_eal_timer_init() < 0) {
        rte_eal_init_alert("Cannot init HPET or TSC timers\n");
        rte_errno = ENOTSUP;
        return -1;
    }

    eal_check_mem_on_local_socket();

    eal_thread_init_master(rte_config.master_lcore);

    ret = eal_thread_dump_affinity(cpuset, sizeof(cpuset));

    RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%x;cpuset=[%s%s])\n",
        rte_config.master_lcore, (int)thread_id, cpuset,
        ret == 0 ? "" : "...");

    RTE_LCORE_FOREACH_SLAVE(i) {

        /*
         * create communication pipes between master thread
         * and children
         */
        if (pipe(lcore_config[i].pipe_master2slave) < 0)
            rte_panic("Cannot create pipe\n");
        if (pipe(lcore_config[i].pipe_slave2master) < 0)
            rte_panic("Cannot create pipe\n");

        lcore_config[i].state = WAIT;

        /* create a thread for each lcore */
        ret = pthread_create(&lcore_config[i].thread_id, NULL,
                     eal_thread_loop, NULL);
        if (ret != 0)
            rte_panic("Cannot create thread\n");

        /* Set thread_name for aid in debugging. */
        snprintf(thread_name, sizeof(thread_name),
            "lcore-slave-%d", i);
        ret = rte_thread_setname(lcore_config[i].thread_id,
                        thread_name);
        if (ret != 0)
            RTE_LOG(DEBUG, EAL,
                "Cannot set name for lcore thread\n");
    }

    /*
     * Launch a dummy function on all slave lcores, so that master lcore
     * knows they are all ready when this function returns.
     */
    rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MASTER);
    rte_eal_mp_wait_lcore();

    /* initialize services so vdevs register service during bus_probe. */
    ret = rte_service_init();
    if (ret) {
        rte_eal_init_alert("rte_service_init() failed\n");
        rte_errno = ENOEXEC;
        return -1;
    }

    /* Probe all the buses and devices/drivers on them 
    /* 所有的 dirver 都会register a driver  
    void rte_pci_register(struct rte_pci_driver *driver)
    {  TAILQ_INSERT_TAIL(&rte_pci_bus.driver_list, driver, next);
        driver->bus = &rte_pci_bus;

    */
    /*遍历pci_device_list和pci_driver_list链表,根据PCI_ID,
    将pci_device与pci_driver绑定,并调用pci_driver的 probe函数初始化PCI设备*/
    if (rte_bus_probe()) {//对应 rte_pci_probe  函数 --->pci_probe_all_drivers
    //--AILQ_FOREACH(p, &(rte_pci_bus.driver_list), next) ---->rte_pci_probe_one_driver
//----rte_pci_map_device  &&  dr->probe(dr, dev);
//PCI设备和PCI驱动匹配后,调用pci_map_device()函数为该PCI设备创建map resource
        rte_eal_init_alert("Cannot probe devices\n");
        rte_errno = ENOTSUP;
        return -1;
    }

#ifdef VFIO_PRESENT
    /* Register mp action after probe() so that we got enough info */
    if (rte_vfio_is_enabled("vfio") && vfio_mp_sync_setup() < 0)
        return -1;
#endif

    /* initialize default service/lcore mappings and start running. Ignore
     * -ENOTSUP, as it indicates no service coremask passed to EAL.
     */
    ret = rte_service_start_with_defaults();
    if (ret < 0 && ret != -ENOTSUP) {
        rte_errno = ENOEXEC;
        return -1;
    }

    rte_eal_mcfg_complete();

    return fctret;
}

/*
读取/sys/bus/pci/devices/0000\:00\:09.0/uio/uio0/maps/map0/目录下的文件,获取UIO设备的map resource。并将其记录在struct pci_map数据结构中。
检查PCI设备和UIO设备在内存总线上的物理地址是否一致。如果一致,对/dev/uioID文件mmap一段内存空间,并将其记录在pci_map->addr和rte_pci_device->mem_resource[].addr中

将所有UIO设备的resource信息都记录在struct mapped_pci_resource数据结构中,并挂到全局链表uio_res_list上
*/

/* Map pci device */
int
rte_pci_map_device(struct rte_pci_device *dev)
{
    int ret = -1;

    /* try mapping the NIC resources using VFIO if it exists */
    switch (dev->kdrv) {
    case RTE_KDRV_VFIO:
#ifdef VFIO_PRESENT
        if (pci_vfio_is_enabled())
            ret = pci_vfio_map_resource(dev);
#endif
        break;
    case RTE_KDRV_IGB_UIO:
    case RTE_KDRV_UIO_GENERIC:
        if (rte_eal_using_phys_addrs()) {
            /* map resources for devices that use uio */
            ret = pci_uio_map_resource(dev);
        }
        break;
    default:
        RTE_LOG(DEBUG, EAL,
            "  Not managed by a supported kernel driver, skipped\n");
        ret = 1;
        break;
    }

    return ret;
}

/* map the PCI resource of a PCI device in virtual memory */
int
pci_uio_map_resource(struct rte_pci_device *dev)
{
    int i, map_idx = 0, ret;
    uint64_t phaddr;
    struct mapped_pci_resource *uio_res = NULL;
    struct mapped_pci_res_list *uio_res_list =
        RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);

    dev->intr_handle.fd = -1;
    dev->intr_handle.uio_cfg_fd = -1;

    /* secondary processes - use already recorded details */
    if (rte_eal_process_type() != RTE_PROC_PRIMARY)
        return pci_uio_map_secondary(dev);

    /* allocate uio resource  open 打开要管理的uio 设备*/
    ret = pci_uio_alloc_resource(dev, &uio_res);
    if (ret)
        return ret;

   /* DPDK还需要把PCI设备的BAR映射到应用层
     Map all BARs    */
    for (i = 0; i != PCI_MAX_RESOURCE; i++) {
        /* skip empty BAR */
        phaddr = dev->mem_resource[i].phys_addr;
        if (phaddr == 0)
            continue;

        ret = pci_uio_map_resource_by_index(dev, i,
                uio_res, map_idx);
        if (ret)
            goto error;

        map_idx++;
    }

    uio_res->nb_maps = map_idx;

    TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);

    return 0;
error:
    for (i = 0; i < map_idx; i++) {
        pci_unmap_resource(uio_res->maps[i].addr,
                (size_t)uio_res->maps[i].size);
        rte_free(uio_res->maps[i].path);
    }
    pci_uio_free_resource(dev, uio_res);
    return -1;
}

int
pci_uio_alloc_resource(struct rte_pci_device *dev,
        struct mapped_pci_resource **uio_res)
{
    char devname[PATH_MAX]; /* contains the /dev/uioX */
    struct rte_pci_addr *loc;

    loc = &dev->addr;

    snprintf(devname, sizeof(devname), "/dev/uio@pci:%u:%u:%u",
            dev->addr.bus, dev->addr.devid, dev->addr.function);

    if (access(devname, O_RDWR) < 0) {
        RTE_LOG(WARNING, EAL, "  "PCI_PRI_FMT" not managed by UIO driver, "
                "skipping\n", loc->domain, loc->bus, loc->devid, loc->function);
        return 1;
    }

    /* save fd if in primary process 
初始化PCI设备的中断句柄。*/
    dev->intr_handle.fd = open(devname, O_RDWR);
    if (dev->intr_handle.fd < 0) {
        RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
            devname, strerror(errno));
        goto error;
    }
    dev->intr_handle.type = RTE_INTR_HANDLE_UIO;

    /* allocate the mapping details for secondary processes*/
    *uio_res = rte_zmalloc("UIO_RES", sizeof(**uio_res), 0);
    if (*uio_res == NULL) {
        RTE_LOG(ERR, EAL,
            "%s(): cannot store uio mmap details\n", __func__);
        goto error;
    }

    snprintf((*uio_res)->path, sizeof((*uio_res)->path), "%s", devname);
    memcpy(&(*uio_res)->pci_addr, &dev->addr, sizeof((*uio_res)->pci_addr));

    return 0;

error:
    pci_uio_free_resource(dev, *uio_res);
    return -1;
}

4、rte_bus_probe---------->调用rte_eth_dev_create 以及 eth_ixgbe_pci_probe()初始化PCI设备; 并通过eth_ixgbe_dev_init 函数分配一个网卡设备,并在全局数组rte_eth_dev_data[]中为网卡设备的数据域分配空间。

  同时初始化网卡设备。首先设置网卡设备的操作函数集,以及收包、发包函数

注册中断处理函数。

rte_intr_callback_register(&(pci_dev->intr_handle),
                eth_igb_interrupt_handler, (void *)eth_dev);

int __rte_experimental
rte_eth_dev_create(struct rte_device *device, const char *name,
    size_t priv_data_size,
    ethdev_bus_specific_init ethdev_bus_specific_init,
    void *bus_init_params,
    ethdev_init_t ethdev_init, void *init_params)
{
    struct rte_eth_dev *ethdev;
    int retval;

    RTE_FUNC_PTR_OR_ERR_RET(*ethdev_init, -EINVAL);

    if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
        //在全局数组rte_eth_devices[]中分配一个网卡设备。并在全局数组rte_eth_dev_data[]中为网卡设备的数据域分配空间。
        ethdev = rte_eth_dev_allocate(name);
        if (!ethdev)
            return -ENODEV;

        if (priv_data_size) {
            ethdev->data->dev_private = rte_zmalloc_socket(
                name, priv_data_size, RTE_CACHE_LINE_SIZE,
                device->numa_node);

            if (!ethdev->data->dev_private) {
                RTE_LOG(ERR, EAL, "failed to allocate private data");
                retval = -ENOMEM;
                goto data_alloc_failed;
            }
        }
    } else {
        ethdev = rte_eth_dev_attach_secondary(name);
        if (!ethdev) {
            RTE_LOG(ERR, EAL, "secondary process attach failed, "
                "ethdev doesn't exist");
            return  -ENODEV;
        }
    }

    ethdev->device = device;

    if (ethdev_bus_specific_init) {
        retval = ethdev_bus_specific_init(ethdev, bus_init_params);
        if (retval) {
            RTE_LOG(ERR, EAL,
                "ethdev bus specific initialisation failed");
            goto probe_failed;
        }
    }

    retval = ethdev_init(ethdev, init_params);
    if (retval) {
        RTE_LOG(ERR, EAL, "ethdev initialisation failed");
        goto probe_failed;
    }

    rte_eth_dev_probing_finish(ethdev);

    return retval;
probe_failed:
    /* free ports private data if primary process */
    if (rte_eal_process_type() == RTE_PROC_PRIMARY)
        rte_free(ethdev->data->dev_private);

data_alloc_failed:
    rte_eth_dev_release_port(ethdev);

    return retval;
}


/*
 * This function is based on code in ixgbe_attach() in base/ixgbe.c.
 * It returns 0 on success.
 */
static int
eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev, void *init_params __rte_unused)
{
    struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev);
    struct rte_intr_handle *intr_handle = &pci_dev->intr_handle;
    struct ixgbe_hw *hw =
        IXGBE_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
    struct ixgbe_vfta *shadow_vfta =
        IXGBE_DEV_PRIVATE_TO_VFTA(eth_dev->data->dev_private);
    struct ixgbe_hwstrip *hwstrip =
        IXGBE_DEV_PRIVATE_TO_HWSTRIP_BITMAP(eth_dev->data->dev_private);
    struct ixgbe_dcb_config *dcb_config =
        IXGBE_DEV_PRIVATE_TO_DCB_CFG(eth_dev->data->dev_private);
    struct ixgbe_filter_info *filter_info =
        IXGBE_DEV_PRIVATE_TO_FILTER_INFO(eth_dev->data->dev_private);
    struct ixgbe_bw_conf *bw_conf =
        IXGBE_DEV_PRIVATE_TO_BW_CONF(eth_dev->data->dev_private);
    uint32_t ctrl_ext;
    uint16_t csum;
    int diag, i;

    PMD_INIT_FUNC_TRACE();

    //设置网卡设备的操作函数集,以及收包、发包函数

    eth_dev->dev_ops = &ixgbe_eth_dev_ops;
    eth_dev->rx_pkt_burst = &ixgbe_recv_pkts;
    eth_dev->tx_pkt_burst = &ixgbe_xmit_pkts;
    eth_dev->tx_pkt_prepare = &ixgbe_prep_pkts;

    /*
     * For secondary processes, we don't initialise any further as primary
     * has already done this work. Only check we don't need a different
     * RX and TX function.
     */
    if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
        struct ixgbe_tx_queue *txq;
        /* TX queue function in primary, set by last queue initialized
         * Tx queue may not initialized by primary process
         */
        if (eth_dev->data->tx_queues) {
            txq = eth_dev->data->tx_queues[eth_dev->data->nb_tx_queues-1];
            ixgbe_set_tx_function(eth_dev, txq);
        } else {
            /* Use default TX function if we get here */
            PMD_INIT_LOG(NOTICE, "No TX queues configured yet. "
                     "Using default TX function.");
        }

        ixgbe_set_rx_function(eth_dev);

        return 0;
    }

    rte_eth_copy_pci_info(eth_dev, pci_dev);

    /* Vendor and Device ID need to be set before init of shared code */
    hw->device_id = pci_dev->id.device_id;
    hw->vendor_id = pci_dev->id.vendor_id;
    hw->hw_addr = (void *)pci_dev->mem_resource[0].addr;
    hw->allow_unsupported_sfp = 1;

    /* Initialize the shared code (base driver) */
#ifdef RTE_LIBRTE_IXGBE_BYPASS
    diag = ixgbe_bypass_init_shared_code(hw);
#else
    diag = ixgbe_init_shared_code(hw);
#endif /* RTE_LIBRTE_IXGBE_BYPASS */

    if (diag != IXGBE_SUCCESS) {
        PMD_INIT_LOG(ERR, "Shared code init failed: %d", diag);
        return -EIO;
    }

    /* pick up the PCI bus settings for reporting later */
    ixgbe_get_bus_info(hw);

    /* Unlock any pending hardware semaphore */
    ixgbe_swfw_lock_reset(hw);

#ifdef RTE_LIBRTE_SECURITY
    /* Initialize security_ctx only for primary process*/
    if (ixgbe_ipsec_ctx_create(eth_dev))
        return -ENOMEM;
#endif

    /* Initialize DCB configuration*/
    memset(dcb_config, 0, sizeof(struct ixgbe_dcb_config));
    ixgbe_dcb_init(hw, dcb_config);
    /* Get Hardware Flow Control setting */
    hw->fc.requested_mode = ixgbe_fc_full;
    hw->fc.current_mode = ixgbe_fc_full;
    hw->fc.pause_time = IXGBE_FC_PAUSE;
    for (i = 0; i < IXGBE_DCB_MAX_TRAFFIC_CLASS; i++) {
        hw->fc.low_water[i] = IXGBE_FC_LO;
        hw->fc.high_water[i] = IXGBE_FC_HI;
    }
    hw->fc.send_xon = 1;

    /* Make sure we have a good EEPROM before we read from it */
    diag = ixgbe_validate_eeprom_checksum(hw, &csum);
    if (diag != IXGBE_SUCCESS) {
        PMD_INIT_LOG(ERR, "The EEPROM checksum is not valid: %d", diag);
        return -EIO;
    }

#ifdef RTE_LIBRTE_IXGBE_BYPASS
    diag = ixgbe_bypass_init_hw(hw);
#else
    diag = ixgbe_init_hw(hw);
#endif /* RTE_LIBRTE_IXGBE_BYPASS */

    /*
     * Devices with copper phys will fail to initialise if ixgbe_init_hw()
     * is called too soon after the kernel driver unbinding/binding occurs.
     * The failure occurs in ixgbe_identify_phy_generic() for all devices,
     * but for non-copper devies, ixgbe_identify_sfp_module_generic() is
     * also called. See ixgbe_identify_phy_82599(). The reason for the
     * failure is not known, and only occuts when virtualisation features
     * are disabled in the bios. A delay of 100ms  was found to be enough by
     * trial-and-error, and is doubled to be safe.
     */
    if (diag && (hw->mac.ops.get_media_type(hw) == ixgbe_media_type_copper)) {
        rte_delay_ms(200);
        diag = ixgbe_init_hw(hw);
    }

    if (diag == IXGBE_ERR_SFP_NOT_PRESENT)
        diag = IXGBE_SUCCESS;

    if (diag == IXGBE_ERR_EEPROM_VERSION) {
        PMD_INIT_LOG(ERR, "This device is a pre-production adapter/"
                 "LOM.  Please be aware there may be issues associated "
                 "with your hardware.");
        PMD_INIT_LOG(ERR, "If you are experiencing problems "
                 "please contact your Intel or hardware representative "
                 "who provided you with this hardware.");
    } else if (diag == IXGBE_ERR_SFP_NOT_SUPPORTED)
        PMD_INIT_LOG(ERR, "Unsupported SFP+ Module");
    if (diag) {
        PMD_INIT_LOG(ERR, "Hardware Initialization Failure: %d", diag);
        return -EIO;
    }

    /* Reset the hw statistics */
    ixgbe_dev_stats_reset(eth_dev);

    /* disable interrupt */
    ixgbe_disable_intr(hw);

    /* reset mappings for queue statistics hw counters*/
    ixgbe_reset_qstat_mappings(hw);

    /* Allocate memory for storing MAC addresses */
    eth_dev->data->mac_addrs = rte_zmalloc("ixgbe", ETHER_ADDR_LEN *
                           hw->mac.num_rar_entries, 0);
    if (eth_dev->data->mac_addrs == NULL) {
        PMD_INIT_LOG(ERR,
                 "Failed to allocate %u bytes needed to store "
                 "MAC addresses",
                 ETHER_ADDR_LEN * hw->mac.num_rar_entries);
        return -ENOMEM;
    }
    /* Copy the permanent MAC address */
    ether_addr_copy((struct ether_addr *) hw->mac.perm_addr,
            &eth_dev->data->mac_addrs[0]);

    /* Allocate memory for storing hash filter MAC addresses */
    eth_dev->data->hash_mac_addrs = rte_zmalloc("ixgbe", ETHER_ADDR_LEN *
                            IXGBE_VMDQ_NUM_UC_MAC, 0);
    if (eth_dev->data->hash_mac_addrs == NULL) {
        PMD_INIT_LOG(ERR,
                 "Failed to allocate %d bytes needed to store MAC addresses",
                 ETHER_ADDR_LEN * IXGBE_VMDQ_NUM_UC_MAC);
        return -ENOMEM;
    }

    /* initialize the vfta */
    memset(shadow_vfta, 0, sizeof(*shadow_vfta));

    /* initialize the hw strip bitmap*/
    memset(hwstrip, 0, sizeof(*hwstrip));

    /* initialize PF if max_vfs not zero */
    ixgbe_pf_host_init(eth_dev);

    ctrl_ext = IXGBE_READ_REG(hw, IXGBE_CTRL_EXT);
    /* let hardware know driver is loaded */
    ctrl_ext |= IXGBE_CTRL_EXT_DRV_LOAD;
    /* Set PF Reset Done bit so PF/VF Mail Ops can work */
    ctrl_ext |= IXGBE_CTRL_EXT_PFRSTD;
    IXGBE_WRITE_REG(hw, IXGBE_CTRL_EXT, ctrl_ext);
    IXGBE_WRITE_FLUSH(hw);

    if (ixgbe_is_sfp(hw) && hw->phy.sfp_type != ixgbe_sfp_type_not_present)
        PMD_INIT_LOG(DEBUG, "MAC: %d, PHY: %d, SFP+: %d",
                 (int) hw->mac.type, (int) hw->phy.type,
                 (int) hw->phy.sfp_type);
    else
        PMD_INIT_LOG(DEBUG, "MAC: %d, PHY: %d",
                 (int) hw->mac.type, (int) hw->phy.type);

    PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x",
             eth_dev->data->port_id, pci_dev->id.vendor_id,
             pci_dev->id.device_id);
//注册中断处理函数。
    rte_intr_callback_register(intr_handle,
                   ixgbe_dev_interrupt_handler, eth_dev);

    /* enable uio/vfio intr/eventfd mapping */
    rte_intr_enable(intr_handle);

    /* enable support intr */
    ixgbe_enable_intr(eth_dev);

    /* initialize filter info */
    memset(filter_info, 0,
           sizeof(struct ixgbe_filter_info));

    /* initialize 5tuple filter list */
    TAILQ_INIT(&filter_info->fivetuple_list);

    /* initialize flow director filter list & hash */
    ixgbe_fdir_filter_init(eth_dev);

    /* initialize l2 tunnel filter list & hash */
    ixgbe_l2_tn_filter_init(eth_dev);

    /* initialize flow filter lists */
    ixgbe_filterlist_init();

    /* initialize bandwidth configuration info */
    memset(bw_conf, 0, sizeof(struct ixgbe_bw_conf));

    /* initialize Traffic Manager configuration */
    ixgbe_tm_conf_init(eth_dev);

    return 0;
}

原文链接:https://www.cnblogs.com/codestack/p/15674071.html 

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/8545.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

深度探讨react-hooks实现原理

react hooks 实现 Hooks 解决了什么问题 在 React 的设计哲学中&#xff0c;简单的来说可以用下面这条公式来表示&#xff1a; UI f(data)等号的左边时 UI 代表的最终画出来的界面&#xff1b;等号的右边是一个函数&#xff0c;也就是我们写的 React 相关的代码&#xff1b…

最新最全面的Spring详解(一)——Spring概述与IOC容器

前言 本文为 【Spring】Spring概述与IOC容器 相关知识&#xff0c;下边将对Spring概述&#xff0c;IOC容器&#xff08;包括&#xff1a;IOC概述、配置元数据、容器实例化与使用、Bean的概述、依赖注入 Dependency Injection、Bean 作用范围&#xff08;作用域&#xff09;、更…

计算机网络(二)

三、数据链路层 3.1 数据链路层概述 数据链路层在物理层提供的服务的基础上向网络层提供服务&#xff0c;其最基本的服务是将源自网络层来的数据可靠地传输到相邻节点的目标机网络层。数据链路层在不可靠的物理介质上提供可靠的传输。 该层的作用包括&#xff1a;物理地址寻址…

安装Redis

一、Windows安装 1、下载安装包 2、下载完毕得到压缩包 3、解压到自己电脑上的环境目录 4、开启redis&#xff0c;双击运行服务 5、使用redis客户端来连接redis 注意&#xff1a;Window下使用确实简单,但是Redis推荐我们使用Linux去开发使用! 二、Linux安装 1、官网下载…

everything常用搜索命令

参考&#xff1a;玩转Everything&#xff08;三&#xff09; https://baijiahao.baidu.com/s?id1735662355311796969&wfrspider&forpc 可右键菜单显示要显示的内容 指定目录搜索 例&#xff1a;e: 文件名 &#xff08;注意加空格&#xff09; 多目录内搜索 例&#x…

ModStartCMS v5.2.0 字段扩展支持,SiteMap增强

系统介绍 ModStart 是一个基于 Laravel 模块化极速开发框架。模块市场拥有丰富的功能应用&#xff0c;支持后台一键快速安装&#xff0c;让开发者能快的实现业务功能开发。 系统完全开源&#xff0c;基于 Apache 2.0 开源协议&#xff0c;免费且不限制商业使用。 功能特性 丰…

多肽914910-73-9:血管紧张素Angiotensin(1-12)(mouse, rat)

血管紧张素 (1-12) 是局部生成血管紧张素的潜在前体。它在广泛的器官和组织中表达&#xff0c;包括小肠、脾脏、肝脏、肾脏和心脏。卡托普利和 CV-11974&#xff08;一种血管紧张素 II I 型受体拮抗剂&#xff09;可消除对静脉输注血管紧张素 (1-12) 的血管收缩和升压反应。编号…

【Linux详解】——环境变量

&#x1f4d6; 前言&#xff1a;本期将介绍Linux下的环境变量 目录&#x1f552; 1. 基本概念&#x1f558; 1.1 常见环境变量&#x1f558; 1.2 查看环境变量方法&#x1f558; 1.3 其他指令&#xff1a;&#x1f558; 1.4 环境变量的来源&#x1f552; 2. 环境变量的操作&…

2022亚太杯建模A题思路分析 小美赛数学建模 A题思路

一、 2022亚太杯大学生数学建模竞赛 注册截止日期&#xff1a;北京时间2022年11月23日&#xff08;星期三&#xff09;中午12点 竞赛开始时间&#xff1a;北京时间2022年11月24日&#xff08;星期四&#xff09;上午6点 竞赛结束时间&#xff1a;北京时间2022年11月28日&#…

OpenCV实战(2)——OpenCV核心数据结构

OpenCV实战&#xff08;2&#xff09;——OpenCV核心数据结构0. 前言1. cv::Mat 数据结构1.1 cv::Mat 简介1.2 cv::Mat 属性1.3 完整代码示例2. 探索 cv::Mat 数据结构2.1 cv::Mat 对象的创建2.2 OpenCV 输入和输出数组小结系列链接0. 前言 cv::Mat 类是用于保存图像(以及其他…

2022 SPSSPRO杯A|B|C题全网最全解题思路+数据分享

一&#xff0c;认证杯数学建模2022 ABC题干分析 2022年第十五届“SPSSPRO杯”数学中国数学建模网络挑战赛 2022认证杯数学中国数学建模网络挑战赛 认证杯这次叫spssrpo 二&#xff0c;A题 人员的紧急疏散 在过去的几十年里&#xff0c;由于大规模集会活动的数量和规模的增加…

大数据采集工具与采集业务划分

目录1- FlumeAgentSourceChannelSinkEvent2- Fluentd3- Logstash4- Chukwa5- Scribe6- Splunk7- Scrapy8- Kafka9- Datax10-日志采集11-数据源数据同步1- Flume https://flume.apache.org/ Flume是Cloudera提供的一个高可用的&#xff0c;高可靠的&#xff0c;分布式的海量日志…

代码随想录56——动态规划:583两个字符串的删除操作、72编辑距离

文章目录1.583两个字符串的删除操作1.1.题目1.2.解答2.72编辑距离2.1.题目2.2.解答1.583两个字符串的删除操作 参考&#xff1a;代码随想录&#xff0c;583两个字符串的删除操作&#xff1b;力扣题目链接 1.1.题目 1.2.解答 本题和 动态规划&#xff1a;115.不同的子序列 相…

在Python中 先乘再除 和 先除再乘 是有差别的

浮点数的原因<font colorblue size4 face"楷体">1 问题来源<font colorblue size4 face"楷体">2 为什么会这样&#xff1f;<font colorblue size4 face"楷体">2.1 分解公式<font colorblue size4 face"楷体">…

最近面试被问到的vue题

v-for 为什么要加 key 如果不使用 key&#xff0c;Vue 会使用一种最大限度减少动态元素并且尽可能的尝试就地修改/复用相同类型元素的算法。key 是为 Vue 中 vnode 的唯一标记&#xff0c;通过这个 key&#xff0c;我们的 diff 操作可以更准确、更快速 更准确&#xff1a;因为…

初阶牛之牛客网刷题集(1)

前言 记录一下牛牛自己在牛客网上刷到的一些题目.分享一下牛牛的解题思路,希望可以帮到大家. 目录前言1.母牛的故事解题思路&#xff1a;代码实现&#xff1a;2.替换空格解题思路:代码实现3.二进制中1的个数解题思路代码实现结语1.母牛的故事 题目链接:传送门 有一头母牛&am…

43期《深入浅出Pytorch》课程 - Task01:PyTorch的安装和基础知识+前置知识打卡

Task011、Pytorch安装2、基础知识2.1 张量(Tensor)2.2 自动求导2.3 梯度2.4 并行计算3、前置知识打卡1、Pytorch安装 由于之前使用过Pytorch&#xff0c;所以说不需要再重新下载&#xff0c;直接开始后续的基础知识 2、基础知识 由于之前学习过numpy系列&#xff0c;所以说…

用专业团队管理软件工具轻松“拿捏”年轻运营团队

本文旨在抛砖引玉&#xff0c;欢迎大家拍砖讨论&#xff0c;通过一款时下流行的专业团队管理软件飞项做案例&#xff0c;一起探讨和交流团队管理专业工具软件和一些对应的方法论。 说到国内这几年流行起来的团队管理工具软件&#xff0c;我们先看看互联网这几年的发展。这几年&…

网络舆情监测是干嘛的?

近年来&#xff0c;有赖于移动网络技术的发展和社会化媒体概念的崛起&#xff0c;讯息的传播与扩散速度得到了空前的提高&#xff0c;而对于民营企业来说同样带来了不少的挑战&#xff0c;接下来TOOM舆情监测系统小编带您简单介绍网络舆情监测是干什么的? 一、什么是网络舆情监…

用户画像系列——布隆过滤器在策略引擎中的应用

在用户画像系列——当我们聊用户画像&#xff0c;我们在聊什么&#xff1f; 介绍了用户画像的应用场景: (1)个性化推荐 通过用户标签给用户推荐合适的商品或者内容 (2)营销圈选 参考&#xff1a;用户画像系列——Lookalike在营销圈选扩量中的应用 (3)策略引擎 根据用户标…