前言
内核大量打印"AMD-Vi completion-wait loop timed out",同时伴随有soft lockup或者rcu cpu stall,如下:
Dec 8 10:02:17 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:17 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:17 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:17 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:18 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:18 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:18 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:18 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:18 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:19 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:19 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:19 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:19 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:19 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:20 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:20 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:20 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:20 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:20 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:21 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:21 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:21 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:21 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:21 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:22 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:22 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:22 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:22 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:22 kernel: AMD-Vi: Completion-Wait loop timed out
Dec 8 10:02:22 kernel: watchdog: BUG: soft lockup - CPU#46 stuck for 22s! [swapper/6:0]
Dec 8 10:02:22 kernel: CPU: 46 PID: 0 Comm: swapper/46 Tainted: G L 5.10.128 2
Dec 8 10:02:22 kernel: RIP: 0010:_raw_spin_unlock_irqrestore+0x15/0x20
Dec 8 10:02:22 kernel: Call Trace:
Dec 8 10:02:22 kernel: <IRQ>
Dec 8 10:02:22 kernel: amd_iommu_flush_iotlb_all+0x4e/0x60
Dec 8 10:02:22 kernel: iommu_dma_flush_iotlb_all+0x1d/0x20
Dec 8 10:02:22 kernel: iova_domain_flush+0x1e/0x30
Dec 8 10:02:22 kernel: fq_flush_timeout+0x39/0xb0
Dec 8 10:02:22 kernel: ? fq_ring_free+0x110/0x110
Dec 8 10:02:22 kernel: call_timer_fn+0x2e/0x100
Dec 8 10:02:22 kernel: __run_timers.part.0+0x1de/0x260
Dec 8 10:02:22 kernel: ? clockevents_program_event+0x8f/0xe0
Dec 8 10:02:22 kernel: ? tick_program_event+0x41/0x80
Dec 8 10:02:22 kernel: run_timer_softirq+0x2a/0x50
Dec 8 10:02:22 kernel: __do_softirq+0xce/0x281
Dec 8 10:02:22 kernel: asm_call_irq_on_stack+0x12/0x20
Dec 8 10:02:22 kernel: </IRQ>
Dec 8 10:02:22 kernel: do_softirq_own_stack+0x3d/0x50
Dec 8 10:02:22 kernel: irq_exit_rcu+0xc5/0x100
Dec 8 10:02:22 kernel: sysvec_apic_timer_interrupt+0x3d/0x90
Dec 8 10:02:22 kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20
Dec 8 10:02:22 kernel: RIP: 0010:native_safe_halt+0xe/0x10
勤快的小伙伴可能会迅速的google到下面的链接:
https://support.lenovo.com/us/en/solutions/tt1512-thinksystem-server-with-amd-processor-running-linux-may-hang-or-crash-with-kernel-message-amd-vi-completion-wait-loop-timed-out
其中却没有解释,为啥机器上会有soft lockup,而且还一直在一个CPU上soft lockup。
Timed out log来源
AMD iommu架构中的一条命令,参考其spec,2.4.1 COMPLETION_WAIT
The COMPLETION_WAIT command allows software to serialize itself with IOMMU command processing. The COMPLETION_WAIT command does not finish until all older commands issuedsince a prior COMPLETION_WAIT have completely executed.
其命令的中,有关于该命令是否完成的说明如下:
当命令完成时,iommu会将cmd.store_data写入cmd.store_addr中;参考代码:
5.10.128iommu_completion_wait()
---data = ++iommu->cmd_sem_val;build_completion_wait(&cmd, iommu, data);ret = __iommu_queue_command_sync(iommu, &cmd, false);if (ret)goto out_unlock;ret = wait_on_sem(iommu, data);
---build_completion_wait()
---u64 paddr = iommu_virt_to_phys((void *)iommu->cmd_sem);memset(cmd, 0, sizeof(*cmd));cmd->data[0] = lower_32_bits(paddr) | CMD_COMPL_WAIT_STORE_MASK;cmd->data[1] = upper_32_bits(paddr);cmd->data[2] = data;CMD_SET_TYPE(cmd, CMD_COMPL_WAIT);
---wait_on_sem()
---while (*iommu->cmd_sem != data && i < LOOP_TIMEOUT) {udelay(1);i += 1;}if (i == LOOP_TIMEOUT) {pr_alert("Completion-Wait loop timed out\n");return -EIO;}
---
这条Log就代表,COMPLETION_WAIT一直没有完成。
那么为什么一直没有完成呢?这里并没有找到明确的答案,不过从以下链接中:
iommu/amd: flush IOTLB for specific domains only (v2) - Patchwork
参考该Patch解决AMD-Vi: Completion-Wait loop timed out的方式,它减少了domain_flush_tlb()的调用次数;那么是不是说,发送太多的flush tlb类型的操作,会导致COMPLETION_WAIT命令超时。
Linux Kernel Watchdog
在Linux内核中,存在两种watchdog,分别用于检测soft lockup和hard lockup,
- soft lockup,,
- hard lockup,用于检测本CPU上,无法响应中断的场景,
对于soft lockup watchdog,用于检测本CPU上,无法调度任务的场景,
- 喂狗,通过优先级最高的任务进行,如下:
5.10.128watchdog_enabled()
---hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);hrtimer->function = watchdog_timer_fn;hrtimer_start(hrtimer, ns_to_ktime(sample_period),HRTIMER_MODE_REL_PINNED_HARD);
---watchdog_timer_fn()-> stop_one_cpu_nowait(smp_processor_id(),softlockup_fn, NULL,this_cpu_ptr(&softlockup_stop_work));softlockup_fn()-> update_touch_ts()-> __this_cpu_write(watchdog_touch_ts, get_timestamp());
由定时器定期发起softlockup_fn()给stop machine运行;需要特别说明的是:这里的stop_one_cpu_nowait()就是将某个回调交给该CPU优先级最高的调度类运行,即stop_sched_class,参考:
5.10.128const struct sched_class stop_sched_class__section("__stop_sched_class") = {.enqueue_task = enqueue_task_stop,.dequeue_task = dequeue_task_stop,...
};#define SCHED_DATA \STRUCT_ALIGN(); \__begin_sched_classes = .; \*(__idle_sched_class) \*(__fair_sched_class) \*(__rt_sched_class) \*(__dl_sched_class) \*(__stop_sched_class) \__end_sched_classes = .;
如果该CPU上最高优先级的任务都无法调度,去喂狗,那么就可以认为,该CPU已经瘫痪,无法进行任务调度,于是watchdog要叫了;
狗叫的过程,同样也优定时器驱动,注意,这里的定时器是hrtimer,在中断上下文运行,
watchdog_timer_fn()-> is_softlockup()---if (time_after(now, touch_ts + get_softlockup_thresh()))return now - touch_ts;----> pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",smp_processor_id(), duration,current->comm, task_pid_nr(current));
这里,我们已经知道,soft lockup由hrtimer定时器驱动,如果中断被禁止了呢?接下来,将由基于NMI中断的hard lockup处理;
hardlockup_detector_event_create()
---/* Try to register using hardware perf events */evt = perf_event_create_kernel_counter(wd_attr, cpu, NULL,watchdog_overflow_callback, NULL);
---watchdog_overflow_callback()-> is_hardlockup()---unsigned long hrint = __this_cpu_read(hrtimer_interrupts);if (__this_cpu_read(hrtimer_interrupts_saved) == hrint)return true;__this_cpu_write(hrtimer_interrupts_saved, hrint);return false;---watchdog_timer_fn()-> watchdog_interrupt_count()
hard lockup由perf event的NMI中断驱动,喂狗方为位于中断上下文的soft lockup的hrtimer;
hard lockup的超时时间为/proc/sys/kernel/watchdog,同行为10秒;soft lockup为hard lockup的2倍。
fq_flush_timeout()
在内核任务调度切换地址空间时,会进行tlb的刷新,如下:
context_switch()-> switch_mm_irqs_off()-> choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);-> load_new_mm_cr3(next->pgd, new_asid, true);---if (need_flush) {invalidate_user_asid(new_asid);new_mm_cr3 = build_cr3(pgdir, new_asid);}----
因为切换地址空间之后,同一个虚机地址,大概率会变成另外一个物理地址,所以,需要将相关tlb全部flush掉;
同理,iommu也会有类似的机制,fq_flush_timeout()就是这样一个作用;
本问题中,出现的所有的soft lockup调用栈均在该函数中,这里,我们简单的看下其工作机制;
iommu flush queue工作方式如下:
- queue_iova()将需要unmap的addr放入flush queue,对应的queue entry会获取一个seq number;
- 启动同一个定时器,该定时器是softirq timer,回调函数为fq_flush_timeout();超时时间为10ms;
- 超时之后,fq_flush_timeout()会调用iova_flush_domain() flush掉所有的iommu tlb entry,然后,给seq number加1
- 释放掉所有拥有当前seq number的flush queue entry;;
参考以下代码:
queue_iova()
---spin_lock_irqsave(&fq->lock, flags);fq_ring_free(iovad, fq);if (fq_full(fq)) {iova_domain_flush(iovad);fq_ring_free(iovad, fq);}idx = fq_ring_add(fq);fq->entries[idx].iova_pfn = pfn;fq->entries[idx].pages = pages;fq->entries[idx].data = data;fq->entries[idx].counter = atomic64_read(&iovad->fq_flush_start_cnt);spin_unlock_irqrestore(&fq->lock, flags);
-----------------------------------------------------------------------> STEP 1if (!atomic_read(&iovad->fq_timer_on) &&!atomic_xchg(&iovad->fq_timer_on, 1))mod_timer(&iovad->fq_timer,jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
-----------------------------------------------------------------------> STEP 2
---fq_flush_timeout()
---atomic_set(&iovad->fq_timer_on, 0);iova_domain_flush(iovad);
-----------------------------------------------------------------------> STEP 3for_each_possible_cpu(cpu) {unsigned long flags;struct iova_fq *fq;fq = per_cpu_ptr(iovad->fq, cpu);spin_lock_irqsave(&fq->lock, flags);fq_ring_free(iovad, fq);spin_unlock_irqrestore(&fq->lock, flags);}
-----------------------------------------------------------------------> STEP 4
---
flush queue是每CPU的;fq_flush_timeout()会统一执行iova_domain_flush(),然后批量释放所有的flush queue上的iova;
调用栈分析
准备知识完备之后,我们看下本问题中出现的soft lockup的调用栈,以下均进行了裁剪,
【调用栈A】
kernel: watchdog: BUG: soft lockup - CPU#46 stuck for 22s! [swapper/46:0]
kernel: CPU: 46 PID: 0 Comm: swapper/46 Not tainted 5.10.128 #2
kernel: RIP: 0010:_raw_spin_unlock_irqrestore+0x15/0x20
kernel: fq_flush_timeout+0x79/0xb0【调用栈B】
kernel: watchdog: BUG: soft lockup - CPU#46 stuck for 22s! [swapper/46:0]
kernel: CPU: 46 PID: 0 Comm: swapper/46 Tainted: G L 5.10.128 #2
kernel: RIP: 0010:_raw_spin_unlock_irqrestore+0x15/0x20
kernel: amd_iommu_flush_iotlb_all+0x4e/0x60
kernel: iommu_dma_flush_iotlb_all+0x1d/0x20
kernel: iova_domain_flush+0x1e/0x30
kernel: fq_flush_timeout+0x39/0xb0
这两个调用栈重复了2个小时的时间,而且一直在CPU 46上;参考代码:
fq_flush_timeout()
---atomic_set(&iovad->fq_timer_on, 0);iova_domain_flush(iovad);-> iommu_dma_flush_iotlb_all()-> amd_iommu_flush_dte_all()---spin_lock_irqsave(&dom->lock, flags);domain_flush_tlb_pde(dom);domain_flush_complete(dom);spin_unlock_irqrestore(&dom->lock, flags); -----> 【调用栈B】---for_each_possible_cpu(cpu) {unsigned long flags;struct iova_fq *fq;fq = per_cpu_ptr(iovad->fq, cpu);spin_lock_irqsave(&fq->lock, flags);fq_ring_free(iovad, fq);spin_unlock_irqrestore(&fq->lock, flags); ------> 【调用栈 A】}
---
从中我们可以得到以下信息:
- 调用栈A和B反复出现,说明fq_flush_timeout()是会返回进入退出的;
- soft lockup调用栈RIP在_raw_spin_unlock_irqrestore(),是因为此时中断被重新enable,这个时候soft lockup的hrtimer才能进来;
- 调用栈B处发生soft lockup,处于spin_lock_irqsave()下,发生了soft lockup,但是没有发生hard lockup,这说明,这20s的时间,并不是发生在iova_domain_flush();
结合以上推论,我们可以进一步推论,soft lockup的发生,是因为其反复进入fq_flush_timeout(),为什么会这样?
首先查看softirq的处理函数:
__do_softirq()
---pending = local_softirq_pending();if (pending) {if (time_before(jiffies, end) && !need_resched() &&--max_restart)goto restart;wakeup_softirqd();}
---
这里即会检查need_resched()又有max_restart控制,所以不会是这里;
再看下timer的处理函数:
__run_timers()
---while (time_after_eq(jiffies, base->clk) &&time_after_eq(jiffies, base->next_expiry)) {levels = collect_expired_timers(base, heads);base->clk++;base->next_expiry = __next_timer_interrupt(base);while (levels--)expire_timers(base, heads + levels);}
---
只要满足以下条件,这里timer的回调就会反复被调用,
- timer->fn()执行没完成时,timer就再次被arm且超时;该条件可以满足,如下:
- fq_flush_timeout()的超时时间是10ms,
- AMD-Vi completion-wait loop timed out出现,说明iommu_completion_wait()的执行时间至少是100ms;
- fq_flush_timeout()开始执行之后,queue_iova()就可以再次arm该timer;
- timer是否会被重复enqueue在一个cpu,即time_base上?参考如下代码:
__mod_timer()
---base = lock_timer_base(timer, &flags);...new_base = get_target_base(base, timer->flags);if (base != new_base) {/** We are trying to schedule the timer on the new base.* However we can't change timer's base while it is running,* otherwise del_timer_sync() can't detect that the timer's* handler yet has not finished. This also guarantees that the* timer is serialized wrt itself.*/if (likely(base->running_timer != timer)) {/* See the comment in lock_timer_base() */timer->flags |= TIMER_MIGRATING;raw_spin_unlock(&base->lock);base = new_base;raw_spin_lock(&base->lock);WRITE_ONCE(timer->flags,(timer->flags & ~TIMER_BASEMASK) | base->cpu);forward_timer_base(base);}}
---
如果这个timer是正在执行的timer,那么它就可以被enqueue到同一个time_base上
所以,soft lockup的原因是fq_flush_timeout()的执行时间长于其超时时间,且每次执行时,都会被再次arm;
异常的时间
毫无疑问,fq_flush_timeout()执行时间过长,与反复打印的AMD-Vi completion-wait loop timed out有关,它意味着,fq_flush_timeout()的执行时间至少被block 100ms,轻松满足条件;那么为什么会出现该timed out呢?
参考小结"Timed out log来源",它有可能是因为过多的执行了iova_domain_flush()或者类似操作;
参考queue_iova()代码,其中包含以下一段:
if (fq_full(fq)) {iova_domain_flush(iovad);fq_ring_free(iovad, fq);}
如果fq_flush_timeout()执行的不及时,可能导致每个cpu的flush queue满了,然后各自自己执行iova_doman_flush(),而这就有可能造成completion-wait loop timed out;同时,这又会进一步恶化fq_flush_timeout()的执行,造成恶性循环。同时,queue_iova()在dma unmap路径,这通常在IO 完成路径,如下:
dma_unmap_sg()-> dma_unmap_sg_attrs()---if (dma_map_direct(dev, ops))dma_direct_unmap_sg(dev, sg, nents, dir, attrs);else if (ops->unmap_sg)ops->unmap_sg(dev, sg, nents, dir, attrs);---iommu_dma_unmap_sg()-> __iommu_dma_unmap()-> iommu_dma_free_iova()-> queue_iova()
这也会造成IO性能的下降,查看系统的sar信息,确实有该情况出现:
12:00:01 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
09:50:01 AM dev259-0 26.80 12.68 570.67 21.77 0.00 0.10 0.29 0.7
09:51:01 AM dev259-0 49.92 12.93 1325.57 26.81 0.00 0.08 0.28 1.41
09:52:01 AM dev259-0 51.58 25.44 1319.07 26.07 0.00 0.08 0.27 1.39
09:53:01 AM dev259-0 26.80 12.74 590.68 22.52 0.00 0.11 0.30 0.79
09:54:01 AM dev259-0 27.26 12.89 588.92 22.08 0.00 0.11 0.29 0.80
09:55:01 AM dev259-0 23.45 12.73 548.27 23.93 0.00 0.10 0.33 0.78
09:56:01 AM dev259-0 25.63 12.89 575.09 22.94 0.00 0.10 0.31 0.79
09:57:01 AM dev259-0 27.56 25.37 596.40 22.56 0.00 0.11 0.29 0.81
09:58:01 AM dev259-0 24.81 12.96 587.04 24.18 0.00 0.10 0.32 0.80
09:59:01 AM dev259-0 25.07 12.70 564.55 23.03 0.00 0.10 0.31 0.78
10:00:01 AM dev259-0 25.66 12.89 566.23 22.57 0.00 0.10 0.31 0.79
10:01:01 AM dev259-0 55.91 12.70 2219.87 39.93 0.01 0.21 0.29 1.65
10:02:01 AM dev259-0 53.30 12.79 1255.96 23.80 0.01 0.17 0.33 1.76
10:03:01 AM dev259-0 23.84 0.00 401.68 16.85 0.01 0.53 0.37 0.89
10:04:01 AM dev259-0 23.05 0.00 375.07 16.28 0.04 1.62 1.01 2.33
10:05:01 AM dev259-0 22.49 0.00 360.32 16.02 0.02 0.79 0.58 1.30
10:06:01 AM dev259-0 24.27 0.00 402.58 16.59 0.02 0.90 0.65 1.57
10:07:02 AM dev259-0 23.85 0.00 398.34 16.70 0.01 0.23 0.41 0.98
10:08:01 AM dev259-0 23.58 0.00 383.09 16.25 0.01 0.27 0.41 0.98
10:09:01 AM dev259-0 21.90 0.00 358.85 16.39 0.01 0.30 0.56 1.23
10:10:01 AM dev259-0 22.21 0.27 354.27 15.96 0.02 0.96 0.43 0.95
10:11:01 AM dev259-0 47.02 1.20 1091.48 23.24 0.02 0.33 0.32 1.52
10:12:02 AM dev259-0 49.38 0.00 1128.20 22.85 0.22 4.38 0.38 1.89
10:13:01 AM dev259-0 23.64 0.00 399.66 16.90 0.00 0.08 0.35 0.82
10:14:01 AM dev259-0 23.70 0.00 374.11 15.79 0.00 0.16 0.35 0.84
10:15:01 AM dev259-0 22.53 0.00 363.78 16.14 0.01 0.53 0.43 0.96
那么造成fq_flush_timeout()性能下降的始作俑者是谁呢?
可能是该Commit中提到的情况:
nvme-pci: clamp max_hw_sectors based on DMA optimized limitation - kernel/git/torvalds/linux.git - Linux kernel source tree
iova的锁竞争导致fq_flush_timeout()的执行效率下降。
附录
iommu=pt
相关处理代码为:
iommu_setup() // arch/x86/kernel/pci-dma.c
---if (!strncmp(p, "pt", 2))iommu_set_default_passthrough(true);---iommu_def_domain_type = IOMMU_DOMAIN_IDENTITY;---
---iommu_probe_device()-> iommu_alloc_default_domain()-> iommu_get_def_domain_type()-> __iommu_attach_device(group->default_domain, dev);-> amd_iommu_probe_finalize()---domain = iommu_get_domain_for_dev(dev);if (domain->type == IOMMU_DOMAIN_DMA)iommu_setup_dma_ops(dev, IOVA_START_PFN << PAGE_SHIFT, 0);---
当iommu=pt时,设备的iommu ops为NULL,此时,
dma_unmap_page_attrs()
---if (dma_map_direct(dev, ops))dma_direct_unmap_sg(dev, sg, nents, dir, attrs);else if (ops->unmap_sg)ops->unmap_sg(dev, sg, nents, dir, attrs);
---
这里走dma_map_direct()
调用栈
kernel: watchdog: BUG: soft lockup - CPU#46 stuck for 22s! [swapper/46:0]
kernel: CPU: 46 PID: 0 Comm: swapper/46 Not tainted 5.10.128 #2
kernel: RIP: 0010:_raw_spin_unlock_irqrestore+0x15/0x20
kernel: Call Trace:
kernel: <IRQ>
kernel: fq_flush_timeout+0x79/0xb0
kernel: ? fq_ring_free+0x110/0x110
kernel: call_timer_fn+0x2e/0x100
kernel: __run_timers.part.0+0x1de/0x260
kernel: ? clockevents_program_event+0x8f/0xe0
kernel: ? tick_program_event+0x41/0x80
kernel: run_timer_softirq+0x2a/0x50
kernel: __do_softirq+0xce/0x281
kernel: asm_call_irq_on_stack+0x12/0x20kernel: watchdog: BUG: soft lockup - CPU#46 stuck for 22s! [swapper/46:0]
kernel: CPU: 46 PID: 0 Comm: swapper/46 Tainted: G L 5.10.128 #2
kernel: RIP: 0010:_raw_spin_unlock_irqrestore+0x15/0x20
kernel: Call Trace:
kernel: <IRQ>
kernel: amd_iommu_flush_iotlb_all+0x4e/0x60
kernel: iommu_dma_flush_iotlb_all+0x1d/0x20
kernel: iova_domain_flush+0x1e/0x30
kernel: fq_flush_timeout+0x39/0xb0
kernel: ? fq_ring_free+0x110/0x110
kernel: call_timer_fn+0x2e/0x100
kernel: __run_timers.part.0+0x1de/0x260
kernel: ? clockevents_program_event+0x8f/0xe0
kernel: ? tick_program_event+0x41/0x80
kernel: run_timer_softirq+0x2a/0x50
kernel: __do_softirq+0xce/0x281
kernel: asm_call_irq_on_stack+0x12/0x20
AMD I/O Virtualization Technology (IOMMU) Specification
https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/specifications/48882_IOMMU.pdf