Home Home > GIT Browse
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDan Williams <dan.j.williams@intel.com>2018-02-04 10:34:02 -0800
committerDan Williams <dan.j.williams@intel.com>2018-03-02 18:00:04 -0800
commit94db151dc89262bfa82922c44e8320cea2334667 (patch)
treee7127df00533f4fcead7c879e897fba33838740e
parent230f5a8969d8345fc9bbe3683f068246cf1be4b8 (diff)
vfio: disable filesystem-dax page pinning
Filesystem-DAX is incompatible with 'longterm' page pinning. Without page cache indirection a DAX mapping maps filesystem blocks directly. This means that the filesystem must not modify a file's block map while any page in a mapping is pinned. In order to prevent the situation of userspace holding of filesystem operations indefinitely, disallow 'longterm' Filesystem-DAX mappings. RDMA has the same conflict and the plan there is to add a 'with lease' mechanism to allow the kernel to notify userspace that the mapping is being torn down for block-map maintenance. Perhaps something similar can be put in place for vfio. Note that xfs and ext4 still report: "DAX enabled. Warning: EXPERIMENTAL, use at your own risk" ...at mount time, and resolving the dax-dma-vs-truncate problem is one of the last hurdles to remove that designation. Acked-by: Alex Williamson <alex.williamson@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: kvm@vger.kernel.org Cc: <stable@vger.kernel.org> Reported-by: Haozhong Zhang <haozhong.zhang@intel.com> Tested-by: Haozhong Zhang <haozhong.zhang@intel.com> Fixes: d475c6346a38 ("dax,ext2: replace XIP read and write with DAX I/O") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
-rw-r--r--drivers/vfio/vfio_iommu_type1.c18
1 files changed, 15 insertions, 3 deletions
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index e30e29ae4819..45657e2b1ff7 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -338,11 +338,12 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
{
struct page *page[1];
struct vm_area_struct *vma;
+ struct vm_area_struct *vmas[1];
int ret;
if (mm == current->mm) {
- ret = get_user_pages_fast(vaddr, 1, !!(prot & IOMMU_WRITE),
- page);
+ ret = get_user_pages_longterm(vaddr, 1, !!(prot & IOMMU_WRITE),
+ page, vmas);
} else {
unsigned int flags = 0;
@@ -351,7 +352,18 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
down_read(&mm->mmap_sem);
ret = get_user_pages_remote(NULL, mm, vaddr, 1, flags, page,
- NULL, NULL);
+ vmas, NULL);
+ /*
+ * The lifetime of a vaddr_get_pfn() page pin is
+ * userspace-controlled. In the fs-dax case this could
+ * lead to indefinite stalls in filesystem operations.
+ * Disallow attempts to pin fs-dax pages via this
+ * interface.
+ */
+ if (ret > 0 && vma_is_fsdax(vmas[0])) {
+ ret = -EOPNOTSUPP;
+ put_page(page[0]);
+ }
up_read(&mm->mmap_sem);
}