Home Home > GIT Browse
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@athlon.transmeta.com>2002-02-04 23:59:01 -0800
committerLinus Torvalds <torvalds@athlon.transmeta.com>2002-02-04 23:59:01 -0800
commitfe0976511d3b5cf2894da54bc451e561bd6b1482 (patch)
tree92bbccc5167568aa5b48d466891fb0dab40ba8bd
parent800446073f02f3035bffad7f1ced654ff6b474c9 (diff)
v2.5.0.10 -> v2.5.0.11
- Jeff Garzik: no longer support old cards in tulip driver (see separate driver for old tulip chips) - Pat Mochel: driverfs/device model documentation - Ballabio Dario: update eata driver to new IO locking - Ingo Molnar: raid resync with new bio structures (much more efficient) and mempool_resize() - Jens Axboe: bio queue locking
-rw-r--r--Documentation/driver-model.txt598
-rw-r--r--Documentation/filesystems/driverfs.txt211
-rw-r--r--Makefile2
-rw-r--r--arch/i386/lib/iodebug.c8
-rw-r--r--drivers/block/cciss.c6
-rw-r--r--drivers/block/cciss.h3
-rw-r--r--drivers/block/cpqarray.c6
-rw-r--r--drivers/block/cpqarray.h3
-rw-r--r--drivers/block/floppy.c64
-rw-r--r--drivers/block/ll_rw_blk.c33
-rw-r--r--drivers/block/nbd.c12
-rw-r--r--drivers/block/paride/pcd.c20
-rw-r--r--drivers/block/paride/pf.c38
-rw-r--r--drivers/block/ps2esdi.c2
-rw-r--r--drivers/block/rd.c473
-rw-r--r--drivers/ide/ide-probe.c2
-rw-r--r--drivers/ide/ide.c24
-rw-r--r--drivers/md/linear.c2
-rw-r--r--drivers/md/md.c281
-rw-r--r--drivers/md/raid0.c2
-rw-r--r--drivers/md/raid1.c1383
-rw-r--r--drivers/net/tulip/ChangeLog5
-rw-r--r--drivers/net/tulip/eeprom.c17
-rw-r--r--drivers/net/tulip/media.c37
-rw-r--r--drivers/net/tulip/timer.c54
-rw-r--r--drivers/net/tulip/tulip_core.c119
-rw-r--r--drivers/scsi/eata.c24
-rw-r--r--drivers/scsi/eata.h2
-rw-r--r--drivers/scsi/scsi.c2
-rw-r--r--drivers/scsi/scsi_error.c2
-rw-r--r--drivers/scsi/scsi_lib.c34
-rw-r--r--drivers/scsi/scsi_merge.c26
-rw-r--r--drivers/scsi/scsi_queue.c5
-rw-r--r--drivers/scsi/u14-34f.c32
-rw-r--r--drivers/scsi/u14-34f.h2
-rw-r--r--fs/bio.c2
-rw-r--r--fs/block_dev.c1
-rw-r--r--fs/buffer.c6
-rw-r--r--fs/ufs/inode.c2
-rw-r--r--include/asm-i386/io.h3
-rw-r--r--include/asm-s390/io.h2
-rw-r--r--include/asm-s390x/io.h2
-rw-r--r--include/linux/blkdev.h5
-rw-r--r--include/linux/devfs_fs_kernel.h8
-rw-r--r--include/linux/ide.h1
-rw-r--r--include/linux/mempool.h1
-rw-r--r--include/linux/nbd.h4
-rw-r--r--include/linux/raid/md.h6
-rw-r--r--include/linux/raid/md_compatible.h158
-rw-r--r--include/linux/raid/md_k.h26
-rw-r--r--include/linux/raid/raid1.h72
-rw-r--r--init/do_mounts.c781
-rw-r--r--mm/memory.c4
-rw-r--r--mm/mempool.c76
54 files changed, 2460 insertions, 2234 deletions
diff --git a/Documentation/driver-model.txt b/Documentation/driver-model.txt
new file mode 100644
index 000000000000..f77e051f0582
--- /dev/null
+++ b/Documentation/driver-model.txt
@@ -0,0 +1,598 @@
+The (New) Linux Kernel Driver Model
+
+Version 0.04
+
+Patrick Mochel <mochel@osdl.org>
+
+03 December 2001
+
+
+Overview
+~~~~~~~~
+
+This driver model is a unification of all the current, disparate driver models
+that are currently in the kernel. It is intended is to augment the
+bus-specific drivers for bridges and devices by consolidating a set of data
+and operations into globally accessible data structures.
+
+Current driver models implement some sort of tree-like structure (sometimes
+just a list) for the devices they control. But, there is no linkage between
+the different bus types.
+
+A common data structure can provide this linkage with little overhead: when a
+bus driver discovers a particular device, it can insert it into the global
+tree as well as its local tree. In fact, the local tree becomes just a subset
+of the global tree.
+
+Common data fields can also be moved out of the local bus models into the
+global model. Some of the manipulation of these fields can also be
+consolidated. Most likely, manipulation functions will become a set
+of helper functions, which the bus drivers wrap around to include any
+bus-specific items.
+
+The common device and bridge interface currently reflects the goals of the
+modern PC: namely the ability to do seamless Plug and Play, power management,
+and hot plug. (The model dictated by Intel and Microsoft (read: ACPI) ensures
+us that any device in the system may fit any of these criteria.)
+
+In reality, not every bus will be able to support such operations. But, most
+buses will support a majority of those operations, and all future buses will.
+In other words, a bus that doesn't support an operation is the exception,
+instead of the other way around.
+
+
+Drivers
+~~~~~~~
+
+The callbacks for bridges and devices are intended to be singular for a
+particular type of bus. For each type of bus that has support compiled in the
+kernel, there should be one statically allocated structure with the
+appropriate callbacks that each device (or bridge) of that type share.
+
+Each bus layer should implement the callbacks for these drivers. It then
+forwards the calls on to the device-specific callbacks. This means that
+device-specific drivers must still implement callbacks for each operation.
+But, they are not called from the top level driver layer.
+
+This does add another layer of indirection for calling one of these functions,
+but there are benefits that are believed to outweigh this slowdown.
+
+First, it prevents device-specific drivers from having to know about the
+global device layer. This speeds up integration time incredibly. It also
+allows drivers to be more portable across kernel versions. Note that the
+former was intentional, the latter is an added bonus.
+
+Second, this added indirection allows the bus to perform any additional logic
+necessary for its child devices. A bus layer may add additional information to
+the call, or translate it into something meaningful for its children.
+
+This could be done in the driver, but if it happens for every object of a
+particular type, it is best done at a higher level.
+
+Recap
+~~~~~
+
+Instances of devices and bridges are allocated dynamically as the system
+discovers their existence. Their fields describe the individual object.
+Drivers - in the global sense - are statically allocated and singular for a
+particular type of bus. They describe a set of operations that every type of
+bus could implement, the implementation following the bus's semantics.
+
+
+Downstream Access
+~~~~~~~~~~~~~~~~~
+
+Common data fields have been moved out of individual bus layers into a common
+data structure. But, these fields must still be accessed by the bus layers,
+and sometimes by the device-specific drivers.
+
+Other bus layers are encouraged to do what has been done for the PCI layer.
+struct pci_dev now looks like this:
+
+struct pci_dev {
+ ...
+
+ struct device device;
+};
+
+Note first that it is statically allocated. This means only one allocation on
+device discovery. Note also that it is at the _end_ of struct pci_dev. This is
+to make people think about what they're doing when switching between the bus
+driver and the global driver; and to prevent against mindless casts between
+the two.
+
+The PCI bus layer freely accesses the fields of struct device. It knows about
+the structure of struct pci_dev, and it should know the structure of struct
+device. PCI devices that have been converted generally do not touch the fields
+of struct device. More precisely, device-specific drivers should not touch
+fields of struct device unless there is a strong compelling reason to do so.
+
+This abstraction is prevention of unnecessary pain during transitional phases.
+If the name of the field changes or is removed, then every downstream driver
+will break. On the other hand, if only the bus layer (and not the device
+layer) accesses struct device, it is only those that need to change.
+
+
+User Interface
+~~~~~~~~~~~~~~
+
+By virtue of having a complete hierarchical view of all the devices in the
+system, exporting a complete hierarchical view to userspace becomes relatively
+easy.
+
+Whenever a device is inserted into the tree, a directory is created for it.
+This directory may be populated at each layer of discovery - the global layer,
+the bus layer, or the device layer.
+
+The global layer currently creates two files - 'status' and 'power'. The
+former only reports the name of the device and its bus ID. The latter reports
+the current power state of the device. It also be used to set the current
+power state.
+
+The bus layer may also create files for the devices it finds while probing the
+bus. For example, the PCI layer currently creates 'wake' and 'resource' files
+for each PCI device.
+
+A device-specific driver may also export files in its directory to expose
+device-specific data or tunable interfaces.
+
+These features were initially implemented using procfs. However, after one
+conversation with Linus, a new filesystem - driverfs - was created to
+implement these features. It is an in-memory filesystem, based heavily off of
+ramfs, though it uses procfs as inspiration for its callback functionality.
+
+Each struct device has a 'struct driver_dir_entry' which encapsulates the
+device's directory and the files within.
+
+Device Structures
+~~~~~~~~~~~~~~~~~
+
+struct device {
+ struct list_head bus_list;
+ struct iobus *parent;
+ struct iobus *subordinate;
+
+ char name[DEVICE_NAME_SIZE];
+ char bus_id[BUS_ID_SIZE];
+
+ struct driver_dir_entry * dir;
+
+ spinlock_t lock;
+ atomic_t refcount;
+
+ struct device_driver *driver;
+ void *driver_data;
+ void *platform_data;
+
+ u32 current_state;
+ unsigned char *saved_state;
+};
+
+bus_list:
+ List of all devices on a particular bus; i.e. the device's siblings
+
+parent:
+ The parent bridge for the device.
+
+subordinate:
+ If the device is a bridge itself, this points to the struct io_bus that is
+ created for it.
+
+name:
+ Human readable (descriptive) name of device. E.g. "Intel EEPro 100"
+
+bus_id:
+ Parsable (yet ASCII) bus id. E.g. "00:04.00" (PCI Bus 0, Device 4, Function
+ 0). It is necessary to have a searchable bus id for each device; making it
+ ASCII allows us to use it for its directory name without translating it.
+
+dir:
+ Driver's driverfs directory.
+
+lock:
+ Driver specific lock.
+
+refcount:
+ Driver's usage count.
+ When this goes to 0, the device is assumed to be removed. It will be removed
+ from its parent's list of children. It's remove() callback will be called to
+ inform the driver to clean up after itself.
+
+driver:
+ Pointer to a struct device_driver, the common operations for each device. See
+ next section.
+
+driver_data:
+ Private data for the driver.
+ Much like the PCI implementation of this field, this allows device-specific
+ drivers to keep a pointer to a device-specific data.
+
+platform_data:
+ Data that the platform (firmware) provides about the device.
+ For example, the ACPI BIOS or EFI may have additional information about the
+ device that is not directly mappable to any existing kernel data structure.
+ It also allows the platform driver (e.g. ACPI) to a driver without the driver
+ having to have explicit knowledge of (atrocities like) ACPI.
+
+
+current_state:
+ Current power state of the device. For PCI and other modern devices, this is
+ 0-3, though it's not necessarily limited to those values.
+
+saved_state:
+ Pointer to driver-specific set of saved state.
+ Having it here allows modules to be unloaded on system suspend and reloaded
+ on resume and maintain state across transitions.
+ It also allows generic drivers to maintain state across system state
+ transitions.
+ (I've implemented a generic PCI driver for devices that don't have a
+ device-specific driver. Instead of managing some vector of saved state
+ for each device the generic driver supports, it can simply store it here.)
+
+
+
+struct device_driver {
+ int (*probe) (struct device *dev);
+ int (*remove) (struct device *dev);
+
+ int (*suspend) (struct device *dev, u32 state, u32 level);
+ int (*resume) (struct device *dev, u32 level);
+}
+
+probe:
+ Check for device existence and associate driver with it.
+
+remove:
+ Dissociate driver with device. Releases device so that it could be used by
+ another driver. Also, if it is a hotplug device (hotplug PCI, Cardbus), an
+ ejection event could take place here.
+
+suspend:
+ Perform one step of the device suspend process.
+
+resume:
+ Perform one step of the device resume process.
+
+The probe() and remove() callbacks are intended to be much simpler than the
+current PCI correspondents.
+
+probe() should do the following only:
+
+- Check if hardware is present
+- Register device interface
+- Disable DMA/interrupts, etc, just in case.
+
+Some device initialisation was done in probe(). This should not be the case
+anymore. All initialisation should take place in the open() call for the
+device.
+
+Breaking initialisation code out must also be done for the resume() callback,
+as most devices will have to be completely reinitialised when coming back from
+a suspend state.
+
+remove() should simply unregister the device interface.
+
+
+Device power management can be quite complicated, based exactly what is
+desired to be done. Four operations sum up most of it:
+
+- OS directed power management.
+ The OS takes care of notifying all drivers that a suspend is requested,
+ saving device state, and powering devices down.
+- Firmware controlled power management.
+ The OS only wants to notify devices that a suspend is requested.
+- Device power management.
+ A user wants to place only one device in a low power state, and maybe save
+ state.
+- System reboot.
+ The system wants to place devices in a quiescent state before the system is
+ reset.
+
+In an attempt to please all of these scenarios, the power management
+transition for any device is broken up into several stages - notify, save
+state, and power down. The disable stage, which should happen after notify and
+before save state has been considered and may be implemented in the future.
+
+Depending on what the system-wide policy is (usually dictated by the power
+management scheme present), each driver's suspend callback may be called
+multiple times, each with a different stage.
+
+On all power management transitions, the stages should be called sequentially
+(notify before save state; save state before power down). However, drivers
+should not assume that any stage was called before hand. (If a driver gets a
+power down call, it shouldn't assume notify or save state was called first.)
+This allows the framework to be used seamlessly by all power management
+actions. Hopefully.
+
+Resume transitions happen in a similar manner. They are broken up into two
+stages currently (power on and restore state), though a third stage (enable)
+may be added later.
+
+For suspend and resume transitions, the following values are defined to denote
+the stage:
+
+enum{
+ SUSPEND_NOTIFY,
+ SUSPEND_SAVE_STATE,
+ SUSPEND_POWER_DOWN,
+};
+
+enum {
+ RESUME_POWER_ON,
+ RESUME_RESTORE_STATE,
+};
+
+
+During a system power transition, the device tree must be walked in order,
+calling the suspend() or resume() callback for each node. This may happen
+several times.
+
+Initially, this was done in kernel space. However, it has occurred to me that
+doing recursion to a non-bounded depth is dangerous, and that there are a lot
+of inherent race conditions in such an operation.
+
+Non-recursive walking of the device tree is possible. However, this makes for
+convoluted code.
+
+No matter what, if the transition happens in kernel space, it is difficult to
+gracefully recover from errors or to implement a policy that prevents one from
+shutting down the device(s) you want to save state to.
+
+Instead, the walking of the device tree has been moved to userspace. When a
+user requests the system to suspend, it will walk the device tree, as exported
+via driverfs, and tell each device to go to sleep. It will do this multiple
+times based on what the system policy is.
+
+Device resume should happen in the same manner when the system awakens.
+
+Each suspend stage is described below:
+
+SUSPEND_NOTIFY:
+
+This level to notify the driver that it is going to sleep. If it knows that it
+cannot resume the hardware from the requested level, or it feels that it is
+too important to be put to sleep, it should return an error from this function.
+
+It does not have to stop I/O requests or actually save state at this point.
+
+SUSPEND_DISABLE:
+
+The driver should stop taking I/O requests at this stage. Because the save
+state stage happens afterwards, the driver may not want to physically disable
+the device; only mark itself unavailable if possible.
+
+SUSPEND_SAVE_STATE:
+
+The driver should allocate memory and save any device state that is relevant
+for the state it is going to enter.
+
+SUSPEND_POWER_DOWN:
+
+The driver should place the device in the power state requested.
+
+
+For resume, the stages are defined as follows:
+
+RESUME_POWER_ON:
+
+Devices should be powered on and reinitialised to some known working state.
+
+RESUME_RESTORE_STATE:
+
+The driver should restore device state to its pre-suspend state and free any
+memory allocated for its saved state.
+
+RESUME_ENABLE:
+
+The device should start taking I/O requests again.
+
+
+Each driver does not have to implement each stage. But, it if it does
+implemente a stage, it should do what is described above. It should not assume
+that it performed any stage previously, or that it will perform any stage
+later.
+
+It is quite possible that a driver can fail during the suspend process, for
+whatever reason. In this event, the calling process must gracefully recover
+and restore everything to their states before the suspend transition began.
+
+If a driver knows that it cannot suspend or resume properly, it should fail
+during the notify stage. Properly implemented power management schemes should
+make sure that this is the first stage that is called.
+
+If a driver gets a power down request, it should obey it, as it may very
+likely be during a reboot.
+
+
+Bus Structures
+~~~~~~~~~~~~~~
+
+struct iobus {
+ struct list_head node;
+ struct iobus *parent;
+ struct list_head children;
+ struct list_head devices;
+
+ struct list_head bus_list;
+
+ spinlock_t lock;
+ atomic_t refcount;
+
+ struct device *self;
+ struct driver_dir_entry * dir;
+
+ char name[DEVICE_NAME_SIZE];
+ char bus_id[BUS_ID_SIZE];
+
+ struct bus_driver *driver;
+};
+
+node:
+ Bus's node in sibling list (its parent's list of child buses).
+
+parent:
+ Pointer to parent bridge.
+
+children:
+ List of subordinate buses.
+ In the children, this correlates to their 'node' field.
+
+devices:
+ List of devices on the bus this bridge controls.
+ This field corresponds to the 'bus_list' field in each child device.
+
+bus_list:
+ Each type of bus keeps a list of all bridges that it finds. This is the
+ bridges entry in that list.
+
+self:
+ Pointer to the struct device for this bridge.
+
+lock:
+ Lock for the bus.
+
+refcount:
+ Usage count for the bus.
+
+dir:
+ Driverfs directory.
+
+name:
+ Human readable ASCII name of bus.
+
+bus_id:
+ Machine readable (though ASCII) description of position on parent bus.
+
+driver:
+ Pointer to operations for bus.
+
+
+struct iobus_driver {
+ char name[16];
+ struct list_head node;
+
+ int (*scan) (struct io_bus*);
+ int (*add_device) (struct io_bus*, char*);
+};
+
+name:
+ ASCII name of bus.
+
+node:
+ List of buses of this type in system.
+
+scan:
+ Search the bus for new devices. This may happen either at boot - where every
+ device discovered will be new - or later on - in which there may only be a few
+ (or no) new devices.
+
+add_device:
+ Trigger a device insertion at a particular location.
+
+
+
+The API
+~~~~~~~
+
+There are several functions exported by the global device layer, including
+several optional helper functions, written solely to try and make your life
+easier.
+
+void device_init_dev(struct device * dev);
+
+Initialise a device structure. It first zeros the device, the initialises all
+of the lists. (Note that this would have been called device_init(), but that
+name was already taken. :/)
+
+
+struct device * device_alloc(void)
+
+Allocate memory for a device structure and initialise it.
+First, allocates memory, then calls device_init_dev() with the new pointer.
+
+
+int device_register(struct device * dev);
+
+Register a device with the global device layer.
+The bus layer should call this function upon device discovery, e.g. when
+probing the bus.
+dev should be fully initialised when this is called.
+If dev->parent is not set, it sets its parent to be the device root.
+It then does the following:
+ - inserts it into its parent's list of children
+ - creates a driverfs directory for it
+ - creates a set of default files for the device in its directory
+ - calls platform_notify() to notify the firmware driver of its existence.
+
+
+void get_device(struct device * dev);
+
+Increment the refcount for a device.
+
+
+int valid_device(struct device * dev);
+
+Check if reference count is positive for a device (it's not waiting to be
+freed). If it is positive, it increments the reference count for the device.
+It returns whether or not the device is usable.
+
+
+void put_device(struct device * dev);
+
+Decrement the reference count for the device. If it hits 0, it removes the
+device from its parent's list of children and calls the remove() callback for
+the device.
+
+
+void lock_device(struct device * dev);
+
+Take the spinlock for the device.
+
+
+void unlock_device(struct device * dev);
+
+Release the spinlock for the device.
+
+
+
+void iobus_init(struct iobus * iobus);
+struct iobus * iobus_alloc(void);
+int iobus_register(struct iobus * iobus);
+void get_iobus(struct iobus * iobus);
+int valid_iobus(struct iobus * iobus);
+void put_iobus(struct iobus * iobus);
+void lock_iobus(struct iobus * iobus);
+void unlock_iobus(struct iobus * iobus);
+
+These functions provide the same functionality as the device_*
+counterparts, only operating on a struct iobus. One important thing to note,
+though is that iobus_register() and iobus_unregister() operate recursively. It
+is possible to add an entire tree in one call.
+
+
+
+int device_driver_init(void);
+
+Main initialisation routine.
+
+This makes sure driverfs is up and running and initialises the device tree.
+
+
+void device_driver_exit(void);
+
+This frees up the device tree.
+
+
+
+
+Credits
+~~~~~~~
+
+The following people have been extremely helpful in solidifying this document
+and the driver model.
+
+Randy Dunlap rddunlap@osdl.org
+Jeff Garzik jgarzik@mandrakesoft.com
+Ben Herrenschmidt benh@kernel.crashing.org
+
+
diff --git a/Documentation/filesystems/driverfs.txt b/Documentation/filesystems/driverfs.txt
new file mode 100644
index 000000000000..b1f2553b73f8
--- /dev/null
+++ b/Documentation/filesystems/driverfs.txt
@@ -0,0 +1,211 @@
+
+driverfs - The Device Driver Filesystem
+
+Patrick Mochel <mochel@osdl.org>
+
+3 December 2001
+
+
+What it is:
+~~~~~~~~~~~
+driverfs is a unified means for device drivers to export interfaces to
+userspace.
+
+Some drivers have a need for exporting interfaces for things like
+setting device-specific parameters, or tuning the device performance.
+For example, wireless networking cards export a file in procfs to set
+their SSID.
+
+Other times, the bus on which a device resides may export other
+information about the device. For example, PCI and USB both export
+device information via procfs or usbdevfs.
+
+In these cases, the files or directories are in nearly random places
+in /proc. One benefit of driverfs is that it can consolidate all of
+these interfaces to one standard location.
+
+
+Why it's better than procfs:
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+This of course can't happen without changing every single driver that
+exports a procfs interface, and having some coordination between all
+of them as to what the proper place for their files is. Or can it?
+
+
+driverfs was developed in conjunction with the new driver model for
+the 2.5 kernel. In that model, the system has one unified tree of all
+the devices that are present in the system. It follows naturally that
+this tree can be exported to userspace in the same order.
+
+So, every bus and every device gets a directory in the filesystem.
+This directory is created when the device is registered in the tree;
+before the driver actually gets a initialised. The dentry for this
+directory is stored in the struct device for this driver, so the
+driver has access to it.
+
+Now, every driver has one standard place to export its files.
+
+Granted, the location of the file is not as intuitive as it may have
+been under procfs. But, I argue that with the exception of
+/proc/bus/pci, none of the files had intuitive locations. I also argue
+that the development of userspace tools can help cope with these
+changes and inconsistencies in locations.
+
+
+Why we're not just using procfs:
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When developing the new driver model, it was initially implemented
+with a procfs tree. In explaining the concept to Linus, he said "Don't
+use proc."
+
+I was a little shocked (especially considering I had already
+implemented it using procfs). "What do you mean 'don't use proc'?"
+
+His argument was that too many things use proc that shouldn't. And
+even more things misuse proc that shouldn't. On top of that, procfs
+was written before the VFS layer was written, so it doesn't use the
+dcache. It reimplements many of the same features that the dcache
+does, and is in general, crufty.
+
+So, he told me to write my own. Soon after, he pointed me at ramfs,
+the simplest filesystem known to man.
+
+Consequently, we have a virtual fileystem based heavily on ramfs, and
+borrowing some conceptual functionality from procfs.
+
+It may suck, but it does what it was designed to. At least so far.
+
+
+How it works:
+~~~~~~~~~~~~~
+
+Directories are encapsulated like this:
+
+struct driver_dir_entry {
+ char * name;
+ struct dentry * dentry;
+ mode_t mode;
+ struct list_head files;
+};
+
+name:
+ Name of the directory.
+dentry:
+ Dentry for the directory.
+mode:
+ Permissions of the directory.
+files:
+ Linked list of driver_file_entry's that are in the directory.
+
+
+To create a directory, one first calls
+
+struct driver_dir_entry *
+driverfs_create_dir_entry(const char * name, mode_t mode);
+
+which allocates and initialises a struct driver_dir_entry. Then to actually
+create the directory:
+
+int driverfs_create_dir(struct driver_dir_entry *, struct driver_dir_entry *);
+
+To remove a directory:
+
+void driverfs_remove_dir(struct driver_dir_entry * entry);
+
+
+Files are encapsulated like this:
+
+struct driver_file_entry {
+ struct driver_dir_entry * parent;
+ struct list_head node;
+ char * name;
+ mode_t mode;
+ struct dentry * dentry;
+ void * data;
+ struct driverfs_operations * ops;
+};
+
+struct driverfs_operations {
+ ssize_t (*read) (char *, size_t, loff_t, void *);
+ ssize_t (*write)(const char *, size_t, loff_t, void*);
+};
+
+node:
+ Node in its parent directory's list of files.
+
+name:
+ The name of the file.
+
+dentry:
+ The dentry for the file.
+
+data:
+ Caller specific data that is passed to the callbacks when they
+ are called.
+
+ops:
+ Operations for the file. Currently, this only contains read() and write()
+ callbacks for the file.
+
+To create a file, one first calls
+
+struct driver_file_entry *
+driverfs_create_entry (const char * name, mode_t mode,
+ struct driverfs_operations * ops, void * data);
+
+That allocates and initialises a struct driver_file_entry. Then, to actually
+create a file, one calls
+
+int driverfs_create_file(struct driver_file_entry * entry,
+ struct driver_dir_entry * parent);
+
+
+To remove a file, one calls
+
+void driverfs_remove_file(struct driver_dir_entry *, const char * name);
+
+
+The callback functionality is similar to the way procfs works. When a
+user performs a read(2) or write(2) on the file, it first calls a
+driverfs function. This function then checks for a non-NULL pointer in
+the file->private_data field, which it assumes to be a pointer to a
+struct driver_file_entry.
+
+It then checks for the appropriate callback and calls it.
+
+
+What driverfs is not:
+~~~~~~~~~~~~~~~~~~~~~
+It is not a replacement for either devfs or procfs.
+
+It does not handle device nodes, like devfs is intended to do. I think
+this functionality is possible, but indeed think that integration of
+the device nodes and control files should be done. Whether driverfs or
+devfs, or something else, is the place to do it, I don't know.
+
+It is not intended to be a replacement for all of the procfs
+functionality. I think that many of the driver files should be moved
+out of /proc (and maybe a few other things as well ;).
+
+
+
+Limitations:
+~~~~~~~~~~~~
+The driverfs functions assume that at most a page is being either read
+or written each time.
+
+
+Possible bugs:
+~~~~~~~~~~~~~~
+It may not deal with offsets and/or seeks very well, especially if
+they cross a page boundary.
+
+There may be locking issues when dynamically adding/removing files and
+directories rapidly (like if you have a hot plug device).
+
+There are some people that believe that filesystems which add
+files/directories dynamically based on the presence of devices are
+inherently flawed. Though not as technically versed in this area as
+some of those people, I like to believe that they can be made to work,
+with the right guidance.
+
diff --git a/Makefile b/Makefile
index 02dbfd5dd447..a62e69d0f39c 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
VERSION = 2
PATCHLEVEL = 5
SUBLEVEL = 1
-EXTRAVERSION =-pre10
+EXTRAVERSION =-pre11
KERNELRELEASE=$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)
diff --git a/arch/i386/lib/iodebug.c b/arch/i386/lib/iodebug.c
index 701a07fe7229..3f74de6a05fa 100644
--- a/arch/i386/lib/iodebug.c
+++ b/arch/i386/lib/iodebug.c
@@ -9,11 +9,3 @@ void * __io_virt_debug(unsigned long x, const char *file, int line)
return (void *)x;
}
-unsigned long __io_phys_debug(unsigned long x, const char *file, int line)
-{
- if (x < PAGE_OFFSET) {
- printk("io mapaddr 0x%05lx not valid at %s:%d!\n", x, file, line);
- return x;
- }
- return __pa(x);
-}
diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index 371755761a53..74aca5367813 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -1237,7 +1237,7 @@ queue:
blkdev_dequeue_request(creq);
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(q->queue_lock);
c->cmd_type = CMD_RWREQ;
c->rq = creq;
@@ -1298,7 +1298,7 @@ queue:
c->Request.CDB[8]= creq->nr_sectors & 0xff;
c->Request.CDB[9] = c->Request.CDB[11] = c->Request.CDB[12] = 0;
- spin_lock_irq(&q->queue_lock);
+ spin_lock_irq(q->queue_lock);
addQ(&(h->reqQ),c);
h->Qdepth++;
@@ -1866,7 +1866,7 @@ static int __init cciss_init_one(struct pci_dev *pdev,
q = BLK_DEFAULT_QUEUE(MAJOR_NR + i);
q->queuedata = hba[i];
- blk_init_queue(q, do_cciss_request);
+ blk_init_queue(q, do_cciss_request, &hba[i]->lock);
blk_queue_bounce_limit(q, hba[i]->pdev->dma_mask);
blk_queue_max_segments(q, MAXSGENTRIES);
blk_queue_max_sectors(q, 512);
diff --git a/drivers/block/cciss.h b/drivers/block/cciss.h
index 357088d21918..03afe43dacf9 100644
--- a/drivers/block/cciss.h
+++ b/drivers/block/cciss.h
@@ -66,6 +66,7 @@ struct ctlr_info
unsigned int Qdepth;
unsigned int maxQsinceinit;
unsigned int maxSG;
+ spinlock_t lock;
//* pointers to command and error info pool */
CommandList_struct *cmd_pool;
@@ -242,7 +243,7 @@ struct board_type {
struct access_method *access;
};
-#define CCISS_LOCK(i) (&((BLK_DEFAULT_QUEUE(MAJOR_NR + i))->queue_lock))
+#define CCISS_LOCK(i) ((BLK_DEFAULT_QUEUE(MAJOR_NR + i))->queue_lock)
#endif /* CCISS_H */
diff --git a/drivers/block/cpqarray.c b/drivers/block/cpqarray.c
index 4ff77277d519..5f85cb0b5b6b 100644
--- a/drivers/block/cpqarray.c
+++ b/drivers/block/cpqarray.c
@@ -467,7 +467,7 @@ int __init cpqarray_init(void)
q = BLK_DEFAULT_QUEUE(MAJOR_NR + i);
q->queuedata = hba[i];
- blk_init_queue(q, do_ida_request);
+ blk_init_queue(q, do_ida_request, &hba[i]->lock);
blk_queue_bounce_limit(q, hba[i]->pci_dev->dma_mask);
blk_queue_max_segments(q, SG_MAX);
blksize_size[MAJOR_NR+i] = ida_blocksizes + (i*256);
@@ -882,7 +882,7 @@ queue_next:
blkdev_dequeue_request(creq);
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(q->queue_lock);
c->ctlr = h->ctlr;
c->hdr.unit = MINOR(creq->rq_dev) >> NWD_SHIFT;
@@ -915,7 +915,7 @@ DBGPX( printk("Submitting %d sectors in %d segments\n", creq->nr_sectors, seg);
c->req.hdr.cmd = (rq_data_dir(creq) == READ) ? IDA_READ : IDA_WRITE;
c->type = CMD_RWREQ;
- spin_lock_irq(&q->queue_lock);
+ spin_lock_irq(q->queue_lock);
/* Put the request on the tail of the request queue */
addQ(&h->reqQ, c);
diff --git a/drivers/block/cpqarray.h b/drivers/block/cpqarray.h
index bdb8e4108f9c..80b4dba8b83e 100644
--- a/drivers/block/cpqarray.h
+++ b/drivers/block/cpqarray.h
@@ -106,6 +106,7 @@ struct ctlr_info {
cmdlist_t *cmd_pool;
dma_addr_t cmd_pool_dhandle;
__u32 *cmd_pool_bits;
+ spinlock_t lock;
unsigned int Qdepth;
unsigned int maxQsinceinit;
@@ -117,7 +118,7 @@ struct ctlr_info {
unsigned int misc_tflags;
};
-#define IDA_LOCK(i) (&((BLK_DEFAULT_QUEUE(MAJOR_NR + i))->queue_lock))
+#define IDA_LOCK(i) ((BLK_DEFAULT_QUEUE(MAJOR_NR + i))->queue_lock)
#endif
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 897f3c886b45..2417023debaf 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -204,6 +204,8 @@ static int use_virtual_dma;
* record each buffers capabilities
*/
+static spinlock_t floppy_lock;
+
static unsigned short virtual_dma_port=0x3f0;
void floppy_interrupt(int irq, void *dev_id, struct pt_regs * regs);
static int set_dor(int fdc, char mask, char data);
@@ -2296,7 +2298,7 @@ static void request_done(int uptodate)
DRS->maxtrack = 1;
/* unlock chained buffers */
- spin_lock_irqsave(&QUEUE->queue_lock, flags);
+ spin_lock_irqsave(QUEUE->queue_lock, flags);
while (current_count_sectors && !QUEUE_EMPTY &&
current_count_sectors >= CURRENT->current_nr_sectors){
current_count_sectors -= CURRENT->current_nr_sectors;
@@ -2304,7 +2306,7 @@ static void request_done(int uptodate)
CURRENT->sector += CURRENT->current_nr_sectors;
end_request(1);
}
- spin_unlock_irqrestore(&QUEUE->queue_lock, flags);
+ spin_unlock_irqrestore(QUEUE->queue_lock, flags);
if (current_count_sectors && !QUEUE_EMPTY){
/* "unlock" last subsector */
@@ -2329,9 +2331,9 @@ static void request_done(int uptodate)
DRWE->last_error_sector = CURRENT->sector;
DRWE->last_error_generation = DRS->generation;
}
- spin_lock_irqsave(&QUEUE->queue_lock, flags);
+ spin_lock_irqsave(QUEUE->queue_lock, flags);
end_request(0);
- spin_unlock_irqrestore(&QUEUE->queue_lock, flags);
+ spin_unlock_irqrestore(QUEUE->queue_lock, flags);
}
}
@@ -2433,17 +2435,20 @@ static void rw_interrupt(void)
static int buffer_chain_size(void)
{
struct bio *bio;
- int size;
+ struct bio_vec *bv;
+ int size, i;
char *base;
- base = CURRENT->buffer;
+ base = bio_data(CURRENT->bio);
size = 0;
rq_for_each_bio(bio, CURRENT) {
- if (bio_data(bio) != base + size)
- break;
+ bio_for_each_segment(bv, bio, i) {
+ if (page_address(bv->bv_page) + bv->bv_offset != base + size)
+ break;
- size += bio->bi_size;
+ size += bv->bv_len;
+ }
}
return size >> 9;
@@ -2469,9 +2474,10 @@ static int transfer_size(int ssize, int max_sector, int max_size)
static void copy_buffer(int ssize, int max_sector, int max_sector_2)
{
int remaining; /* number of transferred 512-byte sectors */
+ struct bio_vec *bv;
struct bio *bio;
char *buffer, *dma_buffer;
- int size;
+ int size, i;
max_sector = transfer_size(ssize,
minimum(max_sector, max_sector_2),
@@ -2501,12 +2507,17 @@ static void copy_buffer(int ssize, int max_sector, int max_sector_2)
dma_buffer = floppy_track_buffer + ((fsector_t - buffer_min) << 9);
- bio = CURRENT->bio;
size = CURRENT->current_nr_sectors << 9;
- buffer = CURRENT->buffer;
- while (remaining > 0){
- SUPBOUND(size, remaining);
+ rq_for_each_bio(bio, CURRENT) {
+ bio_for_each_segment(bv, bio, i) {
+ if (!remaining)
+ break;
+
+ size = bv->bv_len;
+ SUPBOUND(size, remaining);
+
+ buffer = page_address(bv->bv_page) + bv->bv_offset;
#ifdef FLOPPY_SANITY_CHECK
if (dma_buffer + size >
floppy_track_buffer + (max_buffer_sectors << 10) ||
@@ -2526,24 +2537,14 @@ static void copy_buffer(int ssize, int max_sector, int max_sector_2)
if (((unsigned long)buffer) % 512)
DPRINT("%p buffer not aligned\n", buffer);
#endif
- if (CT(COMMAND) == FD_READ)
- memcpy(buffer, dma_buffer, size);
- else
- memcpy(dma_buffer, buffer, size);
- remaining -= size;
- if (!remaining)
- break;
+ if (CT(COMMAND) == FD_READ)
+ memcpy(buffer, dma_buffer, size);
+ else
+ memcpy(dma_buffer, buffer, size);
- dma_buffer += size;
- bio = bio->bi_next;
-#ifdef FLOPPY_SANITY_CHECK
- if (!bio){
- DPRINT("bh=null in copy buffer after copy\n");
- break;
+ remaining -= size;
+ dma_buffer += size;
}
-#endif
- size = bio->bi_size;
- buffer = bio_data(bio);
}
#ifdef FLOPPY_SANITY_CHECK
if (remaining){
@@ -4169,7 +4170,7 @@ int __init floppy_init(void)
blk_size[MAJOR_NR] = floppy_sizes;
blksize_size[MAJOR_NR] = floppy_blocksizes;
- blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), DEVICE_REQUEST);
+ blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), DEVICE_REQUEST, &floppy_lock);
reschedule_timeout(MAXTIMEOUT, "floppy init", MAXTIMEOUT);
config_types();
@@ -4477,6 +4478,7 @@ MODULE_LICENSE("GPL");
#else
__setup ("floppy=", floppy_setup);
+module_init(floppy_init)
/* eject the boot floppy (if we need the drive for a different root floppy) */
/* This should only be called at boot time when we're sure that there's no
diff --git a/drivers/block/ll_rw_blk.c b/drivers/block/ll_rw_blk.c
index 048dcbdef1ca..9849061f045a 100644
--- a/drivers/block/ll_rw_blk.c
+++ b/drivers/block/ll_rw_blk.c
@@ -254,6 +254,12 @@ void blk_queue_segment_boundary(request_queue_t *q, unsigned long mask)
q->seg_boundary_mask = mask;
}
+void blk_queue_assign_lock(request_queue_t *q, spinlock_t *lock)
+{
+ spin_lock_init(lock);
+ q->queue_lock = lock;
+}
+
static char *rq_flags[] = { "REQ_RW", "REQ_RW_AHEAD", "REQ_BARRIER",
"REQ_CMD", "REQ_NOMERGE", "REQ_STARTED",
"REQ_DONTPREP", "REQ_DRIVE_CMD", "REQ_DRIVE_TASK",
@@ -536,9 +542,9 @@ void generic_unplug_device(void *data)
request_queue_t *q = (request_queue_t *) data;
unsigned long flags;
- spin_lock_irqsave(&q->queue_lock, flags);
+ spin_lock_irqsave(q->queue_lock, flags);
__generic_unplug_device(q);
- spin_unlock_irqrestore(&q->queue_lock, flags);
+ spin_unlock_irqrestore(q->queue_lock, flags);
}
static int __blk_cleanup_queue(struct request_list *list)
@@ -624,7 +630,6 @@ static int blk_init_free_list(request_queue_t *q)
init_waitqueue_head(&q->rq[READ].wait);
init_waitqueue_head(&q->rq[WRITE].wait);
- spin_lock_init(&q->queue_lock);
return 0;
nomem:
blk_cleanup_queue(q);
@@ -661,7 +666,7 @@ static int __make_request(request_queue_t *, struct bio *);
* blk_init_queue() must be paired with a blk_cleanup_queue() call
* when the block device is deactivated (such as at module unload).
**/
-int blk_init_queue(request_queue_t *q, request_fn_proc *rfn)
+int blk_init_queue(request_queue_t *q, request_fn_proc *rfn, spinlock_t *lock)
{
int ret;
@@ -682,6 +687,7 @@ int blk_init_queue(request_queue_t *q, request_fn_proc *rfn)
q->plug_tq.routine = &generic_unplug_device;
q->plug_tq.data = q;
q->queue_flags = (1 << QUEUE_FLAG_CLUSTER);
+ q->queue_lock = lock;
/*
* by default assume old behaviour and bounce for any highmem page
@@ -728,7 +734,7 @@ static struct request *get_request_wait(request_queue_t *q, int rw)
struct request_list *rl = &q->rq[rw];
struct request *rq;
- spin_lock_prefetch(&q->queue_lock);
+ spin_lock_prefetch(q->queue_lock);
generic_unplug_device(q);
add_wait_queue(&rl->wait, &wait);
@@ -736,9 +742,9 @@ static struct request *get_request_wait(request_queue_t *q, int rw)
set_current_state(TASK_UNINTERRUPTIBLE);
if (rl->count < batch_requests)
schedule();
- spin_lock_irq(&q->queue_lock);
+ spin_lock_irq(q->queue_lock);
rq = get_request(q, rw);
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(q->queue_lock);
} while (rq == NULL);
remove_wait_queue(&rl->wait, &wait);
current->state = TASK_RUNNING;
@@ -949,9 +955,9 @@ void blk_attempt_remerge(request_queue_t *q, struct request *rq)
{
unsigned long flags;
- spin_lock_irqsave(&q->queue_lock, flags);
+ spin_lock_irqsave(q->queue_lock, flags);
__blk_attempt_remerge(q, rq);
- spin_unlock_irqrestore(&q->queue_lock, flags);
+ spin_unlock_irqrestore(q->queue_lock, flags);
}
static int __make_request(request_queue_t *q, struct bio *bio)
@@ -974,7 +980,7 @@ static int __make_request(request_queue_t *q, struct bio *bio)
*/
blk_queue_bounce(q, &bio);
- spin_lock_prefetch(&q->queue_lock);
+ spin_lock_prefetch(q->queue_lock);
latency = elevator_request_latency(elevator, rw);
barrier = test_bit(BIO_RW_BARRIER, &bio->bi_rw);
@@ -983,7 +989,7 @@ again:
req = NULL;
head = &q->queue_head;
- spin_lock_irq(&q->queue_lock);
+ spin_lock_irq(q->queue_lock);
insert_here = head->prev;
if (blk_queue_empty(q) || barrier) {
@@ -1066,7 +1072,7 @@ get_rq:
freereq = NULL;
} else if ((req = get_request(q, rw)) == NULL) {
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(q->queue_lock);
/*
* READA bit set
@@ -1111,7 +1117,7 @@ get_rq:
out:
if (freereq)
blkdev_release_request(freereq);
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(q->queue_lock);
return 0;
end_io:
@@ -1608,3 +1614,4 @@ EXPORT_SYMBOL(blk_nohighio);
EXPORT_SYMBOL(blk_dump_rq_flags);
EXPORT_SYMBOL(submit_bio);
EXPORT_SYMBOL(blk_contig_segment);
+EXPORT_SYMBOL(blk_queue_assign_lock);
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 22e5b4a60718..c16b6163af89 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -62,6 +62,8 @@ static u64 nbd_bytesizes[MAX_NBD];
static struct nbd_device nbd_dev[MAX_NBD];
static devfs_handle_t devfs_handle;
+static spinlock_t nbd_lock;
+
#define DEBUG( s )
/* #define DEBUG( s ) printk( s )
*/
@@ -347,22 +349,22 @@ static void do_nbd_request(request_queue_t * q)
#endif
req->errors = 0;
blkdev_dequeue_request(req);
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(q->queue_lock);
down (&lo->queue_lock);
list_add(&req->queuelist, &lo->queue_head);
nbd_send_req(lo->sock, req); /* Why does this block? */
up (&lo->queue_lock);
- spin_lock_irq(&q->queue_lock);
+ spin_lock_irq(q->queue_lock);
continue;
error_out:
req->errors++;
blkdev_dequeue_request(req);
- spin_unlock(&q->queue_lock);
+ spin_unlock(q->queue_lock);
nbd_end_request(req);
- spin_lock(&q->queue_lock);
+ spin_lock(q->queue_lock);
}
return;
}
@@ -515,7 +517,7 @@ static int __init nbd_init(void)
#endif
blksize_size[MAJOR_NR] = nbd_blksizes;
blk_size[MAJOR_NR] = nbd_sizes;
- blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), do_nbd_request);
+ blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), do_nbd_request, &nbd_lock);
for (i = 0; i < MAX_NBD; i++) {
nbd_dev[i].refcnt = 0;
nbd_dev[i].file = NULL;
diff --git a/drivers/block/paride/pcd.c b/drivers/block/paride/pcd.c
index 61e50fec569c..9604464e325f 100644
--- a/drivers/block/paride/pcd.c
+++ b/drivers/block/paride/pcd.c
@@ -146,6 +146,8 @@ static int pcd_drive_count;
#include <asm/uaccess.h>
+static spinlock_t pcd_lock;
+
#ifndef MODULE
#include "setup.h"
@@ -355,7 +357,7 @@ int pcd_init (void) /* preliminary initialisation */
}
}
- blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), DEVICE_REQUEST);
+ blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), DEVICE_REQUEST, &pcd_lock);
read_ahead[MAJOR_NR] = 8; /* 8 sector (4kB) read ahead */
for (i=0;i<PCD_UNITS;i++) pcd_blocksizes[i] = 1024;
@@ -821,11 +823,11 @@ static void pcd_start( void )
if (pcd_command(unit,rd_cmd,2048,"read block")) {
pcd_bufblk = -1;
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pcd_lock,saved_flags);
pcd_busy = 0;
end_request(0);
do_pcd_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pcd_lock,saved_flags);
return;
}
@@ -845,11 +847,11 @@ static void do_pcd_read( void )
pcd_retries = 0;
pcd_transfer();
if (!pcd_count) {
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pcd_lock,saved_flags);
end_request(1);
pcd_busy = 0;
do_pcd_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pcd_lock,saved_flags);
return;
}
@@ -868,19 +870,19 @@ static void do_pcd_read_drq( void )
pi_do_claimed(PI,pcd_start);
return;
}
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pcd_lock,saved_flags);
pcd_busy = 0;
pcd_bufblk = -1;
end_request(0);
do_pcd_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pcd_lock,saved_flags);
return;
}
do_pcd_read();
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pcd_lock,saved_flags);
do_pcd_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pcd_lock,saved_flags);
}
/* the audio_ioctl stuff is adapted from sr_ioctl.c */
diff --git a/drivers/block/paride/pf.c b/drivers/block/paride/pf.c
index d659bbe4408a..e49565417eda 100644
--- a/drivers/block/paride/pf.c
+++ b/drivers/block/paride/pf.c
@@ -164,6 +164,8 @@ static int pf_drive_count;
#include <asm/uaccess.h>
+static spinlock_t pf_spin_lock;
+
#ifndef MODULE
#include "setup.h"
@@ -358,7 +360,7 @@ int pf_init (void) /* preliminary initialisation */
return -1;
}
q = BLK_DEFAULT_QUEUE(MAJOR_NR);
- blk_init_queue(q, DEVICE_REQUEST);
+ blk_init_queue(q, DEVICE_REQUEST, &pf_spin_lock);
blk_queue_max_segments(q, cluster);
read_ahead[MAJOR_NR] = 8; /* 8 sector (4kB) read ahead */
@@ -876,9 +878,9 @@ static void pf_next_buf( int unit )
{ long saved_flags;
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pf_spin_lock,saved_flags);
end_request(1);
- if (!pf_run) { spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ if (!pf_run) { spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
return;
}
@@ -894,7 +896,7 @@ static void pf_next_buf( int unit )
pf_count = CURRENT->current_nr_sectors;
pf_buf = CURRENT->buffer;
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
}
static void do_pf_read( void )
@@ -918,11 +920,11 @@ static void do_pf_read_start( void )
pi_do_claimed(PI,do_pf_read_start);
return;
}
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pf_spin_lock,saved_flags);
end_request(0);
pf_busy = 0;
do_pf_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
return;
}
pf_mask = STAT_DRQ;
@@ -944,11 +946,11 @@ static void do_pf_read_drq( void )
pi_do_claimed(PI,do_pf_read_start);
return;
}
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pf_spin_lock,saved_flags);
end_request(0);
pf_busy = 0;
do_pf_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
return;
}
pi_read_block(PI,pf_buf,512);
@@ -959,11 +961,11 @@ static void do_pf_read_drq( void )
if (!pf_count) pf_next_buf(unit);
}
pi_disconnect(PI);
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pf_spin_lock,saved_flags);
end_request(1);
pf_busy = 0;
do_pf_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
}
static void do_pf_write( void )
@@ -985,11 +987,11 @@ static void do_pf_write_start( void )
pi_do_claimed(PI,do_pf_write_start);
return;
}
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pf_spin_lock,saved_flags);
end_request(0);
pf_busy = 0;
do_pf_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
return;
}
@@ -1002,11 +1004,11 @@ static void do_pf_write_start( void )
pi_do_claimed(PI,do_pf_write_start);
return;
}
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pf_spin_lock,saved_flags);
end_request(0);
pf_busy = 0;
do_pf_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
return;
}
pi_write_block(PI,pf_buf,512);
@@ -1032,19 +1034,19 @@ static void do_pf_write_done( void )
pi_do_claimed(PI,do_pf_write_start);
return;
}
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pf_spin_lock,saved_flags);
end_request(0);
pf_busy = 0;
do_pf_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
return;
}
pi_disconnect(PI);
- spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+ spin_lock_irqsave(&pf_spin_lock,saved_flags);
end_request(1);
pf_busy = 0;
do_pf_request(NULL);
- spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+ spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
}
/* end of pf.c */
diff --git a/drivers/block/ps2esdi.c b/drivers/block/ps2esdi.c
index 01c8805b83fc..b248b437bf74 100644
--- a/drivers/block/ps2esdi.c
+++ b/drivers/block/ps2esdi.c
@@ -189,6 +189,8 @@ int __init ps2esdi_init(void)
return 0;
} /* ps2esdi_init */
+module_init(ps2esdi_init);
+
#ifdef MODULE
static int cyl[MAX_HD] = {-1,-1};
diff --git a/drivers/block/rd.c b/drivers/block/rd.c
index b2135fc5b319..c6fb55df1682 100644
--- a/drivers/block/rd.c
+++ b/drivers/block/rd.c
@@ -44,9 +44,6 @@
#include <linux/config.h>
#include <linux/sched.h>
-#include <linux/minix_fs.h>
-#include <linux/ext2_fs.h>
-#include <linux/romfs_fs.h>
#include <linux/fs.h>
#include <linux/kernel.h>
#include <linux/hdreg.h>
@@ -79,19 +76,10 @@ extern void wait_for_keypress(void);
/* The RAM disk size is now a parameter */
#define NUM_RAMDISKS 16 /* This cannot be overridden (yet) */
-#ifndef MODULE
-/* We don't have to load RAM disks or gunzip them in a module. */
-#define RD_LOADER
-#define BUILD_CRAMDISK
-
-void rd_load(void);
-static int crd_load(struct file *fp, struct file *outfp);
-
#ifdef CONFIG_BLK_DEV_INITRD
static int initrd_users;
static spinlock_t initrd_users_lock = SPIN_LOCK_UNLOCKED;
#endif
-#endif
/* Various static variables go here. Most are used only in the RAM disk code.
*/
@@ -542,6 +530,8 @@ int __init rd_init (void)
#ifdef CONFIG_BLK_DEV_INITRD
/* We ought to separate initrd operations here */
register_disk(NULL, MKDEV(MAJOR_NR,INITRD_MINOR), 1, &rd_bd_op, rd_size<<1);
+ devfs_register(devfs_handle, "initrd", DEVFS_FL_DEFAULT, MAJOR_NR,
+ INITRD_MINOR, S_IFBLK | S_IRUSR, &rd_bd_op, NULL);
#endif
blksize_size[MAJOR_NR] = rd_blocksizes; /* Avoid set_blocksize() check */
@@ -565,462 +555,3 @@ MODULE_PARM (rd_blocksize, "i");
MODULE_PARM_DESC(rd_blocksize, "Blocksize of each RAM disk in bytes.");
MODULE_LICENSE("GPL");
-
-/* End of non-loading portions of the RAM disk driver */
-
-#ifdef RD_LOADER
-/*
- * This routine tries to find a RAM disk image to load, and returns the
- * number of blocks to read for a non-compressed image, 0 if the image
- * is a compressed image, and -1 if an image with the right magic
- * numbers could not be found.
- *
- * We currently check for the following magic numbers:
- * minix
- * ext2
- * romfs
- * gzip
- */
-static int __init
-identify_ramdisk_image(kdev_t device, struct file *fp, int start_block)
-{
- const int size = 512;
- struct minix_super_block *minixsb;
- struct ext2_super_block *ext2sb;
- struct romfs_super_block *romfsb;
- int nblocks = -1;
- unsigned char *buf;
-
- buf = kmalloc(size, GFP_KERNEL);
- if (buf == 0)
- return -1;
-
- minixsb = (struct minix_super_block *) buf;
- ext2sb = (struct ext2_super_block *) buf;
- romfsb = (struct romfs_super_block *) buf;
- memset(buf, 0xe5, size);
-
- /*
- * Read block 0 to test for gzipped kernel
- */
- if (fp->f_op->llseek)
- fp->f_op->llseek(fp, start_block * BLOCK_SIZE, 0);
- fp->f_pos = start_block * BLOCK_SIZE;
-
- fp->f_op->read(fp, buf, size, &fp->f_pos);
-
- /*
- * If it matches the gzip magic numbers, return -1
- */
- if (buf[0] == 037 && ((buf[1] == 0213) || (buf[1] == 0236))) {
- printk(KERN_NOTICE
- "RAMDISK: Compressed image found at block %d\n",
- start_block);
- nblocks = 0;
- goto done;
- }
-
- /* romfs is at block zero too */
- if (romfsb->word0 == ROMSB_WORD0 &&
- romfsb->word1 == ROMSB_WORD1) {
- printk(KERN_NOTICE
- "RAMDISK: romfs filesystem found at block %d\n",
- start_block);
- nblocks = (ntohl(romfsb->size)+BLOCK_SIZE-1)>>BLOCK_SIZE_BITS;
- goto done;
- }
-
- /*
- * Read block 1 to test for minix and ext2 superblock
- */
- if (fp->f_op->llseek)
- fp->f_op->llseek(fp, (start_block+1) * BLOCK_SIZE, 0);
- fp->f_pos = (start_block+1) * BLOCK_SIZE;
-
- fp->f_op->read(fp, buf, size, &fp->f_pos);
-
- /* Try minix */
- if (minixsb->s_magic == MINIX_SUPER_MAGIC ||
- minixsb->s_magic == MINIX_SUPER_MAGIC2) {
- printk(KERN_NOTICE
- "RAMDISK: Minix filesystem found at block %d\n",
- start_block);
- nblocks = minixsb->s_nzones << minixsb->s_log_zone_size;
- goto done;
- }
-
- /* Try ext2 */
- if (ext2sb->s_magic == cpu_to_le16(EXT2_SUPER_MAGIC)) {
- printk(KERN_NOTICE
- "RAMDISK: ext2 filesystem found at block %d\n",
- start_block);
- nblocks = le32_to_cpu(ext2sb->s_blocks_count);
- goto done;
- }
-
- printk(KERN_NOTICE
- "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n",
- start_block);
-
-done:
- if (fp->f_op->llseek)
- fp->f_op->llseek(fp, start_block * BLOCK_SIZE, 0);
- fp->f_pos = start_block * BLOCK_SIZE;
-
- kfree(buf);
- return nblocks;
-}
-
-/*
- * This routine loads in the RAM disk image.
- */
-static void __init rd_load_image(kdev_t device, int offset, int unit)
-{
- struct inode *inode, *out_inode;
- struct file infile, outfile;
- struct dentry in_dentry, out_dentry;
- mm_segment_t fs;
- kdev_t ram_device;
- int nblocks, i;
- char *buf;
- unsigned short rotate = 0;
- unsigned short devblocks = 0;
-#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
- char rotator[4] = { '|' , '/' , '-' , '\\' };
-#endif
- ram_device = MKDEV(MAJOR_NR, unit);
-
- if ((inode = get_empty_inode()) == NULL)
- return;
- memset(&infile, 0, sizeof(infile));
- memset(&in_dentry, 0, sizeof(in_dentry));
- infile.f_mode = 1; /* read only */
- infile.f_dentry = &in_dentry;
- in_dentry.d_inode = inode;
- infile.f_op = &def_blk_fops;
- init_special_inode(inode, S_IFBLK | S_IRUSR, kdev_t_to_nr(device));
-
- if ((out_inode = get_empty_inode()) == NULL)
- goto free_inode;
- memset(&outfile, 0, sizeof(outfile));
- memset(&out_dentry, 0, sizeof(out_dentry));
- outfile.f_mode = 3; /* read/write */
- outfile.f_dentry = &out_dentry;
- out_dentry.d_inode = out_inode;
- outfile.f_op = &def_blk_fops;
- init_special_inode(out_inode, S_IFBLK | S_IRUSR | S_IWUSR, kdev_t_to_nr(ram_device));
-
- if (blkdev_open(inode, &infile) != 0) {
- iput(out_inode);
- goto free_inode;
- }
- if (blkdev_open(out_inode, &outfile) != 0)
- goto free_inodes;
-
- fs = get_fs();
- set_fs(KERNEL_DS);
-
- nblocks = identify_ramdisk_image(device, &infile, offset);
- if (nblocks < 0)
- goto done;
-
- if (nblocks == 0) {
-#ifdef BUILD_CRAMDISK
- if (crd_load(&infile, &outfile) == 0)
- goto successful_load;
-#else
- printk(KERN_NOTICE
- "RAMDISK: Kernel does not support compressed "
- "RAM disk images\n");
-#endif
- goto done;
- }
-
- /*
- * NOTE NOTE: nblocks suppose that the blocksize is BLOCK_SIZE, so
- * rd_load_image will work only with filesystem BLOCK_SIZE wide!
- * So make sure to use 1k blocksize while generating ext2fs
- * ramdisk-images.
- */
- if (nblocks > (rd_length[unit] >> BLOCK_SIZE_BITS)) {
- printk("RAMDISK: image too big! (%d/%ld blocks)\n",
- nblocks, rd_length[unit] >> BLOCK_SIZE_BITS);
- goto done;
- }
-
- /*
- * OK, time to copy in the data
- */
- buf = kmalloc(BLOCK_SIZE, GFP_KERNEL);
- if (buf == 0) {
- printk(KERN_ERR "RAMDISK: could not allocate buffer\n");
- goto done;
- }
-
- if (blk_size[MAJOR(device)])
- devblocks = blk_size[MAJOR(device)][MINOR(device)];
-
-#ifdef CONFIG_BLK_DEV_INITRD
- if (MAJOR(device) == MAJOR_NR && MINOR(device) == INITRD_MINOR)
- devblocks = nblocks;
-#endif
-
- if (devblocks == 0) {
- printk(KERN_ERR "RAMDISK: could not determine device size\n");
- goto done;
- }
-
- printk(KERN_NOTICE "RAMDISK: Loading %d blocks [%d disk%s] into ram disk... ",
- nblocks, ((nblocks-1)/devblocks)+1, nblocks>devblocks ? "s" : "");
- for (i=0; i < nblocks; i++) {
- if (i && (i % devblocks == 0)) {
- printk("done disk #%d.\n", i/devblocks);
- rotate = 0;
- if (infile.f_op->release(inode, &infile) != 0) {
- printk("Error closing the disk.\n");
- goto noclose_input;
- }
- printk("Please insert disk #%d and press ENTER\n", i/devblocks+1);
- wait_for_keypress();
- if (blkdev_open(inode, &infile) != 0) {
- printk("Error opening disk.\n");
- goto noclose_input;
- }
- infile.f_pos = 0;
- printk("Loading disk #%d... ", i/devblocks+1);
- }
- infile.f_op->read(&infile, buf, BLOCK_SIZE, &infile.f_pos);
- outfile.f_op->write(&outfile, buf, BLOCK_SIZE, &outfile.f_pos);
-#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
- if (!(i % 16)) {
- printk("%c\b", rotator[rotate & 0x3]);
- rotate++;
- }
-#endif
- }
- printk("done.\n");
- kfree(buf);
-
-successful_load:
- ROOT_DEV = MKDEV(MAJOR_NR, unit);
- if (ROOT_DEVICE_NAME != NULL) strcpy (ROOT_DEVICE_NAME, "rd/0");
-
-done:
- infile.f_op->release(inode, &infile);
-noclose_input:
- blkdev_close(out_inode, &outfile);
- iput(inode);
- iput(out_inode);
- set_fs(fs);
- return;
-free_inodes: /* free inodes on error */
- iput(out_inode);
- infile.f_op->release(inode, &infile);
-free_inode:
- iput(inode);
-}
-
-#ifdef CONFIG_MAC_FLOPPY
-int swim3_fd_eject(int devnum);
-#endif
-
-static void __init rd_load_disk(int n)
-{
-
- if (rd_doload == 0)
- return;
-
- if (MAJOR(ROOT_DEV) != FLOPPY_MAJOR
-#ifdef CONFIG_BLK_DEV_INITRD
- && MAJOR(real_root_dev) != FLOPPY_MAJOR
-#endif
- )
- return;
-
- if (rd_prompt) {
-#ifdef CONFIG_BLK_DEV_FD
- floppy_eject();
-#endif
-#ifdef CONFIG_MAC_FLOPPY
- if(MAJOR(ROOT_DEV) == FLOPPY_MAJOR)
- swim3_fd_eject(MINOR(ROOT_DEV));
- else if(MAJOR(real_root_dev) == FLOPPY_MAJOR)
- swim3_fd_eject(MINOR(real_root_dev));
-#endif
- printk(KERN_NOTICE
- "VFS: Insert root floppy disk to be loaded into RAM disk and press ENTER\n");
- wait_for_keypress();
- }
-
- rd_load_image(ROOT_DEV,rd_image_start, n);
-
-}
-
-void __init rd_load(void)
-{
- rd_load_disk(0);
-}
-
-void __init rd_load_secondary(void)
-{
- rd_load_disk(1);
-}
-
-#ifdef CONFIG_BLK_DEV_INITRD
-void __init initrd_load(void)
-{
- rd_load_image(MKDEV(MAJOR_NR, INITRD_MINOR),rd_image_start,0);
-}
-#endif
-
-#endif /* RD_LOADER */
-
-#ifdef BUILD_CRAMDISK
-
-/*
- * gzip declarations
- */
-
-#define OF(args) args
-
-#ifndef memzero
-#define memzero(s, n) memset ((s), 0, (n))
-#endif
-
-typedef unsigned char uch;
-typedef unsigned short ush;
-typedef unsigned long ulg;
-
-#define INBUFSIZ 4096
-#define WSIZE 0x8000 /* window size--must be a power of two, and */
- /* at least 32K for zip's deflate method */
-
-static uch *inbuf;
-static uch *window;
-
-static unsigned insize; /* valid bytes in inbuf */
-static unsigned inptr; /* index of next byte to be processed in inbuf */
-static unsigned outcnt; /* bytes in output buffer */
-static int exit_code;
-static long bytes_out;
-static struct file *crd_infp, *crd_outfp;
-
-#define get_byte() (inptr < insize ? inbuf[inptr++] : fill_inbuf())
-
-/* Diagnostic functions (stubbed out) */
-#define Assert(cond,msg)
-#define Trace(x)
-#define Tracev(x)
-#define Tracevv(x)
-#define Tracec(c,x)
-#define Tracecv(c,x)
-
-#define STATIC static
-
-static int fill_inbuf(void);
-static void flush_window(void);
-static void *malloc(int size);
-static void free(void *where);
-static void error(char *m);
-static void gzip_mark(void **);
-static void gzip_release(void **);
-
-#include "../../lib/inflate.c"
-
-static void __init *malloc(int size)
-{
- return kmalloc(size, GFP_KERNEL);
-}
-
-static void __init free(void *where)
-{
- kfree(where);
-}
-
-static void __init gzip_mark(void **ptr)
-{
-}
-
-static void __init gzip_release(void **ptr)
-{
-}
-
-
-/* ===========================================================================
- * Fill the input buffer. This is called only when the buffer is empty
- * and at least one byte is really needed.
- */
-static int __init fill_inbuf(void)
-{
- if (exit_code) return -1;
-
- insize = crd_infp->f_op->read(crd_infp, inbuf, INBUFSIZ,
- &crd_infp->f_pos);
- if (insize == 0) return -1;
-
- inptr = 1;
-
- return inbuf[0];
-}
-
-/* ===========================================================================
- * Write the output window window[0..outcnt-1] and update crc and bytes_out.
- * (Used for the decompressed data only.)
- */
-static void __init flush_window(void)
-{
- ulg c = crc; /* temporary variable */
- unsigned n;
- uch *in, ch;
-
- crd_outfp->f_op->write(crd_outfp, window, outcnt, &crd_outfp->f_pos);
- in = window;
- for (n = 0; n < outcnt; n++) {
- ch = *in++;
- c = crc_32_tab[((int)c ^ ch) & 0xff] ^ (c >> 8);
- }
- crc = c;
- bytes_out += (ulg)outcnt;
- outcnt = 0;
-}
-
-static void __init error(char *x)
-{
- printk(KERN_ERR "%s", x);
- exit_code = 1;
-}
-
-static int __init
-crd_load(struct file * fp, struct file *outfp)
-{
- int result;
-
- insize = 0; /* valid bytes in inbuf */
- inptr = 0; /* index of next byte to be processed in inbuf */
- outcnt = 0; /* bytes in output buffer */
- exit_code = 0;
- bytes_out = 0;
- crc = (ulg)0xffffffffL; /* shift register contents */
-
- crd_infp = fp;
- crd_outfp = outfp;
- inbuf = kmalloc(INBUFSIZ, GFP_KERNEL);
- if (inbuf == 0) {
- printk(KERN_ERR "RAMDISK: Couldn't allocate gzip buffer\n");
- return -1;
- }
- window = kmalloc(WSIZE, GFP_KERNEL);
- if (window == 0) {
- printk(KERN_ERR "RAMDISK: Couldn't allocate gzip window\n");
- kfree(inbuf);
- return -1;
- }
- makecrc();
- result = gunzip();
- kfree(inbuf);
- kfree(window);
- return result;
-}
-
-#endif /* BUILD_CRAMDISK */
-
diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c
index 82123b2e0573..6201c2d1600d 100644
--- a/drivers/ide/ide-probe.c
+++ b/drivers/ide/ide-probe.c
@@ -597,7 +597,7 @@ static void ide_init_queue(ide_drive_t *drive)
int max_sectors;
q->queuedata = HWGROUP(drive);
- blk_init_queue(q, do_ide_request);
+ blk_init_queue(q, do_ide_request, &ide_lock);
blk_queue_segment_boundary(q, 0xffff);
/* IDE can do up to 128K per request, pdc4030 needs smaller limit */
diff --git a/drivers/ide/ide.c b/drivers/ide/ide.c
index 8a941308ab14..c1b19e1d9255 100644
--- a/drivers/ide/ide.c
+++ b/drivers/ide/ide.c
@@ -177,8 +177,6 @@ static int initializing; /* set while initializing built-in drivers */
/*
* protects global structures etc, we want to split this into per-hwgroup
* instead.
- *
- * anti-deadlock ordering: ide_lock -> DRIVE_LOCK
*/
spinlock_t ide_lock __cacheline_aligned = SPIN_LOCK_UNLOCKED;
@@ -583,11 +581,9 @@ inline int __ide_end_request(ide_hwgroup_t *hwgroup, int uptodate, int nr_secs)
if (!end_that_request_first(rq, uptodate, nr_secs)) {
add_blkdev_randomness(MAJOR(rq->rq_dev));
- spin_lock(DRIVE_LOCK(drive));
blkdev_dequeue_request(rq);
hwgroup->rq = NULL;
end_that_request_last(rq);
- spin_unlock(DRIVE_LOCK(drive));
ret = 0;
}
@@ -900,11 +896,9 @@ void ide_end_drive_cmd (ide_drive_t *drive, byte stat, byte err)
}
}
- spin_lock(DRIVE_LOCK(drive));
blkdev_dequeue_request(rq);
HWGROUP(drive)->rq = NULL;
end_that_request_last(rq);
- spin_unlock(DRIVE_LOCK(drive));
spin_unlock_irqrestore(&ide_lock, flags);
}
@@ -1368,7 +1362,7 @@ repeat:
/*
* Issue a new request to a drive from hwgroup
- * Caller must have already done spin_lock_irqsave(DRIVE_LOCK(drive), ...)
+ * Caller must have already done spin_lock_irqsave(&ide_lock, ...)
*
* A hwgroup is a serialized group of IDE interfaces. Usually there is
* exactly one hwif (interface) per hwgroup, but buggy controllers (eg. CMD640)
@@ -1456,9 +1450,7 @@ static void ide_do_request(ide_hwgroup_t *hwgroup, int masked_irq)
/*
* just continuing an interrupted request maybe
*/
- spin_lock(DRIVE_LOCK(drive));
rq = hwgroup->rq = elv_next_request(&drive->queue);
- spin_unlock(DRIVE_LOCK(drive));
/*
* Some systems have trouble with IDE IRQs arriving while
@@ -1496,19 +1488,7 @@ request_queue_t *ide_get_queue (kdev_t dev)
*/
void do_ide_request(request_queue_t *q)
{
- unsigned long flags;
-
- /*
- * release queue lock, grab IDE global lock and restore when
- * we leave...
- */
- spin_unlock(&q->queue_lock);
-
- spin_lock_irqsave(&ide_lock, flags);
ide_do_request(q->queuedata, 0);
- spin_unlock_irqrestore(&ide_lock, flags);
-
- spin_lock(&q->queue_lock);
}
/*
@@ -1875,7 +1855,6 @@ int ide_do_drive_cmd (ide_drive_t *drive, struct request *rq, ide_action_t actio
if (action == ide_wait)
rq->waiting = &wait;
spin_lock_irqsave(&ide_lock, flags);
- spin_lock(DRIVE_LOCK(drive));
if (blk_queue_empty(&drive->queue) || action == ide_preempt) {
if (action == ide_preempt)
hwgroup->rq = NULL;
@@ -1886,7 +1865,6 @@ int ide_do_drive_cmd (ide_drive_t *drive, struct request *rq, ide_action_t actio
queue_head = queue_head->next;
}
q->elevator.elevator_add_req_fn(q, rq, queue_head);
- spin_unlock(DRIVE_LOCK(drive));
ide_do_request(hwgroup, 0);
spin_unlock_irqrestore(&ide_lock, flags);
if (action == ide_wait) {
diff --git a/drivers/md/linear.c b/drivers/md/linear.c
index c40dd3a1b58d..b65a67357e1d 100644
--- a/drivers/md/linear.c
+++ b/drivers/md/linear.c
@@ -189,7 +189,7 @@ static mdk_personality_t linear_personality=
status: linear_status,
};
-static int md__init linear_init (void)
+static int __init linear_init (void)
{
return register_md_personality (LINEAR, &linear_personality);
}
diff --git a/drivers/md/md.c b/drivers/md/md.c
index d474faf734c3..ae96e8648a57 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -130,7 +130,7 @@ static struct gendisk md_gendisk=
/*
* Enables to iterate over all existing md arrays
*/
-static MD_LIST_HEAD(all_mddevs);
+static LIST_HEAD(all_mddevs);
/*
* The mapping between kdev and mddev is not necessary a simple
@@ -201,8 +201,8 @@ static mddev_t * alloc_mddev(kdev_t dev)
init_MUTEX(&mddev->reconfig_sem);
init_MUTEX(&mddev->recovery_sem);
init_MUTEX(&mddev->resync_sem);
- MD_INIT_LIST_HEAD(&mddev->disks);
- MD_INIT_LIST_HEAD(&mddev->all_mddevs);
+ INIT_LIST_HEAD(&mddev->disks);
+ INIT_LIST_HEAD(&mddev->all_mddevs);
atomic_set(&mddev->active, 0);
/*
@@ -211,7 +211,7 @@ static mddev_t * alloc_mddev(kdev_t dev)
* if necessary.
*/
add_mddev_mapping(mddev, dev, 0);
- md_list_add(&mddev->all_mddevs, &all_mddevs);
+ list_add(&mddev->all_mddevs, &all_mddevs);
MOD_INC_USE_COUNT;
@@ -221,7 +221,7 @@ static mddev_t * alloc_mddev(kdev_t dev)
mdk_rdev_t * find_rdev_nr(mddev_t *mddev, int nr)
{
mdk_rdev_t * rdev;
- struct md_list_head *tmp;
+ struct list_head *tmp;
ITERATE_RDEV(mddev,rdev,tmp) {
if (rdev->desc_nr == nr)
@@ -232,7 +232,7 @@ mdk_rdev_t * find_rdev_nr(mddev_t *mddev, int nr)
mdk_rdev_t * find_rdev(mddev_t * mddev, kdev_t dev)
{
- struct md_list_head *tmp;
+ struct list_head *tmp;
mdk_rdev_t *rdev;
ITERATE_RDEV(mddev,rdev,tmp) {
@@ -242,17 +242,17 @@ mdk_rdev_t * find_rdev(mddev_t * mddev, kdev_t dev)
return NULL;
}
-static MD_LIST_HEAD(device_names);
+static LIST_HEAD(device_names);
char * partition_name(kdev_t dev)
{
struct gendisk *hd;
static char nomem [] = "<nomem>";
dev_name_t *dname;
- struct md_list_head *tmp = device_names.next;
+ struct list_head *tmp = device_names.next;
while (tmp != &device_names) {
- dname = md_list_entry(tmp, dev_name_t, list);
+ dname = list_entry(tmp, dev_name_t, list);
if (dname->dev == dev)
return dname->name;
tmp = tmp->next;
@@ -275,8 +275,8 @@ char * partition_name(kdev_t dev)
}
dname->dev = dev;
- MD_INIT_LIST_HEAD(&dname->list);
- md_list_add(&dname->list, &device_names);
+ INIT_LIST_HEAD(&dname->list);
+ list_add(&dname->list, &device_names);
return dname->name;
}
@@ -311,7 +311,7 @@ static unsigned int zoned_raid_size(mddev_t *mddev)
{
unsigned int mask;
mdk_rdev_t * rdev;
- struct md_list_head *tmp;
+ struct list_head *tmp;
if (!mddev->sb) {
MD_BUG();
@@ -341,7 +341,7 @@ int md_check_ordering(mddev_t *mddev)
{
int i, c;
mdk_rdev_t *rdev;
- struct md_list_head *tmp;
+ struct list_head *tmp;
/*
* First, all devices must be fully functional
@@ -435,7 +435,7 @@ static int alloc_array_sb(mddev_t * mddev)
mddev->sb = (mdp_super_t *) __get_free_page (GFP_KERNEL);
if (!mddev->sb)
return -ENOMEM;
- md_clear_page(mddev->sb);
+ clear_page(mddev->sb);
return 0;
}
@@ -449,7 +449,7 @@ static int alloc_disk_sb(mdk_rdev_t * rdev)
printk(OUT_OF_MEM);
return -EINVAL;
}
- md_clear_page(rdev->sb);
+ clear_page(rdev->sb);
return 0;
}
@@ -564,7 +564,7 @@ static kdev_t dev_unit(kdev_t dev)
static mdk_rdev_t * match_dev_unit(mddev_t *mddev, kdev_t dev)
{
- struct md_list_head *tmp;
+ struct list_head *tmp;
mdk_rdev_t *rdev;
ITERATE_RDEV(mddev,rdev,tmp)
@@ -576,7 +576,7 @@ static mdk_rdev_t * match_dev_unit(mddev_t *mddev, kdev_t dev)
static int match_mddev_units(mddev_t *mddev1, mddev_t *mddev2)
{
- struct md_list_head *tmp;
+ struct list_head *tmp;
mdk_rdev_t *rdev;
ITERATE_RDEV(mddev1,rdev,tmp)
@@ -586,8 +586,8 @@ static int match_mddev_units(mddev_t *mddev1, mddev_t *mddev2)
return 0;
}
-static MD_LIST_HEAD(all_raid_disks);
-static MD_LIST_HEAD(pending_raid_disks);
+static LIST_HEAD(all_raid_disks);
+static LIST_HEAD(pending_raid_disks);
static void bind_rdev_to_array(mdk_rdev_t * rdev, mddev_t * mddev)
{
@@ -605,7 +605,7 @@ static void bind_rdev_to_array(mdk_rdev_t * rdev, mddev_t * mddev)
mdidx(mddev), partition_name(rdev->dev),
partition_name(same_pdev->dev));
- md_list_add(&rdev->same_set, &mddev->disks);
+ list_add(&rdev->same_set, &mddev->disks);
rdev->mddev = mddev;
mddev->nb_dev++;
printk(KERN_INFO "md: bind<%s,%d>\n", partition_name(rdev->dev), mddev->nb_dev);
@@ -617,8 +617,8 @@ static void unbind_rdev_from_array(mdk_rdev_t * rdev)
MD_BUG();
return;
}
- md_list_del(&rdev->same_set);
- MD_INIT_LIST_HEAD(&rdev->same_set);
+ list_del(&rdev->same_set);
+ INIT_LIST_HEAD(&rdev->same_set);
rdev->mddev->nb_dev--;
printk(KERN_INFO "md: unbind<%s,%d>\n", partition_name(rdev->dev),
rdev->mddev->nb_dev);
@@ -664,13 +664,13 @@ static void export_rdev(mdk_rdev_t * rdev)
MD_BUG();
unlock_rdev(rdev);
free_disk_sb(rdev);
- md_list_del(&rdev->all);
- MD_INIT_LIST_HEAD(&rdev->all);
+ list_del(&rdev->all);
+ INIT_LIST_HEAD(&rdev->all);
if (rdev->pending.next != &rdev->pending) {
printk(KERN_INFO "md: (%s was pending)\n",
partition_name(rdev->dev));
- md_list_del(&rdev->pending);
- MD_INIT_LIST_HEAD(&rdev->pending);
+ list_del(&rdev->pending);
+ INIT_LIST_HEAD(&rdev->pending);
}
#ifndef MODULE
md_autodetect_dev(rdev->dev);
@@ -688,7 +688,7 @@ static void kick_rdev_from_array(mdk_rdev_t * rdev)
static void export_array(mddev_t *mddev)
{
- struct md_list_head *tmp;
+ struct list_head *tmp;
mdk_rdev_t *rdev;
mdp_super_t *sb = mddev->sb;
@@ -723,14 +723,14 @@ static void free_mddev(mddev_t *mddev)
* Make sure nobody else is using this mddev
* (careful, we rely on the global kernel lock here)
*/
- while (md_atomic_read(&mddev->resync_sem.count) != 1)
+ while (atomic_read(&mddev->resync_sem.count) != 1)
schedule();
- while (md_atomic_read(&mddev->recovery_sem.count) != 1)
+ while (atomic_read(&mddev->recovery_sem.count) != 1)
schedule();
del_mddev_mapping(mddev, MKDEV(MD_MAJOR, mdidx(mddev)));
- md_list_del(&mddev->all_mddevs);
- MD_INIT_LIST_HEAD(&mddev->all_mddevs);
+ list_del(&mddev->all_mddevs);
+ INIT_LIST_HEAD(&mddev->all_mddevs);
kfree(mddev);
MOD_DEC_USE_COUNT;
}
@@ -793,7 +793,7 @@ static void print_rdev(mdk_rdev_t *rdev)
void md_print_devices(void)
{
- struct md_list_head *tmp, *tmp2;
+ struct list_head *tmp, *tmp2;
mdk_rdev_t *rdev;
mddev_t *mddev;
@@ -871,12 +871,12 @@ static int uuid_equal(mdk_rdev_t *rdev1, mdk_rdev_t *rdev2)
static mdk_rdev_t * find_rdev_all(kdev_t dev)
{
- struct md_list_head *tmp;
+ struct list_head *tmp;
mdk_rdev_t *rdev;
tmp = all_raid_disks.next;
while (tmp != &all_raid_disks) {
- rdev = md_list_entry(tmp, mdk_rdev_t, all);
+ rdev = list_entry(tmp, mdk_rdev_t, all);
if (rdev->dev == dev)
return rdev;
tmp = tmp->next;
@@ -980,7 +980,7 @@ static int sync_sbs(mddev_t * mddev)
{
mdk_rdev_t *rdev;
mdp_super_t *sb;
- struct md_list_head *tmp;
+ struct list_head *tmp;
ITERATE_RDEV(mddev,rdev,tmp) {
if (rdev->faulty || rdev->alias_device)
@@ -996,15 +996,15 @@ static int sync_sbs(mddev_t * mddev)
int md_update_sb(mddev_t * mddev)
{
int err, count = 100;
- struct md_list_head *tmp;
+ struct list_head *tmp;
mdk_rdev_t *rdev;
repeat:
mddev->sb->utime = CURRENT_TIME;
- if ((++mddev->sb->events_lo)==0)
+ if (!(++mddev->sb->events_lo))
++mddev->sb->events_hi;
- if ((mddev->sb->events_lo|mddev->sb->events_hi)==0) {
+ if (!(mddev->sb->events_lo | mddev->sb->events_hi)) {
/*
* oops, this 64-bit counter should never wrap.
* Either we are in around ~1 trillion A.C., assuming
@@ -1128,8 +1128,8 @@ static int md_import_device(kdev_t newdev, int on_disk)
rdev->desc_nr = -1;
}
}
- md_list_add(&rdev->all, &all_raid_disks);
- MD_INIT_LIST_HEAD(&rdev->pending);
+ list_add(&rdev->all, &all_raid_disks);
+ INIT_LIST_HEAD(&rdev->pending);
if (rdev->faulty && rdev->sb)
free_disk_sb(rdev);
@@ -1167,7 +1167,7 @@ abort_free:
static int analyze_sbs(mddev_t * mddev)
{
int out_of_date = 0, i, first;
- struct md_list_head *tmp, *tmp2;
+ struct list_head *tmp, *tmp2;
mdk_rdev_t *rdev, *rdev2, *freshest;
mdp_super_t *sb;
@@ -1225,7 +1225,7 @@ static int analyze_sbs(mddev_t * mddev)
*/
if (calc_sb_csum(rdev->sb) != rdev->sb->sb_csum) {
if (rdev->sb->events_lo || rdev->sb->events_hi)
- if ((rdev->sb->events_lo--)==0)
+ if (!(rdev->sb->events_lo--))
rdev->sb->events_hi--;
}
@@ -1513,7 +1513,7 @@ static int device_size_calculation(mddev_t * mddev)
int data_disks = 0, persistent;
unsigned int readahead;
mdp_super_t *sb = mddev->sb;
- struct md_list_head *tmp;
+ struct list_head *tmp;
mdk_rdev_t *rdev;
/*
@@ -1572,7 +1572,7 @@ static int device_size_calculation(mddev_t * mddev)
md_size[mdidx(mddev)] = sb->size * data_disks;
readahead = MD_READAHEAD;
- if ((sb->level == 0) || (sb->level == 4) || (sb->level == 5)) {
+ if (!sb->level || (sb->level == 4) || (sb->level == 5)) {
readahead = (mddev->sb->chunk_size>>PAGE_SHIFT) * 4 * data_disks;
if (readahead < data_disks * (MAX_SECTORS>>(PAGE_SHIFT-9))*2)
readahead = data_disks * (MAX_SECTORS>>(PAGE_SHIFT-9))*2;
@@ -1608,7 +1608,7 @@ static int do_md_run(mddev_t * mddev)
{
int pnum, err;
int chunk_size;
- struct md_list_head *tmp;
+ struct list_head *tmp;
mdk_rdev_t *rdev;
@@ -1873,7 +1873,7 @@ int detect_old_array(mdp_super_t *sb)
static void autorun_array(mddev_t *mddev)
{
mdk_rdev_t *rdev;
- struct md_list_head *tmp;
+ struct list_head *tmp;
int err;
if (mddev->disks.prev == &mddev->disks) {
@@ -1913,8 +1913,8 @@ static void autorun_array(mddev_t *mddev)
*/
static void autorun_devices(kdev_t countdev)
{
- struct md_list_head candidates;
- struct md_list_head *tmp;
+ struct list_head candidates;
+ struct list_head *tmp;
mdk_rdev_t *rdev0, *rdev;
mddev_t *mddev;
kdev_t md_kdev;
@@ -1922,11 +1922,11 @@ static void autorun_devices(kdev_t countdev)
printk(KERN_INFO "md: autorun ...\n");
while (pending_raid_disks.next != &pending_raid_disks) {
- rdev0 = md_list_entry(pending_raid_disks.next,
+ rdev0 = list_entry(pending_raid_disks.next,
mdk_rdev_t, pending);
printk(KERN_INFO "md: considering %s ...\n", partition_name(rdev0->dev));
- MD_INIT_LIST_HEAD(&candidates);
+ INIT_LIST_HEAD(&candidates);
ITERATE_RDEV_PENDING(rdev,tmp) {
if (uuid_equal(rdev0, rdev)) {
if (!sb_equal(rdev0->sb, rdev->sb)) {
@@ -1936,8 +1936,8 @@ static void autorun_devices(kdev_t countdev)
continue;
}
printk(KERN_INFO "md: adding %s ...\n", partition_name(rdev->dev));
- md_list_del(&rdev->pending);
- md_list_add(&rdev->pending, &candidates);
+ list_del(&rdev->pending);
+ list_add(&rdev->pending, &candidates);
}
}
/*
@@ -1964,8 +1964,8 @@ static void autorun_devices(kdev_t countdev)
printk(KERN_INFO "md: created md%d\n", mdidx(mddev));
ITERATE_RDEV_GENERIC(candidates,pending,rdev,tmp) {
bind_rdev_to_array(rdev, mddev);
- md_list_del(&rdev->pending);
- MD_INIT_LIST_HEAD(&rdev->pending);
+ list_del(&rdev->pending);
+ INIT_LIST_HEAD(&rdev->pending);
}
autorun_array(mddev);
}
@@ -2025,7 +2025,7 @@ static int autostart_array(kdev_t startdev, kdev_t countdev)
partition_name(startdev));
goto abort;
}
- md_list_add(&start_rdev->pending, &pending_raid_disks);
+ list_add(&start_rdev->pending, &pending_raid_disks);
sb = start_rdev->sb;
@@ -2058,7 +2058,7 @@ static int autostart_array(kdev_t startdev, kdev_t countdev)
MD_BUG();
goto abort;
}
- md_list_add(&rdev->pending, &pending_raid_disks);
+ list_add(&rdev->pending, &pending_raid_disks);
}
/*
@@ -2091,7 +2091,7 @@ static int get_version(void * arg)
ver.minor = MD_MINOR_VERSION;
ver.patchlevel = MD_PATCHLEVEL_VERSION;
- if (md_copy_to_user(arg, &ver, sizeof(ver)))
+ if (copy_to_user(arg, &ver, sizeof(ver)))
return -EFAULT;
return 0;
@@ -2128,7 +2128,7 @@ static int get_array_info(mddev_t * mddev, void * arg)
SET_FROM_SB(layout);
SET_FROM_SB(chunk_size);
- if (md_copy_to_user(arg, &info, sizeof(info)))
+ if (copy_to_user(arg, &info, sizeof(info)))
return -EFAULT;
return 0;
@@ -2144,7 +2144,7 @@ static int get_disk_info(mddev_t * mddev, void * arg)
if (!mddev->sb)
return -EINVAL;
- if (md_copy_from_user(&info, arg, sizeof(info)))
+ if (copy_from_user(&info, arg, sizeof(info)))
return -EFAULT;
nr = info.number;
@@ -2156,7 +2156,7 @@ static int get_disk_info(mddev_t * mddev, void * arg)
SET_FROM_SB(raid_disk);
SET_FROM_SB(state);
- if (md_copy_to_user(arg, &info, sizeof(info)))
+ if (copy_to_user(arg, &info, sizeof(info)))
return -EFAULT;
return 0;
@@ -2191,7 +2191,7 @@ static int add_new_disk(mddev_t * mddev, mdu_disk_info_t *info)
return -EINVAL;
}
if (mddev->nb_dev) {
- mdk_rdev_t *rdev0 = md_list_entry(mddev->disks.next,
+ mdk_rdev_t *rdev0 = list_entry(mddev->disks.next,
mdk_rdev_t, same_set);
if (!uuid_equal(rdev0, rdev)) {
printk(KERN_WARNING "md: %s has different UUID to %s\n",
@@ -2223,7 +2223,7 @@ static int add_new_disk(mddev_t * mddev, mdu_disk_info_t *info)
SET_SB(raid_disk);
SET_SB(state);
- if ((info->state & (1<<MD_DISK_FAULTY))==0) {
+ if (!(info->state & (1<<MD_DISK_FAULTY))) {
err = md_import_device (dev, 0);
if (err) {
printk(KERN_WARNING "md: error, md_import_device() returned %d\n", err);
@@ -2566,7 +2566,7 @@ static int md_ioctl(struct inode *inode, struct file *file,
mddev_t *mddev = NULL;
kdev_t dev;
- if (!md_capable_admin())
+ if (!capable(CAP_SYS_ADMIN))
return -EACCES;
dev = inode->i_rdev;
@@ -2604,12 +2604,12 @@ static int md_ioctl(struct inode *inode, struct file *file,
MD_BUG();
goto abort;
}
- err = md_put_user(md_hd_struct[minor].nr_sects,
+ err = put_user(md_hd_struct[minor].nr_sects,
(unsigned long *) arg);
goto done;
case BLKGETSIZE64: /* Return device size */
- err = md_put_user((u64)md_hd_struct[minor].nr_sects << 9,
+ err = put_user((u64)md_hd_struct[minor].nr_sects << 9,
(u64 *) arg);
goto done;
@@ -2618,7 +2618,7 @@ static int md_ioctl(struct inode *inode, struct file *file,
case BLKFLSBUF:
case BLKBSZGET:
case BLKBSZSET:
- err = blk_ioctl (dev, cmd, arg);
+ err = blk_ioctl(dev, cmd, arg);
goto abort;
default:;
@@ -2670,7 +2670,7 @@ static int md_ioctl(struct inode *inode, struct file *file,
}
if (arg) {
mdu_array_info_t info;
- if (md_copy_from_user(&info, (void*)arg, sizeof(info))) {
+ if (copy_from_user(&info, (void*)arg, sizeof(info))) {
err = -EFAULT;
goto abort_unlock;
}
@@ -2753,17 +2753,17 @@ static int md_ioctl(struct inode *inode, struct file *file,
err = -EINVAL;
goto abort_unlock;
}
- err = md_put_user (2, (char *) &loc->heads);
+ err = put_user (2, (char *) &loc->heads);
if (err)
goto abort_unlock;
- err = md_put_user (4, (char *) &loc->sectors);
+ err = put_user (4, (char *) &loc->sectors);
if (err)
goto abort_unlock;
- err = md_put_user (md_hd_struct[mdidx(mddev)].nr_sects/8,
+ err = put_user (md_hd_struct[mdidx(mddev)].nr_sects/8,
(short *) &loc->cylinders);
if (err)
goto abort_unlock;
- err = md_put_user (get_start_sect(dev),
+ err = put_user (get_start_sect(dev),
(long *) &loc->start);
goto done_unlock;
}
@@ -2787,7 +2787,7 @@ static int md_ioctl(struct inode *inode, struct file *file,
case ADD_NEW_DISK:
{
mdu_disk_info_t info;
- if (md_copy_from_user(&info, (void*)arg, sizeof(info)))
+ if (copy_from_user(&info, (void*)arg, sizeof(info)))
err = -EFAULT;
else
err = add_new_disk(mddev, &info);
@@ -2828,7 +2828,7 @@ static int md_ioctl(struct inode *inode, struct file *file,
{
/* The data is never used....
mdu_param_t param;
- err = md_copy_from_user(&param, (mdu_param_t *)arg,
+ err = copy_from_user(&param, (mdu_param_t *)arg,
sizeof(param));
if (err)
goto abort_unlock;
@@ -2887,7 +2887,7 @@ static int md_release(struct inode *inode, struct file * file)
return 0;
}
-static struct block_device_operations md_fops=
+static struct block_device_operations md_fops =
{
owner: THIS_MODULE,
open: md_open,
@@ -2896,11 +2896,18 @@ static struct block_device_operations md_fops=
};
+static inline void flush_curr_signals(void)
+{
+ spin_lock(&current->sigmask_lock);
+ flush_signals(current);
+ spin_unlock(&current->sigmask_lock);
+}
+
int md_thread(void * arg)
{
mdk_thread_t *thread = arg;
- md_lock_kernel();
+ lock_kernel();
/*
* Detach thread
@@ -2909,8 +2916,9 @@ int md_thread(void * arg)
daemonize();
sprintf(current->comm, thread->name);
- md_init_signals();
- md_flush_signals();
+ current->exit_signal = SIGCHLD;
+ siginitsetinv(&current->blocked, sigmask(SIGKILL));
+ flush_curr_signals();
thread->tsk = current;
/*
@@ -2926,7 +2934,7 @@ int md_thread(void * arg)
*/
current->policy = SCHED_OTHER;
current->nice = -20;
- md_unlock_kernel();
+ unlock_kernel();
complete(thread->event);
while (thread->run) {
@@ -2949,8 +2957,8 @@ int md_thread(void * arg)
run(thread->data);
run_task_queue(&tq_disk);
}
- if (md_signal_pending(current))
- md_flush_signals();
+ if (signal_pending(current))
+ flush_curr_signals();
}
complete(thread->event);
return 0;
@@ -2976,7 +2984,7 @@ mdk_thread_t *md_register_thread(void (*run) (void *),
return NULL;
memset(thread, 0, sizeof(mdk_thread_t));
- md_init_waitqueue_head(&thread->wqueue);
+ init_waitqueue_head(&thread->wqueue);
init_completion(&event);
thread->event = &event;
@@ -3064,7 +3072,7 @@ static int status_unused(char * page)
{
int sz = 0, i = 0;
mdk_rdev_t *rdev;
- struct md_list_head *tmp;
+ struct list_head *tmp;
sz += sprintf(page + sz, "unused devices: ");
@@ -3150,7 +3158,7 @@ static int md_status_read_proc(char *page, char **start, off_t off,
int count, int *eof, void *data)
{
int sz = 0, j, size;
- struct md_list_head *tmp, *tmp2;
+ struct list_head *tmp, *tmp2;
mdk_rdev_t *rdev;
mddev_t *mddev;
@@ -3207,7 +3215,7 @@ static int md_status_read_proc(char *page, char **start, off_t off,
if (mddev->curr_resync) {
sz += status_resync (page+sz, mddev);
} else {
- if (md_atomic_read(&mddev->resync_sem.count) != 1)
+ if (atomic_read(&mddev->resync_sem.count) != 1)
sz += sprintf(page + sz, " resync=DELAYED");
}
sz += sprintf(page + sz, "\n");
@@ -3251,7 +3259,7 @@ mdp_disk_t *get_spare(mddev_t *mddev)
mdp_super_t *sb = mddev->sb;
mdp_disk_t *disk;
mdk_rdev_t *rdev;
- struct md_list_head *tmp;
+ struct list_head *tmp;
ITERATE_RDEV(mddev,rdev,tmp) {
if (rdev->faulty)
@@ -3288,7 +3296,7 @@ void md_sync_acct(kdev_t dev, unsigned long nr_sectors)
static int is_mddev_idle(mddev_t *mddev)
{
mdk_rdev_t * rdev;
- struct md_list_head *tmp;
+ struct list_head *tmp;
int idle;
unsigned long curr_events;
@@ -3311,7 +3319,7 @@ static int is_mddev_idle(mddev_t *mddev)
return idle;
}
-MD_DECLARE_WAIT_QUEUE_HEAD(resync_wait);
+DECLARE_WAIT_QUEUE_HEAD(resync_wait);
void md_done_sync(mddev_t *mddev, int blocks, int ok)
{
@@ -3333,7 +3341,7 @@ int md_do_sync(mddev_t *mddev, mdp_disk_t *spare)
unsigned long mark[SYNC_MARKS];
unsigned long mark_cnt[SYNC_MARKS];
int last_mark,m;
- struct md_list_head *tmp;
+ struct list_head *tmp;
unsigned long last_check;
@@ -3356,8 +3364,8 @@ recheck:
}
if (serialize) {
interruptible_sleep_on(&resync_wait);
- if (md_signal_pending(current)) {
- md_flush_signals();
+ if (signal_pending(current)) {
+ flush_curr_signals();
err = -EINTR;
goto out;
}
@@ -3365,8 +3373,7 @@ recheck:
}
mddev->curr_resync = 1;
-
- max_sectors = mddev->sb->size<<1;
+ max_sectors = mddev->sb->size << 1;
printk(KERN_INFO "md: syncing RAID array md%d\n", mdidx(mddev));
printk(KERN_INFO "md: minimum _guaranteed_ reconstruction speed: %d KB/sec/disc.\n",
@@ -3403,7 +3410,6 @@ recheck:
int sectors;
sectors = mddev->pers->sync_request(mddev, j);
-
if (sectors < 0) {
err = sectors;
goto out;
@@ -3432,13 +3438,13 @@ recheck:
}
- if (md_signal_pending(current)) {
+ if (signal_pending(current)) {
/*
* got a signal, exit.
*/
mddev->curr_resync = 0;
printk(KERN_INFO "md: md_do_sync() got signal ... exiting\n");
- md_flush_signals();
+ flush_curr_signals();
err = -EINTR;
goto out;
}
@@ -3451,7 +3457,7 @@ recheck:
* about not overloading the IO subsystem. (things like an
* e2fsck being done on the RAID array should execute fast)
*/
- if (md_need_resched(current))
+ if (current->need_resched)
schedule();
currspeed = (j-mddev->resync_mark_cnt)/2/((jiffies-mddev->resync_mark)/HZ +1) +1;
@@ -3462,7 +3468,7 @@ recheck:
if ((currspeed > sysctl_speed_limit_max) ||
!is_mddev_idle(mddev)) {
current->state = TASK_INTERRUPTIBLE;
- md_schedule_timeout(HZ/4);
+ schedule_timeout(HZ/4);
goto repeat;
}
} else
@@ -3474,7 +3480,7 @@ recheck:
* this also signals 'finished resyncing' to md_stop
*/
out:
- wait_event(mddev->recovery_wait, atomic_read(&mddev->recovery_active)==0);
+ wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active));
up(&mddev->resync_sem);
out_nolock:
mddev->curr_resync = 0;
@@ -3497,7 +3503,7 @@ void md_do_recovery(void *data)
mddev_t *mddev;
mdp_super_t *sb;
mdp_disk_t *spare;
- struct md_list_head *tmp;
+ struct list_head *tmp;
printk(KERN_INFO "md: recovery thread got woken up ...\n");
restart:
@@ -3581,13 +3587,13 @@ restart:
int md_notify_reboot(struct notifier_block *this,
unsigned long code, void *x)
{
- struct md_list_head *tmp;
+ struct list_head *tmp;
mddev_t *mddev;
- if ((code == MD_SYS_DOWN) || (code == MD_SYS_HALT)
- || (code == MD_SYS_POWER_OFF)) {
+ if ((code == SYS_DOWN) || (code == SYS_HALT) || (code == SYS_POWER_OFF)) {
printk(KERN_INFO "md: stopping all md devices.\n");
+ return NOTIFY_DONE;
ITERATE_MDDEV(mddev,tmp)
do_md_stop (mddev, 1);
@@ -3597,7 +3603,7 @@ int md_notify_reboot(struct notifier_block *this,
* right place to handle this issue is the given
* driver, we do want to have a safe RAID driver ...
*/
- md_mdelay(1000*1);
+ mdelay(1000*1);
}
return NOTIFY_DONE;
}
@@ -3628,7 +3634,7 @@ static void md_geninit(void)
#endif
}
-int md__init md_init(void)
+int __init md_init(void)
{
static char * name = "mdrecoveryd";
int minor;
@@ -3665,7 +3671,7 @@ int md__init md_init(void)
printk(KERN_ALERT
"md: bug: couldn't allocate md_recovery_thread\n");
- md_register_reboot_notifier(&md_notifier);
+ register_reboot_notifier(&md_notifier);
raid_table_header = register_sysctl_table(raid_root_table, 1);
md_geninit();
@@ -3687,7 +3693,7 @@ int md__init md_init(void)
struct {
int set;
int noautodetect;
-} raid_setup_args md__initdata;
+} raid_setup_args __initdata;
/*
* Searches all registered partitions for autorun RAID arrays
@@ -3730,7 +3736,7 @@ static void autostart_arrays(void)
MD_BUG();
continue;
}
- md_list_add(&rdev->pending, &pending_raid_disks);
+ list_add(&rdev->pending, &pending_raid_disks);
}
dev_cnt = 0;
@@ -3742,7 +3748,7 @@ static struct {
int pers[MAX_MD_DEVS];
int chunk[MAX_MD_DEVS];
char *device_names[MAX_MD_DEVS];
-} md_setup_args md__initdata;
+} md_setup_args __initdata;
/*
* Parse the command-line parameters given our kernel, but do not
@@ -3764,7 +3770,7 @@ static struct {
* Shifted name_to_kdev_t() and related operations to md_set_drive()
* for later execution. Rewrote section to make devfs compatible.
*/
-static int md__init md_setup(char *str)
+static int __init md_setup(char *str)
{
int minor, level, factor, fault;
char *pername = "";
@@ -3783,7 +3789,7 @@ static int md__init md_setup(char *str)
}
switch (get_option(&str, &level)) { /* RAID Personality */
case 2: /* could be 0 or -1.. */
- if (level == 0 || level == -1) {
+ if (!level || level == -1) {
if (get_option(&str, &factor) != 2 || /* Chunk Size */
get_option(&str, &fault) != 2) {
printk(KERN_WARNING "md: Too few arguments supplied to md=.\n");
@@ -3825,8 +3831,8 @@ static int md__init md_setup(char *str)
return 1;
}
-extern kdev_t name_to_kdev_t(char *line) md__init;
-void md__init md_setup_drive(void)
+extern kdev_t name_to_kdev_t(char *line) __init;
+void __init md_setup_drive(void)
{
int minor, i;
kdev_t dev;
@@ -3838,7 +3844,8 @@ void md__init md_setup_drive(void)
char *devname;
mdu_disk_info_t dinfo;
- if ((devname = md_setup_args.device_names[minor]) == 0) continue;
+ if (!(devname = md_setup_args.device_names[minor]))
+ continue;
for (i = 0; i < MD_SB_DISKS && devname != 0; i++) {
@@ -3857,7 +3864,7 @@ void md__init md_setup_drive(void)
devfs_get_maj_min(handle, &major, &minor);
dev = MKDEV(major, minor);
}
- if (dev == 0) {
+ if (!dev) {
printk(KERN_WARNING "md: Unknown device name: %s\n", devname);
break;
}
@@ -3869,7 +3876,7 @@ void md__init md_setup_drive(void)
}
devices[i] = 0;
- if (md_setup_args.device_set[minor] == 0)
+ if (!md_setup_args.device_set[minor])
continue;
if (mddev_map[minor].mddev) {
@@ -3933,7 +3940,7 @@ void md__init md_setup_drive(void)
}
}
-static int md__init raid_setup(char *str)
+static int __init raid_setup(char *str)
{
int len, pos;
@@ -3947,7 +3954,7 @@ static int md__init raid_setup(char *str)
wlen = (comma-str)-pos;
else wlen = (len-1)-pos;
- if (strncmp(str, "noautodetect", wlen) == 0)
+ if (!strncmp(str, "noautodetect", wlen))
raid_setup_args.noautodetect = 1;
pos += wlen+1;
}
@@ -3955,7 +3962,7 @@ static int md__init raid_setup(char *str)
return 1;
}
-int md__init md_run_setup(void)
+int __init md_run_setup(void)
{
if (raid_setup_args.noautodetect)
printk(KERN_INFO "md: Skipping autodetection of RAID arrays. (raid=noautodetect)\n");
@@ -4008,23 +4015,23 @@ void cleanup_module(void)
}
#endif
-MD_EXPORT_SYMBOL(md_size);
-MD_EXPORT_SYMBOL(register_md_personality);
-MD_EXPORT_SYMBOL(unregister_md_personality);
-MD_EXPORT_SYMBOL(partition_name);
-MD_EXPORT_SYMBOL(md_error);
-MD_EXPORT_SYMBOL(md_do_sync);
-MD_EXPORT_SYMBOL(md_sync_acct);
-MD_EXPORT_SYMBOL(md_done_sync);
-MD_EXPORT_SYMBOL(md_recover_arrays);
-MD_EXPORT_SYMBOL(md_register_thread);
-MD_EXPORT_SYMBOL(md_unregister_thread);
-MD_EXPORT_SYMBOL(md_update_sb);
-MD_EXPORT_SYMBOL(md_wakeup_thread);
-MD_EXPORT_SYMBOL(md_print_devices);
-MD_EXPORT_SYMBOL(find_rdev_nr);
-MD_EXPORT_SYMBOL(md_interrupt_thread);
-MD_EXPORT_SYMBOL(mddev_map);
-MD_EXPORT_SYMBOL(md_check_ordering);
-MD_EXPORT_SYMBOL(get_spare);
+EXPORT_SYMBOL(md_size);
+EXPORT_SYMBOL(register_md_personality);
+EXPORT_SYMBOL(unregister_md_personality);
+EXPORT_SYMBOL(partition_name);
+EXPORT_SYMBOL(md_error);
+EXPORT_SYMBOL(md_do_sync);
+EXPORT_SYMBOL(md_sync_acct);
+EXPORT_SYMBOL(md_done_sync);
+EXPORT_SYMBOL(md_recover_arrays);
+EXPORT_SYMBOL(md_register_thread);
+EXPORT_SYMBOL(md_unregister_thread);
+EXPORT_SYMBOL(md_update_sb);
+EXPORT_SYMBOL(md_wakeup_thread);
+EXPORT_SYMBOL(md_print_devices);
+EXPORT_SYMBOL(find_rdev_nr);
+EXPORT_SYMBOL(md_interrupt_thread);
+EXPORT_SYMBOL(mddev_map);
+EXPORT_SYMBOL(md_check_ordering);
+EXPORT_SYMBOL(get_spare);
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 8b21f612e0a5..7203d97a27fd 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -334,7 +334,7 @@ static mdk_personality_t raid0_personality=
status: raid0_status,
};
-static int md__init raid0_init (void)
+static int __init raid0_init (void)
{
return register_md_personality (RAID0, &raid0_personality);
}
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 6c8a5bf21112..57829582b60c 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1,7 +1,7 @@
/*
* raid1.c : Multiple Devices driver for Linux
*
- * Copyright (C) 1999, 2000 Ingo Molnar, Red Hat
+ * Copyright (C) 1999, 2000, 2001 Ingo Molnar, Red Hat
*
* Copyright (C) 1996, 1997, 1998 Ingo Molnar, Miguel de Icaza, Gadi Oxman
*
@@ -22,330 +22,208 @@
* Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
-#include <linux/module.h>
-#include <linux/slab.h>
#include <linux/raid/raid1.h>
-#include <asm/atomic.h>
#define MAJOR_NR MD_MAJOR
#define MD_DRIVER
#define MD_PERSONALITY
#define MAX_WORK_PER_DISK 128
-
-#define NR_RESERVED_BUFS 32
-
-
/*
- * The following can be used to debug the driver
+ * Number of guaranteed r1bios in case of extreme VM load:
*/
-#define RAID1_DEBUG 0
-
-#if RAID1_DEBUG
-#define PRINTK(x...) printk(x)
-#define inline
-#define __inline__
-#else
-#define PRINTK(x...) do { } while (0)
-#endif
-
+#define NR_RAID1_BIOS 256
static mdk_personality_t raid1_personality;
-static md_spinlock_t retry_list_lock = MD_SPIN_LOCK_UNLOCKED;
-struct raid1_bh *raid1_retry_list = NULL, **raid1_retry_tail;
+static spinlock_t retry_list_lock = SPIN_LOCK_UNLOCKED;
+static LIST_HEAD(retry_list_head);
-static struct buffer_head *raid1_alloc_bh(raid1_conf_t *conf, int cnt)
+static inline void check_all_w_bios_empty(r1bio_t *r1_bio)
{
- /* return a linked list of "cnt" struct buffer_heads.
- * don't take any off the free list unless we know we can
- * get all we need, otherwise we could deadlock
- */
- struct buffer_head *bh=NULL;
-
- while(cnt) {
- struct buffer_head *t;
- md_spin_lock_irq(&conf->device_lock);
- if (!conf->freebh_blocked && conf->freebh_cnt >= cnt)
- while (cnt) {
- t = conf->freebh;
- conf->freebh = t->b_next;
- t->b_next = bh;
- bh = t;
- t->b_state = 0;
- conf->freebh_cnt--;
- cnt--;
- }
- md_spin_unlock_irq(&conf->device_lock);
- if (cnt == 0)
- break;
- t = kmem_cache_alloc(bh_cachep, SLAB_NOIO);
- if (t) {
- t->b_next = bh;
- bh = t;
- cnt--;
- } else {
- PRINTK("raid1: waiting for %d bh\n", cnt);
- conf->freebh_blocked = 1;
- wait_disk_event(conf->wait_buffer,
- !conf->freebh_blocked ||
- conf->freebh_cnt > conf->raid_disks * NR_RESERVED_BUFS/2);
- conf->freebh_blocked = 0;
- }
- }
- return bh;
-}
+ int i;
-static inline void raid1_free_bh(raid1_conf_t *conf, struct buffer_head *bh)
-{
- unsigned long flags;
- spin_lock_irqsave(&conf->device_lock, flags);
- while (bh) {
- struct buffer_head *t = bh;
- bh=bh->b_next;
- if (t->b_pprev == NULL)
- kmem_cache_free(bh_cachep, t);
- else {
- t->b_next= conf->freebh;
- conf->freebh = t;
- conf->freebh_cnt++;
- }
- }
- spin_unlock_irqrestore(&conf->device_lock, flags);
- wake_up(&conf->wait_buffer);
+ return;
+ for (i = 0; i < MD_SB_DISKS; i++)
+ if (r1_bio->write_bios[i])
+ BUG();
}
-static int raid1_grow_bh(raid1_conf_t *conf, int cnt)
+static inline void check_all_bios_empty(r1bio_t *r1_bio)
{
- /* allocate cnt buffer_heads, possibly less if kmalloc fails */
- int i = 0;
-
- while (i < cnt) {
- struct buffer_head *bh;
- bh = kmem_cache_alloc(bh_cachep, SLAB_KERNEL);
- if (!bh) break;
-
- md_spin_lock_irq(&conf->device_lock);
- bh->b_pprev = &conf->freebh;
- bh->b_next = conf->freebh;
- conf->freebh = bh;
- conf->freebh_cnt++;
- md_spin_unlock_irq(&conf->device_lock);
-
- i++;
- }
- return i;
+ return;
+ if (r1_bio->read_bio)
+ BUG();
+ check_all_w_bios_empty(r1_bio);
}
-static void raid1_shrink_bh(raid1_conf_t *conf)
+static void * r1bio_pool_alloc(int gfp_flags, void *data)
{
- /* discard all buffer_heads */
-
- md_spin_lock_irq(&conf->device_lock);
- while (conf->freebh) {
- struct buffer_head *bh = conf->freebh;
- conf->freebh = bh->b_next;
- kmem_cache_free(bh_cachep, bh);
- conf->freebh_cnt--;
- }
- md_spin_unlock_irq(&conf->device_lock);
-}
-
+ r1bio_t *r1_bio;
-static struct raid1_bh *raid1_alloc_r1bh(raid1_conf_t *conf)
-{
- struct raid1_bh *r1_bh = NULL;
+ r1_bio = kmalloc(sizeof(r1bio_t), gfp_flags);
+ if (r1_bio)
+ memset(r1_bio, 0, sizeof(*r1_bio));
- do {
- md_spin_lock_irq(&conf->device_lock);
- if (!conf->freer1_blocked && conf->freer1) {
- r1_bh = conf->freer1;
- conf->freer1 = r1_bh->next_r1;
- conf->freer1_cnt--;
- r1_bh->next_r1 = NULL;
- r1_bh->state = (1 << R1BH_PreAlloc);
- r1_bh->bh_req.b_state = 0;
- }
- md_spin_unlock_irq(&conf->device_lock);
- if (r1_bh)
- return r1_bh;
- r1_bh = (struct raid1_bh *) kmalloc(sizeof(struct raid1_bh), GFP_NOIO);
- if (r1_bh) {
- memset(r1_bh, 0, sizeof(*r1_bh));
- return r1_bh;
- }
- conf->freer1_blocked = 1;
- wait_disk_event(conf->wait_buffer,
- !conf->freer1_blocked ||
- conf->freer1_cnt > NR_RESERVED_BUFS/2
- );
- conf->freer1_blocked = 0;
- } while (1);
+ return r1_bio;
}
-static inline void raid1_free_r1bh(struct raid1_bh *r1_bh)
+static void r1bio_pool_free(void *r1_bio, void *data)
{
- struct buffer_head *bh = r1_bh->mirror_bh_list;
- raid1_conf_t *conf = mddev_to_conf(r1_bh->mddev);
-
- r1_bh->mirror_bh_list = NULL;
-
- if (test_bit(R1BH_PreAlloc, &r1_bh->state)) {
- unsigned long flags;
- spin_lock_irqsave(&conf->device_lock, flags);
- r1_bh->next_r1 = conf->freer1;
- conf->freer1 = r1_bh;
- conf->freer1_cnt++;
- spin_unlock_irqrestore(&conf->device_lock, flags);
- /* don't need to wakeup wait_buffer because
- * raid1_free_bh below will do that
- */
- } else {
- kfree(r1_bh);
- }
- raid1_free_bh(conf, bh);
+ check_all_bios_empty(r1_bio);
+ kfree(r1_bio);
}
-static int raid1_grow_r1bh (raid1_conf_t *conf, int cnt)
-{
- int i = 0;
-
- while (i < cnt) {
- struct raid1_bh *r1_bh;
- r1_bh = (struct raid1_bh*)kmalloc(sizeof(*r1_bh), GFP_KERNEL);
- if (!r1_bh)
- break;
- memset(r1_bh, 0, sizeof(*r1_bh));
- set_bit(R1BH_PreAlloc, &r1_bh->state);
- r1_bh->mddev = conf->mddev;
-
- raid1_free_r1bh(r1_bh);
- i++;
- }
- return i;
-}
+#define RESYNC_BLOCK_SIZE (64*1024)
+#define RESYNC_PAGES ((RESYNC_BLOCK_SIZE + PAGE_SIZE-1) / PAGE_SIZE)
+#define RESYNC_WINDOW (2048*1024)
-static void raid1_shrink_r1bh(raid1_conf_t *conf)
+static void * r1buf_pool_alloc(int gfp_flags, void *data)
{
- md_spin_lock_irq(&conf->device_lock);
- while (conf->freer1) {
- struct raid1_bh *r1_bh = conf->freer1;
- conf->freer1 = r1_bh->next_r1;
- conf->freer1_cnt--;
- kfree(r1_bh);
+ conf_t *conf = data;
+ struct page *page;
+ r1bio_t *r1_bio;
+ struct bio *bio;
+ int i, j;
+
+ r1_bio = mempool_alloc(conf->r1bio_pool, gfp_flags);
+ check_all_bios_empty(r1_bio);
+
+ bio = bio_alloc(gfp_flags, RESYNC_PAGES);
+ if (!bio)
+ goto out_free_r1_bio;
+
+ for (i = 0; i < RESYNC_PAGES; i++) {
+ page = alloc_page(gfp_flags);
+ if (unlikely(!page))
+ goto out_free_pages;
+
+ bio->bi_io_vec[i].bv_page = page;
+ bio->bi_io_vec[i].bv_len = PAGE_SIZE;
+ bio->bi_io_vec[i].bv_offset = 0;
}
- md_spin_unlock_irq(&conf->device_lock);
-}
-
-
-static inline void raid1_free_buf(struct raid1_bh *r1_bh)
-{
- unsigned long flags;
- struct buffer_head *bh = r1_bh->mirror_bh_list;
- raid1_conf_t *conf = mddev_to_conf(r1_bh->mddev);
- r1_bh->mirror_bh_list = NULL;
-
- spin_lock_irqsave(&conf->device_lock, flags);
- r1_bh->next_r1 = conf->freebuf;
- conf->freebuf = r1_bh;
- spin_unlock_irqrestore(&conf->device_lock, flags);
- raid1_free_bh(conf, bh);
+ /*
+ * Allocate a single data page for this iovec.
+ */
+ bio->bi_vcnt = RESYNC_PAGES;
+ bio->bi_idx = 0;
+ bio->bi_size = RESYNC_BLOCK_SIZE;
+ bio->bi_end_io = NULL;
+ atomic_set(&bio->bi_cnt, 1);
+
+ r1_bio->master_bio = bio;
+
+ return r1_bio;
+
+out_free_pages:
+ for (j = 0; j < i; j++)
+ __free_page(bio->bi_io_vec[j].bv_page);
+ bio_put(bio);
+out_free_r1_bio:
+ mempool_free(r1_bio, conf->r1bio_pool);
+ return NULL;
}
-static struct raid1_bh *raid1_alloc_buf(raid1_conf_t *conf)
+static void r1buf_pool_free(void *__r1_bio, void *data)
{
- struct raid1_bh *r1_bh;
-
- md_spin_lock_irq(&conf->device_lock);
- wait_event_lock_irq(conf->wait_buffer, conf->freebuf, conf->device_lock);
- r1_bh = conf->freebuf;
- conf->freebuf = r1_bh->next_r1;
- r1_bh->next_r1= NULL;
- md_spin_unlock_irq(&conf->device_lock);
+ int i;
+ conf_t *conf = data;
+ r1bio_t *r1bio = __r1_bio;
+ struct bio *bio = r1bio->master_bio;
- return r1_bh;
+ check_all_bios_empty(r1bio);
+ if (atomic_read(&bio->bi_cnt) != 1)
+ BUG();
+ for (i = 0; i < RESYNC_PAGES; i++) {
+ __free_page(bio->bi_io_vec[i].bv_page);
+ bio->bi_io_vec[i].bv_page = NULL;
+ }
+ if (atomic_read(&bio->bi_cnt) != 1)
+ BUG();
+ bio_put(bio);
+ mempool_free(r1bio, conf->r1bio_pool);
}
-static int raid1_grow_buffers (raid1_conf_t *conf, int cnt)
+static void put_all_bios(conf_t *conf, r1bio_t *r1_bio)
{
- int i = 0;
-
- md_spin_lock_irq(&conf->device_lock);
- while (i < cnt) {
- struct raid1_bh *r1_bh;
- struct page *page;
-
- page = alloc_page(GFP_KERNEL);
- if (!page)
- break;
+ int i;
- r1_bh = (struct raid1_bh *) kmalloc(sizeof(*r1_bh), GFP_KERNEL);
- if (!r1_bh) {
- __free_page(page);
- break;
+ if (r1_bio->read_bio) {
+ if (atomic_read(&r1_bio->read_bio->bi_cnt) != 1)
+ BUG();
+ bio_put(r1_bio->read_bio);
+ r1_bio->read_bio = NULL;
+ }
+ for (i = 0; i < MD_SB_DISKS; i++) {
+ struct bio **bio = r1_bio->write_bios + i;
+ if (*bio) {
+ if (atomic_read(&(*bio)->bi_cnt) != 1)
+ BUG();
+ bio_put(*bio);
}
- memset(r1_bh, 0, sizeof(*r1_bh));
- r1_bh->bh_req.b_page = page;
- r1_bh->bh_req.b_data = page_address(page);
- r1_bh->next_r1 = conf->freebuf;
- conf->freebuf = r1_bh;
- i++;
+ *bio = NULL;
}
- md_spin_unlock_irq(&conf->device_lock);
- return i;
+ check_all_bios_empty(r1_bio);
}
-static void raid1_shrink_buffers (raid1_conf_t *conf)
+static inline void free_r1bio(r1bio_t *r1_bio)
{
- md_spin_lock_irq(&conf->device_lock);
- while (conf->freebuf) {
- struct raid1_bh *r1_bh = conf->freebuf;
- conf->freebuf = r1_bh->next_r1;
- __free_page(r1_bh->bh_req.b_page);
- kfree(r1_bh);
- }
- md_spin_unlock_irq(&conf->device_lock);
+ conf_t *conf = mddev_to_conf(r1_bio->mddev);
+
+ put_all_bios(conf, r1_bio);
+ mempool_free(r1_bio, conf->r1bio_pool);
+}
+
+static inline void put_buf(r1bio_t *r1_bio)
+{
+ conf_t *conf = mddev_to_conf(r1_bio->mddev);
+ struct bio *bio = r1_bio->master_bio;
+
+ /*
+ * undo any possible partial request fixup magic:
+ */
+ if (bio->bi_size != RESYNC_BLOCK_SIZE)
+ bio->bi_io_vec[bio->bi_vcnt-1].bv_len = PAGE_SIZE;
+ put_all_bios(conf, r1_bio);
+ mempool_free(r1_bio, conf->r1buf_pool);
}
-static int raid1_map (mddev_t *mddev, kdev_t *rdev)
+static int map(mddev_t *mddev, kdev_t *rdev)
{
- raid1_conf_t *conf = mddev_to_conf(mddev);
+ conf_t *conf = mddev_to_conf(mddev);
int i, disks = MD_SB_DISKS;
/*
- * Later we do read balancing on the read side
+ * Later we do read balancing on the read side
* now we use the first available disk.
*/
for (i = 0; i < disks; i++) {
if (conf->mirrors[i].operational) {
*rdev = conf->mirrors[i].dev;
- return (0);
+ return 0;
}
}
printk (KERN_ERR "raid1_map(): huh, no more operational devices?\n");
- return (-1);
+ return -1;
}
-static void raid1_reschedule_retry (struct raid1_bh *r1_bh)
+static void reschedule_retry(r1bio_t *r1_bio)
{
unsigned long flags;
- mddev_t *mddev = r1_bh->mddev;
- raid1_conf_t *conf = mddev_to_conf(mddev);
-
- md_spin_lock_irqsave(&retry_list_lock, flags);
- if (raid1_retry_list == NULL)
- raid1_retry_tail = &raid1_retry_list;
- *raid1_retry_tail = r1_bh;
- raid1_retry_tail = &r1_bh->next_r1;
- r1_bh->next_r1 = NULL;
- md_spin_unlock_irqrestore(&retry_list_lock, flags);
+ mddev_t *mddev = r1_bio->mddev;
+ conf_t *conf = mddev_to_conf(mddev);
+
+ spin_lock_irqsave(&retry_list_lock, flags);
+ list_add(&r1_bio->retry_list, &retry_list_head);
+ spin_unlock_irqrestore(&retry_list_lock, flags);
+
md_wakeup_thread(conf->thread);
}
-static void inline io_request_done(unsigned long sector, raid1_conf_t *conf, int phase)
+static void inline raid_request_done(unsigned long sector, conf_t *conf, int phase)
{
unsigned long flags;
spin_lock_irqsave(&conf->segment_lock, flags);
@@ -359,9 +237,10 @@ static void inline io_request_done(unsigned long sector, raid1_conf_t *conf, int
spin_unlock_irqrestore(&conf->segment_lock, flags);
}
-static void inline sync_request_done (unsigned long sector, raid1_conf_t *conf)
+static void inline sync_request_done(sector_t sector, conf_t *conf)
{
unsigned long flags;
+
spin_lock_irqsave(&conf->segment_lock, flags);
if (sector >= conf->start_ready)
--conf->cnt_ready;
@@ -375,73 +254,80 @@ static void inline sync_request_done (unsigned long sector, raid1_conf_t *conf)
}
/*
- * raid1_end_bh_io() is called when we have finished servicing a mirrored
+ * raid_end_bio_io() is called when we have finished servicing a mirrored
* operation and are ready to return a success/failure code to the buffer
* cache layer.
*/
-static void raid1_end_bh_io (struct raid1_bh *r1_bh, int uptodate)
+static int raid_end_bio_io(r1bio_t *r1_bio, int uptodate, int nr_sectors)
{
- struct buffer_head *bh = r1_bh->master_bh;
+ struct bio *bio = r1_bio->master_bio;
- io_request_done(bh->b_rsector, mddev_to_conf(r1_bh->mddev),
- test_bit(R1BH_SyncPhase, &r1_bh->state));
+ raid_request_done(bio->bi_sector, mddev_to_conf(r1_bio->mddev),
+ test_bit(R1BIO_SyncPhase, &r1_bio->state));
- bh->b_end_io(bh, uptodate);
- raid1_free_r1bh(r1_bh);
+ bio_endio(bio, uptodate, nr_sectors);
+ free_r1bio(r1_bio);
+
+ return 0;
}
-void raid1_end_request (struct buffer_head *bh, int uptodate)
+
+static int end_request(struct bio *bio, int nr_sectors)
{
- struct raid1_bh * r1_bh = (struct raid1_bh *)(bh->b_private);
+ int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
+ r1bio_t * r1_bio = (r1bio_t *)(bio->bi_private);
/*
* this branch is our 'one mirror IO has finished' event handler:
*/
if (!uptodate)
- md_error (r1_bh->mddev, bh->b_dev);
+ md_error(r1_bio->mddev, bio->bi_dev);
else
/*
- * Set R1BH_Uptodate in our master buffer_head, so that
+ * Set R1BIO_Uptodate in our master bio, so that
* we will return a good error code for to the higher
* levels even if IO on some other mirrored buffer fails.
*
- * The 'master' represents the complex operation to
+ * The 'master' represents the complex operation to
* user-side. So if something waits for IO, then it will
- * wait for the 'master' buffer_head.
+ * wait for the 'master' bio.
*/
- set_bit (R1BH_Uptodate, &r1_bh->state);
+ set_bit(R1BIO_Uptodate, &r1_bio->state);
/*
- * We split up the read and write side, imho they are
+ * We split up the read and write side, imho they are
* conceptually different.
*/
- if ( (r1_bh->cmd == READ) || (r1_bh->cmd == READA) ) {
+ if ((r1_bio->cmd == READ) || (r1_bio->cmd == READA)) {
+ if (!r1_bio->read_bio)
+ BUG();
/*
- * we have only one buffer_head on the read side
+ * we have only one bio on the read side
*/
-
if (uptodate) {
- raid1_end_bh_io(r1_bh, uptodate);
- return;
+ raid_end_bio_io(r1_bio, uptodate, nr_sectors);
+ return 0;
}
/*
* oops, read error:
*/
- printk(KERN_ERR "raid1: %s: rescheduling block %lu\n",
- partition_name(bh->b_dev), bh->b_blocknr);
- raid1_reschedule_retry(r1_bh);
- return;
+ printk(KERN_ERR "raid1: %s: rescheduling sector %lu\n",
+ partition_name(bio->bi_dev), r1_bio->sector);
+ reschedule_retry(r1_bio);
+ return 0;
}
+ if (r1_bio->read_bio)
+ BUG();
/*
* WRITE:
*
- * Let's see if all mirrored write operations have finished
+ * Let's see if all mirrored write operations have finished
* already.
*/
-
- if (atomic_dec_and_test(&r1_bh->remaining))
- raid1_end_bh_io(r1_bh, test_bit(R1BH_Uptodate, &r1_bh->state));
+ if (atomic_dec_and_test(&r1_bio->remaining))
+ raid_end_bio_io(r1_bio, uptodate, nr_sectors);
+ return 0;
}
/*
@@ -456,22 +342,20 @@ void raid1_end_request (struct buffer_head *bh, int uptodate)
* reads should be somehow balanced.
*/
-static int raid1_read_balance (raid1_conf_t *conf, struct buffer_head *bh)
+static int read_balance(conf_t *conf, struct bio *bio, r1bio_t *r1_bio)
{
- int new_disk = conf->last_used;
- const int sectors = bh->b_size >> 9;
- const unsigned long this_sector = bh->b_rsector;
- int disk = new_disk;
- unsigned long new_distance;
- unsigned long current_distance;
-
+ const int sectors = bio->bi_size >> 9;
+ const unsigned long this_sector = r1_bio->sector;
+ unsigned long new_distance, current_distance;
+ int new_disk = conf->last_used, disk = new_disk;
+
/*
* Check if it is sane at all to balance
*/
-
+
if (conf->resync_mirrors)
goto rb_out;
-
+
/* make sure that disk is operational */
while( !conf->mirrors[new_disk].operational) {
@@ -483,7 +367,7 @@ static int raid1_read_balance (raid1_conf_t *conf, struct buffer_head *bh)
* Nothing much to do, lets not change anything
* and hope for the best...
*/
-
+
new_disk = conf->last_used;
goto rb_out;
@@ -491,53 +375,51 @@ static int raid1_read_balance (raid1_conf_t *conf, struct buffer_head *bh)
}
disk = new_disk;
/* now disk == new_disk == starting point for search */
-
+
/*
* Don't touch anything for sequential reads.
*/
-
if (this_sector == conf->mirrors[new_disk].head_position)
goto rb_out;
-
+
/*
* If reads have been done only on a single disk
* for a time, lets give another disk a change.
* This is for kicking those idling disks so that
* they would find work near some hotspot.
*/
-
if (conf->sect_count >= conf->mirrors[new_disk].sect_limit) {
conf->sect_count = 0;
do {
- if (new_disk<=0)
+ if (new_disk <= 0)
new_disk = conf->raid_disks;
new_disk--;
if (new_disk == disk)
break;
} while ((conf->mirrors[new_disk].write_only) ||
- (!conf->mirrors[new_disk].operational));
+ (!conf->mirrors[new_disk].operational));
goto rb_out;
}
-
+
current_distance = abs(this_sector -
conf->mirrors[disk].head_position);
-
+
/* Find the disk which is closest */
-
+
do {
if (disk <= 0)
disk = conf->raid_disks;
disk--;
-
+
if ((conf->mirrors[disk].write_only) ||
(!conf->mirrors[disk].operational))
continue;
-
+
new_distance = abs(this_sector -
conf->mirrors[disk].head_position);
-
+
if (new_distance < current_distance) {
conf->sect_count = 0;
current_distance = new_distance;
@@ -554,69 +436,73 @@ rb_out:
return new_disk;
}
-static int raid1_make_request (mddev_t *mddev, int rw,
- struct buffer_head * bh)
-{
- raid1_conf_t *conf = mddev_to_conf(mddev);
- struct buffer_head *bh_req, *bhl;
- struct raid1_bh * r1_bh;
- int disks = MD_SB_DISKS;
- int i, sum_bhs = 0;
- struct mirror_info *mirror;
-
- if (!buffer_locked(bh))
- BUG();
-
/*
- * make_request() can abort the operation when READA is being
- * used and no empty request is available.
- *
- * Currently, just replace the command with READ/WRITE.
+ * Wait if the reconstruction state machine puts up a bar for
+ * new requests in this sector range:
*/
- if (rw == READA)
- rw = READ;
-
- r1_bh = raid1_alloc_r1bh (conf);
-
+static inline void new_request(conf_t *conf, r1bio_t *r1_bio)
+{
spin_lock_irq(&conf->segment_lock);
wait_event_lock_irq(conf->wait_done,
- bh->b_rsector < conf->start_active ||
- bh->b_rsector >= conf->start_future,
+ r1_bio->sector < conf->start_active ||
+ r1_bio->sector >= conf->start_future,
conf->segment_lock);
- if (bh->b_rsector < conf->start_active)
+ if (r1_bio->sector < conf->start_active)
conf->cnt_done++;
else {
conf->cnt_future++;
if (conf->phase)
- set_bit(R1BH_SyncPhase, &r1_bh->state);
+ set_bit(R1BIO_SyncPhase, &r1_bio->state);
}
spin_unlock_irq(&conf->segment_lock);
-
+}
+
+static int make_request(mddev_t *mddev, int rw, struct bio * bio)
+{
+ conf_t *conf = mddev_to_conf(mddev);
+ mirror_info_t *mirror;
+ r1bio_t *r1_bio;
+ struct bio *read_bio;
+ int i, sum_bios = 0, disks = MD_SB_DISKS;
+
/*
- * i think the read and write branch should be separated completely,
- * since we want to do read balancing on the read side for example.
- * Alternative implementations? :) --mingo
+ * make_request() can abort the operation when READA is being
+ * used and no empty request is available.
+ *
+ * Currently, just replace the command with READ.
*/
+ if (rw == READA)
+ rw = READ;
+
+ r1_bio = mempool_alloc(conf->r1bio_pool, GFP_NOIO);
+ check_all_bios_empty(r1_bio);
+
+ r1_bio->master_bio = bio;
+
+ r1_bio->mddev = mddev;
+ r1_bio->sector = bio->bi_sector;
+ r1_bio->cmd = rw;
- r1_bh->master_bh = bh;
- r1_bh->mddev = mddev;
- r1_bh->cmd = rw;
+ new_request(conf, r1_bio);
if (rw == READ) {
/*
* read balancing logic:
*/
- mirror = conf->mirrors + raid1_read_balance(conf, bh);
-
- bh_req = &r1_bh->bh_req;
- memcpy(bh_req, bh, sizeof(*bh));
- bh_req->b_blocknr = bh->b_rsector;
- bh_req->b_dev = mirror->dev;
- bh_req->b_rdev = mirror->dev;
- /* bh_req->b_rsector = bh->n_rsector; */
- bh_req->b_end_io = raid1_end_request;
- bh_req->b_private = r1_bh;
- generic_make_request (rw, bh_req);
+ mirror = conf->mirrors + read_balance(conf, bio, r1_bio);
+
+ read_bio = bio_clone(bio, GFP_NOIO);
+ if (r1_bio->read_bio)
+ BUG();
+ r1_bio->read_bio = read_bio;
+
+ read_bio->bi_sector = r1_bio->sector;
+ read_bio->bi_dev = mirror->dev;
+ read_bio->bi_end_io = end_request;
+ read_bio->bi_rw = rw;
+ read_bio->bi_private = r1_bio;
+
+ generic_make_request(read_bio);
return 0;
}
@@ -624,62 +510,35 @@ static int raid1_make_request (mddev_t *mddev, int rw,
* WRITE:
*/
- bhl = raid1_alloc_bh(conf, conf->raid_disks);
+ check_all_w_bios_empty(r1_bio);
+
for (i = 0; i < disks; i++) {
- struct buffer_head *mbh;
- if (!conf->mirrors[i].operational)
+ struct bio *mbio;
+ if (!conf->mirrors[i].operational)
continue;
-
- /*
- * We should use a private pool (size depending on NR_REQUEST),
- * to avoid writes filling up the memory with bhs
- *
- * Such pools are much faster than kmalloc anyways (so we waste
- * almost nothing by not using the master bh when writing and
- * win alot of cleanness) but for now we are cool enough. --mingo
- *
- * It's safe to sleep here, buffer heads cannot be used in a shared
- * manner in the write branch. Look how we lock the buffer at the
- * beginning of this function to grok the difference ;)
- */
- mbh = bhl;
- if (mbh == NULL) {
- MD_BUG();
- break;
- }
- bhl = mbh->b_next;
- mbh->b_next = NULL;
- mbh->b_this_page = (struct buffer_head *)1;
-
- /*
- * prepare mirrored mbh (fields ordered for max mem throughput):
- */
- mbh->b_blocknr = bh->b_rsector;
- mbh->b_dev = conf->mirrors[i].dev;
- mbh->b_rdev = conf->mirrors[i].dev;
- mbh->b_rsector = bh->b_rsector;
- mbh->b_state = (1<<BH_Req) | (1<<BH_Dirty) |
- (1<<BH_Mapped) | (1<<BH_Lock);
-
- atomic_set(&mbh->b_count, 1);
- mbh->b_size = bh->b_size;
- mbh->b_page = bh->b_page;
- mbh->b_data = bh->b_data;
- mbh->b_list = BUF_LOCKED;
- mbh->b_end_io = raid1_end_request;
- mbh->b_private = r1_bh;
-
- mbh->b_next = r1_bh->mirror_bh_list;
- r1_bh->mirror_bh_list = mbh;
- sum_bhs++;
+
+ mbio = bio_clone(bio, GFP_NOIO);
+ if (r1_bio->write_bios[i])
+ BUG();
+ r1_bio->write_bios[i] = mbio;
+
+ mbio->bi_sector = r1_bio->sector;
+ mbio->bi_dev = conf->mirrors[i].dev;
+ mbio->bi_end_io = end_request;
+ mbio->bi_rw = rw;
+ mbio->bi_private = r1_bio;
+
+ sum_bios++;
}
- if (bhl) raid1_free_bh(conf,bhl);
- if (!sum_bhs) {
- /* Gag - all mirrors non-operational.. */
- raid1_end_bh_io(r1_bh, 0);
+ if (!sum_bios) {
+ /*
+ * If all mirrors are non-operational
+ * then return an IO error:
+ */
+ raid_end_bio_io(r1_bio, 0, 0);
return 0;
}
- md_atomic_set(&r1_bh->remaining, sum_bhs);
+ atomic_set(&r1_bio->remaining, sum_bios);
/*
* We have to be a bit careful about the semaphore above, thats
@@ -688,28 +547,30 @@ static int raid1_make_request (mddev_t *mddev, int rw,
* safer solution. Imagine, end_request decreasing the semaphore
* before we could have set it up ... We could play tricks with
* the semaphore (presetting it and correcting at the end if
- * sum_bhs is not 'n' but we have to do end_request by hand if
+ * sum_bios is not 'n' but we have to do end_request by hand if
* all requests finish until we had a chance to set up the
* semaphore correctly ... lots of races).
*/
- bh = r1_bh->mirror_bh_list;
- while(bh) {
- struct buffer_head *bh2 = bh;
- bh = bh->b_next;
- generic_make_request(rw, bh2);
+ for (i = 0; i < disks; i++) {
+ struct bio *mbio;
+ mbio = r1_bio->write_bios[i];
+ if (!mbio)
+ continue;
+
+ generic_make_request(mbio);
}
- return (0);
+ return 0;
}
-static int raid1_status (char *page, mddev_t *mddev)
+static int status(char *page, mddev_t *mddev)
{
- raid1_conf_t *conf = mddev_to_conf(mddev);
+ conf_t *conf = mddev_to_conf(mddev);
int sz = 0, i;
-
- sz += sprintf (page+sz, " [%d/%d] [", conf->raid_disks,
- conf->working_disks);
+
+ sz += sprintf(page+sz, " [%d/%d] [", conf->raid_disks,
+ conf->working_disks);
for (i = 0; i < conf->raid_disks; i++)
- sz += sprintf (page+sz, "%s",
+ sz += sprintf(page+sz, "%s",
conf->mirrors[i].operational ? "U" : "_");
sz += sprintf (page+sz, "]");
return sz;
@@ -731,10 +592,10 @@ static int raid1_status (char *page, mddev_t *mddev)
#define ALREADY_SYNCING KERN_INFO \
"raid1: syncing already in progress.\n"
-static void mark_disk_bad (mddev_t *mddev, int failed)
+static void mark_disk_bad(mddev_t *mddev, int failed)
{
- raid1_conf_t *conf = mddev_to_conf(mddev);
- struct mirror_info *mirror = conf->mirrors+failed;
+ conf_t *conf = mddev_to_conf(mddev);
+ mirror_info_t *mirror = conf->mirrors+failed;
mdp_super_t *sb = mddev->sb;
mirror->operational = 0;
@@ -749,37 +610,36 @@ static void mark_disk_bad (mddev_t *mddev, int failed)
md_wakeup_thread(conf->thread);
if (!mirror->write_only)
conf->working_disks--;
- printk (DISK_FAILED, partition_name (mirror->dev),
- conf->working_disks);
+ printk(DISK_FAILED, partition_name(mirror->dev),
+ conf->working_disks);
}
-static int raid1_error (mddev_t *mddev, kdev_t dev)
+static int error(mddev_t *mddev, kdev_t dev)
{
- raid1_conf_t *conf = mddev_to_conf(mddev);
- struct mirror_info * mirrors = conf->mirrors;
+ conf_t *conf = mddev_to_conf(mddev);
+ mirror_info_t * mirrors = conf->mirrors;
int disks = MD_SB_DISKS;
int i;
- /* Find the drive.
+ /*
+ * Find the drive.
* If it is not operational, then we have already marked it as dead
* else if it is the last working disks, ignore the error, let the
* next level up know.
* else mark the drive as failed
*/
-
for (i = 0; i < disks; i++)
- if (mirrors[i].dev==dev && mirrors[i].operational)
+ if (mirrors[i].dev == dev && mirrors[i].operational)
break;
if (i == disks)
return 0;
- if (i < conf->raid_disks && conf->working_disks == 1) {
- /* Don't fail the drive, act as though we were just a
+ if (i < conf->raid_disks && conf->working_disks == 1)
+ /*
+ * Don't fail the drive, act as though we were just a
* normal single drive
*/
-
return 1;
- }
mark_disk_bad(mddev, i);
return 0;
}
@@ -790,41 +650,42 @@ static int raid1_error (mddev_t *mddev, kdev_t dev)
#undef START_SYNCING
-static void print_raid1_conf (raid1_conf_t *conf)
+static void print_conf(conf_t *conf)
{
int i;
- struct mirror_info *tmp;
+ mirror_info_t *tmp;
printk("RAID1 conf printout:\n");
if (!conf) {
- printk("(conf==NULL)\n");
+ printk("(!conf)\n");
return;
}
printk(" --- wd:%d rd:%d nd:%d\n", conf->working_disks,
- conf->raid_disks, conf->nr_disks);
+ conf->raid_disks, conf->nr_disks);
for (i = 0; i < MD_SB_DISKS; i++) {
tmp = conf->mirrors + i;
printk(" disk %d, s:%d, o:%d, n:%d rd:%d us:%d dev:%s\n",
- i, tmp->spare,tmp->operational,
- tmp->number,tmp->raid_disk,tmp->used_slot,
+ i, tmp->spare, tmp->operational,
+ tmp->number, tmp->raid_disk, tmp->used_slot,
partition_name(tmp->dev));
}
}
-static void close_sync(raid1_conf_t *conf)
+static void close_sync(conf_t *conf)
{
mddev_t *mddev = conf->mddev;
- /* If reconstruction was interrupted, we need to close the "active" and "pending"
- * holes.
- * we know that there are no active rebuild requests, os cnt_active == cnt_ready ==0
+ /*
+ * If reconstruction was interrupted, we need to close the "active"
+ * and "pending" holes.
+ * we know that there are no active rebuild requests,
+ * os cnt_active == cnt_ready == 0
*/
- /* this is really needed when recovery stops too... */
spin_lock_irq(&conf->segment_lock);
conf->start_active = conf->start_pending;
conf->start_ready = conf->start_pending;
wait_event_lock_irq(conf->wait_ready, !conf->cnt_pending, conf->segment_lock);
- conf->start_active =conf->start_ready = conf->start_pending = conf->start_future;
+ conf->start_active = conf->start_ready = conf->start_pending = conf->start_future;
conf->start_future = mddev->sb->size+1;
conf->cnt_pending = conf->cnt_future;
conf->cnt_future = 0;
@@ -838,18 +699,18 @@ static void close_sync(raid1_conf_t *conf)
wake_up(&conf->wait_done);
}
-static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
+static int diskop(mddev_t *mddev, mdp_disk_t **d, int state)
{
int err = 0;
- int i, failed_disk=-1, spare_disk=-1, removed_disk=-1, added_disk=-1;
- raid1_conf_t *conf = mddev->private;
- struct mirror_info *tmp, *sdisk, *fdisk, *rdisk, *adisk;
+ int i, failed_disk = -1, spare_disk = -1, removed_disk = -1, added_disk = -1;
+ conf_t *conf = mddev->private;
+ mirror_info_t *tmp, *sdisk, *fdisk, *rdisk, *adisk;
mdp_super_t *sb = mddev->sb;
mdp_disk_t *failed_desc, *spare_desc, *added_desc;
mdk_rdev_t *spare_rdev, *failed_rdev;
- print_raid1_conf(conf);
- md_spin_lock_irq(&conf->device_lock);
+ print_conf(conf);
+ spin_lock_irq(&conf->device_lock);
/*
* find the disk ...
*/
@@ -871,7 +732,7 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
}
/*
* When we activate a spare disk we _must_ have a disk in
- * the lower (active) part of the array to replace.
+ * the lower (active) part of the array to replace.
*/
if ((failed_disk == -1) || (failed_disk >= conf->raid_disks)) {
MD_BUG();
@@ -982,7 +843,7 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
err = 1;
goto abort;
}
-
+
if (sdisk->raid_disk != spare_disk) {
MD_BUG();
err = 1;
@@ -1007,13 +868,14 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
spare_rdev = find_rdev_nr(mddev, spare_desc->number);
failed_rdev = find_rdev_nr(mddev, failed_desc->number);
- /* There must be a spare_rdev, but there may not be a
- * failed_rdev. That slot might be empty...
+ /*
+ * There must be a spare_rdev, but there may not be a
+ * failed_rdev. That slot might be empty...
*/
spare_rdev->desc_nr = failed_desc->number;
if (failed_rdev)
failed_rdev->desc_nr = spare_desc->number;
-
+
xchg_values(*spare_desc, *failed_desc);
xchg_values(*fdisk, *sdisk);
@@ -1024,7 +886,6 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
* give the proper raid_disk number to the now activated
* disk. (this means we switch back these values)
*/
-
xchg_values(spare_desc->raid_disk, failed_desc->raid_disk);
xchg_values(sdisk->raid_disk, fdisk->raid_disk);
xchg_values(spare_desc->number, failed_desc->number);
@@ -1054,7 +915,7 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
rdisk = conf->mirrors + removed_disk;
if (rdisk->spare && (removed_disk < conf->raid_disks)) {
- MD_BUG();
+ MD_BUG();
err = 1;
goto abort;
}
@@ -1068,14 +929,14 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
added_desc = *d;
if (added_disk != added_desc->number) {
- MD_BUG();
+ MD_BUG();
err = 1;
goto abort;
}
adisk->number = added_desc->number;
adisk->raid_disk = added_desc->raid_disk;
- adisk->dev = MKDEV(added_desc->major,added_desc->minor);
+ adisk->dev = MKDEV(added_desc->major, added_desc->minor);
adisk->operational = 0;
adisk->write_only = 0;
@@ -1087,17 +948,18 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
break;
default:
- MD_BUG();
+ MD_BUG();
err = 1;
goto abort;
}
abort:
- md_spin_unlock_irq(&conf->device_lock);
- if (state == DISKOP_SPARE_ACTIVE || state == DISKOP_SPARE_INACTIVE)
- /* should move to "END_REBUILD" when such exists */
- raid1_shrink_buffers(conf);
+ spin_unlock_irq(&conf->device_lock);
+ if (state == DISKOP_SPARE_ACTIVE || state == DISKOP_SPARE_INACTIVE) {
+ mempool_destroy(conf->r1buf_pool);
+ conf->r1buf_pool = NULL;
+ }
- print_raid1_conf(conf);
+ print_conf(conf);
return err;
}
@@ -1108,6 +970,122 @@ abort:
#define REDIRECT_SECTOR KERN_ERR \
"raid1: %s: redirecting sector %lu to another mirror\n"
+static int end_sync_read(struct bio *bio, int nr_sectors)
+{
+ int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
+ r1bio_t * r1_bio = (r1bio_t *)(bio->bi_private);
+
+ check_all_w_bios_empty(r1_bio);
+ if (r1_bio->read_bio != bio)
+ BUG();
+ /*
+ * we have read a block, now it needs to be re-written,
+ * or re-read if the read failed.
+ * We don't do much here, just schedule handling by raid1d
+ */
+ if (!uptodate)
+ md_error (r1_bio->mddev, bio->bi_dev);
+ else
+ set_bit(R1BIO_Uptodate, &r1_bio->state);
+ reschedule_retry(r1_bio);
+
+ return 0;
+}
+
+static int end_sync_write(struct bio *bio, int nr_sectors)
+{
+ int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
+ r1bio_t * r1_bio = (r1bio_t *)(bio->bi_private);
+ mddev_t *mddev = r1_bio->mddev;
+
+ if (!uptodate)
+ md_error(mddev, bio->bi_dev);
+
+ if (atomic_dec_and_test(&r1_bio->remaining)) {
+ sync_request_done(r1_bio->sector, mddev_to_conf(mddev));
+ md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9, uptodate);
+ put_buf(r1_bio);
+ }
+ return 0;
+}
+
+static void sync_request_write(mddev_t *mddev, r1bio_t *r1_bio)
+{
+ conf_t *conf = mddev_to_conf(mddev);
+ int i, sum_bios = 0;
+ int disks = MD_SB_DISKS;
+ struct bio *bio, *mbio;
+
+ bio = r1_bio->master_bio;
+
+ /*
+ * have to allocate lots of bio structures and
+ * schedule writes
+ */
+ if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) {
+ /*
+ * There is no point trying a read-for-reconstruct as
+ * reconstruct is about to be aborted
+ */
+ printk(IO_ERROR, partition_name(bio->bi_dev), r1_bio->sector);
+ md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9, 0);
+ return;
+ }
+
+ check_all_w_bios_empty(r1_bio);
+
+ for (i = 0; i < disks ; i++) {
+ if (!conf->mirrors[i].operational)
+ continue;
+ if (i == conf->last_used)
+ /*
+ * we read from here, no need to write
+ */
+ continue;
+ if (i < conf->raid_disks && !conf->resync_mirrors)
+ /*
+ * don't need to write this we are just rebuilding
+ */
+ continue;
+
+ mbio = bio_clone(bio, GFP_NOIO);
+ if (r1_bio->write_bios[i])
+ BUG();
+ r1_bio->write_bios[i] = mbio;
+ mbio->bi_dev = conf->mirrors[i].dev;
+ mbio->bi_sector = r1_bio->sector;
+ mbio->bi_end_io = end_sync_write;
+ mbio->bi_rw = WRITE;
+ mbio->bi_private = r1_bio;
+
+ sum_bios++;
+ }
+ if (i != disks)
+ BUG();
+ atomic_set(&r1_bio->remaining, sum_bios);
+
+
+ if (!sum_bios) {
+ /*
+ * Nowhere to write this to... I guess we
+ * must be done
+ */
+ printk(IO_ERROR, partition_name(bio->bi_dev), r1_bio->sector);
+ sync_request_done(r1_bio->sector, conf);
+ md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9, 0);
+ put_buf(r1_bio);
+ return;
+ }
+ for (i = 0; i < disks ; i++) {
+ mbio = r1_bio->write_bios[i];
+ if (!mbio)
+ continue;
+
+ md_sync_acct(mbio->bi_dev, mbio->bi_size >> 9);
+ generic_make_request(mbio);
+ }
+}
+
/*
* This is a kernel thread which:
*
@@ -1115,134 +1093,56 @@ abort:
* 2. Updates the raid superblock when problems encounter.
* 3. Performs writes following reads for array syncronising.
*/
-static void end_sync_write(struct buffer_head *bh, int uptodate);
-static void end_sync_read(struct buffer_head *bh, int uptodate);
-static void raid1d (void *data)
+static void raid1d(void *data)
{
- struct raid1_bh *r1_bh;
- struct buffer_head *bh;
+ struct list_head *head = &retry_list_head;
+ r1bio_t *r1_bio;
+ struct bio *bio;
unsigned long flags;
mddev_t *mddev;
kdev_t dev;
for (;;) {
- md_spin_lock_irqsave(&retry_list_lock, flags);
- r1_bh = raid1_retry_list;
- if (!r1_bh)
+ spin_lock_irqsave(&retry_list_lock, flags);
+ if (list_empty(head))
break;
- raid1_retry_list = r1_bh->next_r1;
- md_spin_unlock_irqrestore(&retry_list_lock, flags);
+ r1_bio = list_entry(head->prev, r1bio_t, retry_list);
+ list_del(head->prev);
+ spin_unlock_irqrestore(&retry_list_lock, flags);
+ check_all_w_bios_empty(r1_bio);
- mddev = r1_bh->mddev;
+ mddev = r1_bio->mddev;
if (mddev->sb_dirty) {
printk(KERN_INFO "raid1: dirty sb detected, updating.\n");
mddev->sb_dirty = 0;
md_update_sb(mddev);
}
- bh = &r1_bh->bh_req;
- switch(r1_bh->cmd) {
+ bio = r1_bio->master_bio;
+ switch(r1_bio->cmd) {
case SPECIAL:
- /* have to allocate lots of bh structures and
- * schedule writes
- */
- if (test_bit(R1BH_Uptodate, &r1_bh->state)) {
- int i, sum_bhs = 0;
- int disks = MD_SB_DISKS;
- struct buffer_head *bhl, *mbh;
- raid1_conf_t *conf;
-
- conf = mddev_to_conf(mddev);
- bhl = raid1_alloc_bh(conf, conf->raid_disks); /* don't really need this many */
- for (i = 0; i < disks ; i++) {
- if (!conf->mirrors[i].operational)
- continue;
- if (i==conf->last_used)
- /* we read from here, no need to write */
- continue;
- if (i < conf->raid_disks
- && !conf->resync_mirrors)
- /* don't need to write this,
- * we are just rebuilding */
- continue;
- mbh = bhl;
- if (!mbh) {
- MD_BUG();
- break;
- }
- bhl = mbh->b_next;
- mbh->b_this_page = (struct buffer_head *)1;
-
-
- /*
- * prepare mirrored bh (fields ordered for max mem throughput):
- */
- mbh->b_blocknr = bh->b_blocknr;
- mbh->b_dev = conf->mirrors[i].dev;
- mbh->b_rdev = conf->mirrors[i].dev;
- mbh->b_rsector = bh->b_blocknr;
- mbh->b_state = (1<<BH_Req) | (1<<BH_Dirty) |
- (1<<BH_Mapped) | (1<<BH_Lock);
- atomic_set(&mbh->b_count, 1);
- mbh->b_size = bh->b_size;
- mbh->b_page = bh->b_page;
- mbh->b_data = bh->b_data;
- mbh->b_list = BUF_LOCKED;
- mbh->b_end_io = end_sync_write;
- mbh->b_private = r1_bh;
-
- mbh->b_next = r1_bh->mirror_bh_list;
- r1_bh->mirror_bh_list = mbh;
-
- sum_bhs++;
- }
- md_atomic_set(&r1_bh->remaining, sum_bhs);
- if (bhl) raid1_free_bh(conf, bhl);
- mbh = r1_bh->mirror_bh_list;
-
- if (!sum_bhs) {
- /* nowhere to write this too... I guess we
- * must be done
- */
- sync_request_done(bh->b_blocknr, conf);
- md_done_sync(mddev, bh->b_size>>9, 0);
- raid1_free_buf(r1_bh);
- } else
- while (mbh) {
- struct buffer_head *bh1 = mbh;
- mbh = mbh->b_next;
- generic_make_request(WRITE, bh1);
- md_sync_acct(bh1->b_dev, bh1->b_size/512);
- }
- } else {
- /* There is no point trying a read-for-reconstruct
- * as reconstruct is about to be aborted
- */
-
- printk (IO_ERROR, partition_name(bh->b_dev), bh->b_blocknr);
- md_done_sync(mddev, bh->b_size>>9, 0);
- }
-
+ sync_request_write(mddev, r1_bio);
break;
case READ:
case READA:
- dev = bh->b_dev;
- raid1_map (mddev, &bh->b_dev);
- if (bh->b_dev == dev) {
- printk (IO_ERROR, partition_name(bh->b_dev), bh->b_blocknr);
- raid1_end_bh_io(r1_bh, 0);
- } else {
- printk (REDIRECT_SECTOR,
- partition_name(bh->b_dev), bh->b_blocknr);
- bh->b_rdev = bh->b_dev;
- bh->b_rsector = bh->b_blocknr;
- generic_make_request (r1_bh->cmd, bh);
+ dev = bio->bi_dev;
+ map(mddev, &bio->bi_dev);
+ if (bio->bi_dev == dev) {
+ printk(IO_ERROR, partition_name(bio->bi_dev), r1_bio->sector);
+ raid_end_bio_io(r1_bio, 0, 0);
+ break;
}
+ printk(REDIRECT_SECTOR,
+ partition_name(bio->bi_dev), r1_bio->sector);
+ bio->bi_sector = r1_bio->sector;
+ bio->bi_rw = r1_bio->cmd;
+
+ generic_make_request(bio);
break;
}
}
- md_spin_unlock_irqrestore(&retry_list_lock, flags);
+ spin_unlock_irqrestore(&retry_list_lock, flags);
}
#undef IO_ERROR
#undef REDIRECT_SECTOR
@@ -1251,9 +1151,9 @@ static void raid1d (void *data)
* Private kernel thread to reconstruct mirrors after an unclean
* shutdown.
*/
-static void raid1syncd (void *data)
+static void raid1syncd(void *data)
{
- raid1_conf_t *conf = data;
+ conf_t *conf = data;
mddev_t *mddev = conf->mddev;
if (!conf->resync_mirrors)
@@ -1271,7 +1171,56 @@ static void raid1syncd (void *data)
close_sync(conf);
up(&mddev->recovery_sem);
- raid1_shrink_buffers(conf);
+}
+
+static int init_resync(conf_t *conf)
+{
+ int buffs;
+
+ conf->start_active = 0;
+ conf->start_ready = 0;
+ conf->start_pending = 0;
+ conf->start_future = 0;
+ conf->phase = 0;
+
+ buffs = RESYNC_WINDOW / RESYNC_BLOCK_SIZE;
+ if (conf->r1buf_pool)
+ BUG();
+ conf->r1buf_pool = mempool_create(buffs, r1buf_pool_alloc, r1buf_pool_free, conf);
+ if (!conf->r1buf_pool)
+ return -ENOMEM;
+ conf->window = 2048;
+ conf->cnt_future += conf->cnt_done+conf->cnt_pending;
+ conf->cnt_done = conf->cnt_pending = 0;
+ if (conf->cnt_ready || conf->cnt_active)
+ MD_BUG();
+ return 0;
+}
+
+static void wait_sync_pending(conf_t *conf, sector_t sector_nr)
+{
+ spin_lock_irq(&conf->segment_lock);
+ while (sector_nr >= conf->start_pending) {
+// printk("wait .. sect=%lu start_active=%d ready=%d pending=%d future=%d, cnt_done=%d active=%d ready=%d pending=%d future=%d\n", sector_nr, conf->start_active, conf->start_ready, conf->start_pending, conf->start_future, conf->cnt_done, conf->cnt_active, conf->cnt_ready, conf->cnt_pending, conf->cnt_future);
+ wait_event_lock_irq(conf->wait_done, !conf->cnt_active,
+ conf->segment_lock);
+ wait_event_lock_irq(conf->wait_ready, !conf->cnt_pending,
+ conf->segment_lock);
+ conf->start_active = conf->start_ready;
+ conf->start_ready = conf->start_pending;
+ conf->start_pending = conf->start_future;
+ conf->start_future = conf->start_future+conf->window;
+
+ // Note: falling off the end is not a problem
+ conf->phase = conf->phase ^1;
+ conf->cnt_active = conf->cnt_ready;
+ conf->cnt_ready = 0;
+ conf->cnt_pending = conf->cnt_future;
+ conf->cnt_future = 0;
+ wake_up(&conf->wait_done);
+ }
+ conf->cnt_ready++;
+ spin_unlock_irq(&conf->segment_lock);
}
/*
@@ -1279,7 +1228,7 @@ static void raid1syncd (void *data)
*
* We need to make sure that no normal I/O request - particularly write
* requests - conflict with active sync requests.
- * This is achieved by conceptually dividing the device space into a
+ * This is achieved by conceptually dividing the block space into a
* number of sections:
* DONE: 0 .. a-1 These blocks are in-sync
* ACTIVE: a.. b-1 These blocks may have active sync requests, but
@@ -1322,149 +1271,81 @@ static void raid1syncd (void *data)
* issue suitable write requests
*/
-static int raid1_sync_request (mddev_t *mddev, unsigned long sector_nr)
+static int sync_request(mddev_t *mddev, sector_t sector_nr)
{
- raid1_conf_t *conf = mddev_to_conf(mddev);
- struct mirror_info *mirror;
- struct raid1_bh *r1_bh;
- struct buffer_head *bh;
- int bsize;
- int disk;
- int block_nr;
+ conf_t *conf = mddev_to_conf(mddev);
+ mirror_info_t *mirror;
+ r1bio_t *r1_bio;
+ struct bio *read_bio, *bio;
+ sector_t max_sector, nr_sectors;
+ int disk, partial;
- spin_lock_irq(&conf->segment_lock);
- if (!sector_nr) {
- /* initialize ...*/
- int buffs;
- conf->start_active = 0;
- conf->start_ready = 0;
- conf->start_pending = 0;
- conf->start_future = 0;
- conf->phase = 0;
- /* we want enough buffers to hold twice the window of 128*/
- buffs = 128 *2 / (PAGE_SIZE>>9);
- buffs = raid1_grow_buffers(conf, buffs);
- if (buffs < 2)
- goto nomem;
-
- conf->window = buffs*(PAGE_SIZE>>9)/2;
- conf->cnt_future += conf->cnt_done+conf->cnt_pending;
- conf->cnt_done = conf->cnt_pending = 0;
- if (conf->cnt_ready || conf->cnt_active)
- MD_BUG();
- }
- while (sector_nr >= conf->start_pending) {
- PRINTK("wait .. sect=%lu start_active=%d ready=%d pending=%d future=%d, cnt_done=%d active=%d ready=%d pending=%d future=%d\n",
- sector_nr, conf->start_active, conf->start_ready, conf->start_pending, conf->start_future,
- conf->cnt_done, conf->cnt_active, conf->cnt_ready, conf->cnt_pending, conf->cnt_future);
- wait_event_lock_irq(conf->wait_done,
- !conf->cnt_active,
- conf->segment_lock);
- wait_event_lock_irq(conf->wait_ready,
- !conf->cnt_pending,
- conf->segment_lock);
- conf->start_active = conf->start_ready;
- conf->start_ready = conf->start_pending;
- conf->start_pending = conf->start_future;
- conf->start_future = conf->start_future+conf->window;
- // Note: falling off the end is not a problem
- conf->phase = conf->phase ^1;
- conf->cnt_active = conf->cnt_ready;
- conf->cnt_ready = 0;
- conf->cnt_pending = conf->cnt_future;
- conf->cnt_future = 0;
- wake_up(&conf->wait_done);
- }
- conf->cnt_ready++;
- spin_unlock_irq(&conf->segment_lock);
-
+ if (!sector_nr)
+ if (init_resync(conf))
+ return -ENOMEM;
- /* If reconstructing, and >1 working disc,
+ wait_sync_pending(conf, sector_nr);
+
+ /*
+ * If reconstructing, and >1 working disc,
* could dedicate one to rebuild and others to
* service read requests ..
*/
disk = conf->last_used;
/* make sure disk is operational */
while (!conf->mirrors[disk].operational) {
- if (disk <= 0) disk = conf->raid_disks;
+ if (disk <= 0)
+ disk = conf->raid_disks;
disk--;
if (disk == conf->last_used)
break;
}
conf->last_used = disk;
-
+
mirror = conf->mirrors+conf->last_used;
-
- r1_bh = raid1_alloc_buf (conf);
- r1_bh->master_bh = NULL;
- r1_bh->mddev = mddev;
- r1_bh->cmd = SPECIAL;
- bh = &r1_bh->bh_req;
-
- block_nr = sector_nr;
- bsize = 512;
- while (!(block_nr & 1) && bsize < PAGE_SIZE
- && (block_nr+2)*(bsize>>9) < (mddev->sb->size *2)) {
- block_nr >>= 1;
- bsize <<= 1;
- }
- bh->b_size = bsize;
- bh->b_list = BUF_LOCKED;
- bh->b_dev = mirror->dev;
- bh->b_rdev = mirror->dev;
- bh->b_state = (1<<BH_Req) | (1<<BH_Mapped) | (1<<BH_Lock);
- if (!bh->b_page)
- BUG();
- if (!bh->b_data)
- BUG();
- if (bh->b_data != page_address(bh->b_page))
+
+ r1_bio = mempool_alloc(conf->r1buf_pool, GFP_NOIO);
+ check_all_bios_empty(r1_bio);
+
+ r1_bio->mddev = mddev;
+ r1_bio->sector = sector_nr;
+ r1_bio->cmd = SPECIAL;
+
+ max_sector = mddev->sb->size << 1;
+ if (sector_nr >= max_sector)
BUG();
- bh->b_end_io = end_sync_read;
- bh->b_private = r1_bh;
- bh->b_blocknr = sector_nr;
- bh->b_rsector = sector_nr;
- init_waitqueue_head(&bh->b_wait);
- generic_make_request(READ, bh);
- md_sync_acct(bh->b_dev, bh->b_size/512);
+ bio = r1_bio->master_bio;
+ nr_sectors = RESYNC_BLOCK_SIZE >> 9;
+ if (max_sector - sector_nr < nr_sectors)
+ nr_sectors = max_sector - sector_nr;
+ bio->bi_size = nr_sectors << 9;
+ bio->bi_vcnt = (bio->bi_size + PAGE_SIZE-1) / PAGE_SIZE;
+ /*
+ * Is there a partial page at the end of the request?
+ */
+ partial = bio->bi_size % PAGE_SIZE;
+ if (partial)
+ bio->bi_io_vec[bio->bi_vcnt-1].bv_len = partial;
- return (bsize >> 9);
-nomem:
- raid1_shrink_buffers(conf);
- spin_unlock_irq(&conf->segment_lock);
- return -ENOMEM;
-}
+ read_bio = bio_clone(r1_bio->master_bio, GFP_NOIO);
-static void end_sync_read(struct buffer_head *bh, int uptodate)
-{
- struct raid1_bh * r1_bh = (struct raid1_bh *)(bh->b_private);
+ read_bio->bi_sector = sector_nr;
+ read_bio->bi_dev = mirror->dev;
+ read_bio->bi_end_io = end_sync_read;
+ read_bio->bi_rw = READ;
+ read_bio->bi_private = r1_bio;
- /* we have read a block, now it needs to be re-written,
- * or re-read if the read failed.
- * We don't do much here, just schedule handling by raid1d
- */
- if (!uptodate)
- md_error (r1_bh->mddev, bh->b_dev);
- else
- set_bit(R1BH_Uptodate, &r1_bh->state);
- raid1_reschedule_retry(r1_bh);
-}
+ if (r1_bio->read_bio)
+ BUG();
+ r1_bio->read_bio = read_bio;
-static void end_sync_write(struct buffer_head *bh, int uptodate)
-{
- struct raid1_bh * r1_bh = (struct raid1_bh *)(bh->b_private);
-
- if (!uptodate)
- md_error (r1_bh->mddev, bh->b_dev);
- if (atomic_dec_and_test(&r1_bh->remaining)) {
- mddev_t *mddev = r1_bh->mddev;
- unsigned long sect = bh->b_blocknr;
- int size = bh->b_size;
- raid1_free_buf(r1_bh);
- sync_request_done(sect, mddev_to_conf(mddev));
- md_done_sync(mddev,size>>9, uptodate);
- }
+ md_sync_acct(read_bio->bi_dev, nr_sectors);
+
+ generic_make_request(read_bio);
+
+ return nr_sectors;
}
#define INVALID_LEVEL KERN_WARNING \
@@ -1506,15 +1387,15 @@ static void end_sync_write(struct buffer_head *bh, int uptodate)
#define START_RESYNC KERN_WARNING \
"raid1: raid set md%d not clean; reconstructing mirrors\n"
-static int raid1_run (mddev_t *mddev)
+static int run(mddev_t *mddev)
{
- raid1_conf_t *conf;
+ conf_t *conf;
int i, j, disk_idx;
- struct mirror_info *disk;
+ mirror_info_t *disk;
mdp_super_t *sb = mddev->sb;
mdp_disk_t *descriptor;
mdk_rdev_t *rdev;
- struct md_list_head *tmp;
+ struct list_head *tmp;
int start_recovery = 0;
MOD_INC_USE_COUNT;
@@ -1525,11 +1406,10 @@ static int raid1_run (mddev_t *mddev)
}
/*
* copy the already verified devices into our private RAID1
- * bookkeeping area. [whatever we allocate in raid1_run(),
- * should be freed in raid1_stop()]
+ * bookkeeping area. [whatever we allocate in run(),
+ * should be freed in stop()]
*/
-
- conf = kmalloc(sizeof(raid1_conf_t), GFP_KERNEL);
+ conf = kmalloc(sizeof(conf_t), GFP_KERNEL);
mddev->private = conf;
if (!conf) {
printk(MEM_ERROR, mdidx(mddev));
@@ -1537,7 +1417,16 @@ static int raid1_run (mddev_t *mddev)
}
memset(conf, 0, sizeof(*conf));
- ITERATE_RDEV(mddev,rdev,tmp) {
+ conf->r1bio_pool = mempool_create(NR_RAID1_BIOS, r1bio_pool_alloc,
+ r1bio_pool_free, NULL);
+ if (!conf->r1bio_pool) {
+ printk(MEM_ERROR, mdidx(mddev));
+ goto out;
+ }
+
+// for (tmp = (mddev)->disks.next; rdev = ((mdk_rdev_t *)((char *)(tmp)-(unsigned long)(&((mdk_rdev_t *)0)->same_set))), tmp = tmp->next, tmp->prev != &(mddev)->disks ; ) {
+
+ ITERATE_RDEV(mddev, rdev, tmp) {
if (rdev->faulty) {
printk(ERRORS, partition_name(rdev->dev));
} else {
@@ -1573,7 +1462,7 @@ static int raid1_run (mddev_t *mddev)
continue;
}
if ((descriptor->number > MD_SB_DISKS) ||
- (disk_idx > sb->raid_disks)) {
+ (disk_idx > sb->raid_disks)) {
printk(INCONSISTENT,
partition_name(rdev->dev));
@@ -1586,7 +1475,7 @@ static int raid1_run (mddev_t *mddev)
continue;
}
printk(OPERATIONAL, partition_name(rdev->dev),
- disk_idx);
+ disk_idx);
disk->number = descriptor->number;
disk->raid_disk = disk_idx;
disk->dev = rdev->dev;
@@ -1616,10 +1505,9 @@ static int raid1_run (mddev_t *mddev)
conf->raid_disks = sb->raid_disks;
conf->nr_disks = sb->nr_disks;
conf->mddev = mddev;
- conf->device_lock = MD_SPIN_LOCK_UNLOCKED;
+ conf->device_lock = SPIN_LOCK_UNLOCKED;
- conf->segment_lock = MD_SPIN_LOCK_UNLOCKED;
- init_waitqueue_head(&conf->wait_buffer);
+ conf->segment_lock = SPIN_LOCK_UNLOCKED;
init_waitqueue_head(&conf->wait_done);
init_waitqueue_head(&conf->wait_ready);
@@ -1628,25 +1516,8 @@ static int raid1_run (mddev_t *mddev)
goto out_free_conf;
}
-
- /* pre-allocate some buffer_head structures.
- * As a minimum, 1 r1bh and raid_disks buffer_heads
- * would probably get us by in tight memory situations,
- * but a few more is probably a good idea.
- * For now, try NR_RESERVED_BUFS r1bh and
- * NR_RESERVED_BUFS*raid_disks bufferheads
- * This will allow at least NR_RESERVED_BUFS concurrent
- * reads or writes even if kmalloc starts failing
- */
- if (raid1_grow_r1bh(conf, NR_RESERVED_BUFS) < NR_RESERVED_BUFS ||
- raid1_grow_bh(conf, NR_RESERVED_BUFS*conf->raid_disks)
- < NR_RESERVED_BUFS*conf->raid_disks) {
- printk(MEM_ERROR, mdidx(mddev));
- goto out_free_conf;
- }
-
for (i = 0; i < MD_SB_DISKS; i++) {
-
+
descriptor = sb->disks+i;
disk_idx = descriptor->raid_disk;
disk = conf->mirrors + disk_idx;
@@ -1691,10 +1562,10 @@ static int raid1_run (mddev_t *mddev)
}
if (!start_recovery && !(sb->state & (1 << MD_SB_CLEAN)) &&
- (conf->working_disks > 1)) {
+ (conf->working_disks > 1)) {
const char * name = "raid1syncd";
- conf->resync_thread = md_register_thread(raid1syncd, conf,name);
+ conf->resync_thread = md_register_thread(raid1syncd, conf, name);
if (!conf->resync_thread) {
printk(THREAD_ERROR, mdidx(mddev));
goto out_free_conf;
@@ -1731,9 +1602,8 @@ static int raid1_run (mddev_t *mddev)
return 0;
out_free_conf:
- raid1_shrink_r1bh(conf);
- raid1_shrink_bh(conf);
- raid1_shrink_buffers(conf);
+ if (conf->r1bio_pool)
+ mempool_destroy(conf->r1bio_pool);
kfree(conf);
mddev->private = NULL;
out:
@@ -1752,9 +1622,9 @@ out:
#undef NONE_OPERATIONAL
#undef ARRAY_IS_ACTIVE
-static int raid1_stop_resync (mddev_t *mddev)
+static int stop_resync(mddev_t *mddev)
{
- raid1_conf_t *conf = mddev_to_conf(mddev);
+ conf_t *conf = mddev_to_conf(mddev);
if (conf->resync_thread) {
if (conf->resync_mirrors) {
@@ -1769,9 +1639,9 @@ static int raid1_stop_resync (mddev_t *mddev)
return 0;
}
-static int raid1_restart_resync (mddev_t *mddev)
+static int restart_resync(mddev_t *mddev)
{
- raid1_conf_t *conf = mddev_to_conf(mddev);
+ conf_t *conf = mddev_to_conf(mddev);
if (conf->resync_mirrors) {
if (!conf->resync_thread) {
@@ -1785,46 +1655,45 @@ static int raid1_restart_resync (mddev_t *mddev)
return 0;
}
-static int raid1_stop (mddev_t *mddev)
+static int stop(mddev_t *mddev)
{
- raid1_conf_t *conf = mddev_to_conf(mddev);
+ conf_t *conf = mddev_to_conf(mddev);
md_unregister_thread(conf->thread);
if (conf->resync_thread)
md_unregister_thread(conf->resync_thread);
- raid1_shrink_r1bh(conf);
- raid1_shrink_bh(conf);
- raid1_shrink_buffers(conf);
+ if (conf->r1bio_pool)
+ mempool_destroy(conf->r1bio_pool);
kfree(conf);
mddev->private = NULL;
MOD_DEC_USE_COUNT;
return 0;
}
-static mdk_personality_t raid1_personality=
+static mdk_personality_t raid1_personality =
{
name: "raid1",
- make_request: raid1_make_request,
- run: raid1_run,
- stop: raid1_stop,
- status: raid1_status,
- error_handler: raid1_error,
- diskop: raid1_diskop,
- stop_resync: raid1_stop_resync,
- restart_resync: raid1_restart_resync,
- sync_request: raid1_sync_request
+ make_request: make_request,
+ run: run,
+ stop: stop,
+ status: status,
+ error_handler: error,
+ diskop: diskop,
+ stop_resync: stop_resync,
+ restart_resync: restart_resync,
+ sync_request: sync_request
};
-static int md__init raid1_init (void)
+static int __init raid_init(void)
{
- return register_md_personality (RAID1, &raid1_personality);
+ return register_md_personality(RAID1, &raid1_personality);
}
-static void raid1_exit (void)
+static void raid_exit(void)
{
- unregister_md_personality (RAID1);
+ unregister_md_personality(RAID1);
}
-module_init(raid1_init);
-module_exit(raid1_exit);
+module_init(raid_init);
+module_exit(raid_exit);
MODULE_LICENSE("GPL");
diff --git a/drivers/net/tulip/ChangeLog b/drivers/net/tulip/ChangeLog
index a515efcfd338..8a1caaa28d2f 100644
--- a/drivers/net/tulip/ChangeLog
+++ b/drivers/net/tulip/ChangeLog
@@ -1,3 +1,8 @@
+2001-12-11 Jeff Garzik <jgarzik@mandrakesoft.com>
+
+ * eeprom.c, timer.c, media.c, tulip_core.c:
+ Remove 21040 and 21041 chip support.
+
2001-11-13 David S. Miller <davem@redhat.com>
* tulip_core.c (tulip_mwi_config): Kill unused label early_out.
diff --git a/drivers/net/tulip/eeprom.c b/drivers/net/tulip/eeprom.c
index beb1430cc431..8777cc1f3065 100644
--- a/drivers/net/tulip/eeprom.c
+++ b/drivers/net/tulip/eeprom.c
@@ -136,23 +136,6 @@ void __devinit tulip_parse_eeprom(struct net_device *dev)
subsequent_board:
if (ee_data[27] == 0) { /* No valid media table. */
- } else if (tp->chip_id == DC21041) {
- unsigned char *p = (void *)ee_data + ee_data[27 + controller_index*3];
- int media = get_u16(p);
- int count = p[2];
- p += 3;
-
- printk(KERN_INFO "%s: 21041 Media table, default media %4.4x (%s).\n",
- dev->name, media,
- media & 0x0800 ? "Autosense" : medianame[media & MEDIA_MASK]);
- for (i = 0; i < count; i++) {
- unsigned char media_block = *p++;
- int media_code = media_block & MEDIA_MASK;
- if (media_block & 0x40)
- p += 6;
- printk(KERN_INFO "%s: 21041 media #%d, %s.\n",
- dev->name, media_code, medianame[media_code]);
- }
} else {
unsigned char *p = (void *)ee_data + ee_data[27];
unsigned char csr12dir = 0;
diff --git a/drivers/net/tulip/media.c b/drivers/net/tulip/media.c
index 5d1329776d01..e7160fca0e34 100644
--- a/drivers/net/tulip/media.c
+++ b/drivers/net/tulip/media.c
@@ -21,12 +21,6 @@
#include "tulip.h"
-/* This is a mysterious value that can be written to CSR11 in the 21040 (only)
- to support a pre-NWay full-duplex signaling mechanism using short frames.
- No one knows what it should be, but if left at its default value some
- 10base2(!) packets trigger a full-duplex-request interrupt. */
-#define FULL_DUPLEX_MAGIC 0x6969
-
/* The maximum data clock rate is 2.5 Mhz. The minimum timing is usually
met by back-to-back PCI I/O cycles, but we insert a delay to avoid
"overclocking" issues or future 66Mhz PCI. */
@@ -326,17 +320,6 @@ void tulip_select_media(struct net_device *dev, int startup)
printk(KERN_DEBUG "%s: Using media type %s, CSR12 is %2.2x.\n",
dev->name, medianame[dev->if_port],
inl(ioaddr + CSR12) & 0xff);
- } else if (tp->chip_id == DC21041) {
- int port = dev->if_port <= 4 ? dev->if_port : 0;
- if (tulip_debug > 1)
- printk(KERN_DEBUG "%s: 21041 using media %s, CSR12 is %4.4x.\n",
- dev->name, medianame[port == 3 ? 12: port],
- inl(ioaddr + CSR12));
- outl(0x00000000, ioaddr + CSR13); /* Reset the serial interface */
- outl(t21041_csr14[port], ioaddr + CSR14);
- outl(t21041_csr15[port], ioaddr + CSR15);
- outl(t21041_csr13[port], ioaddr + CSR13);
- new_csr6 = 0x80020000;
} else if (tp->chip_id == LC82C168) {
if (startup && ! tp->medialock)
dev->if_port = tp->mii_cnt ? 11 : 0;
@@ -363,26 +346,6 @@ void tulip_select_media(struct net_device *dev, int startup)
new_csr6 = 0x00420000;
outl(0x1F078, ioaddr + 0xB8);
}
- } else if (tp->chip_id == DC21040) { /* 21040 */
- /* Turn on the xcvr interface. */
- int csr12 = inl(ioaddr + CSR12);
- if (tulip_debug > 1)
- printk(KERN_DEBUG "%s: 21040 media type is %s, CSR12 is %2.2x.\n",
- dev->name, medianame[dev->if_port], csr12);
- if (tulip_media_cap[dev->if_port] & MediaAlwaysFD)
- tp->full_duplex = 1;
- new_csr6 = 0x20000;
- /* Set the full duplux match frame. */
- outl(FULL_DUPLEX_MAGIC, ioaddr + CSR11);
- outl(0x00000000, ioaddr + CSR13); /* Reset the serial interface */
- if (t21040_csr13[dev->if_port] & 8) {
- outl(0x0705, ioaddr + CSR14);
- outl(0x0006, ioaddr + CSR15);
- } else {
- outl(0xffff, ioaddr + CSR14);
- outl(0x0000, ioaddr + CSR15);
- }
- outl(0x8f01 | t21040_csr13[dev->if_port], ioaddr + CSR13);
} else { /* Unknown chip type with no media table. */
if (tp->default_port == 0)
dev->if_port = tp->mii_cnt ? 11 : 3;
diff --git a/drivers/net/tulip/timer.c b/drivers/net/tulip/timer.c
index 4079772ae9fe..53c43912bad7 100644
--- a/drivers/net/tulip/timer.c
+++ b/drivers/net/tulip/timer.c
@@ -33,60 +33,6 @@ void tulip_timer(unsigned long data)
inl(ioaddr + CSR14), inl(ioaddr + CSR15));
}
switch (tp->chip_id) {
- case DC21040:
- if (!tp->medialock && csr12 & 0x0002) { /* Network error */
- printk(KERN_INFO "%s: No link beat found.\n",
- dev->name);
- dev->if_port = (dev->if_port == 2 ? 0 : 2);
- tulip_select_media(dev, 0);
- dev->trans_start = jiffies;
- }
- break;
- case DC21041:
- if (tulip_debug > 2)
- printk(KERN_DEBUG "%s: 21041 media tick CSR12 %8.8x.\n",
- dev->name, csr12);
- if (tp->medialock) break;
- switch (dev->if_port) {
- case 0: case 3: case 4:
- if (csr12 & 0x0004) { /*LnkFail */
- /* 10baseT is dead. Check for activity on alternate port. */
- tp->mediasense = 1;
- if (csr12 & 0x0200)
- dev->if_port = 2;
- else
- dev->if_port = 1;
- printk(KERN_INFO "%s: No 21041 10baseT link beat, Media switched to %s.\n",
- dev->name, medianame[dev->if_port]);
- outl(0, ioaddr + CSR13); /* Reset */
- outl(t21041_csr14[dev->if_port], ioaddr + CSR14);
- outl(t21041_csr15[dev->if_port], ioaddr + CSR15);
- outl(t21041_csr13[dev->if_port], ioaddr + CSR13);
- next_tick = 10*HZ; /* 2.4 sec. */
- } else
- next_tick = 30*HZ;
- break;
- case 1: /* 10base2 */
- case 2: /* AUI */
- if (csr12 & 0x0100) {
- next_tick = (30*HZ); /* 30 sec. */
- tp->mediasense = 0;
- } else if ((csr12 & 0x0004) == 0) {
- printk(KERN_INFO "%s: 21041 media switched to 10baseT.\n",
- dev->name);
- dev->if_port = 0;
- tulip_select_media(dev, 0);
- next_tick = (24*HZ)/10; /* 2.4 sec. */
- } else if (tp->mediasense || (csr12 & 0x0002)) {
- dev->if_port = 3 - dev->if_port; /* Swap ports. */
- tulip_select_media(dev, 0);
- next_tick = 20*HZ;
- } else {
- next_tick = 20*HZ;
- }
- break;
- }
- break;
case DC21140:
case DC21142:
case MX98713:
diff --git a/drivers/net/tulip/tulip_core.c b/drivers/net/tulip/tulip_core.c
index f67ff13732cb..917f1a9be8cf 100644
--- a/drivers/net/tulip/tulip_core.c
+++ b/drivers/net/tulip/tulip_core.c
@@ -15,8 +15,8 @@
*/
#define DRV_NAME "tulip"
-#define DRV_VERSION "0.9.15-pre9"
-#define DRV_RELDATE "Nov 6, 2001"
+#define DRV_VERSION "1.1.0"
+#define DRV_RELDATE "Dec 11, 2001"
#include <linux/config.h>
#include <linux/module.h>
@@ -130,12 +130,8 @@ int tulip_debug = 1;
*/
struct tulip_chip_table tulip_tbl[] = {
- /* DC21040 */
- { "Digital DC21040 Tulip", 128, 0x0001ebef, 0, tulip_timer },
-
- /* DC21041 */
- { "Digital DC21041 Tulip", 128, 0x0001ebef,
- HAS_MEDIA_TABLE | HAS_NWAY, tulip_timer },
+ { }, /* placeholder for array, slot unused currently */
+ { }, /* placeholder for array, slot unused currently */
/* DC21140 */
{ "Digital DS21140 Tulip", 128, 0x0001ebef,
@@ -192,8 +188,6 @@ struct tulip_chip_table tulip_tbl[] = {
static struct pci_device_id tulip_pci_tbl[] __devinitdata = {
- { 0x1011, 0x0002, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DC21040 },
- { 0x1011, 0x0014, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DC21041 },
{ 0x1011, 0x0009, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DC21140 },
{ 0x1011, 0x0019, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DC21143 },
{ 0x11AD, 0x0002, PCI_ANY_ID, PCI_ANY_ID, 0, 0, LC82C168 },
@@ -224,19 +218,6 @@ MODULE_DEVICE_TABLE(pci, tulip_pci_tbl);
/* A full-duplex map for media types. */
const char tulip_media_cap[32] =
{0,0,0,16, 3,19,16,24, 27,4,7,5, 0,20,23,20, 28,31,0,0, };
-u8 t21040_csr13[] = {2,0x0C,8,4, 4,0,0,0, 0,0,0,0, 4,0,0,0};
-
-/* 21041 transceiver register settings: 10-T, 10-2, AUI, 10-T, 10T-FD*/
-u16 t21041_csr13[] = {
- csr13_mask_10bt, /* 10-T */
- csr13_mask_auibnc, /* 10-2 */
- csr13_mask_auibnc, /* AUI */
- csr13_mask_10bt, /* 10-T */
- csr13_mask_10bt, /* 10T-FD */
-};
-u16 t21041_csr14[] = { 0xFFFF, 0xF7FD, 0xF7FD, 0x7F3F, 0x7F3D, };
-u16 t21041_csr15[] = { 0x0008, 0x0006, 0x000E, 0x0008, 0x0008, };
-
static void tulip_tx_timeout(struct net_device *dev);
static void tulip_init_ring(struct net_device *dev);
@@ -388,19 +369,6 @@ media_picked:
outl(0x0008, ioaddr + CSR15);
}
tulip_select_media(dev, 1);
- } else if (tp->chip_id == DC21041) {
- dev->if_port = 0;
- tp->nway = tp->mediasense = 1;
- tp->nwayset = tp->lpar = 0;
- outl(0x00000000, ioaddr + CSR13);
- outl(0xFFFFFFFF, ioaddr + CSR14);
- outl(0x00000008, ioaddr + CSR15); /* Listen on AUI also. */
- tp->csr6 = 0x80020000;
- if (tp->sym_advertise & 0x0040)
- tp->csr6 |= FullDuplex;
- outl(tp->csr6, ioaddr + CSR6);
- outl(0x0000EF01, ioaddr + CSR13);
-
} else if (tp->chip_id == DC21142) {
if (tp->mii_cnt) {
tulip_select_media(dev, 1);
@@ -538,33 +506,6 @@ static void tulip_tx_timeout(struct net_device *dev)
if (tulip_debug > 1)
printk(KERN_WARNING "%s: Transmit timeout using MII device.\n",
dev->name);
- } else if (tp->chip_id == DC21040) {
- if ( !tp->medialock && inl(ioaddr + CSR12) & 0x0002) {
- dev->if_port = (dev->if_port == 2 ? 0 : 2);
- printk(KERN_INFO "%s: 21040 transmit timed out, switching to "
- "%s.\n",
- dev->name, medianame[dev->if_port]);
- tulip_select_media(dev, 0);
- }
- goto out;
- } else if (tp->chip_id == DC21041) {
- int csr12 = inl(ioaddr + CSR12);
-
- printk(KERN_WARNING "%s: 21041 transmit timed out, status %8.8x, "
- "CSR12 %8.8x, CSR13 %8.8x, CSR14 %8.8x, resetting...\n",
- dev->name, inl(ioaddr + CSR5), csr12,
- inl(ioaddr + CSR13), inl(ioaddr + CSR14));
- tp->mediasense = 1;
- if ( ! tp->medialock) {
- if (dev->if_port == 1 || dev->if_port == 2)
- if (csr12 & 0x0004) {
- dev->if_port = 2 - dev->if_port;
- } else
- dev->if_port = 0;
- else
- dev->if_port = 1;
- tulip_select_media(dev, 0);
- }
} else if (tp->chip_id == DC21140 || tp->chip_id == DC21142
|| tp->chip_id == MX98713 || tp->chip_id == COMPEX9881
|| tp->chip_id == DM910X) {
@@ -636,7 +577,6 @@ static void tulip_tx_timeout(struct net_device *dev)
tp->stats.tx_errors++;
-out:
spin_unlock_irqrestore (&tp->lock, flags);
dev->trans_start = jiffies;
netif_wake_queue (dev);
@@ -802,10 +742,6 @@ static void tulip_down (struct net_device *dev)
/* release any unconsumed transmit buffers */
tulip_clean_tx_ring(tp);
- /* 21040 -- Leave the card in 10baseT state. */
- if (tp->chip_id == DC21040)
- outl (0x00000004, ioaddr + CSR13);
-
if (inl (ioaddr + CSR6) != 0xffffffff)
tp->stats.rx_missed_errors += inl (ioaddr + CSR8) & 0xffff;
@@ -966,16 +902,14 @@ static int private_ioctl (struct net_device *dev, struct ifreq *rq, int cmd)
0x1848 +
((csr12&0x7000) == 0x5000 ? 0x20 : 0) +
((csr12&0x06) == 6 ? 0 : 4);
- if (tp->chip_id != DC21041)
- data->val_out |= 0x6048;
+ data->val_out |= 0x6048;
break;
case 4:
/* Advertised value, bogus 10baseTx-FD value from CSR6. */
data->val_out =
((inl(ioaddr + CSR6) >> 3) & 0x0040) +
((csr14 >> 1) & 0x20) + 1;
- if (tp->chip_id != DC21041)
- data->val_out |= ((csr14 >> 9) & 0x03C0);
+ data->val_out |= ((csr14 >> 9) & 0x03C0);
break;
case 5: data->val_out = tp->lpar; break;
default: data->val_out = 0; break;
@@ -1358,7 +1292,6 @@ static int __devinit tulip_init_one (struct pci_dev *pdev,
long ioaddr;
static int board_idx = -1;
int chip_idx = ent->driver_data;
- unsigned int t2104x_mode = 0;
unsigned int eeprom_missing = 0;
unsigned int force_csr0 = 0;
@@ -1527,31 +1460,12 @@ static int __devinit tulip_init_one (struct pci_dev *pdev,
/* Clear the missed-packet counter. */
inl(ioaddr + CSR8);
- if (chip_idx == DC21041) {
- if (inl(ioaddr + CSR9) & 0x8000) {
- chip_idx = DC21040;
- t2104x_mode = 1;
- } else {
- t2104x_mode = 2;
- }
- }
-
/* The station address ROM is read byte serially. The register must
be polled, waiting for the value to be read bit serially from the
EEPROM.
*/
sum = 0;
- if (chip_idx == DC21040) {
- outl(0, ioaddr + CSR9); /* Reset the pointer with a dummy write. */
- for (i = 0; i < 6; i++) {
- int value, boguscnt = 100000;
- do
- value = inl(ioaddr + CSR9);
- while (value < 0 && --boguscnt > 0);
- dev->dev_addr[i] = value;
- sum += value & 0xff;
- }
- } else if (chip_idx == LC82C168) {
+ if (chip_idx == LC82C168) {
for (i = 0; i < 3; i++) {
int value, boguscnt = 100000;
outl(0x600 | i, ioaddr + 0x98);
@@ -1719,10 +1633,6 @@ static int __devinit tulip_init_one (struct pci_dev *pdev,
dev->name, tulip_tbl[chip_idx].chip_name, chip_rev, ioaddr);
pci_set_drvdata(pdev, dev);
- if (t2104x_mode == 1)
- printk(" 21040 compatible mode,");
- else if (t2104x_mode == 2)
- printk(" 21041 mode,");
if (eeprom_missing)
printk(" EEPROM not present,");
for (i = 0; i < 6; i++)
@@ -1731,26 +1641,13 @@ static int __devinit tulip_init_one (struct pci_dev *pdev,
if (tp->chip_id == PNIC2)
tp->link_change = pnic2_lnk_change;
- else if ((tp->flags & HAS_NWAY) || tp->chip_id == DC21041)
+ else if (tp->flags & HAS_NWAY)
tp->link_change = t21142_lnk_change;
else if (tp->flags & HAS_PNICNWAY)
tp->link_change = pnic_lnk_change;
/* Reset the xcvr interface and turn on heartbeat. */
switch (chip_idx) {
- case DC21041:
- if (tp->sym_advertise == 0)
- tp->sym_advertise = 0x0061;
- outl(0x00000000, ioaddr + CSR13);
- outl(0xFFFFFFFF, ioaddr + CSR14);
- outl(0x00000008, ioaddr + CSR15); /* Listen on AUI also. */
- outl(inl(ioaddr + CSR6) | csr6_fd, ioaddr + CSR6);
- outl(0x0000EF01, ioaddr + CSR13);
- break;
- case DC21040:
- outl(0x00000000, ioaddr + CSR13);
- outl(0x00000004, ioaddr + CSR13);
- break;
case DC21140:
case DM910X:
default:
diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c
index 1ce0fa803975..fa97dfb7bdde 100644
--- a/drivers/scsi/eata.c
+++ b/drivers/scsi/eata.c
@@ -1,6 +1,9 @@
/*
* eata.c - Low-level driver for EATA/DMA SCSI host adapters.
*
+ * 11 Dec 2001 Rev. 7.00 for linux 2.5.1
+ * + Use host->host_lock instead of io_request_lock.
+ *
* 1 May 2001 Rev. 6.05 for linux 2.4.4
* + Clean up all pci related routines.
* + Fix data transfer direction for opcode SEND_CUE_SHEET (0x5d)
@@ -438,13 +441,6 @@ MODULE_AUTHOR("Dario Ballabio");
#include <linux/ctype.h>
#include <linux/spinlock.h>
-#define SPIN_FLAGS unsigned long spin_flags;
-#define SPIN_LOCK spin_lock_irq(&io_request_lock);
-#define SPIN_LOCK_SAVE spin_lock_irqsave(&io_request_lock, spin_flags);
-#define SPIN_UNLOCK spin_unlock_irq(&io_request_lock);
-#define SPIN_UNLOCK_RESTORE \
- spin_unlock_irqrestore(&io_request_lock, spin_flags);
-
/* Subversion values */
#define ISA 0
#define ESA 1
@@ -1589,10 +1585,12 @@ static inline int do_reset(Scsi_Cmnd *SCarg) {
#endif
HD(j)->in_reset = TRUE;
- SPIN_UNLOCK
+
+ spin_unlock_irq(&sh[j]->host_lock);
time = jiffies;
while ((jiffies - time) < (10 * HZ) && limit++ < 200000) udelay(100L);
- SPIN_LOCK
+ spin_lock_irq(&sh[j]->host_lock);
+
printk("%s: reset, interrupts disabled, loops %d.\n", BN(j), limit);
for (i = 0; i < sh[j]->can_queue; i++) {
@@ -2036,14 +2034,14 @@ static inline void ihdlr(int irq, unsigned int j) {
static void do_interrupt_handler(int irq, void *shap, struct pt_regs *regs) {
unsigned int j;
- SPIN_FLAGS
+ unsigned long spin_flags;
/* Check if the interrupt must be processed by this handler */
if ((j = (unsigned int)((char *)shap - sha)) >= num_boards) return;
- SPIN_LOCK_SAVE
+ spin_lock_irqsave(&sh[j]->host_lock, spin_flags);
ihdlr(irq, j);
- SPIN_UNLOCK_RESTORE
+ spin_unlock_irqrestore(&sh[j]->host_lock, spin_flags);
}
int eata2x_release(struct Scsi_Host *shpnt) {
@@ -2077,4 +2075,4 @@ static Scsi_Host_Template driver_template = EATA;
#ifndef MODULE
__setup("eata=", option_setup);
#endif /* end MODULE */
-MODULE_LICENSE("Dual BSD/GPL");
+MODULE_LICENSE("GPL");
diff --git a/drivers/scsi/eata.h b/drivers/scsi/eata.h
index afa5e27870f9..de0bad6efaab 100644
--- a/drivers/scsi/eata.h
+++ b/drivers/scsi/eata.h
@@ -13,7 +13,7 @@ int eata2x_abort(Scsi_Cmnd *);
int eata2x_reset(Scsi_Cmnd *);
int eata2x_biosparam(Disk *, kdev_t, int *);
-#define EATA_VERSION "6.05.00"
+#define EATA_VERSION "7.00.00"
#define EATA { \
name: "EATA/DMA 2.0x rev. " EATA_VERSION " ", \
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 3713c3284243..656766c09f2d 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -183,7 +183,7 @@ void scsi_initialize_queue(Scsi_Device * SDpnt, struct Scsi_Host * SHpnt)
request_queue_t *q = &SDpnt->request_queue;
int max_segments = SHpnt->sg_tablesize;
- blk_init_queue(q, scsi_request_fn);
+ blk_init_queue(q, scsi_request_fn, &SHpnt->host_lock);
q->queuedata = (void *) SDpnt;
#ifdef DMA_CHUNK_SIZE
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index af0bb409c9d6..b6894649e12f 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1254,9 +1254,7 @@ STATIC void scsi_restart_operations(struct Scsi_Host *host)
break;
}
- spin_lock(&q->queue_lock);
q->request_fn(q);
- spin_unlock(&q->queue_lock);
}
spin_unlock_irqrestore(&host->host_lock, flags);
}
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a723b3404227..d7cc000bcdd2 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -70,7 +70,7 @@ static void __scsi_insert_special(request_queue_t *q, struct request *rq,
{
unsigned long flags;
- ASSERT_LOCK(&q->queue_lock, 0);
+ ASSERT_LOCK(q->queue_lock, 0);
/*
* tell I/O scheduler that this isn't a regular read/write (ie it
@@ -91,10 +91,10 @@ static void __scsi_insert_special(request_queue_t *q, struct request *rq,
* head of the queue for things like a QUEUE_FULL message from a
* device, or a host that is unable to accept a particular command.
*/
- spin_lock_irqsave(&q->queue_lock, flags);
+ spin_lock_irqsave(q->queue_lock, flags);
__elv_add_request(q, rq, !at_head, 0);
q->request_fn(q);
- spin_unlock_irqrestore(&q->queue_lock, flags);
+ spin_unlock_irqrestore(q->queue_lock, flags);
}
@@ -250,9 +250,9 @@ void scsi_queue_next_request(request_queue_t * q, Scsi_Cmnd * SCpnt)
Scsi_Device *SDpnt;
struct Scsi_Host *SHpnt;
- ASSERT_LOCK(&q->queue_lock, 0);
+ ASSERT_LOCK(q->queue_lock, 0);
- spin_lock_irqsave(&q->queue_lock, flags);
+ spin_lock_irqsave(q->queue_lock, flags);
if (SCpnt != NULL) {
/*
@@ -325,7 +325,7 @@ void scsi_queue_next_request(request_queue_t * q, Scsi_Cmnd * SCpnt)
SHpnt->some_device_starved = 0;
}
}
- spin_unlock_irqrestore(&q->queue_lock, flags);
+ spin_unlock_irqrestore(q->queue_lock, flags);
}
/*
@@ -360,7 +360,7 @@ static Scsi_Cmnd *__scsi_end_request(Scsi_Cmnd * SCpnt,
request_queue_t *q = &SCpnt->device->request_queue;
struct request *req = &SCpnt->request;
- ASSERT_LOCK(&q->queue_lock, 0);
+ ASSERT_LOCK(q->queue_lock, 0);
/*
* If there are blocks left over at the end, set up the command
@@ -445,7 +445,7 @@ static void scsi_release_buffers(Scsi_Cmnd * SCpnt)
{
struct request *req = &SCpnt->request;
- ASSERT_LOCK(&SCpnt->device->request_queue.queue_lock, 0);
+ ASSERT_LOCK(&SCpnt->host->host_lock, 0);
/*
* Free up any indirection buffers we allocated for DMA purposes.
@@ -518,7 +518,7 @@ void scsi_io_completion(Scsi_Cmnd * SCpnt, int good_sectors,
* would be used if we just wanted to retry, for example.
*
*/
- ASSERT_LOCK(&q->queue_lock, 0);
+ ASSERT_LOCK(q->queue_lock, 0);
/*
* Free up any indirection buffers we allocated for DMA purposes.
@@ -746,8 +746,6 @@ struct Scsi_Device_Template *scsi_get_request_dev(struct request *req)
kdev_t dev = req->rq_dev;
int major = MAJOR(dev);
- ASSERT_LOCK(&req->q->queue_lock, 1);
-
for (spnt = scsi_devicelist; spnt; spnt = spnt->next) {
/*
* Search for a block device driver that supports this
@@ -804,7 +802,7 @@ void scsi_request_fn(request_queue_t * q)
struct Scsi_Host *SHpnt;
struct Scsi_Device_Template *STpnt;
- ASSERT_LOCK(&q->queue_lock, 1);
+ ASSERT_LOCK(q->queue_lock, 1);
SDpnt = (Scsi_Device *) q->queuedata;
if (!SDpnt) {
@@ -871,9 +869,9 @@ void scsi_request_fn(request_queue_t * q)
*/
SDpnt->was_reset = 0;
if (SDpnt->removable && !in_interrupt()) {
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(q->queue_lock);
scsi_ioctl(SDpnt, SCSI_IOCTL_DOORLOCK, 0);
- spin_lock_irq(&q->queue_lock);
+ spin_lock_irq(q->queue_lock);
continue;
}
}
@@ -982,7 +980,7 @@ void scsi_request_fn(request_queue_t * q)
* another.
*/
req = NULL;
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(q->queue_lock);
if (SCpnt->request.flags & REQ_CMD) {
/*
@@ -1012,7 +1010,7 @@ void scsi_request_fn(request_queue_t * q)
{
panic("Should not have leftover blocks\n");
}
- spin_lock_irq(&q->queue_lock);
+ spin_lock_irq(q->queue_lock);
SHpnt->host_busy--;
SDpnt->device_busy--;
continue;
@@ -1028,7 +1026,7 @@ void scsi_request_fn(request_queue_t * q)
{
panic("Should not have leftover blocks\n");
}
- spin_lock_irq(&q->queue_lock);
+ spin_lock_irq(q->queue_lock);
SHpnt->host_busy--;
SDpnt->device_busy--;
continue;
@@ -1049,7 +1047,7 @@ void scsi_request_fn(request_queue_t * q)
* Now we need to grab the lock again. We are about to mess
* with the request queue and try to find another command.
*/
- spin_lock_irq(&q->queue_lock);
+ spin_lock_irq(q->queue_lock);
}
}
diff --git a/drivers/scsi/scsi_merge.c b/drivers/scsi/scsi_merge.c
index 9d455e89574a..89def7c84d79 100644
--- a/drivers/scsi/scsi_merge.c
+++ b/drivers/scsi/scsi_merge.c
@@ -307,7 +307,7 @@ __inline static int __scsi_back_merge_fn(request_queue_t * q,
}
#ifdef DMA_CHUNK_SIZE
- if (MERGEABLE_BUFFERS(bio, req->bio))
+ if (MERGEABLE_BUFFERS(req->biotail, bio))
return scsi_new_mergeable(q, req, bio);
#endif
@@ -461,9 +461,7 @@ inline static int scsi_merge_requests_fn(request_queue_t * q,
* (mainly because we don't need queue management functions
* which keep the tally uptodate.
*/
-__inline static int __init_io(Scsi_Cmnd * SCpnt,
- int sg_count_valid,
- int dma_host)
+__inline static int __init_io(Scsi_Cmnd * SCpnt, int dma_host)
{
struct bio * bio;
char * buff;
@@ -480,11 +478,7 @@ __inline static int __init_io(Scsi_Cmnd * SCpnt,
/*
* First we need to know how many scatter gather segments are needed.
*/
- if (!sg_count_valid) {
- count = __count_segments(req, dma_host, NULL);
- } else {
- count = req->nr_segments;
- }
+ count = req->nr_segments;
/*
* If the dma pool is nearly empty, then queue a minimal request
@@ -721,20 +715,14 @@ __inline static int __init_io(Scsi_Cmnd * SCpnt,
return 1;
}
-#define INITIO(_FUNCTION, _VALID, _DMA) \
+#define INITIO(_FUNCTION, _DMA) \
static int _FUNCTION(Scsi_Cmnd * SCpnt) \
{ \
- return __init_io(SCpnt, _VALID, _DMA); \
+ return __init_io(SCpnt, _DMA); \
}
-/*
- * ll_rw_blk.c now keeps track of the number of segments in
- * a request. Thus we don't have to do it any more here.
- * We always force "_VALID" to 1. Eventually clean this up
- * and get rid of the extra argument.
- */
-INITIO(scsi_init_io_v, 1, 0)
-INITIO(scsi_init_io_vd, 1, 1)
+INITIO(scsi_init_io_v, 0)
+INITIO(scsi_init_io_vd, 1)
/*
* Function: initialize_merge_fn()
diff --git a/drivers/scsi/scsi_queue.c b/drivers/scsi/scsi_queue.c
index b864fc04507f..1d9a90bbdd56 100644
--- a/drivers/scsi/scsi_queue.c
+++ b/drivers/scsi/scsi_queue.c
@@ -80,7 +80,6 @@ int scsi_mlqueue_insert(Scsi_Cmnd * cmd, int reason)
{
struct Scsi_Host *host;
unsigned long flags;
- request_queue_t *q = &cmd->device->request_queue;
SCSI_LOG_MLQUEUE(1, printk("Inserting command %p into mlqueue\n", cmd));
@@ -138,10 +137,10 @@ int scsi_mlqueue_insert(Scsi_Cmnd * cmd, int reason)
* Decrement the counters, since these commands are no longer
* active on the host/device.
*/
- spin_lock_irqsave(&q->queue_lock, flags);
+ spin_lock_irqsave(&cmd->host->host_lock, flags);
cmd->host->host_busy--;
cmd->device->device_busy--;
- spin_unlock_irqrestore(&q->queue_lock, flags);
+ spin_unlock_irqrestore(&cmd->host->host_lock, flags);
/*
* Insert this command at the head of the queue for it's device.
diff --git a/drivers/scsi/u14-34f.c b/drivers/scsi/u14-34f.c
index 41cff9e57108..adacf2fd49a0 100644
--- a/drivers/scsi/u14-34f.c
+++ b/drivers/scsi/u14-34f.c
@@ -1,6 +1,9 @@
/*
* u14-34f.c - Low-level driver for UltraStor 14F/34F SCSI host adapters.
*
+ * 11 Dec 2001 Rev. 7.00 for linux 2.5.1
+ * + Use host->host_lock instead of io_request_lock.
+ *
* 1 May 2001 Rev. 6.05 for linux 2.4.4
* + Fix data transfer direction for opcode SEND_CUE_SHEET (0x5d)
*
@@ -334,7 +337,6 @@
* the driver sets host->wish_block = TRUE for all ISA boards.
*/
-#include <linux/module.h>
#include <linux/version.h>
#ifndef LinuxVersionCode
@@ -343,6 +345,9 @@
#define MAX_INT_PARAM 10
+#if defined(MODULE)
+#include <linux/module.h>
+
MODULE_PARM(boot_options, "s");
MODULE_PARM(io_port, "1-" __MODULE_STRING(MAX_INT_PARAM) "i");
MODULE_PARM(linked_comm, "i");
@@ -352,6 +357,8 @@ MODULE_PARM(max_queue_depth, "i");
MODULE_PARM(ext_tran, "i");
MODULE_AUTHOR("Dario Ballabio");
+#endif
+
#include <linux/string.h>
#include <linux/sched.h>
#include <linux/kernel.h>
@@ -374,13 +381,6 @@ MODULE_AUTHOR("Dario Ballabio");
#include <linux/ctype.h>
#include <linux/spinlock.h>
-#define SPIN_FLAGS unsigned long spin_flags;
-#define SPIN_LOCK spin_lock_irq(&io_request_lock);
-#define SPIN_LOCK_SAVE spin_lock_irqsave(&io_request_lock, spin_flags);
-#define SPIN_UNLOCK spin_unlock_irq(&io_request_lock);
-#define SPIN_UNLOCK_RESTORE \
- spin_unlock_irqrestore(&io_request_lock, spin_flags);
-
/* Values for the PRODUCT_ID ports for the 14/34F */
#define PRODUCT_ID1 0x56
#define PRODUCT_ID2 0x40 /* NOTE: Only upper nibble is used */
@@ -672,10 +672,8 @@ static int board_inquiry(unsigned int j) {
/* Issue OGM interrupt */
outb(CMD_OGM_INTR, sh[j]->io_port + REG_LCL_INTR);
- SPIN_UNLOCK
time = jiffies;
while ((jiffies - time) < HZ && limit++ < 20000) udelay(100L);
- SPIN_LOCK
if (cpp->adapter_status || HD(j)->cp_stat[0] != FREE) {
HD(j)->cp_stat[0] = FREE;
@@ -1274,10 +1272,12 @@ static inline int do_reset(Scsi_Cmnd *SCarg) {
#endif
HD(j)->in_reset = TRUE;
- SPIN_UNLOCK
+
+ spin_unlock_irq(&sh[j]->host_lock);
time = jiffies;
while ((jiffies - time) < (10 * HZ) && limit++ < 200000) udelay(100L);
- SPIN_LOCK
+ spin_lock_irq(&sh[j]->host_lock);
+
printk("%s: reset, interrupts disabled, loops %d.\n", BN(j), limit);
for (i = 0; i < sh[j]->can_queue; i++) {
@@ -1718,14 +1718,14 @@ static inline void ihdlr(int irq, unsigned int j) {
static void do_interrupt_handler(int irq, void *shap, struct pt_regs *regs) {
unsigned int j;
- SPIN_FLAGS
+ unsigned long spin_flags;
/* Check if the interrupt must be processed by this handler */
if ((j = (unsigned int)((char *)shap - sha)) >= num_boards) return;
- SPIN_LOCK_SAVE
+ spin_lock_irqsave(&sh[j]->host_lock, spin_flags);
ihdlr(irq, j);
- SPIN_UNLOCK_RESTORE
+ spin_unlock_irqrestore(&sh[j]->host_lock, spin_flags);
}
int u14_34f_release(struct Scsi_Host *shpnt) {
@@ -1752,7 +1752,6 @@ int u14_34f_release(struct Scsi_Host *shpnt) {
return FALSE;
}
-MODULE_LICENSE("BSD without advertisement clause");
static Scsi_Host_Template driver_template = ULTRASTOR_14_34F;
#include "scsi_module.c"
@@ -1760,3 +1759,4 @@ static Scsi_Host_Template driver_template = ULTRASTOR_14_34F;
#ifndef MODULE
__setup("u14-34f=", option_setup);
#endif /* end MODULE */
+MODULE_LICENSE("GPL");
diff --git a/drivers/scsi/u14-34f.h b/drivers/scsi/u14-34f.h
index 1d2988d739b5..d8d1d400fdd9 100644
--- a/drivers/scsi/u14-34f.h
+++ b/drivers/scsi/u14-34f.h
@@ -13,7 +13,7 @@ int u14_34f_abort(Scsi_Cmnd *);
int u14_34f_reset(Scsi_Cmnd *);
int u14_34f_biosparam(Disk *, kdev_t, int *);
-#define U14_34F_VERSION "6.05.00"
+#define U14_34F_VERSION "7.00.00"
#define ULTRASTOR_14_34F { \
name: "UltraStor 14F/34F rev. " U14_34F_VERSION " ", \
diff --git a/fs/bio.c b/fs/bio.c
index d04cbca7ab1b..36fe91f4a636 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -48,7 +48,7 @@ static const int bvec_pool_sizes[BIOVEC_NR_POOLS] = { 1, 4, 16, 64, 128, 256 };
#define BIO_MAX_PAGES (bvec_pool_sizes[BIOVEC_NR_POOLS - 1])
-static void * slab_pool_alloc(int gfp_mask, void *data)
+static void *slab_pool_alloc(int gfp_mask, void *data)
{
return kmem_cache_alloc(data, gfp_mask);
}
diff --git a/fs/block_dev.c b/fs/block_dev.c
index de4cb8afade6..301a62ef5777 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -324,6 +324,7 @@ struct block_device *bdget(dev_t dev)
new_bdev->bd_dev = dev;
new_bdev->bd_op = NULL;
new_bdev->bd_inode = inode;
+ inode->i_mode = S_IFBLK;
inode->i_rdev = kdev;
inode->i_dev = kdev;
inode->i_bdev = new_bdev;
diff --git a/fs/buffer.c b/fs/buffer.c
index 405e81410c88..e724f5ade105 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2005,12 +2005,12 @@ int generic_direct_IO(int rw, struct inode * inode, struct kiobuf * iobuf, unsig
{
int i, nr_blocks, retval;
sector_t *blocks = iobuf->blocks;
- struct buffer_head bh;
- bh.b_dev = inode->i_dev;
nr_blocks = iobuf->length / blocksize;
/* build the blocklist */
for (i = 0; i < nr_blocks; i++, blocknr++) {
+ struct buffer_head bh;
+
bh.b_state = 0;
bh.b_dev = inode->i_dev;
bh.b_size = blocksize;
@@ -2037,7 +2037,7 @@ int generic_direct_IO(int rw, struct inode * inode, struct kiobuf * iobuf, unsig
}
/* This does not understand multi-device filesystems currently */
- retval = brw_kiovec(rw, 1, &iobuf, bh.b_dev, blocks, blocksize);
+ retval = brw_kiovec(rw, 1, &iobuf, inode->i_dev, blocks, blocksize);
out:
return retval;
diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c
index 6ad90b306b0c..cff561ab9b5f 100644
--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -311,7 +311,7 @@ out:
return result;
}
-static int ufs_getfrag_block (struct inode *inode, long fragment, struct buffer_head *bh_result, int create)
+static int ufs_getfrag_block (struct inode *inode, sector_t fragment, struct buffer_head *bh_result, int create)
{
struct super_block * sb;
struct ufs_sb_private_info * uspi;
diff --git a/include/asm-i386/io.h b/include/asm-i386/io.h
index 0c5e61d14eef..d8d68e8c296d 100644
--- a/include/asm-i386/io.h
+++ b/include/asm-i386/io.h
@@ -51,12 +51,9 @@
*/
#if CONFIG_DEBUG_IOVIRT
extern void *__io_virt_debug(unsigned long x, const char *file, int line);
- extern unsigned long __io_phys_debug(unsigned long x, const char *file, int line);
#define __io_virt(x) __io_virt_debug((unsigned long)(x), __FILE__, __LINE__)
-//#define __io_phys(x) __io_phys_debug((unsigned long)(x), __FILE__, __LINE__)
#else
#define __io_virt(x) ((void *)(x))
-//#define __io_phys(x) __pa(x)
#endif
/*
diff --git a/include/asm-s390/io.h b/include/asm-s390/io.h
index a9c1a917a8fc..e044135ef779 100644
--- a/include/asm-s390/io.h
+++ b/include/asm-s390/io.h
@@ -19,7 +19,7 @@
#define IO_SPACE_LIMIT 0xffffffff
#define __io_virt(x) ((void *)(PAGE_OFFSET | (unsigned long)(x)))
-#define __io_phys(x) ((unsigned long)(x) & ~PAGE_OFFSET)
+
/*
* Change virtual addresses to physical addresses and vv.
* These are pretty trivial
diff --git a/include/asm-s390x/io.h b/include/asm-s390x/io.h
index 2d0d2e79a274..088e26498d68 100644
--- a/include/asm-s390x/io.h
+++ b/include/asm-s390x/io.h
@@ -19,7 +19,7 @@
#define IO_SPACE_LIMIT 0xffffffff
#define __io_virt(x) ((void *)(PAGE_OFFSET | (unsigned long)(x)))
-#define __io_phys(x) ((unsigned long)(x) & ~PAGE_OFFSET)
+
/*
* Change virtual addresses to physical addresses and vv.
* These are pretty trivial
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 204ab9765514..fad87a308171 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -160,7 +160,7 @@ struct request_queue
/*
* protects queue structures from reentrancy
*/
- spinlock_t queue_lock;
+ spinlock_t *queue_lock;
/*
* queue settings
@@ -258,13 +258,14 @@ extern void blk_put_request(struct request *);
extern void blk_plug_device(request_queue_t *);
extern void blk_recount_segments(request_queue_t *, struct bio *);
extern inline int blk_contig_segment(request_queue_t *q, struct bio *, struct bio *);
+extern void blk_queue_assign_lock(request_queue_t *q, spinlock_t *);
extern int block_ioctl(kdev_t, unsigned int, unsigned long);
/*
* Access functions for manipulating queue properties
*/
-extern int blk_init_queue(request_queue_t *, request_fn_proc *);
+extern int blk_init_queue(request_queue_t *, request_fn_proc *, spinlock_t *);
extern void blk_cleanup_queue(request_queue_t *);
extern void blk_queue_make_request(request_queue_t *, make_request_fn *);
extern void blk_queue_bounce_limit(request_queue_t *, u64);
diff --git a/include/linux/devfs_fs_kernel.h b/include/linux/devfs_fs_kernel.h
index 7ca978981e2c..0a241a076158 100644
--- a/include/linux/devfs_fs_kernel.h
+++ b/include/linux/devfs_fs_kernel.h
@@ -47,14 +47,6 @@
typedef struct devfs_entry * devfs_handle_t;
-
-#ifdef CONFIG_BLK_DEV_INITRD
-# define ROOT_DEVICE_NAME ((real_root_dev ==ROOT_DEV) ? root_device_name:NULL)
-#else
-# define ROOT_DEVICE_NAME root_device_name
-#endif
-
-
#ifdef CONFIG_DEVFS_FS
struct unique_numspace
diff --git a/include/linux/ide.h b/include/linux/ide.h
index 38a17222c225..5bcdab80f3f7 100644
--- a/include/linux/ide.h
+++ b/include/linux/ide.h
@@ -1001,7 +1001,6 @@ unsigned long ide_get_or_set_dma_base (ide_hwif_t *hwif, int extra, const char *
void hwif_unregister (ide_hwif_t *hwif);
-#define DRIVE_LOCK(drive) (&(drive)->queue.queue_lock)
extern spinlock_t ide_lock;
#endif /* _IDE_H */
diff --git a/include/linux/mempool.h b/include/linux/mempool.h
index 07e97d109ac8..bd3745152632 100644
--- a/include/linux/mempool.h
+++ b/include/linux/mempool.h
@@ -25,6 +25,7 @@ struct mempool_s {
};
extern mempool_t * mempool_create(int min_nr, mempool_alloc_t *alloc_fn,
mempool_free_t *free_fn, void *pool_data);
+extern void mempool_resize(mempool_t *pool, int new_min_nr, int gfp_mask);
extern void mempool_destroy(mempool_t *pool);
extern void * mempool_alloc(mempool_t *pool, int gfp_mask);
extern void mempool_free(void *element, mempool_t *pool);
diff --git a/include/linux/nbd.h b/include/linux/nbd.h
index 0dbf87851169..6c8bc1e4438e 100644
--- a/include/linux/nbd.h
+++ b/include/linux/nbd.h
@@ -46,7 +46,7 @@ nbd_end_request(struct request *req)
#ifdef PARANOIA
requests_out++;
#endif
- spin_lock_irqsave(&q->queue_lock, flags);
+ spin_lock_irqsave(q->queue_lock, flags);
while((bio = req->bio) != NULL) {
nsect = bio_sectors(bio);
blk_finished_io(nsect);
@@ -55,7 +55,7 @@ nbd_end_request(struct request *req)
bio_endio(bio, uptodate, nsect);
}
blkdev_release_request(req);
- spin_unlock_irqrestore(&q->queue_lock, flags);
+ spin_unlock_irqrestore(q->queue_lock, flags);
}
#define MAX_NBD 128
diff --git a/include/linux/raid/md.h b/include/linux/raid/md.h
index a7e18913ec09..233163eb2872 100644
--- a/include/linux/raid/md.h
+++ b/include/linux/raid/md.h
@@ -37,8 +37,12 @@
#include <linux/kernel_stat.h>
#include <asm/io.h>
#include <linux/completion.h>
+#include <linux/mempool.h>
+#include <linux/list.h>
+#include <linux/reboot.h>
+#include <linux/vmalloc.h>
+#include <linux/blkpg.h>
-#include <linux/raid/md_compatible.h>
/*
* 'md_p.h' holds the 'physical' layout of RAID devices
* 'md_u.h' holds the user <=> kernel API
diff --git a/include/linux/raid/md_compatible.h b/include/linux/raid/md_compatible.h
deleted file mode 100644
index 74dadd4bb663..000000000000
--- a/include/linux/raid/md_compatible.h
+++ /dev/null
@@ -1,158 +0,0 @@
-
-/*
- md.h : Multiple Devices driver compatibility layer for Linux 2.0/2.2
- Copyright (C) 1998 Ingo Molnar
-
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 2, or (at your option)
- any later version.
-
- You should have received a copy of the GNU General Public License
- (for example /usr/src/linux/COPYING); if not, write to the Free
- Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
-*/
-
-#include <linux/version.h>
-
-#ifndef _MD_COMPATIBLE_H
-#define _MD_COMPATIBLE_H
-
-/** 2.3/2.4 stuff: **/
-
-#include <linux/reboot.h>
-#include <linux/vmalloc.h>
-#include <linux/blkpg.h>
-
-/* 000 */
-#define md__get_free_pages(x,y) __get_free_pages(x,y)
-
-#if defined(__i386__) || defined(__x86_64__)
-/* 001 */
-static __inline__ int md_cpu_has_mmx(void)
-{
- return test_bit(X86_FEATURE_MMX, &boot_cpu_data.x86_capability);
-}
-#else
-#define md_cpu_has_mmx(x) (0)
-#endif
-
-/* 002 */
-#define md_clear_page(page) clear_page(page)
-
-/* 003 */
-#define MD_EXPORT_SYMBOL(x) EXPORT_SYMBOL(x)
-
-/* 004 */
-#define md_copy_to_user(x,y,z) copy_to_user(x,y,z)
-
-/* 005 */
-#define md_copy_from_user(x,y,z) copy_from_user(x,y,z)
-
-/* 006 */
-#define md_put_user put_user
-
-/* 007 */
-static inline int md_capable_admin(void)
-{
- return capable(CAP_SYS_ADMIN);
-}
-
-/* 008 */
-#define MD_FILE_TO_INODE(file) ((file)->f_dentry->d_inode)
-
-/* 009 */
-static inline void md_flush_signals (void)
-{
- spin_lock(&current->sigmask_lock);
- flush_signals(current);
- spin_unlock(&current->sigmask_lock);
-}
-
-/* 010 */
-static inline void md_init_signals (void)
-{
- current->exit_signal = SIGCHLD;
- siginitsetinv(&current->blocked, sigmask(SIGKILL));
-}
-
-/* 011 */
-#define md_signal_pending signal_pending
-
-/* 012 - md_set_global_readahead - nowhere used */
-
-/* 013 */
-#define md_mdelay(x) mdelay(x)
-
-/* 014 */
-#define MD_SYS_DOWN SYS_DOWN
-#define MD_SYS_HALT SYS_HALT
-#define MD_SYS_POWER_OFF SYS_POWER_OFF
-
-/* 015 */
-#define md_register_reboot_notifier register_reboot_notifier
-
-/* 016 */
-#define md_test_and_set_bit test_and_set_bit
-
-/* 017 */
-#define md_test_and_clear_bit test_and_clear_bit
-
-/* 018 */
-#define md_atomic_read atomic_read
-#define md_atomic_set atomic_set
-
-/* 019 */
-#define md_lock_kernel lock_kernel
-#define md_unlock_kernel unlock_kernel
-
-/* 020 */
-
-#include <linux/init.h>
-
-#define md__init __init
-#define md__initdata __initdata
-#define md__initfunc(__arginit) __initfunc(__arginit)
-
-/* 021 */
-
-
-/* 022 */
-
-#define md_list_head list_head
-#define MD_LIST_HEAD(name) LIST_HEAD(name)
-#define MD_INIT_LIST_HEAD(ptr) INIT_LIST_HEAD(ptr)
-#define md_list_add list_add
-#define md_list_del list_del
-#define md_list_empty list_empty
-
-#define md_list_entry(ptr, type, member) list_entry(ptr, type, member)
-
-/* 023 */
-
-#define md_schedule_timeout schedule_timeout
-
-/* 024 */
-#define md_need_resched(tsk) ((tsk)->need_resched)
-
-/* 025 */
-#define md_spinlock_t spinlock_t
-#define MD_SPIN_LOCK_UNLOCKED SPIN_LOCK_UNLOCKED
-
-#define md_spin_lock spin_lock
-#define md_spin_unlock spin_unlock
-#define md_spin_lock_irq spin_lock_irq
-#define md_spin_unlock_irq spin_unlock_irq
-#define md_spin_unlock_irqrestore spin_unlock_irqrestore
-#define md_spin_lock_irqsave spin_lock_irqsave
-
-/* 026 */
-typedef wait_queue_head_t md_wait_queue_head_t;
-#define MD_DECLARE_WAITQUEUE(w,t) DECLARE_WAITQUEUE((w),(t))
-#define MD_DECLARE_WAIT_QUEUE_HEAD(x) DECLARE_WAIT_QUEUE_HEAD(x)
-#define md_init_waitqueue_head init_waitqueue_head
-
-/* END */
-
-#endif
-
diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h
index 5382bc072c3d..6bf45496c507 100644
--- a/include/linux/raid/md_k.h
+++ b/include/linux/raid/md_k.h
@@ -158,9 +158,9 @@ static inline void mark_disk_nonsync(mdp_disk_t * d)
*/
struct mdk_rdev_s
{
- struct md_list_head same_set; /* RAID devices within the same set */
- struct md_list_head all; /* all RAID devices */
- struct md_list_head pending; /* undetected RAID devices */
+ struct list_head same_set; /* RAID devices within the same set */
+ struct list_head all; /* all RAID devices */
+ struct list_head pending; /* undetected RAID devices */
kdev_t dev; /* Device number */
kdev_t old_dev; /* "" when it was last imported */
@@ -197,7 +197,7 @@ struct mddev_s
int __minor;
mdp_super_t *sb;
int nb_dev;
- struct md_list_head disks;
+ struct list_head disks;
int sb_dirty;
mdu_param_t param;
int ro;
@@ -212,9 +212,9 @@ struct mddev_s
atomic_t active;
atomic_t recovery_active; /* blocks scheduled, but not written */
- md_wait_queue_head_t recovery_wait;
+ wait_queue_head_t recovery_wait;
- struct md_list_head all_mddevs;
+ struct list_head all_mddevs;
};
struct mdk_personality_s
@@ -240,7 +240,7 @@ struct mdk_personality_s
int (*stop_resync)(mddev_t *mddev);
int (*restart_resync)(mddev_t *mddev);
- int (*sync_request)(mddev_t *mddev, unsigned long block_nr);
+ int (*sync_request)(mddev_t *mddev, sector_t sector_nr);
};
@@ -269,9 +269,9 @@ extern mdp_disk_t *get_spare(mddev_t *mddev);
*/
#define ITERATE_RDEV_GENERIC(head,field,rdev,tmp) \
\
- for (tmp = head.next; \
- rdev = md_list_entry(tmp, mdk_rdev_t, field), \
- tmp = tmp->next, tmp->prev != &head \
+ for ((tmp) = (head).next; \
+ (rdev) = (list_entry((tmp), mdk_rdev_t, field)), \
+ (tmp) = (tmp)->next, (tmp)->prev != &(head) \
; )
/*
* iterates through the 'same array disks' ringlist
@@ -305,7 +305,7 @@ extern mdp_disk_t *get_spare(mddev_t *mddev);
#define ITERATE_MDDEV(mddev,tmp) \
\
for (tmp = all_mddevs.next; \
- mddev = md_list_entry(tmp, mddev_t, all_mddevs), \
+ mddev = list_entry(tmp, mddev_t, all_mddevs), \
tmp = tmp->next, tmp->prev != &all_mddevs \
; )
@@ -325,7 +325,7 @@ static inline void unlock_mddev (mddev_t * mddev)
typedef struct mdk_thread_s {
void (*run) (void *data);
void *data;
- md_wait_queue_head_t wqueue;
+ wait_queue_head_t wqueue;
unsigned long flags;
struct completion *event;
struct task_struct *tsk;
@@ -337,7 +337,7 @@ typedef struct mdk_thread_s {
#define MAX_DISKNAME_LEN 64
typedef struct dev_name_s {
- struct md_list_head list;
+ struct list_head list;
kdev_t dev;
char namebuf [MAX_DISKNAME_LEN];
char *name;
diff --git a/include/linux/raid/raid1.h b/include/linux/raid/raid1.h
index 40675b40ca0f..c03eabf2e55c 100644
--- a/include/linux/raid/raid1.h
+++ b/include/linux/raid/raid1.h
@@ -3,6 +3,8 @@
#include <linux/raid/md.h>
+typedef struct mirror_info mirror_info_t;
+
struct mirror_info {
int number;
int raid_disk;
@@ -20,34 +22,21 @@ struct mirror_info {
int used_slot;
};
-struct raid1_private_data {
+typedef struct r1bio_s r1bio_t;
+
+struct r1_private_data_s {
mddev_t *mddev;
- struct mirror_info mirrors[MD_SB_DISKS];
+ mirror_info_t mirrors[MD_SB_DISKS];
int nr_disks;
int raid_disks;
int working_disks;
int last_used;
- unsigned long next_sect;
+ sector_t next_sect;
int sect_count;
mdk_thread_t *thread, *resync_thread;
int resync_mirrors;
- struct mirror_info *spare;
- md_spinlock_t device_lock;
-
- /* buffer pool */
- /* buffer_heads that we have pre-allocated have b_pprev -> &freebh
- * and are linked into a stack using b_next
- * raid1_bh that are pre-allocated have R1BH_PreAlloc set.
- * All these variable are protected by device_lock
- */
- struct buffer_head *freebh;
- int freebh_cnt; /* how many are on the list */
- int freebh_blocked;
- struct raid1_bh *freer1;
- int freer1_blocked;
- int freer1_cnt;
- struct raid1_bh *freebuf; /* each bh_req has a page allocated */
- md_wait_queue_head_t wait_buffer;
+ mirror_info_t *spare;
+ spinlock_t device_lock;
/* for use when syncing mirrors: */
unsigned long start_active, start_ready,
@@ -56,18 +45,21 @@ struct raid1_private_data {
cnt_pending, cnt_future;
int phase;
int window;
- md_wait_queue_head_t wait_done;
- md_wait_queue_head_t wait_ready;
- md_spinlock_t segment_lock;
+ wait_queue_head_t wait_done;
+ wait_queue_head_t wait_ready;
+ spinlock_t segment_lock;
+
+ mempool_t *r1bio_pool;
+ mempool_t *r1buf_pool;
};
-typedef struct raid1_private_data raid1_conf_t;
+typedef struct r1_private_data_s conf_t;
/*
* this is the only point in the RAID code where we violate
* C type safety. mddev->private is an 'opaque' pointer.
*/
-#define mddev_to_conf(mddev) ((raid1_conf_t *) mddev->private)
+#define mddev_to_conf(mddev) ((conf_t *) mddev->private)
/*
* this is our 'private' 'collective' RAID1 buffer head.
@@ -75,20 +67,32 @@ typedef struct raid1_private_data raid1_conf_t;
* for this RAID1 operation, and about their status:
*/
-struct raid1_bh {
+struct r1bio_s {
atomic_t remaining; /* 'have we finished' count,
* used from IRQ handlers
*/
int cmd;
+ sector_t sector;
unsigned long state;
mddev_t *mddev;
- struct buffer_head *master_bh;
- struct buffer_head *mirror_bh_list;
- struct buffer_head bh_req;
- struct raid1_bh *next_r1; /* next for retry or in free list */
+ /*
+ * original bio going to /dev/mdx
+ */
+ struct bio *master_bio;
+ /*
+ * if the IO is in READ direction, then this bio is used:
+ */
+ struct bio *read_bio;
+ /*
+ * if the IO is in WRITE direction, then multiple bios are used:
+ */
+ struct bio *write_bios[MD_SB_DISKS];
+
+ r1bio_t *next_r1; /* next for retry or in free list */
+ struct list_head retry_list;
};
-/* bits for raid1_bh.state */
-#define R1BH_Uptodate 1
-#define R1BH_SyncPhase 2
-#define R1BH_PreAlloc 3 /* this was pre-allocated, add to free list */
+
+/* bits for r1bio.state */
+#define R1BIO_Uptodate 1
+#define R1BIO_SyncPhase 2
#endif
diff --git a/init/do_mounts.c b/init/do_mounts.c
index d34fdd7ae7f8..e6a94292c2b4 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -14,37 +14,44 @@
#include <linux/nfs_fs.h>
#include <linux/nfs_fs_sb.h>
#include <linux/nfs_mount.h>
+#include <linux/minix_fs.h>
+#include <linux/ext2_fs.h>
+#include <linux/romfs_fs.h>
#include <asm/uaccess.h>
-/* syscalls missing from unistd.h */
-
-static inline _syscall2(int,mkdir,char *,name,int,mode);
-static inline _syscall1(int,chdir,char *,name);
-static inline _syscall1(int,chroot,char *,name);
-static inline _syscall1(int,unlink,char *,name);
-static inline _syscall3(int,mknod,char *,name,int,mode,dev_t,dev);
-static inline _syscall5(int,mount,char *,dev,char *,dir,char *,type,
- unsigned long,flags,void *,data);
-static inline _syscall2(int,umount,char *,name,int,flags);
-
-extern void rd_load(void);
-extern void initrd_load(void);
+#define BUILD_CRAMDISK
+
extern int get_filesystem_list(char * buf);
extern void wait_for_keypress(void);
-asmlinkage long sys_mount(char * dev_name, char * dir_name, char * type,
- unsigned long flags, void * data);
+asmlinkage long sys_mount(char *dev_name, char *dir_name, char *type,
+ unsigned long flags, void *data);
+asmlinkage long sys_mkdir(char *name, int mode);
+asmlinkage long sys_chdir(char *name);
+asmlinkage long sys_chroot(char *name);
+asmlinkage long sys_unlink(char *name);
+asmlinkage long sys_symlink(char *old, char *new);
+asmlinkage long sys_mknod(char *name, int mode, dev_t dev);
+asmlinkage long sys_umount(char *name, int flags);
+asmlinkage long sys_ioctl(int fd, int cmd, unsigned long arg);
#ifdef CONFIG_BLK_DEV_INITRD
unsigned int real_root_dev; /* do_proc_dointvec cannot handle kdev_t */
#endif
-int root_mountflags = MS_RDONLY;
-char root_device_name[64];
+#ifdef CONFIG_BLK_DEV_RAM
+extern int rd_doload;
+#else
+static int rd_doload = 0;
+#endif
+int root_mountflags = MS_RDONLY | MS_VERBOSE;
+static char root_device_name[64];
/* this is initialized in init/main.c */
kdev_t ROOT_DEV;
+static int do_devfs = 0;
+
static int __init readonly(char *str)
{
if (*str)
@@ -275,91 +282,20 @@ static void __init get_fs_names(char *page)
}
*s = '\0';
}
-
-static void __init mount_root(void)
+static void __init mount_block_root(char *name, int flags)
{
- void *handle;
- char path[64];
- char *name = "/dev/root";
- char *fs_names, *p;
- int do_devfs = 0;
+ char *fs_names = __getname();
+ char *p;
- root_mountflags |= MS_VERBOSE;
-
- fs_names = __getname();
get_fs_names(fs_names);
-
-#ifdef CONFIG_ROOT_NFS
- if (MAJOR(ROOT_DEV) == UNNAMED_MAJOR) {
- void *data;
- data = nfs_root_data();
- if (data) {
- int err = mount("/dev/root", "/root", "nfs", root_mountflags, data);
- if (!err)
- goto done;
- }
- printk(KERN_ERR "VFS: Unable to mount root fs via NFS, trying floppy.\n");
- ROOT_DEV = MKDEV(FLOPPY_MAJOR, 0);
- }
-#endif
-
-#ifdef CONFIG_BLK_DEV_FD
- if (MAJOR(ROOT_DEV) == FLOPPY_MAJOR) {
-#ifdef CONFIG_BLK_DEV_RAM
- extern int rd_doload;
- extern void rd_load_secondary(void);
-#endif
- floppy_eject();
-#ifndef CONFIG_BLK_DEV_RAM
- printk(KERN_NOTICE "(Warning, this kernel has no ramdisk support)\n");
-#else
- /* rd_doload is 2 for a dual initrd/ramload setup */
- if(rd_doload==2)
- rd_load_secondary();
- else
-#endif
- {
- printk(KERN_NOTICE "VFS: Insert root floppy and press ENTER\n");
- wait_for_keypress();
- }
- }
-#endif
-
- devfs_make_root (root_device_name);
- handle = devfs_find_handle (NULL, ROOT_DEVICE_NAME,
- MAJOR (ROOT_DEV), MINOR (ROOT_DEV),
- DEVFS_SPECIAL_BLK, 1);
- if (handle) {
- int n;
- unsigned major, minor;
-
- devfs_get_maj_min (handle, &major, &minor);
- ROOT_DEV = MKDEV (major, minor);
- if (!ROOT_DEV)
- panic("I have no root and I want to scream");
- n = devfs_generate_path (handle, path + 5, sizeof (path) - 5);
- if (n >= 0) {
- name = path + n;
- devfs_mk_symlink (NULL, "root", DEVFS_FL_DEFAULT,
- name + 5, NULL, NULL);
- memcpy (name, "/dev/", 5);
- do_devfs = 1;
- }
- }
- chdir("/dev");
- unlink("root");
- mknod("root", S_IFBLK|0600, kdev_t_to_nr(ROOT_DEV));
- if (do_devfs)
- mount("devfs", ".", "devfs", 0, NULL);
retry:
for (p = fs_names; *p; p += strlen(p)+1) {
- int err;
- err = sys_mount(name,"/root",p,root_mountflags,root_mount_data);
+ int err = sys_mount(name, "/root", p, flags, root_mount_data);
switch (err) {
case 0:
- goto done;
+ goto out;
case -EACCES:
- root_mountflags |= MS_RDONLY;
+ flags |= MS_RDONLY;
goto retry;
case -EINVAL:
continue;
@@ -375,94 +311,324 @@ retry:
kdevname(ROOT_DEV));
}
panic("VFS: Unable to mount root fs on %s", kdevname(ROOT_DEV));
-
-done:
+out:
putname(fs_names);
- if (do_devfs)
- umount(".", 0);
+ sys_chdir("/root");
+ ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
+ printk("VFS: Mounted root (%s filesystem)%s.\n",
+ current->fs->pwdmnt->mnt_sb->s_type->name,
+ (current->fs->pwdmnt->mnt_sb->s_flags & MS_RDONLY) ? " readonly" : "");
}
+
+#ifdef CONFIG_ROOT_NFS
+static int __init mount_nfs_root(void)
+{
+ void *data = nfs_root_data();
-#ifdef CONFIG_BLK_DEV_INITRD
+ if (data && sys_mount("/dev/root","/root","nfs",root_mountflags,data) == 0)
+ return 1;
+ return 0;
+}
+#endif
-static int __init change_root(kdev_t new_root_dev,const char *put_old)
+static int __init create_dev(char *name, kdev_t dev, char *devfs_name)
+{
+ void *handle;
+ char path[64];
+ int n;
+
+ sys_unlink(name);
+ if (!do_devfs)
+ return sys_mknod(name, S_IFBLK|0600, kdev_t_to_nr(dev));
+
+ handle = devfs_find_handle(NULL, dev ? NULL : devfs_name,
+ MAJOR(dev), MINOR(dev), DEVFS_SPECIAL_BLK, 1);
+ if (!handle)
+ return -1;
+ n = devfs_generate_path(handle, path + 5, sizeof (path) - 5);
+ if (n < 0)
+ return -1;
+ return sys_symlink(path + n + 5, name);
+}
+
+#ifdef CONFIG_MAC_FLOPPY
+int swim3_fd_eject(int devnum);
+#endif
+static void __init change_floppy(char *fmt, ...)
{
- struct vfsmount *old_rootmnt;
- struct nameidata devfs_nd;
- char *new_devname = kmalloc(strlen("/dev/root.old")+1, GFP_KERNEL);
- int error = 0;
-
- if (new_devname)
- strcpy(new_devname, "/dev/root.old");
-
- /* .. here is directory mounted over root */
- mount("..", ".", NULL, MS_MOVE, NULL);
- chdir("/old");
-
- read_lock(&current->fs->lock);
- old_rootmnt = mntget(current->fs->pwdmnt);
- read_unlock(&current->fs->lock);
-
- /* First unmount devfs if mounted */
- if (path_init("/old/dev", LOOKUP_FOLLOW|LOOKUP_POSITIVE, &devfs_nd))
- error = path_walk("/old/dev", &devfs_nd);
- if (!error) {
- if (devfs_nd.mnt->mnt_sb->s_magic == DEVFS_SUPER_MAGIC &&
- devfs_nd.dentry == devfs_nd.mnt->mnt_root)
- umount("/old/dev", 0);
- path_release(&devfs_nd);
+ extern void wait_for_keypress(void);
+ char buf[80];
+ va_list args;
+ va_start(args, fmt);
+ vsprintf(buf, fmt, args);
+ va_end(args);
+#ifdef CONFIG_BLK_DEV_FD
+ floppy_eject();
+#endif
+#ifdef CONFIG_MAC_FLOPPY
+ swim3_fd_eject(MINOR(ROOT_DEV));
+#endif
+ printk(KERN_NOTICE "VFS: Insert %s and press ENTER\n", buf);
+ wait_for_keypress();
+}
+
+#ifdef CONFIG_BLK_DEV_RAM
+
+static int __init crd_load(int in_fd, int out_fd);
+
+/*
+ * This routine tries to find a RAM disk image to load, and returns the
+ * number of blocks to read for a non-compressed image, 0 if the image
+ * is a compressed image, and -1 if an image with the right magic
+ * numbers could not be found.
+ *
+ * We currently check for the following magic numbers:
+ * minix
+ * ext2
+ * romfs
+ * gzip
+ */
+static int __init
+identify_ramdisk_image(int fd, int start_block)
+{
+ const int size = 512;
+ struct minix_super_block *minixsb;
+ struct ext2_super_block *ext2sb;
+ struct romfs_super_block *romfsb;
+ int nblocks = -1;
+ unsigned char *buf;
+
+ buf = kmalloc(size, GFP_KERNEL);
+ if (buf == 0)
+ return -1;
+
+ minixsb = (struct minix_super_block *) buf;
+ ext2sb = (struct ext2_super_block *) buf;
+ romfsb = (struct romfs_super_block *) buf;
+ memset(buf, 0xe5, size);
+
+ /*
+ * Read block 0 to test for gzipped kernel
+ */
+ lseek(fd, start_block * BLOCK_SIZE, 0);
+ read(fd, buf, size);
+
+ /*
+ * If it matches the gzip magic numbers, return -1
+ */
+ if (buf[0] == 037 && ((buf[1] == 0213) || (buf[1] == 0236))) {
+ printk(KERN_NOTICE
+ "RAMDISK: Compressed image found at block %d\n",
+ start_block);
+ nblocks = 0;
+ goto done;
}
- ROOT_DEV = new_root_dev;
- mount_root();
+ /* romfs is at block zero too */
+ if (romfsb->word0 == ROMSB_WORD0 &&
+ romfsb->word1 == ROMSB_WORD1) {
+ printk(KERN_NOTICE
+ "RAMDISK: romfs filesystem found at block %d\n",
+ start_block);
+ nblocks = (ntohl(romfsb->size)+BLOCK_SIZE-1)>>BLOCK_SIZE_BITS;
+ goto done;
+ }
- chdir("/root");
- ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
- printk("VFS: Mounted root (%s filesystem)%s.\n",
- current->fs->pwdmnt->mnt_sb->s_type->name,
- (current->fs->pwdmnt->mnt_sb->s_flags & MS_RDONLY) ? " readonly" : "");
+ /*
+ * Read block 1 to test for minix and ext2 superblock
+ */
+ lseek(fd, (start_block+1) * BLOCK_SIZE, 0);
+ read(fd, buf, size);
+
+ /* Try minix */
+ if (minixsb->s_magic == MINIX_SUPER_MAGIC ||
+ minixsb->s_magic == MINIX_SUPER_MAGIC2) {
+ printk(KERN_NOTICE
+ "RAMDISK: Minix filesystem found at block %d\n",
+ start_block);
+ nblocks = minixsb->s_nzones << minixsb->s_log_zone_size;
+ goto done;
+ }
+
+ /* Try ext2 */
+ if (ext2sb->s_magic == cpu_to_le16(EXT2_SUPER_MAGIC)) {
+ printk(KERN_NOTICE
+ "RAMDISK: ext2 filesystem found at block %d\n",
+ start_block);
+ nblocks = le32_to_cpu(ext2sb->s_blocks_count);
+ goto done;
+ }
-#if 1
- shrink_dcache();
- printk("change_root: old root has d_count=%d\n",
- atomic_read(&old_rootmnt->mnt_root->d_count));
+ printk(KERN_NOTICE
+ "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n",
+ start_block);
+
+done:
+ lseek(fd, start_block * BLOCK_SIZE, 0);
+ kfree(buf);
+ return nblocks;
+}
#endif
- error = mount("/old", "/root/initrd", NULL, MS_MOVE, NULL);
- if (error) {
- int blivet;
- struct block_device *ramdisk = old_rootmnt->mnt_sb->s_bdev;
-
- atomic_inc(&ramdisk->bd_count);
- blivet = blkdev_get(ramdisk, FMODE_READ, 0, BDEV_FS);
- printk(KERN_NOTICE "Trying to unmount old root ... ");
- umount("/old", MNT_DETACH);
- if (!blivet) {
- blivet = ioctl_by_bdev(ramdisk, BLKFLSBUF, 0);
- blkdev_put(ramdisk, BDEV_FS);
- }
- if (blivet) {
- printk(KERN_ERR "error %d\n", blivet);
- } else {
- printk("okay\n");
- error = 0;
+static int __init rd_load_image(char *from)
+{
+ int res = 0;
+
+#ifdef CONFIG_BLK_DEV_RAM
+ int in_fd, out_fd;
+ int nblocks, rd_blocks, devblocks, i;
+ char *buf;
+ unsigned short rotate = 0;
+#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
+ char rotator[4] = { '|' , '/' , '-' , '\\' };
+#endif
+
+ out_fd = open("/dev/ram", O_RDWR, 0);
+ if (out_fd < 0)
+ goto out;
+
+ in_fd = open(from, O_RDONLY, 0);
+ if (in_fd < 0)
+ goto noclose_input;
+
+ nblocks = identify_ramdisk_image(in_fd, rd_image_start);
+ if (nblocks < 0)
+ goto done;
+
+ if (nblocks == 0) {
+#ifdef BUILD_CRAMDISK
+ if (crd_load(in_fd, out_fd) == 0)
+ goto successful_load;
+#else
+ printk(KERN_NOTICE
+ "RAMDISK: Kernel does not support compressed "
+ "RAM disk images\n");
+#endif
+ goto done;
+ }
+
+ /*
+ * NOTE NOTE: nblocks suppose that the blocksize is BLOCK_SIZE, so
+ * rd_load_image will work only with filesystem BLOCK_SIZE wide!
+ * So make sure to use 1k blocksize while generating ext2fs
+ * ramdisk-images.
+ */
+ if (sys_ioctl(out_fd, BLKGETSIZE, (unsigned long)&rd_blocks) < 0)
+ rd_blocks = 0;
+ else
+ rd_blocks >>= 1;
+
+ if (nblocks > rd_blocks) {
+ printk("RAMDISK: image too big! (%d/%d blocks)\n",
+ nblocks, rd_blocks);
+ goto done;
+ }
+
+ /*
+ * OK, time to copy in the data
+ */
+ buf = kmalloc(BLOCK_SIZE, GFP_KERNEL);
+ if (buf == 0) {
+ printk(KERN_ERR "RAMDISK: could not allocate buffer\n");
+ goto done;
+ }
+
+ if (sys_ioctl(in_fd, BLKGETSIZE, (unsigned long)&devblocks) < 0)
+ devblocks = 0;
+ else
+ devblocks >>= 1;
+
+ if (strcmp(from, "/dev/initrd") == 0)
+ devblocks = nblocks;
+
+ if (devblocks == 0) {
+ printk(KERN_ERR "RAMDISK: could not determine device size\n");
+ goto done;
+ }
+
+ printk(KERN_NOTICE "RAMDISK: Loading %d blocks [%d disk%s] into ram disk... ",
+ nblocks, ((nblocks-1)/devblocks)+1, nblocks>devblocks ? "s" : "");
+ for (i=0; i < nblocks; i++) {
+ if (i && (i % devblocks == 0)) {
+ printk("done disk #%d.\n", i/devblocks);
+ rotate = 0;
+ if (close(in_fd)) {
+ printk("Error closing the disk.\n");
+ goto noclose_input;
+ }
+ change_floppy("disk #%d", i/devblocks+1);
+ in_fd = open(from, O_RDONLY, 0);
+ if (in_fd < 0) {
+ printk("Error opening disk.\n");
+ goto noclose_input;
+ }
+ printk("Loading disk #%d... ", i/devblocks+1);
}
- } else {
- spin_lock(&dcache_lock);
- if (new_devname) {
- void *p = old_rootmnt->mnt_devname;
- old_rootmnt->mnt_devname = new_devname;
- new_devname = p;
+ read(in_fd, buf, BLOCK_SIZE);
+ write(out_fd, buf, BLOCK_SIZE);
+#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
+ if (!(i % 16)) {
+ printk("%c\b", rotator[rotate & 0x3]);
+ rotate++;
}
- spin_unlock(&dcache_lock);
+#endif
}
+ printk("done.\n");
+ kfree(buf);
- /* put the old stuff */
- mntput(old_rootmnt);
- kfree(new_devname);
- return error;
+successful_load:
+ res = 1;
+done:
+ close(in_fd);
+noclose_input:
+ close(out_fd);
+out:
+ sys_unlink("/dev/ram");
+#endif
+ return res;
+}
+
+static int __init rd_load_disk(int n)
+{
+#ifdef CONFIG_BLK_DEV_RAM
+ extern int rd_prompt;
+ if (rd_prompt)
+ change_floppy("root floppy disk to be loaded into RAM disk");
+ create_dev("/dev/ram", MKDEV(RAMDISK_MAJOR, n), NULL);
+#endif
+ return rd_load_image("/dev/root");
}
+static void __init mount_root(void)
+{
+#ifdef CONFIG_ROOT_NFS
+ if (MAJOR(ROOT_DEV) == UNNAMED_MAJOR) {
+ if (mount_nfs_root()) {
+ sys_chdir("/root");
+ ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
+ printk("VFS: Mounted root (nfs filesystem).\n");
+ return;
+ }
+ printk(KERN_ERR "VFS: Unable to mount root fs via NFS, trying floppy.\n");
+ ROOT_DEV = MKDEV(FLOPPY_MAJOR, 0);
+ }
#endif
+ devfs_make_root(root_device_name);
+ create_dev("/dev/root", ROOT_DEV, root_device_name);
+#ifdef CONFIG_BLK_DEV_FD
+ if (MAJOR(ROOT_DEV) == FLOPPY_MAJOR) {
+ /* rd_doload is 2 for a dual initrd/ramload setup */
+ if (rd_doload==2) {
+ if (rd_load_disk(1)) {
+ ROOT_DEV = MKDEV(RAMDISK_MAJOR, 1);
+ create_dev("/dev/root", ROOT_DEV, NULL);
+ }
+ } else
+ change_floppy("root floppy");
+ }
+#endif
+ mount_block_root("/dev/root", root_mountflags);
+}
#ifdef CONFIG_BLK_DEV_INITRD
static int do_linuxrc(void * shell)
@@ -470,9 +636,9 @@ static int do_linuxrc(void * shell)
static char *argv[] = { "linuxrc", NULL, };
extern char * envp_init[];
- chdir("/root");
- mount(".", "/", NULL, MS_MOVE, NULL);
- chroot(".");
+ sys_chdir("/root");
+ sys_mount(".", "/", NULL, MS_MOVE, NULL);
+ sys_chroot(".");
mount_devfs_fs ();
@@ -486,76 +652,247 @@ static int do_linuxrc(void * shell)
#endif
+static void __init handle_initrd(void)
+{
+#ifdef CONFIG_BLK_DEV_INITRD
+ int ram0 = kdev_t_to_nr(MKDEV(RAMDISK_MAJOR,0));
+ int error;
+ int i, pid;
+
+ create_dev("/dev/root.old", ram0, NULL);
+ mount_block_root("/dev/root.old", root_mountflags & ~MS_RDONLY);
+ sys_mkdir("/old", 0700);
+ sys_chdir("/old");
+
+ pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
+ if (pid > 0) {
+ while (pid != wait(&i)) {
+ current->policy |= SCHED_YIELD;
+ schedule();
+ }
+ }
+
+ sys_mount("..", ".", NULL, MS_MOVE, NULL);
+ sys_umount("/old/dev", 0);
+
+ if (real_root_dev == ram0) {
+ sys_chdir("/old");
+ return;
+ }
+
+ ROOT_DEV = real_root_dev;
+ mount_root();
+
+ printk(KERN_NOTICE "Trying to move old root to /initrd ... ");
+ error = sys_mount("/old", "/root/initrd", NULL, MS_MOVE, NULL);
+ if (!error)
+ printk("okay\n");
+ else {
+ int fd = open("/dev/root.old", O_RDWR, 0);
+ printk("failed\n");
+ printk(KERN_NOTICE "Unmounting old root\n");
+ sys_umount("/old", MNT_DETACH);
+ printk(KERN_NOTICE "Trying to free ramdisk memory ... ");
+ if (fd < 0) {
+ error = fd;
+ } else {
+ error = sys_ioctl(fd, BLKFLSBUF, 0);
+ close(fd);
+ }
+ printk(error ? "okay\n" : "failed\n");
+ }
+#endif
+}
+
+static int __init initrd_load(void)
+{
+#ifdef CONFIG_BLK_DEV_INITRD
+ create_dev("/dev/ram", MKDEV(RAMDISK_MAJOR, 0), NULL);
+ create_dev("/dev/initrd", MKDEV(RAMDISK_MAJOR, INITRD_MINOR), NULL);
+#endif
+ return rd_load_image("/dev/initrd");
+}
+
/*
* Prepare the namespace - decide what/where to mount, load ramdisks, etc.
*/
void prepare_namespace(void)
{
+ int do_initrd = 0;
+ int is_floppy = MAJOR(ROOT_DEV) == FLOPPY_MAJOR;
#ifdef CONFIG_BLK_DEV_INITRD
- int real_root_mountflags = root_mountflags;
if (!initrd_start)
mount_initrd = 0;
if (mount_initrd)
- root_mountflags &= ~MS_RDONLY;
+ do_initrd = 1;
real_root_dev = ROOT_DEV;
#endif
- mkdir("/dev", 0700);
- mkdir("/root", 0700);
-
-#ifdef CONFIG_BLK_DEV_RAM
-#ifdef CONFIG_BLK_DEV_INITRD
- if (mount_initrd)
- initrd_load();
- else
-#endif
- rd_load();
+ sys_mkdir("/dev", 0700);
+ sys_mkdir("/root", 0700);
+#ifdef CONFIG_DEVFS_FS
+ sys_mount("devfs", "/dev", "devfs", 0, NULL);
+ do_devfs = 1;
#endif
- /* Mount the root filesystem.. */
+ create_dev("/dev/root", ROOT_DEV, NULL);
+ if (do_initrd) {
+ if (initrd_load() && ROOT_DEV != MKDEV(RAMDISK_MAJOR, 0)) {
+ handle_initrd();
+ goto out;
+ }
+ } else if (is_floppy && rd_doload && rd_load_disk(0))
+ ROOT_DEV = MKDEV(RAMDISK_MAJOR, 0);
mount_root();
- chdir("/root");
- ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
- printk("VFS: Mounted root (%s filesystem)%s.\n",
- current->fs->pwdmnt->mnt_sb->s_type->name,
- (current->fs->pwdmnt->mnt_sb->s_flags & MS_RDONLY) ? " readonly" : "");
+out:
+ sys_umount("/dev", 0);
+ sys_mount(".", "/", NULL, MS_MOVE, NULL);
+ sys_chroot(".");
+ mount_devfs_fs ();
+}
-#ifdef CONFIG_BLK_DEV_INITRD
- root_mountflags = real_root_mountflags;
- if (mount_initrd && ROOT_DEV != real_root_dev
- && MAJOR(ROOT_DEV) == RAMDISK_MAJOR && MINOR(ROOT_DEV) == 0) {
- int error;
- int i, pid;
- mkdir("/old", 0700);
- chdir("/old");
-
- pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
- if (pid > 0) {
- while (pid != wait(&i)) {
- current->policy |= SCHED_YIELD;
- schedule();
- }
- }
- if (MAJOR(real_root_dev) != RAMDISK_MAJOR
- || MINOR(real_root_dev) != 0) {
- error = change_root(real_root_dev,"/initrd");
- if (error)
- printk(KERN_ERR "Change root to /initrd: "
- "error %d\n",error);
-
- chdir("/root");
- mount(".", "/", NULL, MS_MOVE, NULL);
- chroot(".");
-
- mount_devfs_fs ();
- return;
- }
- chroot("..");
- chdir("/");
- return;
- }
+#ifdef BUILD_CRAMDISK
+
+/*
+ * gzip declarations
+ */
+
+#define OF(args) args
+
+#ifndef memzero
+#define memzero(s, n) memset ((s), 0, (n))
#endif
- mount(".", "/", NULL, MS_MOVE, NULL);
- chroot(".");
- mount_devfs_fs ();
+typedef unsigned char uch;
+typedef unsigned short ush;
+typedef unsigned long ulg;
+
+#define INBUFSIZ 4096
+#define WSIZE 0x8000 /* window size--must be a power of two, and */
+ /* at least 32K for zip's deflate method */
+
+static uch *inbuf;
+static uch *window;
+
+static unsigned insize; /* valid bytes in inbuf */
+static unsigned inptr; /* index of next byte to be processed in inbuf */
+static unsigned outcnt; /* bytes in output buffer */
+static int exit_code;
+static long bytes_out;
+static int crd_infd, crd_outfd;
+
+#define get_byte() (inptr < insize ? inbuf[inptr++] : fill_inbuf())
+
+/* Diagnostic functions (stubbed out) */
+#define Assert(cond,msg)
+#define Trace(x)
+#define Tracev(x)
+#define Tracevv(x)
+#define Tracec(c,x)
+#define Tracecv(c,x)
+
+#define STATIC static
+
+static int fill_inbuf(void);
+static void flush_window(void);
+static void *malloc(int size);
+static void free(void *where);
+static void error(char *m);
+static void gzip_mark(void **);
+static void gzip_release(void **);
+
+#include "../lib/inflate.c"
+
+static void __init *malloc(int size)
+{
+ return kmalloc(size, GFP_KERNEL);
+}
+
+static void __init free(void *where)
+{
+ kfree(where);
+}
+
+static void __init gzip_mark(void **ptr)
+{
}
+
+static void __init gzip_release(void **ptr)
+{
+}
+
+
+/* ===========================================================================
+ * Fill the input buffer. This is called only when the buffer is empty
+ * and at least one byte is really needed.
+ */
+static int __init fill_inbuf(void)
+{
+ if (exit_code) return -1;
+
+ insize = read(crd_infd, inbuf, INBUFSIZ);
+ if (insize == 0) return -1;
+
+ inptr = 1;
+
+ return inbuf[0];
+}
+
+/* ===========================================================================
+ * Write the output window window[0..outcnt-1] and update crc and bytes_out.
+ * (Used for the decompressed data only.)
+ */
+static void __init flush_window(void)
+{
+ ulg c = crc; /* temporary variable */
+ unsigned n;
+ uch *in, ch;
+
+ write(crd_outfd, window, outcnt);
+ in = window;
+ for (n = 0; n < outcnt; n++) {
+ ch = *in++;
+ c = crc_32_tab[((int)c ^ ch) & 0xff] ^ (c >> 8);
+ }
+ crc = c;
+ bytes_out += (ulg)outcnt;
+ outcnt = 0;
+}
+
+static void __init error(char *x)
+{
+ printk(KERN_ERR "%s", x);
+ exit_code = 1;
+}
+
+static int __init crd_load(int in_fd, int out_fd)
+{
+ int result;
+
+ insize = 0; /* valid bytes in inbuf */
+ inptr = 0; /* index of next byte to be processed in inbuf */
+ outcnt = 0; /* bytes in output buffer */
+ exit_code = 0;
+ bytes_out = 0;
+ crc = (ulg)0xffffffffL; /* shift register contents */
+
+ crd_infd = in_fd;
+ crd_outfd = out_fd;
+ inbuf = kmalloc(INBUFSIZ, GFP_KERNEL);
+ if (inbuf == 0) {
+ printk(KERN_ERR "RAMDISK: Couldn't allocate gzip buffer\n");
+ return -1;
+ }
+ window = kmalloc(WSIZE, GFP_KERNEL);
+ if (window == 0) {
+ printk(KERN_ERR "RAMDISK: Couldn't allocate gzip window\n");
+ kfree(inbuf);
+ return -1;
+ }
+ makecrc();
+ result = gunzip();
+ kfree(inbuf);
+ kfree(window);
+ return result;
+}
+
+#endif /* BUILD_CRAMDISK */
diff --git a/mm/memory.c b/mm/memory.c
index cd99761be475..1315130e918f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1221,8 +1221,10 @@ static int do_no_page(struct mm_struct * mm, struct vm_area_struct * vma,
*/
if (write_access && !(vma->vm_flags & VM_SHARED)) {
struct page * page = alloc_page(GFP_HIGHUSER);
- if (!page)
+ if (!page) {
+ page_cache_release(new_page);
return -1;
+ }
copy_highpage(page, new_page);
page_cache_release(new_page);
lru_cache_add(page);
diff --git a/mm/mempool.c b/mm/mempool.c
index 8116cac13cf4..0c0bf99965ca 100644
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -1,9 +1,9 @@
/*
* linux/mm/mempool.c
*
- * memory buffer pool support. Such pools are mostly used to
- * guarantee deadlock-free IO operations even during extreme
- * VM load.
+ * memory buffer pool support. Such pools are mostly used
+ * for guaranteed, deadlock-free memory allocations during
+ * extreme VM load.
*
* started by Ingo Molnar, Copyright (C) 2001
*/
@@ -75,6 +75,71 @@ mempool_t * mempool_create(int min_nr, mempool_alloc_t *alloc_fn,
}
/**
+ * mempool_resize - resize an existing memory pool
+ * @pool: pointer to the memory pool which was allocated via
+ * mempool_create().
+ * @new_min_nr: the new minimum number of elements guaranteed to be
+ * allocated for this pool.
+ * @gfp_mask: the usual allocation bitmask.
+ *
+ * This function shrinks/grows the pool. In the case of growing,
+ * it cannot be guaranteed that the pool will be grown to the new
+ * size immediately, but new mempool_free() calls will refill it.
+ *
+ * Note, the caller must guarantee that no mempool_destroy is called
+ * while this function is running. mempool_alloc() & mempool_free()
+ * might be called (eg. from IRQ contexts) while this function executes.
+ */
+void mempool_resize(mempool_t *pool, int new_min_nr, int gfp_mask)
+{
+ int delta;
+ void *element;
+ unsigned long flags;
+ struct list_head *tmp;
+
+ if (new_min_nr <= 0)
+ BUG();
+
+ spin_lock_irqsave(&pool->lock, flags);
+ if (new_min_nr < pool->min_nr) {
+ pool->min_nr = new_min_nr;
+ /*
+ * Free possible excess elements.
+ */
+ while (pool->curr_nr > pool->min_nr) {
+ tmp = pool->elements.next;
+ if (tmp == &pool->elements)
+ BUG();
+ list_del(tmp);
+ element = tmp;
+ pool->curr_nr--;
+ spin_unlock_irqrestore(&pool->lock, flags);
+
+ pool->free(element, pool->pool_data);
+
+ spin_lock_irqsave(&pool->lock, flags);
+ }
+ spin_unlock_irqrestore(&pool->lock, flags);
+ return;
+ }
+ delta = new_min_nr - pool->min_nr;
+ pool->min_nr = new_min_nr;
+ spin_unlock_irqrestore(&pool->lock, flags);
+
+ /*
+ * We refill the pool up to the new treshold - but we dont
+ * (cannot) guarantee that the refill succeeds.
+ */
+ while (delta) {
+ element = pool->alloc(gfp_mask, pool->pool_data);
+ if (!element)
+ break;
+ mempool_free(element, pool);
+ delta--;
+ }
+}
+
+/**
* mempool_destroy - deallocate a memory pool
* @pool: pointer to the memory pool which was allocated via
* mempool_create().
@@ -110,7 +175,7 @@ void mempool_destroy(mempool_t *pool)
* @gfp_mask: the usual allocation bitmask.
*
* this function only sleeps if the alloc_fn function sleeps or
- * returns NULL. Note that due to preallocation guarantees this function
+ * returns NULL. Note that due to preallocation, this function
* *never* fails.
*/
void * mempool_alloc(mempool_t *pool, int gfp_mask)
@@ -175,7 +240,7 @@ repeat_alloc:
/**
* mempool_free - return an element to the pool.
- * @gfp_mask: pool element pointer.
+ * @element: pool element pointer.
* @pool: pointer to the memory pool which was allocated via
* mempool_create().
*
@@ -200,6 +265,7 @@ void mempool_free(void *element, mempool_t *pool)
}
EXPORT_SYMBOL(mempool_create);
+EXPORT_SYMBOL(mempool_resize);
EXPORT_SYMBOL(mempool_destroy);
EXPORT_SYMBOL(mempool_alloc);
EXPORT_SYMBOL(mempool_free);