| summaryrefslogtreecommitdiff |
Side-by-side diff
629 files changed, 114926 insertions, 24574 deletions
diff --git a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt index 6d2961c..1690b0e 100644 --- a/Documentation/DMA-mapping.txt +++ b/Documentation/DMA-mapping.txt @@ -8,7 +8,7 @@ Most of the 64bit platforms have special hardware that translates bus addresses (DMA addresses) into physical addresses. This is similar to how page tables and/or a TLB translates virtual addresses to physical -addresses on a cpu. This is needed so that e.g. PCI devices can +addresses on a CPU. This is needed so that e.g. PCI devices can access with a Single Address Cycle (32bit DMA address) any page in the 64bit physical address space. Previously in Linux those 64bit platforms had to set artificial limits on the maximum RAM size in the @@ -37,7 +37,7 @@ returned from the DMA mapping functions. What memory is DMA'able? The first piece of information you must know is what kernel memory can -be used with the DMA mapping facilitites. There has been an unwritten +be used with the DMA mapping facilities. There has been an unwritten set of rules regarding this, and this text is an attempt to finally write them down. @@ -106,7 +106,7 @@ This means that in the failure case, you have three options: 3) Ignore this device and do not initialize it. It is recommended that your driver print a kernel KERN_WARNING message -when you end up performing either #2 or #2. In this manner, if a user +when you end up performing either #2 or #3. In this manner, if a user of your driver reports that performance is bad or that the device is not even detected, you can ask them for the kernel messages to find out exactly why. @@ -146,7 +146,7 @@ all 64-bits during a DAC cycle: If your 64-bit device is going to be an enormous consumer of DMA mappings, this can be problematic since the DMA mappings are a finite resource on many platforms. Please see the "DAC Addressing -for Address Space Hungry Devices" setion near the end of this +for Address Space Hungry Devices" section near the end of this document for how to handle this case. Finally, if your device can only drive the low 24-bits of @@ -205,7 +205,7 @@ There are two types of DMA mappings: - Consistent DMA mappings which are usually mapped at driver initialization, unmapped at the end and for which the hardware should - guarantee that the device and the cpu can access the data + guarantee that the device and the CPU can access the data in parallel and will see updates made by each other without any explicit software flushing. @@ -222,12 +222,12 @@ There are two types of DMA mappings: - Device firmware microcode executed out of main memory. - The invariant these examples all require is that any cpu store + The invariant these examples all require is that any CPU store to memory is immediately visible to the device, and vice versa. Consistent mappings guarantee this. IMPORTANT: Consistent DMA memory does not preclude the usage of - proper memory barriers. The cpu may reorder stores to + proper memory barriers. The CPU may reorder stores to consistent memory just as it may normal memory. Example: if it is important for the device to see the first word of a descriptor updated before the second, you must do @@ -284,7 +284,7 @@ driver needs regions sized smaller than a page, you may prefer using the pci_pool interface, described below. The consistent DMA mapping interfaces, for non-NULL dev, will always -return a DMA address which is SAC (Single Address Cycle) addressible. +return a DMA address which is SAC (Single Address Cycle) addressable. Even if the device indicates (via PCI dma mask) that it may address the upper 32-bits and thus perform DAC cycles, consistent allocation will still only return 32-bit PCI addresses for DMA. This is true @@ -622,7 +622,7 @@ use the normal APIs without any problems. Note that for streaming type mappings you must either use these interfaces, or the dynamic mapping interfaces above. You may not mix usage of both for the same device. Such an act is illegal and is -guarenteed to put a banana in your tailpipe. +guaranteed to put a banana in your tailpipe. However, consistent mappings may in fact be used in conjunction with these interfaces. Remember that, as defined, consistent mappings are @@ -637,7 +637,7 @@ This routine behaves identically to pci_set_dma_mask. You may not use the following interfaces if this routine fails. Next, DMA addresses using this API are kept track of using the -dma64_addr_t type. It is guarenteed to be big enough to hold any +dma64_addr_t type. It is guaranteed to be big enough to hold any DAC address the platform layer will give to you from the following routines. If you have consistent mappings as well, you still use plain dma_addr_t to keep track of those. @@ -745,7 +745,7 @@ transform some example code. PCI_DMA_FROMDEVICE); It really should be self-explanatory. We treat the ADDR and LEN -seperately, because it is possible for an implementation to only +separately, because it is possible for an implementation to only need the address in order to perform the unmap operation. Platform Issues diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 9d17b7b..9d1a96f 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -121,10 +121,16 @@ by better scheme anyway. --------------------------- file_system_type --------------------------- prototypes: - struct super_block *(*read_super) (struct super_block *, void *, int); + struct super_block *(*get_sb) (struct file_system_type *, int, char *, void *); + void (*kill_sb) (struct super_block *); locking rules: -may block BKL ->s_lock mount_sem -yes yes yes maybe + may block BKL +get_sb yes yes +kill_sb yes yes + +->get_sb() returns error or a locked superblock (exclusive on ->s_umount). +->kill_sb() takes a locked superblock, does all shutdown work on it, +unlocks and drops the reference. --------------------------- address_space_operations -------------------------- prototypes: diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index 0a4efdd..af16ee3 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -99,3 +99,20 @@ free to drop it... ->link() callers hold ->i_sem on the object we are linking to. Some of your problems might be over... + +--- +[mandatory] + +new file_system_type method - kill_sb(superblock). If you are converting +an existing filesystem, set it according to ->fs_flags: + FS_REQUIRES_DEV - kill_block_super + FS_LITTER - kill_litter_super + neither - kill_anon_super +FS_LITTER is gone - just remove it from fs_flags. + +--- +[mandatory] + + FS_SINGLE is gone (actually, that had happened back when ->get_sb() +went in - and hadn't been documented ;-/). Just remove it from fs_flags +(and see ->get_sb() entry for other actions). diff --git a/Documentation/ia64/IRQ-redir.txt b/Documentation/ia64/IRQ-redir.txt new file mode 100644 index 0000000..b2096c3 --- a/dev/null +++ b/Documentation/ia64/IRQ-redir.txt @@ -0,0 +1,69 @@ +IRQ affinity on IA64 platforms +------------------------------ + 07.01.2002, Erich Focht <efocht@ess.nec.de> + + +By writing to /proc/irq/IRQ#/smp_affinity the interrupt routing can be +controlled. The behavior on IA64 platforms is slightly different from +that described in Documentation/IRQ-affinity.txt for i386 systems. + +Because of the usage of SAPIC mode and physical destination mode the +IRQ target is one particular CPU and cannot be a mask of several +CPUs. Only the first non-zero bit is taken into account. + + +Usage examples: + +The target CPU has to be specified as a hexadecimal CPU mask. The +first non-zero bit is the selected CPU. This format has been kept for +compatibility reasons with i386. + +Set the delivery mode of interrupt 41 to fixed and route the +interrupts to CPU #3 (logical CPU number) (2^3=0x08): + echo "8" >/proc/irq/41/smp_affinity + +Set the default route for IRQ number 41 to CPU 6 in lowest priority +delivery mode (redirectable): + echo "r 40" >/proc/irq/41/smp_affinity + +The output of the command + cat /proc/irq/IRQ#/smp_affinity +gives the target CPU mask for the specified interrupt vector. If the CPU +mask is preceeded by the character "r", the interrupt is redirectable +(i.e. lowest priority mode routing is used), otherwise its route is +fixed. + + + +Initialization and default behavior: + +If the platform features IRQ redirection (info provided by SAL) all +IO-SAPIC interrupts are initialized with CPU#0 as their default target +and the routing is the so called "lowest priority mode" (actually +fixed SAPIC mode with hint). The XTP chipset registers are used as hints +for the IRQ routing. Currently in Linux XTP registers can have three +values: + - minimal for an idle task, + - normal if any other task runs, + - maximal if the CPU is going to be switched off. +The IRQ is routed to the CPU with lowest XTP register value, the +search begins at the default CPU. Therefore most of the interrupts +will be handled by CPU #0. + +If the platform doesn't feature interrupt redirection IOSAPIC fixed +routing is used. The target CPUs are distributed in a round robin +manner. IRQs will be routed only to the selected target CPUs. Check +with + cat /proc/interrupts + + + +Comments: + +On large (multi-node) systems it is recommended to route the IRQs to +the node to which the corresponding device is connected. +For systems like the NEC AzusA we get IRQ node-affinity for free. This +is because usually the chipsets on each node redirect the interrupts +only to their own CPUs (as they cannot see the XTP registers on the +other nodes). + diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e02079a..b4ed0da 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -467,6 +467,14 @@ running once the system is up. whatever the firmware may have done. + usepirqmask [IA-32] Honor the possible IRQ mask + stored in the BIOS $PIR table. This is + needed on some systems with broken + BIOSes, notably some HP Pavilion N5400 + and Omnibook XE3 notebooks. This will + have no effect if ACPI IRQ routing is + enabled. + pd. [PARIDE] pf. [PARIDE] diff --git a/Documentation/video4linux/API.html b/Documentation/video4linux/API.html index 7f144ad..4b3d8f6 100644 --- a/Documentation/video4linux/API.html +++ b/Documentation/video4linux/API.html @@ -105,7 +105,7 @@ call <b>VIDIOCGWIN</b> to check if the nearest match was suitable. The <TR><TD><b>height</b><TD>The height of the image capture.</TD> <TR><TD><b>chromakey</b><TD>A host order RGB32 value for the chroma key.</TD> <TR><TD><b>flags</b><TD>Additional capture flags.</TD> -<TR><TD><b>clips</b><TD>A list of clipping rectangles. <em>(Set only)</em)</TD> +<TR><TD><b>clips</b><TD>A list of clipping rectangles. <em>(Set only)</em></TD> <TR><TD><b>clipcount</b><TD>The number of clipping rectangles. <em>(Set only)</em></TD> </TABLE> <P> @@ -120,6 +120,7 @@ fields available to the user. </TABLE> <P> Merely setting the window does not enable capturing. Overlay capturing +(i.e. PCI-PCI transfer to the frame buffer of the video card) is activated by passing the <b>VIDIOCCAPTURE</b> ioctl a value of 1, and disabled by passing it a value of 0. <P> @@ -310,9 +311,10 @@ The following decoding modes are defined </TABLE> <P> <H3>Reading Images</H3> -Each call to the <b>read</b> syscall returns the next available image from -the device. It is up to the caller to set the format and then to pass a -suitable size buffer and length to the function. Not all devices will support +Each call to the <b>read</b> syscall returns the next available image +from the device. It is up to the caller to set format and size (using +the VIDIOCSPICT and VIDIOCSWIN ioctls) and then to pass a suitable +size buffer and length to the function. Not all devices will support read operations. <P> A second way to handle image capture is via the mmap interface if supported. @@ -329,16 +331,39 @@ The video_mbuf structure contains the following fields <TR><TD><b>offsets</b><TD>The offset of each frame</TD> </TABLE> <P> -Once the mmap has been made the VIDIOCMCAPTURE ioctl sets the image size -you wish to use (which should match or be below the initial query size). -Having done so it will begin capturing to the memory mapped buffer. Whenever -a buffer is "used" by the program it should called VIDIOCSYNC to free this -frame up and continue. <em>to add:</em>VIDIOCSYNC takes the frame number -you are freeing as its argument. When the buffer is unmapped or all the -buffers are full capture ceases. While capturing to memory the driver will -make a "best effort" attempt to capture to screen as well if requested. This -normally means all frames that "miss" memory mapped capture will go to the -display. +Once the mmap has been made the VIDIOCMCAPTURE ioctl starts the +capture to a frame using the format and image size specified in the +video_mmap (which should match or be below the initial query size). +When the VIDIOCMCAPTURE ioctl returns the frame is <em>not</em> +captured yet, the driver just instructed the hardware to start the +capture. The application has to use the VIDIOCSYNC ioctl to wait +until the capture of a frame is finished. VIDIOCSYNC takes the frame +number you want to wait for as argument. +<p> +It is allowed to call VIDIOCMCAPTURE multiple times (with different +frame numbers in video_mmap->frame of course) and thus have multiple +outstanding capture requests. A simple way do to double-buffering +using this feature looks like this: +<pre> +/* setup everything */ +VIDIOCMCAPTURE(0) +while (whatever) { + VIDIOCMCAPTURE(1) + VIDIOCSYNC(0) + /* process frame 0 while the hardware captures frame 1 */ + VIDIOCMCAPTURE(0) + VIDIOCSYNC(1) + /* process frame 1 while the hardware captures frame 0 */ +} +</pre> +Note that you are <em>not</em> limited to only two frames. The API +allows up to 32 frames, the VIDIOCGMBUF ioctl returns the number of +frames the driver granted. Thus it is possible to build deeper queues +to avoid loosing frames on load peaks. +<p> +While capturing to memory the driver will make a "best effort" attempt +to capture to screen as well if requested. This normally means all +frames that "miss" memory mapped capture will go to the display. <P> A final ioctl exists to allow a device to obtain related devices if a driver has multiple components (for example video0 may not be associated diff --git a/MAINTAINERS b/MAINTAINERS index 973c2fde..85f1801 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -709,14 +709,12 @@ W: http://www.developer.ibm.com/welcome/netfinity/serveraid.html S: Supported IDE DRIVER [GENERAL] -P: Andre Hedrick -M: andre@linux-ide.org -M: andre@linuxdiskcert.org +P: Martin Dalecki +M: martin@dalecki.de +I: pl_PL.ISO8859-2, de_DE.ISO8859-15, (en_US.ISO8859-1) L: linux-kernel@vger.kernel.org -W: http://www.kernel.org/pub/linux/kernel/people/hedrick/ -W: http://www.linux-ide.org/ -W: http://www.linuxdiskcert.org/ -S: Maintained +W: http://www.dalecki.de +S: Developement IDE/ATAPI CDROM DRIVER P: Jens Axboe @@ -246,14 +246,14 @@ MRPROPER_DIRS = \ include arch/$(ARCH)/Makefile -export CPPFLAGS CFLAGS AFLAGS +export CPPFLAGS CFLAGS CFLAGS_KERNEL AFLAGS AFLAGS_KERNEL export NETWORKS DRIVERS LIBS HEAD LDFLAGS LINKFLAGS MAKEBOOT ASFLAGS .S.s: - $(CPP) $(AFLAGS) -traditional -o $*.s $< + $(CPP) $(AFLAGS) $(AFLAGS_KERNEL) -traditional -o $*.s $< .S.o: - $(CC) $(AFLAGS) -traditional -c -o $*.o $< + $(CC) $(AFLAGS) $(AFLAGS_KERNEL) -traditional -c -o $*.o $< Version: dummy @rm -f include/linux/compile.h diff --git a/arch/alpha/defconfig b/arch/alpha/defconfig index 8309ed9..84cd97f 100644 --- a/arch/alpha/defconfig +++ b/arch/alpha/defconfig @@ -255,7 +255,6 @@ CONFIG_BLK_DEV_IDEDMA=y # CONFIG_IDEDMA_PCI_WIP is not set # CONFIG_BLK_DEV_IDEDMA_TIMEOUT is not set # CONFIG_IDEDMA_NEW_DRIVE_LISTINGS is not set -CONFIG_BLK_DEV_ADMA=y # CONFIG_BLK_DEV_AEC62XX is not set # CONFIG_AEC62XX_TUNING is not set CONFIG_BLK_DEV_ALI15X3=y diff --git a/arch/arm/def-configs/badge4 b/arch/arm/def-configs/badge4 index 1229bda..8a2c5f8 100644 --- a/arch/arm/def-configs/badge4 +++ b/arch/arm/def-configs/badge4 @@ -15,7 +15,14 @@ CONFIG_RWSEM_GENERIC_SPINLOCK=y # Code maturity level options # CONFIG_EXPERIMENTAL=y -# CONFIG_OBSOLETE is not set + +# +# General setup +# +CONFIG_NET=y +# CONFIG_SYSVIPC is not set +# CONFIG_BSD_PROCESS_ACCT is not set +CONFIG_SYSCTL=y # # Loadable module support @@ -27,6 +34,7 @@ CONFIG_KMOD=y # # System Type # +# CONFIG_ARCH_ADIFCC is not set # CONFIG_ARCH_ANAKIN is not set # CONFIG_ARCH_ARCA5K is not set # CONFIG_ARCH_CLPS7500 is not set @@ -36,6 +44,7 @@ CONFIG_KMOD=y # CONFIG_ARCH_CAMELOT is not set # CONFIG_ARCH_FOOTBRIDGE is not set # CONFIG_ARCH_INTEGRATOR is not set +# CONFIG_ARCH_IOP310 is not set # CONFIG_ARCH_L7200 is not set # CONFIG_ARCH_RPC is not set CONFIG_ARCH_SA1100=y @@ -48,15 +57,23 @@ CONFIG_ARCH_SA1100=y # # Archimedes/A5000 Implementations (select only ONE) # +# CONFIG_ARCH_ARC is not set +# CONFIG_ARCH_A5K is not set # # Footbridge Implementations # +# CONFIG_ARCH_CATS is not set +# CONFIG_ARCH_PERSONAL_SERVER is not set +# CONFIG_ARCH_EBSA285_ADDIN is not set +# CONFIG_ARCH_EBSA285_HOST is not set +# CONFIG_ARCH_NETWINDER is not set # # SA11x0 Implementations # # CONFIG_SA1100_ASSABET is not set +# CONFIG_ASSABET_NEPONSET is not set # CONFIG_SA1100_ADSBITSY is not set # CONFIG_SA1100_BRUTUS is not set # CONFIG_SA1100_CERF is not set @@ -78,6 +95,7 @@ CONFIG_SA1100_BADGE4=y # CONFIG_SA1100_OMNIMETER is not set # CONFIG_SA1100_PANGOLIN is not set # CONFIG_SA1100_PLEB is not set +# CONFIG_SA1100_PT_SYSTEM3 is not set # CONFIG_SA1100_SHANNON is not set # CONFIG_SA1100_SHERMAN is not set # CONFIG_SA1100_SIMPAD is not set @@ -85,15 +103,23 @@ CONFIG_SA1100_BADGE4=y # CONFIG_SA1100_VICTOR is not set # CONFIG_SA1100_XP860 is not set # CONFIG_SA1100_YOPY is not set +# CONFIG_SA1100_STORK is not set CONFIG_SA1111=y CONFIG_FORCE_MAX_ZONEORDER=9 -CONFIG_SA1100_USB=m -CONFIG_SA1100_USB_NETLINK=m -CONFIG_SA1100_USB_CHAR=m +# CONFIG_SA1100_USB is not set +# CONFIG_SA1100_USB_NETLINK is not set +# CONFIG_SA1100_USB_CHAR is not set +# CONFIG_H3600_SLEEVE is not set # # CLPS711X/EP721X Implementations # +# CONFIG_ARCH_AUTCPU12 is not set +# CONFIG_ARCH_CDB89712 is not set +# CONFIG_ARCH_CLEP7312 is not set +# CONFIG_ARCH_EDB7211 is not set +# CONFIG_ARCH_P720T is not set +# CONFIG_ARCH_FORTUNET is not set # CONFIG_ARCH_EP7211 is not set # CONFIG_ARCH_EP7212 is not set # CONFIG_ARCH_ACORN is not set @@ -117,6 +143,7 @@ CONFIG_CPU_32v4=y # CONFIG_CPU_ARM1020 is not set # CONFIG_CPU_SA110 is not set CONFIG_CPU_SA1100=y +# CONFIG_XSCALE_PMU is not set # CONFIG_ARM_THUMB is not set CONFIG_DISCONTIGMEM=y @@ -126,6 +153,7 @@ CONFIG_DISCONTIGMEM=y # CONFIG_PCI is not set CONFIG_ISA=y # CONFIG_ISA_DMA is not set +# CONFIG_FIQ is not set CONFIG_CPU_FREQ=y CONFIG_HOTPLUG=y @@ -134,13 +162,11 @@ CONFIG_HOTPLUG=y # CONFIG_PCMCIA=y CONFIG_PCMCIA_PROBE=y +# CONFIG_I82092 is not set # CONFIG_I82365 is not set # CONFIG_TCIC is not set +# CONFIG_PCMCIA_CLPS6700 is not set CONFIG_PCMCIA_SA1100=y -CONFIG_NET=y -# CONFIG_SYSVIPC is not set -# CONFIG_BSD_PROCESS_ACCT is not set -CONFIG_SYSCTL=y # # At least one math emulation must be selected @@ -153,6 +179,8 @@ CONFIG_BINFMT_AOUT=m CONFIG_BINFMT_ELF=y CONFIG_BINFMT_MISC=m # CONFIG_PM is not set +# CONFIG_PREEMPT is not set +# CONFIG_APM is not set CONFIG_ARTHUR=m CONFIG_CMDLINE="init=/linuxrc root=/dev/mtdblock3" # CONFIG_LEDS is not set @@ -161,14 +189,23 @@ CONFIG_ALIGNMENT_TRAP=y # # Parallel port support # -# CONFIG_PARPORT is not set +CONFIG_PARPORT=m +# CONFIG_PARPORT_PC is not set +# CONFIG_PARPORT_ARC is not set +# CONFIG_PARPORT_AMIGA is not set +# CONFIG_PARPORT_MFC3 is not set +# CONFIG_PARPORT_ATARI is not set +# CONFIG_PARPORT_GSC is not set +# CONFIG_PARPORT_SUNBPP is not set +# CONFIG_PARPORT_OTHER is not set +# CONFIG_PARPORT_1284 is not set # # Memory Technology Devices (MTD) # CONFIG_MTD=y CONFIG_MTD_DEBUG=y -CONFIG_MTD_DEBUG_VERBOSE=1 +CONFIG_MTD_DEBUG_VERBOSE=0 CONFIG_MTD_PARTITIONS=y # CONFIG_MTD_REDBOOT_PARTS is not set # CONFIG_MTD_BOOTLDR_PARTS is not set @@ -205,6 +242,9 @@ CONFIG_MTD_RAM=y # CONFIG_MTD_ROM is not set # CONFIG_MTD_ABSENT is not set # CONFIG_MTD_OBSOLETE_CHIPS is not set +# CONFIG_MTD_AMDSTD is not set +# CONFIG_MTD_SHARP is not set +# CONFIG_MTD_JEDEC is not set # # Mapping drivers for chip access @@ -212,17 +252,21 @@ CONFIG_MTD_RAM=y # CONFIG_MTD_PHYSMAP is not set # CONFIG_MTD_NORA is not set # CONFIG_MTD_ARM_INTEGRATOR is not set +# CONFIG_MTD_CDB89712 is not set CONFIG_MTD_SA1100=y +# CONFIG_MTD_2PARTS_IPAQ is not set +# CONFIG_MTD_DC21285 is not set # CONFIG_MTD_IQ80310 is not set +# CONFIG_MTD_EPXA10DB is not set +# CONFIG_MTD_PCI is not set # # Self-contained MTD device drivers # +# CONFIG_MTD_PMC551 is not set # CONFIG_MTD_SLRAM is not set -CONFIG_MTD_MTDRAM=m -CONFIG_MTDRAM_TOTAL_SIZE=4096 -CONFIG_MTDRAM_ERASE_SIZE=128 -CONFIG_MTD_BLKMTD=m +# CONFIG_MTD_MTDRAM is not set +# CONFIG_MTD_BLKMTD is not set # # Disk-On-Chip Device Drivers @@ -241,22 +285,35 @@ CONFIG_MTD_BLKMTD=m # Plug and Play configuration # # CONFIG_PNP is not set +# CONFIG_ISAPNP is not set +# CONFIG_PNPBIOS is not set # # Block devices # # CONFIG_BLK_DEV_FD is not set # CONFIG_BLK_DEV_XD is not set +# CONFIG_PARIDE is not set +# CONFIG_BLK_CPQ_DA is not set +# CONFIG_BLK_CPQ_CISS_DA is not set +# CONFIG_CISS_SCSI_TAPE is not set +# CONFIG_BLK_DEV_DAC960 is not set CONFIG_BLK_DEV_LOOP=y CONFIG_BLK_DEV_NBD=m -CONFIG_BLK_DEV_RAM=y -CONFIG_BLK_DEV_RAM_SIZE=4096 +# CONFIG_BLK_DEV_RAM is not set # CONFIG_BLK_DEV_INITRD is not set # # Multi-device support (RAID and LVM) # # CONFIG_MD is not set +# CONFIG_BLK_DEV_MD is not set +# CONFIG_MD_LINEAR is not set +# CONFIG_MD_RAID0 is not set +# CONFIG_MD_RAID1 is not set +# CONFIG_MD_RAID5 is not set +# CONFIG_MD_MULTIPATH is not set +# CONFIG_BLK_DEV_LVM is not set # # Networking options @@ -325,9 +382,16 @@ CONFIG_NETDEVICES=y # # Ethernet (1000 Mbit) # -# CONFIG_ACENIC_OMIT_TIGON_I is not set +# CONFIG_ACENIC is not set +# CONFIG_DL2K is not set +# CONFIG_MYRI_SBUS is not set +# CONFIG_NS83820 is not set +# CONFIG_HAMACHI is not set +# CONFIG_YELLOWFIN is not set +# CONFIG_SK98LIN is not set # CONFIG_FDDI is not set # CONFIG_HIPPI is not set +# CONFIG_PLIP is not set # CONFIG_PPP is not set # CONFIG_SLIP is not set @@ -336,15 +400,23 @@ CONFIG_NETDEVICES=y # CONFIG_NET_RADIO=y # CONFIG_STRIP is not set -# CONFIG_WAVELAN is not set # CONFIG_ARLAN is not set # CONFIG_AIRONET4500 is not set +# CONFIG_AIRONET4500_NONCS is not set +# CONFIG_AIRONET4500_PROC is not set + +# +# Wireless ISA/PCI cards support +# +# CONFIG_WAVELAN is not set # CONFIG_AIRO is not set CONFIG_HERMES=y # -# Wireless Pcmcia cards support +# Wireless Pcmcia/Cardbus cards support # +CONFIG_PCMCIA_NETWAVE=m +CONFIG_PCMCIA_WAVELAN=m CONFIG_PCMCIA_HERMES=y CONFIG_AIRO_CS=m CONFIG_NET_WIRELESS=y @@ -354,6 +426,7 @@ CONFIG_NET_WIRELESS=y # # CONFIG_TR is not set # CONFIG_NET_FC is not set +# CONFIG_RCPCI is not set # CONFIG_SHAPER is not set # @@ -369,14 +442,15 @@ CONFIG_PCMCIA_3C589=y CONFIG_PCMCIA_3C574=m CONFIG_PCMCIA_FMVJ18X=m CONFIG_PCMCIA_PCNET=y -CONFIG_PCMCIA_AXNET=m CONFIG_PCMCIA_NMCLAN=m CONFIG_PCMCIA_SMC91C92=m CONFIG_PCMCIA_XIRC2PS=m +CONFIG_PCMCIA_AXNET=m +# CONFIG_ARCNET_COM20020_CS is not set +# CONFIG_PCMCIA_IBMTR is not set CONFIG_NET_PCMCIA_RADIO=y CONFIG_PCMCIA_RAYCS=m -CONFIG_PCMCIA_NETWAVE=m -CONFIG_PCMCIA_WAVELAN=m +# CONFIG_AIRONET4500_CS is not set # # Amateur Radio support @@ -392,9 +466,16 @@ CONFIG_IRDA=y # IrDA protocols # CONFIG_IRLAN=y +# CONFIG_IRNET is not set CONFIG_IRCOMM=y CONFIG_IRDA_ULTRA=y -# CONFIG_IRDA_OPTIONS is not set + +# +# IrDA options +# +# CONFIG_IRDA_CACHE_LAST_LSAP is not set +# CONFIG_IRDA_FAST_RR is not set +# CONFIG_IRDA_DEBUG is not set # # Infrared-port device drivers @@ -440,23 +521,36 @@ CONFIG_BLK_DEV_IDE=m # CONFIG_BLK_DEV_HD is not set CONFIG_BLK_DEV_IDEDISK=m # CONFIG_IDEDISK_MULTI_MODE is not set +# CONFIG_IDEDISK_STROKE is not set # CONFIG_BLK_DEV_IDEDISK_VENDOR is not set +# CONFIG_BLK_DEV_IDEDISK_FUJITSU is not set +# CONFIG_BLK_DEV_IDEDISK_IBM is not set +# CONFIG_BLK_DEV_IDEDISK_MAXTOR is not set +# CONFIG_BLK_DEV_IDEDISK_QUANTUM is not set +# CONFIG_BLK_DEV_IDEDISK_SEAGATE is not set +# CONFIG_BLK_DEV_IDEDISK_WD is not set # CONFIG_BLK_DEV_COMMERIAL is not set +# CONFIG_BLK_DEV_TIVO is not set # CONFIG_BLK_DEV_IDECS is not set CONFIG_BLK_DEV_IDECD=m -CONFIG_BLK_DEV_IDETAPE=m +# CONFIG_BLK_DEV_IDETAPE is not set CONFIG_BLK_DEV_IDEFLOPPY=m CONFIG_BLK_DEV_IDESCSI=m +# CONFIG_IDE_TASK_IOCTL is not set # # IDE chipset support/bugfixes # # CONFIG_BLK_DEV_CMD640 is not set +# CONFIG_BLK_DEV_CMD640_ENHANCED is not set +# CONFIG_BLK_DEV_ISAPNP is not set # CONFIG_IDE_CHIPSETS is not set # CONFIG_IDEDMA_AUTO is not set # CONFIG_DMA_NONPCI is not set # CONFIG_BLK_DEV_IDE_MODES is not set # CONFIG_BLK_DEV_ATARAID is not set +# CONFIG_BLK_DEV_ATARAID_PDC is not set +# CONFIG_BLK_DEV_ATARAID_HPT is not set # # SCSI support @@ -469,7 +563,7 @@ CONFIG_SCSI=y CONFIG_BLK_DEV_SD=y CONFIG_SD_EXTRA_DEVS=40 CONFIG_CHR_DEV_ST=m -CONFIG_CHR_DEV_OSST=m +# CONFIG_CHR_DEV_OSST is not set CONFIG_BLK_DEV_SR=m # CONFIG_BLK_DEV_SR_VENDOR is not set CONFIG_SR_EXTRA_DEVS=2 @@ -478,7 +572,6 @@ CONFIG_CHR_DEV_SG=y # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs # -# CONFIG_SCSI_DEBUG_QUEUES is not set # CONFIG_SCSI_MULTI_LUN is not set # CONFIG_SCSI_CONSTANTS is not set # CONFIG_SCSI_LOGGING is not set @@ -496,8 +589,10 @@ CONFIG_CHR_DEV_SG=y # CONFIG_SCSI_DPT_I2O is not set # CONFIG_SCSI_ADVANSYS is not set # CONFIG_SCSI_IN2000 is not set +# CONFIG_SCSI_AM53C974 is not set # CONFIG_SCSI_MEGARAID is not set # CONFIG_SCSI_BUSLOGIC is not set +# CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_DTC3280 is not set # CONFIG_SCSI_EATA is not set # CONFIG_SCSI_EATA_DMA is not set @@ -505,10 +600,12 @@ CONFIG_CHR_DEV_SG=y # CONFIG_SCSI_FUTURE_DOMAIN is not set # CONFIG_SCSI_GDTH is not set # CONFIG_SCSI_GENERIC_NCR5380 is not set +# CONFIG_SCSI_INITIO is not set +# CONFIG_SCSI_INIA100 is not set +# CONFIG_SCSI_PPA is not set +# CONFIG_SCSI_IMM is not set # CONFIG_SCSI_NCR53C406A is not set -# CONFIG_SCSI_NCR53C7xx_sync is not set -# CONFIG_SCSI_NCR53C7xx_FAST is not set -# CONFIG_SCSI_NCR53C7xx_DISCONNECT is not set +# CONFIG_SCSI_NCR53C7xx is not set # CONFIG_SCSI_PAS16 is not set # CONFIG_SCSI_PCI2000 is not set # CONFIG_SCSI_PCI2220I is not set @@ -528,11 +625,11 @@ CONFIG_CHR_DEV_SG=y # # I2O device support # -CONFIG_I2O=m -CONFIG_I2O_BLOCK=m -CONFIG_I2O_LAN=m -CONFIG_I2O_SCSI=m -CONFIG_I2O_PROC=m +# CONFIG_I2O is not set +# CONFIG_I2O_BLOCK is not set +# CONFIG_I2O_LAN is not set +# CONFIG_I2O_SCSI is not set +# CONFIG_I2O_PROC is not set # # ISDN subsystem @@ -540,40 +637,75 @@ CONFIG_I2O_PROC=m # CONFIG_ISDN is not set # -# Input core support +# Input device support # -CONFIG_INPUT=m -CONFIG_INPUT_KEYBDEV=m +# CONFIG_INPUT is not set +# CONFIG_INPUT_KEYBDEV is not set # CONFIG_INPUT_MOUSEDEV is not set # CONFIG_INPUT_JOYDEV is not set # CONFIG_INPUT_EVDEV is not set +# CONFIG_GAMEPORT is not set +CONFIG_SOUND_GAMEPORT=y +# CONFIG_GAMEPORT_NS558 is not set +# CONFIG_GAMEPORT_L4 is not set +# CONFIG_INPUT_EMU10K1 is not set +# CONFIG_GAMEPORT_PCIGAME is not set +# CONFIG_GAMEPORT_FM801 is not set +# CONFIG_GAMEPORT_CS461x is not set +# CONFIG_SERIO is not set +# CONFIG_SERIO_SERPORT is not set # # Character devices # # CONFIG_VT is not set # CONFIG_SERIAL is not set +# CONFIG_SERIAL_EXTENDED is not set # CONFIG_SERIAL_NONSTANDARD is not set # # Serial drivers # +# CONFIG_SERIAL_ANAKIN is not set +# CONFIG_SERIAL_ANAKIN_CONSOLE is not set +# CONFIG_SERIAL_AMBA is not set +# CONFIG_SERIAL_AMBA_CONSOLE is not set +# CONFIG_SERIAL_CLPS711X is not set +# CONFIG_SERIAL_CLPS711X_CONSOLE is not set +# CONFIG_SERIAL_21285 is not set +# CONFIG_SERIAL_21285_OLD is not set +# CONFIG_SERIAL_21285_CONSOLE is not set +# CONFIG_SERIAL_UART00 is not set +# CONFIG_SERIAL_UART00_CONSOLE is not set CONFIG_SERIAL_SA1100=y CONFIG_SERIAL_SA1100_CONSOLE=y CONFIG_SA1100_DEFAULT_BAUDRATE=115200 # CONFIG_SERIAL_8250 is not set +# CONFIG_SERIAL_8250_CONSOLE is not set +# CONFIG_ATOMWIDE_SERIAL is not set +# CONFIG_DUALSP_SERIAL is not set +# CONFIG_SERIAL_8250_EXTENDED is not set +# CONFIG_SERIAL_8250_MANY_PORTS is not set +# CONFIG_SERIAL_8250_SHARE_IRQ is not set +# CONFIG_SERIAL_8250_DETECT_IRQ is not set +# CONFIG_SERIAL_8250_MULTIPORT is not set +# CONFIG_SERIAL_8250_RSA is not set CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y CONFIG_UNIX98_PTYS=y CONFIG_UNIX98_PTY_COUNT=256 +# CONFIG_PRINTER is not set +# CONFIG_PPDEV is not set # # I2C support # CONFIG_I2C=m CONFIG_I2C_ALGOBIT=m +# CONFIG_I2C_PHILIPSPAR is not set CONFIG_I2C_ELV=m CONFIG_I2C_VELLEMAN=m +# CONFIG_I2C_BIT_SA1100_GPIO is not set CONFIG_I2C_ALGOPCF=m CONFIG_I2C_ELEKTOR=m CONFIG_I2C_CHARDEV=m @@ -582,11 +714,14 @@ CONFIG_I2C_PROC=m # # L3 serial bus support # -CONFIG_L3=m +CONFIG_L3=y +# CONFIG_L3_ALGOBIT is not set +# CONFIG_L3_BIT_SA1100_GPIO is not set # # Other L3 adapters # +CONFIG_L3_SA1111=y # CONFIG_BIT_SA1100_GPIO is not set # @@ -594,17 +729,6 @@ CONFIG_L3=m # # CONFIG_BUSMOUSE is not set # CONFIG_MOUSE is not set - -# -# Joysticks -# -# CONFIG_INPUT_GAMEPORT is not set -# CONFIG_INPUT_SERIO is not set - -# -# Joysticks -# -# CONFIG_INPUT_IFORCE_USB is not set # CONFIG_QIC02_TAPE is not set # @@ -618,6 +742,8 @@ CONFIG_SOFT_WATCHDOG=m # CONFIG_PCWATCHDOG is not set # CONFIG_ACQUIRE_WDT is not set # CONFIG_ADVANTECH_WDT is not set +# CONFIG_21285_WATCHDOG is not set +# CONFIG_977_WATCHDOG is not set CONFIG_SA1100_WATCHDOG=m # CONFIG_EUROTECH_WDT is not set # CONFIG_IB700_WDT is not set @@ -626,6 +752,7 @@ CONFIG_SA1100_WATCHDOG=m # CONFIG_60XX_WDT is not set # CONFIG_W83877F_WDT is not set # CONFIG_MACHZ_WDT is not set +# CONFIG_INTEL_RNG is not set # CONFIG_NVRAM is not set CONFIG_RTC=m CONFIG_SA1100_RTC=m @@ -647,7 +774,51 @@ CONFIG_SA1100_RTC=m # # Multimedia devices # -# CONFIG_VIDEO_DEV is not set +CONFIG_VIDEO_DEV=y + +# +# Video For Linux +# +CONFIG_VIDEO_PROC_FS=y +# CONFIG_I2C_PARPORT is not set + +# +# Video Adapters +# +# CONFIG_VIDEO_BT848 is not set +# CONFIG_VIDEO_PMS is not set +# CONFIG_VIDEO_BWQCAM is not set +# CONFIG_VIDEO_CQCAM is not set +# CONFIG_VIDEO_CPIA is not set +# CONFIG_VIDEO_SAA5249 is not set +# CONFIG_TUNER_3036 is not set +# CONFIG_VIDEO_STRADIS is not set +# CONFIG_VIDEO_ZORAN is not set +# CONFIG_VIDEO_ZORAN_BUZ is not set +# CONFIG_VIDEO_ZORAN_DC10 is not set +# CONFIG_VIDEO_ZORAN_LML33 is not set +# CONFIG_VIDEO_ZR36120 is not set +# CONFIG_VIDEO_MEYE is not set +# CONFIG_VIDEO_CYBERPRO is not set + +# +# Radio Adapters +# +# CONFIG_RADIO_CADET is not set +# CONFIG_RADIO_RTRACK is not set +# CONFIG_RADIO_RTRACK2 is not set +# CONFIG_RADIO_AZTECH is not set +# CONFIG_RADIO_GEMTEK is not set +# CONFIG_RADIO_GEMTEK_PCI is not set +# CONFIG_RADIO_MAXIRADIO is not set +# CONFIG_RADIO_MAESTRO is not set +# CONFIG_RADIO_MIROPCM20 is not set +# CONFIG_RADIO_MIROPCM20_RDS is not set +# CONFIG_RADIO_SF16FMI is not set +# CONFIG_RADIO_TERRATEC is not set +# CONFIG_RADIO_TRUST is not set +# CONFIG_RADIO_TYPHOON is not set +# CONFIG_RADIO_ZOLTRIX is not set # # File systems @@ -656,15 +827,18 @@ CONFIG_SA1100_RTC=m # CONFIG_AUTOFS_FS is not set # CONFIG_AUTOFS4_FS is not set # CONFIG_REISERFS_FS is not set +# CONFIG_REISERFS_CHECK is not set +# CONFIG_REISERFS_PROC_INFO is not set # CONFIG_ADFS_FS is not set +# CONFIG_ADFS_FS_RW is not set # CONFIG_AFFS_FS is not set # CONFIG_HFS_FS is not set # CONFIG_BFS_FS is not set CONFIG_EXT3_FS=m CONFIG_JBD=m # CONFIG_JBD_DEBUG is not set -CONFIG_FAT_FS=m -CONFIG_MSDOS_FS=m +CONFIG_FAT_FS=y +CONFIG_MSDOS_FS=y # CONFIG_UMSDOS_FS is not set CONFIG_VFAT_FS=m # CONFIG_EFS_FS is not set @@ -672,12 +846,15 @@ CONFIG_VFAT_FS=m CONFIG_JFFS2_FS=y CONFIG_JFFS2_FS_DEBUG=0 CONFIG_CRAMFS=m -# CONFIG_TMPFS is not set -# CONFIG_RAMFS is not set +CONFIG_TMPFS=y +CONFIG_RAMFS=y # CONFIG_ISO9660_FS is not set +# CONFIG_JOLIET is not set +# CONFIG_ZISOFS is not set CONFIG_MINIX_FS=m # CONFIG_VXFS_FS is not set # CONFIG_NTFS_FS is not set +# CONFIG_NTFS_RW is not set # CONFIG_HPFS_FS is not set CONFIG_PROC_FS=y CONFIG_DEVFS_FS=y @@ -685,11 +862,14 @@ CONFIG_DEVFS_MOUNT=y # CONFIG_DEVFS_DEBUG is not set # CONFIG_DEVPTS_FS is not set # CONFIG_QNX4FS_FS is not set +# CONFIG_QNX4FS_RW is not set # CONFIG_ROMFS_FS is not set CONFIG_EXT2_FS=m # CONFIG_SYSV_FS is not set # CONFIG_UDF_FS is not set +# CONFIG_UDF_RW is not set # CONFIG_UFS_FS is not set +# CONFIG_UFS_FS_WRITE is not set # # Network File Systems @@ -698,21 +878,43 @@ CONFIG_EXT2_FS=m # CONFIG_INTERMEZZO_FS is not set CONFIG_NFS_FS=m CONFIG_NFS_V3=y +# CONFIG_ROOT_NFS is not set # CONFIG_NFSD is not set +# CONFIG_NFSD_V3 is not set CONFIG_SUNRPC=m CONFIG_LOCKD=m CONFIG_LOCKD_V4=y CONFIG_SMB_FS=m # CONFIG_SMB_NLS_DEFAULT is not set # CONFIG_NCP_FS is not set +# CONFIG_NCPFS_PACKET_SIGNING is not set +# CONFIG_NCPFS_IOCTL_LOCKING is not set +# CONFIG_NCPFS_STRONG is not set +# CONFIG_NCPFS_NFS_NS is not set +# CONFIG_NCPFS_OS2_NS is not set +# CONFIG_NCPFS_SMALLDOS is not set +# CONFIG_NCPFS_NLS is not set +# CONFIG_NCPFS_EXTRAS is not set # CONFIG_ZISOFS_FS is not set -CONFIG_ZLIB_FS_INFLATE=m # # Partition Types # -# CONFIG_PARTITION_ADVANCED is not set +CONFIG_PARTITION_ADVANCED=y +# CONFIG_ACORN_PARTITION is not set +# CONFIG_OSF_PARTITION is not set +# CONFIG_AMIGA_PARTITION is not set +# CONFIG_ATARI_PARTITION is not set +# CONFIG_MAC_PARTITION is not set CONFIG_MSDOS_PARTITION=y +# CONFIG_BSD_DISKLABEL is not set +# CONFIG_MINIX_SUBPARTITION is not set +# CONFIG_SOLARIS_X86_PARTITION is not set +# CONFIG_UNIXWARE_DISKLABEL is not set +# CONFIG_LDM_PARTITION is not set +# CONFIG_SGI_PARTITION is not set +# CONFIG_ULTRIX_PARTITION is not set +# CONFIG_SUN_PARTITION is not set CONFIG_SMB_NLS=y CONFIG_NLS=y @@ -761,26 +963,56 @@ CONFIG_NLS_DEFAULT="iso8859-1" # Sound # CONFIG_SOUND=y + +# +# Open Sound System +# +CONFIG_SOUND_PRIME=y # CONFIG_SOUND_BT878 is not set +# CONFIG_SOUND_CMPCI is not set +# CONFIG_SOUND_EMU10K1 is not set +# CONFIG_MIDI_EMU10K1 is not set # CONFIG_SOUND_FUSION is not set # CONFIG_SOUND_CS4281 is not set +# CONFIG_SOUND_ES1370 is not set +# CONFIG_SOUND_ES1371 is not set # CONFIG_SOUND_ESSSOLO1 is not set # CONFIG_SOUND_MAESTRO is not set +# CONFIG_SOUND_MAESTRO3 is not set +# CONFIG_SOUND_ICH is not set +# CONFIG_SOUND_RME96XX is not set # CONFIG_SOUND_SONICVIBES is not set # CONFIG_SOUND_TRIDENT is not set # CONFIG_SOUND_MSNDCLAS is not set # CONFIG_SOUND_MSNDPIN is not set +# CONFIG_SOUND_VIA82CXXX is not set +# CONFIG_MIDI_VIA82CXXX is not set CONFIG_SOUND_SA1100=y -CONFIG_SOUND_UDA1341=m -CONFIG_SOUND_SA1111_UDA1341=m -CONFIG_SOUND_SA1100SSP=m +CONFIG_SOUND_UDA1341=y +# CONFIG_SOUND_ASSABET_UDA1341 is not set +# CONFIG_SOUND_H3600_UDA1341 is not set +# CONFIG_SOUND_PANGOLIN_UDA1341 is not set +CONFIG_SOUND_SA1111_UDA1341=y +# CONFIG_SOUND_STORK_UDA1341 is not set +# CONFIG_SOUND_SA1100SSP is not set +# CONFIG_SOUND_STORK_AC97 is not set # CONFIG_SOUND_OSS is not set +# CONFIG_SOUND_WAVEARTIST is not set # CONFIG_SOUND_TVMIXER is not set # +# Advanced Linux Sound Architecture +# +# CONFIG_SND is not set + +# # Multimedia Capabilities Port drivers # -# CONFIG_MCP is not set +CONFIG_MCP=y +CONFIG_MCP_SA1100=y +# CONFIG_MCP_UCB1200 is not set +# CONFIG_MCP_UCB1200_AUDIO is not set +# CONFIG_MCP_UCB1200_TS is not set # # USB support @@ -796,8 +1028,10 @@ CONFIG_USB_DEVICEFS=y # CONFIG_USB_LONG_TIMEOUT is not set # -# USB Controllers +# USB Host Controller Drivers # +# CONFIG_USB_EHCI_HCD is not set +# CONFIG_USB_OHCI_HCD is not set # CONFIG_USB_UHCI is not set # CONFIG_USB_UHCI_ALT is not set # CONFIG_USB_OHCI is not set @@ -810,24 +1044,23 @@ CONFIG_USB_AUDIO=y CONFIG_USB_BLUETOOTH=m CONFIG_USB_STORAGE=y CONFIG_USB_STORAGE_DEBUG=y -CONFIG_USB_STORAGE_DATAFAB=y -CONFIG_USB_STORAGE_FREECOM=y -CONFIG_USB_STORAGE_ISD200=y -CONFIG_USB_STORAGE_DPCM=y -CONFIG_USB_STORAGE_HP8200e=y -CONFIG_USB_STORAGE_SDDR09=y -CONFIG_USB_STORAGE_JUMPSHOT=y +# CONFIG_USB_STORAGE_DATAFAB is not set +# CONFIG_USB_STORAGE_FREECOM is not set +# CONFIG_USB_STORAGE_ISD200 is not set +# CONFIG_USB_STORAGE_DPCM is not set +# CONFIG_USB_STORAGE_HP8200e is not set +# CONFIG_USB_STORAGE_SDDR09 is not set +# CONFIG_USB_STORAGE_JUMPSHOT is not set CONFIG_USB_ACM=m CONFIG_USB_PRINTER=m # # USB Human Interface Devices (HID) # -CONFIG_USB_HID=m -# CONFIG_USB_HIDDEV is not set -CONFIG_USB_KBD=m -CONFIG_USB_MOUSE=m -CONFIG_USB_WACOM=m + +# +# Input core support is needed for USB HID +# # # USB Imaging devices @@ -841,10 +1074,15 @@ CONFIG_USB_HPUSBSCSI=m # # USB Multimedia devices # - -# -# Video4Linux support is needed for USB Multimedia device support -# +CONFIG_USB_IBMCAM=m +CONFIG_USB_OV511=m +CONFIG_USB_PWC=m +CONFIG_USB_SE401=m +# CONFIG_USB_STV680 is not set +CONFIG_USB_VICAM=m +CONFIG_USB_DSBR=m +CONFIG_USB_DABUSB=m +CONFIG_USB_KONICAWC=m # # USB Network adaptors @@ -858,18 +1096,20 @@ CONFIG_USB_USBNET=m # # USB port drivers # +CONFIG_USB_USS720=m # # USB Serial Converter support # CONFIG_USB_SERIAL=m -# CONFIG_USB_SERIAL_GENERIC is not set +CONFIG_USB_SERIAL_GENERIC=y CONFIG_USB_SERIAL_BELKIN=m CONFIG_USB_SERIAL_WHITEHEAT=m CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m CONFIG_USB_SERIAL_EMPEG=m CONFIG_USB_SERIAL_FTDI_SIO=m CONFIG_USB_SERIAL_VISOR=m +# CONFIG_USB_SERIAL_IPAQ is not set CONFIG_USB_SERIAL_IR=m CONFIG_USB_SERIAL_EDGEPORT=m CONFIG_USB_SERIAL_KEYSPAN_PDA=m @@ -883,6 +1123,7 @@ CONFIG_USB_SERIAL_KEYSPAN=m # CONFIG_USB_SERIAL_KEYSPAN_USA19W is not set # CONFIG_USB_SERIAL_KEYSPAN_USA49W is not set CONFIG_USB_SERIAL_MCT_U232=m +# CONFIG_USB_SERIAL_KLSI is not set CONFIG_USB_SERIAL_PL2303=m CONFIG_USB_SERIAL_CYBERJACK=m CONFIG_USB_SERIAL_XIRCOM=m @@ -892,6 +1133,7 @@ CONFIG_USB_SERIAL_OMNINET=m # USB Miscellaneous drivers # CONFIG_USB_RIO500=m +# CONFIG_USB_AUERSWALD is not set # # Bluetooth support @@ -920,4 +1162,12 @@ CONFIG_MAGIC_SYSRQ=y CONFIG_DEBUG_BUGVERBOSE=y CONFIG_DEBUG_ERRORS=y CONFIG_DEBUG_LL=y -CONFIG_DEBUG_LL_SER3=y +# CONFIG_DEBUG_DC21285_PORT is not set +# CONFIG_DEBUG_CLPS711X_UART2 is not set + +# +# Library routines +# +# CONFIG_CRC32 is not set +CONFIG_ZLIB_INFLATE=y +CONFIG_ZLIB_DEFLATE=y diff --git a/arch/arm/def-configs/iq80310 b/arch/arm/def-configs/iq80310 index dfad4f2..d6c12dc 100644 --- a/arch/arm/def-configs/iq80310 +++ b/arch/arm/def-configs/iq80310 @@ -454,7 +454,6 @@ CONFIG_BLK_DEV_IDECD=y CONFIG_BLK_DEV_IDEPCI=y # CONFIG_IDEPCI_SHARE_IRQ is not set CONFIG_BLK_DEV_IDEDMA_PCI=y -CONFIG_BLK_DEV_ADMA=y # CONFIG_BLK_DEV_OFFBOARD is not set CONFIG_IDEDMA_PCI_AUTO=y CONFIG_BLK_DEV_IDEDMA=y diff --git a/arch/arm/def-configs/jornada720 b/arch/arm/def-configs/jornada720 index 3780a2b..cde079e 100644 --- a/arch/arm/def-configs/jornada720 +++ b/arch/arm/def-configs/jornada720 @@ -1,11 +1,15 @@ # -# Automatically generated make config: don't edit +# Automatically generated by make menuconfig: don't edit # CONFIG_ARM=y # CONFIG_EISA is not set # CONFIG_SBUS is not set # CONFIG_MCA is not set CONFIG_UID16=y +CONFIG_RWSEM_GENERIC_SPINLOCK=y +# CONFIG_RWSEM_XCHGADD_ALGORITHM is not set +# CONFIG_GENERIC_BUST_SPINLOCK is not set +# CONFIG_GENERIC_ISA_DMA is not set # # Code maturity level options @@ -23,24 +27,23 @@ CONFIG_KMOD=y # # System Type # +# CONFIG_ARCH_ANAKIN is not set # CONFIG_ARCH_ARCA5K is not set # CONFIG_ARCH_CLPS7500 is not set +# CONFIG_ARCH_CLPS711X is not set # CONFIG_ARCH_CO285 is not set # CONFIG_ARCH_EBSA110 is not set -# CONFIG_ARCH_L7200 is not set +# CONFIG_ARCH_CAMELOT is not set # CONFIG_ARCH_FOOTBRIDGE is not set # CONFIG_ARCH_INTEGRATOR is not set +# CONFIG_ARCH_L7200 is not set # CONFIG_ARCH_RPC is not set CONFIG_ARCH_SA1100=y -# CONFIG_ARCH_CLPS711X is not set +# CONFIG_ARCH_SHARK is not set # # Archimedes/A5000 Implementations # - -# -# Archimedes/A5000 Implementations (select only ONE) -# # CONFIG_ARCH_ARC is not set # CONFIG_ARCH_A5K is not set @@ -58,12 +61,18 @@ CONFIG_ARCH_SA1100=y # # CONFIG_SA1100_ASSABET is not set # CONFIG_ASSABET_NEPONSET is not set +# CONFIG_SA1100_ADSBITSY is not set # CONFIG_SA1100_BRUTUS is not set # CONFIG_SA1100_CERF is not set -# CONFIG_SA1100_BITSY is not set +# CONFIG_SA1100_H3100 is not set +# CONFIG_SA1100_H3600 is not set +# CONFIG_SA1100_H3800 is not set +# CONFIG_SA1100_H3XXX is not set # CONFIG_SA1100_EXTENEX1 is not set +# CONFIG_SA1100_FLEXANET is not set # CONFIG_SA1100_FREEBIRD is not set # CONFIG_SA1100_GRAPHICSCLIENT is not set +# CONFIG_SA1100_GRAPHICSMASTER is not set CONFIG_SA1100_JORNADA720=y # CONFIG_SA1100_HUW_WEBPANEL is not set # CONFIG_SA1100_ITSY is not set @@ -72,70 +81,75 @@ CONFIG_SA1100_JORNADA720=y # CONFIG_SA1100_OMNIMETER is not set # CONFIG_SA1100_PANGOLIN is not set # CONFIG_SA1100_PLEB is not set +# CONFIG_SA1100_SHANNON is not set # CONFIG_SA1100_SHERMAN is not set +# CONFIG_SA1100_SIMPAD is not set # CONFIG_SA1100_PFS168 is not set # CONFIG_SA1100_VICTOR is not set # CONFIG_SA1100_XP860 is not set # CONFIG_SA1100_YOPY is not set CONFIG_SA1111=y +CONFIG_FORCE_MAX_ZONEORDER=9 # CONFIG_SA1100_USB is not set # CONFIG_SA1100_USB_NETLINK is not set # CONFIG_SA1100_USB_CHAR is not set -# CONFIG_SA1100_FREQUENCY_SCALE is not set -# CONFIG_SA1100_VOLTAGE_SCALE is not set +# CONFIG_REGISTERS is not set # # CLPS711X/EP721X Implementations # +# CONFIG_ARCH_AUTCPU12 is not set +# CONFIG_ARCH_CDB89712 is not set +# CONFIG_ARCH_CLEP7312 is not set +# CONFIG_ARCH_EDB7211 is not set # CONFIG_ARCH_P720T is not set +# CONFIG_ARCH_EP7211 is not set +# CONFIG_ARCH_EP7212 is not set # CONFIG_ARCH_ACORN is not set # CONFIG_FOOTBRIDGE is not set # CONFIG_FOOTBRIDGE_HOST is not set # CONFIG_FOOTBRIDGE_ADDIN is not set CONFIG_CPU_32=y # CONFIG_CPU_26 is not set - -# -# Processor Type -# # CONFIG_CPU_32v3 is not set CONFIG_CPU_32v4=y # CONFIG_CPU_ARM610 is not set # CONFIG_CPU_ARM710 is not set # CONFIG_CPU_ARM720T is not set # CONFIG_CPU_ARM920T is not set +# CONFIG_CPU_ARM922T is not set +# CONFIG_CPU_ARM926T is not set # CONFIG_CPU_ARM1020 is not set # CONFIG_CPU_SA110 is not set CONFIG_CPU_SA1100=y +# CONFIG_ARM_THUMB is not set CONFIG_DISCONTIGMEM=y # # General setup # - -# -# Please ensure that you have read the help on the next option -# -# CONFIG_ANGELBOOT is not set # CONFIG_PCI is not set -# CONFIG_ISA is not set +CONFIG_ISA=y # CONFIG_ISA_DMA is not set +# CONFIG_CPU_FREQ is not set CONFIG_HOTPLUG=y # # PCMCIA/CardBus support # CONFIG_PCMCIA=y +# CONFIG_I82092 is not set # CONFIG_I82365 is not set # CONFIG_TCIC is not set # CONFIG_PCMCIA_CLPS6700 is not set CONFIG_PCMCIA_SA1100=y +# CONFIG_MERCURY_BACKPAQ is not set CONFIG_NET=y CONFIG_SYSVIPC=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y CONFIG_FPE_NWFPE=y -# CONFIG_FPE_FASTFPE is not set +CONFIG_FPE_FASTFPE=y CONFIG_KCORE_ELF=y # CONFIG_KCORE_AOUT is not set CONFIG_BINFMT_AOUT=m @@ -145,10 +159,8 @@ CONFIG_PM=y # CONFIG_APM is not set # CONFIG_ARTHUR is not set CONFIG_CMDLINE="keepinitrd" -# CONFIG_PFS168_CMDLINE is not set # CONFIG_LEDS is not set -# CONFIG_ALIGNMENT_TRAP is not set -# CONFIG_UCB1200 is not set +CONFIG_ALIGNMENT_TRAP=y # # Parallel port support @@ -159,63 +171,72 @@ CONFIG_CMDLINE="keepinitrd" # Memory Technology Devices (MTD) # CONFIG_MTD=y -# CONFIG_MTD_DEBUG is not set - -# -# Disk-On-Chip Device Drivers -# -# CONFIG_MTD_DOC1000 is not set -# CONFIG_MTD_DOC2000 is not set -# CONFIG_MTD_DOC2001 is not set -# CONFIG_MTD_DOCPROBE is not set - -# -# RAM/ROM Device Drivers -# -# CONFIG_MTD_PMC551 is not set -# CONFIG_MTD_SLRAM is not set -# CONFIG_MTD_RAM is not set -# CONFIG_MTD_ROM is not set -# CONFIG_MTD_MTDRAM is not set +CONFIG_MTD_DEBUG=y +CONFIG_MTD_DEBUG_VERBOSE=1 +CONFIG_MTD_PARTITIONS=y +# CONFIG_MTD_REDBOOT_PARTS is not set +CONFIG_MTD_BOOTLDR_PARTS=y +# CONFIG_MTD_AFS_PARTS is not set +CONFIG_MTD_CHAR=m +CONFIG_MTD_BLOCK=y +# CONFIG_FTL is not set +# CONFIG_NFTL is not set # -# Linearly Mapped Flash Device Drivers +# RAM/ROM/Flash chip drivers # CONFIG_MTD_CFI=y -# CONFIG_MTD_CFI_ADV_OPTIONS is not set +# CONFIG_MTD_JEDECPROBE is not set +CONFIG_MTD_GEN_PROBE=y +CONFIG_MTD_CFI_ADV_OPTIONS=y +CONFIG_MTD_CFI_NOSWAP=y +# CONFIG_MTD_CFI_BE_BYTE_SWAP is not set +# CONFIG_MTD_CFI_LE_BYTE_SWAP is not set +CONFIG_MTD_CFI_GEOMETRY=y +# CONFIG_MTD_CFI_B1 is not set +CONFIG_MTD_CFI_B2=y +CONFIG_MTD_CFI_B4=y +CONFIG_MTD_CFI_I1=y +CONFIG_MTD_CFI_I2=y +# CONFIG_MTD_CFI_I4 is not set CONFIG_MTD_CFI_INTELEXT=y # CONFIG_MTD_CFI_AMDSTD is not set +# CONFIG_MTD_RAM is not set +# CONFIG_MTD_ROM is not set +# CONFIG_MTD_ABSENT is not set +# CONFIG_MTD_OBSOLETE_CHIPS is not set # CONFIG_MTD_AMDSTD is not set # CONFIG_MTD_SHARP is not set +# CONFIG_MTD_JEDEC is not set + +# +# Mapping drivers for chip access +# # CONFIG_MTD_PHYSMAP is not set # CONFIG_MTD_NORA is not set -# CONFIG_MTD_PNC2000 is not set -# CONFIG_MTD_RPXLITE is not set -# CONFIG_MTD_SC520CDP is not set -# CONFIG_MTD_SBC_MEDIAGX is not set -# CONFIG_MTD_ELAN_104NC is not set +# CONFIG_MTD_ARM_INTEGRATOR is not set +# CONFIG_MTD_CDB89712 is not set CONFIG_MTD_SA1100=y +# CONFIG_MTD_H3600_BACKPAQ is not set # CONFIG_MTD_DC21285 is not set # CONFIG_MTD_IQ80310 is not set -# CONFIG_MTD_CSTM_CFI_JEDEC is not set -# CONFIG_MTD_JEDEC is not set -# CONFIG_MTD_MIXMEM is not set -# CONFIG_MTD_OCTAGON is not set -# CONFIG_MTD_VMAX is not set # -# NAND Flash Device Drivers +# Self-contained MTD device drivers # -# CONFIG_MTD_NAND is not set -# CONFIG_MTD_NAND_SPIA is not set +# CONFIG_MTD_PMC551 is not set +# CONFIG_MTD_SLRAM is not set +# CONFIG_MTD_MTDRAM is not set +# CONFIG_MTD_BLKMTD is not set +# CONFIG_MTD_DOC1000 is not set +# CONFIG_MTD_DOC2000 is not set +# CONFIG_MTD_DOC2001 is not set +# CONFIG_MTD_DOCPROBE is not set # -# User Modules And Translation Layers +# NAND Flash Device Drivers # -CONFIG_MTD_CHAR=y -CONFIG_MTD_BLOCK=y -# CONFIG_FTL is not set -# CONFIG_NFTL is not set +# CONFIG_MTD_NAND is not set # # Plug and Play configuration @@ -246,6 +267,7 @@ CONFIG_BLK_DEV_NBD=m # CONFIG_MD_RAID0 is not set # CONFIG_MD_RAID1 is not set # CONFIG_MD_RAID5 is not set +# CONFIG_MD_MULTIPATH is not set # CONFIG_BLK_DEV_LVM is not set # @@ -258,7 +280,7 @@ CONFIG_RTNETLINK=y # CONFIG_NETLINK_DEV is not set CONFIG_NETFILTER=y # CONFIG_NETFILTER_DEBUG is not set -# CONFIG_FILTER is not set +CONFIG_FILTER=y CONFIG_UNIX=y CONFIG_INET=y CONFIG_IP_MULTICAST=y @@ -282,10 +304,7 @@ CONFIG_IP_MULTICAST=y # CONFIG_IPV6 is not set # CONFIG_KHTTPD is not set # CONFIG_ATM is not set - -# -# -# +# CONFIG_VLAN_8021Q is not set # CONFIG_IPX is not set # CONFIG_ATALK is not set # CONFIG_DECNET is not set @@ -318,7 +337,6 @@ CONFIG_NETDEVICES=y # CONFIG_EQUALIZER is not set # CONFIG_TUN is not set # CONFIG_ETHERTAP is not set -# CONFIG_NET_SB1000 is not set # # Ethernet (10 or 100Mbit) @@ -329,18 +347,44 @@ CONFIG_NETDEVICES=y # Ethernet (1000 Mbit) # # CONFIG_ACENIC is not set +# CONFIG_DL2K is not set +# CONFIG_MYRI_SBUS is not set +# CONFIG_NS83820 is not set # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set # CONFIG_SK98LIN is not set # CONFIG_FDDI is not set # CONFIG_HIPPI is not set -# CONFIG_PPP is not set +# CONFIG_PLIP is not set +CONFIG_PPP=m +# CONFIG_PPP_MULTILINK is not set +# CONFIG_PPP_FILTER is not set +CONFIG_PPP_ASYNC=m +# CONFIG_PPP_SYNC_TTY is not set +CONFIG_PPP_DEFLATE=m +CONFIG_PPP_BSDCOMP=m +# CONFIG_PPPOE is not set # CONFIG_SLIP is not set # # Wireless LAN (non-hamradio) # -# CONFIG_NET_RADIO is not set +CONFIG_NET_RADIO=y +# CONFIG_STRIP is not set +CONFIG_WAVELAN=m +CONFIG_ARLAN=m +CONFIG_AIRONET4500=m +CONFIG_AIRONET4500_NONCS=m +# CONFIG_AIRONET4500_PNP is not set +# CONFIG_AIRONET4500_PCI is not set +# CONFIG_AIRONET4500_ISA is not set +# CONFIG_AIRONET4500_I365 is not set +# CONFIG_AIRONET4500_PROC is not set +# CONFIG_AIRO is not set +CONFIG_HERMES=m +CONFIG_PCMCIA_HERMES=m +CONFIG_AIRO_CS=m +CONFIG_NET_WIRELESS=y # # Token Ring devices @@ -366,11 +410,11 @@ CONFIG_PCMCIA_PCNET=m CONFIG_PCMCIA_NMCLAN=m CONFIG_PCMCIA_SMC91C92=m CONFIG_PCMCIA_XIRC2PS=m +# CONFIG_PCMCIA_AXNET is not set # CONFIG_ARCNET_COM20020_CS is not set # CONFIG_PCMCIA_IBMTR is not set CONFIG_NET_PCMCIA_RADIO=y # CONFIG_PCMCIA_RAYCS is not set -# CONFIG_PCMCIA_HERMES is not set # CONFIG_PCMCIA_NETWAVE is not set CONFIG_PCMCIA_WAVELAN=m CONFIG_AIRONET4500_CS=m @@ -384,10 +428,6 @@ CONFIG_AIRONET4500_CS=m # IrDA (infrared) support # CONFIG_IRDA=m - -# -# IrDA protocols -# CONFIG_IRLAN=m # CONFIG_IRNET is not set CONFIG_IRCOMM=m @@ -397,28 +437,19 @@ CONFIG_IRCOMM=m # # Infrared-port device drivers # - -# -# SIR device drivers -# # CONFIG_IRTTY_SIR is not set # CONFIG_IRPORT_SIR is not set - -# -# FIR device drivers -# +# CONFIG_DONGLE is not set +# CONFIG_USB_IRDA is not set # CONFIG_NSC_FIR is not set # CONFIG_WINBOND_FIR is not set # CONFIG_TOSHIBA_FIR is not set # CONFIG_SMC_IRCC_FIR is not set +# CONFIG_ALI_FIR is not set +# CONFIG_VLSI_FIR is not set CONFIG_SA1100_FIR=m # -# Dongle support -# -# CONFIG_DONGLE is not set - -# # ATA/IDE/MFM/RLL support # CONFIG_IDE=m @@ -427,10 +458,6 @@ CONFIG_IDE=m # IDE, ATA and ATAPI Block devices # CONFIG_BLK_DEV_IDE=m - -# -# Please see Documentation/ide.txt for help/info on IDE drives -# # CONFIG_BLK_DEV_HD_IDE is not set # CONFIG_BLK_DEV_HD is not set CONFIG_BLK_DEV_IDEDISK=m @@ -445,14 +472,10 @@ CONFIG_BLK_DEV_IDEDISK=m # CONFIG_BLK_DEV_COMMERIAL is not set # CONFIG_BLK_DEV_TIVO is not set CONFIG_BLK_DEV_IDECS=m -# CONFIG_BLK_DEV_IDECD is not set +CONFIG_BLK_DEV_IDECD=m # CONFIG_BLK_DEV_IDETAPE is not set # CONFIG_BLK_DEV_IDEFLOPPY is not set # CONFIG_BLK_DEV_IDESCSI is not set - -# -# IDE chipset support/bugfixes -# # CONFIG_BLK_DEV_CMD640 is not set # CONFIG_BLK_DEV_CMD640_ENHANCED is not set # CONFIG_BLK_DEV_ISAPNP is not set @@ -460,6 +483,9 @@ CONFIG_BLK_DEV_IDECS=m # CONFIG_IDEDMA_AUTO is not set # CONFIG_DMA_NONPCI is not set # CONFIG_BLK_DEV_IDE_MODES is not set +# CONFIG_BLK_DEV_ATARAID is not set +# CONFIG_BLK_DEV_ATARAID_PDC is not set +# CONFIG_BLK_DEV_ATARAID_HPT is not set # # SCSI support @@ -483,7 +509,13 @@ CONFIG_BLK_DEV_IDECS=m # # Input core support # -# CONFIG_INPUT is not set +CONFIG_INPUT=y +# CONFIG_INPUT_KEYBDEV is not set +CONFIG_INPUT_MOUSEDEV=y +CONFIG_INPUT_MOUSEDEV_SCREEN_X=640 +CONFIG_INPUT_MOUSEDEV_SCREEN_Y=240 +# CONFIG_INPUT_JOYDEV is not set +# CONFIG_INPUT_EVDEV is not set # # Character devices @@ -493,19 +525,37 @@ CONFIG_VT_CONSOLE=y CONFIG_SERIAL=m # CONFIG_SERIAL_EXTENDED is not set # CONFIG_SERIAL_NONSTANDARD is not set + +# +# Serial drivers +# +# CONFIG_SERIAL_ANAKIN is not set +# CONFIG_SERIAL_ANAKIN_CONSOLE is not set +# CONFIG_SERIAL_AMBA is not set +# CONFIG_SERIAL_AMBA_CONSOLE is not set +# CONFIG_SERIAL_CLPS711X is not set +# CONFIG_SERIAL_CLPS711X_CONSOLE is not set +# CONFIG_SERIAL_21285 is not set +# CONFIG_SERIAL_21285_OLD is not set +# CONFIG_SERIAL_21285_CONSOLE is not set +# CONFIG_SERIAL_UART00 is not set +# CONFIG_SERIAL_UART00_CONSOLE is not set CONFIG_SERIAL_SA1100=y CONFIG_SERIAL_SA1100_CONSOLE=y CONFIG_SA1100_DEFAULT_BAUDRATE=115200 -# CONFIG_TOUCHSCREEN_UCB1200 is not set -# CONFIG_TOUCHSCREEN_BITSY is not set -CONFIG_PROFILER=m -# CONFIG_PFS168_SPI is not set -# CONFIG_PFS168_DTMF is not set -# CONFIG_PFS168_MISC is not set +# CONFIG_SERIAL_8250 is not set +# CONFIG_SERIAL_8250_CONSOLE is not set +# CONFIG_SERIAL_8250_EXTENDED is not set +# CONFIG_SERIAL_8250_MANY_PORTS is not set +# CONFIG_SERIAL_8250_SHARE_IRQ is not set +# CONFIG_SERIAL_8250_DETECT_IRQ is not set +# CONFIG_SERIAL_8250_MULTIPORT is not set +# CONFIG_SERIAL_8250_HUB6 is not set CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y CONFIG_UNIX98_PTYS=y CONFIG_UNIX98_PTY_COUNT=32 +# CONFIG_NEWTONKBD is not set # # I2C support @@ -513,6 +563,15 @@ CONFIG_UNIX98_PTY_COUNT=32 # CONFIG_I2C is not set # +# L3 serial bus support +# +# CONFIG_L3 is not set +# CONFIG_L3_ALGOBIT is not set +# CONFIG_L3_BIT_SA1100_GPIO is not set +# CONFIG_L3_SA1111 is not set +# CONFIG_BIT_SA1100_GPIO is not set + +# # Mice # # CONFIG_BUSMOUSE is not set @@ -524,11 +583,33 @@ CONFIG_MOUSE=m # # Joysticks # -# CONFIG_JOYSTICK is not set - -# -# Input core support is needed for joysticks -# +# CONFIG_INPUT_GAMEPORT is not set +# CONFIG_INPUT_NS558 is not set +# CONFIG_INPUT_LIGHTNING is not set +# CONFIG_INPUT_PCIGAME is not set +# CONFIG_INPUT_CS461X is not set +# CONFIG_INPUT_EMU10K1 is not set +# CONFIG_INPUT_SERIO is not set +# CONFIG_INPUT_SERPORT is not set +# CONFIG_INPUT_ANALOG is not set +# CONFIG_INPUT_A3D is not set +# CONFIG_INPUT_ADI is not set +# CONFIG_INPUT_COBRA is not set +# CONFIG_INPUT_GF2K is not set +# CONFIG_INPUT_GRIP is not set +# CONFIG_INPUT_INTERACT is not set +# CONFIG_INPUT_TMDC is not set +# CONFIG_INPUT_SIDEWINDER is not set +# CONFIG_INPUT_IFORCE_USB is not set +# CONFIG_INPUT_IFORCE_232 is not set +# CONFIG_INPUT_WARRIOR is not set +# CONFIG_INPUT_MAGELLAN is not set +# CONFIG_INPUT_SPACEORB is not set +# CONFIG_INPUT_SPACEBALL is not set +# CONFIG_INPUT_STINGER is not set +# CONFIG_INPUT_DB9 is not set +# CONFIG_INPUT_GAMECON is not set +# CONFIG_INPUT_TURBOGRAFX is not set # CONFIG_QIC02_TAPE is not set # @@ -553,12 +634,13 @@ CONFIG_SA1100_RTC=m # # PCMCIA character devices # -CONFIG_PCMCIA_SERIAL_CS=m +# CONFIG_PCMCIA_SERIAL_CS is not set # # Multimedia devices # # CONFIG_VIDEO_DEV is not set +# CONFIG_V4L2_DEV is not set # # File systems @@ -568,37 +650,45 @@ CONFIG_PCMCIA_SERIAL_CS=m # CONFIG_AUTOFS4_FS is not set # CONFIG_REISERFS_FS is not set # CONFIG_REISERFS_CHECK is not set +# CONFIG_REISERFS_PROC_INFO is not set # CONFIG_ADFS_FS is not set # CONFIG_ADFS_FS_RW is not set # CONFIG_AFFS_FS is not set # CONFIG_HFS_FS is not set # CONFIG_BFS_FS is not set -CONFIG_FAT_FS=m -CONFIG_MSDOS_FS=m +# CONFIG_EXT3_FS is not set +# CONFIG_JBD is not set +# CONFIG_JBD_DEBUG is not set +# CONFIG_FAT_FS is not set +# CONFIG_MSDOS_FS is not set # CONFIG_UMSDOS_FS is not set -CONFIG_VFAT_FS=m +# CONFIG_VFAT_FS is not set # CONFIG_EFS_FS is not set # CONFIG_JFFS_FS is not set -# CONFIG_JFFS2_FS is not set -CONFIG_CRAMFS=y +CONFIG_JFFS2_FS=y +CONFIG_JFFS2_FS_DEBUG=2 +# CONFIG_CRAMFS is not set +# CONFIG_TMPFS is not set CONFIG_RAMFS=y -# CONFIG_ISO9660_FS is not set +CONFIG_ISO9660_FS=m # CONFIG_JOLIET is not set +# CONFIG_ZISOFS is not set # CONFIG_MINIX_FS is not set +# CONFIG_VXFS_FS is not set # CONFIG_NTFS_FS is not set # CONFIG_NTFS_RW is not set # CONFIG_HPFS_FS is not set CONFIG_PROC_FS=y -# CONFIG_DEVFS_FS is not set -# CONFIG_DEVFS_MOUNT is not set -# CONFIG_DEVFS_DEBUG is not set +CONFIG_DEVFS_FS=y +CONFIG_DEVFS_MOUNT=y +CONFIG_DEVFS_DEBUG=y +# CONFIG_DRIVERFS_FS is not set CONFIG_DEVPTS_FS=y # CONFIG_QNX4FS_FS is not set # CONFIG_QNX4FS_RW is not set # CONFIG_ROMFS_FS is not set CONFIG_EXT2_FS=y # CONFIG_SYSV_FS is not set -# CONFIG_SYSV_FS_WRITE is not set # CONFIG_UDF_FS is not set # CONFIG_UDF_RW is not set # CONFIG_UFS_FS is not set @@ -608,6 +698,7 @@ CONFIG_EXT2_FS=y # Network File Systems # # CONFIG_CODA_FS is not set +# CONFIG_INTERMEZZO_FS is not set CONFIG_NFS_FS=m CONFIG_NFS_V3=y # CONFIG_ROOT_NFS is not set @@ -616,8 +707,7 @@ CONFIG_NFS_V3=y CONFIG_SUNRPC=m CONFIG_LOCKD=m CONFIG_LOCKD_V4=y -CONFIG_SMB_FS=m -# CONFIG_SMB_NLS_DEFAULT is not set +# CONFIG_SMB_FS is not set # CONFIG_NCP_FS is not set # CONFIG_NCPFS_PACKET_SIGNING is not set # CONFIG_NCPFS_IOCTL_LOCKING is not set @@ -627,59 +717,22 @@ CONFIG_SMB_FS=m # CONFIG_NCPFS_SMALLDOS is not set # CONFIG_NCPFS_NLS is not set # CONFIG_NCPFS_EXTRAS is not set +# CONFIG_ZISOFS_FS is not set +# CONFIG_ZLIB_FS_INFLATE is not set # # Partition Types # # CONFIG_PARTITION_ADVANCED is not set CONFIG_MSDOS_PARTITION=y -CONFIG_SMB_NLS=y -CONFIG_NLS=y - -# -# Native Language Support -# -CONFIG_NLS_DEFAULT="iso8859-1" -CONFIG_NLS_CODEPAGE_437=y -# CONFIG_NLS_CODEPAGE_737 is not set -# CONFIG_NLS_CODEPAGE_775 is not set -# CONFIG_NLS_CODEPAGE_850 is not set -# CONFIG_NLS_CODEPAGE_852 is not set -# CONFIG_NLS_CODEPAGE_855 is not set -# CONFIG_NLS_CODEPAGE_857 is not set -# CONFIG_NLS_CODEPAGE_860 is not set -# CONFIG_NLS_CODEPAGE_861 is not set -# CONFIG_NLS_CODEPAGE_862 is not set -# CONFIG_NLS_CODEPAGE_863 is not set -# CONFIG_NLS_CODEPAGE_864 is not set -# CONFIG_NLS_CODEPAGE_865 is not set -# CONFIG_NLS_CODEPAGE_866 is not set -# CONFIG_NLS_CODEPAGE_869 is not set -# CONFIG_NLS_CODEPAGE_874 is not set -# CONFIG_NLS_CODEPAGE_932 is not set -# CONFIG_NLS_CODEPAGE_936 is not set -# CONFIG_NLS_CODEPAGE_949 is not set -# CONFIG_NLS_CODEPAGE_950 is not set -# CONFIG_NLS_ISO8859_1 is not set -# CONFIG_NLS_ISO8859_2 is not set -# CONFIG_NLS_ISO8859_3 is not set -# CONFIG_NLS_ISO8859_4 is not set -# CONFIG_NLS_ISO8859_5 is not set -# CONFIG_NLS_ISO8859_6 is not set -# CONFIG_NLS_ISO8859_7 is not set -# CONFIG_NLS_ISO8859_8 is not set -# CONFIG_NLS_ISO8859_9 is not set -# CONFIG_NLS_ISO8859_14 is not set -# CONFIG_NLS_ISO8859_15 is not set -# CONFIG_NLS_KOI8_R is not set -# CONFIG_NLS_UTF8 is not set +# CONFIG_SMB_NLS is not set +# CONFIG_NLS is not set # # Console drivers # CONFIG_PC_KEYMAP=y # CONFIG_VGA_CONSOLE is not set -CONFIG_FB=y # # Frame-buffer support @@ -687,16 +740,17 @@ CONFIG_FB=y CONFIG_FB=y CONFIG_DUMMY_CONSOLE=y # CONFIG_FB_ACORN is not set +# CONFIG_FB_ANAKIN is not set # CONFIG_FB_CLPS711X is not set +# CONFIG_FB_SA1100 is not set CONFIG_FB_EPSON1356=y # CONFIG_FB_CYBER2000 is not set -# CONFIG_FB_SA1100 is not set # CONFIG_FB_VIRTUAL is not set CONFIG_FBCON_ADVANCED=y # CONFIG_FBCON_MFB is not set # CONFIG_FBCON_CFB2 is not set # CONFIG_FBCON_CFB4 is not set -CONFIG_FBCON_CFB8=y +# CONFIG_FBCON_CFB8 is not set CONFIG_FBCON_CFB16=y # CONFIG_FBCON_CFB24 is not set # CONFIG_FBCON_CFB32 is not set @@ -721,10 +775,10 @@ CONFIG_FONT_8x8=y # Sound # CONFIG_SOUND=m -CONFIG_SOUND_UDA1341=m -# CONFIG_SOUND_SA1100_SSP is not set +# CONFIG_SOUND_BT878 is not set # CONFIG_SOUND_CMPCI is not set # CONFIG_SOUND_EMU10K1 is not set +# CONFIG_MIDI_EMU10K1 is not set # CONFIG_SOUND_FUSION is not set # CONFIG_SOUND_CS4281 is not set # CONFIG_SOUND_ES1370 is not set @@ -733,28 +787,121 @@ CONFIG_SOUND_UDA1341=m # CONFIG_SOUND_MAESTRO is not set # CONFIG_SOUND_MAESTRO3 is not set # CONFIG_SOUND_ICH is not set +# CONFIG_SOUND_RME96XX is not set # CONFIG_SOUND_SONICVIBES is not set # CONFIG_SOUND_TRIDENT is not set # CONFIG_SOUND_MSNDCLAS is not set # CONFIG_SOUND_MSNDPIN is not set # CONFIG_SOUND_VIA82CXXX is not set +# CONFIG_MIDI_VIA82CXXX is not set +CONFIG_SOUND_SA1100=m +# CONFIG_SOUND_UDA1341 is not set +# CONFIG_SOUND_ASSABET_UDA1341 is not set +# CONFIG_SOUND_H3600_UDA1341 is not set +# CONFIG_SOUND_PANGOLIN_UDA1341 is not set +# CONFIG_SOUND_SA1111_UDA1341 is not set +# CONFIG_SOUND_SA1100SSP is not set # CONFIG_SOUND_OSS is not set +# CONFIG_SOUND_WAVEARTIST is not set # CONFIG_SOUND_TVMIXER is not set # +# Multimedia Capabilities Port drivers +# +# CONFIG_MCP is not set +# CONFIG_MCP_SA1100 is not set +# CONFIG_MCP_UCB1200 is not set +# CONFIG_MCP_UCB1200_AUDIO is not set +# CONFIG_MCP_UCB1200_TS is not set + +# # USB support # # CONFIG_USB is not set +# CONFIG_USB_UHCI is not set +# CONFIG_USB_UHCI_ALT is not set +# CONFIG_USB_OHCI is not set +# CONFIG_USB_OHCI_SA1111 is not set +# CONFIG_USB_AUDIO is not set +# CONFIG_USB_BLUETOOTH is not set +# CONFIG_USB_STORAGE is not set +# CONFIG_USB_STORAGE_DEBUG is not set +# CONFIG_USB_STORAGE_DATAFAB is not set +# CONFIG_USB_STORAGE_FREECOM is not set +# CONFIG_USB_STORAGE_ISD200 is not set +# CONFIG_USB_STORAGE_DPCM is not set +# CONFIG_USB_STORAGE_HP8200e is not set +# CONFIG_USB_STORAGE_SDDR09 is not set +# CONFIG_USB_STORAGE_JUMPSHOT is not set +# CONFIG_USB_ACM is not set +# CONFIG_USB_PRINTER is not set +# CONFIG_USB_HID is not set +# CONFIG_USB_HIDDEV is not set +# CONFIG_USB_KBD is not set +# CONFIG_USB_MOUSE is not set +# CONFIG_USB_WACOM is not set +# CONFIG_USB_DC2XX is not set +# CONFIG_USB_MDC800 is not set +# CONFIG_USB_SCANNER is not set +# CONFIG_USB_MICROTEK is not set +# CONFIG_USB_HPUSBSCSI is not set +# CONFIG_USB_PEGASUS is not set +# CONFIG_USB_KAWETH is not set +# CONFIG_USB_CATC is not set +# CONFIG_USB_CDCETHER is not set +# CONFIG_USB_USBNET is not set +# CONFIG_USB_USS720 is not set + +# +# USB Serial Converter support +# +# CONFIG_USB_SERIAL is not set +# CONFIG_USB_SERIAL_GENERIC is not set +# CONFIG_USB_SERIAL_BELKIN is not set +# CONFIG_USB_SERIAL_WHITEHEAT is not set +# CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set +# CONFIG_USB_SERIAL_EMPEG is not set +# CONFIG_USB_SERIAL_FTDI_SIO is not set +# CONFIG_USB_SERIAL_VISOR is not set +# CONFIG_USB_SERIAL_IR is not set +# CONFIG_USB_SERIAL_EDGEPORT is not set +# CONFIG_USB_SERIAL_KEYSPAN_PDA is not set +# CONFIG_USB_SERIAL_KEYSPAN is not set +# CONFIG_USB_SERIAL_KEYSPAN_USA28 is not set +# CONFIG_USB_SERIAL_KEYSPAN_USA28X is not set +# CONFIG_USB_SERIAL_KEYSPAN_USA28XA is not set +# CONFIG_USB_SERIAL_KEYSPAN_USA28XB is not set +# CONFIG_USB_SERIAL_KEYSPAN_USA19 is not set +# CONFIG_USB_SERIAL_KEYSPAN_USA18X is not set +# CONFIG_USB_SERIAL_KEYSPAN_USA19W is not set +# CONFIG_USB_SERIAL_KEYSPAN_USA49W is not set +# CONFIG_USB_SERIAL_MCT_U232 is not set +# CONFIG_USB_SERIAL_PL2303 is not set +# CONFIG_USB_SERIAL_CYBERJACK is not set +# CONFIG_USB_SERIAL_XIRCOM is not set +# CONFIG_USB_SERIAL_OMNINET is not set +# CONFIG_USB_RIO500 is not set + +# +# Bluetooth support +# +# CONFIG_BLUEZ is not set # # Kernel hacking # # CONFIG_NO_FRAME_POINTER is not set -CONFIG_DEBUG_ERRORS=y # CONFIG_DEBUG_USER is not set # CONFIG_DEBUG_INFO is not set -# CONFIG_MAGIC_SYSRQ is not set # CONFIG_NO_PGT_CACHE is not set +CONFIG_DEBUG_KERNEL=y +CONFIG_DEBUG_SLAB=y +# CONFIG_MAGIC_SYSRQ is not set +# CONFIG_DEBUG_SPINLOCK is not set +# CONFIG_DEBUG_WAITQ is not set +# CONFIG_DEBUG_BUGVERBOSE is not set +CONFIG_DEBUG_ERRORS=y CONFIG_DEBUG_LL=y # CONFIG_DEBUG_DC21285_PORT is not set # CONFIG_DEBUG_CLPS711X_UART2 is not set +# CONFIG_DEBUG_LL_SER3 is not set diff --git a/arch/arm/kernel/dma.c b/arch/arm/kernel/dma.c index 383e939..c6ea827 100644 --- a/arch/arm/kernel/dma.c +++ b/arch/arm/kernel/dma.c @@ -286,3 +286,5 @@ EXPORT_SYMBOL(set_dma_page); EXPORT_SYMBOL(get_dma_residue); EXPORT_SYMBOL(set_dma_sg); EXPORT_SYMBOL(set_dma_speed); + +EXPORT_SYMBOL(dma_spin_lock); diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index bab87ea..dbec0fa 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -16,6 +16,7 @@ #include <linux/config.h> #include "entry-header.S" #include <asm/thread_info.h> +#include <asm/glue.h> #ifdef IOC_BASE @@ -681,12 +682,12 @@ __dabt_svc: sub sp, sp, #S_FRAME_SIZE /* * This routine must not corrupt r9 */ -#ifdef MULTI_CPU +#ifdef MULTI_ABORT ldr r4, .LCprocfns @ pass r0, r3 to mov lr, pc @ processor code ldr pc, [r4] @ call processor specific code #else - bl cpu_data_abort + bl CPU_ABORT_HANDLER #endif msr cpsr_c, r9 mov r2, sp @@ -799,7 +800,7 @@ __pabt_svc: sub sp, sp, #S_FRAME_SIZE .LCirq: .word __temp_irq .LCund: .word __temp_und .LCabt: .word __temp_abt -#ifdef MULTI_CPU +#ifdef MULTI_ABORT .LCprocfns: .word SYMBOL_NAME(processor) #endif .LCfp: .word SYMBOL_NAME(fp_enter) @@ -823,12 +824,12 @@ __dabt_usr: sub sp, sp, #S_FRAME_SIZE @ Allocate frame size in one go alignment_trap r7, r7, __temp_abt zero_fp mov r0, r2 @ remove once everyones in sync -#ifdef MULTI_CPU +#ifdef MULTI_ABORT ldr r4, .LCprocfns @ pass r0, r3 to mov lr, pc @ processor code ldr pc, [r4] @ call processor specific code #else - bl cpu_data_abort + bl CPU_ABORT_HANDLER #endif set_cpsr_c r2, #MODE_SVC @ Enable interrupts mov r2, sp diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index b7ace64..41412e4 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -76,6 +76,9 @@ struct processor processor; #ifdef MULTI_TLB struct cpu_tlb_fns cpu_tlb; #endif +#ifdef MULTI_USER +struct cpu_user_fns cpu_user; +#endif unsigned char aux_device_present; char elf_platform[ELF_PLATFORM_SIZE]; @@ -248,6 +251,9 @@ static void __init setup_processor(void) #ifdef MULTI_TLB cpu_tlb = *list->tlb; #endif +#ifdef MULTI_USER + cpu_user = *list->user; +#endif printk("Processor: %s %s revision %d\n", proc_info.manufacturer, proc_info.cpu_name, diff --git a/arch/arm/mach-integrator/pci.c b/arch/arm/mach-integrator/pci.c index 8dae29c..62d9618 100644 --- a/arch/arm/mach-integrator/pci.c +++ b/arch/arm/mach-integrator/pci.c @@ -113,7 +113,6 @@ static int __init integrator_map_irq(struct pci_dev *dev, u8 slot, u8 pin) extern void pci_v3_init(void *); struct hw_pci integrator_pci __initdata = { - mem_offset: 0x40000000, swizzle: integrator_swizzle, map_irq: integrator_map_irq, setup: pci_v3_setup, diff --git a/arch/arm/mach-integrator/pci_v3.c b/arch/arm/mach-integrator/pci_v3.c index 2b46c7c..75a660b 100644 --- a/arch/arm/mach-integrator/pci_v3.c +++ b/arch/arm/mach-integrator/pci_v3.c @@ -435,7 +435,7 @@ static int __init pci_v3_setup_resources(struct resource **resource) resource[1] = &non_mem; resource[2] = &pre_mem; - return 0; + return 1; } /* @@ -529,8 +529,10 @@ int __init pci_v3_setup(int nr, struct pci_sys_data *sys) { int ret = 0; - if (nr == 0) + if (nr == 0) { + sys->mem_offset = 0x40000000; ret = pci_v3_setup_resources(sys->resource); + } return ret; } @@ -634,7 +636,6 @@ void __init pci_v3_preinit(void) void __init pci_v3_postinit(void) { unsigned int pci_cmd; - int ret; pci_cmd = PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER | PCI_COMMAND_INVALIDATE; diff --git a/arch/arm/mach-iop310/iq80310-time.c b/arch/arm/mach-iop310/iq80310-time.c index 1b0fe17..df01f64 100644 --- a/arch/arm/mach-iop310/iq80310-time.c +++ b/arch/arm/mach-iop310/iq80310-time.c @@ -47,23 +47,44 @@ static u_long iq80310_read_timer (void) u_long b0, b1, b2, b3, val; b0 = *la0; b1 = *la1; b2 = *la2; b3 = *la3; - b0 = (((b0 & 0x20) >> 1) | (b0 & 0x1f)); - b1 = (((b1 & 0x20) >> 1) | (b1 & 0x1f)); - b2 = (((b2 & 0x20) >> 1) | (b2 & 0x1f)); + b0 = (((b0 & 0x40) >> 1) | (b0 & 0x1f)); + b1 = (((b1 & 0x40) >> 1) | (b1 & 0x1f)); + b2 = (((b2 & 0x40) >> 1) | (b2 & 0x1f)); b3 = (b3 & 0x0f); val = ((b0 << 0) | (b1 << 6) | (b2 << 12) | (b3 << 18)); return val; } -/* IRQs are disabled before entering here from do_gettimeofday() */ +/* + * IRQs are disabled before entering here from do_gettimeofday(). + * Note that the counter may wrap. When it does, 'elapsed' will + * be small, but we will have a pending interrupt. + */ static unsigned long iq80310_gettimeoffset (void) { - unsigned long elapsed, usec; + unsigned long elapsed, usec, tmp1; + unsigned int stat1, stat2; - /* We need elapsed timer ticks since last interrupt */ + stat1 = *(volatile u8 *)IQ80310_INT_STAT; elapsed = iq80310_read_timer(); + stat2 = *(volatile u8 *)IQ80310_INT_STAT; + + /* + * If an interrupt was pending before we read the timer, + * we've already wrapped. Factor this into the time. + * If an interrupt was pending after we read the timer, + * it may have wrapped between checking the interrupt + * status and reading the timer. Re-read the timer to + * be sure its value is after the wrap. + */ + if (stat1 & 1) + elapsed += LATCH; + else if (stat2 & 1) + elapsed = LATCH + iq80310_read_timer(); - /* Now convert them to usec */ + /* + * Now convert them to usec. + */ usec = (unsigned long)(elapsed*tick)/LATCH; return usec; @@ -92,9 +113,7 @@ static void iq80310_timer_interrupt(int irq, void *dev_id, struct pt_regs *regs) * * -DS */ - irq_exit(smp_processor_id(), irq); do_timer(regs); - irq_enter(smp_processor_id(), irq); } extern unsigned long (*gettimeoffset)(void); @@ -116,4 +135,3 @@ void __init time_init(void) *timer_en |= 2; *timer_en |= 1; } - diff --git a/arch/arm/mach-sa1100/Makefile b/arch/arm/mach-sa1100/Makefile index 975cbe4..66e7864 100644 --- a/arch/arm/mach-sa1100/Makefile +++ b/arch/arm/mach-sa1100/Makefile @@ -14,69 +14,105 @@ obj-y := generic.o irq.o dma.o obj-m := obj-n := obj- := +led-y := leds.o -export-objs := assabet.o dma.o flexanet.o freebird.o generic.o h3600.o \ - huw_webpanel.o irq.o pcipool.o sa1111.o sa1111-pcibuf.o \ - yopy.o usb_ctl.o usb_recv.o usb_send.o +export-objs := dma.o generic.o irq.o pcipool.o sa1111.o sa1111-pcibuf.o \ + usb_ctl.o usb_recv.o usb_send.o pm.o # This needs to be cleaned up. We probably need to have SA1100 # and SA1110 config symbols. # # We link the CPU support next, so that RAM timings can be tuned. ifeq ($(CONFIG_CPU_FREQ),y) -obj-$(CONFIG_SA1100_ASSABET) += cpu-sa1110.o -obj-$(CONFIG_SA1100_CERF) += cpu-sa1110.o -obj-$(CONFIG_SA1100_PT_SYSTEM3) += cpu-sa1110.o -obj-$(CONFIG_SA1100_LART) += cpu-sa1100.o +obj-$(CONFIG_SA1100_ASSABET) += cpu-sa1110.o +obj-$(CONFIG_SA1100_CERF) += cpu-sa1110.o +obj-$(CONFIG_SA1100_LART) += cpu-sa1100.o +obj-$(CONFIG_SA1100_PT_SYSTEM3) += cpu-sa1110.o endif # Next, the SA1111 stuff. -obj-$(CONFIG_SA1111) += sa1111.o -obj-$(CONFIG_USB_OHCI_SA1111) += sa1111-pcibuf.o pcipool.o +obj-$(CONFIG_SA1111) += sa1111.o +obj-$(CONFIG_USB_OHCI_SA1111) += sa1111-pcibuf.o pcipool.o # Specific board support -obj-$(CONFIG_SA1100_ADSBITSY) += adsbitsy.o -obj-$(CONFIG_SA1100_ASSABET) += assabet.o -obj-$(CONFIG_ASSABET_NEPONSET) += neponset.o -obj-$(CONFIG_SA1100_BRUTUS) += brutus.o -obj-$(CONFIG_SA1100_CERF) += cerf.o -obj-$(CONFIG_SA1100_EMPEG) += empeg.o -obj-$(CONFIG_SA1100_FLEXANET) += flexanet.o -obj-$(CONFIG_SA1100_FREEBIRD) += freebird.o -obj-$(CONFIG_SA1100_GRAPHICSCLIENT) += graphicsclient.o -obj-$(CONFIG_SA1100_GRAPHICSMASTER) += graphicsmaster.o -obj-$(CONFIG_SA1100_H3600) += h3600.o -obj-$(CONFIG_SA1100_HUW_WEBPANEL) += huw_webpanel.o -obj-$(CONFIG_SA1100_ITSY) += itsy.o -obj-$(CONFIG_SA1100_JORNADA720) += jornada720.o -obj-$(CONFIG_SA1100_LART) += lart.o -obj-$(CONFIG_SA1100_NANOENGINE) += nanoengine.o -obj-$(CONFIG_SA1100_OMNIMETER) += omnimeter.o -obj-$(CONFIG_SA1100_PANGOLIN) += pangolin.o -obj-$(CONFIG_SA1100_PFS168) += pfs168.o -obj-$(CONFIG_SA1100_PLEB) += pleb.o -obj-$(CONFIG_SA1100_SHANNON) += shannon.o -obj-$(CONFIG_SA1100_SHERMAN) += sherman.o -obj-$(CONFIG_SA1100_PT_SYSTEM3) += system3.o -obj-$(CONFIG_SA1100_SIMPAD) += simpad.o -obj-$(CONFIG_SA1100_VICTOR) += victor.o -obj-$(CONFIG_SA1100_XP860) += xp860.o -obj-$(CONFIG_SA1100_YOPY) += yopy.o +obj-$(CONFIG_SA1100_ADSBITSY) += adsbitsy.o +led-$(CONFIG_SA1100_ADSBITSY) += leds-adsbitsy.o + +obj-$(CONFIG_SA1100_ASSABET) += assabet.o +export-objs += assabet.o +led-$(CONFIG_SA1100_ASSABET) += leds-assabet.o +obj-$(CONFIG_ASSABET_NEPONSET) += neponset.o + +obj-$(CONFIG_SA1100_BADGE4) += badge4.o +export-objs += badge4.o + +obj-$(CONFIG_SA1100_BRUTUS) += brutus.o +led-$(CONFIG_SA1100_BRUTUS) += leds-brutus.o + +obj-$(CONFIG_SA1100_CERF) += cerf.o +led-$(CONFIG_SA1100_CERF) += leds-cerf.o + +obj-$(CONFIG_SA1100_EMPEG) += empeg.o + +obj-$(CONFIG_SA1100_FLEXANET) += flexanet.o +export-objs += flexanet.o +led-$(CONFIG_SA1100_FLEXANET) += leds-flexanet.o + +obj-$(CONFIG_SA1100_FREEBIRD) += freebird.o +export-objs += freebird.o + +obj-$(CONFIG_SA1100_GRAPHICSCLIENT) += graphicsclient.o +led-$(CONFIG_SA1100_GRAPHICSCLIENT) += leds-graphicsclient.o + +obj-$(CONFIG_SA1100_GRAPHICSMASTER) += graphicsmaster.o +led-$(CONFIG_SA1100_GRAPHICSMASTER) += leds-graphicsmaster.o + +obj-$(CONFIG_SA1100_H3600) += h3600.o +export-objs += h3600.o + +obj-$(CONFIG_SA1100_HUW_WEBPANEL) += huw_webpanel.o +export-objs += huw_webpanel.o + +obj-$(CONFIG_SA1100_ITSY) += itsy.o + +obj-$(CONFIG_SA1100_JORNADA720) += jornada720.o + +obj-$(CONFIG_SA1100_LART) += lart.o +led-$(CONFIG_SA1100_LART) += leds-lart.o + +obj-$(CONFIG_SA1100_NANOENGINE) += nanoengine.o + +obj-$(CONFIG_SA1100_OMNIMETER) += omnimeter.o + +obj-$(CONFIG_SA1100_PANGOLIN) += pangolin.o + +obj-$(CONFIG_SA1100_PFS168) += pfs168.o +led-$(CONFIG_SA1100_PFS168) += leds-pfs168.o + +obj-$(CONFIG_SA1100_PLEB) += pleb.o + +obj-$(CONFIG_SA1100_PT_SYSTEM3) += system3.o +led-$(CONFIG_SA1100_PT_SYSTEM3) += leds-system3.o + +obj-$(CONFIG_SA1100_SHANNON) += shannon.o + +obj-$(CONFIG_SA1100_SHERMAN) += sherman.o + +obj-$(CONFIG_SA1100_SIMPAD) += simpad.o +led-$(CONFIG_SA1100_SIMPAD) += leds-simpad.o + +obj-$(CONFIG_SA1100_STORK) += stork.o +export-objs += stork.o + +obj-$(CONFIG_SA1100_VICTOR) += victor.o + +obj-$(CONFIG_SA1100_XP860) += xp860.o + +obj-$(CONFIG_SA1100_YOPY) += yopy.o +export-objs += yopy.o # LEDs support -leds-y := leds.o -leds-$(CONFIG_SA1100_ADSBITSY) += leds-adsbitsy.o -leds-$(CONFIG_SA1100_ASSABET) += leds-assabet.o -leds-$(CONFIG_SA1100_BRUTUS) += leds-brutus.o -leds-$(CONFIG_SA1100_CERF) += leds-cerf.o -leds-$(CONFIG_SA1100_FLEXANET) += leds-flexanet.o -leds-$(CONFIG_SA1100_GRAPHICSCLIENT) += leds-graphicsclient.o -leds-$(CONFIG_SA1100_GRAPHICSMASTER) += leds-graphicsmaster.o -leds-$(CONFIG_SA1100_LART) += leds-lart.o -leds-$(CONFIG_SA1100_PFS168) += leds-pfs168.o -leds-$(CONFIG_SA1100_SIMPAD) += leds-simpad.o -leds-$(CONFIG_SA1100_PT_SYSTEM3) += leds-system3.o -obj-$(CONFIG_LEDS) += $(leds-y) +obj-$(CONFIG_LEDS) += $(led-y) # SA1110 USB client support list-multi += sa1100usb_core.o diff --git a/arch/arm/mach-sa1100/cpu-sa1110.c b/arch/arm/mach-sa1100/cpu-sa1110.c index ed8015502..5fcd1e9 100644 --- a/arch/arm/mach-sa1100/cpu-sa1110.c +++ b/arch/arm/mach-sa1100/cpu-sa1110.c @@ -3,7 +3,7 @@ * * Copyright (C) 2001 Russell King * - * $Id: cpu-sa1110.c,v 1.6 2001/10/22 11:53:47 rmk Exp $ + * $Id: cpu-sa1110.c,v 1.8 2002/01/09 17:13:27 rmk Exp $ * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -69,13 +69,23 @@ static struct sdram_params tc59sm716_cl3_params __initdata = { }; static struct sdram_params samsung_k4s641632d_tc75 __initdata = { - rows: 14, - tck: 9, - trcd: 27, - trp: 20, - twr: 9, - refresh: 64000, - cas_latency: 3, + rows: 14, + tck: 9, + trcd: 27, + trp: 20, + twr: 9, + refresh: 64000, + cas_latency: 3, +}; + +static struct sdram_params samsung_km416s4030ct __initdata = { + rows: 13, + tck: 8, + trcd: 24, /* 3 CLKs */ + trp: 24, /* 3 CLKs */ + twr: 16, /* Trdl: 2 CLKs */ + refresh: 64000, + cas_latency: 3, }; static struct sdram_params sdram_params; @@ -273,6 +283,8 @@ static int __init sa1110_clk_init(void) if (machine_is_pt_system3()) sdram = &samsung_k4s641632d_tc75; + if (machine_is_h3100()) + sdram = &samsung_km416s4030ct; if (sdram) { printk(KERN_DEBUG "SDRAM: tck: %d trcd: %d trp: %d" diff --git a/arch/arm/mach-sa1100/pm.c b/arch/arm/mach-sa1100/pm.c index 649e73f..3c6a969 100644 --- a/arch/arm/mach-sa1100/pm.c +++ b/arch/arm/mach-sa1100/pm.c @@ -20,6 +20,7 @@ * in the platform specific files. */ #include <linux/config.h> +#include <linux/module.h> #include <linux/init.h> #include <linux/pm.h> #include <linux/slab.h> @@ -27,6 +28,7 @@ #include <linux/interrupt.h> #include <linux/sysctl.h> #include <linux/errno.h> +#include <linux/cpufreq.h> #include <asm/hardware.h> #include <asm/memory.h> @@ -210,3 +212,5 @@ static int __init pm_init(void) __initcall(pm_init); #endif + +EXPORT_SYMBOL(pm_do_suspend); diff --git a/arch/arm/mach-sa1100/system3.c b/arch/arm/mach-sa1100/system3.c index 34d08d1..e8b8876 100644 --- a/arch/arm/mach-sa1100/system3.c +++ b/arch/arm/mach-sa1100/system3.c @@ -99,10 +99,9 @@ extern void convert_to_tag_list(struct param_struct *params, int mem_init); */ static struct map_desc system3_io_desc[] __initdata = { - /* virtual physical length domain r w c b */ - { 0xe8000000, 0x00000000, 0x01000000, DOMAIN_IO, 0, 1, 0, 0 }, /* Flash bank 0 */ - { 0xf3000000, PT_CPLD_BASE, 0x00100000, DOMAIN_IO, 0, 1, 0, 0 }, /* System Registers */ - { 0xf4000000, PT_SA1111_BASE, 0x00100000, DOMAIN_IO, 0, 1, 0, 0 }, /* SA-1111 */ + /* virtual physical length domain r w c b */ + { 0xf3000000, PT_CPLD_BASE, 0x00100000, DOMAIN_IO, 0, 1, 0, 0 }, /* System Registers */ + { 0xf4000000, PT_SA1111_BASE, 0x00100000, DOMAIN_IO, 0, 1, 0, 0 }, /* SA-1111 */ LAST_DESC }; diff --git a/arch/arm/mm/abort-ev4.S b/arch/arm/mm/abort-ev4.S index 8b3e667..4eb0823 100644 --- a/arch/arm/mm/abort-ev4.S +++ b/arch/arm/mm/abort-ev4.S @@ -1,7 +1,7 @@ #include <linux/linkage.h> #include <asm/assembler.h> /* - * Function: armv4_early_abort + * Function: v4_early_abort * * Params : r2 = address of aborted instruction * : r3 = saved SPSR @@ -18,7 +18,7 @@ * picture. Unfortunately, this does happen. We live with it. */ .align 5 -ENTRY(armv4_early_abort) +ENTRY(v4_early_abort) mrc p15, 0, r1, c5, c0, 0 @ get FSR mrc p15, 0, r0, c6, c0, 0 @ get FAR ldr r3, [r2] @ read aborted ARM instruction diff --git a/arch/arm/mm/abort-ev4t.S b/arch/arm/mm/abort-ev4t.S index dd72ad8..31a38f1 100644 --- a/arch/arm/mm/abort-ev4t.S +++ b/arch/arm/mm/abort-ev4t.S @@ -1,7 +1,7 @@ #include <linux/linkage.h> #include <asm/assembler.h> /* - * Function: armv4t_early_abort + * Function: v4t_early_abort * * Params : r2 = address of aborted instruction * : r3 = saved SPSR @@ -18,7 +18,7 @@ * picture. Unfortunately, this does happen. We live with it. */ .align 5 -ENTRY(armv4t_early_abort) +ENTRY(v4t_early_abort) mrc p15, 0, r1, c5, c0, 0 @ get FSR mrc p15, 0, r0, c6, c0, 0 @ get FAR tst r3, #PSR_T_BIT diff --git a/arch/arm/mm/abort-ev5ej.S b/arch/arm/mm/abort-ev5ej.S index 618a6cb..0f7cd37 100644 --- a/arch/arm/mm/abort-ev5ej.S +++ b/arch/arm/mm/abort-ev5ej.S @@ -1,7 +1,7 @@ #include <linux/linkage.h> #include <asm/assembler.h> /* - * Function: armv5ej_early_abort + * Function: v5ej_early_abort * * Params : r2 = address of aborted instruction * : r3 = saved SPSR @@ -18,7 +18,7 @@ * picture. Unfortunately, this does happen. We live with it. */ .align 5 -ENTRY(armv5ej_early_abort) +ENTRY(v5ej_early_abort) mrc p15, 0, r1, c5, c0, 0 @ get FSR mrc p15, 0, r0, c6, c0, 0 @ get FAR tst r3, #PSR_J_BIT diff --git a/arch/arm/mm/abort-lv4t.S b/arch/arm/mm/abort-lv4t.S index 6cbe9e3..6b768df 100644 --- a/arch/arm/mm/abort-lv4t.S +++ b/arch/arm/mm/abort-lv4t.S @@ -1,7 +1,7 @@ #include <linux/linkage.h> #include <asm/assembler.h> /* - * Function: armv4t_late_abort + * Function: v4t_late_abort * * Params : r2 = address of aborted instruction * : r3 = saved SPSR @@ -17,7 +17,7 @@ * abort here if the I-TLB and D-TLB aren't seeing the same * picture. Unfortunately, this does happen. We live with it. */ -ENTRY(armv4t_late_abort) +ENTRY(v4t_late_abort) tst r3, #PSR_T_BIT @ check for thumb mode mrc p15, 0, r1, c5, c0, 0 @ get FSR mrc p15, 0, r0, c6, c0, 0 @ get FAR diff --git a/arch/arm/mm/copypage-v3.S b/arch/arm/mm/copypage-v3.S index 1666c3f..96e154b 100644 --- a/arch/arm/mm/copypage-v3.S +++ b/arch/arm/mm/copypage-v3.S @@ -20,7 +20,7 @@ * * FIXME: do we need to handle cache stuff... */ -ENTRY(armv3_copy_user_page) +ENTRY(v3_copy_user_page) stmfd sp!, {r4, lr} @ 2 mov r2, #PAGE_SZ/64 @ 1 ldmia r1!, {r3, r4, ip, lr} @ 4+1 @@ -42,7 +42,7 @@ ENTRY(armv3_copy_user_page) * * FIXME: do we need to handle cache stuff... */ -ENTRY(armv3_clear_user_page) +ENTRY(v3_clear_user_page) str lr, [sp, #-4]! mov r1, #PAGE_SZ/64 @ 1 mov r2, #0 @ 1 @@ -57,3 +57,8 @@ ENTRY(armv3_clear_user_page) bne 1b @ 1 ldr pc, [sp], #4 + .section ".text.init", #alloc, #execinstr + +ENTRY(v3_user_fns) + .long v3_clear_user_page + .long v3_copy_user_page diff --git a/arch/arm/mm/copypage-v4.S b/arch/arm/mm/copypage-v4.S index d4cdbdd..e1f1526 100644 --- a/arch/arm/mm/copypage-v4.S +++ b/arch/arm/mm/copypage-v4.S @@ -26,7 +26,7 @@ * instruction. If your processor does not supply this, you have to write your * own copy_user_page that does the right thing. */ -ENTRY(armv4_copy_user_page) +ENTRY(v4_copy_user_page) stmfd sp!, {r4, lr} @ 2 mov r2, #PAGE_SZ/64 @ 1 ldmia r1!, {r3, r4, ip, lr} @ 4 @@ -51,7 +51,7 @@ ENTRY(armv4_copy_user_page) * * Same story as above. */ -ENTRY(armv4_clear_user_page) +ENTRY(v4_clear_user_page) str lr, [sp, #-4]! mov r1, #PAGE_SZ/64 @ 1 mov r2, #0 @ 1 @@ -68,3 +68,10 @@ ENTRY(armv4_clear_user_page) bne 1b @ 1 mcr p15, 0, r1, c7, c10, 4 @ 1 drain WB ldr pc, [sp], #4 + + .section ".text.init", #alloc, #execinstr + +ENTRY(v4_user_fns) + .long v4_clear_user_page + .long v4_copy_user_page + diff --git a/arch/arm/mm/copypage-v4mc.S b/arch/arm/mm/copypage-v4mc.S index 8d8d022..2e2d7fb 100644 --- a/arch/arm/mm/copypage-v4mc.S +++ b/arch/arm/mm/copypage-v4mc.S @@ -26,7 +26,7 @@ * instruction. If your processor does not supply this, you have to write your * own copy_user_page that does the right thing. */ -ENTRY(armv4_mc_copy_user_page) +ENTRY(v4_mc_copy_user_page) stmfd sp!, {r4, lr} @ 2 mov r4, r0 mov r0, r1 @@ -53,7 +53,7 @@ ENTRY(armv4_mc_copy_user_page) * * Same story as above. */ -ENTRY(armv4_mc_clear_user_page) +ENTRY(v4_mc_clear_user_page) str lr, [sp, #-4]! mov r1, #PAGE_SZ/64 @ 1 mov r2, #0 @ 1 @@ -69,3 +69,10 @@ ENTRY(armv4_mc_clear_user_page) subs r1, r1, #1 @ 1 bne 1b @ 1 ldr pc, [sp], #4 + + .section ".text.init", #alloc, #execinstr + +ENTRY(v4_mc_user_fns) + .long v4_mc_clear_user_page + .long v4_mc_copy_user_page + diff --git a/arch/arm/mm/copypage-v5te.S b/arch/arm/mm/copypage-v5te.S index 0cd3fff..1685047 100644 --- a/arch/arm/mm/copypage-v5te.S +++ b/arch/arm/mm/copypage-v5te.S @@ -32,7 +32,7 @@ * page. We rely on the mini-cache being smaller than one page, so we'll * cycle through the complete cache anyway. */ -ENTRY(armv5te_copy_user_page) +ENTRY(v5te_mc_copy_user_page) stmfd sp!, {r4, r5, lr} mov r5, r0 mov r0, r1 @@ -62,7 +62,7 @@ ENTRY(armv5te_copy_user_page) * r0 = destination * r1 = virtual user address of ultimate destination page */ -ENTRY(armv5te_clear_user_page) +ENTRY(v5te_mc_clear_user_page) str lr, [sp, #-4]! mov r1, #PAGE_SZ/32 mov r2, #0 @@ -77,3 +77,9 @@ ENTRY(armv5te_clear_user_page) subs r1, r1, #1 bne 1b ldr pc, [sp], #4 + + .section ".text.init", #alloc, #execinstr + +ENTRY(v5te_mc_user_fns) + .long v5te_mc_clear_user_page + .long v5te_mc_copy_user_page diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index 22428e7..c6efcd2c 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -181,7 +181,7 @@ bad_pmd: static void make_coherent(struct vm_area_struct *vma, unsigned long addr, struct page *page) { - struct vm_area_struct *mpnt; + struct list_head *l; struct mm_struct *mm = vma->vm_mm; unsigned long pgoff = (addr - vma->vm_start) >> PAGE_SHIFT; int aliases = 0; @@ -191,10 +191,12 @@ make_coherent(struct vm_area_struct *vma, unsigned long addr, struct page *page) * space, then we need to handle them specially to maintain * cache coherency. */ - for (mpnt = page->mapping->i_mmap_shared; mpnt; - mpnt = mpnt->vm_next_share) { + list_for_each(l, &page->mapping->i_mmap_shared) { + struct vm_area_struct *mpnt; unsigned long off; + mpnt = list_entry(l, struct vm_area_struct, shared); + /* * If this VMA is not in our MM, we can ignore it. * Note that we intentionally don't mask out the VMA diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index c9b4ba1..7195077 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -1,7 +1,7 @@ /* * linux/arch/arm/mm/init.c * - * Copyright (C) 1995-2000 Russell King + * Copyright (C) 1995-2002 Russell King * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -46,7 +46,7 @@ #define TABLE_OFFSET 0 #endif -#define TABLE_SIZE ((TABLE_OFFSET + PTRS_PER_PTE) * sizeof(void *)) +#define TABLE_SIZE ((TABLE_OFFSET + PTRS_PER_PTE) * sizeof(pte_t)) static unsigned long totalram_pages; extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; @@ -319,7 +319,7 @@ static __init void reserve_node_zero(unsigned int bootmap_pfn, unsigned int boot * and can only be in node 0. */ reserve_bootmem_node(pgdat, __pa(swapper_pg_dir), - PTRS_PER_PGD * sizeof(void *)); + PTRS_PER_PGD * sizeof(pgd_t)); #endif /* * And don't forget to reserve the allocator bitmap, diff --git a/arch/arm/mm/mm-armv.c b/arch/arm/mm/mm-armv.c index d8dfc4e..e17c7cf 100644 --- a/arch/arm/mm/mm-armv.c +++ b/arch/arm/mm/mm-armv.c @@ -1,7 +1,7 @@ /* * linux/arch/arm/mm/mm-armv.c * - * Copyright (C) 1998-2000 Russell King + * Copyright (C) 1998-2002 Russell King * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -82,9 +82,6 @@ pgd_t *get_pgd_slow(struct mm_struct *mm) init_pgd = pgd_offset_k(0); if (vectors_base() == 0) { - init_pmd = pmd_offset(init_pgd, 0); - init_pte = pte_offset(init_pmd, 0); - /* * This lock is here just to satisfy pmd_alloc and pte_lock */ @@ -172,11 +169,14 @@ free: static inline void alloc_init_section(unsigned long virt, unsigned long phys, int prot) { - pmd_t pmd; + pmd_t *pmdp, pmd; - pmd_val(pmd) = phys | prot; + pmdp = pmd_offset(pgd_offset_k(virt), virt); + if (virt & (1 << PMD_SHIFT)) + pmdp++; - set_pmd(pmd_offset(pgd_offset_k(virt), virt), pmd); + pmd_val(pmd) = phys | prot; + set_pmd(pmdp, pmd); } /* @@ -189,18 +189,19 @@ alloc_init_section(unsigned long virt, unsigned long phys, int prot) static inline void alloc_init_page(unsigned long virt, unsigned long phys, int domain, int prot) { - pmd_t *pmdp; + pmd_t *pmdp, pmd; pte_t *ptep; pmdp = pmd_offset(pgd_offset_k(virt), virt); if (pmd_none(*pmdp)) { - pte_t *ptep = alloc_bootmem_low_pages(2 * PTRS_PER_PTE * - sizeof(pte_t)); + ptep = alloc_bootmem_low_pages(2 * PTRS_PER_PTE * + sizeof(pte_t)); - ptep += PTRS_PER_PTE; - - set_pmd(pmdp, __mk_pmd(ptep, PMD_TYPE_TABLE | PMD_DOMAIN(domain))); + pmd_val(pmd) = __pa(ptep) | PMD_TYPE_TABLE | PMD_DOMAIN(domain); + set_pmd(pmdp, pmd); + pmd_val(pmd) += 256 * sizeof(pte_t); + set_pmd(pmdp + 1, pmd); } ptep = pte_offset_kernel(pmdp, virt); @@ -266,11 +267,11 @@ static void __init create_mapping(struct map_desc *md) length -= PAGE_SIZE; } - while (length >= PGDIR_SIZE) { + while (length >= (PGDIR_SIZE / 2)) { alloc_init_section(virt, virt + off, prot_sect); - virt += PGDIR_SIZE; - length -= PGDIR_SIZE; + virt += (PGDIR_SIZE / 2); + length -= (PGDIR_SIZE / 2); } while (length >= PAGE_SIZE) { @@ -463,41 +464,3 @@ void __init create_memmap_holes(struct meminfo *mi) for (node = 0; node < numnodes; node++) free_unused_memmap_node(node, mi); } - -/* - * PTE table allocation cache. - * - * This is a move away from our custom 2K page allocator. We now use the - * slab cache to keep track of these objects. - * - * With this, it is questionable as to whether the PGT cache gains us - * anything. We may be better off dropping the PTE stuff from our PGT - * cache implementation. - */ -kmem_cache_t *pte_cache; - -/* - * The constructor gets called for each object within the cache when the - * cache page is created. Note that if slab tries to misalign the blocks, - * we BUG() loudly. - */ -static void pte_cache_ctor(void *pte, kmem_cache_t *cache, unsigned long flags) -{ - unsigned long block = (unsigned long)pte; - - if (block & 2047) - BUG(); - - memzero(pte, 2 * PTRS_PER_PTE * sizeof(pte_t)); - cpu_cache_clean_invalidate_range(block, block + - PTRS_PER_PTE * sizeof(pte_t), 0); -} - -void __init pgtable_cache_init(void) -{ - pte_cache = kmem_cache_create("pte-cache", - 2 * PTRS_PER_PTE * sizeof(pte_t), 0, 0, - pte_cache_ctor, NULL); - if (!pte_cache) - BUG(); -} diff --git a/arch/arm/mm/proc-arm1020.S b/arch/arm/mm/proc-arm1020.S index dd1d04b..0c6dad0 100644 --- a/arch/arm/mm/proc-arm1020.S +++ b/arch/arm/mm/proc-arm1020.S @@ -499,7 +499,9 @@ ENTRY(cpu_arm1020_set_pmd) */ .align 5 ENTRY(cpu_arm1020_set_pte) - str r1, [r0], #-1024 @ linux version + tst r0, #2048 + streq r0, [r0, -r0] @ BUG_ON + str r1, [r0], #-2048 @ linux version eor r1, r1, #LPTE_PRESENT | LPTE_YOUNG | LPTE_WRITE | LPTE_DIRTY @@ -608,7 +610,7 @@ __arm1020_setup: */ .type arm1020_processor_functions, #object arm1020_processor_functions: - .word armv4t_early_abort + .word v4t_early_abort .word cpu_arm1020_check_bugs .word cpu_arm1020_proc_init .word cpu_arm1020_proc_fin @@ -635,10 +637,6 @@ arm1020_processor_functions: .word cpu_arm1020_set_pmd .word cpu_arm1020_set_pte - /* misc */ - .word armv4_clear_user_page - .word armv4_copy_user_page - .size arm1020_processor_functions, . - arm1020_processor_functions .type cpu_arm1020_info, #object @@ -672,4 +670,5 @@ __arm1020_proc_info: .long cpu_arm1020_info .long arm1020_processor_functions .long v4wbi_tlb_fns + .long v4_user_fns .size __arm1020_proc_info, . - __arm1020_proc_info diff --git a/arch/arm/mm/proc-arm2,3.S b/arch/arm/mm/proc-arm2,3.S index 732cd6b..efa1f62 100644 --- a/arch/arm/mm/proc-arm2,3.S +++ b/arch/arm/mm/proc-arm2,3.S @@ -342,6 +342,7 @@ arm3_elf_name: .asciz "v2" .long cpu_arm2_info .long SYMBOL_NAME(arm2_processor_functions) .long 0 + .long 0 .long 0x41560250 .long 0xfffffff0 @@ -353,6 +354,7 @@ arm3_elf_name: .asciz "v2" .long cpu_arm250_info .long SYMBOL_NAME(arm250_processor_functions) .long 0 + .long 0 .long 0x41560300 .long 0xfffffff0 @@ -364,3 +366,5 @@ arm3_elf_name: .asciz "v2" .long cpu_arm3_info .long SYMBOL_NAME(arm3_processor_functions) .long 0 + .long 0 + diff --git a/arch/arm/mm/proc-arm6,7.S b/arch/arm/mm/proc-arm6,7.S index 56a3bbd..45b08b2 100644 --- a/arch/arm/mm/proc-arm6,7.S +++ b/arch/arm/mm/proc-arm6,7.S @@ -274,7 +274,9 @@ ENTRY(cpu_arm7_set_pmd) .align 5 ENTRY(cpu_arm6_set_pte) ENTRY(cpu_arm7_set_pte) - str r1, [r0], #-1024 @ linux version + tst r0, #2048 + streq r0, [r0, -r0] @ BUG_ON + str r1, [r0], #-2048 @ linux version eor r1, r1, #LPTE_PRESENT | LPTE_YOUNG | LPTE_WRITE | LPTE_DIRTY @@ -373,10 +375,6 @@ ENTRY(arm6_processor_functions) .word cpu_arm6_set_pmd .word cpu_arm6_set_pte - /* other */ - .word armv3_clear_user_page - .word armv3_copy_user_page - .size arm6_processor_functions, . - arm6_processor_functions /* @@ -412,10 +410,6 @@ ENTRY(arm7_processor_functions) .word cpu_arm7_set_pmd .word cpu_arm7_set_pte - /* other */ - .word armv3_clear_user_page - .word armv3_copy_user_page - .size arm7_processor_functions, . - arm7_processor_functions .type cpu_arm6_info, #object @@ -465,6 +459,7 @@ __arm6_proc_info: .long cpu_arm6_info .long arm6_processor_functions .long v3_tlb_fns + .long v3_user_fns .size __arm6_proc_info, . - __arm6_proc_info .type __arm610_proc_info, #object @@ -479,6 +474,7 @@ __arm610_proc_info: .long cpu_arm610_info .long arm6_processor_functions .long v3_tlb_fns + .long v3_user_fns .size __arm610_proc_info, . - __arm610_proc_info .type __arm7_proc_info, #object @@ -493,6 +489,7 @@ __arm7_proc_info: .long cpu_arm7_info .long arm7_processor_functions .long v3_tlb_fns + .long v3_user_fns .size __arm7_proc_info, . - __arm7_proc_info .type __arm710_proc_info, #object @@ -507,4 +504,5 @@ __arm710_proc_info: .long cpu_arm710_info .long arm7_processor_functions .long v3_tlb_fns + .long v3_user_fns .size __arm710_proc_info, . - __arm710_proc_info diff --git a/arch/arm/mm/proc-arm720.S b/arch/arm/mm/proc-arm720.S index 7efb11d..ab079ed 100644 --- a/arch/arm/mm/proc-arm720.S +++ b/arch/arm/mm/proc-arm720.S @@ -136,7 +136,9 @@ ENTRY(cpu_arm720_set_pmd) */ .align 5 ENTRY(cpu_arm720_set_pte) - str r1, [r0], #-1024 @ linux version + tst r0, #2048 + streq r0, [r0, -r0] @ BUG_ON + str r1, [r0], #-2048 @ linux version eor r1, r1, #LPTE_PRESENT | LPTE_YOUNG | LPTE_WRITE | LPTE_DIRTY @@ -199,7 +201,7 @@ __arm720_setup: mov r0, #0 */ .type arm720_processor_functions, #object ENTRY(arm720_processor_functions) - .word armv4t_late_abort + .word v4t_late_abort .word cpu_arm720_check_bugs .word cpu_arm720_proc_init .word cpu_arm720_proc_fin @@ -226,10 +228,6 @@ ENTRY(arm720_processor_functions) .word cpu_arm720_set_pmd .word cpu_arm720_set_pte - /* misc */ - .word armv4_clear_user_page - .word armv4_copy_user_page - .size arm720_processor_functions, . - arm720_processor_functions .type cpu_arm720_info, #object @@ -265,4 +263,5 @@ __arm720_proc_info: .long cpu_arm720_info @ info .long arm720_processor_functions .long v4_tlb_fns + .long v4_user_fns .size __arm720_proc_info, . - __arm720_proc_info diff --git a/arch/arm/mm/proc-arm920.S b/arch/arm/mm/proc-arm920.S index a91dc90..bd8f81f 100644 --- a/arch/arm/mm/proc-arm920.S +++ b/arch/arm/mm/proc-arm920.S @@ -420,7 +420,9 @@ ENTRY(cpu_arm920_set_pmd) */ .align 5 ENTRY(cpu_arm920_set_pte) - str r1, [r0], #-1024 @ linux version + tst r0, #2048 + streq r0, [r0, -r0] @ BUG_ON + str r1, [r0], #-2048 @ linux version eor r1, r1, #LPTE_PRESENT | LPTE_YOUNG | LPTE_WRITE | LPTE_DIRTY @@ -511,7 +513,7 @@ __arm920_setup: */ .type arm920_processor_functions, #object arm920_processor_functions: - .word armv4t_early_abort + .word v4t_early_abort .word cpu_arm920_check_bugs .word cpu_arm920_proc_init .word cpu_arm920_proc_fin @@ -538,10 +540,6 @@ arm920_processor_functions: .word cpu_arm920_set_pmd .word cpu_arm920_set_pte - /* misc */ - .word armv4_clear_user_page - .word armv4_copy_user_page - .size arm920_processor_functions, . - arm920_processor_functions .type cpu_arm920_info, #object @@ -575,4 +573,5 @@ __arm920_proc_info: .long cpu_arm920_info .long arm920_processor_functions .long v4wbi_tlb_fns + .long v4_user_fns .size __arm920_proc_info, . - __arm920_proc_info diff --git a/arch/arm/mm/proc-arm922.S b/arch/arm/mm/proc-arm922.S index 5a92d6a..a1f40d9 100644 --- a/arch/arm/mm/proc-arm922.S +++ b/arch/arm/mm/proc-arm922.S @@ -421,7 +421,9 @@ ENTRY(cpu_arm922_set_pmd) */ .align 5 ENTRY(cpu_arm922_set_pte) - str r1, [r0], #-1024 @ linux version + tst r0, #2048 + streq r0, [r0, -r0] @ BUG_ON + str r1, [r0], #-2048 @ linux version eor r1, r1, #LPTE_PRESENT | LPTE_YOUNG | LPTE_WRITE | LPTE_DIRTY @@ -512,7 +514,7 @@ __arm922_setup: */ .type arm922_processor_functions, #object arm922_processor_functions: - .word armv4t_early_abort + .word v4t_early_abort .word cpu_arm922_check_bugs .word cpu_arm922_proc_init .word cpu_arm922_proc_fin @@ -539,10 +541,6 @@ arm922_processor_functions: .word cpu_arm922_set_pmd .word cpu_arm922_set_pte - /* misc */ - .word armv4_clear_user_page - .word armv4_copy_user_page - .size arm922_processor_functions, . - arm922_processor_functions .type cpu_arm922_info, #object @@ -576,4 +574,5 @@ __arm922_proc_info: .long cpu_arm922_info .long arm922_processor_functions .long v4wbi_tlb_fns + .long v4_user_fns .size __arm922_proc_info, . - __arm922_proc_info diff --git a/arch/arm/mm/proc-arm926.S b/arch/arm/mm/proc-arm926.S index 96c9c53..430c962 100644 --- a/arch/arm/mm/proc-arm926.S +++ b/arch/arm/mm/proc-arm926.S @@ -443,7 +443,9 @@ ENTRY(cpu_arm926_set_pmd) */ .align 5 ENTRY(cpu_arm926_set_pte) - str r1, [r0], #-1024 @ linux version + tst r0, #2048 + streq r0, [r0, -r0] @ BUG_ON + str r1, [r0], #-2048 @ linux version eor r1, r1, #LPTE_PRESENT | LPTE_YOUNG | LPTE_WRITE | LPTE_DIRTY @@ -549,7 +551,7 @@ __arm926_setup: */ .type arm926_processor_functions, #object arm926_processor_functions: - .word armv5ej_early_abort + .word v5ej_early_abort .word cpu_arm926_check_bugs .word cpu_arm926_proc_init .word cpu_arm926_proc_fin @@ -576,10 +578,6 @@ arm926_processor_functions: .word cpu_arm926_set_pmd .word cpu_arm926_set_pte - /* misc */ - .word armv4_clear_user_page - .word armv4_copy_user_page - .size arm926_processor_functions, . - arm926_processor_functions .type cpu_arm926_info, #object @@ -613,4 +611,5 @@ __arm926_proc_info: .long cpu_arm926_info .long arm926_processor_functions .long v4wbi_tlb_fns + .long v4_user_fns .size __arm926_proc_info, . - __arm926_proc_info diff --git a/arch/arm/mm/proc-sa110.S b/arch/arm/mm/proc-sa110.S index 06e6834..1b29c7c 100644 --- a/arch/arm/mm/proc-sa110.S +++ b/arch/arm/mm/proc-sa110.S @@ -1,7 +1,7 @@ /* * linux/arch/arm/mm/proc-sa110.S * - * Copyright (C) 1997-2000 Russell King + * Copyright (C) 1997-2002 Russell King * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -468,7 +468,9 @@ ENTRY(cpu_sa1100_set_pmd) .align 5 ENTRY(cpu_sa110_set_pte) ENTRY(cpu_sa1100_set_pte) - str r1, [r0], #-1024 @ linux version + tst r0, #2048 + streq r0, [r0, -r0] @ BUG_ON + str r1, [r0], #-2048 @ linux version eor r1, r1, #L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_WRITE | L_PTE_DIRTY @@ -538,7 +540,7 @@ __setup_common: .type sa110_processor_functions, #object ENTRY(sa110_processor_functions) - .word armv4_early_abort + .word v4_early_abort .word cpu_sa110_check_bugs .word cpu_sa110_proc_init .word cpu_sa110_proc_fin @@ -565,10 +567,6 @@ ENTRY(sa110_processor_functions) .word cpu_sa110_set_pmd .word cpu_sa110_set_pte - /* misc */ - .word armv4_clear_user_page - .word armv4_copy_user_page - .size sa110_processor_functions, . - sa110_processor_functions .type cpu_sa110_info, #object @@ -610,10 +608,6 @@ ENTRY(sa1100_processor_functions) .word cpu_sa1100_set_pmd .word cpu_sa1100_set_pte - /* misc */ - .word armv4_mc_clear_user_page - .word armv4_mc_copy_user_page - .size sa1100_processor_functions, . - sa1100_processor_functions cpu_sa1100_info: @@ -651,6 +645,7 @@ __sa110_proc_info: .long cpu_sa110_info .long sa110_processor_functions .long v4wb_tlb_fns + .long v4_user_fns .size __sa110_proc_info, . - __sa110_proc_info .type __sa1100_proc_info,#object @@ -665,6 +660,7 @@ __sa1100_proc_info: .long cpu_sa1100_info .long sa1100_processor_functions .long v4wb_tlb_fns + .long v4_mc_user_fns .size __sa1100_proc_info, . - __sa1100_proc_info .type __sa1110_proc_info,#object @@ -679,4 +675,5 @@ __sa1110_proc_info: .long cpu_sa1110_info .long sa1100_processor_functions .long v4wb_tlb_fns + .long v4_mc_user_fns .size __sa1110_proc_info, . - __sa1110_proc_info diff --git a/arch/arm/mm/proc-xscale.S b/arch/arm/mm/proc-xscale.S index 3e67f27..1cd680f 100644 --- a/arch/arm/mm/proc-xscale.S +++ b/arch/arm/mm/proc-xscale.S @@ -602,7 +602,9 @@ ENTRY(cpu_xscale_set_pmd) */ .align 5 ENTRY(cpu_xscale_set_pte) - str r1, [r0], #-1024 @ linux version + tst r0, #2048 + streq r0, [r0, -r0] @ BUG_ON + str r1, [r0], #-2048 @ linux version bic r2, r1, #0xff0 orr r2, r2, #PTE_TYPE_EXT @ extended page @@ -695,7 +697,7 @@ __xscale_setup: .type xscale_processor_functions, #object ENTRY(xscale_processor_functions) - .word armv4t_early_abort + .word v4t_early_abort .word cpu_xscale_check_bugs .word cpu_xscale_proc_init .word cpu_xscale_proc_fin @@ -722,10 +724,6 @@ ENTRY(xscale_processor_functions) .word cpu_xscale_set_pmd .word cpu_xscale_set_pte - /* misc */ - .word armv5te_clear_user_page - .word armv5te_copy_user_page - .size xscale_processor_functions, . - xscale_processor_functions .type cpu_80200_info, #object @@ -765,6 +763,7 @@ __80200_proc_info: .long cpu_80200_info .long xscale_processor_functions .long v4wbi_tlb_fns + .long v5te_mc_user_fns .size __80200_proc_info, . - __80200_proc_info .type __pxa250_proc_info,#object @@ -779,6 +778,7 @@ __pxa250_proc_info: .long cpu_pxa250_info .long xscale_processor_functions .long v4wbi_tlb_fns + .long v5te_mc_user_fns .size __cotulla_proc_info, . - __cotulla_proc_info .size __pxa250_proc_info, . - __pxa250_proc_info diff --git a/arch/i386/defconfig b/arch/i386/defconfig index 7d0ca2f..6b6fa38 100644 --- a/arch/i386/defconfig +++ b/arch/i386/defconfig @@ -258,7 +258,6 @@ CONFIG_BLK_DEV_IDEDMA=y # CONFIG_IDEDMA_PCI_WIP is not set # CONFIG_BLK_DEV_IDEDMA_TIMEOUT is not set # CONFIG_IDEDMA_NEW_DRIVE_LISTINGS is not set -CONFIG_BLK_DEV_ADMA=y # CONFIG_BLK_DEV_AEC62XX is not set # CONFIG_AEC62XX_TUNING is not set # CONFIG_BLK_DEV_ALI15X3 is not set diff --git a/arch/i386/kernel/apm.c b/arch/i386/kernel/apm.c index 3a39757..c20530f 100644 --- a/arch/i386/kernel/apm.c +++ b/arch/i386/kernel/apm.c @@ -275,10 +275,11 @@ extern int (*console_blank_hook)(int); */ /* - * Define to always call the APM BIOS busy routine even if the clock was - * not slowed by the idle routine. + * Define as 1 to make the driver always call the APM BIOS busy + * routine even if the clock was not reported as slowed by the + * idle routine. Otherwise, define as 0. */ -#define ALWAYS_CALL_BUSY +#define ALWAYS_CALL_BUSY 1 /* * Define to make the APM BIOS calls zero all data segment registers (so @@ -380,7 +381,7 @@ static int idle_period = DEFAULT_IDLE_PERIOD; static int set_pm_idle; static int suspends_pending; static int standbys_pending; -static int waiting_for_resume; +static int ignore_sys_suspend; static int ignore_normal_resume; static int bounce_interval = DEFAULT_BOUNCE_INTERVAL; @@ -471,6 +472,28 @@ static const lookup_t error_table[] = { }; #define ERROR_COUNT (sizeof(error_table)/sizeof(lookup_t)) +/** + * apm_error - display an APM error + * @str: information string + * @err: APM BIOS return code + * + * Write a meaningful log entry to the kernel log in the event of + * an APM error. + */ + +static void apm_error(char *str, int err) +{ + int i; + + for (i = 0; i < ERROR_COUNT; i++) + if (error_table[i].key == err) break; + if (i < ERROR_COUNT) + printk(KERN_NOTICE "apm: %s: %s\n", str, error_table[i].msg); + else + printk(KERN_NOTICE "apm: %s: unknown error code %#2.2x\n", + str, err); +} + /* * These are the actual BIOS calls. Depending on APM_ZERO_SEGS and * apm_info.allow_ints, we are being really paranoid here! Not only @@ -702,13 +725,13 @@ static int set_power_state(u_short what, u_short state) } /** - * apm_set_power_state - set system wide power state + * set_system_power_state - set system wide power state * @state: which state to enter * * Transition the entire system into a new APM power state. */ -static int apm_set_power_state(u_short state) +static int set_system_power_state(u_short state) { return set_power_state(APM_DEVICE_ALL, state); } @@ -725,7 +748,6 @@ static int apm_set_power_state(u_short state) static int apm_do_idle(void) { u32 eax; - int slowed; if (apm_bios_call_simple(APM_FUNC_IDLE, 0, 0, &eax)) { static unsigned long t; @@ -737,13 +759,8 @@ static int apm_do_idle(void) } return -1; } - slowed = (apm_info.bios.flags & APM_IDLE_SLOWS_CLOCK) != 0; -#ifdef ALWAYS_CALL_BUSY - clock_slowed = 1; -#else - clock_slowed = slowed; -#endif - return slowed; + clock_slowed = (apm_info.bios.flags & APM_IDLE_SLOWS_CLOCK) != 0; + return clock_slowed; } /** @@ -756,7 +773,7 @@ static void apm_do_busy(void) { u32 dummy; - if (clock_slowed) { + if (clock_slowed || ALWAYS_CALL_BUSY) { (void) apm_bios_call_simple(APM_FUNC_BUSY, 0, 0, &dummy); clock_slowed = 0; } @@ -771,7 +788,7 @@ static void apm_do_busy(void) #define IDLE_CALC_LIMIT (HZ * 100) #define IDLE_LEAKY_MAX 16 -static void (*sys_idle)(void); +static void (*original_pm_idle)(void); extern void default_idle(void); @@ -785,14 +802,13 @@ extern void default_idle(void); static void apm_cpu_idle(void) { - static int use_apm_idle = 0; - static unsigned int last_jiffies = 0; - static unsigned int last_stime = 0; + static int use_apm_idle; /* = 0 */ + static unsigned int last_jiffies; /* = 0 */ + static unsigned int last_stime; /* = 0 */ - int apm_is_idle = 0; + int apm_idle_done = 0; unsigned int jiffies_since_last_check = jiffies - last_jiffies; - unsigned int t1; - + unsigned int bucket; recalc: if (jiffies_since_last_check > IDLE_CALC_LIMIT) { @@ -810,7 +826,7 @@ recalc: last_stime = current->times.tms_stime; } - t1 = IDLE_LEAKY_MAX; + bucket = IDLE_LEAKY_MAX; while (!need_resched()) { if (use_apm_idle) { @@ -818,23 +834,24 @@ recalc: t = jiffies; switch (apm_do_idle()) { - case 0: apm_is_idle = 1; + case 0: apm_idle_done = 1; if (t != jiffies) { - if (t1) { - t1 = IDLE_LEAKY_MAX; + if (bucket) { + bucket = IDLE_LEAKY_MAX; continue; } - } else if (t1) { - t1--; + } else if (bucket) { + bucket--; continue; } break; - case 1: apm_is_idle = 1; + case 1: apm_idle_done = 1; break; + default: /* BIOS refused */ } } - if (sys_idle) - sys_idle(); + if (original_pm_idle) + original_pm_idle(); else default_idle(); jiffies_since_last_check = jiffies - last_jiffies; @@ -842,7 +859,7 @@ recalc: goto recalc; } - if (apm_is_idle) + if (apm_idle_done) apm_do_busy(); } @@ -890,7 +907,7 @@ static void apm_power_off(void) if (apm_info.realmode_power_off) machine_real_restart(po_bios_call, sizeof(po_bios_call)); else - (void) apm_set_power_state(APM_STATE_OFF); + (void) set_system_power_state(APM_STATE_OFF); } /** @@ -1035,28 +1052,6 @@ static int apm_engage_power_management(u_short device, int enable) return APM_SUCCESS; } -/** - * apm_error - display an APM error - * @str: information string - * @err: APM BIOS return code - * - * Write a meaningful log entry to the kernel log in the event of - * an APM error. - */ - -static void apm_error(char *str, int err) -{ - int i; - - for (i = 0; i < ERROR_COUNT; i++) - if (error_table[i].key == err) break; - if (i < ERROR_COUNT) - printk(KERN_NOTICE "apm: %s: %s\n", str, error_table[i].msg); - else - printk(KERN_NOTICE "apm: %s: unknown error code %#2.2x\n", - str, err); -} - #if defined(CONFIG_APM_DISPLAY_BLANK) && defined(CONFIG_VT) /** @@ -1198,9 +1193,9 @@ static int suspend(int vetoable) /* Vetoed */ if (vetoable) { if (apm_info.connection_version > 0x100) - apm_set_power_state(APM_STATE_REJECT); + set_system_power_state(APM_STATE_REJECT); err = -EBUSY; - waiting_for_resume = 0; + ignore_sys_suspend = 0; printk(KERN_WARNING "apm: suspend was vetoed.\n"); goto out; } @@ -1208,9 +1203,10 @@ static int suspend(int vetoable) } get_time_diff(); cli(); - err = apm_set_power_state(APM_STATE_SUSPEND); + err = set_system_power_state(APM_STATE_SUSPEND); reinit_timer(); set_time(); + ignore_normal_resume = 1; sti(); if (err == APM_NO_ERROR) err = APM_SUCCESS; @@ -1219,7 +1215,6 @@ static int suspend(int vetoable) err = (err == APM_SUCCESS) ? 0 : -EIO; pm_send_all(PM_RESUME, (void *)0); queue_event(APM_NORMAL_RESUME, NULL); - ignore_normal_resume = 1; out: spin_lock(&user_list_lock); for (as = user_list; as != NULL; as = as->next) { @@ -1237,7 +1232,7 @@ static void standby(void) /* If needed, notify drivers here */ get_time_diff(); - err = apm_set_power_state(APM_STATE_STANDBY); + err = set_system_power_state(APM_STATE_STANDBY); if ((err != APM_SUCCESS) && (err != APM_NO_ERROR)) apm_error("standby", err); } @@ -1291,13 +1286,13 @@ static void check_events(void) case APM_USER_SUSPEND: #ifdef CONFIG_APM_IGNORE_USER_SUSPEND if (apm_info.connection_version > 0x100) - apm_set_power_state(APM_STATE_REJECT); + set_system_power_state(APM_STATE_REJECT); break; #endif case APM_SYS_SUSPEND: if (ignore_bounce) { if (apm_info.connection_version > 0x100) - apm_set_power_state(APM_STATE_REJECT); + set_system_power_state(APM_STATE_REJECT); break; } /* @@ -1308,9 +1303,9 @@ static void check_events(void) * sending a SUSPEND event until something else * happens! */ - if (waiting_for_resume) + if (ignore_sys_suspend) return; - waiting_for_resume = 1; + ignore_sys_suspend = 1; queue_event(event, NULL); if (suspends_pending <= 0) (void) suspend(1); @@ -1319,7 +1314,7 @@ static void check_events(void) case APM_NORMAL_RESUME: case APM_CRITICAL_RESUME: case APM_STANDBY_RESUME: - waiting_for_resume = 0; + ignore_sys_suspend = 0; last_resume = jiffies; ignore_bounce = 1; if ((event != APM_NORMAL_RESUME) @@ -1363,7 +1358,7 @@ static void apm_event_handler(void) pending_count = 4; if (debug) printk(KERN_DEBUG "apm: setting state busy\n"); - err = apm_set_power_state(APM_STATE_BUSY); + err = set_system_power_state(APM_STATE_BUSY); if (err) apm_error("busy", err); } @@ -1972,7 +1967,7 @@ static int __init apm_init(void) if (HZ != 100) idle_period = (idle_period * HZ) / 100; if (idle_threshold < 100) { - sys_idle = pm_idle; + original_pm_idle = pm_idle; pm_idle = apm_cpu_idle; set_pm_idle = 1; } @@ -1985,7 +1980,7 @@ static void __exit apm_exit(void) int error; if (set_pm_idle) - pm_idle = sys_idle; + pm_idle = original_pm_idle; if (((apm_info.bios.flags & APM_BIOS_DISENGAGED) == 0) && (apm_info.connection_version > 0x0100)) { error = apm_engage_power_management(APM_DEVICE_ALL, 0); diff --git a/arch/i386/kernel/dmi_scan.c b/arch/i386/kernel/dmi_scan.c index 5323df4..25bc3d7 100644 --- a/arch/i386/kernel/dmi_scan.c +++ b/arch/i386/kernel/dmi_scan.c @@ -492,6 +492,11 @@ static __initdata struct dmi_blacklist dmi_blacklist[]={ MATCH(DMI_BIOS_VERSION, "A04"), MATCH(DMI_BIOS_DATE, "08/24/2000"), NO_MATCH } }, + { broken_apm_power, "Dell Inspiron 2500", { /* Handle problems with APM on Inspiron 2500 */ + MATCH(DMI_BIOS_VENDOR, "Phoenix Technologies LTD"), + MATCH(DMI_BIOS_VERSION, "A12"), + MATCH(DMI_BIOS_DATE, "02/04/2002"), NO_MATCH + } }, { set_realmode_power_off, "Award Software v4.60 PGMA", { /* broken PM poweroff bios */ MATCH(DMI_BIOS_VENDOR, "Award Software International, Inc."), MATCH(DMI_BIOS_VERSION, "4.60 PGMA"), diff --git a/arch/i386/kernel/entry.S b/arch/i386/kernel/entry.S index cea5ec0..92c5839 100644 --- a/arch/i386/kernel/entry.S +++ b/arch/i386/kernel/entry.S @@ -717,6 +717,7 @@ ENTRY(sys_call_table) .long SYMBOL_NAME(sys_fremovexattr) .long SYMBOL_NAME(sys_tkill) .long SYMBOL_NAME(sys_sendfile64) + .long SYMBOL_NAME(sys_futex) /* 240 */ .rept NR_syscalls-(.-sys_call_table)/4 .long SYMBOL_NAME(sys_ni_syscall) diff --git a/arch/i386/kernel/pci-i386.h b/arch/i386/kernel/pci-i386.h index 2c821af..8f22892 100644 --- a/arch/i386/kernel/pci-i386.h +++ b/arch/i386/kernel/pci-i386.h @@ -18,6 +18,7 @@ #define PCI_NO_SORT 0x0100 #define PCI_BIOS_SORT 0x0200 #define PCI_NO_CHECKS 0x0400 +#define PCI_USE_PIRQ_MASK 0x0800 #define PCI_ASSIGN_ROMS 0x1000 #define PCI_BIOS_IRQ_SCAN 0x2000 #define PCI_ASSIGN_ALL_BUSSES 0x4000 diff --git a/arch/i386/kernel/pci-irq.c b/arch/i386/kernel/pci-irq.c index 0c45abe..10e307f 100644 --- a/arch/i386/kernel/pci-irq.c +++ b/arch/i386/kernel/pci-irq.c @@ -570,6 +570,10 @@ static int pcibios_lookup_irq(struct pci_dev *dev, int assign) * reported by the device if possible. */ newirq = dev->irq; + if (!((1 << newirq) & mask)) { + if ( pci_probe & PCI_USE_PIRQ_MASK) newirq = 0; + else printk(KERN_WARNING "PCI: IRQ %i for device %s doesn't match PIRQ mask - try pci=usepirqmask\n", newirq, dev->slot_name); + } if (!newirq && assign) { for (i = 0; i < 16; i++) { if (!(mask & (1 << i))) @@ -588,7 +592,8 @@ static int pcibios_lookup_irq(struct pci_dev *dev, int assign) irq = pirq & 0xf; DBG(" -> hardcoded IRQ %d\n", irq); msg = "Hardcoded"; - } else if (r->get && (irq = r->get(pirq_router_dev, dev, pirq))) { + } else if ( r->get && (irq = r->get(pirq_router_dev, dev, pirq)) && \ + ((!(pci_probe & PCI_USE_PIRQ_MASK)) || ((1 << irq) & mask)) ) { DBG(" -> got IRQ %d\n", irq); msg = "Found"; } else if (newirq && r->set && (dev->class >> 8) != PCI_CLASS_DISPLAY_VGA) { @@ -622,7 +627,9 @@ static int pcibios_lookup_irq(struct pci_dev *dev, int assign) continue; if (info->irq[pin].link == pirq) { /* We refuse to override the dev->irq information. Give a warning! */ - if (dev2->irq && dev2->irq != irq) { + if ( dev2->irq && dev2->irq != irq && \ + (!(pci_probe & PCI_USE_PIRQ_MASK) || \ + ((1 << dev2->irq) & mask)) ) { printk(KERN_INFO "IRQ routing conflict for %s, have irq %d, want irq %d\n", dev2->slot_name, dev2->irq, irq); continue; diff --git a/arch/i386/kernel/pci-pc.c b/arch/i386/kernel/pci-pc.c index 7281c71..09dd880 100644 --- a/arch/i386/kernel/pci-pc.c +++ b/arch/i386/kernel/pci-pc.c @@ -1343,6 +1343,9 @@ char * __devinit pcibios_setup(char *str) } else if (!strcmp(str, "assign-busses")) { pci_probe |= PCI_ASSIGN_ALL_BUSSES; return NULL; + } else if (!strcmp(str, "usepirqmask")) { + pci_probe |= PCI_USE_PIRQ_MASK; + return NULL; } else if (!strncmp(str, "irqmask=", 8)) { pcibios_irq_mask = simple_strtol(str+8, NULL, 0); return NULL; diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile index a107bbc..757b58a 100644 --- a/arch/ia64/Makefile +++ b/arch/ia64/Makefile @@ -25,7 +25,7 @@ CFLAGS_KERNEL := -mconstant-gp GCC_VERSION=$(shell $(CROSS_COMPILE)$(HOSTCC) -v 2>&1 | fgrep 'gcc version' | cut -f3 -d' ' | cut -f1 -d'.') ifneq ($(GCC_VERSION),2) - CFLAGS += -frename-registers --param max-inline-insns=400 + CFLAGS += -frename-registers --param max-inline-insns=2000 endif ifeq ($(CONFIG_ITANIUM_BSTEP_SPECIFIC),y) @@ -58,7 +58,7 @@ ifdef CONFIG_IA64_SGI_SN CFLAGS += -DBRINGUP SUBDIRS := arch/$(ARCH)/sn/kernel \ arch/$(ARCH)/sn/io \ - arch/$(ARCH)/sn/fprom \ + arch/$(ARCH)/sn/fakeprom \ $(SUBDIRS) CORE_FILES := arch/$(ARCH)/sn/kernel/sn.o \ arch/$(ARCH)/sn/io/sgiio.o \ diff --git a/arch/ia64/config.in b/arch/ia64/config.in index 5c0350e..086c87a 100644 --- a/arch/ia64/config.in +++ b/arch/ia64/config.in @@ -119,6 +119,7 @@ if [ "$CONFIG_IA64_HP_SIM" = "n" ]; then source drivers/mtd/Config.in source drivers/pnp/Config.in source drivers/block/Config.in +source drivers/ieee1394/Config.in source drivers/message/i2o/Config.in source drivers/md/Config.in @@ -230,7 +231,7 @@ if [ "$CONFIG_IA64_HP_SIM" != "n" -o "$CONFIG_IA64_GENERIC" != "n" ]; then mainmenu_option next_comment comment 'Simulated drivers' - tristate 'Simulated Ethernet ' CONFIG_SIMETH + bool 'Simulated Ethernet ' CONFIG_SIMETH bool 'Simulated serial driver support' CONFIG_SIM_SERIAL if [ "$CONFIG_SCSI" != "n" ]; then bool 'Simulated SCSI disk' CONFIG_SCSI_SIM @@ -252,13 +253,20 @@ if [ "$CONFIG_DEBUG_KERNEL" != "n" ]; then bool ' Disable VHPT' CONFIG_DISABLE_VHPT bool ' Magic SysRq key' CONFIG_MAGIC_SYSRQ -# early printk is currently broken for SMP: the secondary processors get stuck... -# bool ' Early printk support (requires VGA!)' CONFIG_IA64_EARLY_PRINTK - + bool ' Early printk support (requires VGA!)' CONFIG_IA64_EARLY_PRINTK bool ' Debug memory allocations' CONFIG_DEBUG_SLAB bool ' Spinlock debugging' CONFIG_DEBUG_SPINLOCK bool ' Turn on compare-and-exchange bug checking (slow!)' CONFIG_IA64_DEBUG_CMPXCHG bool ' Turn on irq debug checks (slow!)' CONFIG_IA64_DEBUG_IRQ + bool ' Built-in Kernel Debugger support' CONFIG_KDB + dep_tristate ' KDB modules' CONFIG_KDB_MODULES $CONFIG_KDB + if [ "$CONFIG_KDB" = "y" ]; then + bool ' KDB off by default' CONFIG_KDB_OFF + comment ' Load all symbols for debugging is required for KDB' + define_bool CONFIG_KALLSYMS y + else + bool ' Load all symbols for debugging' CONFIG_KALLSYMS + fi fi endmenu diff --git a/arch/ia64/defconfig b/arch/ia64/defconfig index 6a0a9ff..48671bf 100644 --- a/arch/ia64/defconfig +++ b/arch/ia64/defconfig @@ -207,7 +207,6 @@ CONFIG_BLK_DEV_IDESCSI=y CONFIG_BLK_DEV_IDEPCI=y CONFIG_IDEPCI_SHARE_IRQ=y CONFIG_BLK_DEV_IDEDMA_PCI=y -CONFIG_BLK_DEV_ADMA=y # CONFIG_BLK_DEV_OFFBOARD is not set # CONFIG_IDEDMA_PCI_AUTO is not set CONFIG_BLK_DEV_IDEDMA=y diff --git a/arch/ia64/dig/setup.c b/arch/ia64/dig/setup.c index 66bc8d5..c75a357 100644 --- a/arch/ia64/dig/setup.c +++ b/arch/ia64/dig/setup.c @@ -33,6 +33,7 @@ * is sufficient (the IDE driver will autodetect the drive geometry). */ char drive_info[4*16]; +extern int pcat_compat; unsigned char aux_device_present = 0xaa; /* XXX remove this when legacy I/O is gone */ @@ -81,13 +82,19 @@ dig_setup (char **cmdline_p) screen_info.orig_video_ega_bx = 3; /* XXX fake */ } -void +void __init dig_irq_init (void) { - /* - * Disable the compatibility mode interrupts (8259 style), needs IN/OUT support - * enabled. - */ - outb(0xff, 0xA1); - outb(0xff, 0x21); + if (pcat_compat) { + /* + * Disable the compatibility mode interrupts (8259 style), needs IN/OUT support + * enabled. + */ + printk("%s: Disabling PC-AT compatible 8259 interrupts\n", __FUNCTION__); + outb(0xff, 0xA1); + outb(0xff, 0x21); + } else { + printk("%s: System doesn't have PC-AT compatible dual-8259 setup. " + "Nothing to be done\n", __FUNCTION__); + } } diff --git a/arch/ia64/hp/hpsim_console.c b/arch/ia64/hp/hpsim_console.c index 4782748..450e3bf 100644 --- a/arch/ia64/hp/hpsim_console.c +++ b/arch/ia64/hp/hpsim_console.c @@ -1,15 +1,18 @@ /* * Platform dependent support for HP simulator. * - * Copyright (C) 1998, 1999 Hewlett-Packard Co - * Copyright (C) 1998, 1999 David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 1998, 1999, 2002 Hewlett-Packard Co + * David Mosberger-Tang <davidm@hpl.hp.com> * Copyright (C) 1999 Vijay Chander <vijay@engr.sgi.com> */ +#include <linux/config.h> + #include <linux/init.h> #include <linux/kernel.h> #include <linux/param.h> #include <linux/string.h> #include <linux/types.h> +#include <linux/tty.h> #include <linux/kdev_t.h> #include <linux/console.h> @@ -57,5 +60,5 @@ simcons_write (struct console *cons, const char *buf, unsigned count) static kdev_t simcons_console_device (struct console *c) { - return MKDEV(TTY_MAJOR, 64 + c->index); + return mk_kdev(TTY_MAJOR, 64 + c->index); } diff --git a/arch/ia64/ia32/binfmt_elf32.c b/arch/ia64/ia32/binfmt_elf32.c index 15a9f9b..1efa244 100644 --- a/arch/ia64/ia32/binfmt_elf32.c +++ b/arch/ia64/ia32/binfmt_elf32.c @@ -142,10 +142,11 @@ ia64_elf32_init (struct pt_regs *regs) /* * Setup GDTD. Note: GDTD is the descrambled version of the pseudo-descriptor * format defined by Figure 3-11 "Pseudo-Descriptor Format" in the IA-32 - * architecture manual. + * architecture manual. Also note that the only fields that are not ignored are + * `base', `limit', 'G', `P' (must be 1) and `S' (must be 0). */ - regs->r31 = IA32_SEG_UNSCRAMBLE(IA32_SEG_DESCRIPTOR(IA32_GDT_OFFSET, IA32_PAGE_SIZE - 1, 0, - 0, 0, 0, 0, 0, 0)); + regs->r31 = IA32_SEG_UNSCRAMBLE(IA32_SEG_DESCRIPTOR(IA32_GDT_OFFSET, IA32_PAGE_SIZE - 1, + 0, 0, 0, 1, 0, 0, 0)); /* Setup the segment selectors */ regs->r16 = (__USER_DS << 16) | __USER_DS; /* ES == DS, GS, FS are zero */ regs->r17 = (__USER_DS << 16) | __USER_CS; /* SS, CS; ia32_load_state() sets TSS and LDT */ @@ -206,6 +207,7 @@ elf32_set_personality (void) set_personality(PER_LINUX32); current->thread.map_base = IA32_PAGE_OFFSET/3; current->thread.task_size = IA32_PAGE_OFFSET; /* use what Linux/x86 uses... */ + current->thread.flags |= IA64_THREAD_XSTACK; /* data must be executable */ set_fs(USER_DS); /* set addr limit for new TASK_SIZE */ } diff --git a/arch/ia64/ia32/ia32_entry.S b/arch/ia64/ia32/ia32_entry.S index 7a85fb8..3730d4a 100644 --- a/arch/ia64/ia32/ia32_entry.S +++ b/arch/ia64/ia32/ia32_entry.S @@ -37,7 +37,7 @@ ENTRY(ia32_clone) mov loc1=r16 // save ar.pfs across do_fork .body zxt4 out1=in1 // newsp - mov out3=0 // stacksize + mov out3=16 // stacksize (compensates for 16-byte scratch area) adds out2=IA64_SWITCH_STACK_SIZE+16,sp // out2 = ®s zxt4 out0=in0 // out0 = clone_flags br.call.sptk.many rp=do_fork @@ -98,7 +98,7 @@ GLOBAL_ENTRY(ia32_ret_from_clone) ld8 r2=[r2] ;; mov r8=0 - tbit.nz p6,p0=r2,PT_TRACESYS_BIT + tbit.nz p6,p0=r2,PT_SYSCALLTRACE_BIT (p6) br.cond.spnt .ia32_strace_check_retval ;; // prevent RAW on r8 END(ia32_ret_from_clone) @@ -220,7 +220,7 @@ ia32_syscall_table: data8 sys32_pipe data8 sys32_times data8 sys32_ni_syscall /* old prof syscall holder */ - data8 sys_brk /* 45 */ + data8 sys32_brk /* 45 */ data8 sys_setgid /* 16-bit version */ data8 sys_getgid /* 16-bit version */ data8 sys32_signal diff --git a/arch/ia64/ia32/ia32_ioctl.c b/arch/ia64/ia32/ia32_ioctl.c index a73a2c9..bb16265 100644 --- a/arch/ia64/ia32/ia32_ioctl.c +++ b/arch/ia64/ia32/ia32_ioctl.c @@ -3,12 +3,14 @@ * * Copyright (C) 2000 VA Linux Co * Copyright (C) 2000 Don Dugger <n0ano@valinux.com> - * Copyright (C) 2001 Hewlett-Packard Co + * Copyright (C) 2001-2002 Hewlett-Packard Co * David Mosberger-Tang <davidm@hpl.hp.com> */ #include <linux/types.h> #include <linux/dirent.h> +#include <linux/fs.h> /* argh, msdos_fs.h isn't self-contained... */ + #include <linux/msdos_fs.h> #include <linux/mtio.h> #include <linux/ncp_fs.h> @@ -79,6 +81,38 @@ sys32_ioctl (unsigned int fd, unsigned int cmd, unsigned int arg) return ret; } + case IOCTL_NR(SIOCGIFCONF): + { + struct ifconf32 { + int ifc_len; + unsigned int ifc_ptr; + } ifconf32; + struct ifconf ifconf; + int i, n; + char *p32, *p64; + char buf[32]; /* sizeof IA32 ifreq structure */ + + if (copy_from_user(&ifconf32, P(arg), sizeof(ifconf32))) + return -EFAULT; + ifconf.ifc_len = ifconf32.ifc_len; + ifconf.ifc_req = P(ifconf32.ifc_ptr); + ret = DO_IOCTL(fd, SIOCGIFCONF, &ifconf); + ifconf32.ifc_len = ifconf.ifc_len; + if (copy_to_user(P(arg), &ifconf32, sizeof(ifconf32))) + return -EFAULT; + n = ifconf.ifc_len / sizeof(struct ifreq); + p32 = P(ifconf32.ifc_ptr); + p64 = P(ifconf32.ifc_ptr); + for (i = 0; i < n; i++) { + if (copy_from_user(buf, p64, sizeof(struct ifreq))) + return -EFAULT; + if (copy_to_user(p32, buf, sizeof(buf))) + return -EFAULT; + p32 += sizeof(buf); + p64 += sizeof(struct ifreq); + } + return ret; + } case IOCTL_NR(DRM_IOCTL_VERSION): { diff --git a/arch/ia64/ia32/ia32_signal.c b/arch/ia64/ia32/ia32_signal.c index 5b91a81..57a07f8 100644 --- a/arch/ia64/ia32/ia32_signal.c +++ b/arch/ia64/ia32/ia32_signal.c @@ -1,7 +1,7 @@ /* * IA32 Architecture-specific signal handling support. * - * Copyright (C) 1999, 2001 Hewlett-Packard Co + * Copyright (C) 1999, 2001-2002 Hewlett-Packard Co * David Mosberger-Tang <davidm@hpl.hp.com> * Copyright (C) 1999 Arun Sharma <arun.sharma@intel.com> * Copyright (C) 2000 VA Linux Co @@ -522,6 +522,7 @@ get_sigframe (struct k_sigaction *ka, struct pt_regs * regs, size_t frame_size) static int setup_frame_ia32 (int sig, struct k_sigaction *ka, sigset_t *set, struct pt_regs * regs) { + struct exec_domain *ed = current_thread_info()->exec_domain; struct sigframe_ia32 *frame; int err = 0; @@ -530,12 +531,8 @@ setup_frame_ia32 (int sig, struct k_sigaction *ka, sigset_t *set, struct pt_regs if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame))) goto give_sigsegv; - err |= __put_user((current->exec_domain - && current->exec_domain->signal_invmap - && sig < 32 - ? (int)(current->exec_domain->signal_invmap[sig]) - : sig), - &frame->sig); + err |= __put_user((ed && ed->signal_invmap && sig < 32 + ? (int)(ed->signal_invmap[sig]) : sig), &frame->sig); err |= setup_sigcontext_ia32(&frame->sc, &frame->fpstate, regs, set->sig[0]); @@ -590,6 +587,7 @@ static int setup_rt_frame_ia32 (int sig, struct k_sigaction *ka, siginfo_t *info, sigset_t *set, struct pt_regs * regs) { + struct exec_domain *ed = current_thread_info()->exec_domain; struct rt_sigframe_ia32 *frame; int err = 0; @@ -598,12 +596,8 @@ setup_rt_frame_ia32 (int sig, struct k_sigaction *ka, siginfo_t *info, if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame))) goto give_sigsegv; - err |= __put_user((current->exec_domain - && current->exec_domain->signal_invmap - && sig < 32 - ? current->exec_domain->signal_invmap[sig] - : sig), - &frame->sig); + err |= __put_user((ed && ed->signal_invmap + && sig < 32 ? ed->signal_invmap[sig] : sig), &frame->sig); err |= __put_user((long)&frame->info, &frame->pinfo); err |= __put_user((long)&frame->uc, &frame->puc); err |= copy_siginfo_to_user32(&frame->info, info); diff --git a/arch/ia64/ia32/ia32_support.c b/arch/ia64/ia32/ia32_support.c index 4f536c1..9d0d71e 100644 --- a/arch/ia64/ia32/ia32_support.c +++ b/arch/ia64/ia32/ia32_support.c @@ -3,7 +3,7 @@ * * Copyright (C) 1999 Arun Sharma <arun.sharma@intel.com> * Copyright (C) 2000 Asit K. Mallick <asit.k.mallick@intel.com> - * Copyright (C) 2001 Hewlett-Packard Co + * Copyright (C) 2001-2002 Hewlett-Packard Co * David Mosberger-Tang <davidm@hpl.hp.com> * * 06/16/00 A. Mallick added csd/ssd/tssd for ia32 thread context @@ -153,10 +153,12 @@ ia32_gdt_init (void) /* We never change the TSS and LDT descriptors, so we can share them across all CPUs. */ ldt_size = PAGE_ALIGN(IA32_LDT_ENTRIES*IA32_LDT_ENTRY_SIZE); for (nr = 0; nr < NR_CPUS; ++nr) { - ia32_gdt[_TSS(nr)] = IA32_SEG_DESCRIPTOR(IA32_TSS_OFFSET, 235, - 0xb, 0, 3, 1, 1, 1, 0); - ia32_gdt[_LDT(nr)] = IA32_SEG_DESCRIPTOR(IA32_LDT_OFFSET, ldt_size - 1, - 0x2, 0, 3, 1, 1, 1, 0); + ia32_gdt[_TSS(nr) >> IA32_SEGSEL_INDEX_SHIFT] + = IA32_SEG_DESCRIPTOR(IA32_TSS_OFFSET, 235, + 0xb, 0, 3, 1, 1, 1, 0); + ia32_gdt[_LDT(nr) >> IA32_SEGSEL_INDEX_SHIFT] + = IA32_SEG_DESCRIPTOR(IA32_LDT_OFFSET, ldt_size - 1, + 0x2, 0, 3, 1, 1, 1, 0); } } @@ -172,6 +174,10 @@ ia32_bad_interrupt (unsigned long int_num, struct pt_regs *regs) siginfo.si_signo = SIGTRAP; siginfo.si_errno = int_num; /* XXX is it OK to abuse si_errno like this? */ + siginfo.si_flags = 0; + siginfo.si_isr = 0; + siginfo.si_addr = 0; + siginfo.si_imm = 0; siginfo.si_code = TRAP_BRKPT; force_sig_info(SIGTRAP, &siginfo, current); } diff --git a/arch/ia64/ia32/ia32_traps.c b/arch/ia64/ia32/ia32_traps.c index 8f0bb83..c43d91b 100644 --- a/arch/ia64/ia32/ia32_traps.c +++ b/arch/ia64/ia32/ia32_traps.c @@ -2,7 +2,7 @@ * IA-32 exception handlers * * Copyright (C) 2000 Asit K. Mallick <asit.k.mallick@intel.com> - * Copyright (C) 2001 Hewlett-Packard Co + * Copyright (C) 2001-2002 Hewlett-Packard Co * David Mosberger-Tang <davidm@hpl.hp.com> * * 06/16/00 A. Mallick added siginfo for most cases (close to IA32) @@ -40,7 +40,11 @@ ia32_exception (struct pt_regs *regs, unsigned long isr) { struct siginfo siginfo; + /* initialize these fields to avoid leaking kernel bits to user space: */ siginfo.si_errno = 0; + siginfo.si_flags = 0; + siginfo.si_isr = 0; + siginfo.si_imm = 0; switch ((isr >> 16) & 0xff) { case 1: case 2: @@ -103,6 +107,8 @@ ia32_exception (struct pt_regs *regs, unsigned long isr) * and it will suffer the consequences since we won't be able to * fully reproduce the context of the exception */ + siginfo.si_isr = isr; + siginfo.si_flags = __ISR_VALID; switch(((~fcr) & (fsr & 0x3f)) | (fsr & 0x240)) { case 0x000: default: diff --git a/arch/ia64/ia32/sys_ia32.c b/arch/ia64/ia32/sys_ia32.c index e11e8d9..009a644 100644 --- a/arch/ia64/ia32/sys_ia32.c +++ b/arch/ia64/ia32/sys_ia32.c @@ -6,7 +6,7 @@ * Copyright (C) 1999 Arun Sharma <arun.sharma@intel.com> * Copyright (C) 1997,1998 Jakub Jelinek (jj@sunsite.mff.cuni.cz) * Copyright (C) 1997 David S. Miller (davem@caip.rutgers.edu) - * Copyright (C) 2000-2001 Hewlett-Packard Co + * Copyright (C) 2000-2002 Hewlett-Packard Co * David Mosberger-Tang <davidm@hpl.hp.com> * * These routines maintain argument size conversion between 32bit and 64bit @@ -74,6 +74,9 @@ #define PAGE_START(addr) ((addr) & PAGE_MASK) #define PAGE_OFF(addr) ((addr) & ~PAGE_MASK) +#define high2lowuid(uid) ((uid) > 65535 ? 65534 : (uid)) +#define high2lowgid(gid) ((gid) > 65535 ? 65534 : (gid)) + extern asmlinkage long sys_execve (char *, char **, char **, struct pt_regs *); extern asmlinkage long sys_mprotect (unsigned long, size_t, unsigned long); extern asmlinkage long sys_munmap (unsigned long, size_t); @@ -82,6 +85,7 @@ extern unsigned long arch_get_unmapped_area (struct file *, unsigned long, unsig /* forward declaration: */ asmlinkage long sys32_mprotect (unsigned int, unsigned int, int); +asmlinkage unsigned long sys_brk(unsigned long); /* * Anything that modifies or inspects ia32 user virtual memory must hold this semaphore @@ -400,7 +404,7 @@ emulate_mmap (struct file *file, unsigned long start, unsigned long len, int pro return -EINVAL; } if (!(prot & PROT_WRITE) && sys_mprotect(pstart, pend - pstart, prot) < 0) - return EINVAL; + return -EINVAL; } return start; } @@ -2578,6 +2582,7 @@ sys32_ipc (u32 call, int first, int second, int third, u32 ptr, u32 fifth) default: return -EINVAL; } + return -EINVAL; } /* @@ -3783,6 +3788,19 @@ sys32_personality (unsigned int personality) return ret; } +asmlinkage unsigned long +sys32_brk (unsigned int brk) +{ + unsigned long ret, obrk; + struct mm_struct *mm = current->mm; + + obrk = mm->brk; + ret = sys_brk(brk); + if (ret < obrk) + clear_user((void *) ret, PAGE_ALIGN(ret) - ret); + return ret; +} + #ifdef NOTYET /* UNTESTED FOR IA64 FROM HERE DOWN */ struct ncp_mount_data32 { diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c index 5014111..4ce4977 100644 --- a/arch/ia64/kernel/acpi.c +++ b/arch/ia64/kernel/acpi.c @@ -5,11 +5,11 @@ * 'IA-64 Extensions to ACPI Specification' Revision 0.6 * * Copyright (C) 1999 VA Linux Systems - * Copyright (C) 1999,2000 Walt Drummond <drummond@valinux.com> - * Copyright (C) 2000 Hewlett-Packard Co. - * Copyright (C) 2000 David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 1999, 2000 Walt Drummond <drummond@valinux.com> + * Copyright (C) 2000, 2002 Hewlett-Packard Co. + * David Mosberger-Tang <davidm@hpl.hp.com> * Copyright (C) 2000 Intel Corp. - * Copyright (C) 2000,2001 J.I. Lee <jung-ik.lee@intel.com> + * Copyright (C) 2000, 2001 J.I. Lee <jung-ik.lee@intel.com> * ACPI based kernel configuration manager. * ACPI 2.0 & IA64 ext 0.71 */ @@ -44,6 +44,8 @@ int platform_irq_list[ACPI_MAX_PLATFORM_IRQS]; int __initdata available_cpus; int __initdata total_cpus; +int __initdata pcat_compat; + void (*pm_idle) (void); void (*pm_power_off) (void); @@ -293,6 +295,16 @@ acpi20_parse_madt (acpi_madt_t *madt) } else printk("Lapic address set to default 0x%lx\n", ipi_base_addr); + /* + * The PCAT_COMPAT flag indicates that the system has a dual-8259 compatible + * setup. + */ +#ifdef CONFIG_ITANIUM + pcat_compat = 1; /* fw on some Itanium systems is broken... */ +#else + pcat_compat = (madt->flags & MADT_PCAT_COMPAT); +#endif + p = (char *) (madt + 1); end = p + (madt->header.length - sizeof(acpi_madt_t)); @@ -319,17 +331,7 @@ acpi20_parse_madt (acpi_madt_t *madt) case ACPI20_ENTRY_IO_SAPIC: iosapic = (acpi_entry_iosapic_t *) p; if (iosapic_init) - /* - * The PCAT_COMPAT flag indicates that the system has a - * dual-8259 compatible setup. - */ - iosapic_init(iosapic->address, iosapic->irq_base, -#ifdef CONFIG_ITANIUM - 1 /* fw on some Itanium systems is broken... */ -#else - (madt->flags & MADT_PCAT_COMPAT) -#endif - ); + iosapic_init(iosapic->address, iosapic->irq_base, pcat_compat); break; case ACPI20_ENTRY_PLATFORM_INT_SOURCE: @@ -401,7 +403,7 @@ acpi20_parse (acpi20_rsdp_t *rsdp20) # ifdef CONFIG_ACPI acpi_xsdt_t *xsdt; acpi_desc_table_hdr_t *hdrp; - acpi_madt_t *madt; + acpi_madt_t *madt = NULL; int tables, i; if (strncmp(rsdp20->signature, ACPI_RSDP_SIG, ACPI_RSDP_SIG_LEN)) { diff --git a/arch/ia64/kernel/brl_emu.c b/arch/ia64/kernel/brl_emu.c index 8016167..abfd6a8 100644 --- a/arch/ia64/kernel/brl_emu.c +++ b/arch/ia64/kernel/brl_emu.c @@ -2,6 +2,9 @@ * Emulation of the "brl" instruction for IA64 processors that * don't support it in hardware. * Author: Stephan Zeisset, Intel Corp. <Stephan.Zeisset@intel.com> + * + * 02/22/02 D. Mosberger Clear si_flgs, si_isr, and si_imm to avoid + * leaking kernel bits. */ #include <linux/kernel.h> @@ -195,6 +198,9 @@ ia64_emulate_brl (struct pt_regs *regs, unsigned long ar_ec) printk("Woah! Unimplemented Instruction Address Trap!\n"); siginfo.si_signo = SIGILL; siginfo.si_errno = 0; + siginfo.si_flags = 0; + siginfo.si_isr = 0; + siginfo.si_imm = 0; siginfo.si_code = ILL_BADIADDR; force_sig_info(SIGILL, &siginfo, current); } else if (ia64_psr(regs)->tb) { @@ -205,6 +211,10 @@ ia64_emulate_brl (struct pt_regs *regs, unsigned long ar_ec) siginfo.si_signo = SIGTRAP; siginfo.si_errno = 0; siginfo.si_code = TRAP_BRANCH; + siginfo.si_flags = 0; + siginfo.si_isr = 0; + siginfo.si_addr = 0; + siginfo.si_imm = 0; force_sig_info(SIGTRAP, &siginfo, current); } else if (ia64_psr(regs)->ss) { /* @@ -214,6 +224,10 @@ ia64_emulate_brl (struct pt_regs *regs, unsigned long ar_ec) siginfo.si_signo = SIGTRAP; siginfo.si_errno = 0; siginfo.si_code = TRAP_TRACE; + siginfo.si_flags = 0; + siginfo.si_isr = 0; + siginfo.si_addr = 0; + siginfo.si_imm = 0; force_sig_info(SIGTRAP, &siginfo, current); } return rv; diff --git a/arch/ia64/kernel/efivars.c b/arch/ia64/kernel/efivars.c index 5a5cf77..189503b 100644 --- a/arch/ia64/kernel/efivars.c +++ b/arch/ia64/kernel/efivars.c @@ -29,6 +29,11 @@ * * Changelog: * + * 12 Feb 2002 - Matt Domsch <Matt_Domsch@dell.com> + * use list_for_each_safe when deleting vars. + * remove ifdef CONFIG_SMP around include <linux/smp.h> + * v0.04 release to linux-ia64@linuxia64.org + * * 20 April 2001 - Matt Domsch <Matt_Domsch@dell.com> * Moved vars from /proc/efi to /proc/efi/vars, and made * efi.c own the /proc/efi directory. @@ -56,18 +61,16 @@ #include <linux/sched.h> /* for capable() */ #include <linux/mm.h> #include <linux/module.h> +#include <linux/smp.h> #include <asm/efi.h> #include <asm/uaccess.h> -#ifdef CONFIG_SMP -#include <linux/smp.h> -#endif MODULE_AUTHOR("Matt Domsch <Matt_Domsch@Dell.com>"); MODULE_DESCRIPTION("/proc interface to EFI Variables"); MODULE_LICENSE("GPL"); -#define EFIVARS_VERSION "0.03 2001-Apr-20" +#define EFIVARS_VERSION "0.04 2002-Feb-12" static int efivar_read(char *page, char **start, off_t off, @@ -265,7 +268,7 @@ efivar_write(struct file *file, const char *buffer, { unsigned long strsize1, strsize2; int found=0; - struct list_head *pos; + struct list_head *pos, *n; unsigned long size = sizeof(efi_variable_t); efi_status_t status; efivar_entry_t *efivar = data, *search_efivar = NULL; @@ -297,7 +300,7 @@ efivar_write(struct file *file, const char *buffer, This allows any properly formatted data structure to be written to any of the files in /proc/efi/vars and it will work. */ - list_for_each(pos, &efivar_list) { + list_for_each_safe(pos, n, &efivar_list) { search_efivar = efivar_entry(pos); strsize1 = utf8_strsize(search_efivar->var.VariableName, 1024); strsize2 = utf8_strsize(var_data->VariableName, 1024); @@ -413,12 +416,12 @@ efivars_init(void) static void __exit efivars_exit(void) { - struct list_head *pos; + struct list_head *pos, *n; efivar_entry_t *efivar; spin_lock(&efivars_lock); - list_for_each(pos, &efivar_list) { + list_for_each_safe(pos, n, &efivar_list) { efivar = efivar_entry(pos); remove_proc_entry(efivar->entry->name, efi_vars_dir); list_del(&efivar->list); diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S index 832f396..d521bc7 100644 --- a/arch/ia64/kernel/entry.S +++ b/arch/ia64/kernel/entry.S @@ -3,7 +3,7 @@ * * Kernel entry points. * - * Copyright (C) 1998-2001 Hewlett-Packard Co + * Copyright (C) 1998-2002 Hewlett-Packard Co * David Mosberger-Tang <davidm@hpl.hp.com> * Copyright (C) 1999 VA Linux Systems * Copyright (C) 1999 Walt Drummond <drummond@valinux.com> @@ -30,14 +30,15 @@ #include <linux/config.h> +#include <asm/asmmacro.h> #include <asm/cache.h> #include <asm/errno.h> #include <asm/kregs.h> #include <asm/offsets.h> +#include <asm/pgtable.h> #include <asm/processor.h> +#include <asm/thread_info.h> #include <asm/unistd.h> -#include <asm/asmmacro.h> -#include <asm/pgtable.h> #include "minstate.h" @@ -115,7 +116,7 @@ GLOBAL_ENTRY(sys_clone) mov loc1=r16 // save ar.pfs across do_fork .body mov out1=in1 - mov out3=0 + mov out3=16 // stacksize (compensates for 16-byte scratch area) adds out2=IA64_SWITCH_STACK_SIZE+16,sp // out2 = ®s mov out0=in0 // out0 = clone_flags br.call.sptk.many rp=do_fork @@ -128,6 +129,9 @@ END(sys_clone) /* * prev_task <- ia64_switch_to(struct task_struct *next) + * With Ingo's new scheduler, interrupts are disabled when this routine gets + * called. The code starting at .map relies on this. The rest of the code + * doesn't care about the interrupt masking status. */ GLOBAL_ENTRY(ia64_switch_to) .prologue @@ -158,10 +162,8 @@ GLOBAL_ENTRY(ia64_switch_to) (p6) srlz.d ld8 sp=[r21] // load kernel stack pointer of new task mov IA64_KR(CURRENT)=r20 // update "current" application register - mov r8=r13 // return pointer to previously running task mov r13=in0 // set "current" pointer ;; -(p6) ssm psr.i // renable psr.i AFTER the ic bit is serialized DO_LOAD_SWITCH_STACK #ifdef CONFIG_SMP @@ -170,7 +172,7 @@ GLOBAL_ENTRY(ia64_switch_to) br.ret.sptk.many rp // boogie on out in new context .map: - rsm psr.i | psr.ic + rsm psr.ic // interrupts (psr.i) are already disabled here movl r25=PAGE_KERNEL ;; srlz.d @@ -433,7 +435,7 @@ GLOBAL_ENTRY(invoke_syscall_trace) .body mov loc2=b6 ;; -#error br.call.sptk.many rp=syscall_trace + br.call.sptk.many rp=syscall_trace .ret3: mov rp=loc0 mov ar.pfs=loc1 mov b6=loc2 @@ -454,7 +456,7 @@ END(invoke_syscall_trace) GLOBAL_ENTRY(ia64_trace_syscall) PT_REGS_UNWIND_INFO(0) -#error br.call.sptk.many rp=invoke_syscall_trace // give parent a chance to catch syscall args + br.call.sptk.many rp=invoke_syscall_trace // give parent a chance to catch syscall args .ret6: br.call.sptk.many rp=b6 // do the syscall strace_check_retval: cmp.lt p6,p0=r8,r0 // syscall failed? @@ -467,7 +469,7 @@ strace_save_retval: .mem.offset 0,0; st8.spill [r2]=r8 // store return value in slot for r8 .mem.offset 8,0; st8.spill [r3]=r10 // clear error indication in slot for r10 ia64_strace_leave_kernel: -#error br.call.sptk.many rp=invoke_syscall_trace // give parent a chance to catch return value + br.call.sptk.many rp=invoke_syscall_trace // give parent a chance to catch return value .rety: br.cond.sptk ia64_leave_kernel strace_error: @@ -491,12 +493,12 @@ GLOBAL_ENTRY(ia64_ret_from_clone) */ br.call.sptk.many rp=ia64_invoke_schedule_tail .ret8: - adds r2=IA64_TASK_PTRACE_OFFSET,r13 + adds r2=TI_FLAGS+IA64_TASK_SIZE,r13 ;; - ld8 r2=[r2] + ld4 r2=[r2] ;; mov r8=0 - tbit.nz p6,p0=r2,PT_TRACESYS_BIT + tbit.nz p6,p0=r2,TIF_SYSCALL_TRACE (p6) br.cond.spnt strace_check_retval ;; // added stop bits to prevent r8 dependency END(ia64_ret_from_clone) @@ -516,50 +518,29 @@ END(ia64_ret_from_syscall) // fall through GLOBAL_ENTRY(ia64_leave_kernel) PT_REGS_UNWIND_INFO(0) - lfetch.fault [sp] - movl r14=.restart - ;; - mov.ret.sptk rp=r14,.restart -.restart: - adds r17=IA64_TASK_NEED_RESCHED_OFFSET,r13 - adds r18=IA64_TASK_SIGPENDING_OFFSET,r13 -#ifdef CONFIG_PERFMON - adds r19=IA64_TASK_PFM_MUST_BLOCK_OFFSET,r13 -#endif - ;; -#ifdef CONFIG_PERFMON -(pUser) ld8 r19=[r19] // load current->thread.pfm_must_block -#endif -#error (pUser) ld8 r17=[r17] // load current->need_resched -#error (pUser) ld4 r18=[r18] // load current->sigpending + // work.need_resched etc. mustn't get changed by this CPU before it returns to userspace: +(pUser) cmp.eq.unc p6,p0=r0,r0 // p6 <- pUser +(pUser) rsm psr.i ;; -#ifdef CONFIG_PERFMON -(pUser) cmp.ne.unc p9,p0=r19,r0 // current->thread.pfm_must_block != 0? -#endif -#error (pUser) cmp.ne.unc p7,p0=r17,r0 // current->need_resched != 0? -#errror (pUser) cmp.ne.unc p8,p0=r18,r0 // current->sigpending != 0? +(pUser) adds r17=TI_FLAGS+IA64_TASK_SIZE,r13 ;; +.work_processed: +(p6) ld4 r18=[r17] // load current_thread_info()->flags adds r2=PT(R8)+16,r12 adds r3=PT(R9)+16,r12 -#ifdef CONFIG_PERFMON -(p9) br.call.spnt.many b7=pfm_block_on_overflow -#endif -#if __GNUC__ < 3 -(p7) br.call.spnt.many b7=invoke_schedule -#else -(p7) br.call.spnt.many b7=schedule -#endif -(p8) br.call.spnt.many b7=handle_signal_delivery // check & deliver pending signals ;; // start restoring the state saved on the kernel stack (struct pt_regs): ld8.fill r8=[r2],16 ld8.fill r9=[r3],16 +(p6) and r19=TIF_WORK_MASK,r18 // any work other than TIF_SYSCALL_TRACE? ;; ld8.fill r10=[r2],16 ld8.fill r11=[r3],16 +(p6) cmp4.ne.unc p6,p0=r19, r0 // any special work pending? ;; ld8.fill r16=[r2],16 ld8.fill r17=[r3],16 +(p6) br.cond.spnt .work_pending ;; ld8.fill r18=[r2],16 ld8.fill r19=[r3],16 @@ -582,7 +563,7 @@ GLOBAL_ENTRY(ia64_leave_kernel) ld8.fill r30=[r2],16 ld8.fill r31=[r3],16 ;; - rsm psr.i | psr.ic // initiate turning off of interrupts & interruption collection + rsm psr.i | psr.ic // initiate turning off of interrupt and interruption collection invala // invalidate ALAT ;; ld8 r1=[r2],16 // ar.ccv @@ -601,7 +582,7 @@ GLOBAL_ENTRY(ia64_leave_kernel) mov ar.fpsr=r13 mov b0=r14 ;; - srlz.i // ensure interrupts & interruption collection are off + srlz.i // ensure interruption collection is off mov b7=r15 ;; bsw.0 // switch back to bank 0 @@ -729,6 +710,25 @@ skip_rbs_switch: mov ar.unat=rARUNAT mov pr=rARPR,-1 rfi + +.work_pending: + tbit.z p6,p0=r18,TIF_NEED_RESCHED // current_thread_info()->need_resched==0? +(p6) br.cond.sptk.few .notify +#if __GNUC__ < 3 + br.call.spnt.many rp=invoke_schedule +#else + br.call.spnt.many rp=schedule +#endif +.ret9: cmp.eq p6,p0=r0,r0 // p6 <- 1 + rsm psr.i + ;; + adds r17=TI_FLAGS+IA64_TASK_SIZE,r13 + br.cond.sptk.many .work_processed // re-check + +.notify: + br.call.spnt.many rp=notify_resume_user +.ret10: cmp.ne p6,p0=r0,r0 // p6 <- 0 + br.cond.sptk.many .work_processed // don't re-check END(ia64_leave_kernel) ENTRY(handle_syscall_error) @@ -802,7 +802,7 @@ END(invoke_schedule) * be set up by the caller. We declare 8 input registers so the system call * args get preserved, in case we need to restart a system call. */ -ENTRY(handle_signal_delivery) +ENTRY(notify_resume_user) .prologue ASM_UNW_PRLG_RP|ASM_UNW_PRLG_PFS, ASM_UNW_PRLG_GRSAVE(8) alloc loc1=ar.pfs,8,2,3,0 // preserve all eight input regs in case of syscall restart! mov r9=ar.unat @@ -816,17 +816,17 @@ ENTRY(handle_signal_delivery) .spillpsp ar.unat, 16 // (note that offset is relative to psp+0x10!) st8 [sp]=r9,-16 // allocate space for ar.unat and save it .body -#error br.call.sptk.many rp=ia64_do_signal + br.call.sptk.many rp=do_notify_resume_user .ret15: .restore sp adds sp=16,sp // pop scratch stack space ;; - ld8 r9=[sp] // load new unat from sw->caller_unat + ld8 r9=[sp] // load new unat from sigscratch->scratch_unat mov rp=loc0 ;; mov ar.unat=r9 mov ar.pfs=loc1 br.ret.sptk.many rp -END(handle_signal_delivery) +END(do_notify_resume_user) GLOBAL_ENTRY(sys_rt_sigsuspend) .prologue ASM_UNW_PRLG_RP|ASM_UNW_PRLG_PFS, ASM_UNW_PRLG_GRSAVE(8) @@ -1033,9 +1033,9 @@ sys_call_table: data8 sys_syslog data8 sys_setitimer data8 sys_getitimer - data8 ia64_oldstat // 1120 - data8 ia64_oldlstat - data8 ia64_oldfstat + data8 ia64_ni_syscall // 1120 /* was: ia64_oldstat */ + data8 ia64_ni_syscall /* was: ia64_oldlstat */ + data8 ia64_ni_syscall /* was: ia64_oldfstat */ data8 sys_vhangup data8 sys_lchown data8 sys_vm86 // 1125 @@ -1130,19 +1130,23 @@ sys_call_table: data8 sys_getdents64 data8 sys_getunwind // 1215 data8 sys_readahead + data8 sys_setxattr + data8 sys_lsetxattr + data8 sys_fsetxattr + data8 sys_getxattr // 1220 + data8 sys_lgetxattr + data8 sys_fgetxattr + data8 sys_listxattr + data8 sys_llistxattr + data8 sys_flistxattr // 1225 + data8 sys_removexattr + data8 sys_lremovexattr + data8 sys_fremovexattr +#if 0 data8 sys_tkill +#else data8 ia64_ni_syscall - data8 ia64_ni_syscall - data8 ia64_ni_syscall // 1220 - data8 ia64_ni_syscall - data8 ia64_ni_syscall - data8 ia64_ni_syscall - data8 ia64_ni_syscall - data8 ia64_ni_syscall // 1225 - data8 ia64_ni_syscall - data8 ia64_ni_syscall - data8 ia64_ni_syscall - data8 ia64_ni_syscall +#endif data8 ia64_ni_syscall // 1230 data8 ia64_ni_syscall data8 ia64_ni_syscall diff --git a/arch/ia64/kernel/gate.S b/arch/ia64/kernel/gate.S index 774c1b9..09f619b 100644 --- a/arch/ia64/kernel/gate.S +++ b/arch/ia64/kernel/gate.S @@ -90,7 +90,7 @@ GLOBAL_ENTRY(ia64_sigtramp) (p8) br.cond.spnt setup_rbs // yup -> (clobbers r14, r15, and r16) back_from_setup_rbs: - .save ar.pfs, r8 + .spillreg ar.pfs, r8 alloc r8=ar.pfs,0,0,3,0 // get CFM0, EC0, and CPL0 into r8 ld8 out0=[base0],16 // load arg0 (signum) adds base1=(ARG1_OFF-(RBS_BASE_OFF+SIGCONTEXT_OFF)),base1 diff --git a/arch/ia64/kernel/head.S b/arch/ia64/kernel/head.S index 148078b..1d1e63e 100644 --- a/arch/ia64/kernel/head.S +++ b/arch/ia64/kernel/head.S @@ -127,23 +127,21 @@ start_ap: #ifdef CONFIG_SMP /* * Find the init_task for the currently booting CPU. At poweron, and in - * UP mode, cpucount is 0. + * UP mode, task_for_booting_cpu is NULL. */ - movl r3=cpucount + movl r3=task_for_booting_cpu ;; - ld4 r3=[r3] // r3 <- smp_processor_id() - movl r2=init_tasks + ld8 r3=[r3] + movl r2=init_thread_union ;; - shladd r2=r3,3,r2 + cmp.eq isBP,isAP=r3,r0 ;; - ld8 r2=[r2] +(isAP) mov r2=r3 #else - mov r3=0 - movl r2=init_task_union - ;; + movl r2=init_thread_union + cmp.eq isBP,isAP=r0,r0 #endif - cmp4.ne isAP,isBP=r3,r0 - ;; // RAW on r2 + ;; extr r3=r2,0,61 // r3 == phys addr of task struct mov r16=KERNEL_TR_PAGE_NUM ;; @@ -180,10 +178,12 @@ start_ap: .rodata alive_msg: stringz "I'm alive and well\n" +alive_msg_end: .previous alloc r2=ar.pfs,0,0,2,0 movl out0=alive_msg + movl out1=alive_msg_end-alive_msg-1 ;; br.call.sptk.many rp=early_printk 1: // force new bundle diff --git a/arch/ia64/kernel/ia64_ksyms.c b/arch/ia64/kernel/ia64_ksyms.c index 329125a..55b9faf 100644 --- a/arch/ia64/kernel/ia64_ksyms.c +++ b/arch/ia64/kernel/ia64_ksyms.c @@ -24,6 +24,7 @@ EXPORT_SYMBOL(strnlen); EXPORT_SYMBOL(strrchr); EXPORT_SYMBOL(strstr); EXPORT_SYMBOL(strtok); +EXPORT_SYMBOL(strpbrk); #include <linux/irq.h> EXPORT_SYMBOL(isa_irq_to_vector_map); diff --git a/arch/ia64/kernel/init_task.c b/arch/ia64/kernel/init_task.c index 3027ada1a..cdece82 100644 --- a/arch/ia64/kernel/init_task.c +++ b/arch/ia64/kernel/init_task.c @@ -2,8 +2,8 @@ * This is where we statically allocate and initialize the initial * task. * - * Copyright (C) 1999 Hewlett-Packard Co - * Copyright (C) 1999 David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 1999, 2002 Hewlett-Packard Co + * David Mosberger-Tang <davidm@hpl.hp.com> */ #include <linux/init.h> @@ -22,10 +22,20 @@ struct mm_struct init_mm = INIT_MM(init_mm); /* * Initial task structure. * - * We need to make sure that this is page aligned due to the way - * process stacks are handled. This is done by having a special - * "init_task" linker map entry.. + * We need to make sure that this is properly aligned due to the way process stacks are + * handled. This is done by having a special ".data.init_task" section... */ -union task_union init_task_union - __attribute__((section("init_task"))) = - { INIT_TASK(init_task_union.task) }; +#define init_thread_info init_thread_union.s.thread_info + +union init_thread { + struct { + struct task_struct task; + struct thread_info thread_info; + } s; + unsigned long stack[KERNEL_STACK_SIZE/sizeof (unsigned long)]; +} init_thread_union __attribute__((section(".data.init_task"))) = {{ + task: INIT_TASK(init_thread_union.s.task), + thread_info: INIT_THREAD_INFO(init_thread_union.s.thread_info) +}}; + +asm (".global init_task; init_task = init_thread_union"); diff --git a/arch/ia64/kernel/iosapic.c b/arch/ia64/kernel/iosapic.c index 266526f..262fb32 100644 --- a/arch/ia64/kernel/iosapic.c +++ b/arch/ia64/kernel/iosapic.c @@ -3,8 +3,9 @@ * * Copyright (C) 1999 Intel Corp. * Copyright (C) 1999 Asit Mallick <asit.k.mallick@intel.com> - * Copyright (C) 1999-2000 Hewlett-Packard Co. - * Copyright (C) 1999-2000 David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 2000-2002 J.I. Lee <jung-ik.lee@intel.com> + * Copyright (C) 1999-2000, 2002 Hewlett-Packard Co. + * David Mosberger-Tang <davidm@hpl.hp.com> * Copyright (C) 1999 VA Linux Systems * Copyright (C) 1999,2000 Walt Drummond <drummond@valinux.com> * @@ -15,6 +16,12 @@ * PCI to vector mapping, shared PCI interrupts. * 00/10/27 D. Mosberger Document things a bit more to make them more understandable. * Clean up much of the old IOSAPIC cruft. + * 01/07/27 J.I. Lee PCI irq routing, Platform/Legacy interrupts and fixes for + * ACPI S5(SoftOff) support. + * 02/01/23 J.I. Lee iosapic pgm fixes for PCI irq routing from _PRT + * 02/01/07 E. Focht <efocht@ess.nec.de> Redirectable interrupt vectors in + * iosapic_set_affinity(), initializations for + * /proc/irq/#/smp_affinity */ /* * Here is what the interrupt logic between a PCI device and the CPU looks like: @@ -63,6 +70,7 @@ #undef DEBUG_IRQ_ROUTING +#undef OVERRIDE_DEBUG static spinlock_t iosapic_lock = SPIN_LOCK_UNLOCKED; @@ -88,7 +96,7 @@ static struct iosapic_irq { * Translate IOSAPIC irq number to the corresponding IA-64 interrupt vector. If no * entry exists, return -1. */ -static int +int iosapic_irq_to_vector (int irq) { int vector; @@ -121,6 +129,7 @@ set_rte (unsigned int vector, unsigned long dest) u32 low32, high32; char *addr; int pin; + char redir; pin = iosapic_irq[vector].pin; if (pin < 0) @@ -131,6 +140,11 @@ set_rte (unsigned int vector, unsigned long dest) trigger = iosapic_irq[vector].trigger; dmode = iosapic_irq[vector].dmode; + redir = (dmode == IOSAPIC_LOWEST_PRIORITY) ? 1 : 0; +#ifdef CONFIG_SMP + set_irq_affinity_info(vector, (int)(dest & 0xffff), redir); +#endif + low32 = ((pol << IOSAPIC_POLARITY_SHIFT) | (trigger << IOSAPIC_TRIGGER_SHIFT) | (dmode << IOSAPIC_DELIVERY_SHIFT) | @@ -211,6 +225,7 @@ iosapic_set_affinity (unsigned int irq, unsigned long mask) u32 high32, low32; int dest, pin; char *addr; + int redir = (irq & (1<<31)) ? 1 : 0; mask &= (1UL << smp_num_cpus) - 1; @@ -225,6 +240,8 @@ iosapic_set_affinity (unsigned int irq, unsigned long mask) if (pin < 0) return; /* not an IOSAPIC interrupt */ + set_irq_affinity_info(irq,dest,redir); + /* dest contains both id and eid */ high32 = dest << IOSAPIC_DEST_SHIFT; @@ -234,9 +251,13 @@ iosapic_set_affinity (unsigned int irq, unsigned long mask) writel(IOSAPIC_RTE_LOW(pin), addr + IOSAPIC_REG_SELECT); low32 = readl(addr + IOSAPIC_WINDOW); - /* change delivery mode to fixed */ low32 &= ~(7 << IOSAPIC_DELIVERY_SHIFT); - low32 |= (IOSAPIC_FIXED << IOSAPIC_DELIVERY_SHIFT); + if (redir) + /* change delivery mode to lowest priority */ + low32 |= (IOSAPIC_LOWEST_PRIORITY << IOSAPIC_DELIVERY_SHIFT); + else + /* change delivery mode to fixed */ + low32 |= (IOSAPIC_FIXED << IOSAPIC_DELIVERY_SHIFT); writel(IOSAPIC_RTE_HIGH(pin), addr + IOSAPIC_REG_SELECT); writel(high32, addr + IOSAPIC_WINDOW); @@ -343,29 +364,64 @@ iosapic_version (char *addr) } /* - * ACPI can describe IOSAPIC interrupts via static tables and namespace - * methods. This provides an interface to register those interrupts and - * program the IOSAPIC RTE. + * if the given vector is already owned by other, + * assign a new vector for the other and make the vector available */ -int -iosapic_register_irq (u32 global_vector, unsigned long polarity, unsigned long - edge_triggered, u32 base_irq, char *iosapic_address) +static void +iosapic_reassign_vector (int vector) +{ + int new_vector; + + if (iosapic_irq[vector].pin >= 0 || iosapic_irq[vector].addr + || iosapic_irq[vector].base_irq || iosapic_irq[vector].dmode + || iosapic_irq[vector].polarity || iosapic_irq[vector].trigger) + { + new_vector = ia64_alloc_irq(); + printk("Reassigning Vector 0x%x to 0x%x\n", vector, new_vector); + memcpy (&iosapic_irq[new_vector], &iosapic_irq[vector], + sizeof(struct iosapic_irq)); + memset (&iosapic_irq[vector], 0, sizeof(struct iosapic_irq)); + iosapic_irq[vector].pin = -1; + } +} + +static void +register_irq (u32 global_vector, int vector, int pin, unsigned char delivery, + unsigned long polarity, unsigned long edge_triggered, + u32 base_irq, char *iosapic_address) { irq_desc_t *idesc; struct hw_interrupt_type *irq_type; - int vector; - vector = iosapic_irq_to_vector(global_vector); - if (vector < 0) - vector = ia64_alloc_irq(); - - /* fill in information from this vector's IOSAPIC */ - iosapic_irq[vector].addr = iosapic_address; - iosapic_irq[vector].base_irq = base_irq; - iosapic_irq[vector].pin = global_vector - iosapic_irq[vector].base_irq; + iosapic_irq[vector].pin = pin; iosapic_irq[vector].polarity = polarity ? IOSAPIC_POL_HIGH : IOSAPIC_POL_LOW; - iosapic_irq[vector].dmode = IOSAPIC_LOWEST_PRIORITY; + iosapic_irq[vector].dmode = delivery; + /* + * In override, it does not provide addr/base_irq. global_vector is enough to + * locate iosapic addr, base_irq and pin by examining base_irq and max_pin of + * registered iosapics (tbd) + */ +#ifndef OVERRIDE_DEBUG + if (iosapic_address) { + iosapic_irq[vector].addr = iosapic_address; + iosapic_irq[vector].base_irq = base_irq; + } +#else + if (iosapic_address) { + if (iosapic_irq[vector].addr && (iosapic_irq[vector].addr != iosapic_address)) + printk("WARN: register_irq: diff IOSAPIC ADDRESS for gv %x, v %x\n", + global_vector, vector); + iosapic_irq[vector].addr = iosapic_address; + if (iosapic_irq[vector].base_irq && (iosapic_irq[vector].base_irq != base_irq)) { + printk("WARN: register_irq: diff BASE IRQ %x for gv %x, v %x\n", + base_irq, global_vector, vector); + } + iosapic_irq[vector].base_irq = base_irq; + } else if (!iosapic_irq[vector].addr) + printk("WARN: register_irq: invalid override for gv %x, v %x\n", + global_vector, vector); +#endif if (edge_triggered) { iosapic_irq[vector].trigger = IOSAPIC_EDGE; irq_type = &irq_type_iosapic_edge; @@ -377,12 +433,32 @@ iosapic_register_irq (u32 global_vector, unsigned long polarity, unsigned long idesc = irq_desc(vector); if (idesc->handler != irq_type) { if (idesc->handler != &no_irq_type) - printk("iosapic_register_irq(): changing vector 0x%02x from" + printk("register_irq(): changing vector 0x%02x from " "%s to %s\n", vector, idesc->handler->typename, irq_type->typename); idesc->handler = irq_type; } +} + +/* + * ACPI can describe IOSAPIC interrupts via static tables and namespace + * methods. This provides an interface to register those interrupts and + * program the IOSAPIC RTE. + */ +int +iosapic_register_irq (u32 global_vector, unsigned long polarity, unsigned long + edge_triggered, u32 base_irq, char *iosapic_address) +{ + int vector; - printk("IOSAPIC %x(%s,%s) -> Vector %x\n", global_vector, + vector = iosapic_irq_to_vector(global_vector); + if (vector < 0) + vector = ia64_alloc_irq(); + + register_irq (global_vector, vector, global_vector - base_irq, + IOSAPIC_LOWEST_PRIORITY, polarity, edge_triggered, + base_irq, iosapic_address); + + printk("IOSAPIC 0x%x(%s,%s) -> Vector 0x%x\n", global_vector, (polarity ? "high" : "low"), (edge_triggered ? "edge" : "level"), vector); /* program the IOSAPIC routing table */ @@ -395,51 +471,40 @@ iosapic_register_irq (u32 global_vector, unsigned long polarity, unsigned long * Note that the irq_base and IOSAPIC address must be set in iosapic_init(). */ int -iosapic_register_platform_irq (u32 int_type, u32 global_vector, u32 iosapic_vector, - u16 eid, u16 id, unsigned long polarity, +iosapic_register_platform_irq (u32 int_type, u32 global_vector, + u32 iosapic_vector, u16 eid, u16 id, unsigned long polarity, unsigned long edge_triggered, u32 base_irq, char *iosapic_address) { - struct hw_interrupt_type *irq_type; - irq_desc_t *idesc; + unsigned char delivery; int vector; switch (int_type) { - case ACPI20_ENTRY_PIS_CPEI: + case ACPI20_ENTRY_PIS_PMI: + vector = iosapic_vector; + /* + * since PMI vector is alloc'd by FW(ACPI) not by kernel, + * we need to make sure the vector is available + */ + iosapic_reassign_vector(vector); + delivery = IOSAPIC_PMI; + break; + case ACPI20_ENTRY_PIS_CPEI: vector = IA64_PCE_VECTOR; - iosapic_irq[vector].dmode = IOSAPIC_LOWEST_PRIORITY; + delivery = IOSAPIC_LOWEST_PRIORITY; break; - case ACPI20_ENTRY_PIS_INIT: + case ACPI20_ENTRY_PIS_INIT: vector = ia64_alloc_irq(); - iosapic_irq[vector].dmode = IOSAPIC_INIT; + delivery = IOSAPIC_INIT; break; - default: + default: printk("iosapic_register_platform_irq(): invalid int type\n"); return -1; } - /* fill in information from this vector's IOSAPIC */ - iosapic_irq[vector].addr = iosapic_address; - iosapic_irq[vector].base_irq = base_irq; - iosapic_irq[vector].pin = global_vector - iosapic_irq[vector].base_irq; - iosapic_irq[vector].polarity = polarity ? IOSAPIC_POL_HIGH : IOSAPIC_POL_LOW; - - if (edge_triggered) { - iosapic_irq[vector].trigger = IOSAPIC_EDGE; - irq_type = &irq_type_iosapic_edge; - } else { - iosapic_irq[vector].trigger = IOSAPIC_LEVEL; - irq_type = &irq_type_iosapic_level; - } - - idesc = irq_desc(vector); - if (idesc->handler != irq_type) { - if (idesc->handler != &no_irq_type) - printk("iosapic_register_platform_irq(): changing vector 0x%02x from" - "%s to %s\n", vector, idesc->handler->typename, irq_type->typename); - idesc->handler = irq_type; - } + register_irq(global_vector, vector, global_vector - base_irq, delivery, polarity, + edge_triggered, base_irq, iosapic_address); - printk("PLATFORM int %x: IOSAPIC %x(%s,%s) -> Vector %x CPU %.02u:%.02u\n", + printk("PLATFORM int 0x%x: IOSAPIC 0x%x(%s,%s) -> Vector 0x%x CPU %.02u:%.02u\n", int_type, global_vector, (polarity ? "high" : "low"), (edge_triggered ? "edge" : "level"), vector, eid, id); @@ -450,15 +515,18 @@ iosapic_register_platform_irq (u32 int_type, u32 global_vector, u32 iosapic_vect /* - * ACPI calls this when it finds an entry for a legacy ISA interrupt. Note that the - * irq_base and IOSAPIC address must be set in iosapic_init(). + * ACPI calls this when it finds an entry for a legacy ISA interrupt. + * Note that the irq_base and IOSAPIC address must be set in iosapic_init(). */ void iosapic_register_legacy_irq (unsigned long irq, unsigned long pin, unsigned long polarity, unsigned long edge_triggered) { - unsigned int vector = isa_irq_to_vector(irq); + int vector = isa_irq_to_vector(irq); + + register_irq(irq, vector, (int)pin, IOSAPIC_LOWEST_PRIORITY, polarity, edge_triggered, + 0, NULL); /* ignored for override */ #ifdef DEBUG_IRQ_ROUTING printk("ISA: IRQ %u -> IOSAPIC irq 0x%02x (%s, %s) -> vector %02x\n", @@ -467,18 +535,14 @@ iosapic_register_legacy_irq (unsigned long irq, vector); #endif - iosapic_irq[vector].pin = pin; - iosapic_irq[vector].dmode = IOSAPIC_LOWEST_PRIORITY; - iosapic_irq[vector].polarity = polarity ? IOSAPIC_POL_HIGH : IOSAPIC_POL_LOW; - iosapic_irq[vector].trigger = edge_triggered ? IOSAPIC_EDGE : IOSAPIC_LEVEL; + /* program the IOSAPIC routing table */ + set_rte(vector, (ia64_get_lid() >> 16) & 0xffff); } void __init iosapic_init (unsigned long phys_addr, unsigned int base_irq, int pcat_compat) { - struct hw_interrupt_type *irq_type; - int i, irq, max_pin, vector; - irq_desc_t *idesc; + int i, irq, max_pin, vector, pin; unsigned int ver; char *addr; static int first_time = 1; @@ -496,7 +560,6 @@ iosapic_init (unsigned long phys_addr, unsigned int base_irq, int pcat_compat) } addr = ioremap(phys_addr, 0); - ver = iosapic_version(addr); max_pin = (ver >> 16) & 0xff; @@ -511,27 +574,18 @@ iosapic_init (unsigned long phys_addr, unsigned int base_irq, int pcat_compat) */ for (irq = 0; irq < 16; ++irq) { vector = isa_irq_to_vector(irq); - iosapic_irq[vector].addr = addr; - iosapic_irq[vector].base_irq = 0; - if (iosapic_irq[vector].pin == -1) - iosapic_irq[vector].pin = irq; - iosapic_irq[vector].dmode = IOSAPIC_LOWEST_PRIORITY; - iosapic_irq[vector].trigger = IOSAPIC_EDGE; - iosapic_irq[vector].polarity = IOSAPIC_POL_HIGH; + if ((pin = iosapic_irq[vector].pin) == -1) + pin = irq; + + register_irq(irq, vector, pin, + /* IOSAPIC_POL_HIGH, IOSAPIC_EDGE */ + IOSAPIC_LOWEST_PRIORITY, 1, 1, base_irq, addr); + #ifdef DEBUG_IRQ_ROUTING printk("ISA: IRQ %u -> IOSAPIC irq 0x%02x (high, edge) -> vector 0x%02x\n", irq, iosapic_irq[vector].base_irq + iosapic_irq[vector].pin, vector); #endif - irq_type = &irq_type_iosapic_edge; - idesc = irq_desc(vector); - if (idesc->handler != irq_type) { - if (idesc->handler != &no_irq_type) - printk("iosapic_init: changing vector 0x%02x from %s to " - "%s\n", irq, idesc->handler->typename, - irq_type->typename); - idesc->handler = irq_type; - } /* program the IOSAPIC routing table: */ set_rte(vector, (ia64_get_lid() >> 16) & 0xffff); @@ -540,7 +594,7 @@ iosapic_init (unsigned long phys_addr, unsigned int base_irq, int pcat_compat) for (i = 0; i < pci_irq.num_routes; i++) { irq = pci_irq.route[i].irq; - if ((unsigned) (irq - base_irq) > max_pin) + if ((irq < (int)base_irq) || (irq > (int)(base_irq + max_pin))) /* the interrupt route is for another controller... */ continue; @@ -553,29 +607,18 @@ iosapic_init (unsigned long phys_addr, unsigned int base_irq, int pcat_compat) vector = ia64_alloc_irq(); } - iosapic_irq[vector].addr = addr; - iosapic_irq[vector].base_irq = base_irq; - iosapic_irq[vector].pin = (irq - base_irq); - iosapic_irq[vector].dmode = IOSAPIC_LOWEST_PRIORITY; - iosapic_irq[vector].trigger = IOSAPIC_LEVEL; - iosapic_irq[vector].polarity = IOSAPIC_POL_LOW; + register_irq(irq, vector, irq - base_irq, + /* IOSAPIC_POL_LOW, IOSAPIC_LEVEL */ + IOSAPIC_LOWEST_PRIORITY, 0, 0, base_irq, addr); # ifdef DEBUG_IRQ_ROUTING printk("PCI: (B%d,I%d,P%d) -> IOSAPIC irq 0x%02x -> vector 0x%02x\n", pci_irq.route[i].bus, pci_irq.route[i].pci_id>>16, pci_irq.route[i].pin, iosapic_irq[vector].base_irq + iosapic_irq[vector].pin, vector); # endif - irq_type = &irq_type_iosapic_level; - idesc = irq_desc(vector); - if (idesc->handler != irq_type){ - if (idesc->handler != &no_irq_type) - printk("iosapic_init: changing vector 0x%02x from %s to %s\n", - vector, idesc->handler->typename, irq_type->typename); - idesc->handler = irq_type; - } - /* program the IOSAPIC routing table: */ - set_rte(vector, (ia64_get_lid() >> 16) & 0xffff); + /* program the IOSAPIC routing table: */ + set_rte(vector, (ia64_get_lid() >> 16) & 0xffff); } } @@ -585,6 +628,8 @@ iosapic_pci_fixup (int phase) struct pci_dev *dev; unsigned char pin; int vector; + struct hw_interrupt_type *irq_type; + irq_desc_t *idesc; if (phase != 1) return; @@ -611,19 +656,28 @@ iosapic_pci_fixup (int phase) if (vector >= 0) printk(KERN_WARNING "PCI: using PPB(B%d,I%d,P%d) to get vector %02x\n", - bridge->bus->number, PCI_SLOT(bridge->devfn), + dev->bus->number, PCI_SLOT(dev->devfn), pin, vector); else printk(KERN_WARNING - "PCI: Couldn't map irq for (B%d,I%d,P%d)o\n", - bridge->bus->number, PCI_SLOT(bridge->devfn), - pin); + "PCI: Couldn't map irq for (B%d,I%d,P%d)\n", + dev->bus->number, PCI_SLOT(dev->devfn), pin); } if (vector >= 0) { printk("PCI->APIC IRQ transform: (B%d,I%d,P%d) -> 0x%02x\n", dev->bus->number, PCI_SLOT(dev->devfn), pin, vector); dev->irq = vector; + irq_type = &irq_type_iosapic_level; + idesc = irq_desc(vector); + if (idesc->handler != irq_type){ + if (idesc->handler != &no_irq_type) + printk("iosapic_pci_fixup: changing vector 0x%02x from " + "%s to %s\n", vector, + idesc->handler->typename, + irq_type->typename); + idesc->handler = irq_type; + } #ifdef CONFIG_SMP /* * For platforms that do not support interrupt redirect @@ -638,7 +692,16 @@ iosapic_pci_fixup (int phase) cpu_index++; if (cpu_index >= smp_num_cpus) cpu_index = 0; + } else { + /* + * Direct the interrupt vector to the current cpu, + * platform redirection will distribute them. + */ + set_rte(vector, (ia64_get_lid() >> 16) & 0xffff); } +#else + /* direct the interrupt vector to the running cpu id */ + set_rte(vector, (ia64_get_lid() >> 16) & 0xffff); #endif } } diff --git a/arch/ia64/kernel/irq.c b/arch/ia64/kernel/irq.c index 5c8d801..e83200d 100644 --- a/arch/ia64/kernel/irq.c +++ b/arch/ia64/kernel/irq.c @@ -161,7 +161,7 @@ int show_interrupts(struct seq_file *p, void *v) for (action=action->next; action; action = action->next) seq_printf(p, ", %s", action->name); - seq_putc('\n'); + seq_putc(p, '\n'); } seq_puts(p, "NMI: "); for (j = 0; j < smp_num_cpus; j++) @@ -287,10 +287,11 @@ static inline void wait_on_irq(void) * already executing in one.. */ if (!irqs_running()) - if (local_bh_count() || !spin_is_locked(&global_bh_lock)) + if (really_local_bh_count() || !spin_is_locked(&global_bh_lock)) break; /* Duh, we have to loop. Release the lock to avoid deadlocks */ + smp_mb__before_clear_bit(); /* need barrier before releasing lock... */ clear_bit(0,&global_irq_lock); for (;;) { @@ -305,7 +306,7 @@ static inline void wait_on_irq(void) continue; if (global_irq_lock) continue; - if (!local_bh_count() && spin_is_locked(&global_bh_lock)) + if (!really_local_bh_count() && spin_is_locked(&global_bh_lock)) continue; if (!test_and_set_bit(0,&global_irq_lock)) break; @@ -378,14 +379,14 @@ void __global_cli(void) __save_flags(flags); if (flags & IA64_PSR_I) { __cli(); - if (!local_irq_count()) + if (!really_local_irq_count()) get_irqlock(); } #else __save_flags(flags); if (flags & (1 << EFLAGS_IF_SHIFT)) { __cli(); - if (!local_irq_count()) + if (!really_local_irq_count()) get_irqlock(); } #endif @@ -393,7 +394,7 @@ void __global_cli(void) void __global_sti(void) { - if (!local_irq_count()) + if (!really_local_irq_count()) release_irqlock(smp_processor_id()); __sti(); } @@ -422,7 +423,7 @@ unsigned long __global_save_flags(void) retval = 2 + local_enabled; /* check for global flags if we're not in an interrupt */ - if (!local_irq_count()) { + if (!really_local_irq_count()) { if (local_enabled) retval = 1; if (global_irq_holder == cpu) @@ -529,7 +530,7 @@ void disable_irq(unsigned int irq) disable_irq_nosync(irq); #ifdef CONFIG_SMP - if (!local_irq_count()) { + if (!really_local_irq_count()) { do { barrier(); } while (irq_desc(irq)->status & IRQ_INPROGRESS); @@ -1009,6 +1010,11 @@ int setup_irq(unsigned int irq, struct irqaction * new) rand_initialize_irq(irq); } + if (new->flags & SA_PERCPU_IRQ) { + desc->status |= IRQ_PER_CPU; + desc->handler = &irq_type_ia64_lsapic; + } + /* * The following block of code has to be executed atomically */ @@ -1089,13 +1095,25 @@ out: static struct proc_dir_entry * smp_affinity_entry [NR_IRQS]; static unsigned long irq_affinity [NR_IRQS] = { [0 ... NR_IRQS-1] = ~0UL }; +static char irq_redir [NR_IRQS]; // = { [0 ... NR_IRQS-1] = 1 }; + +void set_irq_affinity_info(int irq, int hwid, int redir) +{ + unsigned long mask = 1UL<<cpu_logical_id(hwid); + + if (irq >= 0 && irq < NR_IRQS) { + irq_affinity[irq] = mask; + irq_redir[irq] = (char) (redir & 0xff); + } +} static int irq_affinity_read_proc (char *page, char **start, off_t off, int count, int *eof, void *data) { - if (count < HEX_DIGITS+1) + if (count < HEX_DIGITS+3) return -EINVAL; - return sprintf (page, "%08lx\n", irq_affinity[(long)data]); + return sprintf (page, "%s%08lx\n", irq_redir[(long)data] ? "r " : "", + irq_affinity[(long)data]); } static int irq_affinity_write_proc (struct file *file, const char *buffer, @@ -1103,11 +1121,20 @@ static int irq_affinity_write_proc (struct file *file, const char *buffer, { int irq = (long) data, full_count = count, err; unsigned long new_value; + const char *buf = buffer; + int redir; if (!irq_desc(irq)->handler->set_affinity) return -EIO; - err = parse_hex_value(buffer, count, &new_value); + if (buf[0] == 'r' || buf[0] == 'R') { + ++buf; + while (*buf == ' ') ++buf; + redir = 1; + } else + redir = 0; + + err = parse_hex_value(buf, count, &new_value); /* * Do not allow disabling IRQs completely - it's a too easy @@ -1117,8 +1144,7 @@ static int irq_affinity_write_proc (struct file *file, const char *buffer, if (!(new_value & cpu_online_map)) return -EINVAL; - irq_affinity[irq] = new_value; - irq_desc(irq)->handler->set_affinity(irq, new_value); + irq_desc(irq)->handler->set_affinity(irq | (redir?(1<<31):0), new_value); return full_count; } diff --git a/arch/ia64/kernel/ivt.S b/arch/ia64/kernel/ivt.S index 957cd80..7cdd0e1 100644 --- a/arch/ia64/kernel/ivt.S +++ b/arch/ia64/kernel/ivt.S @@ -43,6 +43,7 @@ #include <asm/processor.h> #include <asm/ptrace.h> #include <asm/system.h> +#include <asm/thread_info.h> #include <asm/unistd.h> #if 1 @@ -275,6 +276,7 @@ ENTRY(alt_itlb_miss) mov r16=cr.ifa // get address that caused the TLB miss movl r17=PAGE_KERNEL mov r21=cr.ipsr + movl r19=(((1 << IA64_MAX_PHYS_BITS) - 1) & ~0xfff) mov r31=pr ;; #ifdef CONFIG_DISABLE_VHPT @@ -289,12 +291,12 @@ ENTRY(alt_itlb_miss) (p8) br.cond.dptk itlb_fault #endif extr.u r23=r21,IA64_PSR_CPL0_BIT,2 // extract psr.cpl + and r19=r19,r16 // clear ed, reserved bits, and PTE control bits shr.u r18=r16,57 // move address bit 61 to bit 4 - dep r19=0,r16,IA64_MAX_PHYS_BITS,(64-IA64_MAX_PHYS_BITS) // clear ed & reserved bits ;; andcm r18=0x10,r18 // bit 4=~address-bit(61) cmp.ne p8,p0=r0,r23 // psr.cpl != 0? - dep r19=r17,r19,0,12 // insert PTE control bits into r19 + or r19=r17,r19 // insert PTE control bits into r19 ;; or r19=r19,r18 // set bit 4 (uncached) if the access was to region 6 (p8) br.cond.spnt page_fault @@ -312,6 +314,7 @@ ENTRY(alt_dtlb_miss) mov r16=cr.ifa // get address that caused the TLB miss movl r17=PAGE_KERNEL mov r20=cr.isr + movl r19=(((1 << IA64_MAX_PHYS_BITS) - 1) & ~0xfff) mov r21=cr.ipsr mov r31=pr ;; @@ -328,15 +331,15 @@ ENTRY(alt_dtlb_miss) #endif extr.u r23=r21,IA64_PSR_CPL0_BIT,2 // extract psr.cpl tbit.nz p6,p7=r20,IA64_ISR_SP_BIT // is speculation bit on? + and r19=r19,r16 // clear ed, reserved bits, and PTE control bits shr.u r18=r16,57 // move address bit 61 to bit 4 - dep r19=0,r16,IA64_MAX_PHYS_BITS,(64-IA64_MAX_PHYS_BITS) // clear ed & reserved bits ;; andcm r18=0x10,r18 // bit 4=~address-bit(61) cmp.ne p8,p0=r0,r23 (p8) br.cond.spnt page_fault dep r21=-1,r21,IA64_PSR_ED_BIT,1 - dep r19=r17,r19,0,12 // insert PTE control bits into r19 + or r19=r19,r17 // insert PTE control bits into r19 ;; or r19=r19,r18 // set bit 4 (uncached) if the access was to region 6 (p6) mov cr.ipsr=r21 @@ -654,16 +657,16 @@ ENTRY(break_fault) ld8 r16=[r16] // load address of syscall entry point mov rp=r15 // set the real return addr ;; - ld8 r2=[r2] // r2 = current->ptrace mov b6=r16 // arrange things so we skip over break instruction when returning: adds r16=16,sp // get pointer to cr_ipsr adds r17=24,sp // get pointer to cr_iip + add r2=TI_FLAGS+IA64_TASK_SIZE,r13 ;; ld8 r18=[r16] // fetch cr_ipsr - tbit.z p8,p0=r2,PT_TRACESYS_BIT // (current->ptrace & PF_TRACESYS) == 0? + ld4 r2=[r2] // r2 = current_thread_info()->flags ;; ld8 r19=[r17] // fetch cr_iip extr.u r20=r18,41,2 // extract ei field @@ -676,6 +679,7 @@ ENTRY(break_fault) ;; (p6) st8 [r17]=r19 // store new cr.iip if cr.isr.ei wrapped around dep r18=r20,r18,41,2 // insert new ei into cr.isr + tbit.z p8,p0=r2,TIF_SYSCALL_TRACE ;; st8 [r16]=r18 // store new value for cr.isr @@ -855,16 +859,16 @@ ENTRY(dispatch_to_ia32_handler) ld4 out5=[r14],8 // r13 == ebp ;; ld4 out3=[r14],8 // r14 == esi - adds r2=IA64_TASK_PTRACE_OFFSET,r13 // r2 = ¤t->ptrace + adds r2=TI_FLAGS+IA64_TASK_SIZE,r13 ;; ld4 out4=[r14] // r15 == edi movl r16=ia32_syscall_table ;; (p6) shladd r16=r8,3,r16 // force ni_syscall if not valid syscall number - ld8 r2=[r2] // r2 = current->ptrace + ld4 r2=[r2] // r2 = current_thread_info()->flags ;; ld8 r16=[r16] - tbit.z p8,p0=r2,PT_TRACESYS_BIT // (current->ptrace & PT_TRACESYS) == 0? + tbit.z p8,p0=r2,TIF_SYSCALL_TRACE ;; mov b6=r16 movl r15=ia32_ret_from_syscall diff --git a/arch/ia64/kernel/mca.c b/arch/ia64/kernel/mca.c index f090da7..da1cf74 100644 --- a/arch/ia64/kernel/mca.c +++ b/arch/ia64/kernel/mca.c @@ -3,6 +3,9 @@ * Purpose: Generic MCA handling layer * * Updated for latest kernel + * Copyright (C) 2002 Intel + * Copyright (C) Jenna Hall (jenna.s.hall@intel.com) + * * Copyright (C) 2001 Intel * Copyright (C) Fred Lewis (frederick.v.lewis@intel.com) * @@ -12,6 +15,11 @@ * Copyright (C) 1999 Silicon Graphics, Inc. * Copyright (C) Vijay Chander(vijay@engr.sgi.com) * + * 02/01/04 J. Hall Aligned MCA stack to 16 bytes, added platform vs. CPU + * error flag, set SAL default return values, changed + * error record structure to linked list, added init call + * to sal_get_state_info_size(). + * * 01/01/03 F. Lewis Added setup of CMCI and CPEI IRQs, logging of corrected * platform errors, completed code for logging of * corrected & uncorrected machine check errors, and @@ -27,6 +35,7 @@ #include <linux/interrupt.h> #include <linux/irq.h> #include <linux/smp_lock.h> +#include <linux/bootmem.h> #include <asm/machvec.h> #include <asm/page.h> @@ -50,18 +59,22 @@ ia64_mc_info_t ia64_mc_info; ia64_mca_sal_to_os_state_t ia64_sal_to_os_handoff_state; ia64_mca_os_to_sal_state_t ia64_os_to_sal_handoff_state; u64 ia64_mca_proc_state_dump[512]; -u64 ia64_mca_stack[1024]; +u64 ia64_mca_stack[1024] __attribute__((aligned(16))); u64 ia64_mca_stackframe[32]; u64 ia64_mca_bspstore[1024]; -u64 ia64_init_stack[INIT_TASK_SIZE] __attribute__((aligned(16))); +u64 ia64_init_stack[KERNEL_STACK_SIZE] __attribute__((aligned(16))); +u64 ia64_mca_sal_data_area[1356]; +u64 ia64_mca_min_state_save_info; +u64 ia64_tlb_functional; +u64 ia64_os_mca_recovery_successful; static void ia64_mca_wakeup_ipi_wait(void); static void ia64_mca_wakeup(int cpu); static void ia64_mca_wakeup_all(void); static void ia64_log_init(int); -extern void ia64_monarch_init_handler (void); -extern void ia64_slave_init_handler (void); -extern struct hw_interrupt_type irq_type_iosapic_level; +extern void ia64_monarch_init_handler (void); +extern void ia64_slave_init_handler (void); +extern struct hw_interrupt_type irq_type_iosapic_level; static struct irqaction cmci_irqaction = { handler: ia64_mca_cmc_int_handler, @@ -95,25 +108,31 @@ static struct irqaction mca_cpe_irqaction = { * memory. * * Inputs : sal_info_type (Type of error record MCA/CMC/CPE/INIT) - * Outputs : None + * Outputs : platform error status */ -void +int ia64_mca_log_sal_error_record(int sal_info_type) { + int platform_err = 0; + /* Get the MCA error record */ if (!ia64_log_get(sal_info_type, (prfunc_t)printk)) - return; // no record retrieved + return platform_err; // no record retrieved - /* Log the error record */ - ia64_log_print(sal_info_type, (prfunc_t)printk); + /* TODO: + * 1. analyze error logs to determine recoverability + * 2. perform error recovery procedures, if applicable + * 3. set ia64_os_mca_recovery_successful flag, if applicable + */ - /* Clear the CMC SAL logs now that they have been logged */ + platform_err = ia64_log_print(sal_info_type, (prfunc_t)printk); ia64_sal_clear_state_info(sal_info_type); + + return platform_err; } /* - * hack for now, add platform dependent handlers - * here + * platform dependent error handling */ #ifndef PLATFORM_MCA_HANDLERS void @@ -275,8 +294,8 @@ ia64_mca_cmc_vector_setup (void) cmcv_reg_t cmcv; cmcv.cmcv_regval = 0; - cmcv.cmcv_mask = 0; /* Unmask/enable interrupt */ - cmcv.cmcv_vector = IA64_CMC_VECTOR; + cmcv.cmcv_mask = 0; /* Unmask/enable interrupt */ + cmcv.cmcv_vector = IA64_CMC_VECTOR; ia64_set_cmcv(cmcv.cmcv_regval); IA64_MCA_DEBUG("ia64_mca_platform_init: CPU %d corrected " @@ -374,6 +393,9 @@ ia64_mca_init(void) IA64_MCA_DEBUG("ia64_mca_init: begin\n"); + /* initialize recovery success indicator */ + ia64_os_mca_recovery_successful = 0; + /* Clear the Rendez checkin flag for all cpus */ for(i = 0 ; i < NR_CPUS; i++) ia64_mc_info.imi_rendez_checkin[i] = IA64_MCA_RENDEZ_CHECKIN_NOTDONE; @@ -459,7 +481,7 @@ ia64_mca_init(void) /* * Configure the CMCI vector and handler. Interrupts for CMC are - * per-processor, so AP CMC interrupts are setup in smp_callin() (smp.c). + * per-processor, so AP CMC interrupts are setup in smp_callin() (smpboot.c). */ register_percpu_irq(IA64_CMC_VECTOR, &cmci_irqaction); ia64_mca_cmc_vector_setup(); /* Setup vector on BSP & enable */ @@ -498,6 +520,9 @@ ia64_mca_init(void) ia64_log_init(SAL_INFO_TYPE_CMC); ia64_log_init(SAL_INFO_TYPE_CPE); + /* Zero the min state save info */ + ia64_mca_min_state_save_info = 0; + #if defined(MCA_TEST) mca_test(); #endif /* #if defined(MCA_TEST) */ @@ -576,7 +601,7 @@ ia64_mca_wakeup_all(void) int cpu; /* Clear the Rendez checkin flag for all cpus */ - for(cpu = 0 ; cpu < smp_num_cpus; cpu++) + for(cpu = 0; cpu < smp_num_cpus; cpu++) if (ia64_mc_info.imi_rendez_checkin[cpu] == IA64_MCA_RENDEZ_CHECKIN_DONE) ia64_mca_wakeup(cpu); @@ -668,6 +693,13 @@ ia64_return_to_sal_check(void) /* Cold Boot for uncorrectable MCA */ ia64_os_to_sal_handoff_state.imots_os_status = IA64_MCA_COLD_BOOT; + + /* Default = tell SAL to return to same context */ + ia64_os_to_sal_handoff_state.imots_context = IA64_MCA_SAME_CONTEXT; + + /* Register pointer to new min state values */ + /* NOTE: need to do something with this during recovery phase */ + ia64_os_to_sal_handoff_state.imots_new_min_state = &ia64_mca_min_state_save_info; } /* @@ -678,10 +710,10 @@ ia64_return_to_sal_check(void) * This is the place where the core of OS MCA handling is done. * Right now the logs are extracted and displayed in a well-defined * format. This handler code is supposed to be run only on the - * monarch processor. Once the monarch is done with MCA handling + * monarch processor. Once the monarch is done with MCA handling * further MCA logging is enabled by clearing logs. * Monarch also has the duty of sending wakeup-IPIs to pull the - * slave processors out of rendezvous spinloop. + * slave processors out of rendezvous spinloop. * * Inputs : None * Outputs : None @@ -689,20 +721,16 @@ ia64_return_to_sal_check(void) void ia64_mca_ucmc_handler(void) { -#if 0 /* stubbed out @FVL */ - /* - * Attempting to log a DBE error Causes "reserved register/field panic" - * in printk. - */ + int platform_err = 0; /* Get the MCA error record and log it */ - ia64_mca_log_sal_error_record(SAL_INFO_TYPE_MCA); -#endif /* stubbed out @FVL */ + platform_err = ia64_mca_log_sal_error_record(SAL_INFO_TYPE_MCA); /* * Do Platform-specific mca error handling if required. */ - mca_handler_platform() ; + if (platform_err) + mca_handler_platform(); /* * Wakeup all the processors which are spinning in the rendezvous @@ -749,13 +777,16 @@ typedef struct ia64_state_log_s { spinlock_t isl_lock; int isl_index; - ia64_err_rec_t isl_log[IA64_MAX_LOGS]; /* need space to store header + error log */ + ia64_err_rec_t *isl_log[IA64_MAX_LOGS]; /* need space to store header + error log */ } ia64_state_log_t; static ia64_state_log_t ia64_state_log[IA64_MAX_LOG_TYPES]; -/* Note: Some of these macros assume IA64_MAX_LOGS is always 2. Should be */ -/* fixed. @FVL */ +#define IA64_LOG_ALLOCATE(it, size) \ + {ia64_state_log[it].isl_log[IA64_LOG_CURR_INDEX(it)] = \ + (ia64_err_rec_t *)alloc_bootmem(size); \ + ia64_state_log[it].isl_log[IA64_LOG_NEXT_INDEX(it)] = \ + (ia64_err_rec_t *)alloc_bootmem(size);} #define IA64_LOG_LOCK_INIT(it) spin_lock_init(&ia64_state_log[it].isl_lock) #define IA64_LOG_LOCK(it) spin_lock_irqsave(&ia64_state_log[it].isl_lock, s) #define IA64_LOG_UNLOCK(it) spin_unlock_irqrestore(&ia64_state_log[it].isl_lock,s) @@ -765,13 +796,13 @@ static ia64_state_log_t ia64_state_log[IA64_MAX_LOG_TYPES]; ia64_state_log[it].isl_index = 1 - ia64_state_log[it].isl_index #define IA64_LOG_INDEX_DEC(it) \ ia64_state_log[it].isl_index = 1 - ia64_state_log[it].isl_index -#define IA64_LOG_NEXT_BUFFER(it) (void *)(&(ia64_state_log[it].isl_log[IA64_LOG_NEXT_INDEX(it)])) -#define IA64_LOG_CURR_BUFFER(it) (void *)(&(ia64_state_log[it].isl_log[IA64_LOG_CURR_INDEX(it)])) +#define IA64_LOG_NEXT_BUFFER(it) (void *)((ia64_state_log[it].isl_log[IA64_LOG_NEXT_INDEX(it)])) +#define IA64_LOG_CURR_BUFFER(it) (void *)((ia64_state_log[it].isl_log[IA64_LOG_CURR_INDEX(it)])) /* * C portion of the OS INIT handler * - * Called from ia64_<monarch/slave>_init_handler + * Called from ia64_monarch_init_handler * * Inputs: pointer to pt_regs where processor info was saved. * @@ -885,10 +916,18 @@ ia64_log_prt_section_header (sal_log_section_hdr_t *sh, prfunc_t prfunc) void ia64_log_init(int sal_info_type) { - IA64_LOG_LOCK_INIT(sal_info_type); + u64 max_size = 0; + IA64_LOG_NEXT_INDEX(sal_info_type) = 0; - memset(IA64_LOG_NEXT_BUFFER(sal_info_type), 0, - sizeof(ia64_err_rec_t) * IA64_MAX_LOGS); + IA64_LOG_LOCK_INIT(sal_info_type); + + // SAL will tell us the maximum size of any error record of this type + max_size = ia64_sal_get_state_info_size(sal_info_type); + + // set up OS data structures to hold error info + IA64_LOG_ALLOCATE(sal_info_type, max_size); + memset(IA64_LOG_CURR_BUFFER(sal_info_type), 0, max_size); + memset(IA64_LOG_NEXT_BUFFER(sal_info_type), 0, max_size); } /* @@ -923,8 +962,7 @@ ia64_log_get(int sal_info_type, prfunc_t prfunc) return total_len; } else { IA64_LOG_UNLOCK(sal_info_type); - prfunc("ia64_log_get: Failed to retrieve SAL error record type %d\n", - sal_info_type); + prfunc("ia64_log_get: No SAL error record available for type %d\n", sal_info_type); return 0; } } @@ -1268,7 +1306,7 @@ ia64_log_mem_dev_err_info_print (sal_log_mem_dev_err_info_t *mdei, } if (mdei->valid.oem_data) { - ia64_log_prt_oem_data((int)mdei->header.len, + platform_mem_dev_err_print((int)mdei->header.len, (int)sizeof(sal_log_mem_dev_err_info_t) - 1, &(mdei->oem_data[0]), prfunc); } @@ -1357,7 +1395,7 @@ ia64_log_pci_bus_err_info_print (sal_log_pci_bus_err_info_t *pbei, prfunc("\n"); if (pbei->valid.oem_data) { - ia64_log_prt_oem_data((int)pbei->header.len, + platform_pci_bus_err_print((int)pbei->header.len, (int)sizeof(sal_log_pci_bus_err_info_t) - 1, &(pbei->oem_data[0]), prfunc); } @@ -1456,7 +1494,7 @@ ia64_log_pci_comp_err_info_print(sal_log_pci_comp_err_info_t *pcei, } } if (pcei->valid.oem_data) { - ia64_log_prt_oem_data((int)pcei->header.len, n_pci_data, + platform_pci_comp_err_print((int)pcei->header.len, n_pci_data, p_oem_data, prfunc); prfunc("\n"); } @@ -1485,7 +1523,7 @@ ia64_log_plat_specific_err_info_print (sal_log_plat_specific_err_info_t *psei, ia64_log_prt_guid(&psei->guid, prfunc); } if (psei->valid.oem_data) { - ia64_log_prt_oem_data((int)psei->header.len, + platform_plat_specific_err_print((int)psei->header.len, (int)sizeof(sal_log_plat_specific_err_info_t) - 1, &(psei->oem_data[0]), prfunc); } @@ -1519,7 +1557,7 @@ ia64_log_host_ctlr_err_info_print (sal_log_host_ctlr_err_info_t *hcei, if (hcei->valid.bus_spec_data) prfunc(" Bus Specific Data: %#lx", hcei->bus_spec_data); if (hcei->valid.oem_data) { - ia64_log_prt_oem_data((int)hcei->header.len, + platform_host_ctlr_err_print((int)hcei->header.len, (int)sizeof(sal_log_host_ctlr_err_info_t) - 1, &(hcei->oem_data[0]), prfunc); } @@ -1553,7 +1591,7 @@ ia64_log_plat_bus_err_info_print (sal_log_plat_bus_err_info_t *pbei, if (pbei->valid.bus_spec_data) prfunc(" Bus Specific Data: %#lx", pbei->bus_spec_data); if (pbei->valid.oem_data) { - ia64_log_prt_oem_data((int)pbei->header.len, + platform_plat_bus_err_print((int)pbei->header.len, (int)sizeof(sal_log_plat_bus_err_info_t) - 1, &(pbei->oem_data[0]), prfunc); } @@ -1745,17 +1783,18 @@ ia64_log_processor_info_print(sal_log_record_header_t *lh, prfunc_t prfunc) * Inputs : lh (Pointer to the sal error record header with format * specified by the SAL spec). * prfunc (fn ptr of log output function to use) - * Outputs : None + * Outputs : platform error status */ -void +int ia64_log_platform_info_print (sal_log_record_header_t *lh, prfunc_t prfunc) { - sal_log_section_hdr_t *slsh; - int n_sects; - int ercd_pos; + sal_log_section_hdr_t *slsh; + int n_sects; + int ercd_pos; + int platform_err = 0; if (!lh) - return; + return platform_err; #ifdef MCA_PRT_XTRA_DATA // for test only @FVL ia64_log_prt_record_header(lh, prfunc); @@ -1765,7 +1804,7 @@ ia64_log_platform_info_print (sal_log_record_header_t *lh, prfunc_t prfunc) IA64_MCA_DEBUG("ia64_mca_log_print: " "truncated SAL error record. len = %d\n", lh->len); - return; + return platform_err; } /* Print record header info */ @@ -1796,35 +1835,43 @@ ia64_log_platform_info_print (sal_log_record_header_t *lh, prfunc_t prfunc) ia64_log_proc_dev_err_info_print((sal_log_processor_info_t *)slsh, prfunc); } else if (efi_guidcmp(slsh->guid, SAL_PLAT_MEM_DEV_ERR_SECT_GUID) == 0) { + platform_err = 1; prfunc("+Platform Memory Device Error Info Section\n"); ia64_log_mem_dev_err_info_print((sal_log_mem_dev_err_info_t *)slsh, prfunc); } else if (efi_guidcmp(slsh->guid, SAL_PLAT_SEL_DEV_ERR_SECT_GUID) == 0) { + platform_err = 1; prfunc("+Platform SEL Device Error Info Section\n"); ia64_log_sel_dev_err_info_print((sal_log_sel_dev_err_info_t *)slsh, prfunc); } else if (efi_guidcmp(slsh->guid, SAL_PLAT_PCI_BUS_ERR_SECT_GUID) == 0) { + platform_err = 1; prfunc("+Platform PCI Bus Error Info Section\n"); ia64_log_pci_bus_err_info_print((sal_log_pci_bus_err_info_t *)slsh, prfunc); } else if (efi_guidcmp(slsh->guid, SAL_PLAT_SMBIOS_DEV_ERR_SECT_GUID) == 0) { + platform_err = 1; prfunc("+Platform SMBIOS Device Error Info Section\n"); ia64_log_smbios_dev_err_info_print((sal_log_smbios_dev_err_info_t *)slsh, prfunc); } else if (efi_guidcmp(slsh->guid, SAL_PLAT_PCI_COMP_ERR_SECT_GUID) == 0) { + platform_err = 1; prfunc("+Platform PCI Component Error Info Section\n"); ia64_log_pci_comp_err_info_print((sal_log_pci_comp_err_info_t *)slsh, prfunc); } else if (efi_guidcmp(slsh->guid, SAL_PLAT_SPECIFIC_ERR_SECT_GUID) == 0) { + platform_err = 1; prfunc("+Platform Specific Error Info Section\n"); ia64_log_plat_specific_err_info_print((sal_log_plat_specific_err_info_t *) slsh, prfunc); } else if (efi_guidcmp(slsh->guid, SAL_PLAT_HOST_CTLR_ERR_SECT_GUID) == 0) { + platform_err = 1; prfunc("+Platform Host Controller Error Info Section\n"); ia64_log_host_ctlr_err_info_print((sal_log_host_ctlr_err_info_t *)slsh, prfunc); } else if (efi_guidcmp(slsh->guid, SAL_PLAT_BUS_ERR_SECT_GUID) == 0) { + platform_err = 1; prfunc("+Platform Bus Error Info Section\n"); ia64_log_plat_bus_err_info_print((sal_log_plat_bus_err_info_t *)slsh, prfunc); @@ -1838,8 +1885,9 @@ ia64_log_platform_info_print (sal_log_record_header_t *lh, prfunc_t prfunc) n_sects, lh->len); if (!n_sects) { prfunc("No Platform Error Info Sections found\n"); - return; + return platform_err; } + return platform_err; } /* @@ -1849,15 +1897,17 @@ ia64_log_platform_info_print (sal_log_record_header_t *lh, prfunc_t prfunc) * * Inputs : info_type (SAL_INFO_TYPE_{MCA,INIT,CMC,CPE}) * prfunc (fn ptr of log output function to use) - * Outputs : None + * Outputs : platform error status */ -void +int ia64_log_print(int sal_info_type, prfunc_t prfunc) { + int platform_err = 0; + switch(sal_info_type) { case SAL_INFO_TYPE_MCA: prfunc("+BEGIN HARDWARE ERROR STATE AT MCA\n"); - ia64_log_platform_info_print(IA64_LOG_CURR_BUFFER(sal_info_type), prfunc); + platform_err = ia64_log_platform_info_print(IA64_LOG_CURR_BUFFER(sal_info_type), prfunc); prfunc("+END HARDWARE ERROR STATE AT MCA\n"); break; case SAL_INFO_TYPE_INIT: @@ -1877,4 +1927,5 @@ ia64_log_print(int sal_info_type, prfunc_t prfunc) prfunc("+MCA UNKNOWN ERROR LOG (UNIMPLEMENTED)\n"); break; } + return platform_err; } diff --git a/arch/ia64/kernel/mca_asm.S b/arch/ia64/kernel/mca_asm.S index 18a1031..6855a5b 100644 --- a/arch/ia64/kernel/mca_asm.S +++ b/arch/ia64/kernel/mca_asm.S @@ -7,6 +7,12 @@ // 00/03/29 cfleck Added code to save INIT handoff state in pt_regs format, switch to temp // kstack, switch modes, jump to C INIT handler // +// 02/01/04 J.Hall <jenna.s.hall@intel.com> +// Before entering virtual mode code: +// 1. Check for TLB CPU error +// 2. Restore current thread pointer to kr6 +// 3. Move stack ptr 16 bytes to conform to C calling convention +// #include <linux/config.h> #include <asm/asmmacro.h> @@ -21,10 +27,21 @@ */ #define MINSTATE_PHYS /* Make sure stack access is physical for MINSTATE */ +/* + * Needed for ia64_sal call + */ +#define SAL_GET_STATE_INFO 0x01000001 + +/* + * Needed for return context to SAL + */ +#define IA64_MCA_SAME_CONTEXT 0x0 +#define IA64_MCA_COLD_BOOT -2 + #include "minstate.h" /* - * SAL_TO_OS_MCA_HANDOFF_STATE (SAL 3.0 spec) + * SAL_TO_OS_MCA_HANDOFF_STATE (SAL 3.0 spec) * 1. GR1 = OS GP * 2. GR8 = PAL_PROC physical address * 3. GR9 = SAL_PROC physical address @@ -40,26 +57,34 @@ st8 [_tmp]=r9,0x08;; \ st8 [_tmp]=r10,0x08;; \ st8 [_tmp]=r11,0x08;; \ - st8 [_tmp]=r12,0x08;; + st8 [_tmp]=r12,0x08 /* - * OS_MCA_TO_SAL_HANDOFF_STATE (SAL 3.0 spec) - * 1. GR8 = OS_MCA return status + * OS_MCA_TO_SAL_HANDOFF_STATE (SAL 3.0 spec) + * (p6) is executed if we never entered virtual mode (TLB error) + * (p7) is executed if we entered virtual mode as expected (normal case) + * 1. GR8 = OS_MCA return status * 2. GR9 = SAL GP (physical) - * 3. GR10 = 0/1 returning same/new context - * 4. GR22 = New min state save area pointer - * returns ptr to SAL rtn save loc in _tmp + * 3. GR10 = 0/1 returning same/new context + * 4. GR22 = New min state save area pointer + * returns ptr to SAL rtn save loc in _tmp */ -#define OS_MCA_TO_SAL_HANDOFF_STATE_RESTORE(_tmp) \ - movl _tmp=ia64_os_to_sal_handoff_state;; \ - DATA_VA_TO_PA(_tmp);; \ - ld8 r8=[_tmp],0x08;; \ - ld8 r9=[_tmp],0x08;; \ - ld8 r10=[_tmp],0x08;; \ - ld8 r22=[_tmp],0x08;; \ - movl _tmp=ia64_sal_to_os_handoff_state;; \ - DATA_VA_TO_PA(_tmp);; \ - add _tmp=0x28,_tmp;; // point to SAL rtn save location +#define OS_MCA_TO_SAL_HANDOFF_STATE_RESTORE(_tmp) \ +(p6) movl _tmp=ia64_sal_to_os_handoff_state;; \ +(p7) movl _tmp=ia64_os_to_sal_handoff_state;; \ + DATA_VA_TO_PA(_tmp);; \ +(p6) movl r8=IA64_MCA_COLD_BOOT; \ +(p6) movl r10=IA64_MCA_SAME_CONTEXT; \ +(p6) add _tmp=0x18,_tmp;; \ +(p6) ld8 r9=[_tmp],0x10; \ +(p6) movl r22=ia64_mca_min_state_save_info;; \ +(p7) ld8 r8=[_tmp],0x08;; \ +(p7) ld8 r9=[_tmp],0x08;; \ +(p7) ld8 r10=[_tmp],0x08;; \ +(p7) ld8 r22=[_tmp],0x08;; \ + DATA_VA_TO_PA(r22) + // now _tmp is pointing to SAL rtn save location + .global ia64_os_mca_dispatch .global ia64_os_mca_dispatch_end @@ -70,6 +95,9 @@ .global ia64_mca_stackframe .global ia64_mca_bspstore .global ia64_init_stack + .global ia64_mca_sal_data_area + .global ia64_tlb_functional + .global ia64_mca_min_state_save_info .text .align 16 @@ -90,26 +118,34 @@ ia64_os_mca_dispatch: // for ia64_mca_sal_to_os_state_t has been // defined in include/asm/mca.h SAL_TO_OS_MCA_HANDOFF_STATE_SAVE(r2) + ;; // LOG PROCESSOR STATE INFO FROM HERE ON.. - ;; begin_os_mca_dump: br ia64_os_mca_proc_state_dump;; ia64_os_mca_done_dump: // Setup new stack frame for OS_MCA handling - movl r2=ia64_mca_bspstore;; // local bspstore area location in r2 + movl r2=ia64_mca_bspstore;; // local bspstore area location in r2 DATA_VA_TO_PA(r2);; - movl r3=ia64_mca_stackframe;; // save stack frame to memory in r3 + movl r3=ia64_mca_stackframe;; // save stack frame to memory in r3 DATA_VA_TO_PA(r3);; - rse_switch_context(r6,r3,r2);; // RSC management in this new context - movl r12=ia64_mca_stack;; - mov r2=8*1024;; // stack size must be same as c array - add r12=r2,r12;; // stack base @ bottom of array + rse_switch_context(r6,r3,r2);; // RSC management in this new context + movl r12=ia64_mca_stack + mov r2=8*1024;; // stack size must be same as C array + add r12=r2,r12;; // stack base @ bottom of array + adds r12=-16,r12;; // allow 16 bytes of scratch + // (C calling convention) DATA_VA_TO_PA(r12);; - // Enter virtual mode from physical mode + // Check to see if the MCA resulted from a TLB error +begin_tlb_error_check: + br ia64_os_mca_tlb_error_check;; + +done_tlb_error_check: + + // If TLB is functional, enter virtual mode from physical mode VIRTUAL_MODE_ENTER(r2, r3, ia64_os_mca_virtual_begin, r4) ia64_os_mca_virtual_begin: @@ -130,25 +166,28 @@ ia64_os_mca_virtual_end: #endif /* #if defined(MCA_TEST) */ // restore the original stack frame here - movl r2=ia64_mca_stackframe // restore stack frame from memory at r2 + movl r2=ia64_mca_stackframe // restore stack frame from memory at r2 ;; DATA_VA_TO_PA(r2) movl r4=IA64_PSR_MC ;; - rse_return_context(r4,r3,r2) // switch from interrupt context for RSE + rse_return_context(r4,r3,r2) // switch from interrupt context for RSE // let us restore all the registers from our PSI structure - mov r8=gp + mov r8=gp ;; begin_os_mca_restore: br ia64_os_mca_proc_state_restore;; ia64_os_mca_done_restore: - ;; + movl r3=ia64_tlb_functional;; + DATA_VA_TO_PA(r3);; + ld8 r3=[r3];; + cmp.eq p6,p7=r0,r3;; + OS_MCA_TO_SAL_HANDOFF_STATE_RESTORE(r2);; // branch back to SALE_CHECK - OS_MCA_TO_SAL_HANDOFF_STATE_RESTORE(r2) ld8 r3=[r2];; - mov b0=r3;; // SAL_CHECK return address + mov b0=r3;; // SAL_CHECK return address br b0 ;; ia64_os_mca_dispatch_end: @@ -405,7 +444,7 @@ ia64_os_mca_proc_state_restore: movl r2=ia64_mca_proc_state_dump // Convert virtual address ;; // of OS state dump area DATA_VA_TO_PA(r2) // to physical address - ;; + restore_GRs: // restore bank-1 GRs 16-31 bsw.1;; add r3=16*8,r2;; // to get to NaT of GR 16-31 @@ -621,6 +660,80 @@ end_os_mca_restore: //EndStub////////////////////////////////////////////////////////////////////// +//++ +// Name: +// ia64_os_mca_tlb_error_check() +// +// Stub Description: +// +// This stub checks to see if the MCA resulted from a TLB error +// +//-- + +ia64_os_mca_tlb_error_check: + + // Retrieve sal data structure for uncorrected MCA + + // Make the ia64_sal_get_state_info() call + movl r4=ia64_mca_sal_data_area;; + movl r7=ia64_sal;; + mov r6=r1 // save gp + DATA_VA_TO_PA(r4) // convert to physical address + DATA_VA_TO_PA(r7);; // convert to physical address + ld8 r7=[r7] // get addr of pdesc from ia64_sal + movl r3=SAL_GET_STATE_INFO;; + DATA_VA_TO_PA(r7);; // convert to physical address + ld8 r8=[r7],8;; // get pdesc function pointer + DATA_VA_TO_PA(r8) // convert to physical address + ld8 r1=[r7];; // set new (ia64_sal) gp + DATA_VA_TO_PA(r1) // convert to physical address + mov b6=r8 + + alloc r5=ar.pfs,8,0,8,0;; // allocate stack frame for SAL call + mov out0=r3 // which SAL proc to call + mov out1=r0 // error type == MCA + mov out2=r0 // null arg + mov out3=r4 // data copy area + mov out4=r0 // null arg + mov out5=r0 // null arg + mov out6=r0 // null arg + mov out7=r0;; // null arg + + br.call.sptk.few b0=b6;; + + mov r1=r6 // restore gp + mov ar.pfs=r5;; // restore ar.pfs + + movl r6=ia64_tlb_functional;; + DATA_VA_TO_PA(r6) // needed later + + cmp.eq p6,p7=r0,r8;; // check SAL call return address +(p7) st8 [r6]=r0 // clear tlb_functional flag +(p7) br tlb_failure // error; return to SAL + + // examine processor error log for type of error + add r4=40+24,r4;; // parse past record header (length=40) + // and section header (length=24) + ld4 r4=[r4] // get valid field of processor log + mov r5=0xf00;; + and r5=r4,r5;; // read bits 8-11 of valid field + // to determine if we have a TLB error + movl r3=0x1 + cmp.eq p6,p7=r0,r5;; + // if no TLB failure, set tlb_functional flag +(p6) st8 [r6]=r3 + // else clear flag +(p7) st8 [r6]=r0 + + // if no TLB failure, continue with normal virtual mode logging +(p6) br done_tlb_error_check + // else no point in entering virtual mode for logging +tlb_failure: + br ia64_os_mca_virtual_end + +//EndStub////////////////////////////////////////////////////////////////////// + + // ok, the issue here is that we need to save state information so // it can be useable by the kernel debugger and show regs routines. // In order to do this, our best bet is save the current state (plus @@ -633,7 +746,7 @@ end_os_mca_restore: // This has been defined for registration purposes with SAL // as a part of ia64_mca_init. // -// When we get here, the follow registers have been +// When we get here, the following registers have been // set by the SAL for our use // // 1. GR1 = OS INIT GP @@ -649,42 +762,10 @@ end_os_mca_restore: GLOBAL_ENTRY(ia64_monarch_init_handler) -#if defined(CONFIG_SMP) && defined(SAL_MPINIT_WORKAROUND) - // - // work around SAL bug that sends all processors to monarch entry - // - mov r17=cr.lid - // XXX fix me: this is wrong: hard_smp_processor_id() is a pair of lid/eid - movl r18=ia64_cpu_to_sapicid - ;; - dep r18=0,r18,61,3 // convert to physical address - ;; - shr.u r17=r17,16 - ld4 r18=[r18] // get the BSP ID - ;; - dep r17=0,r17,16,48 - ;; - cmp4.ne p6,p0=r17,r18 // Am I the BSP ? -(p6) br.cond.spnt slave_init_spin_me - ;; -#endif - -// -// ok, the first thing we do is stash the information -// the SAL passed to os -// -_tmp = r2 - movl _tmp=ia64_sal_to_os_handoff_state - ;; - dep _tmp=0,_tmp, 61, 3 // get physical address + // stash the information the SAL passed to os + SAL_TO_OS_MCA_HANDOFF_STATE_SAVE(r2) ;; - st8 [_tmp]=r1,0x08;; - st8 [_tmp]=r8,0x08;; - st8 [_tmp]=r9,0x08;; - st8 [_tmp]=r10,0x08;; - st8 [_tmp]=r11,0x08;; - st8 [_tmp]=r12,0x08;; // now we want to save information so we can dump registers SAVE_MIN_WITH_COVER @@ -695,12 +776,10 @@ _tmp = r2 ;; SAVE_REST -// ok, enough should be saved at this point to be dangerous, and supply +// ok, enough should be saved at this point to be dangerous, and supply // information for a dump // We need to switch to Virtual mode before hitting the C functions. -// -// -// + movl r2=IA64_PSR_IT|IA64_PSR_IC|IA64_PSR_DT|IA64_PSR_RT|IA64_PSR_DFH|IA64_PSR_BN mov r3=psr // get the current psr, minimum enabled at this point ;; @@ -708,8 +787,8 @@ _tmp = r2 ;; movl r3=IVirtual_Switch ;; - mov cr.iip=r3 // short return to set the appropriate bits - mov cr.ipsr=r2 // need to do an rfi to set appropriate bits + mov cr.iip=r3 // short return to set the appropriate bits + mov cr.ipsr=r2 // need to do an rfi to set appropriate bits ;; rfi ;; @@ -717,7 +796,7 @@ IVirtual_Switch: // // We should now be running virtual // - // Lets call the C handler to get the rest of the state info + // Let's call the C handler to get the rest of the state info // alloc r14=ar.pfs,0,0,1,0 // now it's safe (must be first in insn group!) ;; // diff --git a/arch/ia64/kernel/minstate.h b/arch/ia64/kernel/minstate.h index a7e15a0..4ce74a4 100644 --- a/arch/ia64/kernel/minstate.h +++ b/arch/ia64/kernel/minstate.h @@ -92,7 +92,6 @@ * * Assumed state upon entry: * psr.ic: off - * psr.dt: off * r31: contains saved predicates (pr) * * Upon exit, the state is as follows: @@ -186,7 +185,6 @@ * * Assumed state upon entry: * psr.ic: on - * psr.dt: on * r2: points to &pt_regs.r16 * r3: points to &pt_regs.r17 */ diff --git a/arch/ia64/kernel/palinfo.c b/arch/ia64/kernel/palinfo.c index cd87720..5b69b5b 100644 --- a/arch/ia64/kernel/palinfo.c +++ b/arch/ia64/kernel/palinfo.c @@ -724,7 +724,7 @@ tr_info(char *page) status = ia64_pal_tr_read(j, i, tr_buffer, &tr_valid); if (status != 0) { - printk(__FUNCTION__ " pal call failed on tr[%d:%d]=%ld\n", i, j, status); + printk("palinfo: pal call failed on tr[%d:%d]=%ld\n", i, j, status); continue; } @@ -842,9 +842,8 @@ static void palinfo_smp_call(void *info) { palinfo_smp_data_t *data = (palinfo_smp_data_t *)info; - /* printk(__FUNCTION__" called on CPU %d\n", smp_processor_id());*/ if (data == NULL) { - printk(KERN_ERR __FUNCTION__" data pointer is NULL\n"); + printk("%s palinfo: data pointer is NULL\n", KERN_ERR); data->ret = 0; /* no output */ return; } @@ -868,11 +867,10 @@ int palinfo_handle_smp(pal_func_cpu_u_t *f, char *page) ptr.page = page; ptr.ret = 0; /* just in case */ - /*printk(__FUNCTION__" calling CPU %d from CPU %d for function %d\n", f->req_cpu,smp_processor_id(), f->func_id);*/ /* will send IPI to other CPU and wait for completion of remote call */ if ((ret=smp_call_function_single(f->req_cpu, palinfo_smp_call, &ptr, 0, 1))) { - printk(__FUNCTION__" remote CPU call from %d to %d on function %d: error %d\n", smp_processor_id(), f->req_cpu, f->func_id, ret); + printk("palinfo: remote CPU call from %d to %d on function %d: error %d\n", smp_processor_id(), f->req_cpu, f->func_id, ret); return 0; } return ptr.ret; @@ -881,7 +879,7 @@ int palinfo_handle_smp(pal_func_cpu_u_t *f, char *page) static int palinfo_handle_smp(pal_func_cpu_u_t *f, char *page) { - printk(__FUNCTION__" should not be called with non SMP kernel\n"); + printk("palinfo: should not be called with non SMP kernel\n"); return 0; } #endif /* CONFIG_SMP */ diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c index a5ecf3e..27e72758 100644 --- a/arch/ia64/kernel/perfmon.c +++ b/arch/ia64/kernel/perfmon.c @@ -1,13 +1,16 @@ /* - * This file contains the code to configure and read/write the ia64 performance - * monitoring stuff. + * This file implements the perfmon subsystem which is used + * to program the IA-64 Performance Monitoring Unit (PMU). * * Originaly Written by Ganesh Venkitachalam, IBM Corp. - * Modifications by David Mosberger-Tang, Hewlett-Packard Co. - * Modifications by Stephane Eranian, Hewlett-Packard Co. * Copyright (C) 1999 Ganesh Venkitachalam <venkitac@us.ibm.com> - * Copyright (C) 1999 David Mosberger-Tang <davidm@hpl.hp.com> - * Copyright (C) 2000-2001 Stephane Eranian <eranian@hpl.hp.com> + * + * Modifications by Stephane Eranian, Hewlett-Packard Co. + * Modifications by David Mosberger-Tang, Hewlett-Packard Co. + * + * Copyright (C) 1999-2002 Hewlett Packard Co + * Stephane Eranian <eranian@hpl.hp.com> + * David Mosberger-Tang <davidm@hpl.hp.com> */ #include <linux/config.h> @@ -22,151 +25,137 @@ #include <linux/mm.h> #include <asm/bitops.h> -#include <asm/efi.h> #include <asm/errno.h> -#include <asm/hw_irq.h> #include <asm/page.h> #include <asm/pal.h> #include <asm/perfmon.h> -#include <asm/pgtable.h> #include <asm/processor.h> #include <asm/signal.h> #include <asm/system.h> -#include <asm/system.h> #include <asm/uaccess.h> #include <asm/delay.h> /* for ia64_get_itc() */ #ifdef CONFIG_PERFMON -#define PFM_VERSION "0.3" -#define PFM_SMPL_HDR_VERSION 1 - -#define PMU_FIRST_COUNTER 4 /* first generic counter */ - -#define PFM_WRITE_PMCS 0xa0 -#define PFM_WRITE_PMDS 0xa1 -#define PFM_READ_PMDS 0xa2 -#define PFM_STOP 0xa3 -#define PFM_START 0xa4 -#define PFM_ENABLE 0xa5 /* unfreeze only */ -#define PFM_DISABLE 0xa6 /* freeze only */ -#define PFM_RESTART 0xcf -#define PFM_CREATE_CONTEXT 0xa7 -#define PFM_DESTROY_CONTEXT 0xa8 /* - * Those 2 are just meant for debugging. I considered using sysctl() for - * that but it is a little bit too pervasive. This solution is at least - * self-contained. + * For PMU which rely on the debug registers for some features, you must + * you must enable the following flag to activate the support for + * accessing the registers via the perfmonctl() interface. */ -#define PFM_DEBUG_ON 0xe0 -#define PFM_DEBUG_OFF 0xe1 - -#define PFM_DEBUG_BASE PFM_DEBUG_ON - +#ifdef CONFIG_ITANIUM +#define PFM_PMU_USES_DBR 1 +#endif /* - * perfmon API flags + * perfmon context states */ -#define PFM_FL_INHERIT_NONE 0x00 /* never inherit a context across fork (default) */ -#define PFM_FL_INHERIT_ONCE 0x01 /* clone pfm_context only once across fork() */ -#define PFM_FL_INHERIT_ALL 0x02 /* always clone pfm_context across fork() */ -#define PFM_FL_SMPL_OVFL_NOBLOCK 0x04 /* do not block on sampling buffer overflow */ -#define PFM_FL_SYSTEM_WIDE 0x08 /* create a system wide context */ -#define PFM_FL_EXCL_INTR 0x10 /* exclude interrupt from system wide monitoring */ +#define PFM_CTX_DISABLED 0 +#define PFM_CTX_ENABLED 1 /* - * PMC API flags + * Reset register flags */ -#define PFM_REGFL_OVFL_NOTIFY 1 /* send notification on overflow */ +#define PFM_RELOAD_LONG_RESET 1 +#define PFM_RELOAD_SHORT_RESET 2 /* - * Private flags and masks + * Misc macros and definitions */ +#define PMU_FIRST_COUNTER 4 + +#define PFM_IS_DISABLED() pmu_conf.pfm_is_disabled + +#define PMC_OVFL_NOTIFY(ctx, i) ((ctx)->ctx_soft_pmds[i].flags & PFM_REGFL_OVFL_NOTIFY) #define PFM_FL_INHERIT_MASK (PFM_FL_INHERIT_NONE|PFM_FL_INHERIT_ONCE|PFM_FL_INHERIT_ALL) -#ifdef CONFIG_SMP -#define cpu_is_online(i) (cpu_online_map & (1UL << i)) -#else -#define cpu_is_online(i) 1 -#endif +#define PMC_IS_IMPL(i) (i<pmu_conf.num_pmcs && pmu_conf.impl_regs[i>>6] & (1UL<< (i) %64)) +#define PMD_IS_IMPL(i) (i<pmu_conf.num_pmds && pmu_conf.impl_regs[4+(i>>6)] & (1UL<<(i) % 64)) -#define PMC_IS_IMPL(i) (i < pmu_conf.num_pmcs && pmu_conf.impl_regs[i>>6] & (1<< (i&~(64-1)))) -#define PMD_IS_IMPL(i) (i < pmu_conf.num_pmds && pmu_conf.impl_regs[4+(i>>6)] & (1<< (i&~(64-1)))) -#define PMD_IS_COUNTER(i) (i>=PMU_FIRST_COUNTER && i < (PMU_FIRST_COUNTER+pmu_conf.max_counters)) -#define PMC_IS_COUNTER(i) (i>=PMU_FIRST_COUNTER && i < (PMU_FIRST_COUNTER+pmu_conf.max_counters)) +#define PMD_IS_COUNTING(i) (i >=0 && i < 256 && pmu_conf.counter_pmds[i>>6] & (1UL <<(i) % 64)) +#define PMC_IS_COUNTING(i) PMD_IS_COUNTING(i) -/* This is the Itanium-specific PMC layout for counter config */ -typedef struct { - unsigned long pmc_plm:4; /* privilege level mask */ - unsigned long pmc_ev:1; /* external visibility */ - unsigned long pmc_oi:1; /* overflow interrupt */ - unsigned long pmc_pm:1; /* privileged monitor */ - unsigned long pmc_ig1:1; /* reserved */ - unsigned long pmc_es:7; /* event select */ - unsigned long pmc_ig2:1; /* reserved */ - unsigned long pmc_umask:4; /* unit mask */ - unsigned long pmc_thres:3; /* threshold */ - unsigned long pmc_ig3:1; /* reserved (missing from table on p6-17) */ - unsigned long pmc_ism:2; /* instruction set mask */ - unsigned long pmc_ig4:38; /* reserved */ -} pmc_counter_reg_t; - -/* test for EAR/BTB configuration */ -#define PMU_DEAR_EVENT 0x67 -#define PMU_IEAR_EVENT 0x23 -#define PMU_BTB_EVENT 0x11 - -#define PMC_IS_DEAR(a) (((pmc_counter_reg_t *)(a))->pmc_es == PMU_DEAR_EVENT) -#define PMC_IS_IEAR(a) (((pmc_counter_reg_t *)(a))->pmc_es == PMU_IEAR_EVENT) -#define PMC_IS_BTB(a) (((pmc_counter_reg_t *)(a))->pmc_es == PMU_BTB_EVENT) +#define IBR_IS_IMPL(k) (k<pmu_conf.num_ibrs) +#define DBR_IS_IMPL(k) (k<pmu_conf.num_dbrs) + +#define PMC_IS_BTB(a) (((pfm_monitor_t *)(a))->pmc_es == PMU_BTB_EVENT) + +#define LSHIFT(x) (1UL<<(x)) +#define PMM(x) LSHIFT(x) +#define PMC_IS_MONITOR(c) ((pmu_conf.monitor_pmcs[0] & PMM((c))) != 0) + +#define CTX_IS_ENABLED(c) ((c)->ctx_flags.state == PFM_CTX_ENABLED) +#define CTX_OVFL_NOBLOCK(c) ((c)->ctx_fl_block == 0) +#define CTX_INHERIT_MODE(c) ((c)->ctx_fl_inherit) +#define CTX_HAS_SMPL(c) ((c)->ctx_psb != NULL) +#define CTX_USED_PMD(ctx,n) (ctx)->ctx_used_pmds[(n)>>6] |= 1UL<< ((n) % 64) + +#define CTX_USED_IBR(ctx,n) (ctx)->ctx_used_ibrs[(n)>>6] |= 1UL<< ((n) % 64) +#define CTX_USED_DBR(ctx,n) (ctx)->ctx_used_dbrs[(n)>>6] |= 1UL<< ((n) % 64) +#define CTX_USES_DBREGS(ctx) (((pfm_context_t *)(ctx))->ctx_fl_using_dbreg==1) + +#define LOCK_CTX(ctx) spin_lock(&(ctx)->ctx_lock) +#define UNLOCK_CTX(ctx) spin_unlock(&(ctx)->ctx_lock) + +#define SET_PMU_OWNER(t) do { pmu_owners[smp_processor_id()].owner = (t); } while(0) +#define PMU_OWNER() pmu_owners[smp_processor_id()].owner + +#define LOCK_PFS() spin_lock(&pfm_sessions.pfs_lock) +#define UNLOCK_PFS() spin_unlock(&pfm_sessions.pfs_lock) + +#define PFM_REG_RETFLAG_SET(flags, val) do { flags &= ~PFM_REG_RETFL_MASK; flags |= (val); } while(0) /* - * This header is at the beginning of the sampling buffer returned to the user. - * It is exported as Read-Only at this point. It is directly followed with the - * first record. + * debugging */ -typedef struct { - int hdr_version; /* could be used to differentiate formats */ - int hdr_reserved; - unsigned long hdr_entry_size; /* size of one entry in bytes */ - unsigned long hdr_count; /* how many valid entries */ - unsigned long hdr_pmds; /* which pmds are recorded */ -} perfmon_smpl_hdr_t; +#define DBprintk(a) \ + do { \ + if (pfm_debug_mode >0) { printk("%s.%d: CPU%d ", __FUNCTION__, __LINE__, smp_processor_id()); printk a; } \ + } while (0) -/* - * Header entry in the buffer as a header as follows. - * The header is directly followed with the PMDS to saved in increasing index order: - * PMD4, PMD5, .... How many PMDs are present is determined by the tool which must - * keep track of it when generating the final trace file. + +/* + * These are some helpful architected PMC and IBR/DBR register layouts */ typedef struct { - int pid; /* identification of process */ - int cpu; /* which cpu was used */ - unsigned long rate; /* initial value of this counter */ - unsigned long stamp; /* timestamp */ - unsigned long ip; /* where did the overflow interrupt happened */ - unsigned long regs; /* which registers overflowed (up to 64)*/ -} perfmon_smpl_entry_t; + unsigned long pmc_plm:4; /* privilege level mask */ + unsigned long pmc_ev:1; /* external visibility */ + unsigned long pmc_oi:1; /* overflow interrupt */ + unsigned long pmc_pm:1; /* privileged monitor */ + unsigned long pmc_ig1:1; /* reserved */ + unsigned long pmc_es:8; /* event select */ + unsigned long pmc_ig2:48; /* reserved */ +} pfm_monitor_t; /* * There is one such data structure per perfmon context. It is used to describe the - * sampling buffer. It is to be shared among siblings whereas the pfm_context isn't. + * sampling buffer. It is to be shared among siblings whereas the pfm_context + * is not. * Therefore we maintain a refcnt which is incremented on fork(). - * This buffer is private to the kernel only the actual sampling buffer including its - * header are exposed to the user. This construct allows us to export the buffer read-write, - * if needed, without worrying about security problems. + * This buffer is private to the kernel only the actual sampling buffer + * including its header are exposed to the user. This construct allows us to + * export the buffer read-write, if needed, without worrying about security + * problems. */ -typedef struct { - atomic_t psb_refcnt; /* how many users for the buffer */ - int reserved; +typedef struct _pfm_smpl_buffer_desc { + spinlock_t psb_lock; /* protection lock */ + unsigned long psb_refcnt; /* how many users for the buffer */ + int psb_flags; /* bitvector of flags */ + void *psb_addr; /* points to location of first entry */ unsigned long psb_entries; /* maximum number of entries */ unsigned long psb_size; /* aligned size of buffer */ - unsigned long psb_index; /* next free entry slot */ + unsigned long psb_index; /* next free entry slot XXX: must use the one in buffer */ unsigned long psb_entry_size; /* size of each entry including entry header */ perfmon_smpl_hdr_t *psb_hdr; /* points to sampling buffer header */ + + struct _pfm_smpl_buffer_desc *psb_next; /* next psb, used for rvfreeing of psb_hdr */ + } pfm_smpl_buffer_desc_t; +#define LOCK_PSB(p) spin_lock(&(p)->psb_lock) +#define UNLOCK_PSB(p) spin_unlock(&(p)->psb_lock) + +#define PFM_PSB_VMA 0x1 /* a VMA is describing the buffer */ /* * This structure is initialized at boot time and contains @@ -180,126 +169,187 @@ typedef struct { unsigned long num_pmcs ; /* highest PMC implemented (may have holes) */ unsigned long num_pmds; /* highest PMD implemented (may have holes) */ unsigned long impl_regs[16]; /* buffer used to hold implememted PMC/PMD mask */ + unsigned long num_ibrs; /* number of instruction debug registers */ + unsigned long num_dbrs; /* number of data debug registers */ + unsigned long monitor_pmcs[4]; /* which pmc are controlling monitors */ + unsigned long counter_pmds[4]; /* which pmd are used as counters */ } pmu_config_t; -#define PERFMON_IS_DISABLED() pmu_conf.pfm_is_disabled - +/* + * 64-bit software counter structure + */ typedef struct { - __u64 val; /* virtual 64bit counter value */ - __u64 ival; /* initial value from user */ - __u64 smpl_rval; /* reset value on sampling overflow */ - __u64 ovfl_rval; /* reset value on overflow */ - int flags; /* notify/do not notify */ + u64 val; /* virtual 64bit counter value */ + u64 ival; /* initial value from user */ + u64 long_reset; /* reset value on sampling overflow */ + u64 short_reset;/* reset value on overflow */ + u64 reset_pmds[4]; /* which other pmds to reset when this counter overflows */ + int flags; /* notify/do not notify */ } pfm_counter_t; -#define PMD_OVFL_NOTIFY(ctx, i) ((ctx)->ctx_pmds[i].flags & PFM_REGFL_OVFL_NOTIFY) /* - * perfmon context. One per process, is cloned on fork() depending on inheritance flags + * perfmon context. One per process, is cloned on fork() depending on + * inheritance flags */ typedef struct { - unsigned int inherit:2; /* inherit mode */ - unsigned int noblock:1; /* block/don't block on overflow with notification */ - unsigned int system:1; /* do system wide monitoring */ - unsigned int frozen:1; /* pmu must be kept frozen on ctxsw in */ - unsigned int exclintr:1;/* exlcude interrupts from system wide monitoring */ - unsigned int reserved:26; + unsigned int state:1; /* 0=disabled, 1=enabled */ + unsigned int inherit:2; /* inherit mode */ + unsigned int block:1; /* when 1, task will blocked on user notifications */ + unsigned int system:1; /* do system wide monitoring */ + unsigned int frozen:1; /* pmu must be kept frozen on ctxsw in */ + unsigned int protected:1; /* allow access to creator of context only */ + unsigned int using_dbreg:1; /* using range restrictions (debug registers) */ + unsigned int reserved:24; } pfm_context_flags_t; +/* + * perfmon context: encapsulates all the state of a monitoring session + * XXX: probably need to change layout + */ typedef struct pfm_context { + pfm_smpl_buffer_desc_t *ctx_psb; /* sampling buffer, if any */ + unsigned long ctx_smpl_vaddr; /* user level virtual address of smpl buffer */ - pfm_smpl_buffer_desc_t *ctx_smpl_buf; /* sampling buffer descriptor, if any */ - unsigned long ctx_dear_counter; /* which PMD holds D-EAR */ - unsigned long ctx_iear_counter; /* which PMD holds I-EAR */ - unsigned long ctx_btb_counter; /* which PMD holds BTB */ - - spinlock_t ctx_notify_lock; + spinlock_t ctx_lock; pfm_context_flags_t ctx_flags; /* block/noblock */ - int ctx_notify_sig; /* XXX: SIGPROF or other */ - struct task_struct *ctx_notify_task; /* who to notify on overflow */ - struct task_struct *ctx_creator; /* pid of creator (debug) */ - unsigned long ctx_ovfl_regs; /* which registers just overflowed (notification) */ - unsigned long ctx_smpl_regs; /* which registers to record on overflow */ + struct task_struct *ctx_notify_task; /* who to notify on overflow */ + struct task_struct *ctx_owner; /* pid of creator (debug) */ - struct semaphore ctx_restart_sem; /* use for blocking notification mode */ + unsigned long ctx_ovfl_regs[4]; /* which registers overflowed (notification) */ + unsigned long ctx_smpl_regs[4]; /* which registers to record on overflow */ - unsigned long ctx_used_pmds[4]; /* bitmask of used PMD (speedup ctxsw) */ - unsigned long ctx_used_pmcs[4]; /* bitmask of used PMC (speedup ctxsw) */ + struct semaphore ctx_restart_sem; /* use for blocking notification mode */ - pfm_counter_t ctx_pmds[IA64_NUM_PMD_COUNTERS]; /* XXX: size should be dynamic */ + unsigned long ctx_used_pmds[4]; /* bitmask of used PMD (speedup ctxsw) */ + unsigned long ctx_saved_pmcs[4]; /* bitmask of PMC to save on ctxsw */ + unsigned long ctx_reload_pmcs[4]; /* bitmask of PMC to reload on ctxsw (SMP) */ -} pfm_context_t; + unsigned long ctx_used_ibrs[4]; /* bitmask of used IBR (speedup ctxsw) */ + unsigned long ctx_used_dbrs[4]; /* bitmask of used DBR (speedup ctxsw) */ -#define CTX_USED_PMD(ctx,n) (ctx)->ctx_used_pmds[(n)>>6] |= 1<< ((n) % 64) -#define CTX_USED_PMC(ctx,n) (ctx)->ctx_used_pmcs[(n)>>6] |= 1<< ((n) % 64) + pfm_counter_t ctx_soft_pmds[IA64_NUM_PMD_REGS]; /* XXX: size should be dynamic */ -#define ctx_fl_inherit ctx_flags.inherit -#define ctx_fl_noblock ctx_flags.noblock -#define ctx_fl_system ctx_flags.system -#define ctx_fl_frozen ctx_flags.frozen -#define ctx_fl_exclintr ctx_flags.exclintr + u64 ctx_saved_psr; /* copy of psr used for lazy ctxsw */ + unsigned long ctx_saved_cpus_allowed; /* copy of the task cpus_allowed (system wide) */ + unsigned long ctx_cpu; /* cpu to which perfmon is applied (system wide) */ -#define CTX_OVFL_NOBLOCK(c) ((c)->ctx_fl_noblock == 1) -#define CTX_INHERIT_MODE(c) ((c)->ctx_fl_inherit) -#define CTX_HAS_SMPL(c) ((c)->ctx_smpl_buf != NULL) + atomic_t ctx_saving_in_progress; /* flag indicating actual save in progress */ + atomic_t ctx_last_cpu; /* CPU id of current or last CPU used */ +} pfm_context_t; -static pmu_config_t pmu_conf; +#define ctx_fl_inherit ctx_flags.inherit +#define ctx_fl_block ctx_flags.block +#define ctx_fl_system ctx_flags.system +#define ctx_fl_frozen ctx_flags.frozen +#define ctx_fl_protected ctx_flags.protected +#define ctx_fl_using_dbreg ctx_flags.using_dbreg -/* for debug only */ -static int pfm_debug=0; /* 0= nodebug, >0= debug output on */ +/* + * global information about all sessions + * mostly used to synchronize between system wide and per-process + */ +typedef struct { + spinlock_t pfs_lock; /* lock the structure */ -#define DBprintk(a) \ - do { \ - if (pfm_debug >0) { printk(__FUNCTION__" %d: ", __LINE__); printk a; } \ - } while (0); + unsigned long pfs_task_sessions; /* number of per task sessions */ + unsigned long pfs_sys_sessions; /* number of per system wide sessions */ + unsigned long pfs_sys_use_dbregs; /* incremented when a system wide session uses debug regs */ + unsigned long pfs_ptrace_use_dbregs; /* incremented when a process uses debug regs */ + struct task_struct *pfs_sys_session[NR_CPUS]; /* point to task owning a system-wide session */ +} pfm_session_t; -static void ia64_reset_pmu(void); +/* + * structure used to pass argument to/from remote CPU + * using IPI to check and possibly save the PMU context on SMP systems. + * + * not used in UP kernels + */ +typedef struct { + struct task_struct *task; /* which task we are interested in */ + int retval; /* return value of the call: 0=you can proceed, 1=need to wait for completion */ +} pfm_smp_ipi_arg_t; /* - * structure used to pass information between the interrupt handler - * and the tasklet. + * perfmon command descriptions */ typedef struct { - pid_t to_pid; /* which process to notify */ - pid_t from_pid; /* which process is source of overflow */ - int sig; /* with which signal */ - unsigned long bitvect; /* which counters have overflowed */ -} notification_info_t; + int (*cmd_func)(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, struct pt_regs *regs); + int cmd_flags; + unsigned int cmd_narg; + size_t cmd_argsize; +} pfm_cmd_desc_t; +#define PFM_CMD_PID 0x1 /* command requires pid argument */ +#define PFM_CMD_ARG_READ 0x2 /* command must read argument(s) */ +#define PFM_CMD_ARG_WRITE 0x4 /* command must write argument(s) */ +#define PFM_CMD_CTX 0x8 /* command needs a perfmon context */ +#define PFM_CMD_NOCHK 0x10 /* command does not need to check task's state */ -typedef struct { - unsigned long pfs_proc_sessions; - unsigned long pfs_sys_session; /* can only be 0/1 */ - unsigned long pfs_dfl_dcr; /* XXX: hack */ - unsigned int pfs_pp; -} pfm_session_t; +#define PFM_CMD_IDX(cmd) (cmd) -struct { - struct task_struct *owner; -} ____cacheline_aligned pmu_owners[NR_CPUS]; +#define PFM_CMD_IS_VALID(cmd) ((PFM_CMD_IDX(cmd) >= 0) && (PFM_CMD_IDX(cmd) < PFM_CMD_COUNT) \ + && pfm_cmd_tab[PFM_CMD_IDX(cmd)].cmd_func != NULL) +#define PFM_CMD_USE_PID(cmd) ((pfm_cmd_tab[PFM_CMD_IDX(cmd)].cmd_flags & PFM_CMD_PID) != 0) +#define PFM_CMD_READ_ARG(cmd) ((pfm_cmd_tab[PFM_CMD_IDX(cmd)].cmd_flags & PFM_CMD_ARG_READ) != 0) +#define PFM_CMD_WRITE_ARG(cmd) ((pfm_cmd_tab[PFM_CMD_IDX(cmd)].cmd_flags & PFM_CMD_ARG_WRITE) != 0) +#define PFM_CMD_USE_CTX(cmd) ((pfm_cmd_tab[PFM_CMD_IDX(cmd)].cmd_flags & PFM_CMD_CTX) != 0) +#define PFM_CMD_CHK(cmd) ((pfm_cmd_tab[PFM_CMD_IDX(cmd)].cmd_flags & PFM_CMD_NOCHK) == 0) -/* - * helper macros +#define PFM_CMD_ARG_MANY -1 /* cannot be zero */ +#define PFM_CMD_NARG(cmd) (pfm_cmd_tab[PFM_CMD_IDX(cmd)].cmd_narg) +#define PFM_CMD_ARG_SIZE(cmd) (pfm_cmd_tab[PFM_CMD_IDX(cmd)].cmd_argsize) + + +/* + * perfmon internal variables */ -#define SET_PMU_OWNER(t) do { pmu_owners[smp_processor_id()].owner = (t); } while(0); -#define PMU_OWNER() pmu_owners[smp_processor_id()].owner +static pmu_config_t pmu_conf; /* PMU configuration */ +static int pfm_debug_mode; /* 0= nodebug, >0= debug output on */ +static pfm_session_t pfm_sessions; /* global sessions information */ +static struct proc_dir_entry *perfmon_dir; /* for debug only */ +static unsigned long pfm_spurious_ovfl_intr_count; /* keep track of spurious ovfl interrupts */ +static unsigned long pfm_ovfl_intr_count; /* keep track of spurious ovfl interrupts */ +static unsigned long pfm_recorded_samples_count; + +static void pfm_vm_close(struct vm_area_struct * area); +static struct vm_operations_struct pfm_vm_ops={ + close: pfm_vm_close +}; -#ifdef CONFIG_SMP -#define PFM_CAN_DO_LAZY() (smp_num_cpus==1 && pfs_info.pfs_sys_session==0) -#else -#define PFM_CAN_DO_LAZY() (pfs_info.pfs_sys_session==0) -#endif +/* + * keep track of task owning the PMU per CPU. + */ +static struct { + struct task_struct *owner; +} ____cacheline_aligned pmu_owners[NR_CPUS]; -static void pfm_lazy_save_regs (struct task_struct *ta); -/* for debug only */ -static struct proc_dir_entry *perfmon_dir; /* - * XXX: hack to indicate that a system wide monitoring session is active + * forward declarations */ -static pfm_session_t pfs_info; +static void ia64_reset_pmu(struct task_struct *); +static void pfm_fetch_regs(int cpu, struct task_struct *task, pfm_context_t *ctx); +static void pfm_lazy_save_regs (struct task_struct *ta); + +static inline unsigned long +pfm_read_soft_counter(pfm_context_t *ctx, int i) +{ + return ctx->ctx_soft_pmds[i].val + (ia64_get_pmd(i) & pmu_conf.perf_ovfl_val); +} + +static inline void +pfm_write_soft_counter(pfm_context_t *ctx, int i, unsigned long val) +{ + ctx->ctx_soft_pmds[i].val = val & ~pmu_conf.perf_ovfl_val; + /* + * writing to unimplemented part is ignore, so we do not need to + * mask off top part + */ + ia64_set_pmd(i, val); +} /* * finds the number of PM(C|D) registers given @@ -324,10 +374,10 @@ find_num_pm_regs(long *buffer) * Generates a unique (per CPU) timestamp */ static inline unsigned long -perfmon_get_stamp(void) +pfm_get_stamp(void) { /* - * XXX: maybe find something more efficient + * XXX: must find something more efficient */ return ia64_get_itc(); } @@ -336,15 +386,16 @@ perfmon_get_stamp(void) * This is used when initializing the contents of the area. */ static inline unsigned long -kvirt_to_pa(unsigned long adr) +pfm_kvirt_to_pa(unsigned long adr) { __u64 pa = ia64_tpa(adr); - DBprintk(("kv2pa(%lx-->%lx)\n", adr, pa)); + //DBprintk(("kv2pa(%lx-->%lx)\n", adr, pa)); return pa; } + static void * -rvmalloc(unsigned long size) +pfm_rvmalloc(unsigned long size) { void *mem; unsigned long adr; @@ -352,6 +403,7 @@ rvmalloc(unsigned long size) size=PAGE_ALIGN(size); mem=vmalloc(size); if (mem) { + //printk("perfmon: CPU%d pfm_rvmalloc(%ld)=%p\n", smp_processor_id(), size, mem); memset(mem, 0, size); /* Clear the ram out, no junk to the user */ adr=(unsigned long) mem; while (size > 0) { @@ -364,37 +416,145 @@ rvmalloc(unsigned long size) } static void -rvfree(void *mem, unsigned long size) +pfm_rvfree(void *mem, unsigned long size) { unsigned long adr; if (mem) { adr=(unsigned long) mem; - while ((long) size > 0) { - mem_map_unreserve(vmalloc_to_page((void *)adr)); + while ((long) size > 0) + mem_map_unreserve(vmalloc_to_page((void*)adr)); adr+=PAGE_SIZE; size-=PAGE_SIZE; } vfree(mem); } + return; +} + +/* + * This function gets called from mm/mmap.c:exit_mmap() only when there is a sampling buffer + * attached to the context AND the current task has a mapping for it, i.e., it is the original + * creator of the context. + * + * This function is used to remember the fact that the vma describing the sampling buffer + * has now been removed. It can only be called when no other tasks share the same mm context. + * + */ +static void +pfm_vm_close(struct vm_area_struct *vma) +{ + pfm_smpl_buffer_desc_t *psb = (pfm_smpl_buffer_desc_t *)vma->vm_private_data; + + if (psb == NULL) { + printk("perfmon: psb is null in [%d]\n", current->pid); + return; + } + /* + * Add PSB to list of buffers to free on release_thread() when no more users + * + * This call is safe because, once the count is zero is cannot be modified anymore. + * This is not because there is no more user of the mm context, that the sampling + * buffer is not being used anymore outside of this task. In fact, it can still + * be accessed from within the kernel by another task (such as the monitored task). + * + * Therefore, we only move the psb into the list of buffers to free when we know + * nobody else is using it. + * The linked list if independent of the perfmon context, because in the case of + * multi-threaded processes, the last thread may not have been involved with + * monitoring however it will be the one removing the vma and it should therefore + * also remove the sampling buffer. This buffer cannot be removed until the vma + * is removed. + * + * This function cannot remove the buffer from here, because exit_mmap() must first + * complete. Given that there is no other vma related callback in the generic code, + * we have created on own with the linked list of sampling buffer to free which + * is part of the thread structure. In release_thread() we check if the list is + * empty. If not we call into perfmon to free the buffer and psb. That is the only + * way to ensure a safe deallocation of the sampling buffer which works when + * the buffer is shared between distinct processes or with multi-threaded programs. + * + * We need to lock the psb because the refcnt test and flag manipulation must + * looked like an atomic operation vis a vis pfm_context_exit() + */ + LOCK_PSB(psb); + + if (psb->psb_refcnt == 0) { + + psb->psb_next = current->thread.pfm_smpl_buf_list; + current->thread.pfm_smpl_buf_list = psb; + + DBprintk(("psb for [%d] smpl @%p size %ld inserted into list\n", + current->pid, psb->psb_hdr, psb->psb_size)); + } + DBprintk(("psb vma flag cleared for [%d] smpl @%p size %ld inserted into list\n", + current->pid, psb->psb_hdr, psb->psb_size)); + + /* + * indicate to pfm_context_exit() that the vma has been removed. + */ + psb->psb_flags &= ~PFM_PSB_VMA; + + UNLOCK_PSB(psb); +} + +/* + * This function is called from pfm_destroy_context() and also from pfm_inherit() + * to explicitely remove the sampling buffer mapping from the user level address space. + */ +static int +pfm_remove_smpl_mapping(struct task_struct *task) +{ + pfm_context_t *ctx = task->thread.pfm_context; + pfm_smpl_buffer_desc_t *psb; + int r; + + /* + * some sanity checks first + */ + if (ctx == NULL || task->mm == NULL || ctx->ctx_smpl_vaddr == 0 || ctx->ctx_psb == NULL) { + printk("perfmon: invalid context mm=%p\n", task->mm); + return -1; + } + psb = ctx->ctx_psb; + + down_write(&task->mm->mmap_sem); + + r = do_munmap(task->mm, ctx->ctx_smpl_vaddr, psb->psb_size); + + up_write(&task->mm->mmap_sem); + if (r !=0) { + printk("perfmon: pid %d unable to unmap sampling buffer @0x%lx size=%ld\n", + task->pid, ctx->ctx_smpl_vaddr, psb->psb_size); + } + DBprintk(("[%d] do_unmap(0x%lx, %ld)=%d\n", + task->pid, ctx->ctx_smpl_vaddr, psb->psb_size, r)); + + /* + * make sure we suppress all traces of this buffer + * (important for pfm_inherit) + */ + ctx->ctx_smpl_vaddr = 0; + + return 0; } static pfm_context_t * pfm_context_alloc(void) { - pfm_context_t *pfc; + pfm_context_t *ctx; /* allocate context descriptor */ - pfc = vmalloc(sizeof(*pfc)); - if (pfc) memset(pfc, 0, sizeof(*pfc)); - - return pfc; + ctx = kmalloc(sizeof(pfm_context_t), GFP_KERNEL); + if (ctx) memset(ctx, 0, sizeof(pfm_context_t)); + + return ctx; } static void -pfm_context_free(pfm_context_t *pfc) +pfm_context_free(pfm_context_t *ctx) { - if (pfc) vfree(pfc); + if (ctx) kfree(ctx); } static int @@ -402,11 +562,13 @@ pfm_remap_buffer(struct vm_area_struct *vma, unsigned long buf, unsigned long ad { unsigned long page; + DBprintk(("CPU%d buf=0x%lx addr=0x%lx size=%ld\n", smp_processor_id(), buf, addr, size)); + while (size > 0) { - page = kvirt_to_pa(buf); + page = pfm_kvirt_to_pa(buf); if (remap_page_range(vma, addr, page, PAGE_SIZE, PAGE_SHARED)) return -ENOMEM; - + addr += PAGE_SIZE; buf += PAGE_SIZE; size -= PAGE_SIZE; @@ -426,7 +588,7 @@ pfm_smpl_entry_size(unsigned long *which, unsigned long size) for (i=0; i < size; i++, which++) res += hweight64(*which); - DBprintk((" res=%ld\n", res)); + DBprintk(("weight=%ld\n", res)); return res; } @@ -435,15 +597,16 @@ pfm_smpl_entry_size(unsigned long *which, unsigned long size) * Allocates the sampling buffer and remaps it into caller's address space */ static int -pfm_smpl_buffer_alloc(pfm_context_t *ctx, unsigned long which_pmds, unsigned long entries, void **user_addr) +pfm_smpl_buffer_alloc(pfm_context_t *ctx, unsigned long *which_pmds, unsigned long entries, + void **user_vaddr) { struct mm_struct *mm = current->mm; - struct vm_area_struct *vma; - unsigned long addr, size, regcount; + struct vm_area_struct *vma = NULL; + unsigned long size, regcount; void *smpl_buf; pfm_smpl_buffer_desc_t *psb; - regcount = pfm_smpl_entry_size(&which_pmds, 1); + regcount = pfm_smpl_entry_size(which_pmds, 1); /* note that regcount might be 0, in this case only the header for each * entry will be recorded. @@ -456,132 +619,206 @@ pfm_smpl_buffer_alloc(pfm_context_t *ctx, unsigned long which_pmds, unsigned lon + entries * (sizeof(perfmon_smpl_entry_t) + regcount*sizeof(u64))); /* * check requested size to avoid Denial-of-service attacks - * XXX: may have to refine this test + * XXX: may have to refine this test + * Check against address space limit. + * + * if ((mm->total_vm << PAGE_SHIFT) + len> current->rlim[RLIMIT_AS].rlim_cur) + * return -ENOMEM; */ if (size > current->rlim[RLIMIT_MEMLOCK].rlim_cur) return -EAGAIN; - /* find some free area in address space */ - addr = get_unmapped_area(NULL, 0, size, 0, MAP_PRIVATE); - if (!addr) goto no_addr; + /* + * We do the easy to undo allocations first. + * + * pfm_rvmalloc(), clears the buffer, so there is no leak + */ + smpl_buf = pfm_rvmalloc(size); + if (smpl_buf == NULL) { + DBprintk(("Can't allocate sampling buffer\n")); + return -ENOMEM; + } + + DBprintk(("smpl_buf @%p\n", smpl_buf)); - DBprintk((" entries=%ld aligned size=%ld, unmapped @0x%lx\n", entries, size, addr)); + /* allocate sampling buffer descriptor now */ + psb = kmalloc(sizeof(*psb), GFP_KERNEL); + if (psb == NULL) { + DBprintk(("Can't allocate sampling buffer descriptor\n")); + pfm_rvfree(smpl_buf, size); + return -ENOMEM; + } /* allocate vma */ vma = kmem_cache_alloc(vm_area_cachep, SLAB_KERNEL); - if (!vma) goto no_vma; - + if (!vma) { + DBprintk(("Cannot allocate vma\n")); + goto error; + } /* - * initialize the vma for the sampling buffer + * partially initialize the vma for the sampling buffer */ - vma->vm_mm = mm; - vma->vm_start = addr; - vma->vm_end = addr + size; - vma->vm_flags = VM_READ|VM_MAYREAD; - vma->vm_page_prot = PAGE_READONLY; /* XXX may need to change */ - vma->vm_ops = NULL; - vma->vm_pgoff = 0; - vma->vm_file = NULL; - vma->vm_raend = 0; - - smpl_buf = rvmalloc(size); - if (smpl_buf == NULL) goto no_buffer; - - DBprintk((" smpl_buf @%p\n", smpl_buf)); - - if (pfm_remap_buffer(vma, (unsigned long)smpl_buf, addr, size)) goto cant_remap; - - /* allocate sampling buffer descriptor now */ - psb = vmalloc(sizeof(*psb)); - if (psb == NULL) goto no_buffer_desc; + vma->vm_flags = VM_READ| VM_MAYREAD |VM_RESERVED; + vma->vm_page_prot = PAGE_READONLY; /* XXX may need to change */ + vma->vm_ops = &pfm_vm_ops; /* necesarry to get the close() callback */ + vma->vm_pgoff = 0; + vma->vm_file = NULL; + vma->vm_raend = 0; + vma->vm_private_data = psb; /* information needed by the pfm_vm_close() function */ - /* start with something clean */ - memset(smpl_buf, 0x0, size); + /* + * Now we have everything we need and we can initialize + * and connect all the data structures + */ psb->psb_hdr = smpl_buf; - psb->psb_addr = (char *)smpl_buf+sizeof(perfmon_smpl_hdr_t); /* first entry */ + psb->psb_addr = ((char *)smpl_buf)+sizeof(perfmon_smpl_hdr_t); /* first entry */ psb->psb_size = size; /* aligned size */ psb->psb_index = 0; psb->psb_entries = entries; + psb->psb_flags = PFM_PSB_VMA; /* remember that there is a vma describing the buffer */ + psb->psb_refcnt = 1; - atomic_set(&psb->psb_refcnt, 1); + spin_lock_init(&psb->psb_lock); + /* + * XXX: will need to do cacheline alignment to avoid false sharing in SMP mode and + * multitask monitoring. + */ psb->psb_entry_size = sizeof(perfmon_smpl_entry_t) + regcount*sizeof(u64); - DBprintk((" psb @%p entry_size=%ld hdr=%p addr=%p\n", (void *)psb,psb->psb_entry_size, (void *)psb->psb_hdr, (void *)psb->psb_addr)); + DBprintk(("psb @%p entry_size=%ld hdr=%p addr=%p\n", + (void *)psb,psb->psb_entry_size, (void *)psb->psb_hdr, + (void *)psb->psb_addr)); - /* initialize some of the fields of header */ - psb->psb_hdr->hdr_version = PFM_SMPL_HDR_VERSION; - psb->psb_hdr->hdr_entry_size = sizeof(perfmon_smpl_entry_t)+regcount*sizeof(u64); - psb->psb_hdr->hdr_pmds = which_pmds; + /* initialize some of the fields of user visible buffer header */ + psb->psb_hdr->hdr_version = PFM_SMPL_VERSION; + psb->psb_hdr->hdr_entry_size = psb->psb_entry_size; + psb->psb_hdr->hdr_pmds[0] = which_pmds[0]; - /* store which PMDS to record */ - ctx->ctx_smpl_regs = which_pmds; + /* + * Let's do the difficult operations next. + * + * now we atomically find some area in the address space and + * remap the buffer in it. + */ + down_write(¤t->mm->mmap_sem); - /* link to perfmon context */ - ctx->ctx_smpl_buf = psb; - vma->vm_private_data = ctx; /* link to pfm_context(not yet used) */ + /* find some free area in address space, must have mmap sem held */ + vma->vm_start = get_unmapped_area(NULL, 0, size, 0, MAP_PRIVATE|MAP_ANONYMOUS); + if (vma->vm_start == 0UL) { + DBprintk(("Cannot find unmapped area for size %ld\n", size)); + up_write(¤t->mm->mmap_sem); + goto error; + } + vma->vm_end = vma->vm_start + size; + + DBprintk(("entries=%ld aligned size=%ld, unmapped @0x%lx\n", entries, size, vma->vm_start)); + + /* can only be applied to current, need to have the mm semaphore held when called */ + if (pfm_remap_buffer(vma, (unsigned long)smpl_buf, vma->vm_start, size)) { + DBprintk(("Can't remap buffer\n")); + up_write(¤t->mm->mmap_sem); + goto error; + } /* - * now insert the vma in the vm list for the process + * now insert the vma in the vm list for the process, must be + * done with mmap lock held */ insert_vm_struct(mm, vma); mm->total_vm += size >> PAGE_SHIFT; + up_write(¤t->mm->mmap_sem); + + /* store which PMDS to record */ + ctx->ctx_smpl_regs[0] = which_pmds[0]; + + + /* link to perfmon context */ + ctx->ctx_psb = psb; + /* - * that's the address returned to the user + * keep track of user level virtual address */ - *user_addr = (void *)addr; + ctx->ctx_smpl_vaddr = *(unsigned long *)user_vaddr = vma->vm_start; return 0; - /* outlined error handling */ -no_addr: - DBprintk(("Cannot find unmapped area for size %ld\n", size)); - return -ENOMEM; -no_vma: - DBprintk(("Cannot allocate vma\n")); - return -ENOMEM; -cant_remap: - DBprintk(("Can't remap buffer\n")); - rvfree(smpl_buf, size); -no_buffer: - DBprintk(("Can't allocate sampling buffer\n")); - kmem_cache_free(vm_area_cachep, vma); - return -ENOMEM; -no_buffer_desc: - DBprintk(("Can't allocate sampling buffer descriptor\n")); - kmem_cache_free(vm_area_cachep, vma); - rvfree(smpl_buf, size); +error: + pfm_rvfree(smpl_buf, size); + kfree(psb); return -ENOMEM; } +/* + * XXX: do something better here + */ +static int +pfm_bad_permissions(struct task_struct *task) +{ + /* stolen from bad_signal() */ + return (current->session != task->session) + && (current->euid ^ task->suid) && (current->euid ^ task->uid) + && (current->uid ^ task->suid) && (current->uid ^ task->uid); +} + + static int -pfx_is_sane(pfreq_context_t *pfx) +pfx_is_sane(struct task_struct *task, pfarg_context_t *pfx) { int ctx_flags; + int cpu; /* valid signal */ - //if (pfx->notify_sig < 1 || pfx->notify_sig >= _NSIG) return -EINVAL; - if (pfx->notify_sig !=0 && pfx->notify_sig != SIGPROF) return -EINVAL; /* cannot send to process 1, 0 means do not notify */ - if (pfx->notify_pid < 0 || pfx->notify_pid == 1) return -EINVAL; - - ctx_flags = pfx->flags; + if (pfx->ctx_notify_pid == 1) { + DBprintk(("invalid notify_pid %d\n", pfx->ctx_notify_pid)); + return -EINVAL; + } + ctx_flags = pfx->ctx_flags; if (ctx_flags & PFM_FL_SYSTEM_WIDE) { -#ifdef CONFIG_SMP - if (smp_num_cpus > 1) { - printk("perfmon: system wide monitoring on SMP not yet supported\n"); + DBprintk(("cpu_mask=0x%lx\n", pfx->ctx_cpu_mask)); + /* + * cannot block in this mode + */ + if (ctx_flags & PFM_FL_NOTIFY_BLOCK) { + DBprintk(("cannot use blocking mode when in system wide monitoring\n")); return -EINVAL; } -#endif - if ((ctx_flags & PFM_FL_SMPL_OVFL_NOBLOCK) == 0) { - printk("perfmon: system wide monitoring cannot use blocking notification mode\n"); + /* + * must only have one bit set in the CPU mask + */ + if (hweight64(pfx->ctx_cpu_mask) != 1UL) { + DBprintk(("invalid CPU mask specified\n")); + return -EINVAL; + } + /* + * and it must be a valid CPU + */ + cpu = ffs(pfx->ctx_cpu_mask); + if (cpu > smp_num_cpus) { + DBprintk(("CPU%d is not online\n", cpu)); return -EINVAL; } + /* + * check for pre-existing pinning, if conflicting reject + */ + if (task->cpus_allowed != ~0UL && (task->cpus_allowed & (1UL<<cpu)) == 0) { + DBprintk(("[%d] pinned on 0x%lx, mask for CPU%d \n", task->pid, + task->cpus_allowed, cpu)); + return -EINVAL; + } + + } else { + /* + * must provide a target for the signal in blocking mode even when + * no counter is configured with PFM_FL_REG_OVFL_NOTIFY + */ + if ((ctx_flags & PFM_FL_NOTIFY_BLOCK) && pfx->ctx_notify_pid == 0) return -EINVAL; } /* probably more to add here */ @@ -589,68 +826,97 @@ pfx_is_sane(pfreq_context_t *pfx) } static int -pfm_context_create(int flags, perfmon_req_t *req) +pfm_create_context(struct task_struct *task, pfm_context_t *ctx, void *req, int count, + struct pt_regs *regs) { - pfm_context_t *ctx; - struct task_struct *task = NULL; - perfmon_req_t tmp; + pfarg_context_t tmp; void *uaddr = NULL; - int ret; + int ret, cpu = 0; int ctx_flags; - pid_t pid; + pid_t notify_pid; - /* to go away */ - if (flags) { - printk("perfmon: use context flags instead of perfmon() flags. Obsoleted API\n"); - } + /* a context has already been defined */ + if (ctx) return -EBUSY; + + /* + * not yet supported + */ + if (task != current) return -EINVAL; if (copy_from_user(&tmp, req, sizeof(tmp))) return -EFAULT; - ret = pfx_is_sane(&tmp.pfr_ctx); + ret = pfx_is_sane(task, &tmp); if (ret < 0) return ret; - ctx_flags = tmp.pfr_ctx.flags; + ctx_flags = tmp.ctx_flags; + + ret = -EBUSY; + + LOCK_PFS(); if (ctx_flags & PFM_FL_SYSTEM_WIDE) { + + /* at this point, we know there is at least one bit set */ + cpu = ffs(tmp.ctx_cpu_mask) - 1; + + DBprintk(("requesting CPU%d currently on CPU%d\n",cpu, smp_processor_id())); + + if (pfm_sessions.pfs_task_sessions > 0) { + DBprintk(("system wide not possible, task_sessions=%ld\n", pfm_sessions.pfs_task_sessions)); + goto abort; + } + + if (pfm_sessions.pfs_sys_session[cpu]) { + DBprintk(("system wide not possible, conflicting session [%d] on CPU%d\n",pfm_sessions.pfs_sys_session[cpu]->pid, cpu)); + goto abort; + } + pfm_sessions.pfs_sys_session[cpu] = task; /* - * XXX: This is not AT ALL SMP safe + * count the number of system wide sessions */ - if (pfs_info.pfs_proc_sessions > 0) return -EBUSY; - if (pfs_info.pfs_sys_session > 0) return -EBUSY; - - pfs_info.pfs_sys_session = 1; + pfm_sessions.pfs_sys_sessions++; - } else if (pfs_info.pfs_sys_session >0) { + } else if (pfm_sessions.pfs_sys_sessions == 0) { + pfm_sessions.pfs_task_sessions++; + } else { /* no per-process monitoring while there is a system wide session */ - return -EBUSY; - } else - pfs_info.pfs_proc_sessions++; + goto abort; + } + + UNLOCK_PFS(); + + ret = -ENOMEM; ctx = pfm_context_alloc(); if (!ctx) goto error; - /* record the creator (debug only) */ - ctx->ctx_creator = current; + /* record the creator (important for inheritance) */ + ctx->ctx_owner = current; + + notify_pid = tmp.ctx_notify_pid; - pid = tmp.pfr_ctx.notify_pid; + spin_lock_init(&ctx->ctx_lock); - spin_lock_init(&ctx->ctx_notify_lock); + if (notify_pid == current->pid) { - if (pid == current->pid) { ctx->ctx_notify_task = task = current; current->thread.pfm_context = ctx; - atomic_set(¤t->thread.pfm_notifiers_check, 1); + } else if (notify_pid!=0) { + struct task_struct *notify_task; - } else if (pid!=0) { read_lock(&tasklist_lock); - task = find_task_by_pid(pid); - if (task) { + notify_task = find_task_by_pid(notify_pid); + + if (notify_task) { + + ret = -EPERM; + /* - * record who to notify - */ - ctx->ctx_notify_task = task; + * check if we can send this task a signal + */ + if (pfm_bad_permissions(notify_task)) goto buffer_error; /* * make visible @@ -669,7 +935,9 @@ pfm_context_create(int flags, perfmon_req_t *req) * task has been detached from the tasklist otherwise you are * exposed to race conditions. */ - atomic_add(1, &task->thread.pfm_notifiers_check); + atomic_add(1, &ctx->ctx_notify_task->thread.pfm_notifiers_check); + + ctx->ctx_notify_task = notify_task; } read_unlock(&tasklist_lock); } @@ -677,37 +945,48 @@ pfm_context_create(int flags, perfmon_req_t *req) /* * notification process does not exist */ - if (pid != 0 && task == NULL) { + if (notify_pid != 0 && ctx->ctx_notify_task == NULL) { ret = -EINVAL; goto buffer_error; } - ctx->ctx_notify_sig = SIGPROF; /* siginfo imposes a fixed signal */ + if (tmp.ctx_smpl_entries) { + DBprintk(("sampling entries=%ld\n",tmp.ctx_smpl_entries)); - if (tmp.pfr_ctx.smpl_entries) { - DBprintk((" sampling entries=%ld\n",tmp.pfr_ctx.smpl_entries)); - - ret = pfm_smpl_buffer_alloc(ctx, tmp.pfr_ctx.smpl_regs, - tmp.pfr_ctx.smpl_entries, &uaddr); + ret = pfm_smpl_buffer_alloc(ctx, tmp.ctx_smpl_regs, + tmp.ctx_smpl_entries, &uaddr); if (ret<0) goto buffer_error; - tmp.pfr_ctx.smpl_vaddr = uaddr; + tmp.ctx_smpl_vaddr = uaddr; } /* initialization of context's flags */ - ctx->ctx_fl_inherit = ctx_flags & PFM_FL_INHERIT_MASK; - ctx->ctx_fl_noblock = (ctx_flags & PFM_FL_SMPL_OVFL_NOBLOCK) ? 1 : 0; - ctx->ctx_fl_system = (ctx_flags & PFM_FL_SYSTEM_WIDE) ? 1: 0; - ctx->ctx_fl_exclintr = (ctx_flags & PFM_FL_EXCL_INTR) ? 1: 0; - ctx->ctx_fl_frozen = 0; + ctx->ctx_fl_inherit = ctx_flags & PFM_FL_INHERIT_MASK; + ctx->ctx_fl_block = (ctx_flags & PFM_FL_NOTIFY_BLOCK) ? 1 : 0; + ctx->ctx_fl_system = (ctx_flags & PFM_FL_SYSTEM_WIDE) ? 1: 0; + ctx->ctx_fl_frozen = 0; + /* + * setting this flag to 0 here means, that the creator or the task that the + * context is being attached are granted access. Given that a context can only + * be created for the calling process this, in effect only allows the creator + * to access the context. See pfm_protect() for more. + */ + ctx->ctx_fl_protected = 0; + + /* for system wide mode only (only 1 bit set) */ + ctx->ctx_cpu = cpu; + + atomic_set(&ctx->ctx_last_cpu,-1); /* SMP only, means no CPU */ /* * Keep track of the pmds we want to sample * XXX: may be we don't need to save/restore the DEAR/IEAR pmds * but we do need the BTB for sure. This is because of a hardware * buffer of 1 only for non-BTB pmds. + * + * We ignore the unimplemented pmds specified by the user */ - ctx->ctx_used_pmds[0] = tmp.pfr_ctx.smpl_regs; - ctx->ctx_used_pmcs[0] = 1; /* always save/restore PMC[0] */ + ctx->ctx_used_pmds[0] = tmp.ctx_smpl_regs[0] & pmu_conf.impl_regs[4]; + ctx->ctx_saved_pmcs[0] = 1; /* always save/restore PMC[0] */ sema_init(&ctx->ctx_restart_sem, 0); /* init this semaphore to locked */ @@ -717,31 +996,27 @@ pfm_context_create(int flags, perfmon_req_t *req) goto buffer_error; } - DBprintk((" context=%p, pid=%d notify_sig %d notify_task=%p\n",(void *)ctx, current->pid, ctx->ctx_notify_sig, ctx->ctx_notify_task)); - DBprintk((" context=%p, pid=%d flags=0x%x inherit=%d noblock=%d system=%d\n",(void *)ctx, current->pid, ctx_flags, ctx->ctx_fl_inherit, ctx->ctx_fl_noblock, ctx->ctx_fl_system)); + DBprintk(("context=%p, pid=%d notify_task=%p\n", + (void *)ctx, task->pid, ctx->ctx_notify_task)); + + DBprintk(("context=%p, pid=%d flags=0x%x inherit=%d block=%d system=%d\n", + (void *)ctx, task->pid, ctx_flags, ctx->ctx_fl_inherit, + ctx->ctx_fl_block, ctx->ctx_fl_system)); /* * when no notification is required, we can make this visible at the last moment */ - if (pid == 0) current->thread.pfm_context = ctx; - + if (notify_pid == 0) task->thread.pfm_context = ctx; /* - * by default, we always include interrupts for system wide - * DCR.pp is set by default to zero by kernel in cpu_init() + * pin task to CPU and force reschedule on exit to ensure + * that when back to user level the task runs on the designated + * CPU. */ if (ctx->ctx_fl_system) { - if (ctx->ctx_fl_exclintr == 0) { - unsigned long dcr = ia64_get_dcr(); - - ia64_set_dcr(dcr|IA64_DCR_PP); - /* - * keep track of the kernel default value - */ - pfs_info.pfs_dfl_dcr = dcr; - - DBprintk((" dcr.pp is set\n")); - } - } + ctx->ctx_saved_cpus_allowed = task->cpus_allowed; + set_cpus_allowed(task, 1UL << cpu); + DBprintk(("[%d] rescheduled allowed=0x%lx\n", task->pid,task->cpus_allowed)); + } return 0; @@ -751,225 +1026,492 @@ error: /* * undo session reservation */ + LOCK_PFS(); + if (ctx_flags & PFM_FL_SYSTEM_WIDE) { - pfs_info.pfs_sys_session = 0; + pfm_sessions.pfs_sys_session[cpu] = NULL; + pfm_sessions.pfs_sys_sessions--; } else { - pfs_info.pfs_proc_sessions--; + pfm_sessions.pfs_task_sessions--; } +abort: + UNLOCK_PFS(); + return ret; } static void -pfm_reset_regs(pfm_context_t *ctx) +pfm_reset_regs(pfm_context_t *ctx, unsigned long *ovfl_regs, int flag) { - unsigned long mask = ctx->ctx_ovfl_regs; - int i, cnum; + unsigned long mask = ovfl_regs[0]; + unsigned long reset_others = 0UL; + unsigned long val; + int i; + + DBprintk(("masks=0x%lx\n", mask)); - DBprintk((" ovfl_regs=0x%lx\n", mask)); /* * now restore reset value on sampling overflowed counters */ - for(i=0, cnum=PMU_FIRST_COUNTER; i < pmu_conf.max_counters; i++, cnum++, mask >>= 1) { + mask >>= PMU_FIRST_COUNTER; + for(i = PMU_FIRST_COUNTER; mask; i++, mask >>= 1) { if (mask & 0x1) { - DBprintk((" reseting PMD[%d]=%lx\n", cnum, ctx->ctx_pmds[i].smpl_rval & pmu_conf.perf_ovfl_val)); + val = flag == PFM_RELOAD_LONG_RESET ? + ctx->ctx_soft_pmds[i].long_reset: + ctx->ctx_soft_pmds[i].short_reset; + + reset_others |= ctx->ctx_soft_pmds[i].reset_pmds[0]; + + DBprintk(("[%d] %s reset soft_pmd[%d]=%lx\n", + current->pid, + flag == PFM_RELOAD_LONG_RESET ? "long" : "short", i, val)); /* upper part is ignored on rval */ - ia64_set_pmd(cnum, ctx->ctx_pmds[i].smpl_rval); + pfm_write_soft_counter(ctx, i, val); + } + } - /* - * we must reset BTB index (clears pmd16.full to make - * sure we do not report the same branches twice. - * The non-blocking case in handled in update_counters() - */ - if (cnum == ctx->ctx_btb_counter) { - DBprintk(("reseting PMD16\n")); - ia64_set_pmd(16, 0); - } + /* + * Now take care of resetting the other registers + */ + for(i = 0; reset_others; i++, reset_others >>= 1) { + + if ((reset_others & 0x1) == 0) continue; + + val = flag == PFM_RELOAD_LONG_RESET ? + ctx->ctx_soft_pmds[i].long_reset: + ctx->ctx_soft_pmds[i].short_reset; + + if (PMD_IS_COUNTING(i)) { + pfm_write_soft_counter(ctx, i, val); + } else { + ia64_set_pmd(i, val); } + + DBprintk(("[%d] %s reset_others pmd[%d]=%lx\n", + current->pid, + flag == PFM_RELOAD_LONG_RESET ? "long" : "short", i, val)); } /* just in case ! */ - ctx->ctx_ovfl_regs = 0; + ctx->ctx_ovfl_regs[0] = 0UL; } static int -pfm_write_pmcs(struct task_struct *ta, perfmon_req_t *req, int count) +pfm_write_pmcs(struct task_struct *ta, pfm_context_t *ctx, void *arg, int count, struct pt_regs *regs) { struct thread_struct *th = &ta->thread; - pfm_context_t *ctx = th->pfm_context; - perfmon_req_t tmp; - unsigned long cnum; + pfarg_reg_t tmp, *req = (pfarg_reg_t *)arg; + unsigned int cnum; int i; + int ret = 0, reg_retval = 0; + + /* we don't quite support this right now */ + if (ta != current) return -EINVAL; + + if (!CTX_IS_ENABLED(ctx)) return -EINVAL; /* XXX: ctx locking may be required here */ for (i = 0; i < count; i++, req++) { + if (copy_from_user(&tmp, req, sizeof(tmp))) return -EFAULT; - cnum = tmp.pfr_reg.reg_num; + cnum = tmp.reg_num; - /* XXX needs to check validity of the data maybe */ - if (!PMC_IS_IMPL(cnum)) { - DBprintk((" invalid pmc[%ld]\n", cnum)); - return -EINVAL; + /* + * we reject all non implemented PMC as well + * as attempts to modify PMC[0-3] which are used + * as status registers by the PMU + */ + if (!PMC_IS_IMPL(cnum) || cnum < 4) { + DBprintk(("pmc[%u] is unimplemented or invalid\n", cnum)); + ret = -EINVAL; + goto abort_mission; } + /* + * A PMC used to configure monitors must be: + * - system-wide session: privileged monitor + * - per-task : user monitor + * any other configuration is rejected. + */ + if (PMC_IS_MONITOR(cnum)) { + pfm_monitor_t *p = (pfm_monitor_t *)&tmp.reg_value; - if (PMC_IS_COUNTER(cnum)) { + DBprintk(("pmc[%u].pm = %d\n", cnum, p->pmc_pm)); + if (ctx->ctx_fl_system ^ p->pmc_pm) { + //if ((ctx->ctx_fl_system == 1 && p->pmc_pm == 0) + // ||(ctx->ctx_fl_system == 0 && p->pmc_pm == 1)) { + ret = -EINVAL; + goto abort_mission; + } /* - * we keep track of EARS/BTB to speed up sampling later + * enforce generation of overflow interrupt. Necessary on all + * CPUs which do not implement 64-bit hardware counters. */ - if (PMC_IS_DEAR(&tmp.pfr_reg.reg_value)) { - ctx->ctx_dear_counter = cnum; - } else if (PMC_IS_IEAR(&tmp.pfr_reg.reg_value)) { - ctx->ctx_iear_counter = cnum; - } else if (PMC_IS_BTB(&tmp.pfr_reg.reg_value)) { - ctx->ctx_btb_counter = cnum; + p->pmc_oi = 1; + } + + if (PMC_IS_COUNTING(cnum)) { + if (tmp.reg_flags & PFM_REGFL_OVFL_NOTIFY) { + /* + * must have a target for the signal + */ + if (ctx->ctx_notify_task == NULL) { + ret = -EINVAL; + goto abort_mission; + } + + ctx->ctx_soft_pmds[cnum].flags |= PFM_REGFL_OVFL_NOTIFY; } -#if 0 - if (tmp.pfr_reg.reg_flags & PFM_REGFL_OVFL_NOTIFY) - ctx->ctx_pmds[cnum - PMU_FIRST_COUNTER].flags |= PFM_REGFL_OVFL_NOTIFY; -#endif + /* + * copy reset vector + */ + ctx->ctx_soft_pmds[cnum].reset_pmds[0] = tmp.reg_reset_pmds[0]; + ctx->ctx_soft_pmds[cnum].reset_pmds[1] = tmp.reg_reset_pmds[1]; + ctx->ctx_soft_pmds[cnum].reset_pmds[2] = tmp.reg_reset_pmds[2]; + ctx->ctx_soft_pmds[cnum].reset_pmds[3] = tmp.reg_reset_pmds[3]; + + /* + * needed in case the user does not initialize the equivalent + * PMD. Clearing is done in reset_pmu() so there is no possible + * leak here. + */ + CTX_USED_PMD(ctx, cnum); } - /* keep track of what we use */ - CTX_USED_PMC(ctx, cnum); - ia64_set_pmc(cnum, tmp.pfr_reg.reg_value); +abort_mission: + if (ret == -EINVAL) reg_retval = PFM_REG_RETFL_EINVAL; - DBprintk((" setting PMC[%ld]=0x%lx flags=0x%x used_pmcs=0%lx\n", cnum, tmp.pfr_reg.reg_value, ctx->ctx_pmds[cnum - PMU_FIRST_COUNTER].flags, ctx->ctx_used_pmcs[0])); + PFM_REG_RETFLAG_SET(tmp.reg_flags, reg_retval); - } - /* - * we have to set this here event hough we haven't necessarily started monitoring - * because we may be context switched out - */ - if (ctx->ctx_fl_system==0) th->flags |= IA64_THREAD_PM_VALID; + /* + * update register return value, abort all if problem during copy. + */ + if (copy_to_user(req, &tmp, sizeof(tmp))) return -EFAULT; - return 0; + /* + * if there was something wrong on this register, don't touch + * the hardware at all and abort write request for others. + * + * On error, the user mut sequentially scan the table and the first + * entry which has a return flag set is the one that caused the error. + */ + if (ret != 0) { + DBprintk(("[%d] pmc[%u]=0x%lx error %d\n", + ta->pid, cnum, tmp.reg_value, reg_retval)); + break; + } + + /* + * We can proceed with this register! + */ + + /* + * keep copy the pmc, used for register reload + */ + th->pmc[cnum] = tmp.reg_value; + + ia64_set_pmc(cnum, tmp.reg_value); + + DBprintk(("[%d] pmc[%u]=0x%lx flags=0x%x save_pmcs=0%lx reload_pmcs=0x%lx\n", + ta->pid, cnum, tmp.reg_value, + ctx->ctx_soft_pmds[cnum].flags, + ctx->ctx_saved_pmcs[0], ctx->ctx_reload_pmcs[0])); + + } + return ret; } static int -pfm_write_pmds(struct task_struct *ta, perfmon_req_t *req, int count) +pfm_write_pmds(struct task_struct *ta, pfm_context_t *ctx, void *arg, int count, struct pt_regs *regs) { - struct thread_struct *th = &ta->thread; - pfm_context_t *ctx = th->pfm_context; - perfmon_req_t tmp; - unsigned long cnum; + pfarg_reg_t tmp, *req = (pfarg_reg_t *)arg; + unsigned int cnum; int i; + int ret = 0, reg_retval = 0; + + /* we don't quite support this right now */ + if (ta != current) return -EINVAL; + + /* + * Cannot do anything before PMU is enabled + */ + if (!CTX_IS_ENABLED(ctx)) return -EINVAL; + /* XXX: ctx locking may be required here */ for (i = 0; i < count; i++, req++) { - int k; if (copy_from_user(&tmp, req, sizeof(tmp))) return -EFAULT; - cnum = tmp.pfr_reg.reg_num; + cnum = tmp.reg_num; - k = cnum - PMU_FIRST_COUNTER; - - if (!PMD_IS_IMPL(cnum)) return -EINVAL; + if (!PMD_IS_IMPL(cnum)) { + ret = -EINVAL; + goto abort_mission; + } /* update virtualized (64bits) counter */ - if (PMD_IS_COUNTER(cnum)) { - ctx->ctx_pmds[k].ival = tmp.pfr_reg.reg_value; - ctx->ctx_pmds[k].val = tmp.pfr_reg.reg_value & ~pmu_conf.perf_ovfl_val; - ctx->ctx_pmds[k].smpl_rval = tmp.pfr_reg.reg_smpl_reset; - ctx->ctx_pmds[k].ovfl_rval = tmp.pfr_reg.reg_ovfl_reset; - - if (tmp.pfr_reg.reg_flags & PFM_REGFL_OVFL_NOTIFY) - ctx->ctx_pmds[cnum - PMU_FIRST_COUNTER].flags |= PFM_REGFL_OVFL_NOTIFY; + if (PMD_IS_COUNTING(cnum)) { + ctx->ctx_soft_pmds[cnum].ival = tmp.reg_value; + ctx->ctx_soft_pmds[cnum].val = tmp.reg_value & ~pmu_conf.perf_ovfl_val; + ctx->ctx_soft_pmds[cnum].long_reset = tmp.reg_long_reset; + ctx->ctx_soft_pmds[cnum].short_reset = tmp.reg_short_reset; + } +abort_mission: + if (ret == -EINVAL) reg_retval = PFM_REG_RETFL_EINVAL; + + PFM_REG_RETFLAG_SET(tmp.reg_flags, reg_retval); + + if (copy_to_user(req, &tmp, sizeof(tmp))) return -EFAULT; + + /* + * if there was something wrong on this register, don't touch + * the hardware at all and abort write request for others. + * + * On error, the user mut sequentially scan the table and the first + * entry which has a return flag set is the one that caused the error. + */ + if (ret != 0) { + DBprintk(("[%d] pmc[%u]=0x%lx error %d\n", + ta->pid, cnum, tmp.reg_value, reg_retval)); + break; + } + /* keep track of what we use */ CTX_USED_PMD(ctx, cnum); /* writes to unimplemented part is ignored, so this is safe */ - ia64_set_pmd(cnum, tmp.pfr_reg.reg_value); + ia64_set_pmd(cnum, tmp.reg_value); /* to go away */ ia64_srlz_d(); - DBprintk((" setting PMD[%ld]: ovfl_notify=%d pmd.val=0x%lx pmd.ovfl_rval=0x%lx pmd.smpl_rval=0x%lx pmd=%lx used_pmds=0%lx\n", - cnum, - PMD_OVFL_NOTIFY(ctx, cnum - PMU_FIRST_COUNTER), - ctx->ctx_pmds[k].val, - ctx->ctx_pmds[k].ovfl_rval, - ctx->ctx_pmds[k].smpl_rval, - ia64_get_pmd(cnum) & pmu_conf.perf_ovfl_val, - ctx->ctx_used_pmds[0])); + DBprintk(("[%d] pmd[%u]: soft_pmd=0x%lx short_reset=0x%lx " + "long_reset=0x%lx hw_pmd=%lx notify=%c used_pmds=0x%lx reset_pmds=0x%lx\n", + ta->pid, cnum, + ctx->ctx_soft_pmds[cnum].val, + ctx->ctx_soft_pmds[cnum].short_reset, + ctx->ctx_soft_pmds[cnum].long_reset, + ia64_get_pmd(cnum) & pmu_conf.perf_ovfl_val, + PMC_OVFL_NOTIFY(ctx, cnum) ? 'Y':'N', + ctx->ctx_used_pmds[0], + ctx->ctx_soft_pmds[cnum].reset_pmds[0])); } - /* - * we have to set this here event hough we haven't necessarily started monitoring - * because we may be context switched out - */ - if (ctx->ctx_fl_system==0) th->flags |= IA64_THREAD_PM_VALID; - - return 0; + return ret; } static int -pfm_read_pmds(struct task_struct *ta, perfmon_req_t *req, int count) +pfm_read_pmds(struct task_struct *ta, pfm_context_t *ctx, void *arg, int count, struct pt_regs *regs) { struct thread_struct *th = &ta->thread; - pfm_context_t *ctx = th->pfm_context; unsigned long val=0; - perfmon_req_t tmp; + pfarg_reg_t tmp, *req = (pfarg_reg_t *)arg; int i; + if (!CTX_IS_ENABLED(ctx)) return -EINVAL; + /* * XXX: MUST MAKE SURE WE DON"T HAVE ANY PENDING OVERFLOW BEFORE READING - * This is required when the monitoring has been stoppped by user of kernel. - * If ity is still going on, then that's fine because we a re not gauranteed - * to return an accurate value in this case + * This is required when the monitoring has been stoppped by user or kernel. + * If it is still going on, then that's fine because we a re not guaranteed + * to return an accurate value in this case. */ /* XXX: ctx locking may be required here */ + DBprintk(("ctx_last_cpu=%d for [%d]\n", atomic_read(&ctx->ctx_last_cpu), ta->pid)); + for (i = 0; i < count; i++, req++) { - unsigned long reg_val = ~0, ctx_val = ~0; + unsigned long reg_val = ~0UL, ctx_val = ~0UL; if (copy_from_user(&tmp, req, sizeof(tmp))) return -EFAULT; - if (!PMD_IS_IMPL(tmp.pfr_reg.reg_num)) return -EINVAL; + if (!PMD_IS_IMPL(tmp.reg_num)) goto abort_mission; - if (PMD_IS_COUNTER(tmp.pfr_reg.reg_num)) { - if (ta == current){ - val = ia64_get_pmd(tmp.pfr_reg.reg_num); - } else { - val = reg_val = th->pmd[tmp.pfr_reg.reg_num]; + /* + * If the task is not the current one, then we check if the + * PMU state is still in the local live register due to lazy ctxsw. + * If true, then we read directly from the registers. + */ + if (atomic_read(&ctx->ctx_last_cpu) == smp_processor_id()){ + ia64_srlz_d(); + val = reg_val = ia64_get_pmd(tmp.reg_num); + DBprintk(("reading pmd[%u]=0x%lx from hw\n", tmp.reg_num, val)); + } else { +#ifdef CONFIG_SMP + int cpu; + /* + * for SMP system, the context may still be live on another + * CPU so we need to fetch it before proceeding with the read + * This call we only be made once for the whole loop because + * of ctx_last_cpu becoming == -1. + * + * We cannot reuse ctx_last_cpu as it may change before we get to the + * actual IPI call. In this case, we will do the call for nothing but + * there is no way around it. The receiving side will simply do nothing. + */ + cpu = atomic_read(&ctx->ctx_last_cpu); + if (cpu != -1) { + DBprintk(("must fetch on CPU%d for [%d]\n", cpu, ta->pid)); + pfm_fetch_regs(cpu, ta, ctx); } - val &= pmu_conf.perf_ovfl_val; +#endif + /* context has been saved */ + val = reg_val = th->pmd[tmp.reg_num]; + } + if (PMD_IS_COUNTING(tmp.reg_num)) { /* - * lower part of .val may not be zero, so we must be an addition because of - * residual count (see update_counters). + * XXX: need to check for overflow */ - val += ctx_val = ctx->ctx_pmds[tmp.pfr_reg.reg_num - PMU_FIRST_COUNTER].val; + + val &= pmu_conf.perf_ovfl_val; + val += ctx_val = ctx->ctx_soft_pmds[tmp.reg_num].val; } else { - /* for now */ - if (ta != current) return -EINVAL; - ia64_srlz_d(); - val = ia64_get_pmd(tmp.pfr_reg.reg_num); + val = reg_val = ia64_get_pmd(tmp.reg_num); } - tmp.pfr_reg.reg_value = val; + PFM_REG_RETFLAG_SET(tmp.reg_flags, 0); + tmp.reg_value = val; - DBprintk((" reading PMD[%ld]=0x%lx reg=0x%lx ctx_val=0x%lx pmc=0x%lx\n", - tmp.pfr_reg.reg_num, val, reg_val, ctx_val, ia64_get_pmc(tmp.pfr_reg.reg_num))); + DBprintk(("read pmd[%u] soft_pmd=0x%lx reg=0x%lx pmc=0x%lx\n", + tmp.reg_num, ctx_val, reg_val, + ia64_get_pmc(tmp.reg_num))); if (copy_to_user(req, &tmp, sizeof(tmp))) return -EFAULT; } return 0; +abort_mission: + PFM_REG_RETFLAG_SET(tmp.reg_flags, PFM_REG_RETFL_EINVAL); + /* + * XXX: if this fails, we stick we the original failure, flag not updated! + */ + copy_to_user(req, &tmp, sizeof(tmp)); + return -EINVAL; + } +#ifdef PFM_PMU_USES_DBR +/* + * Only call this function when a process it trying to + * write the debug registers (reading is always allowed) + */ +int +pfm_use_debug_registers(struct task_struct *task) +{ + pfm_context_t *ctx = task->thread.pfm_context; + int ret = 0; + + DBprintk(("called for [%d]\n", task->pid)); + + /* + * do it only once + */ + if (task->thread.flags & IA64_THREAD_DBG_VALID) return 0; + + /* + * Even on SMP, we do not need to use an atomic here because + * the only way in is via ptrace() and this is possible only when the + * process is stopped. Even in the case where the ctxsw out is not totally + * completed by the time we come here, there is no way the 'stopped' process + * could be in the middle of fiddling with the pfm_write_ibr_dbr() routine. + * So this is always safe. + */ + if (ctx && ctx->ctx_fl_using_dbreg == 1) return -1; + + /* + * XXX: not pretty + */ + LOCK_PFS(); + + /* + * We only allow the use of debug registers when there is no system + * wide monitoring + * XXX: we could relax this by + */ + if (pfm_sessions.pfs_sys_use_dbregs> 0) + ret = -1; + else + pfm_sessions.pfs_ptrace_use_dbregs++; + + DBprintk(("ptrace_use_dbregs=%lu sys_use_dbregs=%lu by [%d] ret = %d\n", + pfm_sessions.pfs_ptrace_use_dbregs, + pfm_sessions.pfs_sys_use_dbregs, + task->pid, ret)); + + UNLOCK_PFS(); + + return ret; +} + +/* + * This function is called for every task that exits with the + * IA64_THREAD_DBG_VALID set. This indicates a task which was + * able to use the debug registers for debugging purposes via + * ptrace(). Therefore we know it was not using them for + * perfmormance monitoring, so we only decrement the number + * of "ptraced" debug register users to keep the count up to date + */ +int +pfm_release_debug_registers(struct task_struct *task) +{ + int ret; + + LOCK_PFS(); + if (pfm_sessions.pfs_ptrace_use_dbregs == 0) { + printk("perfmon: invalid release for [%d] ptrace_use_dbregs=0\n", task->pid); + ret = -1; + } else { + pfm_sessions.pfs_ptrace_use_dbregs--; + ret = 0; + } + UNLOCK_PFS(); + + return ret; +} +#else /* PFM_PMU_USES_DBR is true */ +/* + * in case, the PMU does not use the debug registers, these two functions are nops. + * The first function is called from arch/ia64/kernel/ptrace.c. + * The second function is called from arch/ia64/kernel/process.c. + */ +int +pfm_use_debug_registers(struct task_struct *task) +{ + return 0; +} +int +pfm_release_debug_registers(struct task_struct *task) +{ + return 0; +} +#endif /* PFM_PMU_USES_DBR */ + static int -pfm_do_restart(struct task_struct *task) +pfm_restart(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) { - struct thread_struct *th = &task->thread; - pfm_context_t *ctx = th->pfm_context; void *sem = &ctx->ctx_restart_sem; + /* + * Cannot do anything before PMU is enabled + */ + if (!CTX_IS_ENABLED(ctx)) return -EINVAL; + + + if (ctx->ctx_fl_frozen==0) { + printk("task %d without pmu_frozen set\n", task->pid); + return -EINVAL; + } + if (task == current) { - DBprintk((" restarting self %d frozen=%d \n", current->pid, ctx->ctx_fl_frozen)); + DBprintk(("restarting self %d frozen=%d \n", current->pid, ctx->ctx_fl_frozen)); + + pfm_reset_regs(ctx, ctx->ctx_ovfl_regs, PFM_RELOAD_LONG_RESET); - pfm_reset_regs(ctx); + ctx->ctx_ovfl_regs[0] = 0UL; /* * We ignore block/don't block because we never block @@ -978,26 +1520,37 @@ pfm_do_restart(struct task_struct *task) ctx->ctx_fl_frozen = 0; if (CTX_HAS_SMPL(ctx)) { - ctx->ctx_smpl_buf->psb_hdr->hdr_count = 0; - ctx->ctx_smpl_buf->psb_index = 0; + ctx->ctx_psb->psb_hdr->hdr_count = 0; + ctx->ctx_psb->psb_index = 0; } - /* pfm_reset_smpl_buffers(ctx,th->pfm_ovfl_regs);*/ - /* simply unfreeze */ ia64_set_pmc(0, 0); ia64_srlz_d(); return 0; - } + } + /* restart on another task */ - /* check if blocking */ + /* + * if blocking, then post the semaphore. + * if non-blocking, then we ensure that the task will go into + * pfm_overflow_must_block() before returning to user mode. + * We cannot explicitely reset another task, it MUST always + * be done by the task itself. This works for system wide because + * the tool that is controlling the session is doing "self-monitoring". + * + * XXX: what if the task never goes back to user? + * + */ if (CTX_OVFL_NOBLOCK(ctx) == 0) { - DBprintk((" unblocking %d \n", task->pid)); + DBprintk(("unblocking %d \n", task->pid)); up(sem); - return 0; + } else { + task->thread.pfm_ovfl_block_reset = 1; + set_tsk_thread_flag(current, TIF_NOTIFY_RESUME); } - +#if 0 /* * in case of non blocking mode, then it's just a matter of * of reseting the sampling buffer (if any) index. The PMU @@ -1008,314 +1561,686 @@ pfm_do_restart(struct task_struct *task) * must reset the header count first */ if (CTX_HAS_SMPL(ctx)) { - DBprintk((" resetting sampling indexes for %d \n", task->pid)); - ctx->ctx_smpl_buf->psb_hdr->hdr_count = 0; - ctx->ctx_smpl_buf->psb_index = 0; + DBprintk(("resetting sampling indexes for %d \n", task->pid)); + ctx->ctx_psb->psb_hdr->hdr_count = 0; + ctx->ctx_psb->psb_index = 0; } +#endif + return 0; +} + +static int +pfm_destroy_context(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) +{ + /* we don't quite support this right now */ + if (task != current) return -EINVAL; + + if (ctx->ctx_fl_system) { + ia64_psr(regs)->pp = 0; + __asm__ __volatile__ ("rsm psr.pp;;"::: "memory"); + } else { + ia64_psr(regs)->up = 0; + __asm__ __volatile__ ("rum psr.up;;"::: "memory"); + + task->thread.flags &= ~IA64_THREAD_PM_VALID; + } + + SET_PMU_OWNER(NULL); + + /* freeze PMU */ + ia64_set_pmc(0, 1); + ia64_srlz_d(); + + /* restore security level */ + ia64_psr(regs)->sp = 1; + + /* + * remove sampling buffer mapping, if any + */ + if (ctx->ctx_smpl_vaddr) pfm_remove_smpl_mapping(task); + + /* now free context and related state */ + pfm_context_exit(task); return 0; } /* - * system-wide mode: propagate activation/desactivation throughout the tasklist - * - * XXX: does not work for SMP, of course + * does nothing at the moment */ -static void -pfm_process_tasklist(int cmd) +static int +pfm_unprotect_context(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) { - struct task_struct *p; - struct pt_regs *regs; - - for_each_task(p) { - regs = (struct pt_regs *)((unsigned long)p + IA64_STK_OFFSET); - regs--; - ia64_psr(regs)->pp = cmd; - } + return 0; } static int -do_perfmonctl (struct task_struct *task, int cmd, int flags, perfmon_req_t *req, int count, struct pt_regs *regs) +pfm_protect_context(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) { - perfmon_req_t tmp; - struct thread_struct *th = &task->thread; - pfm_context_t *ctx = th->pfm_context; + DBprintk(("context from [%d] is protected\n", task->pid)); + /* + * from now on, only the creator of the context has access to it + */ + ctx->ctx_fl_protected = 1; - memset(&tmp, 0, sizeof(tmp)); + /* + * reinforce secure monitoring: cannot toggle psr.up + */ + ia64_psr(regs)->sp = 1; - if (ctx == NULL && cmd != PFM_CREATE_CONTEXT && cmd < PFM_DEBUG_BASE) { - DBprintk((" PFM_WRITE_PMCS: no context for task %d\n", task->pid)); - return -EINVAL; - } + return 0; +} - switch (cmd) { - case PFM_CREATE_CONTEXT: - /* a context has already been defined */ - if (ctx) return -EBUSY; +static int +pfm_debug(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) +{ + unsigned int mode = *(unsigned int *)arg; - /* - * cannot directly create a context in another process - */ - if (task != current) return -EINVAL; + pfm_debug_mode = mode == 0 ? 0 : 1; - if (req == NULL || count != 1) return -EINVAL; + printk("perfmon debugging %s\n", pfm_debug_mode ? "on" : "off"); - if (!access_ok(VERIFY_READ, req, sizeof(struct perfmon_req_t)*count)) return -EFAULT; + return 0; +} - return pfm_context_create(flags, req); +#ifdef PFM_PMU_USES_DBR - case PFM_WRITE_PMCS: - /* we don't quite support this right now */ - if (task != current) return -EINVAL; +typedef struct { + unsigned long ibr_mask:56; + unsigned long ibr_plm:4; + unsigned long ibr_ig:3; + unsigned long ibr_x:1; +} ibr_mask_reg_t; - if (!access_ok(VERIFY_READ, req, sizeof(struct perfmon_req_t)*count)) return -EFAULT; +typedef struct { + unsigned long dbr_mask:56; + unsigned long dbr_plm:4; + unsigned long dbr_ig:2; + unsigned long dbr_w:1; + unsigned long dbr_r:1; +} dbr_mask_reg_t; - return pfm_write_pmcs(task, req, count); +typedef union { + unsigned long val; + ibr_mask_reg_t ibr; + dbr_mask_reg_t dbr; +} dbreg_t; - case PFM_WRITE_PMDS: - /* we don't quite support this right now */ - if (task != current) return -EINVAL; - if (!access_ok(VERIFY_READ, req, sizeof(struct perfmon_req_t)*count)) return -EFAULT; +static int +pfm_write_ibr_dbr(int mode, struct task_struct *task, void *arg, int count, struct pt_regs *regs) +{ + struct thread_struct *thread = &task->thread; + pfm_context_t *ctx = task->thread.pfm_context; + pfarg_dbreg_t tmp, *req = (pfarg_dbreg_t *)arg; + dbreg_t dbreg; + unsigned int rnum; + int first_time; + int i, ret = 0; - return pfm_write_pmds(task, req, count); + /* + * for range restriction: psr.db must be cleared or the + * the PMU will ignore the debug registers. + * + * XXX: may need more in system wide mode, + * no task can have this bit set? + */ + if (ia64_psr(regs)->db == 1) return -EINVAL; - case PFM_START: - /* we don't quite support this right now */ - if (task != current) return -EINVAL; - if (PMU_OWNER() && PMU_OWNER() != current && PFM_CAN_DO_LAZY()) pfm_lazy_save_regs(PMU_OWNER()); + first_time = ctx->ctx_fl_using_dbreg == 0; - SET_PMU_OWNER(current); + /* + * check for debug registers in system wide mode + * + */ + LOCK_PFS(); + if (ctx->ctx_fl_system && first_time) { + if (pfm_sessions.pfs_ptrace_use_dbregs) + ret = -EBUSY; + else + pfm_sessions.pfs_sys_use_dbregs++; + } + UNLOCK_PFS(); - /* will start monitoring right after rfi */ - ia64_psr(regs)->up = 1; - ia64_psr(regs)->pp = 1; + if (ret != 0) return ret; - if (ctx->ctx_fl_system) { - pfm_process_tasklist(1); - pfs_info.pfs_pp = 1; + if (ctx->ctx_fl_system) { + /* we mark ourselves as owner of the debug registers */ + ctx->ctx_fl_using_dbreg = 1; + } else { + if (ctx->ctx_fl_using_dbreg == 0) { + ret= -EBUSY; + if ((thread->flags & IA64_THREAD_DBG_VALID) != 0) { + DBprintk(("debug registers already in use for [%d]\n", task->pid)); + goto abort_mission; } + /* we mark ourselves as owner of the debug registers */ + ctx->ctx_fl_using_dbreg = 1; + + /* + * Given debug registers cannot be used for both debugging + * and performance monitoring at the same time, we reuse + * the storage area to save and restore the registers on ctxsw. + */ + memset(task->thread.dbr, 0, sizeof(task->thread.dbr)); + memset(task->thread.ibr, 0, sizeof(task->thread.ibr)); /* - * mark the state as valid. - * this will trigger save/restore at context switch + * clear hardware registers to make sure we don't leak + * information and pick up stale state */ - if (ctx->ctx_fl_system==0) th->flags |= IA64_THREAD_PM_VALID; + for (i=0; i < pmu_conf.num_ibrs; i++) { + ia64_set_ibr(i, 0UL); + } + for (i=0; i < pmu_conf.num_dbrs; i++) { + ia64_set_dbr(i, 0UL); + } + } + } - ia64_set_pmc(0, 0); - ia64_srlz_d(); + ret = -EFAULT; - break; + /* + * Now install the values into the registers + */ + for (i = 0; i < count; i++, req++) { - case PFM_ENABLE: - /* we don't quite support this right now */ - if (task != current) return -EINVAL; + + if (copy_from_user(&tmp, req, sizeof(tmp))) goto abort_mission; + + rnum = tmp.dbreg_num; + dbreg.val = tmp.dbreg_value; + + ret = -EINVAL; - if (PMU_OWNER() && PMU_OWNER() != current && PFM_CAN_DO_LAZY()) pfm_lazy_save_regs(PMU_OWNER()); + if ((mode == 0 && !IBR_IS_IMPL(rnum)) || ((mode == 1) && !DBR_IS_IMPL(rnum))) { + DBprintk(("invalid register %u val=0x%lx mode=%d i=%d count=%d\n", + rnum, dbreg.val, mode, i, count)); - /* reset all registers to stable quiet state */ - ia64_reset_pmu(); + goto abort_mission; + } - /* make sure nothing starts */ - ia64_psr(regs)->up = 0; - ia64_psr(regs)->pp = 0; + /* + * make sure we do not install enabled breakpoint + */ + if (rnum & 0x1) { + if (mode == 0) + dbreg.ibr.ibr_x = 0; + else + dbreg.dbr.dbr_r = dbreg.dbr.dbr_w = 0; + } - /* do it on the live register as well */ - __asm__ __volatile__ ("rsm psr.pp|psr.pp;;"::: "memory"); + /* + * clear return flags and copy back to user + * + * XXX: fix once EAGAIN is implemented + */ + ret = -EFAULT; - SET_PMU_OWNER(current); + PFM_REG_RETFLAG_SET(tmp.dbreg_flags, 0); - /* - * mark the state as valid. - * this will trigger save/restore at context switch - */ - if (ctx->ctx_fl_system==0) th->flags |= IA64_THREAD_PM_VALID; + if (copy_to_user(req, &tmp, sizeof(tmp))) goto abort_mission; - /* simply unfreeze */ - ia64_set_pmc(0, 0); - ia64_srlz_d(); - break; + /* + * Debug registers, just like PMC, can only be modified + * by a kernel call. Moreover, perfmon() access to those + * registers are centralized in this routine. The hardware + * does not modify the value of these registers, therefore, + * if we save them as they are written, we can avoid having + * to save them on context switch out. This is made possible + * by the fact that when perfmon uses debug registers, ptrace() + * won't be able to modify them concurrently. + */ + if (mode == 0) { + CTX_USED_IBR(ctx, rnum); - case PFM_DISABLE: - /* we don't quite support this right now */ - if (task != current) return -EINVAL; + ia64_set_ibr(rnum, dbreg.val); - /* simply freeze */ - ia64_set_pmc(0, 1); - ia64_srlz_d(); - /* - * XXX: cannot really toggle IA64_THREAD_PM_VALID - * but context is still considered valid, so any - * read request would return something valid. Same - * thing when this task terminates (pfm_flush_regs()). - */ - break; + thread->ibr[rnum] = dbreg.val; - case PFM_READ_PMDS: - if (!access_ok(VERIFY_READ, req, sizeof(struct perfmon_req_t)*count)) return -EFAULT; - if (!access_ok(VERIFY_WRITE, req, sizeof(struct perfmon_req_t)*count)) return -EFAULT; + DBprintk(("write ibr%u=0x%lx used_ibrs=0x%lx\n", rnum, dbreg.val, ctx->ctx_used_ibrs[0])); + } else { + CTX_USED_DBR(ctx, rnum); - return pfm_read_pmds(task, req, count); + ia64_set_dbr(rnum, dbreg.val); - case PFM_STOP: - /* we don't quite support this right now */ - if (task != current) return -EINVAL; + thread->dbr[rnum] = dbreg.val; - /* simply stop monitors, not PMU */ - ia64_psr(regs)->up = 0; - ia64_psr(regs)->pp = 0; + DBprintk(("write dbr%u=0x%lx used_dbrs=0x%lx\n", rnum, dbreg.val, ctx->ctx_used_dbrs[0])); + } + } - if (ctx->ctx_fl_system) { - pfm_process_tasklist(0); - pfs_info.pfs_pp = 0; - } + return 0; - break; +abort_mission: + /* + * in case it was our first attempt, we undo the global modifications + */ + if (first_time) { + LOCK_PFS(); + if (ctx->ctx_fl_system) { + pfm_sessions.pfs_sys_use_dbregs--; + } + UNLOCK_PFS(); + ctx->ctx_fl_using_dbreg = 0; + } + /* + * install error return flag + */ + if (ret != -EFAULT) { + /* + * XXX: for now we can only come here on EINVAL + */ + PFM_REG_RETFLAG_SET(tmp.dbreg_flags, PFM_REG_RETFL_EINVAL); + copy_to_user(req, &tmp, sizeof(tmp)); + } + return ret; +} - case PFM_RESTART: /* temporary, will most likely end up as a PFM_ENABLE */ +static int +pfm_write_ibrs(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) +{ + /* we don't quite support this right now */ + if (task != current) return -EINVAL; - if ((th->flags & IA64_THREAD_PM_VALID) == 0 && ctx->ctx_fl_system==0) { - printk(" PFM_RESTART not monitoring\n"); - return -EINVAL; - } - if (CTX_OVFL_NOBLOCK(ctx) == 0 && ctx->ctx_fl_frozen==0) { - printk("task %d without pmu_frozen set\n", task->pid); - return -EINVAL; - } + if (!CTX_IS_ENABLED(ctx)) return -EINVAL; - return pfm_do_restart(task); /* we only look at first entry */ + return pfm_write_ibr_dbr(0, task, arg, count, regs); +} - case PFM_DESTROY_CONTEXT: - /* we don't quite support this right now */ - if (task != current) return -EINVAL; +static int +pfm_write_dbrs(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) +{ + /* we don't quite support this right now */ + if (task != current) return -EINVAL; - /* first stop monitors */ - ia64_psr(regs)->up = 0; - ia64_psr(regs)->pp = 0; + if (!CTX_IS_ENABLED(ctx)) return -EINVAL; - /* then freeze PMU */ - ia64_set_pmc(0, 1); - ia64_srlz_d(); + return pfm_write_ibr_dbr(1, task, arg, count, regs); +} - /* don't save/restore on context switch */ - if (ctx->ctx_fl_system ==0) task->thread.flags &= ~IA64_THREAD_PM_VALID; +#endif /* PFM_PMU_USES_DBR */ - SET_PMU_OWNER(NULL); +static int +pfm_get_features(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, struct pt_regs *regs) +{ + pfarg_features_t tmp; - /* now free context and related state */ - pfm_context_exit(task); - break; + memset(&tmp, 0, sizeof(tmp)); - case PFM_DEBUG_ON: - printk("perfmon debugging on\n"); - pfm_debug = 1; - break; + tmp.ft_version = PFM_VERSION; + tmp.ft_smpl_version = PFM_SMPL_VERSION; - case PFM_DEBUG_OFF: - printk("perfmon debugging off\n"); - pfm_debug = 0; - break; + if (copy_to_user(arg, &tmp, sizeof(tmp))) return -EFAULT; - default: - DBprintk((" UNknown command 0x%x\n", cmd)); + return 0; +} + +static int +pfm_start(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) +{ + /* we don't quite support this right now */ + if (task != current) return -EINVAL; + + /* + * Cannot do anything before PMU is enabled + */ + if (!CTX_IS_ENABLED(ctx)) return -EINVAL; + + DBprintk(("[%d] fl_system=%d owner=%p current=%p\n", + current->pid, + ctx->ctx_fl_system, PMU_OWNER(), + current)); + + if (PMU_OWNER() != task) { + printk("perfmon: pfm_start task [%d] not pmu owner\n", task->pid); + return -EINVAL; + } + + if (ctx->ctx_fl_system) { + + /* enable dcr pp */ + ia64_set_dcr(ia64_get_dcr()|IA64_DCR_PP); + + local_cpu_data->pfm_dcr_pp = 1; + ia64_psr(regs)->pp = 1; + __asm__ __volatile__ ("ssm psr.pp;;"::: "memory"); + + } else { + if ((task->thread.flags & IA64_THREAD_PM_VALID) == 0) { + printk("perfmon: pfm_start task flag not set for [%d]\n", task->pid); return -EINVAL; + } + ia64_psr(regs)->up = 1; + __asm__ __volatile__ ("sum psr.up;;"::: "memory"); + } + ia64_srlz_d(); + + return 0; +} + +static int +pfm_enable(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) +{ + /* we don't quite support this right now */ + if (task != current) return -EINVAL; + + if (ctx->ctx_fl_system == 0 && PMU_OWNER() && PMU_OWNER() != current) + pfm_lazy_save_regs(PMU_OWNER()); + + /* reset all registers to stable quiet state */ + ia64_reset_pmu(task); + + /* make sure nothing starts */ + if (ctx->ctx_fl_system) { + ia64_psr(regs)->pp = 0; + ia64_psr(regs)->up = 0; /* just to make sure! */ + + __asm__ __volatile__ ("rsm psr.pp;;"::: "memory"); + +#ifdef CONFIG_SMP + local_cpu_data->pfm_syst_wide = 1; + local_cpu_data->pfm_dcr_pp = 0; +#endif + } else { + /* + * needed in case the task was a passive task during + * a system wide session and now wants to have its own + * session + */ + ia64_psr(regs)->pp = 0; /* just to make sure! */ + ia64_psr(regs)->up = 0; + + __asm__ __volatile__ ("rum psr.up;;"::: "memory"); + /* + * allow user control (user monitors only) + if (task == ctx->ctx_owner) { + */ + { + DBprintk(("clearing psr.sp for [%d]\n", current->pid)); + ia64_psr(regs)->sp = 0; + } + task->thread.flags |= IA64_THREAD_PM_VALID; + } + + SET_PMU_OWNER(task); + + + ctx->ctx_flags.state = PFM_CTX_ENABLED; + atomic_set(&ctx->ctx_last_cpu, smp_processor_id()); + + /* simply unfreeze */ + ia64_set_pmc(0, 0); + ia64_srlz_d(); + + return 0; +} + +static int +pfm_disable(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) +{ + /* we don't quite support this right now */ + if (task != current) return -EINVAL; + + if (!CTX_IS_ENABLED(ctx)) return -EINVAL; + + /* + * stop monitoring, freeze PMU, and save state in context + */ + pfm_flush_regs(task); + + /* + * just to make sure nothing starts again when back in user mode. + * pfm_flush_regs() freezes the PMU anyway. + */ + if (ctx->ctx_fl_system) { + ia64_psr(regs)->pp = 0; + } else { + ia64_psr(regs)->up = 0; + } + + /* + * goes back to default behavior + * no need to change live psr.sp because useless at the kernel level + */ + ia64_psr(regs)->sp = 1; + + DBprintk(("enabling psr.sp for [%d]\n", current->pid)); + + ctx->ctx_flags.state = PFM_CTX_DISABLED; + + return 0; +} + +static int +pfm_stop(struct task_struct *task, pfm_context_t *ctx, void *arg, int count, + struct pt_regs *regs) +{ + /* we don't quite support this right now */ + if (task != current) return -EINVAL; + + /* + * Cannot do anything before PMU is enabled + */ + if (!CTX_IS_ENABLED(ctx)) return -EINVAL; + + DBprintk(("[%d] fl_system=%d owner=%p current=%p\n", + current->pid, + ctx->ctx_fl_system, PMU_OWNER(), + current)); + /* simply stop monitoring but not the PMU */ + if (ctx->ctx_fl_system) { + + __asm__ __volatile__ ("rsm psr.pp;;"::: "memory"); + + /* disable dcr pp */ + ia64_set_dcr(ia64_get_dcr() & ~IA64_DCR_PP); + + local_cpu_data->pfm_dcr_pp = 0; + + ia64_psr(regs)->pp = 0; + + __asm__ __volatile__ ("rsm psr.pp;;"::: "memory"); + + } else { + ia64_psr(regs)->up = 0; + __asm__ __volatile__ ("rum psr.up;;"::: "memory"); } return 0; } /* - * XXX: do something better here + * functions MUST be listed in the increasing order of their index (see permfon.h) */ +static pfm_cmd_desc_t pfm_cmd_tab[]={ +/* 0 */{ NULL, 0, 0, 0}, /* not used */ +/* 1 */{ pfm_write_pmcs, PFM_CMD_PID|PFM_CMD_CTX|PFM_CMD_ARG_READ|PFM_CMD_ARG_WRITE, PFM_CMD_ARG_MANY, sizeof(pfarg_reg_t)}, +/* 2 */{ pfm_write_pmds, PFM_CMD_PID|PFM_CMD_CTX|PFM_CMD_ARG_READ, PFM_CMD_ARG_MANY, sizeof(pfarg_reg_t)}, +/* 3 */{ pfm_read_pmds, PFM_CMD_PID|PFM_CMD_CTX|PFM_CMD_ARG_READ|PFM_CMD_ARG_WRITE, PFM_CMD_ARG_MANY, sizeof(pfarg_reg_t)}, +/* 4 */{ pfm_stop, PFM_CMD_PID|PFM_CMD_CTX, 0, 0}, +/* 5 */{ pfm_start, PFM_CMD_PID|PFM_CMD_CTX, 0, 0}, +/* 6 */{ pfm_enable, PFM_CMD_PID|PFM_CMD_CTX, 0, 0}, +/* 7 */{ pfm_disable, PFM_CMD_PID|PFM_CMD_CTX, 0, 0}, +/* 8 */{ pfm_create_context, PFM_CMD_ARG_READ, 1, sizeof(pfarg_context_t)}, +/* 9 */{ pfm_destroy_context, PFM_CMD_PID|PFM_CMD_CTX, 0, 0}, +/* 10 */{ pfm_restart, PFM_CMD_PID|PFM_CMD_CTX|PFM_CMD_NOCHK, 0, 0}, +/* 11 */{ pfm_protect_context, PFM_CMD_PID|PFM_CMD_CTX, 0, 0}, +/* 12 */{ pfm_get_features, PFM_CMD_ARG_WRITE, 0, 0}, +/* 13 */{ pfm_debug, 0, 1, sizeof(unsigned int)}, +/* 14 */{ pfm_unprotect_context, PFM_CMD_PID|PFM_CMD_CTX, 0, 0}, +/* 15 */{ NULL, 0, 0, 0}, /* not used */ +/* 16 */{ NULL, 0, 0, 0}, /* not used */ +/* 17 */{ NULL, 0, 0, 0}, /* not used */ +/* 18 */{ NULL, 0, 0, 0}, /* not used */ +/* 19 */{ NULL, 0, 0, 0}, /* not used */ +/* 20 */{ NULL, 0, 0, 0}, /* not used */ +/* 21 */{ NULL, 0, 0, 0}, /* not used */ +/* 22 */{ NULL, 0, 0, 0}, /* not used */ +/* 23 */{ NULL, 0, 0, 0}, /* not used */ +/* 24 */{ NULL, 0, 0, 0}, /* not used */ +/* 25 */{ NULL, 0, 0, 0}, /* not used */ +/* 26 */{ NULL, 0, 0, 0}, /* not used */ +/* 27 */{ NULL, 0, 0, 0}, /* not used */ +/* 28 */{ NULL, 0, 0, 0}, /* not used */ +/* 29 */{ NULL, 0, 0, 0}, /* not used */ +/* 30 */{ NULL, 0, 0, 0}, /* not used */ +/* 31 */{ NULL, 0, 0, 0}, /* not used */ +#ifdef PFM_PMU_USES_DBR +/* 32 */{ pfm_write_ibrs, PFM_CMD_PID|PFM_CMD_CTX|PFM_CMD_ARG_READ|PFM_CMD_ARG_WRITE, PFM_CMD_ARG_MANY, sizeof(pfarg_dbreg_t)}, +/* 33 */{ pfm_write_dbrs, PFM_CMD_PID|PFM_CMD_CTX|PFM_CMD_ARG_READ|PFM_CMD_ARG_WRITE, PFM_CMD_ARG_MANY, sizeof(pfarg_dbreg_t)} +#endif +}; +#define PFM_CMD_COUNT (sizeof(pfm_cmd_tab)/sizeof(pfm_cmd_desc_t)) + static int -perfmon_bad_permissions(struct task_struct *task) +check_task_state(struct task_struct *task) { - /* stolen from bad_signal() */ - return (current->session != task->session) - && (current->euid ^ task->suid) && (current->euid ^ task->uid) - && (current->uid ^ task->suid) && (current->uid ^ task->uid); + int ret = 0; +#ifdef CONFIG_SMP + /* We must wait until the state has been completely + * saved. There can be situations where the reader arrives before + * after the task is marked as STOPPED but before pfm_save_regs() + * is completed. + */ + for (;;) { + + task_lock(task); + if (1 /*XXX !task_has_cpu(task)*/) break; + task_unlock(task); + + do { + if (task->state != TASK_ZOMBIE && task->state != TASK_STOPPED) return -EBUSY; + barrier(); + cpu_relax(); + } while (0 /*task_has_cpu(task)*/); + } + task_unlock(task); +#else + if (task->state != TASK_ZOMBIE && task->state != TASK_STOPPED) { + DBprintk(("warning [%d] not in stable state %ld\n", task->pid, task->state)); + ret = -EBUSY; + } +#endif + return ret; } asmlinkage int -sys_perfmonctl (int pid, int cmd, int flags, perfmon_req_t *req, int count, long arg6, long arg7, long arg8, long stack) +sys_perfmonctl (pid_t pid, int cmd, void *arg, int count, long arg5, long arg6, long arg7, + long arg8, long stack) { - struct pt_regs *regs = (struct pt_regs *) &stack; - struct task_struct *child = current; - int ret = -ESRCH; + struct pt_regs *regs = (struct pt_regs *)&stack; + struct task_struct *task = current; + pfm_context_t *ctx = task->thread.pfm_context; + size_t sz; + int ret = -ESRCH, narg; - /* sanity check: - * - * ensures that we don't do bad things in case the OS - * does not have enough storage to save/restore PMC/PMD + /* + * reject any call if perfmon was disabled at initialization time */ - if (PERFMON_IS_DISABLED()) return -ENOSYS; + if (PFM_IS_DISABLED()) return -ENOSYS; - /* XXX: pid interface is going away in favor of pfm context */ - if (pid != current->pid) { - read_lock(&tasklist_lock); + DBprintk(("cmd=%d idx=%d valid=%d narg=0x%x\n", cmd, PFM_CMD_IDX(cmd), + PFM_CMD_IS_VALID(cmd), PFM_CMD_NARG(cmd))); - child = find_task_by_pid(pid); + if (PFM_CMD_IS_VALID(cmd) == 0) return -EINVAL; - if (!child) goto abort_call; + /* ingore arguments when command has none */ + narg = PFM_CMD_NARG(cmd); + if ((narg == PFM_CMD_ARG_MANY && count == 0) || (narg > 0 && narg != count)) return -EINVAL; - ret = -EPERM; + sz = PFM_CMD_ARG_SIZE(cmd); - if (perfmon_bad_permissions(child)) goto abort_call; + if (PFM_CMD_READ_ARG(cmd) && !access_ok(VERIFY_READ, arg, sz*count)) return -EFAULT; - /* - * XXX: need to do more checking here + if (PFM_CMD_WRITE_ARG(cmd) && !access_ok(VERIFY_WRITE, arg, sz*count)) return -EFAULT; + + if (PFM_CMD_USE_PID(cmd)) { + /* + * XXX: may need to fine tune this one */ - if (child->state != TASK_ZOMBIE && child->state != TASK_STOPPED) { - DBprintk((" warning process %d not in stable state %ld\n", pid, child->state)); + if (pid < 2) return -EPERM; + + if (pid != current->pid) { + + read_lock(&tasklist_lock); + + task = find_task_by_pid(pid); + + if (!task) goto abort_call; + + ret = -EPERM; + + if (pfm_bad_permissions(task)) goto abort_call; + + if (PFM_CMD_CHK(cmd)) { + ret = check_task_state(task); + if (ret != 0) goto abort_call; + } + ctx = task->thread.pfm_context; } + } + + if (PFM_CMD_USE_CTX(cmd)) { + ret = -EINVAL; + if (ctx == NULL) { + DBprintk(("no context for task %d\n", task->pid)); + goto abort_call; + } + ret = -EPERM; + /* + * we only grant access to the context if: + * - the caller is the creator of the context (ctx_owner) + * OR - the context is attached to the caller AND The context IS NOT + * in protected mode + */ + if (ctx->ctx_owner != current && (ctx->ctx_fl_protected || task != current)) { + DBprintk(("context protected, no access for [%d]\n", task->pid)); + goto abort_call; + } } - ret = do_perfmonctl(child, cmd, flags, req, count, regs); + + ret = (*pfm_cmd_tab[PFM_CMD_IDX(cmd)].cmd_func)(task, ctx, arg, count, regs); abort_call: - if (child != current) read_unlock(&tasklist_lock); + if (task != current) read_unlock(&tasklist_lock); return ret; } -#if __GNUC__ >= 3 -void asmlinkage -pfm_block_on_overflow(void) -#else -void asmlinkage -pfm_block_on_overflow(u64 arg0, u64 arg1, u64 arg2, u64 arg3, u64 arg4, u64 arg5, u64 arg6, u64 arg7) -#endif +void +pfm_ovfl_block_reset (void) { struct thread_struct *th = ¤t->thread; pfm_context_t *ctx = current->thread.pfm_context; int ret; /* - * NO matter what notify_pid is, - * we clear overflow, won't notify again + * clear the flag, to make sure we won't get here + * again */ - th->pfm_must_block = 0; + th->pfm_ovfl_block_reset = 0; /* * do some sanity checks first */ if (!ctx) { - printk("perfmon: process %d has no PFM context\n", current->pid); + printk("perfmon: [%d] has no PFM context\n", current->pid); return; } - if (ctx->ctx_notify_task == 0) { - printk("perfmon: process %d has no task to notify\n", current->pid); - return; - } - - DBprintk((" current=%d task=%d\n", current->pid, ctx->ctx_notify_task->pid)); - /* should not happen */ - if (CTX_OVFL_NOBLOCK(ctx)) { - printk("perfmon: process %d non-blocking ctx should not be here\n", current->pid); - return; - } + if (CTX_OVFL_NOBLOCK(ctx)) goto non_blocking; - DBprintk((" CPU%d %d before sleep\n", smp_processor_id(), current->pid)); + DBprintk(("[%d] before sleeping\n", current->pid)); /* * may go through without blocking on SMP systems @@ -1323,12 +2248,14 @@ pfm_block_on_overflow(u64 arg0, u64 arg1, u64 arg2, u64 arg3, u64 arg4, u64 arg5 */ ret = down_interruptible(&ctx->ctx_restart_sem); - DBprintk((" CPU%d %d after sleep ret=%d\n", smp_processor_id(), current->pid, ret)); + DBprintk(("[%d] after sleeping ret=%d\n", current->pid, ret)); /* * in case of interruption of down() we don't restart anything */ if (ret >= 0) { + +non_blocking: /* we reactivate on context switch */ ctx->ctx_fl_frozen = 0; /* @@ -1336,19 +2263,19 @@ pfm_block_on_overflow(u64 arg0, u64 arg1, u64 arg2, u64 arg3, u64 arg4, u64 arg5 * use the local reference */ - pfm_reset_regs(ctx); + pfm_reset_regs(ctx, ctx->ctx_ovfl_regs, PFM_RELOAD_LONG_RESET); + + ctx->ctx_ovfl_regs[0] = 0UL; /* * Unlock sampling buffer and reset index atomically * XXX: not really needed when blocking */ if (CTX_HAS_SMPL(ctx)) { - ctx->ctx_smpl_buf->psb_hdr->hdr_count = 0; - ctx->ctx_smpl_buf->psb_index = 0; + ctx->ctx_psb->psb_hdr->hdr_count = 0; + ctx->ctx_psb->psb_index = 0; } - DBprintk((" CPU%d %d unfreeze PMU\n", smp_processor_id(), current->pid)); - ia64_set_pmc(0, 0); ia64_srlz_d(); @@ -1357,23 +2284,111 @@ pfm_block_on_overflow(u64 arg0, u64 arg1, u64 arg2, u64 arg3, u64 arg4, u64 arg5 } /* + * This function will record an entry in the sampling if it is not full already. + * Return: + * 0 : buffer is not full (did not BECOME full: still space or was already full) + * 1 : buffer is full (recorded the last entry) + */ +static int +pfm_record_sample(struct task_struct *task, pfm_context_t *ctx, unsigned long ovfl_mask, struct pt_regs *regs) +{ + pfm_smpl_buffer_desc_t *psb = ctx->ctx_psb; + unsigned long *e, m, idx; + perfmon_smpl_entry_t *h; + int j; + + +pfm_recorded_samples_count++; + idx = ia64_fetch_and_add(1, &psb->psb_index); + DBprintk(("recording index=%ld entries=%ld\n", idx-1, psb->psb_entries)); + + /* + * XXX: there is a small chance that we could run out on index before resetting + * but index is unsigned long, so it will take some time..... + * We use > instead of == because fetch_and_add() is off by one (see below) + * + * This case can happen in non-blocking mode or with multiple processes. + * For non-blocking, we need to reload and continue. + */ + if (idx > psb->psb_entries) return 0; + + /* first entry is really entry 0, not 1 caused by fetch_and_add */ + idx--; + + h = (perfmon_smpl_entry_t *)(((char *)psb->psb_addr) + idx*(psb->psb_entry_size)); + + /* + * initialize entry header + */ + h->pid = task->pid; + h->cpu = smp_processor_id(); + h->rate = 0; /* XXX: add the sampling rate used here */ + h->ip = regs ? regs->cr_iip : 0x0; /* where did the fault happened */ + h->regs = ovfl_mask; /* which registers overflowed */ + + /* guaranteed to monotonically increase on each cpu */ + h->stamp = pfm_get_stamp(); + h->period = 0UL; /* not yet used */ + + /* position for first pmd */ + e = (unsigned long *)(h+1); + + /* + * selectively store PMDs in increasing index number + */ + m = ctx->ctx_smpl_regs[0]; + for (j=0; m; m >>=1, j++) { + + if ((m & 0x1) == 0) continue; + + if (PMD_IS_COUNTING(j)) { + *e = pfm_read_soft_counter(ctx, j); + /* check if this pmd overflowed as well */ + *e += ovfl_mask & (1UL<<j) ? 1 + pmu_conf.perf_ovfl_val : 0; + } else { + *e = ia64_get_pmd(j); /* slow */ + } + DBprintk(("e=%p pmd%d =0x%lx\n", (void *)e, j, *e)); + e++; + } + /* + * make the new entry visible to user, needs to be atomic + */ + ia64_fetch_and_add(1, &psb->psb_hdr->hdr_count); + + DBprintk(("index=%ld entries=%ld hdr_count=%ld\n", + idx, psb->psb_entries, psb->psb_hdr->hdr_count)); + /* + * sampling buffer full ? + */ + if (idx == (psb->psb_entries-1)) { + DBprintk(("sampling buffer full\n")); + /* + * XXX: must reset buffer in blocking mode and lost notified + */ + return 1; + } + return 0; +} + +/* * main overflow processing routine. * it can be called from the interrupt path or explicitely during the context switch code * Return: * new value of pmc[0]. if 0x0 then unfreeze, else keep frozen */ -unsigned long -update_counters (struct task_struct *task, u64 pmc0, struct pt_regs *regs) +static unsigned long +pfm_overflow_handler(struct task_struct *task, u64 pmc0, struct pt_regs *regs) { - unsigned long mask, i, cnum; - struct thread_struct *th; + unsigned long mask; + struct thread_struct *t; pfm_context_t *ctx; - unsigned long bv = 0; + unsigned long old_val; + unsigned long ovfl_notify = 0UL, ovfl_pmds = 0UL; + int i; int my_cpu = smp_processor_id(); - int ret = 1, buffer_is_full = 0; - int ovfl_has_long_recovery, can_notify, need_reset_pmd16=0; + int ret = 1; struct siginfo si; - /* * It is never safe to access the task for which the overflow interrupt is destinated * using the current variable as the interrupt may occur in the middle of a context switch @@ -1388,233 +2403,151 @@ update_counters (struct task_struct *task, u64 pmc0, struct pt_regs *regs) */ if (task == NULL) { - DBprintk((" owners[%d]=NULL\n", my_cpu)); + DBprintk(("owners[%d]=NULL\n", my_cpu)); return 0x1; } - th = &task->thread; - ctx = th->pfm_context; + t = &task->thread; + ctx = task->thread.pfm_context; + + if (!ctx) { + printk("perfmon: Spurious overflow interrupt: process %d has no PFM context\n", + task->pid); + return 0; + } /* * XXX: debug test * Don't think this could happen given upfront tests */ - if ((th->flags & IA64_THREAD_PM_VALID) == 0 && ctx->ctx_fl_system == 0) { - printk("perfmon: Spurious overflow interrupt: process %d not using perfmon\n", task->pid); + if ((t->flags & IA64_THREAD_PM_VALID) == 0 && ctx->ctx_fl_system == 0) { + printk("perfmon: Spurious overflow interrupt: process %d not using perfmon\n", + task->pid); return 0x1; } - if (!ctx) { - printk("perfmon: Spurious overflow interrupt: process %d has no PFM context\n", task->pid); - return 0; - } - /* * sanity test. Should never happen */ - if ((pmc0 & 0x1 )== 0) { - printk("perfmon: pid %d pmc0=0x%lx assumption error for freeze bit\n", task->pid, pmc0); + if ((pmc0 & 0x1) == 0) { + printk("perfmon: pid %d pmc0=0x%lx assumption error for freeze bit\n", + task->pid, pmc0); return 0x0; } mask = pmc0 >> PMU_FIRST_COUNTER; - DBprintk(("pmc0=0x%lx pid=%d owner=%d iip=0x%lx, ctx is in %s mode used_pmds=0x%lx used_pmcs=0x%lx\n", - pmc0, task->pid, PMU_OWNER()->pid, regs->cr_iip, - CTX_OVFL_NOBLOCK(ctx) ? "NO-BLOCK" : "BLOCK", - ctx->ctx_used_pmds[0], - ctx->ctx_used_pmcs[0])); + DBprintk(("pmc0=0x%lx pid=%d iip=0x%lx, %s" + " mode used_pmds=0x%lx save_pmcs=0x%lx reload_pmcs=0x%lx\n", + pmc0, task->pid, (regs ? regs->cr_iip : 0), + CTX_OVFL_NOBLOCK(ctx) ? "nonblocking" : "blocking", + ctx->ctx_used_pmds[0], + ctx->ctx_saved_pmcs[0], + ctx->ctx_reload_pmcs[0])); /* - * XXX: need to record sample only when an EAR/BTB has overflowed + * First we update the virtual counters */ - if (CTX_HAS_SMPL(ctx)) { - pfm_smpl_buffer_desc_t *psb = ctx->ctx_smpl_buf; - unsigned long *e, m, idx=0; - perfmon_smpl_entry_t *h; - int j; + for (i = PMU_FIRST_COUNTER; mask ; i++, mask >>= 1) { - idx = ia64_fetch_and_add(1, &psb->psb_index); - DBprintk((" recording index=%ld entries=%ld\n", idx, psb->psb_entries)); + /* skip pmd which did not overflow */ + if ((mask & 0x1) == 0) continue; + + DBprintk(("PMD[%d] overflowed hw_pmd=0x%lx soft_pmd=0x%lx\n", + i, ia64_get_pmd(i), ctx->ctx_soft_pmds[i].val)); /* - * XXX: there is a small chance that we could run out on index before resetting - * but index is unsigned long, so it will take some time..... - * We use > instead of == because fetch_and_add() is off by one (see below) - * - * This case can happen in non-blocking mode or with multiple processes. - * For non-blocking, we need to reload and continue. + * Because we sometimes (EARS/BTB) reset to a specific value, we cannot simply use + * val to count the number of times we overflowed. Otherwise we would loose the + * current value in the PMD (which can be >0). So to make sure we don't loose + * the residual counts we set val to contain full 64bits value of the counter. */ - if (idx > psb->psb_entries) { - buffer_is_full = 1; - goto reload_pmds; - } - - /* first entry is really entry 0, not 1 caused by fetch_and_add */ - idx--; - - h = (perfmon_smpl_entry_t *)(((char *)psb->psb_addr) + idx*(psb->psb_entry_size)); + old_val = ctx->ctx_soft_pmds[i].val; + ctx->ctx_soft_pmds[i].val = 1 + pmu_conf.perf_ovfl_val + pfm_read_soft_counter(ctx, i); - h->pid = task->pid; - h->cpu = my_cpu; - h->rate = 0; - h->ip = regs ? regs->cr_iip : 0x0; /* where did the fault happened */ - h->regs = mask; /* which registers overflowed */ - /* guaranteed to monotonically increase on each cpu */ - h->stamp = perfmon_get_stamp(); + DBprintk(("soft_pmd[%d].val=0x%lx old_val=0x%lx pmd=0x%lx\n", + i, ctx->ctx_soft_pmds[i].val, old_val, + ia64_get_pmd(i) & pmu_conf.perf_ovfl_val)); - e = (unsigned long *)(h+1); - - /* - * selectively store PMDs in increasing index number - */ - for (j=0, m = ctx->ctx_smpl_regs; m; m >>=1, j++) { - if (m & 0x1) { - if (PMD_IS_COUNTER(j)) - *e = ctx->ctx_pmds[j-PMU_FIRST_COUNTER].val - + (ia64_get_pmd(j) & pmu_conf.perf_ovfl_val); - else { - *e = ia64_get_pmd(j); /* slow */ - } - DBprintk((" e=%p pmd%d =0x%lx\n", (void *)e, j, *e)); - e++; - } - } /* - * make the new entry visible to user, needs to be atomic + * now that we have extracted the hardware counter, we can clear it to ensure + * that a subsequent PFM_READ_PMDS will not include it again. */ - ia64_fetch_and_add(1, &psb->psb_hdr->hdr_count); + ia64_set_pmd(i, 0UL); - DBprintk((" index=%ld entries=%ld hdr_count=%ld\n", idx, psb->psb_entries, psb->psb_hdr->hdr_count)); - /* - * sampling buffer full ? + /* + * check for overflow condition */ - if (idx == (psb->psb_entries-1)) { - /* - * will cause notification, cannot be 0 - */ - bv = mask << PMU_FIRST_COUNTER; + if (old_val > ctx->ctx_soft_pmds[i].val) { - buffer_is_full = 1; + ovfl_pmds |= 1UL << i; - DBprintk((" sampling buffer full must notify bv=0x%lx\n", bv)); + DBprintk(("soft_pmd[%d] overflowed flags=0x%x, ovfl=0x%lx\n", i, ctx->ctx_soft_pmds[i].flags, ovfl_pmds)); - /* - * we do not reload here, when context is blocking - */ - if (!CTX_OVFL_NOBLOCK(ctx)) goto no_reload; - - /* - * here, we have a full buffer but we are in non-blocking mode - * so we need to reload overflowed PMDs with sampling reset values - * and restart right away. - */ + if (PMC_OVFL_NOTIFY(ctx, i)) { + ovfl_notify |= 1UL << i; + } } - /* FALL THROUGH */ } -reload_pmds: /* - * in the case of a non-blocking context, we reload - * with the ovfl_rval when no user notification is taking place (short recovery) - * otherwise when the buffer is full which requires user interaction) then we use - * smpl_rval which is the long_recovery path (disturbance introduce by user execution). + * check for sampling buffer * - * XXX: implies that when buffer is full then there is always notification. + * if present, record sample. We propagate notification ONLY when buffer + * becomes full. */ - ovfl_has_long_recovery = CTX_OVFL_NOBLOCK(ctx) && buffer_is_full; - - /* - * XXX: CTX_HAS_SMPL() should really be something like CTX_HAS_SMPL() and is activated,i.e., - * one of the PMC is configured for EAR/BTB. - * - * When sampling, we can only notify when the sampling buffer is full. - */ - can_notify = CTX_HAS_SMPL(ctx) == 0 && ctx->ctx_notify_task; - - DBprintk((" ovfl_has_long_recovery=%d can_notify=%d\n", ovfl_has_long_recovery, can_notify)); - - for (i = 0, cnum = PMU_FIRST_COUNTER; mask ; cnum++, i++, mask >>= 1) { - - if ((mask & 0x1) == 0) continue; - - DBprintk((" PMD[%ld] overflowed pmd=0x%lx pmod.val=0x%lx\n", cnum, ia64_get_pmd(cnum), ctx->ctx_pmds[i].val)); - - /* - * Because we sometimes (EARS/BTB) reset to a specific value, we cannot simply use - * val to count the number of times we overflowed. Otherwise we would loose the current value - * in the PMD (which can be >0). So to make sure we don't loose - * the residual counts we set val to contain full 64bits value of the counter. - * - * XXX: is this needed for EARS/BTB ? - */ - ctx->ctx_pmds[i].val += 1 + pmu_conf.perf_ovfl_val - + (ia64_get_pmd(cnum) & pmu_conf.perf_ovfl_val); /* slow */ - - DBprintk((" pmod[%ld].val=0x%lx pmd=0x%lx\n", i, ctx->ctx_pmds[i].val, ia64_get_pmd(cnum)&pmu_conf.perf_ovfl_val)); - - if (can_notify && PMD_OVFL_NOTIFY(ctx, i)) { - DBprintk((" CPU%d should notify task %p with signal %d\n", my_cpu, ctx->ctx_notify_task, ctx->ctx_notify_sig)); - bv |= 1 << i; - } else { - DBprintk((" CPU%d PMD[%ld] overflow, no notification\n", my_cpu, cnum)); + if(CTX_HAS_SMPL(ctx)) { + ret = pfm_record_sample(task, ctx, ovfl_pmds, regs); + if (ret == 1) { /* - * In case no notification is requested, we reload the reset value right away - * otherwise we wait until the notify_pid process has been called and has - * has finished processing data. Check out pfm_overflow_notify() + * Sampling buffer became full + * If no notication was requested, then we reset buffer index + * and reset registers (done below) and resume. + * If notification requested, then defer reset until pfm_restart() */ - - /* writes to upper part are ignored, so this is safe */ - if (ovfl_has_long_recovery) { - DBprintk((" CPU%d PMD[%ld] reload with smpl_val=%lx\n", my_cpu, cnum,ctx->ctx_pmds[i].smpl_rval)); - ia64_set_pmd(cnum, ctx->ctx_pmds[i].smpl_rval); - } else { - DBprintk((" CPU%d PMD[%ld] reload with ovfl_val=%lx\n", my_cpu, cnum,ctx->ctx_pmds[i].smpl_rval)); - ia64_set_pmd(cnum, ctx->ctx_pmds[i].ovfl_rval); + if (ovfl_notify == 0UL) { + ctx->ctx_psb->psb_hdr->hdr_count = 0UL; + ctx->ctx_psb->psb_index = 0UL; } + } else { + /* + * sample recorded in buffer, no need to notify user + */ + ovfl_notify = 0UL; } - if (cnum == ctx->ctx_btb_counter) need_reset_pmd16=1; } - /* - * In case of BTB overflow we need to reset the BTB index. - */ - if (need_reset_pmd16) { - DBprintk(("reset PMD16\n")); - ia64_set_pmd(16, 0); - } - -no_reload: /* - * some counters overflowed, but they did not require - * user notification, so after having reloaded them above - * we simply restart + * No overflow requiring a user level notification */ - if (!bv) return 0x0; + if (ovfl_notify == 0UL) { + pfm_reset_regs(ctx, &ovfl_pmds, PFM_RELOAD_SHORT_RESET); + return 0x0; + } - ctx->ctx_ovfl_regs = bv; /* keep track of what to reset when unblocking */ - /* - * Now we know that: - * - we have some counters which overflowed (contains in bv) - * - someone has asked to be notified on overflow. + /* + * keep track of what to reset when unblocking */ + ctx->ctx_ovfl_regs[0] = ovfl_pmds; - /* - * If the notification task is still present, then notify_task is non - * null. It is clean by that task if it ever exits before we do. + * we have come to this point because there was an overflow and that notification + * was requested. The notify_task may have disappeared, in which case notify_task + * is NULL. */ - if (ctx->ctx_notify_task) { si.si_errno = 0; si.si_addr = NULL; si.si_pid = task->pid; /* who is sending */ - si.si_signo = ctx->ctx_notify_sig; /* is SIGPROF */ - si.si_code = PROF_OVFL; /* goes to user */ - si.si_pfm_ovfl = bv; - - + si.si_signo = SIGPROF; + si.si_code = PROF_OVFL; /* indicates a perfmon SIGPROF signal */ + /* + * Shift the bitvector such that the user sees bit 4 for PMD4 and so on. + * We only use smpl_ovfl[0] for now. It should be fine for quite a while + * until we have more than 61 PMD available. + */ + si.si_pfm_ovfl[0] = ovfl_notify; /* * when the target of the signal is not ourself, we have to be more @@ -1626,15 +2559,29 @@ no_reload: if (ctx->ctx_notify_task != current) { /* * grab the notification lock for this task + * This guarantees that the sequence: test + send_signal + * is atomic with regards to the ctx_notify_task field. + * + * We need a spinlock and not just an atomic variable for this. + * */ - spin_lock(&ctx->ctx_notify_lock); + spin_lock(&ctx->ctx_lock); /* * now notify_task cannot be modified until we're done * if NULL, they it got modified while we were in the handler */ if (ctx->ctx_notify_task == NULL) { - spin_unlock(&ctx->ctx_notify_lock); + + spin_unlock(&ctx->ctx_lock); + + /* + * If we've lost the notified task, then we will run + * to completion wbut keep the PMU frozen. Results + * will be incorrect anyway. We do not kill task + * to leave it possible to attach perfmon context + * to already running task. + */ goto lost_notify; } /* @@ -1648,20 +2595,23 @@ no_reload: * necessarily go to the signal handler (if any) when it goes back to * user mode. */ - DBprintk((" %d sending %d notification to %d\n", task->pid, si.si_signo, ctx->ctx_notify_task->pid)); + DBprintk(("[%d] sending notification to [%d]\n", + task->pid, ctx->ctx_notify_task->pid)); /* * this call is safe in an interrupt handler, so does read_lock() on tasklist_lock */ - ret = send_sig_info(ctx->ctx_notify_sig, &si, ctx->ctx_notify_task); - if (ret != 0) printk(" send_sig_info(process %d, SIGPROF)=%d\n", ctx->ctx_notify_task->pid, ret); + ret = send_sig_info(SIGPROF, &si, ctx->ctx_notify_task); + if (ret != 0) + printk("send_sig_info(process %d, SIGPROF)=%d\n", + ctx->ctx_notify_task->pid, ret); /* * now undo the protections in order */ if (ctx->ctx_notify_task != current) { read_unlock(&tasklist_lock); - spin_unlock(&ctx->ctx_notify_lock); + spin_unlock(&ctx->ctx_lock); } /* @@ -1678,35 +2628,41 @@ no_reload: * before, changing it to NULL will still maintain this invariant. * Of course, when it is equal to current it cannot change at this point. */ - if (!CTX_OVFL_NOBLOCK(ctx) && ctx->ctx_notify_task != current) { - th->pfm_must_block = 1; /* will cause blocking */ + DBprintk(("block=%d notify [%d] current [%d]\n", + ctx->ctx_fl_block, + ctx->ctx_notify_task ? ctx->ctx_notify_task->pid: -1, + current->pid )); + + if (!CTX_OVFL_NOBLOCK(ctx) && ctx->ctx_notify_task != task) { + t->pfm_ovfl_block_reset = 1; /* will cause blocking */ } } else { -lost_notify: - DBprintk((" notification task has disappeared !\n")); +lost_notify: /* XXX: more to do here, to convert to non-blocking (reset values) */ + + DBprintk(("notification task has disappeared !\n")); /* - * for a non-blocking context, we make sure we do not fall into the pfm_overflow_notify() - * trap. Also in the case of a blocking context with lost notify process, then we do not - * want to block either (even though it is interruptible). In this case, the PMU will be kept - * frozen and the process will run to completion without monitoring enabled. + * for a non-blocking context, we make sure we do not fall into the + * pfm_overflow_notify() trap. Also in the case of a blocking context with lost + * notify process, then we do not want to block either (even though it is + * interruptible). In this case, the PMU will be kept frozen and the process will + * run to completion without monitoring enabled. * * Of course, we cannot loose notify process when self-monitoring. */ - th->pfm_must_block = 0; + t->pfm_ovfl_block_reset = 0; } /* - * if we block, we keep the PMU frozen. If non-blocking we restart. - * in the case of non-blocking were the notify process is lost, we also - * restart. + * If notification was successful, then we rely on the pfm_restart() + * call to unfreeze and reset (in both blocking or non-blocking mode). + * + * If notification failed, then we will keep the PMU frozen and run + * the task to completion */ - if (!CTX_OVFL_NOBLOCK(ctx)) - ctx->ctx_fl_frozen = 1; - else - ctx->ctx_fl_frozen = 0; + ctx->ctx_fl_frozen = 1; - DBprintk((" reload pmc0=0x%x must_block=%ld\n", - ctx->ctx_fl_frozen ? 0x1 : 0x0, th->pfm_must_block)); + DBprintk(("reload pmc0=0x%x must_block=%ld\n", + ctx->ctx_fl_frozen ? 0x1 : 0x0, t->pfm_ovfl_block_reset)); return ctx->ctx_fl_frozen ? 0x1 : 0x0; } @@ -1715,29 +2671,40 @@ static void perfmon_interrupt (int irq, void *arg, struct pt_regs *regs) { u64 pmc0; - struct task_struct *ta; + struct task_struct *task; - pmc0 = ia64_get_pmc(0); /* slow */ + pfm_ovfl_intr_count++; + + /* + * srlz.d done before arriving here + * + * This is slow + */ + pmc0 = ia64_get_pmc(0); /* * if we have some pending bits set * assumes : if any PM[0].bit[63-1] is set, then PMC[0].fr = 1 */ - if ((pmc0 & ~0x1) && (ta=PMU_OWNER())) { - - /* assumes, PMC[0].fr = 1 at this point */ - pmc0 = update_counters(ta, pmc0, regs); + if ((pmc0 & ~0x1UL)!=0UL && (task=PMU_OWNER())!= NULL) { - /* - * if pmu_frozen = 0 - * pmc0 = 0 and we resume monitoring right away - * else - * pmc0 = 0x1 frozen but all pending bits are cleared + /* + * assumes, PMC[0].fr = 1 at this point + * + * XXX: change protype to pass &pmc0 */ - ia64_set_pmc(0, pmc0); - ia64_srlz_d(); + pmc0 = pfm_overflow_handler(task, pmc0, regs); + + /* we never explicitely freeze PMU here */ + if (pmc0 == 0) { + ia64_set_pmc(0, 0); + ia64_srlz_d(); + } } else { - printk("perfmon: Spurious PMU overflow interrupt: pmc0=0x%lx owner=%p\n", pmc0, (void *)PMU_OWNER()); + pfm_spurious_ovfl_intr_count++; + + DBprintk(("perfmon: Spurious PMU overflow interrupt on CPU%d: pmc0=0x%lx owner=%p\n", + smp_processor_id(), pmc0, (void *)PMU_OWNER())); } } @@ -1745,14 +2712,37 @@ perfmon_interrupt (int irq, void *arg, struct pt_regs *regs) static int perfmon_proc_info(char *page) { +#ifdef CONFIG_SMP +#define cpu_is_online(i) (cpu_online_map & (1UL << i)) +#else +#define cpu_is_online(i) 1 +#endif char *p = page; u64 pmc0 = ia64_get_pmc(0); int i; - p += sprintf(p, "CPU%d.pmc[0]=%lx\nPerfmon debug: %s\n", smp_processor_id(), pmc0, pfm_debug ? "On" : "Off"); - p += sprintf(p, "proc_sessions=%lu sys_sessions=%lu\n", - pfs_info.pfs_proc_sessions, - pfs_info.pfs_sys_session); + p += sprintf(p, "perfmon enabled: %s\n", pmu_conf.pfm_is_disabled ? "No": "Yes"); + + p += sprintf(p, "monitors_pmcs0]=0x%lx\n", pmu_conf.monitor_pmcs[0]); + p += sprintf(p, "counter_pmcds[0]=0x%lx\n", pmu_conf.counter_pmds[0]); + p += sprintf(p, "overflow interrupts=%lu\n", pfm_ovfl_intr_count); + p += sprintf(p, "spurious overflow interrupts=%lu\n", pfm_spurious_ovfl_intr_count); + p += sprintf(p, "recorded samples=%lu\n", pfm_recorded_samples_count); + + p += sprintf(p, "CPU%d.pmc[0]=%lx\nPerfmon debug: %s\n", + smp_processor_id(), pmc0, pfm_debug ? "On" : "Off"); + + p += sprintf(p, "CPU%d cpu_data.pfm_syst_wide=%d cpu_data.dcr_pp=%d\n", + smp_processor_id(), local_cpu_data->pfm_syst_wide, local_cpu_data->pfm_dcr_pp); + + LOCK_PFS(); + p += sprintf(p, "proc_sessions=%lu\nsys_sessions=%lu\nsys_use_dbregs=%lu\nptrace_use_dbregs=%lu\n", + pfm_sessions.pfs_task_sessions, + pfm_sessions.pfs_sys_sessions, + pfm_sessions.pfs_sys_use_dbregs, + pfm_sessions.pfs_ptrace_use_dbregs); + + UNLOCK_PFS(); for(i=0; i < NR_CPUS; i++) { if (cpu_is_online(i)) { @@ -1761,10 +2751,11 @@ perfmon_proc_info(char *page) pmu_owners[i].owner ? pmu_owners[i].owner->pid: -1); } } + return p - page; } -/* for debug only */ +/* /proc interface, for debug only */ static int perfmon_read_entry(char *page, char **start, off_t off, int count, int *eof, void *data) { @@ -1781,85 +2772,27 @@ perfmon_read_entry(char *page, char **start, off_t off, int count, int *eof, voi return len; } -static struct irqaction perfmon_irqaction = { - handler: perfmon_interrupt, - flags: SA_INTERRUPT, - name: "perfmon" -}; - -void __init -perfmon_init (void) +void +pfm_syst_wide_update_task(struct task_struct *task, int mode) { - pal_perf_mon_info_u_t pm_info; - s64 status; - - register_percpu_irq(IA64_PERFMON_VECTOR, &perfmon_irqaction); - - ia64_set_pmv(IA64_PERFMON_VECTOR); - ia64_srlz_d(); - - pmu_conf.pfm_is_disabled = 1; - - printk("perfmon: version %s (sampling format v%d)\n", PFM_VERSION, PFM_SMPL_HDR_VERSION); - printk("perfmon: Interrupt vectored to %u\n", IA64_PERFMON_VECTOR); - - if ((status=ia64_pal_perf_mon_info(pmu_conf.impl_regs, &pm_info)) != 0) { - printk("perfmon: PAL call failed (%ld)\n", status); - return; - } - pmu_conf.perf_ovfl_val = (1L << pm_info.pal_perf_mon_info_s.width) - 1; - pmu_conf.max_counters = pm_info.pal_perf_mon_info_s.generic; - pmu_conf.num_pmcs = find_num_pm_regs(pmu_conf.impl_regs); - pmu_conf.num_pmds = find_num_pm_regs(&pmu_conf.impl_regs[4]); + struct pt_regs *regs = (struct pt_regs *)((unsigned long) task + IA64_STK_OFFSET); - printk("perfmon: %d bits counters (max value 0x%lx)\n", pm_info.pal_perf_mon_info_s.width, pmu_conf.perf_ovfl_val); - printk("perfmon: %ld PMC/PMD pairs, %ld PMCs, %ld PMDs\n", pmu_conf.max_counters, pmu_conf.num_pmcs, pmu_conf.num_pmds); - - /* sanity check */ - if (pmu_conf.num_pmds >= IA64_NUM_PMD_REGS || pmu_conf.num_pmcs >= IA64_NUM_PMC_REGS) { - printk(KERN_ERR "perfmon: ERROR not enough PMC/PMD storage in kernel, perfmon is DISABLED\n"); - return; /* no need to continue anyway */ - } - /* we are all set */ - pmu_conf.pfm_is_disabled = 0; + regs--; /* - * Insert the tasklet in the list. - * It is still disabled at this point, so it won't run - printk(__FUNCTION__" tasklet is %p state=%d, count=%d\n", &perfmon_tasklet, perfmon_tasklet.state, perfmon_tasklet.count); - */ - - /* - * for now here for debug purposes + * propagate the value of the dcr_pp bit to the psr */ - perfmon_dir = create_proc_read_entry ("perfmon", 0, 0, perfmon_read_entry, NULL); -} - -void -perfmon_init_percpu (void) -{ - ia64_set_pmv(IA64_PERFMON_VECTOR); - ia64_srlz_d(); + ia64_psr(regs)->pp = mode ? local_cpu_data->pfm_dcr_pp : 0; } void -pfm_save_regs (struct task_struct *ta) +pfm_save_regs (struct task_struct *task) { - struct task_struct *owner; pfm_context_t *ctx; - struct thread_struct *t; - u64 pmc0, psr; - unsigned long mask; - int i; + u64 psr; - t = &ta->thread; - ctx = ta->thread.pfm_context; + ctx = task->thread.pfm_context; - /* - * We must make sure that we don't loose any potential overflow - * interrupt while saving PMU context. In this code, external - * interrupts are always enabled. - */ /* * save current PSR: needed because we modify it @@ -1868,66 +2801,58 @@ pfm_save_regs (struct task_struct *ta) /* * stop monitoring: - * This is the only way to stop monitoring without destroying overflow - * information in PMC[0]. - * This is the last instruction which can cause overflow when monitoring - * in kernel. - * By now, we could still have an overflow interrupt in-flight. + * This is the last instruction which can generate an overflow + * + * We do not need to set psr.sp because, it is irrelevant in kernel. + * It will be restored from ipsr when going back to user level */ - __asm__ __volatile__ ("rsm psr.up|psr.pp;;"::: "memory"); + __asm__ __volatile__ ("rum psr.up;;"::: "memory"); - /* - * Mark the PMU as not owned - * This will cause the interrupt handler to do nothing in case an overflow - * interrupt was in-flight - * This also guarantees that pmc0 will contain the final state - * It virtually gives us full control over overflow processing from that point - * on. - * It must be an atomic operation. - */ - owner = PMU_OWNER(); - SET_PMU_OWNER(NULL); + ctx->ctx_saved_psr = psr; - /* - * read current overflow status: - * - * we are guaranteed to read the final stable state - */ - ia64_srlz_d(); - pmc0 = ia64_get_pmc(0); /* slow */ + //ctx->ctx_last_cpu = smp_processor_id(); - /* - * freeze PMU: - * - * This destroys the overflow information. This is required to make sure - * next process does not start with monitoring on if not requested - */ - ia64_set_pmc(0, 1); +} - /* - * Check for overflow bits and proceed manually if needed +static void +pfm_lazy_save_regs (struct task_struct *task) +{ + pfm_context_t *ctx; + struct thread_struct *t; + unsigned long mask; + int i; + + DBprintk(("on [%d] by [%d]\n", task->pid, current->pid)); + + t = &task->thread; + ctx = task->thread.pfm_context; + +#ifdef CONFIG_SMP + /* + * announce we are saving this PMU state + * This will cause other CPU, to wait until we're done + * before using the context.h * - * It is safe to call the interrupt handler now because it does - * not try to block the task right away. Instead it will set a - * flag and let the task proceed. The blocking will only occur - * next time the task exits from the kernel. + * must be an atomic operation */ - if (pmc0 & ~0x1) { - update_counters(owner, pmc0, NULL); - /* we will save the updated version of pmc0 */ - } - /* - * restore PSR for context switch to save - */ - __asm__ __volatile__ ("mov psr.l=%0;; srlz.i;;"::"r"(psr): "memory"); + atomic_set(&ctx->ctx_saving_in_progress, 1); + + /* + * if owner is NULL, it means that the other CPU won the race + * and the IPI has caused the context to be saved in pfm_handle_fectch_regs() + * instead of here. We have nothing to do + * + * note that this is safe, because the other CPU NEVER modifies saving_in_progress. + */ + if (PMU_OWNER() == NULL) goto do_nothing; +#endif /* - * we do not save registers if we can do lazy + * do not own the PMU */ - if (PFM_CAN_DO_LAZY()) { - SET_PMU_OWNER(owner); - return; - } + SET_PMU_OWNER(NULL); + + ia64_srlz_d(); /* * XXX needs further optimization. @@ -1937,30 +2862,73 @@ pfm_save_regs (struct task_struct *ta) for (i=0; mask; i++, mask>>=1) { if (mask & 0x1) t->pmd[i] =ia64_get_pmd(i); } - - /* skip PMC[0], we handle it separately */ - mask = ctx->ctx_used_pmcs[0]>>1; - for (i=1; mask; i++, mask>>=1) { + /* + * XXX: simplify to pmc0 only + */ + mask = ctx->ctx_saved_pmcs[0]; + for (i=0; mask; i++, mask>>=1) { if (mask & 0x1) t->pmc[i] = ia64_get_pmc(i); } + + /* not owned by this CPU */ + atomic_set(&ctx->ctx_last_cpu, -1); + +do_nothing: /* - * Throughout this code we could have gotten an overflow interrupt. It is transformed - * into a spurious interrupt as soon as we give up pmu ownership. + * declare we are done saving this context + * + * must be an atomic operation */ + atomic_set(&ctx->ctx_saving_in_progress,0); + } -static void -pfm_lazy_save_regs (struct task_struct *ta) +#ifdef CONFIG_SMP +/* + * Handles request coming from other CPUs + */ +static void +pfm_handle_fetch_regs(void *info) { - pfm_context_t *ctx; + pfm_smp_ipi_arg_t *arg = info; struct thread_struct *t; + pfm_context_t *ctx; unsigned long mask; int i; - DBprintk((" on [%d] by [%d]\n", ta->pid, current->pid)); + ctx = arg->task->thread.pfm_context; + t = &arg->task->thread; + + DBprintk(("task=%d owner=%d saving=%d\n", + arg->task->pid, + PMU_OWNER() ? PMU_OWNER()->pid: -1, + atomic_read(&ctx->ctx_saving_in_progress))); + + /* must wait if saving was interrupted */ + if (atomic_read(&ctx->ctx_saving_in_progress)) { + arg->retval = 1; + return; + } + + /* can proceed, done with context */ + if (PMU_OWNER() != arg->task) { + arg->retval = 0; + return; + } + + DBprintk(("saving state for [%d] save_pmcs=0x%lx all_pmcs=0x%lx used_pmds=0x%lx\n", + arg->task->pid, + ctx->ctx_saved_pmcs[0], + ctx->ctx_reload_pmcs[0], + ctx->ctx_used_pmds[0])); + + /* + * XXX: will be replaced with pure assembly call + */ + SET_PMU_OWNER(NULL); + + ia64_srlz_d(); - t = &ta->thread; - ctx = ta->thread.pfm_context; /* * XXX needs further optimization. * Also must take holes into account @@ -1970,84 +2938,338 @@ pfm_lazy_save_regs (struct task_struct *ta) if (mask & 0x1) t->pmd[i] =ia64_get_pmd(i); } - /* skip PMC[0], we handle it separately */ - mask = ctx->ctx_used_pmcs[0]>>1; - for (i=1; mask; i++, mask>>=1) { + mask = ctx->ctx_saved_pmcs[0]; + for (i=0; mask; i++, mask>>=1) { if (mask & 0x1) t->pmc[i] = ia64_get_pmc(i); } - SET_PMU_OWNER(NULL); + /* not owned by this CPU */ + atomic_set(&ctx->ctx_last_cpu, -1); + + /* can proceed */ + arg->retval = 0; } +/* + * Function call to fetch PMU state from another CPU identified by 'cpu'. + * If the context is being saved on the remote CPU, then we busy wait until + * the saving is done and then we return. In this case, non IPI is sent. + * Otherwise, we send an IPI to the remote CPU, potentially interrupting + * pfm_lazy_save_regs() over there. + * + * If the retval==1, then it means that we interrupted remote save and that we must + * wait until the saving is over before proceeding. + * Otherwise, we did the saving on the remote CPU, and it was done by the time we got there. + * in either case, we can proceed. + */ +static void +pfm_fetch_regs(int cpu, struct task_struct *task, pfm_context_t *ctx) +{ + pfm_smp_ipi_arg_t arg; + int ret; + + arg.task = task; + arg.retval = -1; + + if (atomic_read(&ctx->ctx_saving_in_progress)) { + DBprintk(("no IPI, must wait for [%d] to be saved on [%d]\n", task->pid, cpu)); + + /* busy wait */ + while (atomic_read(&ctx->ctx_saving_in_progress)); + return; + } + DBprintk(("calling CPU %d from CPU %d\n", cpu, smp_processor_id())); + + if (cpu == -1) { + printk("refusing to use -1 for [%d]\n", task->pid); + return; + } + + /* will send IPI to other CPU and wait for completion of remote call */ + if ((ret=smp_call_function_single(cpu, pfm_handle_fetch_regs, &arg, 0, 1))) { + printk("perfmon: remote CPU call from %d to %d error %d\n", smp_processor_id(), cpu, ret); + return; + } + /* + * we must wait until saving is over on the other CPU + * This is the case, where we interrupted the saving which started just at the time we sent the + * IPI. + */ + if (arg.retval == 1) { + DBprintk(("must wait for [%d] to be saved on [%d]\n", task->pid, cpu)); + while (atomic_read(&ctx->ctx_saving_in_progress)); + DBprintk(("done saving for [%d] on [%d]\n", task->pid, cpu)); + } +} +#endif /* CONFIG_SMP */ + void -pfm_load_regs (struct task_struct *ta) +pfm_load_regs (struct task_struct *task) { - struct thread_struct *t = &ta->thread; - pfm_context_t *ctx = ta->thread.pfm_context; + struct thread_struct *t; + pfm_context_t *ctx; struct task_struct *owner; unsigned long mask; - int i; + u64 psr; + int i, cpu; owner = PMU_OWNER(); - if (owner == ta) goto skip_restore; + ctx = task->thread.pfm_context; + + /* + * if we were the last user, then nothing to do except restore psr + */ + if (owner == task) { + if (atomic_read(&ctx->ctx_last_cpu) != smp_processor_id()) + DBprintk(("invalid last_cpu=%d for [%d]\n", + atomic_read(&ctx->ctx_last_cpu), task->pid)); + + psr = ctx->ctx_saved_psr; + __asm__ __volatile__ ("mov psr.l=%0;; srlz.i;;"::"r"(psr): "memory"); + + return; + } + DBprintk(("load_regs: must reload for [%d] owner=%d\n", + task->pid, owner ? owner->pid : -1 )); + /* + * someone else is still using the PMU, first push it out and + * then we'll be able to install our stuff ! + */ if (owner) pfm_lazy_save_regs(owner); - SET_PMU_OWNER(ta); +#ifdef CONFIG_SMP + /* + * check if context on another CPU (-1 means saved) + * We MUST use the variable, as last_cpu may change behind our + * back. If it changes to -1 (not on a CPU anymore), then in cpu + * we have the last CPU the context was on. We may be sending the + * IPI for nothing, but we have no way of verifying this. + */ + cpu = atomic_read(&ctx->ctx_last_cpu); + if (cpu != -1) { + pfm_fetch_regs(cpu, task, ctx); + } +#endif + t = &task->thread; + /* + * XXX: will be replaced by assembly routine + * We clear all unused PMDs to avoid leaking information + */ mask = ctx->ctx_used_pmds[0]; for (i=0; mask; i++, mask>>=1) { - if (mask & 0x1) ia64_set_pmd(i, t->pmd[i]); + if (mask & 0x1) + ia64_set_pmd(i, t->pmd[i]); + else + ia64_set_pmd(i, 0UL); } + /* XXX: will need to clear all unused pmd, for security */ - /* skip PMC[0] to avoid side effects */ - mask = ctx->ctx_used_pmcs[0]>>1; + /* + * skip pmc[0] to avoid side-effects, + * all PMCs are systematically reloaded, unsued get default value + * to avoid picking up stale configuration + */ + mask = ctx->ctx_reload_pmcs[0]>>1; for (i=1; mask; i++, mask>>=1) { if (mask & 0x1) ia64_set_pmc(i, t->pmc[i]); } -skip_restore: + + /* + * restore debug registers when used for range restrictions. + * We must restore the unused registers to avoid picking up + * stale information. + */ + mask = ctx->ctx_used_ibrs[0]; + for (i=0; mask; i++, mask>>=1) { + if (mask & 0x1) + ia64_set_ibr(i, t->ibr[i]); + else + ia64_set_ibr(i, 0UL); + } + + mask = ctx->ctx_used_dbrs[0]; + for (i=0; mask; i++, mask>>=1) { + if (mask & 0x1) + ia64_set_dbr(i, t->dbr[i]); + else + ia64_set_dbr(i, 0UL); + } + + if (t->pmc[0] & ~0x1) { + ia64_srlz_d(); + pfm_overflow_handler(task, t->pmc[0], NULL); + } + /* - * unfreeze only when possible + * fl_frozen==1 when we are in blocking mode waiting for restart */ if (ctx->ctx_fl_frozen == 0) { ia64_set_pmc(0, 0); ia64_srlz_d(); - /* place where we potentially (kernel level) start monitoring again */ } + atomic_set(&ctx->ctx_last_cpu, smp_processor_id()); + + SET_PMU_OWNER(task); + + /* + * restore the psr we changed in pfm_save_regs() + */ + psr = ctx->ctx_saved_psr; + __asm__ __volatile__ ("mov psr.l=%0;; srlz.i;;"::"r"(psr): "memory"); + } +static void +pfm_model_specific_reset_pmu(struct task_struct *task) +{ + int i; + +#ifdef CONFIG_ITANIUM + /* opcode matcher set to all 1s */ + ia64_set_pmc(8,~0UL); + ia64_set_pmc(9,~0UL); + + /* I-EAR config cleared, plm=0 */ + ia64_set_pmc(10,0UL); + + /* D-EAR config cleared, PMC[11].pt must be 1 */ + ia64_set_pmc(11,1UL << 28); + + /* BTB config. plm=0 */ + ia64_set_pmc(12,0UL); + + /* Instruction address range, PMC[13].ta must be 1 */ + ia64_set_pmc(13,1UL); + + /* + * Clear all PMDs + * + * XXX: may be good enough to rely on the impl_regs to generalize + * this. + */ + for(i = 0; i< 18 ; i++) { + ia64_set_pmd(i,0UL); + } +#endif +} + +/* + * XXX: this routine is not very portable for PMCs + * XXX: make this routine able to work with non current context + */ +static void +ia64_reset_pmu(struct task_struct *task) +{ + pfm_context_t *ctx = task->thread.pfm_context; + struct thread_struct *t = &task->thread; + unsigned long mask; + int i; + + if (task != current) { + printk("perfmon: invalid task in ia64_reset_pmu()\n"); + return; + } + + /* PMU is frozen, no pending overflow bits */ + ia64_set_pmc(0,1); + + /* + * Let's first do the architected initializations + */ + + /* clear counters */ + ia64_set_pmd(4,0UL); + ia64_set_pmd(5,0UL); + ia64_set_pmd(6,0UL); + ia64_set_pmd(7,0UL); + + /* clear overflow status bits */ + ia64_set_pmc(1,0UL); + ia64_set_pmc(2,0UL); + ia64_set_pmc(3,0UL); + + /* clear counting monitor configuration */ + ia64_set_pmc(4,0UL); + ia64_set_pmc(5,0UL); + ia64_set_pmc(6,0UL); + ia64_set_pmc(7,0UL); + + /* + * Now let's do the CPU model specific initializations + */ + pfm_model_specific_reset_pmu(task); + + /* + * On context switched restore, we must restore ALL pmc even + * when they are not actively used by the task. In UP, the incoming process + * may otherwise pick up left over PMC state from the previous process. + * As opposed to PMD, stale PMC can cause harm to the incoming + * process because they may change what is being measured. + * Therefore, we must systematically reinstall the entire + * PMC state. In SMP, the same thing is possible on the + * same CPU but also on between 2 CPUs. + * + * There is unfortunately no easy way to avoid this problem + * on either UP or SMP. This definitively slows down the + * pfm_load_regs(). + */ + + /* + * We must include all the PMC in this mask to make sure we don't + * see any side effect of the stale state, such as opcode matching + * or range restrictions, for instance. + */ + ctx->ctx_reload_pmcs[0] = pmu_conf.impl_regs[0]; + + /* + * make sure we pick up whatever values were installed + * for the CPU model specific reset. We also include + * the architected PMC (pmc4-pmc7) + * + * This step is required in order to restore the correct values in PMC when + * the task is switched out and back in just after the PFM_ENABLE. + */ + mask = pmu_conf.impl_regs[0]; + for (i=0; mask; i++, mask>>=1) { + if (mask & 0x1) t->pmc[i] = ia64_get_pmc(i); + } + + /* + * useful in case of re-enable after disable + */ + ctx->ctx_used_pmds[0] = 0UL; + ctx->ctx_used_ibrs[0] = 0UL; + ctx->ctx_used_dbrs[0] = 0UL; + + ia64_srlz_d(); +} /* * This function is called when a thread exits (from exit_thread()). * This is a simplified pfm_save_regs() that simply flushes the current * register state into the save area taking into account any pending - * overflow. This time no notification is sent because the taks is dying + * overflow. This time no notification is sent because the task is dying * anyway. The inline processing of overflows avoids loosing some counts. * The PMU is frozen on exit from this call and is to never be reenabled * again for this task. + * */ void -pfm_flush_regs (struct task_struct *ta) +pfm_flush_regs (struct task_struct *task) { pfm_context_t *ctx; - u64 pmc0, psr, mask; - int i,j; + u64 pmc0; + unsigned long mask, mask2, val; + int i; - if (ta == NULL) { - panic(__FUNCTION__" task is NULL\n"); - } - ctx = ta->thread.pfm_context; - if (ctx == NULL) { - panic(__FUNCTION__" no PFM ctx is NULL\n"); - } - /* - * We must make sure that we don't loose any potential overflow - * interrupt while saving PMU context. In this code, external - * interrupts are always enabled. - */ + ctx = task->thread.pfm_context; - /* - * save current PSR: needed because we modify it + if (ctx == NULL) return; + + /* + * that's it if context already disabled */ - __asm__ __volatile__ ("mov %0=psr;;": "=r"(psr) :: "memory"); + if (ctx->ctx_flags.state == PFM_CTX_DISABLED) return; /* * stop monitoring: @@ -2057,7 +3279,23 @@ pfm_flush_regs (struct task_struct *ta) * in kernel. * By now, we could still have an overflow interrupt in-flight. */ - __asm__ __volatile__ ("rsm psr.up;;"::: "memory"); + if (ctx->ctx_fl_system) { + /* disable dcr pp */ + ia64_set_dcr(ia64_get_dcr() & ~IA64_DCR_PP); + + local_cpu_data->pfm_syst_wide = 0; + local_cpu_data->pfm_dcr_pp = 0; + + + __asm__ __volatile__ ("rsm psr.pp;;"::: "memory"); + + } else { + + __asm__ __volatile__ ("rum psr.up;;"::: "memory"); + + /* no more save/restore on ctxsw */ + current->thread.flags &= ~IA64_THREAD_PM_VALID; + } /* * Mark the PMU as not owned @@ -2088,85 +3326,68 @@ pfm_flush_regs (struct task_struct *ta) ia64_srlz_d(); /* - * restore PSR for context switch to save + * We don't need to restore psr, because we are on our way out anyway */ - __asm__ __volatile__ ("mov psr.l=%0;;srlz.i;"::"r"(psr): "memory"); /* * This loop flushes the PMD into the PFM context. - * IT also processes overflow inline. + * It also processes overflow inline. * * IMPORTANT: No notification is sent at this point as the process is dying. * The implicit notification will come from a SIGCHILD or a return from a * waitpid(). * - * XXX: must take holes into account */ - mask = pmc0 >> PMU_FIRST_COUNTER; - for (i=0,j=PMU_FIRST_COUNTER; i< pmu_conf.max_counters; i++,j++) { - - /* collect latest results */ - ctx->ctx_pmds[i].val += ia64_get_pmd(j) & pmu_conf.perf_ovfl_val; - - /* - * now everything is in ctx_pmds[] and we need - * to clear the saved context from save_regs() such that - * pfm_read_pmds() gets the correct value - */ - ta->thread.pmd[j] = 0; - /* take care of overflow inline */ - if (mask & 0x1) { - ctx->ctx_pmds[i].val += 1 + pmu_conf.perf_ovfl_val; - DBprintk((" PMD[%d] overflowed pmd=0x%lx pmds.val=0x%lx\n", - j, ia64_get_pmd(j), ctx->ctx_pmds[i].val)); - } - mask >>=1; - } -} + if (atomic_read(&ctx->ctx_last_cpu) != smp_processor_id()) + printk("perfmon: [%d] last_cpu=%d\n", task->pid, atomic_read(&ctx->ctx_last_cpu)); -/* - * XXX: this routine is not very portable for PMCs - * XXX: make this routine able to work with non current context - */ -static void -ia64_reset_pmu(void) -{ - int i; + mask = pmc0 >> PMU_FIRST_COUNTER; + mask2 = ctx->ctx_used_pmds[0] >> PMU_FIRST_COUNTER; - /* PMU is frozen, no pending overflow bits */ - ia64_set_pmc(0,1); + for (i = PMU_FIRST_COUNTER; mask2; i++, mask>>=1, mask2>>=1) { - /* extra overflow bits + counter configs cleared */ - for(i=1; i< PMU_FIRST_COUNTER + pmu_conf.max_counters ; i++) { - ia64_set_pmc(i,0); - } + /* skip non used pmds */ + if ((mask2 & 0x1) == 0) continue; - /* opcode matcher set to all 1s */ - ia64_set_pmc(8,~0); - ia64_set_pmc(9,~0); + val = ia64_get_pmd(i); - /* I-EAR config cleared, plm=0 */ - ia64_set_pmc(10,0); + if (PMD_IS_COUNTING(i)) { - /* D-EAR config cleared, PMC[11].pt must be 1 */ - ia64_set_pmc(11,1 << 28); + DBprintk(("[%d] pmd[%d] soft_pmd=0x%lx hw_pmd=0x%lx\n", task->pid, i, ctx->ctx_soft_pmds[i].val, val & pmu_conf.perf_ovfl_val)); - /* BTB config. plm=0 */ - ia64_set_pmc(12,0); + /* collect latest results */ + ctx->ctx_soft_pmds[i].val += val & pmu_conf.perf_ovfl_val; - /* Instruction address range, PMC[13].ta must be 1 */ - ia64_set_pmc(13,1); + /* + * now everything is in ctx_soft_pmds[] and we need + * to clear the saved context from save_regs() such that + * pfm_read_pmds() gets the correct value + */ + task->thread.pmd[i] = 0; - /* clears all PMD registers */ - for(i=0;i< pmu_conf.num_pmds; i++) { - if (PMD_IS_IMPL(i)) ia64_set_pmd(i,0); + /* take care of overflow inline */ + if (mask & 0x1) { + ctx->ctx_soft_pmds[i].val += 1 + pmu_conf.perf_ovfl_val; + DBprintk(("[%d] pmd[%d] overflowed soft_pmd=0x%lx\n", + task->pid, i, ctx->ctx_soft_pmds[i].val)); + } + } else { + DBprintk(("[%d] pmd[%d] hw_pmd=0x%lx\n", task->pid, i, val)); + /* not a counter, just save value as is */ + task->thread.pmd[i] = val; + } } - ia64_srlz_d(); + /* + * indicates that context has been saved + */ + atomic_set(&ctx->ctx_last_cpu, -1); + } + /* - * task is the newly created task + * task is the newly created task, pt_regs for new child */ int pfm_inherit(struct task_struct *task, struct pt_regs *regs) @@ -2174,25 +3395,29 @@ pfm_inherit(struct task_struct *task, struct pt_regs *regs) pfm_context_t *ctx = current->thread.pfm_context; pfm_context_t *nctx; struct thread_struct *th = &task->thread; - int i, cnum; + unsigned long m; + int i; /* - * bypass completely for system wide + * make sure child cannot mess up the monitoring session */ - if (pfs_info.pfs_sys_session) { - DBprintk((" enabling psr.pp for %d\n", task->pid)); - ia64_psr(regs)->pp = pfs_info.pfs_pp; - return 0; - } + ia64_psr(regs)->sp = 1; + DBprintk(("enabling psr.sp for [%d]\n", task->pid)); + + /* + * remove any sampling buffer mapping from child user + * address space. Must be done for all cases of inheritance. + */ + if (ctx->ctx_smpl_vaddr) pfm_remove_smpl_mapping(task); /* * takes care of easiest case first */ if (CTX_INHERIT_MODE(ctx) == PFM_FL_INHERIT_NONE) { - DBprintk((" removing PFM context for %d\n", task->pid)); + DBprintk(("removing PFM context for [%d]\n", task->pid)); task->thread.pfm_context = NULL; - task->thread.pfm_must_block = 0; - atomic_set(&task->thread.pfm_notifiers_check, 0); + task->thread.pfm_ovfl_block_reset = 0; + /* copy_thread() clears IA64_THREAD_PM_VALID */ return 0; } @@ -2202,45 +3427,81 @@ pfm_inherit(struct task_struct *task, struct pt_regs *regs) /* copy content */ *nctx = *ctx; + if (CTX_INHERIT_MODE(ctx) == PFM_FL_INHERIT_ONCE) { nctx->ctx_fl_inherit = PFM_FL_INHERIT_NONE; - atomic_set(&task->thread.pfm_notifiers_check, 0); - DBprintk((" downgrading to INHERIT_NONE for %d\n", task->pid)); - pfs_info.pfs_proc_sessions++; + atomic_set(&nctx->ctx_last_cpu, -1); + + /* + * task is not yet visible in the tasklist, so we do + * not need to lock the newly created context. + * However, we must grab the tasklist_lock to ensure + * that the ctx_owner or ctx_notify_task do not disappear + * while we increment their check counters. + */ + read_lock(&tasklist_lock); + + if (nctx->ctx_notify_task) + atomic_inc(&nctx->ctx_notify_task->thread.pfm_notifiers_check); + + if (nctx->ctx_owner) + atomic_inc(&nctx->ctx_owner->thread.pfm_owners_check); + + read_unlock(&tasklist_lock); + + DBprintk(("downgrading to INHERIT_NONE for [%d]\n", task->pid)); + + LOCK_PFS(); + pfm_sessions.pfs_task_sessions++; + UNLOCK_PFS(); } /* initialize counters in new context */ - for(i=0, cnum= PMU_FIRST_COUNTER; i < pmu_conf.max_counters; cnum++, i++) { - nctx->ctx_pmds[i].val = nctx->ctx_pmds[i].ival & ~pmu_conf.perf_ovfl_val; - th->pmd[cnum] = nctx->ctx_pmds[i].ival & pmu_conf.perf_ovfl_val; + m = pmu_conf.counter_pmds[0] >> PMU_FIRST_COUNTER; + for(i = PMU_FIRST_COUNTER ; m ; m>>=1, i++) { + if (m & 0x1) { + nctx->ctx_soft_pmds[i].val = nctx->ctx_soft_pmds[i].ival & ~pmu_conf.perf_ovfl_val; + th->pmd[i] = nctx->ctx_soft_pmds[i].ival & pmu_conf.perf_ovfl_val; + } } /* clear BTB index register */ th->pmd[16] = 0; /* if sampling then increment number of users of buffer */ - if (nctx->ctx_smpl_buf) { - atomic_inc(&nctx->ctx_smpl_buf->psb_refcnt); + if (nctx->ctx_psb) { + + /* + * XXX: nopt very pretty! + */ + LOCK_PSB(nctx->ctx_psb); + nctx->ctx_psb->psb_refcnt++; + UNLOCK_PSB(nctx->ctx_psb); + /* + * remove any pointer to sampling buffer mapping + */ + nctx->ctx_smpl_vaddr = 0; } nctx->ctx_fl_frozen = 0; - nctx->ctx_ovfl_regs = 0; + nctx->ctx_ovfl_regs[0] = 0UL; + sema_init(&nctx->ctx_restart_sem, 0); /* reset this semaphore to locked */ /* clear pending notification */ - th->pfm_must_block = 0; + th->pfm_ovfl_block_reset = 0; /* link with new task */ - th->pfm_context = nctx; + th->pfm_context = nctx; - DBprintk((" nctx=%p for process %d\n", (void *)nctx, task->pid)); + DBprintk(("nctx=%p for process [%d]\n", (void *)nctx, task->pid)); /* * the copy_thread routine automatically clears * IA64_THREAD_PM_VALID, so we need to reenable it, if it was used by the caller */ if (current->thread.flags & IA64_THREAD_PM_VALID) { - DBprintk((" setting PM_VALID for %d\n", task->pid)); + DBprintk(("setting PM_VALID for [%d]\n", task->pid)); th->flags |= IA64_THREAD_PM_VALID; } @@ -2248,100 +3509,248 @@ pfm_inherit(struct task_struct *task, struct pt_regs *regs) } /* - * called from release_thread(), at this point this task is not in the - * tasklist anymore + * + * We cannot touch any of the PMU registers at this point as we may + * not be running on the same CPU the task was last run on. Therefore + * it is assumed that the PMU has been stopped appropriately in + * pfm_flush_regs() called from exit_thread(). + * + * The function is called in the context of the parent via a release_thread() + * and wait4(). The task is not in the tasklist anymore. */ void pfm_context_exit(struct task_struct *task) { pfm_context_t *ctx = task->thread.pfm_context; - if (!ctx) { - DBprintk((" invalid context for %d\n", task->pid)); - return; - } + /* + * check sampling buffer + */ + if (ctx->ctx_psb) { + pfm_smpl_buffer_desc_t *psb = ctx->ctx_psb; - /* check is we have a sampling buffer attached */ - if (ctx->ctx_smpl_buf) { - pfm_smpl_buffer_desc_t *psb = ctx->ctx_smpl_buf; + LOCK_PSB(psb); - /* if only user left, then remove */ - DBprintk((" [%d] [%d] psb->refcnt=%d\n", current->pid, task->pid, psb->psb_refcnt.counter)); + DBprintk(("sampling buffer from [%d] @%p size %ld vma_flag=0x%x\n", + task->pid, + psb->psb_hdr, psb->psb_size, psb->psb_flags)); - if (atomic_dec_and_test(&psb->psb_refcnt) ) { - rvfree(psb->psb_hdr, psb->psb_size); - vfree(psb); - DBprintk((" [%d] cleaning [%d] sampling buffer\n", current->pid, task->pid )); - } - } - DBprintk((" [%d] cleaning [%d] pfm_context @%p\n", current->pid, task->pid, (void *)ctx)); - - /* - * To avoid getting the notified task scan the entire process list - * when it exits because it would have pfm_notifiers_check set, we - * decrease it by 1 to inform the task, that one less task is going - * to send it notification. each new notifer increases this field by - * 1 in pfm_context_create(). Of course, there is race condition between - * decreasing the value and the notified task exiting. The danger comes - * from the fact that we have a direct pointer to its task structure - * thereby bypassing the tasklist. We must make sure that if we have - * notify_task!= NULL, the target task is still somewhat present. It may - * already be detached from the tasklist but that's okay. Note that it is - * okay if we 'miss the deadline' and the task scans the list for nothing, - * it will affect performance but not correctness. The correctness is ensured - * by using the notify_lock whic prevents the notify_task from changing on us. - * Once holdhing this lock, if we see notify_task!= NULL, then it will stay like + /* + * in the case where we are the last user, we may be able to free + * the buffer + */ + psb->psb_refcnt--; + + if (psb->psb_refcnt == 0) { + + /* + * The flag is cleared in pfm_vm_close(). which gets + * called from do_exit() via exit_mm(). + * By the time we come here, the task has no more mm context. + * + * We can only free the psb and buffer here after the vm area + * describing the buffer has been removed. This normally happens + * as part of do_exit() but the entire mm context is ONLY removed + * once its reference counts goes to zero. This is typically + * the case except for multi-threaded (several tasks) processes. + * + * See pfm_vm_close() and pfm_cleanup_smpl_buf() for more details. + */ + if ((psb->psb_flags & PFM_PSB_VMA) == 0) { + + DBprintk(("cleaning sampling buffer from [%d] @%p size %ld\n", + task->pid, + psb->psb_hdr, psb->psb_size)); + + /* + * free the buffer and psb + */ + pfm_rvfree(psb->psb_hdr, psb->psb_size); + kfree(psb); + psb = NULL; + } + } + /* psb may have been deleted */ + if (psb) UNLOCK_PSB(psb); + } + + DBprintk(("cleaning [%d] pfm_context @%p notify_task=%p check=%d mm=%p\n", + task->pid, ctx, + ctx->ctx_notify_task, + atomic_read(&task->thread.pfm_notifiers_check), task->mm)); + + /* + * To avoid getting the notified task or owner task scan the entire process + * list when they exit, we decrement notifiers_check and owners_check respectively. + * + * Of course, there is race condition between decreasing the value and the + * task exiting. The danger comes from the fact that, in both cases, we have a + * direct pointer to a task structure thereby bypassing the tasklist. + * We must make sure that, if we have task!= NULL, the target task is still + * present and is identical to the initial task specified + * during pfm_create_context(). It may already be detached from the tasklist but + * that's okay. Note that it is okay if we miss the deadline and the task scans + * the list for nothing, it will affect performance but not correctness. + * The correctness is ensured by using the ctx_lock which prevents the + * notify_task from changing the fields in our context. + * Once holdhing this lock, if we see task!= NULL, then it will stay like * that until we release the lock. If it is NULL already then we came too late. */ - spin_lock(&ctx->ctx_notify_lock); + LOCK_CTX(ctx); - if (ctx->ctx_notify_task) { - DBprintk((" [%d] [%d] atomic_sub on [%d] notifiers=%u\n", current->pid, task->pid, - ctx->ctx_notify_task->pid, - atomic_read(&ctx->ctx_notify_task->thread.pfm_notifiers_check))); + if (ctx->ctx_notify_task != NULL) { + DBprintk(("[%d], [%d] atomic_sub on [%d] notifiers=%u\n", current->pid, + task->pid, + ctx->ctx_notify_task->pid, + atomic_read(&ctx->ctx_notify_task->thread.pfm_notifiers_check))); + + atomic_dec(&ctx->ctx_notify_task->thread.pfm_notifiers_check); + } + + if (ctx->ctx_owner != NULL) { + DBprintk(("[%d], [%d] atomic_sub on [%d] owners=%u\n", + current->pid, + task->pid, + ctx->ctx_owner->pid, + atomic_read(&ctx->ctx_owner->thread.pfm_owners_check))); - atomic_sub(1, &ctx->ctx_notify_task->thread.pfm_notifiers_check); + atomic_dec(&ctx->ctx_owner->thread.pfm_owners_check); } - spin_unlock(&ctx->ctx_notify_lock); + UNLOCK_CTX(ctx); + + LOCK_PFS(); if (ctx->ctx_fl_system) { - /* - * if included interrupts (true by default), then reset - * to get default value - */ - if (ctx->ctx_fl_exclintr == 0) { - /* - * reload kernel default DCR value - */ - ia64_set_dcr(pfs_info.pfs_dfl_dcr); - DBprintk((" restored dcr to 0x%lx\n", pfs_info.pfs_dfl_dcr)); + + pfm_sessions.pfs_sys_session[ctx->ctx_cpu] = NULL; + pfm_sessions.pfs_sys_sessions--; + DBprintk(("freeing syswide session on CPU%ld\n", ctx->ctx_cpu)); + /* update perfmon debug register counter */ + if (ctx->ctx_fl_using_dbreg) { + if (pfm_sessions.pfs_sys_use_dbregs == 0) { + printk("perfmon: invalid release for [%d] sys_use_dbregs=0\n", task->pid); + } else + pfm_sessions.pfs_sys_use_dbregs--; } - /* - * free system wide session slot - */ - pfs_info.pfs_sys_session = 0; + + /* + * remove any CPU pinning + */ + set_cpus_allowed(task, ctx->ctx_saved_cpus_allowed); } else { - pfs_info.pfs_proc_sessions--; + pfm_sessions.pfs_task_sessions--; } + UNLOCK_PFS(); pfm_context_free(ctx); /* * clean pfm state in thread structure, */ - task->thread.pfm_context = NULL; - task->thread.pfm_must_block = 0; + task->thread.pfm_context = NULL; + task->thread.pfm_ovfl_block_reset = 0; + /* pfm_notifiers is cleaned in pfm_cleanup_notifiers() */ +} +/* + * function invoked from release_thread when pfm_smpl_buf_list is not NULL + */ +int +pfm_cleanup_smpl_buf(struct task_struct *task) +{ + pfm_smpl_buffer_desc_t *tmp, *psb = task->thread.pfm_smpl_buf_list; + + if (psb == NULL) { + printk("perfmon: psb is null in [%d]\n", current->pid); + return -1; + } + /* + * Walk through the list and free the sampling buffer and psb + */ + while (psb) { + DBprintk(("[%d] freeing smpl @%p size %ld\n", current->pid, psb->psb_hdr, psb->psb_size)); + + pfm_rvfree(psb->psb_hdr, psb->psb_size); + tmp = psb->psb_next; + kfree(psb); + psb = tmp; + } + + /* just in case */ + task->thread.pfm_smpl_buf_list = NULL; + + return 0; } +/* + * function invoked from release_thread to make sure that the ctx_owner field does not + * point to an unexisting task. + */ +void +pfm_cleanup_owners(struct task_struct *task) +{ + struct task_struct *p; + pfm_context_t *ctx; + + DBprintk(("called by [%d] for [%d]\n", current->pid, task->pid)); + + read_lock(&tasklist_lock); + + for_each_task(p) { + /* + * It is safe to do the 2-step test here, because thread.ctx + * is cleaned up only in release_thread() and at that point + * the task has been detached from the tasklist which is an + * operation which uses the write_lock() on the tasklist_lock + * so it cannot run concurrently to this loop. So we have the + * guarantee that if we find p and it has a perfmon ctx then + * it is going to stay like this for the entire execution of this + * loop. + */ + ctx = p->thread.pfm_context; + + //DBprintk(("[%d] scanning task [%d] ctx=%p\n", task->pid, p->pid, ctx)); + + if (ctx && ctx->ctx_owner == task) { + DBprintk(("trying for owner [%d] in [%d]\n", task->pid, p->pid)); + /* + * the spinlock is required to take care of a race condition + * with the send_sig_info() call. We must make sure that + * either the send_sig_info() completes using a valid task, + * or the notify_task is cleared before the send_sig_info() + * can pick up a stale value. Note that by the time this + * function is executed the 'task' is already detached from the + * tasklist. The problem is that the notifiers have a direct + * pointer to it. It is okay to send a signal to a task in this + * stage, it simply will have no effect. But it is better than sending + * to a completely destroyed task or worse to a new task using the same + * task_struct address. + */ + LOCK_CTX(ctx); + + ctx->ctx_owner = NULL; + + UNLOCK_CTX(ctx); + + DBprintk(("done for notifier [%d] in [%d]\n", task->pid, p->pid)); + } + } + read_unlock(&tasklist_lock); +} + + +/* + * function called from release_thread to make sure that the ctx_notify_task is not pointing + * to an unexisting task + */ void pfm_cleanup_notifiers(struct task_struct *task) { struct task_struct *p; pfm_context_t *ctx; - DBprintk((" [%d] called\n", task->pid)); + DBprintk(("called by [%d] for [%d]\n", current->pid, task->pid)); read_lock(&tasklist_lock); @@ -2358,10 +3767,10 @@ pfm_cleanup_notifiers(struct task_struct *task) */ ctx = p->thread.pfm_context; - DBprintk((" [%d] scanning task [%d] ctx=%p\n", task->pid, p->pid, ctx)); + //DBprintk(("[%d] scanning task [%d] ctx=%p\n", task->pid, p->pid, ctx)); if (ctx && ctx->ctx_notify_task == task) { - DBprintk((" trying for notifier %d in %d\n", task->pid, p->pid)); + DBprintk(("trying for notifier [%d] in [%d]\n", task->pid, p->pid)); /* * the spinlock is required to take care of a race condition * with the send_sig_info() call. We must make sure that @@ -2375,23 +3784,123 @@ pfm_cleanup_notifiers(struct task_struct *task) * to a completely destroyed task or worse to a new task using the same * task_struct address. */ - spin_lock(&ctx->ctx_notify_lock); + LOCK_CTX(ctx); ctx->ctx_notify_task = NULL; - spin_unlock(&ctx->ctx_notify_lock); + UNLOCK_CTX(ctx); - DBprintk((" done for notifier %d in %d\n", task->pid, p->pid)); + DBprintk(("done for notifier [%d] in [%d]\n", task->pid, p->pid)); } } read_unlock(&tasklist_lock); +} + +static struct irqaction perfmon_irqaction = { + handler: perfmon_interrupt, + flags: SA_INTERRUPT, + name: "perfmon" +}; + + +/* + * perfmon initialization routine, called from the initcall() table + */ +int __init +perfmon_init (void) +{ + pal_perf_mon_info_u_t pm_info; + s64 status; + + register_percpu_irq(IA64_PERFMON_VECTOR, &perfmon_irqaction); + + ia64_set_pmv(IA64_PERFMON_VECTOR); + ia64_srlz_d(); + + pmu_conf.pfm_is_disabled = 1; + + printk("perfmon: version %u.%u (sampling format v%u.%u) IRQ %u\n", + PFM_VERSION_MAJ, + PFM_VERSION_MIN, + PFM_SMPL_VERSION_MAJ, + PFM_SMPL_VERSION_MIN, + IA64_PERFMON_VECTOR); + + if ((status=ia64_pal_perf_mon_info(pmu_conf.impl_regs, &pm_info)) != 0) { + printk("perfmon: PAL call failed (%ld), perfmon disabled\n", status); + return -1; + } + + pmu_conf.perf_ovfl_val = (1UL << pm_info.pal_perf_mon_info_s.width) - 1; + pmu_conf.max_counters = pm_info.pal_perf_mon_info_s.generic; + pmu_conf.num_pmcs = find_num_pm_regs(pmu_conf.impl_regs); + pmu_conf.num_pmds = find_num_pm_regs(&pmu_conf.impl_regs[4]); + + printk("perfmon: %u bits counters (max value 0x%016lx)\n", + pm_info.pal_perf_mon_info_s.width, pmu_conf.perf_ovfl_val); + + printk("perfmon: %lu PMC/PMD pairs, %lu PMCs, %lu PMDs\n", + pmu_conf.max_counters, pmu_conf.num_pmcs, pmu_conf.num_pmds); + + /* sanity check */ + if (pmu_conf.num_pmds >= IA64_NUM_PMD_REGS || pmu_conf.num_pmcs >= IA64_NUM_PMC_REGS) { + printk(KERN_ERR "perfmon: not enough pmc/pmd, perfmon is DISABLED\n"); + return -1; /* no need to continue anyway */ + } + + if (ia64_pal_debug_info(&pmu_conf.num_ibrs, &pmu_conf.num_dbrs)) { + printk(KERN_WARNING "perfmon: unable to get number of debug registers\n"); + pmu_conf.num_ibrs = pmu_conf.num_dbrs = 0; + } + /* PAL reports the number of pairs */ + pmu_conf.num_ibrs <<=1; + pmu_conf.num_dbrs <<=1; + + /* + * list the pmc registers used to control monitors + * XXX: unfortunately this information is not provided by PAL + * + * We start with the architected minimum and then refine for each CPU model + */ + pmu_conf.monitor_pmcs[0] = PMM(4)|PMM(5)|PMM(6)|PMM(7); + + /* + * architected counters + */ + pmu_conf.counter_pmds[0] |= PMM(4)|PMM(5)|PMM(6)|PMM(7); +#ifdef CONFIG_ITANIUM + pmu_conf.monitor_pmcs[0] |= PMM(10)|PMM(11)|PMM(12); + /* Itanium does not add more counters */ +#endif + /* we are all set */ + pmu_conf.pfm_is_disabled = 0; + + /* + * for now here for debug purposes + */ + perfmon_dir = create_proc_read_entry ("perfmon", 0, 0, perfmon_read_entry, NULL); + + spin_lock_init(&pfm_sessions.pfs_lock); + + return 0; +} + +__initcall(perfmon_init); + +void +perfmon_init_percpu (void) +{ + ia64_set_pmv(IA64_PERFMON_VECTOR); + ia64_srlz_d(); } + #else /* !CONFIG_PERFMON */ asmlinkage int -sys_perfmonctl (int pid, int cmd, int flags, perfmon_req_t *req, int count, long arg6, long arg7, long arg8, long stack) +sys_perfmonctl (int pid, int cmd, void *req, int count, long arg5, long arg6, + long arg7, long arg8, long stack) { return -ENOSYS; } diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c index bfdb34b..e25cbc1 100644 --- a/arch/ia64/kernel/process.c +++ b/arch/ia64/kernel/process.c @@ -1,8 +1,8 @@ /* * Architecture-specific setup. * - * Copyright (C) 1998-2001 Hewlett-Packard Co - * Copyright (C) 1998-2001 David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 1998-2002 Hewlett-Packard Co + * David Mosberger-Tang <davidm@hpl.hp.com> */ #define __KERNEL_SYSCALLS__ /* see <asm/unistd.h> */ #include <linux/config.h> @@ -12,14 +12,17 @@ #include <linux/errno.h> #include <linux/kernel.h> #include <linux/mm.h> +#include <linux/personality.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/smp_lock.h> #include <linux/stddef.h> +#include <linux/thread_info.h> #include <linux/unistd.h> #include <asm/delay.h> #include <asm/efi.h> +#include <asm/elf.h> #include <asm/perfmon.h> #include <asm/pgtable.h> #include <asm/processor.h> @@ -28,6 +31,16 @@ #include <asm/unwind.h> #include <asm/user.h> +#ifdef CONFIG_IA64_SGI_SN +#include <asm/sn/idle.h> +#endif + +#ifdef CONFIG_PERFMON +# include <asm/perfmon.h> +#endif + +#include "sigframe.h" + static void do_show_stack (struct unw_frame_info *info, void *arg) { @@ -46,6 +59,15 @@ do_show_stack (struct unw_frame_info *info, void *arg) } void +show_trace_task (struct task_struct *task) +{ + struct unw_frame_info info; + + unw_init_from_blocked_task(&info, task); + do_show_stack(&info, 0); +} + +void show_stack (struct task_struct *task) { if (!task) @@ -90,8 +112,8 @@ show_regs (struct pt_regs *regs) printk("r26 : %016lx r27 : %016lx r28 : %016lx\n", regs->r26, regs->r27, regs->r28); printk("r29 : %016lx r30 : %016lx r31 : %016lx\n", regs->r29, regs->r30, regs->r31); - /* print the stacked registers if cr.ifs is valid: */ - if (regs->cr_ifs & 0x8000000000000000) { + if (user_mode(regs)) { + /* print the stacked registers */ unsigned long val, sof, *bsp, ndirty; int i, is_nat = 0; @@ -103,32 +125,61 @@ show_regs (struct pt_regs *regs) printk("r%-3u:%c%016lx%s", 32 + i, is_nat ? '*' : ' ', val, ((i == sof - 1) || (i % 3) == 2) ? "\n" : " "); } - } - if (!user_mode(regs)) + } else show_stack(0); } +void +do_notify_resume_user (sigset_t *oldset, struct sigscratch *scr, long in_syscall) +{ +#ifdef CONFIG_PERFMON + if (current->thread.pfm_ovfl_block_reset) + pfm_ovfl_block_reset(); +#endif + + /* deal with pending signal delivery */ + if (test_thread_flag(TIF_SIGPENDING)) + ia64_do_signal(oldset, scr, in_syscall); +} + +/* + * We use this if we don't have any better idle routine.. + */ +static void +default_idle (void) +{ + /* may want to do PAL_LIGHT_HALT here... */ +} + void __attribute__((noreturn)) cpu_idle (void *unused) { /* endless idle loop with no priority at all */ - init_idle(); - current->nice = 20; - while (1) { #ifdef CONFIG_SMP if (!need_resched()) min_xtp(); #endif - while (!need_resched()) - continue; + + while (!need_resched()) { +#ifdef CONFIG_IA64_SGI_SN + snidle(); +#endif + if (pm_idle) + (*pm_idle)(); + else + default_idle(); + } + +#ifdef CONFIG_IA64_SGI_SN + snidleoff(); +#endif + #ifdef CONFIG_SMP normal_xtp(); #endif schedule(); check_pgt_cache(); - if (pm_idle) - (*pm_idle)(); } } @@ -137,10 +188,14 @@ ia64_save_extra (struct task_struct *task) { if ((task->thread.flags & IA64_THREAD_DBG_VALID) != 0) ia64_save_debug_regs(&task->thread.dbr[0]); + #ifdef CONFIG_PERFMON if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0) pfm_save_regs(task); + + if (local_cpu_data->pfm_syst_wide) pfm_syst_wide_update_task(task, 0); #endif + if (IS_IA32_PROCESS(ia64_task_regs(task))) ia32_save_state(task); } @@ -150,10 +205,14 @@ ia64_load_extra (struct task_struct *task) { if ((task->thread.flags & IA64_THREAD_DBG_VALID) != 0) ia64_load_debug_regs(&task->thread.dbr[0]); + #ifdef CONFIG_PERFMON if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0) pfm_load_regs(task); + + if (local_cpu_data->pfm_syst_wide) pfm_syst_wide_update_task(task, 1); #endif + if (IS_IA32_PROCESS(ia64_task_regs(task))) ia32_load_state(task); } @@ -233,7 +292,7 @@ copy_thread (int nr, unsigned long clone_flags, if (user_mode(child_ptregs)) { if (user_stack_base) { - child_ptregs->r12 = user_stack_base + user_stack_size; + child_ptregs->r12 = user_stack_base + user_stack_size - 16; child_ptregs->ar_bspstore = user_stack_base; child_ptregs->ar_rnat = 0; child_ptregs->loadrs = 0; @@ -286,9 +345,15 @@ copy_thread (int nr, unsigned long clone_flags, if (IS_IA32_PROCESS(ia64_task_regs(current))) ia32_save_state(p); #endif + #ifdef CONFIG_PERFMON - if (p->thread.pfm_context) - retval = pfm_inherit(p, child_ptregs); + /* + * reset notifiers and owner check (may not have a perfmon context) + */ + atomic_set(&p->thread.pfm_notifiers_check, 0); + atomic_set(&p->thread.pfm_owners_check, 0); + + if (current->thread.pfm_context) retval = pfm_inherit(p, child_ptregs); #endif return retval; } @@ -412,6 +477,16 @@ out: return error; } +void +ia64_set_personality (struct elf64_hdr *elf_ex, int ibcs2_interpreter) +{ + set_personality(PER_LINUX); + if (elf_ex->e_flags & EF_IA_64_LINUX_EXECUTABLE_STACK) + current->thread.flags |= IA64_THREAD_XSTACK; + else + current->thread.flags &= ~IA64_THREAD_XSTACK; +} + pid_t kernel_thread (int (*fn)(void *), void *arg, unsigned long flags) { @@ -443,15 +518,15 @@ flush_thread (void) #ifdef CONFIG_PERFMON /* - * By the time we get here, the task is detached from the tasklist. This is important - * because it means that no other tasks can ever find it as a notifiied task, therfore - * there is no race condition between this code and let's say a pfm_context_create(). - * Conversely, the pfm_cleanup_notifiers() cannot try to access a task's pfm context if - * this other task is in the middle of its own pfm_context_exit() because it would alreayd - * be out of the task list. Note that this case is very unlikely between a direct child - * and its parents (if it is the notified process) because of the way the exit is notified - * via SIGCHLD. + * by the time we get here, the task is detached from the tasklist. This is important + * because it means that no other tasks can ever find it as a notified task, therfore there + * is no race condition between this code and let's say a pfm_context_create(). + * Conversely, the pfm_cleanup_notifiers() cannot try to access a task's pfm context if this + * other task is in the middle of its own pfm_context_exit() because it would already be out of + * the task list. Note that this case is very unlikely between a direct child and its parents + * (if it is the notified process) because of the way the exit is notified via SIGCHLD. */ + void release_thread (struct task_struct *task) { @@ -460,6 +535,12 @@ release_thread (struct task_struct *task) if (atomic_read(&task->thread.pfm_notifiers_check) > 0) pfm_cleanup_notifiers(task); + + if (atomic_read(&task->thread.pfm_owners_check) > 0) + pfm_cleanup_owners(task); + + if (task->thread.pfm_smpl_buf_list) + pfm_cleanup_smpl_buf(task); } #endif @@ -475,21 +556,13 @@ exit_thread (void) ia64_set_fpu_owner(0); #endif #ifdef CONFIG_PERFMON - /* stop monitoring */ - if ((current->thread.flags & IA64_THREAD_PM_VALID) != 0) { - /* - * we cannot rely on switch_to() to save the PMU - * context for the last time. There is a possible race - * condition in SMP mode between the child and the - * parent. by explicitly saving the PMU context here - * we garantee no race. this call we also stop - * monitoring - */ + /* if needed, stop monitoring and flush state to perfmon context */ + if (current->thread.pfm_context) pfm_flush_regs(current); - /* - * make sure that switch_to() will not save context again - */ - current->thread.flags &= ~IA64_THREAD_PM_VALID; + + /* free debug register resources */ + if ((current->thread.flags & IA64_THREAD_DBG_VALID) != 0) { + pfm_release_debug_registers(current); } #endif } @@ -571,3 +644,29 @@ machine_power_off (void) pm_power_off(); machine_halt(); } + +void __init +init_task_struct_cache (void) +{ +} + +struct task_struct * +dup_task_struct(struct task_struct *orig) +{ + struct task_struct *tsk; + + tsk = __get_free_pages(GFP_KERNEL, KERNEL_STACK_SIZE_ORDER); + if (!tsk) + return NULL; + + memcpy(tsk, orig, sizeof(struct task_struct) + sizeof(struct thread_info)); + tsk->thread_info = (struct thread_info *) ((char *) tsk + IA64_TASK_SIZE); + atomic_set(&tsk->usage, 1); + return tsk; +} + +void +__put_task_struct (struct task_struct *tsk) +{ + free_pages((unsigned long) tsk, KERNEL_STACK_SIZE_ORDER); +} diff --git a/arch/ia64/kernel/ptrace.c b/arch/ia64/kernel/ptrace.c index dc932f6..08dd15c 100644 --- a/arch/ia64/kernel/ptrace.c +++ b/arch/ia64/kernel/ptrace.c @@ -23,6 +23,9 @@ #include <asm/system.h> #include <asm/uaccess.h> #include <asm/unwind.h> +#ifdef CONFIG_PERFMON +#include <asm/perfmon.h> +#endif /* * Bits in the PSR that we allow ptrace() to change: @@ -755,11 +758,6 @@ access_uarea (struct task_struct *child, unsigned long addr, unsigned long *data } else { /* access debug registers */ - if (!(child->thread.flags & IA64_THREAD_DBG_VALID)) { - child->thread.flags |= IA64_THREAD_DBG_VALID; - memset(child->thread.dbr, 0, sizeof(child->thread.dbr)); - memset(child->thread.ibr, 0, sizeof(child->thread.ibr)); - } if (addr >= PT_IBR) { regnum = (addr - PT_IBR) >> 3; ptr = &child->thread.ibr[0]; @@ -772,6 +770,31 @@ access_uarea (struct task_struct *child, unsigned long addr, unsigned long *data dprintk("ptrace: rejecting access to register address 0x%lx\n", addr); return -1; } +#ifdef CONFIG_PERFMON + /* + * Check if debug registers are used + * by perfmon. This test must be done once we know that we can + * do the operation, i.e. the arguments are all valid, but before + * we start modifying the state. + * + * Perfmon needs to keep a count of how many processes are + * trying to modify the debug registers for system wide monitoring + * sessions. + * + * We also include read access here, because they may cause + * the PMU-installed debug register state (dbr[], ibr[]) to + * be reset. The two arrays are also used by perfmon, but + * we do not use IA64_THREAD_DBG_VALID. The registers are restored + * by the PMU context switch code. + */ + if (pfm_use_debug_registers(child)) return -1; +#endif + + if (!(child->thread.flags & IA64_THREAD_DBG_VALID)) { + child->thread.flags |= IA64_THREAD_DBG_VALID; + memset(child->thread.dbr, 0, sizeof(child->thread.dbr)); + memset(child->thread.ibr, 0, sizeof(child->thread.ibr)); + } ptr += regnum; @@ -789,6 +812,260 @@ access_uarea (struct task_struct *child, unsigned long addr, unsigned long *data return 0; } +static long +ptrace_getregs (struct task_struct *child, struct pt_all_user_regs *ppr) +{ + struct switch_stack *sw; + struct pt_regs *pt; + long ret, retval; + struct unw_frame_info info; + char nat = 0; + int i; + + retval = verify_area(VERIFY_WRITE, ppr, sizeof(struct pt_all_user_regs)); + if (retval != 0) { + return -EIO; + } + + pt = ia64_task_regs(child); + sw = (struct switch_stack *) (child->thread.ksp + 16); + unw_init_from_blocked_task(&info, child); + if (unw_unwind_to_user(&info) < 0) { + return -EIO; + } + + if (((unsigned long) ppr & 0x7) != 0) { + dprintk("ptrace:unaligned register address %p\n", ppr); + return -EIO; + } + + retval = 0; + + /* control regs */ + + retval |= __put_user(pt->cr_iip, &ppr->cr_iip); + retval |= access_uarea(child, PT_CR_IPSR, &ppr->cr_ipsr, 0); + + /* app regs */ + + retval |= __put_user(pt->ar_pfs, &ppr->ar[PT_AUR_PFS]); + retval |= __put_user(pt->ar_rsc, &ppr->ar[PT_AUR_RSC]); + retval |= __put_user(pt->ar_bspstore, &ppr->ar[PT_AUR_BSPSTORE]); + retval |= __put_user(pt->ar_unat, &ppr->ar[PT_AUR_UNAT]); + retval |= __put_user(pt->ar_ccv, &ppr->ar[PT_AUR_CCV]); + retval |= __put_user(pt->ar_fpsr, &ppr->ar[PT_AUR_FPSR]); + + retval |= access_uarea(child, PT_AR_EC, &ppr->ar[PT_AUR_EC], 0); + retval |= access_uarea(child, PT_AR_LC, &ppr->ar[PT_AUR_LC], 0); + retval |= access_uarea(child, PT_AR_RNAT, &ppr->ar[PT_AUR_RNAT], 0); + retval |= access_uarea(child, PT_AR_BSP, &ppr->ar[PT_AUR_BSP], 0); + retval |= access_uarea(child, PT_CFM, &ppr->cfm, 0); + + /* gr1-gr3 */ + + retval |= __copy_to_user(&ppr->gr[1], &pt->r1, sizeof(long) * 3); + + /* gr4-gr7 */ + + for (i = 4; i < 8; i++) { + retval |= unw_access_gr(&info, i, &ppr->gr[i], &nat, 0); + } + + /* gr8-gr11 */ + + retval |= __copy_to_user(&ppr->gr[8], &pt->r8, sizeof(long) * 4); + + /* gr12-gr15 */ + + retval |= __copy_to_user(&ppr->gr[12], &pt->r12, sizeof(long) * 4); + + /* gr16-gr31 */ + + retval |= __copy_to_user(&ppr->gr[16], &pt->r16, sizeof(long) * 16); + + /* b0 */ + + retval |= __put_user(pt->b0, &ppr->br[0]); + + /* b1-b5 */ + + for (i = 1; i < 6; i++) { + retval |= unw_access_br(&info, i, &ppr->br[i], 0); + } + + /* b6-b7 */ + + retval |= __put_user(pt->b6, &ppr->br[6]); + retval |= __put_user(pt->b7, &ppr->br[7]); + + /* fr2-fr5 */ + + for (i = 2; i < 6; i++) { + retval |= access_fr(&info, i, 0, (unsigned long *) &ppr->fr[i], 0); + retval |= access_fr(&info, i, 1, (unsigned long *) &ppr->fr[i] + 1, 0); + } + + /* fr6-fr9 */ + + retval |= __copy_to_user(&ppr->fr[6], &pt->f6, sizeof(struct ia64_fpreg) * 4); + + /* fp scratch regs(10-15) */ + + retval |= __copy_to_user(&ppr->fr[10], &sw->f10, sizeof(struct ia64_fpreg) * 6); + + /* fr16-fr31 */ + + for (i = 16; i < 32; i++) { + retval |= access_fr(&info, i, 0, (unsigned long *) &ppr->fr[i], 0); + retval |= access_fr(&info, i, 1, (unsigned long *) &ppr->fr[i] + 1, 0); + } + + /* fph */ + + ia64_flush_fph(child); + retval |= __copy_to_user(&ppr->fr[32], &child->thread.fph, sizeof(ppr->fr[32]) * 96); + + /* preds */ + + retval |= __put_user(pt->pr, &ppr->pr); + + /* nat bits */ + + retval |= access_uarea(child, PT_NAT_BITS, &ppr->nat, 0); + + ret = retval ? -EIO : 0; + return ret; +} + +static long +ptrace_setregs (struct task_struct *child, struct pt_all_user_regs *ppr) +{ + struct switch_stack *sw; + struct pt_regs *pt; + long ret, retval; + struct unw_frame_info info; + char nat = 0; + int i; + + retval = verify_area(VERIFY_READ, ppr, sizeof(struct pt_all_user_regs)); + if (retval != 0) { + return -EIO; + } + + pt = ia64_task_regs(child); + sw = (struct switch_stack *) (child->thread.ksp + 16); + unw_init_from_blocked_task(&info, child); + if (unw_unwind_to_user(&info) < 0) { + return -EIO; + } + + if (((unsigned long) ppr & 0x7) != 0) { + dprintk("ptrace:unaligned register address %p\n", ppr); + return -EIO; + } + + retval = 0; + + /* control regs */ + + retval |= __get_user(pt->cr_iip, &ppr->cr_iip); + retval |= access_uarea(child, PT_CR_IPSR, &ppr->cr_ipsr, 1); + + /* app regs */ + + retval |= __get_user(pt->ar_pfs, &ppr->ar[PT_AUR_PFS]); + retval |= __get_user(pt->ar_rsc, &ppr->ar[PT_AUR_RSC]); + retval |= __get_user(pt->ar_bspstore, &ppr->ar[PT_AUR_BSPSTORE]); + retval |= __get_user(pt->ar_unat, &ppr->ar[PT_AUR_UNAT]); + retval |= __get_user(pt->ar_ccv, &ppr->ar[PT_AUR_CCV]); + retval |= __get_user(pt->ar_fpsr, &ppr->ar[PT_AUR_FPSR]); + + retval |= access_uarea(child, PT_AR_EC, &ppr->ar[PT_AUR_EC], 1); + retval |= access_uarea(child, PT_AR_LC, &ppr->ar[PT_AUR_LC], 1); + retval |= access_uarea(child, PT_AR_RNAT, &ppr->ar[PT_AUR_RNAT], 1); + retval |= access_uarea(child, PT_AR_BSP, &ppr->ar[PT_AUR_BSP], 1); + retval |= access_uarea(child, PT_CFM, &ppr->cfm, 1); + + /* gr1-gr3 */ + + retval |= __copy_from_user(&pt->r1, &ppr->gr[1], sizeof(long) * 3); + + /* gr4-gr7 */ + + for (i = 4; i < 8; i++) { + long ret = unw_get_gr(&info, i, &ppr->gr[i], &nat); + if (ret < 0) { + return ret; + } + retval |= unw_access_gr(&info, i, &ppr->gr[i], &nat, 1); + } + + /* gr8-gr11 */ + + retval |= __copy_from_user(&pt->r8, &ppr->gr[8], sizeof(long) * 4); + + /* gr12-gr15 */ + + retval |= __copy_from_user(&pt->r12, &ppr->gr[12], sizeof(long) * 4); + + /* gr16-gr31 */ + + retval |= __copy_from_user(&pt->r16, &ppr->gr[16], sizeof(long) * 16); + + /* b0 */ + + retval |= __get_user(pt->b0, &ppr->br[0]); + + /* b1-b5 */ + + for (i = 1; i < 6; i++) { + retval |= unw_access_br(&info, i, &ppr->br[i], 1); + } + + /* b6-b7 */ + + retval |= __get_user(pt->b6, &ppr->br[6]); + retval |= __get_user(pt->b7, &ppr->br[7]); + + /* fr2-fr5 */ + + for (i = 2; i < 6; i++) { + retval |= access_fr(&info, i, 0, (unsigned long *) &ppr->fr[i], 1); + retval |= access_fr(&info, i, 1, (unsigned long *) &ppr->fr[i] + 1, 1); + } + + /* fr6-fr9 */ + + retval |= __copy_from_user(&pt->f6, &ppr->fr[6], sizeof(ppr->fr[6]) * 4); + + /* fp scratch regs(10-15) */ + + retval |= __copy_from_user(&sw->f10, &ppr->fr[10], sizeof(ppr->fr[10]) * 6); + + /* fr16-fr31 */ + + for (i = 16; i < 32; i++) { + retval |= access_fr(&info, i, 0, (unsigned long *) &ppr->fr[i], 1); + retval |= access_fr(&info, i, 1, (unsigned long *) &ppr->fr[i] + 1, 1); + } + + /* fph */ + + ia64_sync_fph(child); + retval |= __copy_from_user(&child->thread.fph, &ppr->fr[32], sizeof(ppr->fr[32]) * 96); + + /* preds */ + + retval |= __get_user(pt->pr, &ppr->pr); + + /* nat bits */ + + retval |= access_uarea(child, PT_NAT_BITS, &ppr->nat, 1); + + ret = retval ? -EIO : 0; + return ret; +} + /* * Called by kernel/ptrace.c when detaching.. * @@ -916,9 +1193,9 @@ sys_ptrace (long request, pid_t pid, unsigned long addr, unsigned long data, if (data > _NSIG) goto out_tsk; if (request == PTRACE_SYSCALL) - child->ptrace |= PT_TRACESYS; + set_tsk_thread_flag(child, TIF_SYSCALL_TRACE); else - child->ptrace &= ~PT_TRACESYS; + clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE); child->exit_code = data; /* make sure the single step/taken-branch trap bits are not set: */ @@ -959,7 +1236,7 @@ sys_ptrace (long request, pid_t pid, unsigned long addr, unsigned long data, if (data > _NSIG) goto out_tsk; - child->ptrace &= ~PT_TRACESYS; + clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE); if (request == PTRACE_SINGLESTEP) { ia64_psr(pt)->ss = 1; } else { @@ -979,12 +1256,28 @@ sys_ptrace (long request, pid_t pid, unsigned long addr, unsigned long data, ret = ptrace_detach(child, data); goto out_tsk; + case PTRACE_GETREGS: + ret = ptrace_getregs(child, (struct pt_all_user_regs*) data); + goto out_tsk; + + case PTRACE_SETREGS: + ret = ptrace_setregs(child, (struct pt_all_user_regs*) data); + goto out_tsk; + + case PTRACE_SETOPTIONS: + if (data & PTRACE_O_TRACESYSGOOD) + child->ptrace |= PT_TRACESYSGOOD; + else + child->ptrace &= ~PT_TRACESYSGOOD; + ret = 0; + break; + default: ret = -EIO; goto out_tsk; } out_tsk: - free_task_struct(child); + put_task_struct(child); out: unlock_kernel(); return ret; @@ -993,9 +1286,16 @@ sys_ptrace (long request, pid_t pid, unsigned long addr, unsigned long data, void syscall_trace (void) { - if ((current->ptrace & (PT_PTRACED|PT_TRACESYS)) != (PT_PTRACED|PT_TRACESYS)) + if (!test_thread_flag(TIF_SYSCALL_TRACE)) + return; + if (!(current->ptrace & PT_PTRACED)) return; - current->exit_code = SIGTRAP; + /* + * The 0x80 provides a way for the tracing parent to distinguish between a syscall + * stop and SIGTRAP delivery. + */ + current->exit_code = SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD) + ? 0x80 : 0); set_current_state(TASK_STOPPED); notify_parent(current, SIGCHLD); schedule(); diff --git a/arch/ia64/kernel/sal.c b/arch/ia64/kernel/sal.c index 61f86eb..bd0fd49 100644 --- a/arch/ia64/kernel/sal.c +++ b/arch/ia64/kernel/sal.c @@ -18,7 +18,8 @@ #include <asm/sal.h> #include <asm/pal.h> -spinlock_t sal_lock = SPIN_LOCK_UNLOCKED; +spinlock_t sal_lock __cacheline_aligned = SPIN_LOCK_UNLOCKED; +unsigned long sal_platform_features; static struct { void *addr; /* function entry point */ @@ -76,7 +77,7 @@ ia64_sal_strerror (long status) return str; } -static void __init +static void __init ia64_sal_handler_init (void *entry_point, void *gpval) { /* fill in the SAL procedure descriptor and point ia64_sal to it: */ @@ -102,7 +103,7 @@ ia64_sal_init (struct ia64_sal_systab *systab) if (strncmp(systab->signature, "SST_", 4) != 0) printk("bad signature in system table!"); - /* + /* * revisions are coded in BCD, so %x does the job for us */ printk("SAL v%x.%02x: oem=%.32s, product=%.32s\n", @@ -152,12 +153,12 @@ ia64_sal_init (struct ia64_sal_systab *systab) case SAL_DESC_PLATFORM_FEATURE: { struct ia64_sal_desc_platform_feature *pf = (void *) p; + sal_platform_features = pf->feature_mask; printk("SAL: Platform features "); - if (pf->feature_mask & (1 << 0)) + if (pf->feature_mask & IA64_SAL_PLATFORM_FEATURE_BUS_LOCK) printk("BusLock "); - - if (pf->feature_mask & (1 << 1)) { + if (pf->feature_mask & IA64_SAL_PLATFORM_FEATURE_IRQ_REDIR_HINT) { printk("IRQ_Redirection "); #ifdef CONFIG_SMP if (no_int_routing) @@ -166,15 +167,17 @@ ia64_sal_init (struct ia64_sal_systab *systab) smp_int_redirect |= SMP_IRQ_REDIRECTION; #endif } - if (pf->feature_mask & (1 << 2)) { + if (pf->feature_mask & IA64_SAL_PLATFORM_FEATURE_IPI_REDIR_HINT) { printk("IPI_Redirection "); #ifdef CONFIG_SMP - if (no_int_routing) + if (no_int_routing) smp_int_redirect &= ~SMP_IPI_REDIRECTION; else smp_int_redirect |= SMP_IPI_REDIRECTION; #endif } + if (pf->feature_mask & IA64_SAL_PLATFORM_FEATURE_ITC_DRIFT) + printk("ITC_Drift "); printk("\n"); break; } diff --git a/arch/ia64/kernel/salinfo.c b/arch/ia64/kernel/salinfo.c new file mode 100644 index 0000000..e556c35 --- a/dev/null +++ b/arch/ia64/kernel/salinfo.c @@ -0,0 +1,105 @@ +/* + * salinfo.c + * + * Creates entries in /proc/sal for various system features. + * + * Copyright (c) 2001 Silicon Graphics, Inc. All rights reserved. + * + * 10/30/2001 jbarnes@sgi.com copied much of Stephane's palinfo + * code to create this file + */ + +#include <linux/types.h> +#include <linux/proc_fs.h> +#include <linux/module.h> + +#include <asm/sal.h> + +MODULE_AUTHOR("Jesse Barnes <jbarnes@sgi.com>"); +MODULE_DESCRIPTION("/proc interface to IA-64 SAL features"); +MODULE_LICENSE("GPL"); + +static int salinfo_read(char *page, char **start, off_t off, int count, int *eof, void *data); + +typedef struct { + const char *name; /* name of the proc entry */ + unsigned long feature; /* feature bit */ + struct proc_dir_entry *entry; /* registered entry (removal) */ +} salinfo_entry_t; + +/* + * List {name,feature} pairs for every entry in /proc/sal/<feature> + * that this module exports + */ +static salinfo_entry_t salinfo_entries[]={ + { "bus_lock", IA64_SAL_PLATFORM_FEATURE_BUS_LOCK, }, + { "irq_redirection", IA64_SAL_PLATFORM_FEATURE_IRQ_REDIR_HINT, }, + { "ipi_redirection", IA64_SAL_PLATFORM_FEATURE_IPI_REDIR_HINT, }, + { "itc_drift", IA64_SAL_PLATFORM_FEATURE_ITC_DRIFT, }, +}; + +#define NR_SALINFO_ENTRIES (sizeof(salinfo_entries)/sizeof(salinfo_entry_t)) + +/* + * One for each feature and one more for the directory entry... + */ +static struct proc_dir_entry *salinfo_proc_entries[NR_SALINFO_ENTRIES + 1]; + +static int __init +salinfo_init(void) +{ + struct proc_dir_entry *salinfo_dir; /* /proc/sal dir entry */ + struct proc_dir_entry **sdir = salinfo_proc_entries; /* keeps track of every entry */ + int i; + + salinfo_dir = proc_mkdir("sal", NULL); + + for (i=0; i < NR_SALINFO_ENTRIES; i++) { + /* pass the feature bit in question as misc data */ + *sdir++ = create_proc_read_entry (salinfo_entries[i].name, 0, salinfo_dir, + salinfo_read, (void *)salinfo_entries[i].feature); + } + *sdir++ = salinfo_dir; + + return 0; +} + +static void __exit +salinfo_exit(void) +{ + int i = 0; + + for (i = 0; i < NR_SALINFO_ENTRIES ; i++) { + if (salinfo_proc_entries[i]) + remove_proc_entry (salinfo_proc_entries[i]->name, NULL); + } +} + +/* + * 'data' contains an integer that corresponds to the feature we're + * testing + */ +static int +salinfo_read(char *page, char **start, off_t off, int count, int *eof, void *data) +{ + int len = 0; + + MOD_INC_USE_COUNT; + + len = sprintf(page, (sal_platform_features & (unsigned long)data) ? "1\n" : "0\n"); + + if (len <= off+count) *eof = 1; + + *start = page + off; + len -= off; + + if (len>count) len = count; + if (len<0) len = 0; + + MOD_DEC_USE_COUNT; + + return len; +} + +module_init(salinfo_init); +module_exit(salinfo_exit); diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c index 7eda024..658b7b1 100644 --- a/arch/ia64/kernel/setup.c +++ b/arch/ia64/kernel/setup.c @@ -3,7 +3,7 @@ * * Copyright (C) 1998-2001 Hewlett-Packard Co * David Mosberger-Tang <davidm@hpl.hp.com> - * Copyright (C) 1998, 1999, 2001 Stephane Eranian <eranian@hpl.hp.com> + * Stephane Eranian <eranian@hpl.hp.com> * Copyright (C) 2000, Rohit Seth <rohit.seth@intel.com> * Copyright (C) 1999 VA Linux Systems * Copyright (C) 1999 Walt Drummond <drummond@valinux.com> @@ -20,6 +20,7 @@ #include <linux/init.h> #include <linux/bootmem.h> +#include <linux/console.h> #include <linux/delay.h> #include <linux/kernel.h> #include <linux/reboot.h> @@ -27,7 +28,7 @@ #include <linux/seq_file.h> #include <linux/string.h> #include <linux/threads.h> -#include <linux/console.h> +#include <linux/tty.h> #include <asm/acpi-ext.h> #include <asm/ia32.h> @@ -147,6 +148,10 @@ free_available_memory (unsigned long start, unsigned long end, void *arg) } +/* + * Find a place to put the bootmap and return its starting address in bootmap_start. + * This address must be page-aligned. + */ static int find_bootmap_location (unsigned long start, unsigned long end, void *arg) { @@ -165,7 +170,7 @@ find_bootmap_location (unsigned long start, unsigned long end, void *arg) for (i = 0; i < num_rsvd_regions; i++) { range_start = MAX(start, free_start); - range_end = MIN(end, rsvd_region[i].start); + range_end = MIN(end, rsvd_region[i].start & PAGE_MASK); if (range_end <= range_start) continue; /* skip over empty range */ @@ -177,7 +182,7 @@ find_bootmap_location (unsigned long start, unsigned long end, void *arg) /* nothing more available in this segment */ if (range_end == end) return 0; - free_start = rsvd_region[i].end; + free_start = PAGE_ALIGN(rsvd_region[i].end); } return 0; } @@ -306,6 +311,10 @@ setup_arch (char **cmdline_p) /* process SAL system table: */ ia64_sal_init(efi.sal_systab); +#ifdef CONFIG_IA64_GENERIC + machvec_init(acpi_get_sysname()); +#endif + /* * Set `iobase' to the appropriate address in region 6 * (uncached access range) @@ -332,10 +341,6 @@ setup_arch (char **cmdline_p) cpu_init(); /* initialize the bootstrap CPU */ -#ifdef CONFIG_IA64_GENERIC - machvec_init(acpi_get_sysname()); -#endif - if (efi.acpi20) { /* Parse the ACPI 2.0 tables */ acpi20_parse(efi.acpi20); @@ -371,17 +376,14 @@ show_cpuinfo (struct seq_file *m, void *v) { #ifdef CONFIG_SMP # define lpj c->loops_per_jiffy +# define cpunum c->cpu #else # define lpj loops_per_jiffy +# define cpunum 0 #endif char family[32], features[128], *cp; struct cpuinfo_ia64 *c = v; - unsigned long mask, cpu = c - cpu_data(0); - -#ifdef CONFIG_SMP - if (!(cpu_online_map & (1 << cpu))) - return 0; -#endif + unsigned long mask; mask = c->features; @@ -403,7 +405,7 @@ show_cpuinfo (struct seq_file *m, void *v) sprintf(cp, " 0x%lx", mask); seq_printf(m, - "processor : %lu\n" + "processor : %d\n" "vendor : %s\n" "arch : IA-64\n" "family : %s\n" @@ -416,7 +418,7 @@ show_cpuinfo (struct seq_file *m, void *v) "cpu MHz : %lu.%06lu\n" "itc MHz : %lu.%06lu\n" "BogoMIPS : %lu.%02lu\n\n", - cpu, c->vendor, family, c->model, c->revision, c->archrev, + cpunum, c->vendor, family, c->model, c->revision, c->archrev, features, c->ppn, c->number, c->proc_freq / 1000000, c->proc_freq % 1000000, c->itc_freq / 1000000, c->itc_freq % 1000000, @@ -427,6 +429,10 @@ show_cpuinfo (struct seq_file *m, void *v) static void * c_start (struct seq_file *m, loff_t *pos) { +#ifdef CONFIG_SMP + while (*pos < NR_CPUS && !(cpu_online_map & (1 << *pos))) + ++*pos; +#endif return *pos < NR_CPUS ? cpu_data(*pos) : NULL; } @@ -483,6 +489,9 @@ identify_cpu (struct cpuinfo_ia64 *c) cpuid.bits[i] = ia64_get_cpuid(i); memcpy(c->vendor, cpuid.field.vendor, 16); +#ifdef CONFIG_SMP + c->cpu = smp_processor_id(); +#endif c->ppn = cpuid.field.ppn; c->number = cpuid.field.number; c->revision = cpuid.field.revision; @@ -534,7 +543,7 @@ cpu_init (void) = alloc_bootmem_pages_node(NODE_DATA(numa_node_id()), sizeof(struct cpuinfo_ia64)); for (cpu = 1; cpu < NR_CPUS; ++cpu) - memcpy(my_cpu_data->cpu_data[cpu]->cpu_data_ptrs, + memcpy(my_cpu_data->cpu_data[cpu]->cpu_data, my_cpu_data->cpu_data, sizeof(my_cpu_data->cpu_data)); } else { order = get_order(sizeof(struct cpuinfo_ia64)); @@ -577,6 +586,8 @@ cpu_init (void) atomic_inc(&init_mm.mm_count); current->active_mm = &init_mm; + if (current->mm) + BUG(); ia64_mmu_init(my_cpu_data); @@ -616,4 +627,6 @@ cpu_init (void) num_phys_stacked = 96; } local_cpu_data->phys_stacked_size_p8 = num_phys_stacked*8 + 8; + + platform_cpu_init(); } diff --git a/arch/ia64/kernel/sigframe.h b/arch/ia64/kernel/sigframe.h index 797c673..414c121 100644 --- a/arch/ia64/kernel/sigframe.h +++ b/arch/ia64/kernel/sigframe.h @@ -21,3 +21,5 @@ struct sigframe { struct siginfo info; struct sigcontext sc; }; + +extern long ia64_do_signal (sigset_t *, struct sigscratch *, long); diff --git a/arch/ia64/kernel/signal.c b/arch/ia64/kernel/signal.c index 9b2a7b9..5bf67ba 100644 --- a/arch/ia64/kernel/signal.c +++ b/arch/ia64/kernel/signal.c @@ -1,7 +1,7 @@ /* * Architecture-specific signal handling support. * - * Copyright (C) 1999-2001 Hewlett-Packard Co + * Copyright (C) 1999-2002 Hewlett-Packard Co * David Mosberger-Tang <davidm@hpl.hp.com> * * Derived from i386 and Alpha versions. @@ -17,6 +17,8 @@ #include <linux/smp.h> #include <linux/smp_lock.h> #include <linux/stddef.h> +#include <linux/tty.h> +#include <linux/binfmts.h> #include <linux/unistd.h> #include <linux/wait.h> @@ -39,8 +41,6 @@ # define GET_SIGSET(k,u) __get_user((k)->sig[0], &(u)->sig[0]) #endif -extern long ia64_do_signal (sigset_t *, struct sigscratch *, long); /* forward decl */ - long ia64_rt_sigsuspend (sigset_t *uset, size_t sigsetsize, struct sigscratch *scr) { @@ -160,6 +160,7 @@ copy_siginfo_to_user (siginfo_t *to, siginfo_t *from) err |= __put_user((short)from->si_code, &to->si_code); switch (from->si_code >> 16) { case __SI_FAULT >> 16: + err |= __put_user(from->si_flags, &to->si_flags); err |= __put_user(from->si_isr, &to->si_isr); case __SI_POLL >> 16: err |= __put_user(from->si_addr, &to->si_addr); @@ -172,7 +173,12 @@ copy_siginfo_to_user (siginfo_t *to, siginfo_t *from) case __SI_PROF >> 16: err |= __put_user(from->si_uid, &to->si_uid); err |= __put_user(from->si_pid, &to->si_pid); - err |= __put_user(from->si_pfm_ovfl, &to->si_pfm_ovfl); + if (from->si_code == PROF_OVFL) { + err |= __put_user(from->si_pfm_ovfl[0], &to->si_pfm_ovfl[0]); + err |= __put_user(from->si_pfm_ovfl[1], &to->si_pfm_ovfl[1]); + err |= __put_user(from->si_pfm_ovfl[2], &to->si_pfm_ovfl[2]); + err |= __put_user(from->si_pfm_ovfl[3], &to->si_pfm_ovfl[3]); + } break; default: err |= __put_user(from->si_uid, &to->si_uid); @@ -239,7 +245,7 @@ ia64_rt_sigreturn (struct sigscratch *scr) * could be corrupted. */ retval = (long) &ia64_leave_kernel; - if (current->ptrace & PT_TRACESYS) + if (test_thread_flag(TIF_SYSCALL_TRACE)) /* * strace expects to be notified after sigreturn returns even though the * context to which we return may not be in the middle of a syscall. diff --git a/arch/ia64/kernel/smp.c b/arch/ia64/kernel/smp.c index 9f7966c..9c03776 100644 --- a/arch/ia64/kernel/smp.c +++ b/arch/ia64/kernel/smp.c @@ -29,6 +29,7 @@ #include <linux/smp.h> #include <linux/kernel_stat.h> #include <linux/mm.h> +#include <linux/cache.h> #include <linux/delay.h> #include <linux/cache.h> @@ -38,7 +39,6 @@ #include <asm/delay.h> #include <asm/efi.h> #include <asm/machvec.h> - #include <asm/io.h> #include <asm/irq.h> #include <asm/page.h> @@ -51,14 +51,19 @@ #include <asm/unistd.h> #include <asm/mca.h> -/* The 'big kernel lock' */ -spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; +/* + * The Big Kernel Lock. It's not supposed to be used for performance critical stuff + * anymore. But we still need to align it because certain workloads are still affected by + * it. For example, llseek() and various other filesystem related routines still use the + * BKL. + */ +spinlock_t kernel_flag __cacheline_aligned = SPIN_LOCK_UNLOCKED; /* * Structure and data for smp_call_function(). This is designed to minimise static memory * requirements. It also looks cleaner. */ -static spinlock_t call_lock = SPIN_LOCK_UNLOCKED; +static spinlock_t call_lock __cacheline_aligned = SPIN_LOCK_UNLOCKED; struct call_data_struct { void (*func) (void *info); @@ -70,8 +75,12 @@ struct call_data_struct { static volatile struct call_data_struct *call_data; +static spinlock_t migration_lock = SPIN_LOCK_UNLOCKED; +static task_t *migrating_task; + #define IPI_CALL_FUNC 0 #define IPI_CPU_STOP 1 +#define IPI_MIGRATE_TASK 2 static void stop_this_cpu (void) @@ -98,51 +107,60 @@ handle_IPI (int irq, void *dev_id, struct pt_regs *regs) mb(); /* Order interrupt and bit testing. */ while ((ops = xchg(pending_ipis, 0)) != 0) { - mb(); /* Order bit clearing and data access. */ - do { - unsigned long which; - - which = ffz(~ops); - ops &= ~(1 << which); - - switch (which) { - case IPI_CALL_FUNC: - { - struct call_data_struct *data; - void (*func)(void *info); - void *info; - int wait; - - /* release the 'pointer lock' */ - data = (struct call_data_struct *) call_data; - func = data->func; - info = data->info; - wait = data->wait; - - mb(); - atomic_inc(&data->started); - - /* At this point the structure may be gone unless wait is true. */ - (*func)(info); - - /* Notify the sending CPU that the task is done. */ - mb(); - if (wait) - atomic_inc(&data->finished); + mb(); /* Order bit clearing and data access. */ + do { + unsigned long which; + + which = ffz(~ops); + ops &= ~(1 << which); + + switch (which) { + case IPI_CALL_FUNC: + { + struct call_data_struct *data; + void (*func)(void *info); + void *info; + int wait; + + /* release the 'pointer lock' */ + data = (struct call_data_struct *) call_data; + func = data->func; + info = data->info; + wait = data->wait; + + mb(); + atomic_inc(&data->started); + /* + * At this point the structure may be gone unless + * wait is true. + */ + (*func)(info); + + /* Notify the sending CPU that the task is done. */ + mb(); + if (wait) + atomic_inc(&data->finished); + } + break; + + case IPI_MIGRATE_TASK: + { + task_t *p = migrating_task; + spin_unlock(&migration_lock); + sched_task_migrated(p); + } + break; + + case IPI_CPU_STOP: + stop_this_cpu(); + break; + + default: + printk(KERN_CRIT "Unknown IPI on CPU %d: %lu\n", this_cpu, which); + break; } - break; - - case IPI_CPU_STOP: - stop_this_cpu(); - break; - - default: - printk(KERN_CRIT "Unknown IPI on CPU %d: %lu\n", this_cpu, which); - break; - } /* Switch */ - } while (ops); - - mb(); /* Order data access and bit testing. */ + } while (ops); + mb(); /* Order data access and bit testing. */ } } @@ -185,10 +203,25 @@ smp_send_reschedule (int cpu) platform_send_ipi(cpu, IA64_IPI_RESCHEDULE, IA64_IPI_DM_INT, 0); } +/* + * This function sends a reschedule IPI to all (other) CPUs. This should only be used if + * some 'global' task became runnable, such as a RT task, that must be handled now. The + * first CPU that manages to grab the task will run it. + */ +void +smp_send_reschedule_all (void) +{ + int i; + + for (i = 0; i < smp_num_cpus; i++) + if (i != smp_processor_id()) + smp_send_reschedule(i); +} + void smp_flush_tlb_all (void) { - smp_call_function ((void (*)(void *))__flush_tlb_all,0,1,1); + smp_call_function((void (*)(void *))__flush_tlb_all, 0, 1, 1); __flush_tlb_all(); } @@ -317,6 +350,15 @@ smp_send_stop (void) smp_num_cpus = 1; } +void +smp_migrate_task (int cpu, task_t *p) +{ + /* The target CPU will unlock the migration spinlock: */ + spin_lock(&migration_lock); + migrating_task = p; + send_IPI_single(cpu, IPI_MIGRATE_TASK); +} + int __init setup_profiling_timer (unsigned int multiplier) { diff --git a/arch/ia64/kernel/smpboot.c b/arch/ia64/kernel/smpboot.c index bc3dbf7..16eaaa5 100644 --- a/arch/ia64/kernel/smpboot.c +++ b/arch/ia64/kernel/smpboot.c @@ -70,6 +70,7 @@ extern void __init calibrate_delay(void); extern void start_ap(void); int cpucount; +task_t *task_for_booting_cpu; /* Setup configured maximum number of CPUs to activate */ static int max_cpus = -1; @@ -378,7 +379,7 @@ start_secondary (void *unused) smp_callin(); Dprintk("CPU %d is set to go.\n", smp_processor_id()); while (!atomic_read(&smp_commenced)) - ; + cpu_relax(); Dprintk("CPU %d is starting idle.\n", smp_processor_id()); return cpu_idle(); @@ -416,13 +417,13 @@ do_boot_cpu (int sapicid) if (!idle) panic("No idle process for CPU %d", cpu); - idle->processor = cpu; + init_idle(idle, cpu); + ia64_cpu_to_sapicid[cpu] = sapicid; - idle->cpus_runnable = 1 << cpu; /* we schedule the first task manually */ - del_from_runqueue(idle); unhash_process(idle); - init_tasks[cpu] = idle; + + task_for_booting_cpu = idle; Dprintk("Sending wakeup vector %u to AP 0x%x/0x%x.\n", ap_wakeup_vector, cpu, sapicid); @@ -451,6 +452,17 @@ do_boot_cpu (int sapicid) } } +unsigned long cache_decay_ticks; /* # of ticks an idle task is considered cache-hot */ + +static void +smp_tune_scheduling (void) +{ + cache_decay_ticks = 10; /* XXX base this on PAL info and cache-bandwidth estimate */ + + printk("task migration cache decay timeout: %ld msecs.\n", + (cache_decay_ticks + 1) * 1000 / HZ); +} + /* * Cycle through the APs sending Wakeup IPIs to boot each. */ @@ -470,8 +482,8 @@ smp_boot_cpus (void) smp_setup_percpu_timer(); /* - * We have the boot CPU online for sure. - */ + * We have the boot CPU online for sure. + */ set_bit(0, &cpu_online_map); set_bit(0, &cpu_callin_map); @@ -480,9 +492,9 @@ smp_boot_cpus (void) printk("Boot processor id 0x%x/0x%x\n", 0, boot_cpu_id); - global_irq_holder = 0; - current->processor = 0; - init_idle(); + global_irq_holder = NO_PROC_ID; + current_thread_info()->cpu = 0; + smp_tune_scheduling(); /* * If SMP should be disabled, then really disable it! @@ -493,7 +505,7 @@ smp_boot_cpus (void) smp_num_cpus = 1; goto smp_done; } - if (max_cpus != -1) + if (max_cpus != -1) printk (KERN_INFO "Limiting CPUs to %d\n", max_cpus); if (smp_boot_data.cpu_count > 1) { diff --git a/arch/ia64/kernel/sys_ia64.c b/arch/ia64/kernel/sys_ia64.c index 8b4512d..8079857 100644 --- a/arch/ia64/kernel/sys_ia64.c +++ b/arch/ia64/kernel/sys_ia64.c @@ -2,8 +2,8 @@ * This file contains various system calls that have different calling * conventions on different platforms. * - * Copyright (C) 1999-2000 Hewlett-Packard Co - * Copyright (C) 1999-2000 David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 1999-2000, 2002 Hewlett-Packard Co + * David Mosberger-Tang <davidm@hpl.hp.com> */ #include <linux/config.h> #include <linux/errno.h> @@ -201,15 +201,13 @@ do_mmap2 (unsigned long addr, unsigned long len, int prot, int flags, int fd, un if (len == 0) goto out; - /* don't permit mappings into unmapped space or the virtual page table of a region: */ + /* + * Don't permit mappings into unmapped space, the virtual page table of a region, + * or across a region boundary. Note: RGN_MAP_LIMIT is equal to 2^n-PAGE_SIZE + * (for some integer n <= 61) and len > 0. + */ roff = rgn_offset(addr); - if ((len | roff | (roff + len)) >= RGN_MAP_LIMIT) { - addr = -EINVAL; - goto out; - } - - /* don't permit mappings that would cross a region boundary: */ - if (rgn_index(addr) != rgn_index(addr + len)) { + if ((len > RGN_MAP_LIMIT) || (roff > (RGN_MAP_LIMIT - len))) { addr = -EINVAL; goto out; } @@ -276,74 +274,6 @@ ia64_create_module (const char *name_user, size_t size, long arg2, long arg3, return addr; } -#if 1 -/* - * This is here for a while to keep compatibillity with the old stat() - * call - it will be removed later once everybody migrates to the new - * kernel stat structure that matches the glibc one - Jes - */ - -static int -cp_ia64_old_stat (struct kstat *stat, struct ia64_oldstat *statbuf) -{ - struct ia64_oldstat tmp; - unsigned int blocks, indirect; - - memset(&tmp, 0, sizeof(tmp)); - tmp.st_dev = stat->dev; - tmp.st_ino = stat->ino; - tmp.st_mode = stat->mode; - tmp.st_nlink = stat->nlink; - SET_STAT_UID(tmp, stat->uid); - SET_STAT_GID(tmp, stat->gid); - tmp.st_rdev = stat->rdev; - tmp.st_size = stat->size; - tmp.st_atime = stat->atime; - tmp.st_mtime = stat->mtime; - tmp.st_ctime = stat->ctime; - tmp.st_blocks = stat->i_blocks; - tmp.st_blksize = stat->i_blksize; - return copy_to_user(statbuf,&tmp,sizeof(tmp)) ? -EFAULT : 0; -} - -asmlinkage long -ia64_oldstat (char *filename, struct ia64_oldstat *statbuf) -{ - struct kstat stat; - int error = vfs_stat(filename, &stat); - - if (!error) - error = cp_ia64_old_stat(&stat, statbuf); - - return error; -} - -asmlinkage long -ia64_oldlstat (char *filename, struct ia64_oldstat *statbuf) -{ - struct kstat stat; - int error = vfs_lstat(filename, &stat); - - if (!error) - error = cp_ia64_old_stat(&stat, statbuf); - - return error; -} - -asmlinkage long -ia64_oldfstat (unsigned int fd, struct ia64_oldstat *statbuf) -{ - struct kstat stat; - int error = vfs_fstat(fd, &stat); - - if (!error) - error = cp_ia64_old_stat(&stat, statbuf); - - return error; -} - -#endif - #ifndef CONFIG_PCI asmlinkage long diff --git a/arch/ia64/kernel/traps.c b/arch/ia64/kernel/traps.c index 8b949be..ccef9a5 100644 --- a/arch/ia64/kernel/traps.c +++ b/arch/ia64/kernel/traps.c @@ -1,7 +1,7 @@ /* * Architecture-specific trap handling. * - * Copyright (C) 1998-2001 Hewlett-Packard Co + * Copyright (C) 1998-2002 Hewlett-Packard Co * David Mosberger-Tang <davidm@hpl.hp.com> * * 05/12/00 grao <goutham.rao@intel.com> : added isr in siginfo for SIGFPE @@ -32,6 +32,7 @@ register double f30 asm ("f30"); register double f31 asm ("f31"); #include <linux/kernel.h> #include <linux/init.h> #include <linux/sched.h> +#include <linux/tty.h> #include <linux/vt_kern.h> /* For unblank_screen() */ #include <asm/hardirq.h> @@ -133,6 +134,8 @@ ia64_bad_break (unsigned long break_num, struct pt_regs *regs) /* SIGILL, SIGFPE, SIGSEGV, and SIGBUS want these field initialized: */ siginfo.si_addr = (void *) (regs->cr_iip + ia64_psr(regs)->ri); siginfo.si_imm = break_num; + siginfo.si_flags = 0; /* clear __ISR_VALID */ + siginfo.si_isr = 0; switch (break_num) { case 0: /* unknown error */ @@ -352,6 +355,8 @@ handle_fpu_swa (int fp_fault, struct pt_regs *regs, unsigned long isr) siginfo.si_code = FPE_FLTDIV; } siginfo.si_isr = isr; + siginfo.si_flags = __ISR_VALID; + siginfo.si_imm = 0; force_sig_info(SIGFPE, &siginfo, current); } } else { @@ -372,6 +377,8 @@ handle_fpu_swa (int fp_fault, struct pt_regs *regs, unsigned long isr) siginfo.si_code = FPE_FLTRES; } siginfo.si_isr = isr; + siginfo.si_flags = __ISR_VALID; + siginfo.si_imm = 0; force_sig_info(SIGFPE, &siginfo, current); } } @@ -490,6 +497,8 @@ ia64_fault (unsigned long vector, unsigned long isr, unsigned long ifa, siginfo.si_errno = 0; siginfo.si_addr = (void *) (regs->cr_iip + ia64_psr(regs)->ri); siginfo.si_imm = vector; + siginfo.si_flags = __ISR_VALID; + siginfo.si_isr = isr; force_sig_info(SIGILL, &siginfo, current); return; } @@ -517,6 +526,10 @@ ia64_fault (unsigned long vector, unsigned long isr, unsigned long ifa, } siginfo.si_signo = SIGTRAP; siginfo.si_errno = 0; + siginfo.si_flags = 0; + siginfo.si_isr = 0; + siginfo.si_addr = 0; + siginfo.si_imm = 0; force_sig_info(SIGTRAP, &siginfo, current); return; @@ -528,6 +541,9 @@ ia64_fault (unsigned long vector, unsigned long isr, unsigned long ifa, siginfo.si_errno = 0; siginfo.si_code = FPE_FLTINV; siginfo.si_addr = (void *) (regs->cr_iip + ia64_psr(regs)->ri); + siginfo.si_flags = __ISR_VALID; + siginfo.si_isr = isr; + siginfo.si_imm = 0; force_sig_info(SIGFPE, &siginfo, current); } return; @@ -537,6 +553,9 @@ ia64_fault (unsigned long vector, unsigned long isr, unsigned long ifa, siginfo.si_signo = SIGILL; siginfo.si_code = ILL_BADIADDR; siginfo.si_errno = 0; + siginfo.si_flags = 0; + siginfo.si_isr = 0; + siginfo.si_imm = 0; siginfo.si_addr = (void *) (regs->cr_iip + ia64_psr(regs)->ri); force_sig_info(SIGILL, &siginfo, current); return; diff --git a/arch/ia64/kernel/unaligned.c b/arch/ia64/kernel/unaligned.c index 90596f9..3f2671a 100644 --- a/arch/ia64/kernel/unaligned.c +++ b/arch/ia64/kernel/unaligned.c @@ -1,9 +1,9 @@ /* * Architecture-specific unaligned trap handling. * - * Copyright (C) 1999-2001 Hewlett-Packard Co - * Copyright (C) 1999-2000 Stephane Eranian <eranian@hpl.hp.com> - * Copyright (C) 2001 David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 1999-2002 Hewlett-Packard Co + * Stephane Eranian <eranian@hpl.hp.com> + * David Mosberger-Tang <davidm@hpl.hp.com> * * 2001/10/11 Fix unaligned access to rotating registers in s/w pipelined loops. * 2001/08/13 Correct size of extended floats (float_fsz) from 16 to 10 bytes. @@ -12,6 +12,7 @@ #include <linux/kernel.h> #include <linux/sched.h> #include <linux/smp_lock.h> +#include <linux/tty.h> #include <asm/uaccess.h> #include <asm/rse.h> @@ -23,7 +24,7 @@ extern void die_if_kernel(char *str, struct pt_regs *regs, long err) __attribute #undef DEBUG_UNALIGNED_TRAP #ifdef DEBUG_UNALIGNED_TRAP -# define DPRINT(a...) do { printk("%s.%u: ", __FUNCTION__, __LINE__); printk (a); } while (0) +# define DPRINT(a...) do { printk("%s %u: ", __FUNCTION__, __LINE__); printk (a); } while (0) # define DDUMP(str,vp,len) dump(str, vp, len) static void @@ -650,7 +651,7 @@ emulate_load_updates (update_t type, load_store_t ld, struct pt_regs *regs, unsi * just in case. */ if (ld.x6_op == 1 || ld.x6_op == 3) { - printk(KERN_ERR __FUNCTION__": register update on speculative load, error\n"); + printk("%s %s: register update on speculative load, error\n", KERN_ERR, __FUNCTION__); die_if_kernel("unaligned reference on specualtive load with register update\n", regs, 30); } @@ -1080,8 +1081,8 @@ emulate_load_floatpair (unsigned long ifa, load_store_t ld, struct pt_regs *regs * For this reason we keep this sanity check */ if (ld.x6_op == 1 || ld.x6_op == 3) - printk(KERN_ERR __FUNCTION__": register update on speculative load pair, " - "error\n"); + printk("%s %s: register update on speculative load pair, " + "error\n",KERN_ERR, __FUNCTION__); setreg(ld.r3, ifa, 0, regs); } @@ -1488,6 +1489,9 @@ ia64_handle_unaligned (unsigned long ifa, struct pt_regs *regs) si.si_errno = 0; si.si_code = BUS_ADRALN; si.si_addr = (void *) ifa; + si.si_flags = 0; + si.si_isr = 0; + si.si_imm = 0; force_sig_info(SIGBUS, &si, current); goto done; } diff --git a/arch/ia64/kernel/unwind.c b/arch/ia64/kernel/unwind.c index ea6284e..9c4eee5 100644 --- a/arch/ia64/kernel/unwind.c +++ b/arch/ia64/kernel/unwind.c @@ -1,6 +1,6 @@ /* - * Copyright (C) 1999-2001 Hewlett-Packard Co - * Copyright (C) 1999-2001 David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 1999-2002 Hewlett-Packard Co + * David Mosberger-Tang <davidm@hpl.hp.com> */ /* * This file implements call frame unwind support for the Linux @@ -72,6 +72,8 @@ #define alloc_reg_state() kmalloc(sizeof(struct unw_state_record), GFP_ATOMIC) #define free_reg_state(usr) kfree(usr) +#define alloc_labeled_state() kmalloc(sizeof(struct unw_labeled_state), GFP_ATOMIC) +#define free_labeled_state(usr) kfree(usr) typedef unsigned long unw_word; typedef unsigned char unw_hash_index_t; @@ -521,7 +523,7 @@ unw_access_pr (struct unw_frame_info *info, unsigned long *val, int write) } -/* Unwind decoder routines */ +/* Routines to manipulate the state stack. */ static inline void push (struct unw_state_record *sr) @@ -534,24 +536,60 @@ push (struct unw_state_record *sr) return; } memcpy(rs, &sr->curr, sizeof(*rs)); - rs->next = sr->stack; - sr->stack = rs; + sr->curr.next = rs; } static void pop (struct unw_state_record *sr) { - struct unw_reg_state *rs; + struct unw_reg_state *rs = sr->curr.next; - if (!sr->stack) { - printk ("unwind: stack underflow!\n"); + if (!rs) { + printk("unwind: stack underflow!\n"); return; } - rs = sr->stack; - sr->stack = rs->next; + memcpy(&sr->curr, rs, sizeof(*rs)); free_reg_state(rs); } +/* Make a copy of the state stack. Non-recursive to avoid stack overflows. */ +static struct unw_reg_state * +dup_state_stack (struct unw_reg_state *rs) +{ + struct unw_reg_state *copy, *prev = NULL, *first = NULL; + + while (rs) { + copy = alloc_reg_state(); + if (!copy) { + printk ("unwind.dup_state_stack: out of memory\n"); + return NULL; + } + memcpy(copy, rs, sizeof(*copy)); + if (first) + prev->next = copy; + else + first = copy; + rs = rs->next; + prev = copy; + } + return first; +} + +/* Free all stacked register states (but not RS itself). */ +static void +free_state_stack (struct unw_reg_state *rs) +{ + struct unw_reg_state *p, *next; + + for (p = rs->next; p != NULL; p = next) { + next = p->next; + free_reg_state(p); + } + rs->next = NULL; +} + +/* Unwind decoder routines */ + static enum unw_register_index __attribute__((const)) decode_abreg (unsigned char abreg, int memory) { @@ -689,7 +727,7 @@ desc_prologue (int body, unw_word rlen, unsigned char mask, unsigned char grsave sr->first_region = 0; /* check if we're done: */ - if (body && sr->when_target < sr->region_start + sr->region_len) { + if (sr->when_target < sr->region_start + sr->region_len) { sr->done = 1; return; } @@ -902,31 +940,36 @@ desc_epilogue (unw_word t, unw_word ecount, struct unw_state_record *sr) static inline void desc_copy_state (unw_word label, struct unw_state_record *sr) { - struct unw_reg_state *rs; + struct unw_labeled_state *ls; - for (rs = sr->reg_state_list; rs; rs = rs->next) { - if (rs->label == label) { - memcpy (&sr->curr, rs, sizeof(sr->curr)); + for (ls = sr->labeled_states; ls; ls = ls->next) { + if (ls->label == label) { + free_state_stack(&sr->curr); + memcpy(&sr->curr, &ls->saved_state, sizeof(sr->curr)); + sr->curr.next = dup_state_stack(ls->saved_state.next); return; } } - printk("unwind: failed to find state labelled 0x%lx\n", label); + printk("unwind: failed to find state labeled 0x%lx\n", label); } static inline void desc_label_state (unw_word label, struct unw_state_record *sr) { - struct unw_reg_state *rs; + struct unw_labeled_state *ls; - rs = alloc_reg_state(); - if (!rs) { - printk("unwind: cannot stack!\n"); + ls = alloc_labeled_state(); + if (!ls) { + printk("unwind.desc_label_state(): out of memory\n"); return; } - memcpy(rs, &sr->curr, sizeof(*rs)); - rs->label = label; - rs->next = sr->reg_state_list; - sr->reg_state_list = rs; + ls->label = label; + memcpy(&ls->saved_state, &sr->curr, sizeof(ls->saved_state)); + ls->saved_state.next = dup_state_stack(sr->curr.next); + + /* insert into list of labeled states: */ + ls->next = sr->labeled_states; + sr->labeled_states = ls; } /* @@ -1378,6 +1421,8 @@ lookup (struct unw_table *table, unsigned long rel_ip) else break; } + if (rel_ip < e->start_offset || rel_ip >= e->end_offset) + return NULL; return e; } @@ -1388,9 +1433,9 @@ lookup (struct unw_table *table, unsigned long rel_ip) static inline struct unw_script * build_script (struct unw_frame_info *info) { - struct unw_reg_state *rs, *next; const struct unw_table_entry *e = 0; struct unw_script *script = 0; + struct unw_labeled_state *ls, *next; unsigned long ip = info->ip; struct unw_state_record sr; struct unw_table *table; @@ -1535,15 +1580,15 @@ build_script (struct unw_frame_info *info) for (i = UNW_REG_BSP; i < UNW_NUM_REGS; ++i) compile_reg(&sr, i, script); - /* free labelled register states & stack: */ + /* free labeled register states & stack: */ STAT(parse_start = ia64_get_itc()); - for (rs = sr.reg_state_list; rs; rs = next) { - next = rs->next; - free_reg_state(rs); + for (ls = sr.labeled_states; ls; ls = next) { + next = ls->next; + free_state_stack(&ls->saved_state); + free_labeled_state(ls); } - while (sr.stack) - pop(&sr); + free_state_stack(&sr.curr); STAT(unw.stat.script.parse_time += ia64_get_itc() - parse_start); script_finalize(script, &sr); diff --git a/arch/ia64/kernel/unwind_i.h b/arch/ia64/kernel/unwind_i.h index 8aaff0a..c4763f3 100644 --- a/arch/ia64/kernel/unwind_i.h +++ b/arch/ia64/kernel/unwind_i.h @@ -1,6 +1,6 @@ /* - * Copyright (C) 2000 Hewlett-Packard Co - * Copyright (C) 2000 David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 2000, 2002 Hewlett-Packard Co + * David Mosberger-Tang <davidm@hpl.hp.com> * * Kernel unwind support. */ @@ -85,6 +85,17 @@ struct unw_reg_info { int when; /* when the register gets saved */ }; +struct unw_reg_state { + struct unw_reg_state *next; /* next (outer) element on state stack */ + struct unw_reg_info reg[UNW_NUM_REGS]; /* register save locations */ +}; + +struct unw_labeled_state { + struct unw_labeled_state *next; /* next labeled state (or NULL) */ + unsigned long label; /* label for this state */ + struct unw_reg_state saved_state; +}; + struct unw_state_record { unsigned int first_region : 1; /* is this the first region? */ unsigned int done : 1; /* are we done scanning descriptors? */ @@ -105,11 +116,8 @@ struct unw_state_record { u8 gr_save_loc; /* next general register to use for saving a register */ u8 return_link_reg; /* branch register in which the return link is passed */ - struct unw_reg_state { - struct unw_reg_state *next; - unsigned long label; /* label of this state record */ - struct unw_reg_info reg[UNW_NUM_REGS]; - } curr, *stack, *reg_state_list; + struct unw_labeled_state *labeled_states; /* list of all labeled states */ + struct unw_reg_state curr; /* current state */ }; enum unw_nat_type { @@ -139,7 +147,7 @@ struct unw_insn { }; /* - * Preserved general static registers (r2-r5) give rise to two script + * Preserved general static registers (r4-r7) give rise to two script * instructions; everything else yields at most one instruction; at * the end of the script, the psp gets popped, accounting for one more * instruction. diff --git a/arch/ia64/lib/clear_page.S b/arch/ia64/lib/clear_page.S index ed68678..d3a6359 100644 --- a/arch/ia64/lib/clear_page.S +++ b/arch/ia64/lib/clear_page.S @@ -1,51 +1,77 @@ /* - * - * Optimized function to clear a page of memory. - * - * Inputs: - * in0: address of page - * - * Output: - * none - * - * Copyright (C) 1999-2001 Hewlett-Packard Co - * Copyright (C) 1999 Stephane Eranian <eranian@hpl.hp.com> - * Copyright (C) 1999-2001 David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 1999-2002 Hewlett-Packard Co + * Stephane Eranian <eranian@hpl.hp.com> + * David Mosberger-Tang <davidm@hpl.hp.com> + * Copyright (C) 2002 Ken Chen <kenneth.w.chen@int |