You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
(11) |
Jul
(5) |
Aug
|
Sep
(1) |
Oct
(4) |
Nov
(2) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
|
Feb
|
Mar
(18) |
Apr
(7) |
May
(8) |
Jun
(19) |
Jul
(16) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(8) |
Jun
|
Jul
(2) |
Aug
(1) |
Sep
(7) |
Oct
|
Nov
|
Dec
(2) |
| 2005 |
Jan
(3) |
Feb
(2) |
Mar
|
Apr
|
May
(10) |
Jun
|
Jul
(1) |
Aug
(3) |
Sep
|
Oct
|
Nov
(4) |
Dec
(1) |
| 2006 |
Jan
(41) |
Feb
(41) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2007 |
Jan
(1) |
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
(1) |
Sep
|
Oct
(1) |
Nov
(1) |
Dec
|
|
From: Steve W. <sc...@ne...> - 2003-03-21 12:40:02
|
Hi,
I've been playing around with SuperH's sh5 toolchain on NetBSD/sh5. (yeah,
this is a Linux list. Bite me... ;-).
At this point, I have it running natively under NetBSD/sh5 and have been
trying to get shared libraries working (don't ask; they're needed for
something else). In the course of this, I found a PIC bug. This may or
may not be a problem for a cross-sh5 compiler from the same sources; I
went straight in at the deep end and build it natively (bootstrapped using
an earlier native sh5 toolchain, before self-hosting it). Someone else can
verify if the cross compiler is similarly afflicted.
Anyhow, the following test case tickles the bug:
--8<--8<--
/*
* Compile with "cc -O2 -fPIC -c" (Note the -fPIC option)
*
* This will segfault in cc1:gen_movsi_const() for -O1 and -O2.
* Compiling with -O0 is fine.
*
* I suspect the problem is due to some interaction between the offset
* of 'wibble.bar' from 'wibble@GOTOFF' and a tail-call to
* 'some_function()' when optimisation is enabled.
*
* Interestingly, removing the 'static' storage class from 'wibble' also
* fixes the problem. So the problem does not affect 'wibble@GOT'. It is
* definately only triggered by 'wibble@GOTOFF'.
*
* As I have no native debugger (yet) on NetBSD/sh5, the only way to
* debug this problem is to catch the segfault in the kernel debugger
* and look at the registers. It's caused by a NULL de-reference in
* gen_movsi_const(). I can provide a few more details if required.
*/
static struct {
int foo;
int bar;
} wibble;
extern void some_function(void *);
void
wobble(void)
{
some_function(&wibble.bar);
#if 0
/* Enabling this statement fixes the problem */
wibble.foo = 0;
#endif
}
--8<--8<--
Of course, if no one else can repeat this, then I'll have to go looking at
a problem somewhere in NetBSD. :-/
Any ideas?
Cheers, Steve
|
|
From: Sean M. <Sea...@su...> - 2003-03-10 16:34:55
|
Hi, Several people have been enquiring about a suitable GCC toolchain for building SH-5 Linux kernels an applications. I have decided to make available on our external FTP site the toolchain that I am using, as a freely downloadable set of sources. This toolchain has some significant improvements, and several bug-fixes over previous versions. Specifically, all problems with sharable libraries are now resolved, that is, there are no known DLL issues with these files. The tarballs are available in: ftp://ftp.uk.superh.com/pub/SuperH-GNU/Barcelona-20030310 % ls -l -r--r--r-- 14343634 Mar 7 14:51 sh5linux-source-SH5LINUX-BINUTILS-SS20030206-B1.tar.gz -r--r--r-- 17342899 Mar 10 10:14 sh5linux-source-SH5LINUX-GLIBC-2.2.5-B4.tar.gz -r--r--r-- 15873709 Mar 7 15:18 sh5linux-source-SH5LINUX-SH5GCC-CVS20020529.1800-B6.tar.gz % md5sum * 8b734f7f0aa7c7e2893ed84e9869cd54 sh5linux-source-SH5LINUX-BINUTILS-SS20030206-B1.tar.gz 830df21d2c7a2b8bbc8b2067bb2c9dfa sh5linux-source-SH5LINUX-GLIBC-2.2.5-B4.tar.gz 145b55c6496c3b1715597a3b65422516 sh5linux-source-SH5LINUX-SH5GCC-CVS20020529.1800-B6.tar.gz Gcc is based on a CVS snapshot of May last year, patched to the 3.2.1 level. Binutils is based on SS20030206, and GLIBC is based on the 2.2.5 release. The name Barcelona is the internal codename SuperH uses for the SH-5 Linux toolchain. The sources for the above toolchain have been built on a RedHat 7.2 x86 based system, and this cross-toolchain, has been used to build the native toolchain, which has has natively built components such as "make" for SH-5 Linux. Right now, we are trying to get the native toolchain to bootstrap itself, and building all the myriad of components that GCC requires - the toolchain has proven to be very stable thus far! If anyone is interested in obtaining the binary toolchain that I am using, then please let me know and I will make that available as well. If anyone finds any problems with toolchain, then please let me know. Regards, Sean -- ------------------------------------------------------------------------ | Sean McGoogan, | E-mail: Sea...@Su... | | SuperH (UK) Ltd., | | | 2410 Aztec West, | Direct: +44 (0) 1454 465670 | | Almondsbury, | Main: +44 (0) 1454 465600 | | Bristol, BS32 4QX, U.K. | Fax: +44 (0) 1454 465601 | ------------------------------------------------------------------------ |
|
From: Paul M. <pau...@ti...> - 2002-11-08 21:26:16
|
Was just browsing the code when I noticed that map_cayman_irq() only deals with INT A/B/C. The attached patch fixes handles INTD as well (and therefore also puts it in line with actual PCISIG compliance). Regards, -- Paul Mundt pau...@ti... TimeSys Corporation |
|
From: Paul M. <pau...@ti...> - 2002-11-08 20:37:41
|
Hi All, It looks like the kernel sources support the simulator as a valid target. In attempting to get it to work however, I'm unable to get the kernel to even pretend to compile with the sh64-elf toolchain. What's everyone using for their development setup? And more importantly, does anyone have any prebuild kernel images that'll load up under the simulator? Regards, --=20 Paul Mundt pau...@ti... TimeSys Corporation |
|
From: Stuart M. <stu...@st...> - 2002-10-30 11:10:45
|
On Wed, 30 Oct 2002 10:42:28 +0100 lar...@no... wrote: > Stuart Menefy <stu...@st...> writes: > > > Are both 32-bit and 64-bit virtual address spaces available to > > > user mode programs? > > the current kernel implementation can be more efficient by only > > managing 32 bit addressing. > > But 64-bit arithmetic is supported (i.e. Linux saves all 64 bits of > the registers), right? Yes. All general purpose registers are 64 bit, and native 64 bit operations are all supported. However C language pointers are 32 bit, and registers which can only ever hold addresses (for example the program counter and registers related to the TLB) only have 32 bits implemented. Stuart |
|
From: Lars B. <lar...@no...> - 2002-10-30 09:42:49
|
Stuart Menefy <stu...@st...> writes: > > Are both 32-bit and 64-bit virtual address spaces available to > > user mode programs? > the current kernel implementation can be more efficient by only > managing 32 bit addressing. But 64-bit arithmetic is supported (i.e. Linux saves all 64 bits of the registers), right? > Hope this helps Yes, thanks a lot! |
|
From: Stuart M. <stu...@st...> - 2002-10-29 13:24:46
|
Hi Lars On Tue, 29 Oct 2002 08:53:31 +0100 lar...@no... wrote: > Hello, > > I didn't find any information on in your > http://www.superh-software.com/linux/ > site about the status of the Linux/SHmedia port. All the kernel work is in the BK repository linux-shmedia (linux-shmedia.bkbits.net). > Does it boot to a > shell prompt? Yes > Is it running multiuser stable? Pretty much > Are both 32-bit and > 64-bit virtual address spaces available to user mode programs? The short answer is no, only 32 bit addresses spaces are supported. We need to distinguish between the SHmedia architecture and the current SH5 implementation of that architecture. The SHmedia architecture is fully 64 bit, but the SH5 implementation doesn't implement full 64 bits in all cases. In particular addresses are only 32 bit. So the current kernel implementation can be more efficient by only managing 32 bit addressing. > Thank you. Hope this helps Stuart |
|
From: Lars B. <lar...@no...> - 2002-10-29 07:53:36
|
Hello, I didn't find any information on in your http://www.superh-software.com/linux/ site about the status of the Linux/SHmedia port. Does it boot to a shell prompt? Is it running multiuser stable? Are both 32-bit and 64-bit virtual address spaces available to user mode programs? Thank you. |
|
From: Gaster, B. <ben...@su...> - 2002-09-05 10:52:42
|
Hello! I am planning to update the external bk repository for SHmedia Linux later today to include work that we have been doing over the last month. This work is best split into two categories: 1. Optimizations=20 2. Bug fixes The work on optimizations includes enabling PCI DMA transfers which itself required fixing a large number of bugs in the DMA code that was disabled---I suppose this was why! Other optimizations that have been done include: Optimized TLB miss handler Optimized flush_tlb_range Optimized PCI use of flush_cache_all Unnecessary 1ms delay when accessing device locations! We are currently using versions of the SHmedia development tools, driven by shared libraries, that in combination with the optimizations exposed a number of bugs in SHmedia assembler inserts, in particular annotations of insert arguments. We have began work on optimizing string.h for SHmedia but there is still much work to be done here, in particular the use of cache line allocation and prefetch. Please read the change logs for a detailed description of the changes included in today's push into the SHmedia Linux bk repository.=20 The great thing about these optimizations and bug fixes is that we now have a SHmedia Linux kernel that is beginning to show responsiveness similar to that of x86 Linux... WOW! There is still some way to go and may more optimizations that can be done, reducing the TLB handler from 3 levels to 2 levels, thus avoiding an extra level of indirection, is just one straight forward example. We now have a kernel with video, sound, PCI USB, and webcam all running for hours with "real time" performance! This is really excellent! Cheers, Ben. Benedict R. Gaster SuperH 2430 Aztec West, Almondsbury, Bristol BS32 4AQ, UK |
|
From: Stuart M. <stu...@st...> - 2002-07-29 17:52:03
|
Ben On Thu, 25 Jul 2002 15:35:17 +0100 ben...@su... wrote: > Hello! > > While debugging a development version of the SHmedia Linux kernel, in > particular I now have a new selective purge/flush cache range > implementation, I find that BusyBox often reports the following message > but everything else seems to work fine: > > "sh: tcsetpgrp: Operation not permitted" > > Can anyone tell me what might be producing this message and/or what it > means? There were quite a few process group related problems with early 0.5x versions of busybox. Ideally try and upgrade to a more recent version. The other solution to some problems like this is to add to the command line: CONSOLE=/dev/ttySC0 or whatever is appropriate. The problems is that with the console on a non-standard serial port (something other than /dev/ttySn), bustbox can't work out which device to use, and falls back to /dev/console, which in turn has some odd restrictions on setting distinguished process groups. Neither of which should have anything to do with cache changes! If you are sure that the problem only occured afer your cache changes, then I would suspect you have a bug in your changes. What though is impossible to say. Something somewhere seeing stale data perhaps... Stuart |
|
From: Benedict R. G. <ben...@su...> - 2002-07-25 14:36:22
|
Hello! While debugging a development version of the SHmedia Linux kernel, in particular I now have a new selective purge/flush cache range implementation, I find that BusyBox often reports the following message but everything else seems to work fine: "sh: tcsetpgrp: Operation not permitted" Can anyone tell me what might be producing this message and/or what it means? Thanks, Ben. |
|
From: Massimo A. <Mas...@st...> - 2002-07-25 14:03:20
|
Hi Ben, I have already ported LMBench to SH5, I'm trying to tune it for a = correct analysys; I'm running it into a Linux50 kernel 2.4.19 without Cache I'll keep you informed on what's going on. Ciao, Massimo ----- Original Message -----=20 From: ben...@su...=20 To: lin...@li...=20 Sent: Wednesday, July 24, 2002 4:24 PM Subject: [linuxsh-shmedia-dev] SHmedia Linux operand caches continued <...> But how are we to measure performance of the kernel? =20 My current plan is to port LMBench (http://www.bitmover.com/lmbench/) on = to SHmedia Linux and then use this as a basis for considering different = caching optimisations. The problem I have at the moment is that LMBench = expects a native toolchain and although I have it cross-compiled to run = it still requires native tools. Of course, I sure with some work that it = would be possible to modify LMBench to get around this but as we are = currently working on bringing up a native toolchain, very close I = believe (Andy), it seems that my time is better spent doing other things = until it arrives. =20 As before please try out the new caching implementation and let me know = if there are any problems. =20 cheers, =20 ben. |
|
From: Gaster, B. <ben...@su...> - 2002-07-24 14:25:23
|
Hello! =20 I have just updated SHmedia Linux on bitkeeper to include a bug fix for the initial operand cache implementation, checked in last week, and also the first phase of optimising the cache flushing/purging routines. There is still work to be done but it does seem to be getting there. =20 The bug fix now ensures that interrupts are not enabled during the flushing of a particular line---this process has been optimised a little and so in fact is turned of for the flushing of two lines in particular cases---which was causing problems in the case an interrupt was serviced and a line was filled, dirty and had been flushed but not invalidated, then control was returned to the flushing algorithm and the line was invalidated causing data to be lost. Current tests, including running ANT for hours at a time and some had coded examples that use mmap/unmap and spawn a large number of piped processes run without problems. =20 The optimisations are concerned with the transferring of data between kernel and user processes and inter-process and are called by the kernel to allow a port to avoid synonyms introducing coherency problems. # =20 There are now two remaining areas of work relating to the caching implementation: =20 1. Optimise the flushing/purging of memory ranges. 2. Optimise the implementation to selectively flush/purge only the lines that correspond to the effective address being purged, thus avoid unnecessary flushes of the remaining 3 ways. =20 I expect to have the 1st task completed by the end of this week while the 2nd job requires a mechanism for measuring kernel performance.=20 =20 But how are we to measure performance of the kernel? =20 My current plan is to port LMBench (http://www.bitmover.com/lmbench/) on to SHmedia Linux and then use this as a basis for considering different caching optimisations. The problem I have at the moment is that LMBench expects a native toolchain and although I have it cross-compiled to run it still requires native tools. Of course, I sure with some work that it would be possible to modify LMBench to get around this but as we are currently working on bringing up a native toolchain, very close I believe (Andy), it seems that my time is better spent doing other things until it arrives. =20 As before please try out the new caching implementation and let me know if there are any problems. =20 cheers, =20 ben. |
|
From: Gaster, B. <ben...@su...> - 2002-07-19 15:01:54
|
Hello! =20 This week I decided to start looking into the implementation of caching on SHmedia Linux and in particular look at the problem of enabling the operand cache in write back mode. Since then I have started a complete rewrite of the caching implementation to firstly enable the operand cache, in write back mode, and then to optimise range and page flush/purging for both the I-cache and D-cache.=20 =20 To try to avoid problems with stability the implementation is planned in two stages the first, now complete, is to enable the D-cache, in write back mode, and to correctly implement flushing/purging of the entire cache. When ranges are flushed/purged then the resulting operation is done on the whole cache which is semantically correct but does not provide best performance. Stage two of the work is to optimise each of the flush/purge functions for both the I-cache and D-cache for the SH-5 platform. I have pushed the changes for stage 1 back into Bitkeeper and plan to start work on optimising cache flushing/purging either this weekend or early next week. =20 There are two ways to flush the operand cache on SH-5: =20 1. OCBP. Find any cache set/way that matches construct a virtual address in the line and issue an OCBP instruction for that address. Main problem with this approach is that the address might not be in the TLB, thus causing a page miss. 2. ALLOCO. Find any cache set where at least one way matches the flush range, and issue 4 alloco instructions on different addresses that hit that set. The main disadvantage of this approach is the eviction of blocks outside the flush range that happen to be resident in the same cache set, i.e., costs of pointless writebacks and later refills. A further disadvantage to this approach is that it not possible to optimise for the case when the cache line is dirty and so requires write back but should be retained in the cache without requiring refill from memory---caused by the fact that alloco writes zeros to the particular way. =20 The current implementation uses approach 2 as we want to avoid the case when a page miss is raised to bypass the issue with making sure the cache is coherent. This requires that a 32k region of memory, defined below, must be allocated with non-paged kernel space that is not used for anything else. This region must be at least 32 byte aligned to allow index calculations to be preformed using modulo 256 integer arithmetic. =20 The configuration options for the SHmedia kernel still allows the I-cache and D-cache to be disabled and this work does not break that option. =20 Please try out the kernel with both the I-cache and D-caches enabled and let me know how things go. Do not yet expect massive speed ups, as described above there are still many optimisations that must be implemented before the full benefits of the SH-5 caches can be utilized! =20 Ben=20 |
|
From: Stuart M. <stu...@st...> - 2002-06-25 16:06:04
|
Folks Just to let you know, I've just comitted and pushed the changes I've had sitting around for a while. These are: - The changes for accessing FP registers via ptrace (these are the same as the ones I sent Stephen a while ago). - The memory map reworking - emulated P1 to P4 are now dead, all devices are mapped dynamically - compressed kernel support - a stab a getting the memory model right (basically some synco instructions in ctrl_out[bwl] and out[bwl]. All these changes have been comitted first into the linux repository, and then brought across to the linux-2.4 repository. Any problems, let me know (and fast, before I forget it all!). Stuart -- Stuart Menefy stu...@st... STMicroelectronics Ltd ST Intranet: mo.bri.st.com Bristol, UK Rest of the World: www.linuxsh.st.com |
|
From: Gaster, B. <ben...@su...> - 2002-06-25 05:54:43
|
Hello! Most modern graphics cards provide support for what are often called video overlay surfaces which provide functionality for colour conversion, in particular YUV to RGB, and scaling from one resolution to another. The Kyro chipset from Imagination Technologies is no exception providing a wide selection of features for display video.=20 Recently we have added support for single overlay surfaces and views (scaling) to simple Kyro framebuffer supported by the SHmedia Linux kernel 2.4.19x (http://linux-shmedia.bkbits.net:8080/linux-2.4). As the implementation requires use of some additional IOCTLS a small user library has also been implemented and can be downloaded along with documentation from the SuperH SHmedia Linux web site (http://www.superh-software.com/linux/). Ben.=20 |
|
From: Gaster, B. <ben...@su...> - 2002-06-19 15:11:55
|
Hi,
I've tracked down the problem to a bug in the Kyro driver itself, not
the overlay device driver as I had thought; with this fix the mapping of
the overlay surface into user space works just fine.
Now I have a good understanding of the problem it seems over kill to
provide an overlay driver layered on top of the Kyro framebuffer driver.
Instead the simplest and most compact approach is to use the fb_mmap
function of the framedriver to map the video memory + offset into user
space.=20
To provide support for overlay surfaces using the current Kyro
framebuffer dirver I purpose defining the following set of IOCTLs:
Create overlay surfaces and viewports:
KYRO_IOCTL_OVERLAY_CREATE
KYRO_IOCTL_OVERLAY_VIEWPORT_SET
Set video mode:
KYRO_IOCTL_SET_VIDEO_MODE
Return overlay information to user space:
KYRO_IOCTL_UVSTRIDE
KYRO_IOCTL_OVERLAY_OFFSET =20
KYRO_IOCTL_STRIDE =20
With these implemented it is straightforward to implement a user library
for the creation of overlay surfaces. For example, I have implement a
library with the following interface:
typedef enum { DISPLAY_DEPTH_16=3D16, DISPLAY_DEPTH_32=3D32 } =
DISPLAY_DEPTH;
int setVideoMode(unsigned long xWidth,=20
unsigned long yHeight, unsigned long scan,
DISPLAY_DEPTH depth,
int bLinear);
int overlayCreate(unsigned long xWidth,
unsigned long yWidth);
int overlayViewportCreate(unsigned long xOrgin,
unsigned long yOrgin,
unsigned long xSize,
unsigned long ySize);
unsigned long getOverlayStride(void);
unsigned long getOverlayUVStride(void);
unsigned char * getOverlayFB(void);
I propose applying the patches for the Kyro framebuffer back into
bitkeeper and then putting the GPL source for the library, along with
some documentation, on to the SuperH SH-5 Linux web site
(http://www.superh-software.com/linux/index.htm) for public access.
Does this sound ok with other people? If so I'll try and do this work
tomorrow.
Ben.
-----Original Message-----
From: Gaster, Benedict=20
Sent: 19 June 2002 11:25
To: Stuart Menefy
Cc: lin...@li...
Subject: RE: [linuxsh-shmedia-dev] Mapping a PCI address into user space
Ok so now I have a modified overlay_mmap function being called correctly
and returns a valid address to the user program, however, data written
to this address does not appear in the overlay surface, i.e., on the
screen!
The modified overlay_mmap is as follows, based on fb_mmap (defined in
./devices/video/fbmem.c) and I was hoping that someone might be able to
spot what is going wrong.
Thanks,
Ben.
static int overlay_mmap(struct file * file, struct vm_area_struct * vma)
{
unsigned long off, start;
u32 len;
off =3D vma->vm_pgoff << PAGE_SHIFT;
start =3D (unsigned long) kyro_dev_physical_overlay_ptr();
len =3D PAGE_ALIGN((start & ~PAGE_MASK) +=20
((ol_create.ulWidth * ol_create.ulHeight) +
((ol_create.ulWidth * ol_create.ulHeight) / 2)));
printk("\nstart =3D 0x%x\n", start);
printk("PAGE_MASK =3D 0x%x\n", PAGE_MASK);
start &=3D PAGE_MASK;
if ((vma->vm_end - vma->vm_start + off) > len)
{
return -EINVAL;
}
=20
off +=3D start;
vma->vm_pgoff =3D off >> PAGE_SHIFT;
/* This is an IO map - tell maydump to skip this VMA */
vma->vm_flags |=3D VM_IO;
pgprot_val(vma->vm_page_prot) &=3D ~_PAGE_CACHABLE;
if (io_remap_page_range(vma->vm_start, off,
vma->vm_end - vma->vm_start,
vma->vm_page_prot))
{
return -EAGAIN;
} =20
=20
return 0;
}
-----Original Message-----
From: Gaster, Benedict=20
Sent: 19 June 2002 09:16
To: Stuart Menefy
Cc: lin...@li...
Subject: RE: [linuxsh-shmedia-dev] Mapping a PCI address into user space
Hi Again,
I have worked out that I was not selecting a valid set of flags which
was causing the call to mmap to fail with -EINVAL, i.e, invalid
argument. The user call now looks like:
pOvl =3D mmap(0, (OW*OH) + ((OH*OW)/2), PROT_READ | PROT_WRITE,
MAP_SHARED|MAP_ANONYMOUS, fd, 0);
On completion of the call the pOvl is set to the address:
0x29557000
but the problem is now that the mmap function for the overlay device is
never being called and so although pOvl seems to be a valid address it
is not the one I was expecting? I would have thought that as the file
descriptor for the overlay device is being passed into mmap then
overlay_mmap would be called by mmap and thus allow the correct address
to be mapped for the overlay surface.
Any ideas?
Ben.
-----Original Message-----
From: Gaster, Benedict=20
Sent: 18 June 2002 16:44
To: Stuart Menefy
Cc: lin...@li...
Subject: RE: [linuxsh-shmedia-dev] Mapping a PCI address into user space
Hi Stuart,
So I've had a go at implementing a simple overlay character device on
top of the FB driver which is very straightforward but does support an
implementation of mmap. However, there is a problem when I make the mmap
system call to the device.
A small section of the driver code is as follows:
...
static int overlay_mmap(struct file * file, struct vm_area_struct * vma)
{
unsigned long off, start;
u32 len;
off =3D vma->vm_pgoff << PAGE_SHIFT;
start =3D (unsigned long) kyro_dev_physical_overlay_ptr();
len =3D PAGE_ALIGN(start & ~PAGE_MASK) +=20
((ol_create.ulWidth * ol_create.ulHeight) +
((ol_create.ulWidth * ol_create.ulHeight) / 2));
start &=3D PAGE_MASK;
if ((vma->vm_end - vma->vm_start + off) > len)
{
return -EINVAL;
}
=20
off +=3D start;
vma->vm_pgoff =3D off >> PAGE_SHIFT;
/*
* Don't alter the page protection flags; we want to keep the area
* cached for better performance. This does mean that we may miss
* some updates to the screen occasionally, but process switches
* should cause the caches and buffers to be flushed often enough.
*/
if (io_remap_page_range(vma->vm_start, off,
vma->vm_end - vma->vm_start,
vma->vm_page_prot))
{
return -EAGAIN;
} =20
=20
return 0;
}=20
struct file_operations overlay_fops =3D
{
owner: THIS_MODULE,
open: overlay_open,
read: overlay_read,
write: overlay_write,
ioctl: overlay_ioctl,
release: overlay_release,
mmap: overlay_mmap,
};
and then the user code looks like:
if ((fd =3D open("/dev/overlay", O_RDWR)) =3D=3D ENOENT)
{
err_message("failed to open /dev/fb0\n");
} =20
=20
if (ioctl(fd, OVERLAY_IOCTL_STRIDE, &ovlStride) =3D=3D -EINVAL)
{
err_message("failed to get stride\n");
}
pOvl =3D mmap(0, (OW*OH) + ((OH*OW)/2), PROT_WRITE, 0, fd, 0);
where the device is called "overlay", the ioctl call is just requesting
the stride value, and the final line requesting the current over surface
to be memory mapped. If we run the user code strace outputs the
following:
/home/benedict # /myprogs/bin/strace ./yuv
execve("./yuv", ["./yuv"], [/* 5 vars */]) =3D 0
fcntl(0, F_GETFD) =3D 0
fcntl(1, F_GETFD) =3D 0
fcntl(2, F_GETFD) =3D 0
uname({sys=3D"Linux", node=3D"sunday-pci", ...}) =3D 0
semop(4784052, 0x2d, 4194303) =3D 0
SYS_199(0, 0x3fffff, 0x20413, 0x2d, 0x7bfffd99) =3D 0
semget(IPC_PRIVATE, 4194303, IPC_EXCL|0x20013|022) =3D 0
getgid() =3D 0
brk(0) =3D 0x490450
brk(0x490470) =3D 0x490470
brk(0x491000) =3D 0x491000
brk(0x492000) =3D 0x492000
getpid() =3D 13
open("/dev/overlay", O_RDWR) =3D 3
ioctl(3, 0x6f00, 0x7bfffde4) =3D 0
old_mmap(NULL, 152064, PROT_WRITE, MAP_FILE, 3, 0) =3D -1 EINVAL =
(Invalid
argument)
fstat64(1, {st_mode=3DS_IFCHR|0664, st_rdev=3Dmakedev(5, 1), ...}) =3D 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) =3D 0x29556000
ioctl(1, TCGETS, {B9600 opost isig icanon echo ...}) =3D 0
write(1, "pOvl =3D 0xffffffff\n", 18pOvl =3D 0xffffffff
) =3D 18
open("spider0.Y", O_RDONLY) =3D 4
write(1, "Filling Y-data\n", 15Filling Y-data
) =3D 15
fstat64(4, {st_mode=3DS_IFREG|0444, st_size=3D101376, ...}) =3D 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) =3D 0x29557000
read(4, "\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20"...,
4096) =3D 4096
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++
It seems to imply that the call to mmap is failing due to an invalid
argument; can you spot what might be wrong?
Ben.
-----Original Message-----
From: Stuart Menefy [mailto:stu...@st...]=20
Sent: 18 June 2002 14:05
To: Gaster, Benedict
Cc: lin...@li...
Subject: Re: [linuxsh-shmedia-dev] Mapping a PCI address into user space
Hi Ben
On Tue, 18 Jun 2002 11:05:54 +0100
ben...@su... wrote:
> Hello!
> =20
> I have modified the bitkeeper Kyro framebuffer driver to provide
> additional ioctl calls for the creation of overlay surfaces (to handle
> YUV to RGB colour conversion). Currently two calls are required for
> creation of an overlay surface and a creation of viewport (a scaling
> window mapping the overlay surface on to the RGB output) which returns
a
> pointer to the viewport surface. Currently the pointer returned can
only
> be written to in Kernel space which I assume is due to the pointer
into
> the PCI address space is mapped only in kernel space and not in user
> space.
You don't say how you're doing it, but this sounds likly.
> I presume that it is possible to map the pages to be written
by a
> particular user process but I not sure how to do this and weather it
is
> the best approach.=20
Its pretty simple. Just have a look at how the existing frame buffer
code
does it (fb_mmap).
> I suppose it would be possible to extend the framebuffer driver to
> provide an additional driver interface for overlay surfaces and thus
use
> mmap to create a memory mapped region but this seems like over kill
for
> the functionally that I'm after.
It depends what functionally you're after!
Assuming you simply want to display data from a user application
which generates YUV data, then mapping the overlay into user space is
almost certainly the easiest way to do it.
Whether you do this as a new device or an extension of the existing one
is pretty much up to you. Mapping the fb device is probably the easiest
(simply use the offset parameter to mmap to decice what to map), but is
somewhat non-standard.
A better way might to to implement a subset of Video for Linux (V4L),
which already has an API which supports overlays, allowing you to
position them, change the format, etc.=20
Stuart
--=20
Stuart Menefy
stu...@st...
STMicroelectronics Ltd ST Intranet:
mo.bri.st.com
Bristol, UK Rest of the World:
www.linuxsh.st.com
------------------------------------------------------------------------
----
Bringing you mounds of caffeinated joy
>>> http://thinkgeek.com/sf <<<
_______________________________________________
Linuxsh-shmedia-dev mailing list
Lin...@li...
https://lists.sourceforge.net/lists/listinfo/linuxsh-shmedia-dev
------------------------------------------------------------------------
----
Bringing you mounds of caffeinated joy
>>> http://thinkgeek.com/sf <<<
_______________________________________________
Linuxsh-shmedia-dev mailing list
Lin...@li...
https://lists.sourceforge.net/lists/listinfo/linuxsh-shmedia-dev
------------------------------------------------------------------------
----
Bringing you mounds of caffeinated joy
>>> http://thinkgeek.com/sf <<<
_______________________________________________
Linuxsh-shmedia-dev mailing list
Lin...@li...
https://lists.sourceforge.net/lists/listinfo/linuxsh-shmedia-dev
|
|
From: Gaster, B. <ben...@su...> - 2002-06-19 10:26:04
|
Ok so now I have a modified overlay_mmap function being called correctly
and returns a valid address to the user program, however, data written
to this address does not appear in the overlay surface, i.e., on the
screen!
The modified overlay_mmap is as follows, based on fb_mmap (defined in
./devices/video/fbmem.c) and I was hoping that someone might be able to
spot what is going wrong.
Thanks,
Ben.
static int overlay_mmap(struct file * file, struct vm_area_struct * vma)
{
unsigned long off, start;
u32 len;
off =3D vma->vm_pgoff << PAGE_SHIFT;
start =3D (unsigned long) kyro_dev_physical_overlay_ptr();
len =3D PAGE_ALIGN((start & ~PAGE_MASK) +=20
((ol_create.ulWidth * ol_create.ulHeight) +
((ol_create.ulWidth * ol_create.ulHeight) / 2)));
printk("\nstart =3D 0x%x\n", start);
printk("PAGE_MASK =3D 0x%x\n", PAGE_MASK);
start &=3D PAGE_MASK;
if ((vma->vm_end - vma->vm_start + off) > len)
{
return -EINVAL;
}
=20
off +=3D start;
vma->vm_pgoff =3D off >> PAGE_SHIFT;
/* This is an IO map - tell maydump to skip this VMA */
vma->vm_flags |=3D VM_IO;
pgprot_val(vma->vm_page_prot) &=3D ~_PAGE_CACHABLE;
if (io_remap_page_range(vma->vm_start, off,
vma->vm_end - vma->vm_start,
vma->vm_page_prot))
{
return -EAGAIN;
} =20
=20
return 0;
}
-----Original Message-----
From: Gaster, Benedict=20
Sent: 19 June 2002 09:16
To: Stuart Menefy
Cc: lin...@li...
Subject: RE: [linuxsh-shmedia-dev] Mapping a PCI address into user space
Hi Again,
I have worked out that I was not selecting a valid set of flags which
was causing the call to mmap to fail with -EINVAL, i.e, invalid
argument. The user call now looks like:
pOvl =3D mmap(0, (OW*OH) + ((OH*OW)/2), PROT_READ | PROT_WRITE,
MAP_SHARED|MAP_ANONYMOUS, fd, 0);
On completion of the call the pOvl is set to the address:
0x29557000
but the problem is now that the mmap function for the overlay device is
never being called and so although pOvl seems to be a valid address it
is not the one I was expecting? I would have thought that as the file
descriptor for the overlay device is being passed into mmap then
overlay_mmap would be called by mmap and thus allow the correct address
to be mapped for the overlay surface.
Any ideas?
Ben.
-----Original Message-----
From: Gaster, Benedict=20
Sent: 18 June 2002 16:44
To: Stuart Menefy
Cc: lin...@li...
Subject: RE: [linuxsh-shmedia-dev] Mapping a PCI address into user space
Hi Stuart,
So I've had a go at implementing a simple overlay character device on
top of the FB driver which is very straightforward but does support an
implementation of mmap. However, there is a problem when I make the mmap
system call to the device.
A small section of the driver code is as follows:
...
static int overlay_mmap(struct file * file, struct vm_area_struct * vma)
{
unsigned long off, start;
u32 len;
off =3D vma->vm_pgoff << PAGE_SHIFT;
start =3D (unsigned long) kyro_dev_physical_overlay_ptr();
len =3D PAGE_ALIGN(start & ~PAGE_MASK) +=20
((ol_create.ulWidth * ol_create.ulHeight) +
((ol_create.ulWidth * ol_create.ulHeight) / 2));
start &=3D PAGE_MASK;
if ((vma->vm_end - vma->vm_start + off) > len)
{
return -EINVAL;
}
=20
off +=3D start;
vma->vm_pgoff =3D off >> PAGE_SHIFT;
/*
* Don't alter the page protection flags; we want to keep the area
* cached for better performance. This does mean that we may miss
* some updates to the screen occasionally, but process switches
* should cause the caches and buffers to be flushed often enough.
*/
if (io_remap_page_range(vma->vm_start, off,
vma->vm_end - vma->vm_start,
vma->vm_page_prot))
{
return -EAGAIN;
} =20
=20
return 0;
}=20
struct file_operations overlay_fops =3D
{
owner: THIS_MODULE,
open: overlay_open,
read: overlay_read,
write: overlay_write,
ioctl: overlay_ioctl,
release: overlay_release,
mmap: overlay_mmap,
};
and then the user code looks like:
if ((fd =3D open("/dev/overlay", O_RDWR)) =3D=3D ENOENT)
{
err_message("failed to open /dev/fb0\n");
} =20
=20
if (ioctl(fd, OVERLAY_IOCTL_STRIDE, &ovlStride) =3D=3D -EINVAL)
{
err_message("failed to get stride\n");
}
pOvl =3D mmap(0, (OW*OH) + ((OH*OW)/2), PROT_WRITE, 0, fd, 0);
where the device is called "overlay", the ioctl call is just requesting
the stride value, and the final line requesting the current over surface
to be memory mapped. If we run the user code strace outputs the
following:
/home/benedict # /myprogs/bin/strace ./yuv
execve("./yuv", ["./yuv"], [/* 5 vars */]) =3D 0
fcntl(0, F_GETFD) =3D 0
fcntl(1, F_GETFD) =3D 0
fcntl(2, F_GETFD) =3D 0
uname({sys=3D"Linux", node=3D"sunday-pci", ...}) =3D 0
semop(4784052, 0x2d, 4194303) =3D 0
SYS_199(0, 0x3fffff, 0x20413, 0x2d, 0x7bfffd99) =3D 0
semget(IPC_PRIVATE, 4194303, IPC_EXCL|0x20013|022) =3D 0
getgid() =3D 0
brk(0) =3D 0x490450
brk(0x490470) =3D 0x490470
brk(0x491000) =3D 0x491000
brk(0x492000) =3D 0x492000
getpid() =3D 13
open("/dev/overlay", O_RDWR) =3D 3
ioctl(3, 0x6f00, 0x7bfffde4) =3D 0
old_mmap(NULL, 152064, PROT_WRITE, MAP_FILE, 3, 0) =3D -1 EINVAL =
(Invalid
argument)
fstat64(1, {st_mode=3DS_IFCHR|0664, st_rdev=3Dmakedev(5, 1), ...}) =3D 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) =3D 0x29556000
ioctl(1, TCGETS, {B9600 opost isig icanon echo ...}) =3D 0
write(1, "pOvl =3D 0xffffffff\n", 18pOvl =3D 0xffffffff
) =3D 18
open("spider0.Y", O_RDONLY) =3D 4
write(1, "Filling Y-data\n", 15Filling Y-data
) =3D 15
fstat64(4, {st_mode=3DS_IFREG|0444, st_size=3D101376, ...}) =3D 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) =3D 0x29557000
read(4, "\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20"...,
4096) =3D 4096
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++
It seems to imply that the call to mmap is failing due to an invalid
argument; can you spot what might be wrong?
Ben.
-----Original Message-----
From: Stuart Menefy [mailto:stu...@st...]=20
Sent: 18 June 2002 14:05
To: Gaster, Benedict
Cc: lin...@li...
Subject: Re: [linuxsh-shmedia-dev] Mapping a PCI address into user space
Hi Ben
On Tue, 18 Jun 2002 11:05:54 +0100
ben...@su... wrote:
> Hello!
> =20
> I have modified the bitkeeper Kyro framebuffer driver to provide
> additional ioctl calls for the creation of overlay surfaces (to handle
> YUV to RGB colour conversion). Currently two calls are required for
> creation of an overlay surface and a creation of viewport (a scaling
> window mapping the overlay surface on to the RGB output) which returns
a
> pointer to the viewport surface. Currently the pointer returned can
only
> be written to in Kernel space which I assume is due to the pointer
into
> the PCI address space is mapped only in kernel space and not in user
> space.
You don't say how you're doing it, but this sounds likly.
> I presume that it is possible to map the pages to be written
by a
> particular user process but I not sure how to do this and weather it
is
> the best approach.=20
Its pretty simple. Just have a look at how the existing frame buffer
code
does it (fb_mmap).
> I suppose it would be possible to extend the framebuffer driver to
> provide an additional driver interface for overlay surfaces and thus
use
> mmap to create a memory mapped region but this seems like over kill
for
> the functionally that I'm after.
It depends what functionally you're after!
Assuming you simply want to display data from a user application
which generates YUV data, then mapping the overlay into user space is
almost certainly the easiest way to do it.
Whether you do this as a new device or an extension of the existing one
is pretty much up to you. Mapping the fb device is probably the easiest
(simply use the offset parameter to mmap to decice what to map), but is
somewhat non-standard.
A better way might to to implement a subset of Video for Linux (V4L),
which already has an API which supports overlays, allowing you to
position them, change the format, etc.=20
Stuart
--=20
Stuart Menefy
stu...@st...
STMicroelectronics Ltd ST Intranet:
mo.bri.st.com
Bristol, UK Rest of the World:
www.linuxsh.st.com
------------------------------------------------------------------------
----
Bringing you mounds of caffeinated joy
>>> http://thinkgeek.com/sf <<<
_______________________________________________
Linuxsh-shmedia-dev mailing list
Lin...@li...
https://lists.sourceforge.net/lists/listinfo/linuxsh-shmedia-dev
------------------------------------------------------------------------
----
Bringing you mounds of caffeinated joy
>>> http://thinkgeek.com/sf <<<
_______________________________________________
Linuxsh-shmedia-dev mailing list
Lin...@li...
https://lists.sourceforge.net/lists/listinfo/linuxsh-shmedia-dev
|
|
From: Gaster, B. <ben...@su...> - 2002-06-19 08:17:39
|
Hi Again,
I have worked out that I was not selecting a valid set of flags which
was causing the call to mmap to fail with -EINVAL, i.e, invalid
argument. The user call now looks like:
pOvl =3D mmap(0, (OW*OH) + ((OH*OW)/2), PROT_READ | PROT_WRITE,
MAP_SHARED|MAP_ANONYMOUS, fd, 0);
On completion of the call the pOvl is set to the address:
0x29557000
but the problem is now that the mmap function for the overlay device is
never being called and so although pOvl seems to be a valid address it
is not the one I was expecting? I would have thought that as the file
descriptor for the overlay device is being passed into mmap then
overlay_mmap would be called by mmap and thus allow the correct address
to be mapped for the overlay surface.
Any ideas?
Ben.
-----Original Message-----
From: Gaster, Benedict=20
Sent: 18 June 2002 16:44
To: Stuart Menefy
Cc: lin...@li...
Subject: RE: [linuxsh-shmedia-dev] Mapping a PCI address into user space
Hi Stuart,
So I've had a go at implementing a simple overlay character device on
top of the FB driver which is very straightforward but does support an
implementation of mmap. However, there is a problem when I make the mmap
system call to the device.
A small section of the driver code is as follows:
...
static int overlay_mmap(struct file * file, struct vm_area_struct * vma)
{
unsigned long off, start;
u32 len;
off =3D vma->vm_pgoff << PAGE_SHIFT;
start =3D (unsigned long) kyro_dev_physical_overlay_ptr();
len =3D PAGE_ALIGN(start & ~PAGE_MASK) +=20
((ol_create.ulWidth * ol_create.ulHeight) +
((ol_create.ulWidth * ol_create.ulHeight) / 2));
start &=3D PAGE_MASK;
if ((vma->vm_end - vma->vm_start + off) > len)
{
return -EINVAL;
}
=20
off +=3D start;
vma->vm_pgoff =3D off >> PAGE_SHIFT;
/*
* Don't alter the page protection flags; we want to keep the area
* cached for better performance. This does mean that we may miss
* some updates to the screen occasionally, but process switches
* should cause the caches and buffers to be flushed often enough.
*/
if (io_remap_page_range(vma->vm_start, off,
vma->vm_end - vma->vm_start,
vma->vm_page_prot))
{
return -EAGAIN;
} =20
=20
return 0;
}=20
struct file_operations overlay_fops =3D
{
owner: THIS_MODULE,
open: overlay_open,
read: overlay_read,
write: overlay_write,
ioctl: overlay_ioctl,
release: overlay_release,
mmap: overlay_mmap,
};
and then the user code looks like:
if ((fd =3D open("/dev/overlay", O_RDWR)) =3D=3D ENOENT)
{
err_message("failed to open /dev/fb0\n");
} =20
=20
if (ioctl(fd, OVERLAY_IOCTL_STRIDE, &ovlStride) =3D=3D -EINVAL)
{
err_message("failed to get stride\n");
}
pOvl =3D mmap(0, (OW*OH) + ((OH*OW)/2), PROT_WRITE, 0, fd, 0);
where the device is called "overlay", the ioctl call is just requesting
the stride value, and the final line requesting the current over surface
to be memory mapped. If we run the user code strace outputs the
following:
/home/benedict # /myprogs/bin/strace ./yuv
execve("./yuv", ["./yuv"], [/* 5 vars */]) =3D 0
fcntl(0, F_GETFD) =3D 0
fcntl(1, F_GETFD) =3D 0
fcntl(2, F_GETFD) =3D 0
uname({sys=3D"Linux", node=3D"sunday-pci", ...}) =3D 0
semop(4784052, 0x2d, 4194303) =3D 0
SYS_199(0, 0x3fffff, 0x20413, 0x2d, 0x7bfffd99) =3D 0
semget(IPC_PRIVATE, 4194303, IPC_EXCL|0x20013|022) =3D 0
getgid() =3D 0
brk(0) =3D 0x490450
brk(0x490470) =3D 0x490470
brk(0x491000) =3D 0x491000
brk(0x492000) =3D 0x492000
getpid() =3D 13
open("/dev/overlay", O_RDWR) =3D 3
ioctl(3, 0x6f00, 0x7bfffde4) =3D 0
old_mmap(NULL, 152064, PROT_WRITE, MAP_FILE, 3, 0) =3D -1 EINVAL =
(Invalid
argument)
fstat64(1, {st_mode=3DS_IFCHR|0664, st_rdev=3Dmakedev(5, 1), ...}) =3D 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) =3D 0x29556000
ioctl(1, TCGETS, {B9600 opost isig icanon echo ...}) =3D 0
write(1, "pOvl =3D 0xffffffff\n", 18pOvl =3D 0xffffffff
) =3D 18
open("spider0.Y", O_RDONLY) =3D 4
write(1, "Filling Y-data\n", 15Filling Y-data
) =3D 15
fstat64(4, {st_mode=3DS_IFREG|0444, st_size=3D101376, ...}) =3D 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) =3D 0x29557000
read(4, "\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20"...,
4096) =3D 4096
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++
It seems to imply that the call to mmap is failing due to an invalid
argument; can you spot what might be wrong?
Ben.
-----Original Message-----
From: Stuart Menefy [mailto:stu...@st...]=20
Sent: 18 June 2002 14:05
To: Gaster, Benedict
Cc: lin...@li...
Subject: Re: [linuxsh-shmedia-dev] Mapping a PCI address into user space
Hi Ben
On Tue, 18 Jun 2002 11:05:54 +0100
ben...@su... wrote:
> Hello!
> =20
> I have modified the bitkeeper Kyro framebuffer driver to provide
> additional ioctl calls for the creation of overlay surfaces (to handle
> YUV to RGB colour conversion). Currently two calls are required for
> creation of an overlay surface and a creation of viewport (a scaling
> window mapping the overlay surface on to the RGB output) which returns
a
> pointer to the viewport surface. Currently the pointer returned can
only
> be written to in Kernel space which I assume is due to the pointer
into
> the PCI address space is mapped only in kernel space and not in user
> space.
You don't say how you're doing it, but this sounds likly.
> I presume that it is possible to map the pages to be written
by a
> particular user process but I not sure how to do this and weather it
is
> the best approach.=20
Its pretty simple. Just have a look at how the existing frame buffer
code
does it (fb_mmap).
> I suppose it would be possible to extend the framebuffer driver to
> provide an additional driver interface for overlay surfaces and thus
use
> mmap to create a memory mapped region but this seems like over kill
for
> the functionally that I'm after.
It depends what functionally you're after!
Assuming you simply want to display data from a user application
which generates YUV data, then mapping the overlay into user space is
almost certainly the easiest way to do it.
Whether you do this as a new device or an extension of the existing one
is pretty much up to you. Mapping the fb device is probably the easiest
(simply use the offset parameter to mmap to decice what to map), but is
somewhat non-standard.
A better way might to to implement a subset of Video for Linux (V4L),
which already has an API which supports overlays, allowing you to
position them, change the format, etc.=20
Stuart
--=20
Stuart Menefy
stu...@st...
STMicroelectronics Ltd ST Intranet:
mo.bri.st.com
Bristol, UK Rest of the World:
www.linuxsh.st.com
------------------------------------------------------------------------
----
Bringing you mounds of caffeinated joy
>>> http://thinkgeek.com/sf <<<
_______________________________________________
Linuxsh-shmedia-dev mailing list
Lin...@li...
https://lists.sourceforge.net/lists/listinfo/linuxsh-shmedia-dev
|
|
From: Stuart M. <stu...@st...> - 2002-06-18 19:06:34
|
Folks As you may have spotted from a previous posting I've been playing with compressed kernels, with the aim of improving the download speed. After a number of unexpected problems, it is now working fine, so I thought I might post some figures. First some background. The Linux compressed kernels work by taking the normal ELF kernel image, removing some sections (primarily empty zero page, and those to do with debugging), and the compressing the resulting file using gzip. This is then built as binary data into a new program, which when booted decompresses the kernel image, and jumps into it. So after benchmarking this, we get the following results (figures are approx. - stopwatch and visual inspection - so don't extrapolate too far!): gdb from BSF 20020329 --------------------- Compressed kernel (size 632K, 616K data + 16K loader) download 90 seconds = 7K/sec decompress 15 seconds total 105 seconds = 16K/sec Uncompressed kernel (size 1702K) download 103 seconds = 16K/sec gdb from Madrid 0.6 ------------------- Compressed kernel (size 664K, 648K data + 16K loader) download 17 seconds = 39K/sec decompress 2 seconds total 19 seconds = 94K/sec Uncompressed kernel (size 1789K) download 44 seconds = 40K/sec Note that this is not a fair comparison, because as well as changing the version of gdb being used, I also changed the decompression code to enable the cache, hence the big fall in decompression time. This testing did turn up one anomily. The download speed is very dependant on the alignment of the data being downloaded. If it is 8 byte aligned speeds are in the order of 40K/sec, while 4 byte aligned data falls to under 9K/sec. This is almost certainly a result of how the debug hardware is being driven, so needs looking at, I'll check this in later this week. Stuart -- Stuart Menefy stu...@st... STMicroelectronics Ltd ST Intranet: mo.bri.st.com Bristol, UK Rest of the World: www.linuxsh.st.com |
|
From: Gaster, B. <ben...@su...> - 2002-06-18 15:45:31
|
Hi Stuart,
So I've had a go at implementing a simple overlay character device on
top of the FB driver which is very straightforward but does support an
implementation of mmap. However, there is a problem when I make the mmap
system call to the device.
A small section of the driver code is as follows:
...
static int overlay_mmap(struct file * file, struct vm_area_struct * vma)
{
unsigned long off, start;
u32 len;
off =3D vma->vm_pgoff << PAGE_SHIFT;
start =3D (unsigned long) kyro_dev_physical_overlay_ptr();
len =3D PAGE_ALIGN(start & ~PAGE_MASK) +=20
((ol_create.ulWidth * ol_create.ulHeight) +
((ol_create.ulWidth * ol_create.ulHeight) / 2));
start &=3D PAGE_MASK;
if ((vma->vm_end - vma->vm_start + off) > len)
{
return -EINVAL;
}
=20
off +=3D start;
vma->vm_pgoff =3D off >> PAGE_SHIFT;
/*
* Don't alter the page protection flags; we want to keep the area
* cached for better performance. This does mean that we may miss
* some updates to the screen occasionally, but process switches
* should cause the caches and buffers to be flushed often enough.
*/
if (io_remap_page_range(vma->vm_start, off,
vma->vm_end - vma->vm_start,
vma->vm_page_prot))
{
return -EAGAIN;
} =20
=20
return 0;
}=20
struct file_operations overlay_fops =3D
{
owner: THIS_MODULE,
open: overlay_open,
read: overlay_read,
write: overlay_write,
ioctl: overlay_ioctl,
release: overlay_release,
mmap: overlay_mmap,
};
and then the user code looks like:
if ((fd =3D open("/dev/overlay", O_RDWR)) =3D=3D ENOENT)
{
err_message("failed to open /dev/fb0\n");
} =20
=20
if (ioctl(fd, OVERLAY_IOCTL_STRIDE, &ovlStride) =3D=3D -EINVAL)
{
err_message("failed to get stride\n");
}
pOvl =3D mmap(0, (OW*OH) + ((OH*OW)/2), PROT_WRITE, 0, fd, 0);
where the device is called "overlay", the ioctl call is just requesting
the stride value, and the final line requesting the current over surface
to be memory mapped. If we run the user code strace outputs the
following:
/home/benedict # /myprogs/bin/strace ./yuv
execve("./yuv", ["./yuv"], [/* 5 vars */]) =3D 0
fcntl(0, F_GETFD) =3D 0
fcntl(1, F_GETFD) =3D 0
fcntl(2, F_GETFD) =3D 0
uname({sys=3D"Linux", node=3D"sunday-pci", ...}) =3D 0
semop(4784052, 0x2d, 4194303) =3D 0
SYS_199(0, 0x3fffff, 0x20413, 0x2d, 0x7bfffd99) =3D 0
semget(IPC_PRIVATE, 4194303, IPC_EXCL|0x20013|022) =3D 0
getgid() =3D 0
brk(0) =3D 0x490450
brk(0x490470) =3D 0x490470
brk(0x491000) =3D 0x491000
brk(0x492000) =3D 0x492000
getpid() =3D 13
open("/dev/overlay", O_RDWR) =3D 3
ioctl(3, 0x6f00, 0x7bfffde4) =3D 0
old_mmap(NULL, 152064, PROT_WRITE, MAP_FILE, 3, 0) =3D -1 EINVAL =
(Invalid
argument)
fstat64(1, {st_mode=3DS_IFCHR|0664, st_rdev=3Dmakedev(5, 1), ...}) =3D 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) =3D 0x29556000
ioctl(1, TCGETS, {B9600 opost isig icanon echo ...}) =3D 0
write(1, "pOvl =3D 0xffffffff\n", 18pOvl =3D 0xffffffff
) =3D 18
open("spider0.Y", O_RDONLY) =3D 4
write(1, "Filling Y-data\n", 15Filling Y-data
) =3D 15
fstat64(4, {st_mode=3DS_IFREG|0444, st_size=3D101376, ...}) =3D 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) =3D 0x29557000
read(4, "\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20\20"...,
4096) =3D 4096
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++
It seems to imply that the call to mmap is failing due to an invalid
argument; can you spot what might be wrong?
Ben.
-----Original Message-----
From: Stuart Menefy [mailto:stu...@st...]=20
Sent: 18 June 2002 14:05
To: Gaster, Benedict
Cc: lin...@li...
Subject: Re: [linuxsh-shmedia-dev] Mapping a PCI address into user space
Hi Ben
On Tue, 18 Jun 2002 11:05:54 +0100
ben...@su... wrote:
> Hello!
> =20
> I have modified the bitkeeper Kyro framebuffer driver to provide
> additional ioctl calls for the creation of overlay surfaces (to handle
> YUV to RGB colour conversion). Currently two calls are required for
> creation of an overlay surface and a creation of viewport (a scaling
> window mapping the overlay surface on to the RGB output) which returns
a
> pointer to the viewport surface. Currently the pointer returned can
only
> be written to in Kernel space which I assume is due to the pointer
into
> the PCI address space is mapped only in kernel space and not in user
> space.
You don't say how you're doing it, but this sounds likly.
> I presume that it is possible to map the pages to be written
by a
> particular user process but I not sure how to do this and weather it
is
> the best approach.=20
Its pretty simple. Just have a look at how the existing frame buffer
code
does it (fb_mmap).
> I suppose it would be possible to extend the framebuffer driver to
> provide an additional driver interface for overlay surfaces and thus
use
> mmap to create a memory mapped region but this seems like over kill
for
> the functionally that I'm after.
It depends what functionally you're after!
Assuming you simply want to display data from a user application
which generates YUV data, then mapping the overlay into user space is
almost certainly the easiest way to do it.
Whether you do this as a new device or an extension of the existing one
is pretty much up to you. Mapping the fb device is probably the easiest
(simply use the offset parameter to mmap to decice what to map), but is
somewhat non-standard.
A better way might to to implement a subset of Video for Linux (V4L),
which already has an API which supports overlays, allowing you to
position them, change the format, etc.=20
Stuart
--=20
Stuart Menefy
stu...@st...
STMicroelectronics Ltd ST Intranet:
mo.bri.st.com
Bristol, UK Rest of the World:
www.linuxsh.st.com
|
|
From: Stuart M. <stu...@st...> - 2002-06-18 13:15:15
|
Hi Ben On Tue, 18 Jun 2002 11:05:54 +0100 ben...@su... wrote: > Hello! > > I have modified the bitkeeper Kyro framebuffer driver to provide > additional ioctl calls for the creation of overlay surfaces (to handle > YUV to RGB colour conversion). Currently two calls are required for > creation of an overlay surface and a creation of viewport (a scaling > window mapping the overlay surface on to the RGB output) which returns a > pointer to the viewport surface. Currently the pointer returned can only > be written to in Kernel space which I assume is due to the pointer into > the PCI address space is mapped only in kernel space and not in user > space. You don't say how you're doing it, but this sounds likly. > I presume that it is possible to map the pages to be written by a > particular user process but I not sure how to do this and weather it is > the best approach. Its pretty simple. Just have a look at how the existing frame buffer code does it (fb_mmap). > I suppose it would be possible to extend the framebuffer driver to > provide an additional driver interface for overlay surfaces and thus use > mmap to create a memory mapped region but this seems like over kill for > the functionally that I'm after. It depends what functionally you're after! Assuming you simply want to display data from a user application which generates YUV data, then mapping the overlay into user space is almost certainly the easiest way to do it. Whether you do this as a new device or an extension of the existing one is pretty much up to you. Mapping the fb device is probably the easiest (simply use the offset parameter to mmap to decice what to map), but is somewhat non-standard. A better way might to to implement a subset of Video for Linux (V4L), which already has an API which supports overlays, allowing you to position them, change the format, etc. Stuart -- Stuart Menefy stu...@st... STMicroelectronics Ltd ST Intranet: mo.bri.st.com Bristol, UK Rest of the World: www.linuxsh.st.com |
|
From: Gaster, B. <ben...@su...> - 2002-06-18 10:07:06
|
Hello! =20 I have modified the bitkeeper Kyro framebuffer driver to provide additional ioctl calls for the creation of overlay surfaces (to handle YUV to RGB colour conversion). Currently two calls are required for creation of an overlay surface and a creation of viewport (a scaling window mapping the overlay surface on to the RGB output) which returns a pointer to the viewport surface. Currently the pointer returned can only be written to in Kernel space which I assume is due to the pointer into the PCI address space is mapped only in kernel space and not in user space. I presume that it is possible to map the pages to be written by a particular user process but I not sure how to do this and weather it is the best approach.=20 =20 I suppose it would be possible to extend the framebuffer driver to provide an additional driver interface for overlay surfaces and thus use mmap to create a memory mapped region but this seems like over kill for the functionally that I'm after. =20 What do other people think? =20 ben. |
|
From: Stuart M. <stu...@st...> - 2002-06-10 16:32:46
|
Hi Ben Highmem is used on systems which have more physical memory than can be mapped into kernel space, usually more than 2G or 3G. In this case the bottom part of phyical memory is permantly mapped into kernel space (lowmem) and the rest is mapped as required (highmem) - usually when copying data to or from kernel space. To make this work requires code to be written in the machine specific parts of the kernel, something nobody has bothered to do for the shmedia kernel. For every page in the system, there is a struct page. This is around 48 bytes, so this ammounts to quite a high percentage of your memory simply being used to keep track of memory, so it is worthwhile trying to make this struct as small as possible. For lowmem it is a simple calculation to convert the address of a struct page into the virtual address for that page. It looks like sometime between 2.4.17 and 2.4.19, a change was made to only use the 'virtual' member of struct page if required - usually only if highmem is in use. The best way to fix this would be to change your code to use the page_address() macro to get at this data. If the virtual field is in use, it will use it, otherwise it will do the calculation. Hope this helps Stuart On Mon, 10 Jun 2002 15:34:15 +0100 ben...@su... wrote: > Hello! > > It seems that for some ports of the Linux kernel RAM is mapped > completely into the kernel's address space while on machines with > "highmem" some memory is mapped into the kernel virtual memory > dynamically. The use of "highmem" seems to be controlled through the > configuration has define: > > CONFIG_HIGHMEM > > and defining this has the important consequence of adding the following > field to the definition of "struct page" in mm.h: > > void *virtual; /* Kernel virtual address (NULL if not kmapped, ie. > highmem) */ > > If we look in ./arch/shmedia/config.in we see that CONFIG_HIGHMEM is > not defined and thus the virtual field is never defined. This seems to > imply that no memory can be mapped into the kernel dynamically! Can > anyone explain this in more detail as the problem is I'm porting some > code to SHMedia Linux that makes reference to this field. > > thanks, > > ben. > -- Stuart Menefy stu...@st... STMicroelectronics Ltd ST Intranet: mo.bri.st.com Bristol, UK Rest of the World: www.linuxsh.st.com |
|
From: Gaster, B. <ben...@su...> - 2002-06-10 14:35:28
|
Hello! =20 It seems that for some ports of the Linux kernel RAM is mapped completely into the kernel's address space while on machines with "highmem" some memory is mapped into the kernel virtual memory dynamically. The use of "highmem" seems to be controlled through the configuration has define: =20 CONFIG_HIGHMEM =20 and defining this has the important consequence of adding the following field to the definition of "struct page" in mm.h: =20 void *virtual; /* Kernel virtual address (NULL if not kmapped, ie. highmem) */ =20 If we look in ./arch/shmedia/config.in we see that CONFIG_HIGHMEM is not defined and thus the virtual field is never defined. This seems to imply that no memory can be mapped into the kernel dynamically! Can anyone explain this in more detail as the problem is I'm porting some code to SHMedia Linux that makes reference to this field. =20 thanks, =20 ben. |