Documentation of Parallel and Serial port configuration?

Fri Oct 7 11:40:08 CDT 2005

Hi Kuba,

On Thursday 06 Oct 2005 23:23, Kuba Ober wrote:
> > we can probably do better than inb() / outb().
>
> You can't do any better than that [It] is the only one that makes sense
> (when you run things on ia32).

... and when you're not on an ia32 platform with a superIO chip?

> > Advantages of using ppdev over simple inb() / outb() are:
> >
> >   should support [*] cross-architecture (arm, alpha, powerpc, ...)
> That'd be good for winelib only or wine-with-emulator (bochs? qemu?).

Yup, both.  A ported applications (via winelib or qemu) should work under any 
Linux architecture.  Unfortunately, it would be a Linux-specific solution; 
*-BSDs have their own interface.

> >   should support [*] some esoteric devices (USB-parallel converters, ...)
> At a huge performance penalty ;)

But it would work, 's my point.  The performance of parallel-over-USB is a 
separate issue.

Legacy devices (such as parallel ports) are being gradually faded out.  So 
writing code that requires a SuperIO chip is not best.

> > The overhead in doing a syscall isn't significant as any outb() operation
> > takes ~1us anyway
>
> AFAIK, the overhead stems from the fact that instead of a machine
> instruction you have to:
> - process an exception in the kernel, which then signals SIGSEGV to the
> process
> - invoke the signal handler
> - determine what's up and disassemble the instruction at CS:EIP
> - invoke a function/syscall based on the disassembled instruction
>
> If this isn't dog slow, I don't know what is. I wasn't entirely clear, the
> syscall is the least of our worries in fact :)

I think you may be confusing some other activity (maybe an invalid memory 
access?).  A syscall is pretty simple.  The application does some bookkeeping 
and calls int(errupt) 0x80, triggering the switch from user-land to 
kernel-land.  The kernel then picks up the request and carries on.  Its 
described here[1], although the details may have changed slightly with more 
recent kernels.  There's no signalling (in the Unix user-land sense) going 
on.

[1] http://www.tldp.org/LDP/khg/HyperNews/get/syscall/syscall86.html

Overhead is "currently" (measured for 2.4.0) at slightly under 0.4us (see 
[2]).  For 2.6-series kernels it may have gone down slightly further, but 
0.4us would seem a reasonable upper-bound.  Assuming the kernel driver is 
reasonably written,  I'd make a complete guess that the overhead is between 
0.4 and 0.6us (although I should benchmark the number :^).

[2] http://cs.nmu.edu/~benchmark/index.php?page=null_call

> > I suspect most programs designed to work under Win98 just hit the
> > hardware, so obtaining permissions (doing ioperm() as root, for example)
> > should work. If we have some mechanism for catching the program doing
> > either inb() or outb(), then we could provide a better implement via the
> > ppdev interface.
>
> At the cost of slowing things down. For devices that bit bang data (like
> programmers), this makes things unacceptably slow.

I can't say I share that experience (about being unacceptably slow, that is).  
A 40-60% increase in overhead for a single instruction would be definitely 
noticeable, but only if this is the bottleneck in the program.  Other 
activity takes longer (c.f. context-switching in [2], for example).  Even 
just calling functions take order of 100ns (on my ~700MHz laptop).  The time 
between successive changes of parallel port state might be (much) larger than 
the 400-600ns overhead in using kernel routines, so the overhead becomes less 
significant.  Of course, this would be application specific.

The worse-case would be something driving the parallel port as a square-wave 
generator: you'd get the full 40-60% drop in performance (assuming all the 
above numbers).  Perhaps slightly more realistically, the PLIP interface is 
reckoned[3] to have a 1.2Mbit/s bandwidth, corresponding to a ~3.33us 
turn-around time.  Adding a 0.4-0.6us overhead would reduce the bandwidth to 
between 1.1Mbit/s and 1.0Mbit/s (8-16% performance drop).  Would this matter? 
No, because if it did you'd go out and buy 100baseT cards and achieve far 
greater performance (or Myrinet, or ...).

[3] http://yara.ecn.purdue.edu/~pplinux/ppcluster.html

For the particular use-case you have in mind, my understanding is that 
programmers often require some additional delay mechanism to allow the EPROM 
to keep up (certainly for write, probably for reads too).  This would reduce 
the impact of the performance hit, perhaps acceptably (or even imperceptibly) 
so.

Does all this matter?  Probably not.  I would bet you this smartee here that 
if a program is worrying about ns response of some function, then that 
function its good enough, and that some better "higher level" algorithmic 
optimisation would have a much larger benefit (e.g. ethernet vs PLIP).

Cheers,

Paul.

(apologies for the overly long email!)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.winehq.org/pipermail/wine-devel/attachments/20051007/50f7e840/attachment.pgp