[okl4-developer] pagefault
Frank Kaiser
frank.kaiser at opensynergy.com
Thu Jul 24 01:22:43 EST 2008
Hi, Geoffrey
We have identified a reentrancy problem in our interrupt handler code,
and it has something to do with the CONTINUATION scheme. The easiest
solution would be just not using it, but I think some of the kernel
functions called by the interrupt handler cannot afford receiving NULL
as value of parameter 'cont'.
I have to find a way to concurrently maintain the interrupt handler's
return address for each nested interrupt (up to 8 are theretically
possible).
Regards
Frank
> -----Original Message-----
> From: developer-bounces at okl4.org [mailto:developer-bounces at okl4.org]
On Behalf
> Of Frank Kaiser
> Sent: Wednesday, July 23, 2008 11:00 AM
> To: Geoffrey Lee
> Cc: developer at okl4.org
> Subject: Re: [okl4-developer] pagefault
>
> Hello, Geoffrey
>
> I don't think that the CONTINUATION is wrong (although I must admit
that
> the concept appears dubious to me when I look at what the macro
> ACTIVATE_CONTINUATION() is doing with the stack pointer). As I
explained
> earlier, our hardware's interrupt controller requires a special
> acknowledgement at the end of the interrupt processing (otherwise it
> would never again accept an interrupt with the same priority). It
cannot
> be done before, since this would open a hole for priority inversion.
> I also do not believe that not unmasking the timer interrupt in the
> handler is the cause of the problem. This interrupt occurs in a 1
second
> interval, that is plenty of time for the timer server to do the
SYSCALL
> 'interrupt acknowledge'. Furthermore, if that did not happen, the
> interrupt handler would store the next interrupt as pending instead of
> processing it.
> What, however, can happen is that the kernel's scheduler interrupt
> occurs while the timer interrupt is handled. The scheduler is driven
by
> a separate interval timer with a tick of 5 ms, and its interrupt is
> given a higher priority than the timer interrupt.
> Whether this really happens depends on the interrupt state at core
> level. If I am not mistaken, then by default the core goes into IRQ
> disable state, when an IRQ is accepted. W/o any other measure, this
> state would persist until the return from interrupt, when the CPU
state
> is switched back to the previous one and the CPSR register is
reloaded.
> In this case the scheduler interrupt would be delayed until the end of
> the return from the timer interrupt. If the core interrupt is enabled
> earlier, than there could be a reentrancy effect on the interrupt
> handler.
> Another reentrancy problem can be possible between the interrupt
handler
> and the SYSCALL config_interrupt(). The point here is that the SYSCALL
> deals with the same data as the interrupt handler, and if critical
> operations are not atomic, than this could result in inconsistent
data.
> For our further problem analysis it could be very helpful to
understand
> what happens with the core interrupt state. Especially the assembler
> code in 'traps.spp' is difficult to anticipate. Can you shed light on
> this?
>
> Regards
> Frank
> > -----Original Message-----
> > From: Pierre-Antoine Bernard
> > Sent: Wednesday, July 23, 2008 9:53 AM
> > To: Frank Kaiser
> > Subject: FW: [okl4-developer] pagefault
> >
> >
> >
> >
> > -----Original Message-----
> > From: Geoffrey Lee [mailto:glee at ok-labs.com]
> > Sent: Wed 7/23/2008 6:29 AM
> > To: Pierre-Antoine Bernard
> > Subject: Re: [okl4-developer] pagefault
> >
> > On Tue, Jul 22, 2008 at 07:22:30PM +0200, Pierre-Antoine Bernard
> wrote:
> > > Hi Geoffrey,
> > >
> > > Many thanks for your help.
> > > I can give you some kernel interrupt handling code for our ATMEL
> AT91SAM9263
> > platform (see attachment).
> >
> >
> > Hi Pierre-Antoine
> >
> > It may be possible that you have gotten the continuation wrong and
> > are returning via the wrong continuation. Normally we mask the
> > interrupt when it is raised and after the user level interrupt
> > handler handles it and calls InterruptControl() it is unmasked.
> >
> > Are you working on the same SoC as Frank? I seem to remember
> > you have some special interrupt handling code that does priorities
> > and required some special treatment. It is possible that this is
> > the culprit.
> >
> > >
> > > Regards,
> > > Pierre-Antoine
> >
> > -gl
> >
>
>
> _______________________________________________
> Developer mailing list
> Developer at okl4.org
> https://lists.okl4.org/mailman/listinfo/developer
More information about the Developer
mailing list