[okl4-developer] spatial and temporal partitioning control?

Stefan M. Petters smp at isep.ipp.pt
Sun Sep 6 07:17:38 EST 2009


Dear Jean Christophe,


Jean-Christophe Dubois wrote:
> On Sunday 30 August 2009 18:49:11 Jean-Christophe Dubois wrote:
>   
>> Now, I am a bit uneasy on how this would work on a non SOC device using for
>> example a PCI bus or an USB bus. Would we get the ability to assign each
>> PCI card/USB device to a single Cell. How should the bus
>> enumeration/configuration happen? Considering the PCI enumeration could
>> change the various devices physical addresses how would it work? How about
>> PCI hotplug? I am not sure ... does anybody has a view on this? Or is this
>> a field to explore?
>>     
I have to pass on that one.

>> Still this let the temporal partitioning issue. OKL4 is mostly based on
>> priority. If a high priority cell goes mad, we are stuck because it will
>> eat up all of our CPU without any possibility for other cells to try to fix
>> the problem.
Yes, that's the very definition of priority. In order to fix this you
might want to have a high prio thread which is periodicly checking
whether it received some alive IPC from a  lower prio task. If not, fix
it. However, this is rather crude, but if you don't trust your high prio
cells ...


>>  And even if we have an even higher priority cell controlling
>> the all thing, how could it find that one specific cell has gone mad and
>> all others cells are starving on CPU resource.
>>     
Well either you hack up the apps to send async IPCs to the watchdog(s),
but as mentioned this seems rather crude.

>> Last if we are considering cells with threads of equal priority, OKL4 shall
>> run these various threads in round-robin mode. But all threads will
>> necessarily get the same scheduling time. I can't find an easy way to
>> instruct OKL4 that for 2 threads of the same priority, one should get twice
>> as much CPU time than the other one. Is there a way to allocate CPU
>> resource  to the various cells/threads in a fine grained way?
>>     
AFAIK not in the current version. There have been efforts in that direction.
1. is work done under Scott Brandt at UCSC.
2. another which has been done at NICTA (supervised by me) on a 2.1
version of the kernel.
the code of the latter has not been released. I recently started a redo
of that on OKL4 3.0, but that has not grown to a degree where it could
be released. However, the ideas and logic behind the 2.1 version plus
one or two extensions are published.
http://www.cister.isep.ipp.pt/docs/towards+real+multi-criticality+scheduling/474/
Note, this is done from a RT perspective, but the discussion is mostly
equally valid when talking best effort tasks only.
Most concern in that is the performance loss. If one may ignore the
bursty release of stuff and reduce it to a simple runnability
requirement (i.e. deadline is equal to period) than the RT proof aspect
is not that hard, but that still leaves the EDF scheduling core. If one
is *only* concerned with shares, this can be arranged easily. Virtual
machine type hierarchical scheduling comes to mind. and when certain
assumptions can be made, the EDF queue doesn't look as bad anymore.

Note, somehow you need to find shares which add up to not more than 100%
;-) which requires some sort of analysis in particular when doing a real
system which communicates (which it almost invariably will do). There is
a few things more to it, than published, but I think that would be too
noisy for the mailing list. We can take that offline if you are still
curious.
Since it is a rather obvious problem in virtual machines I would have
expected more work on this. Maybe it's been done but not published anywhere.
 
>> BTW, now that there is no more "privileged" cell (as opposed to OKL4 2.4)
>> how could one cell theoretically restart another cell that is crashed/has
>> gone mad? We can certainly reset the all system but is there a finer
>> grained way to do it.
>>     
Here I have to pass again.

Regards,
    Stefan.
-- 
Stefan M. Petters
CISTER Research Unit
 
ISEP - IPP | Rua Dr. António Bernardino de Almeida 431
4200-072 Porto | Portugal
T +351 22 83 40 529 | Homepage
<http://www.cister.isep.ipp.pt/people/stefan+m%2E+petters>



More information about the Developer mailing list