[dev] Multithreading call tomorrow (Fri) 11:00 East / 17:00 EU
David Lamparter
equinox at opensourcerouting.org
Fri May 19 12:07:37 EDT 2017
Here's a paste of what we wrote down during the meeting. The document
is still accessible under this URL:
https://docs.google.com/document/d/1736qrFFDHCxokZyb8i2mD_BLMzer0EbDSoOM9zjesYQ/edit?usp=sharing
(if the formatting of this mail is broken, please use the link)
FRR Multithreading meeting -- minutes & design foo
Context
(David) - “Why MT” is hopefully not a big discussion item
Generally:
- Keepalive problems, i.e. CLI blocking protocol engine
- More generally - making use of modern multicore CPUs
Possibility to use modern IPC (e.g. shared memory, semaphore …)
Minutes
- Priorities:
1. Eliminate CLI blocking
2. Introduce framework for general MT usage
- Prior Implementations
1. Implemented by Chris Hall for EuroIX
Motivated by keepalive generation blocked by update processing
Ultimately not merged due to magnitude of change
2. By David Lamparter as prototype
- Evaluations of other event loops - not viable
Changes involved too large / invasive
Design
- General idea: keep existing userspace “threading” model (thread.c) working as-is as much as possible
- Add support for ‘persistent events’ i.e. events / tasks that do not need to be manually rescheduled after they pop
- OpenMP?
Platform support sketchy, our implementation should be flexible enough to allow the same thing / choice
- Models
One pthread per connection?
Probably way too much implicit ‘locking’ already going on to make this change
1000 bgp peers → 1000 pthreads, not great
General idea (current working direction):
MT-safe thread_master
Avoid excessive locking / code restructuring by allowing pthreads to safely schedule tasks onto each other to do work on data structures they ‘own’
Current primary problem:
Struct thread is not owned by anyone in particular
Cancellation of a task scheduled / running on another pthread ⇒ possible solution with pthread_setcancelstate()
Need implementations of blocking & nonblocking thread_cancel
Nonblocking implementations require daemon code to synchronize access to shared data themselves
Need _persistent_ tasks / events
- Action items
Testing for/adding MT-safety of
thread_master
Memory allocation
zlog
deduplication of attributes in bgpd
Future work
- Thread pooling for one thread_master?
- Multiple threads handling events from one thread_master
- Socket / shm IPC
More information about the dev
mailing list