[dev] Multithreading call tomorrow (Fri) 11:00 East / 17:00 EU

David Lamparter equinox at opensourcerouting.org
Fri May 19 12:07:37 EDT 2017


Here's a paste of what we wrote down during the meeting.  The document
is still accessible under this URL:
https://docs.google.com/document/d/1736qrFFDHCxokZyb8i2mD_BLMzer0EbDSoOM9zjesYQ/edit?usp=sharing
(if the formatting of this mail is broken, please use the link)

FRR Multithreading meeting -- minutes & design foo

Context
(David) - “Why MT” is hopefully not a big discussion item
Generally:
-	Keepalive problems, i.e. CLI blocking protocol engine
-	More generally - making use of modern multicore CPUs
	Possibility to use modern IPC (e.g. shared memory, semaphore …)

Minutes
- Priorities:
	1. Eliminate CLI blocking
	2. Introduce framework for general MT usage
- Prior Implementations 
	1. Implemented by Chris Hall for EuroIX
		Motivated by keepalive generation blocked by update processing
		Ultimately not merged due to magnitude of change
	2. By David Lamparter as prototype
- Evaluations of other event loops - not viable
	Changes involved too large / invasive

Design
- General idea: keep existing userspace “threading” model (thread.c) working as-is as much as possible
- Add support for ‘persistent events’ i.e. events / tasks that do not need to be manually rescheduled after they pop
- OpenMP?
	Platform support sketchy, our implementation should be flexible enough to allow the same thing / choice
- Models
	One pthread per connection?
		Probably way too much implicit ‘locking’ already going on to make this change
		1000 bgp peers → 1000 pthreads, not great
	General idea (current working direction):
		MT-safe thread_master
		Avoid excessive locking / code restructuring by allowing pthreads to safely schedule tasks onto each other to do work on data structures they ‘own’
	Current primary problem:
		Struct thread is not owned by anyone in particular
		Cancellation of a task scheduled / running on another pthread ⇒ possible solution with pthread_setcancelstate()
		Need implementations of blocking & nonblocking thread_cancel
		Nonblocking implementations require daemon code to synchronize access to shared data themselves
		Need _persistent_ tasks / events
- Action items
	Testing for/adding MT-safety of
		thread_master
		Memory allocation
		zlog
		deduplication of attributes in bgpd

Future work
- Thread pooling for one thread_master?
- Multiple threads handling events from one thread_master
- Socket / shm IPC



More information about the dev mailing list