Introduction To The Linux Kernel: Challenges And Case Studies

Transcription

Introduction to the Linux kernel:challenges and case studiesJuan Carlos Sáez AlcaideDepartment of Computer Architecture and AutomationArTeCS GroupComplutense University of MadridIV Semana de la Informática 2018Feb 8, 2018

About MeJuan Carlos Sáez Alcaide (jcsaezal@ucm.es) Interim Associate Professor, UCMDepartment of Computer Architecture and AutomationTeaching: Operating Systems, Linux and Android Internals, Member of the ArTeCS Research Group High Performance ComputingComputer ArchitectureInteraction between system software and architecture UCM Campus Representative of the USENIX Int’l Association Login (USENIX Magazine)IV Semana de la Informática 2018-2

Outline1 Introduction2 Main Features3 Kernel Control Paths and Concurrency4 Common Kernel abstractions5 A case study: PMCTrack toolIV Semana de la Informática 2018-3

Outline1 Introduction2 Main Features3 Kernel Control Paths and Concurrency4 Common Kernel abstractions5 A case study: PMCTrack toolIV Semana de la Informática 2018-4

Unix (I)Unics – Unix (1969) Created by Ken Thompson and rewritten in “C” by Dennis Ritchie (1973)V6 (1975): Public source code(AT&T license) BSD distributions (Billy Joy)John Lion’s book on UNIX V6Keys to success1234Inexpensive licenseSource code availableCode was simple and easy to modifyRan on modest HWIV Semana de la Informática 2018-5

Unix (II)Unix (Cont.) V7 (1979): code can be no longer used for academic purposesXenix (1980) MicrosoftSCOUnix System III (1982)Unix System V (1983) HP-UX, IBM’s AIX, Sun’s SolarisIV Semana de la Informática 2018-6

Unix (III)Proyecto GNU (1983) - Richard Stallman SO GNU: Emacs, GNU compiler collection(GCC), GNU Hurd (kernel)Richard StallmanMinix v1 (1987) - Andrew Tanenbaum Minimal Unix-like OS (Unix clone)Teaching purposes. Modular structureCompatible with Unix V7 (user level) evolvedtowards the POSIX standard 1987 (Minix1 - i8088), 1997 (Minix2 - i386)Andrew TanembaumIV Semana de la Informática 2018-7

Conflicts in the Unix worldUnix Wars (1987-1996)Unix International (USL - AT&T) vs. Open Software Foundation Unix System V Release 4 (SVR4)War of specificationsWar ends in 1996 Open Group (Unix Trademark)Lawsuit brought by USL-AT&T against BSDI (1991-1994)Issue: attempt to make BSD free of any UNIX codeAgreement is reached in 1994: Novell acquires USLIV Semana de la Informática 2018-8

And Linux comes out (1991)A few months after Minix 1 was released the following message was posted at comp.os.minix:From: torva.@klaava.Helsinki.FI (Linus Benedict Torvalds)Date: 25 Aug 91 20:57:08 GMT Local: Sun, Aug 25 1991 9:57 pmSubject: What would you like to see most in minix?Hello everybody out there using minix I'm doing a (free) operatingsystem (just a hobby, won't be big and professional like gnu) for386(486) AT clones. This has been brewing since april, and is startingto get ready. I'd like any feedback on things people like/dislikein minix, as my OS resembles it somewhat (same physical layout ofthe file system (due to practical reasons) among other things).I've currently ported bash(1.08) and gcc(1.40), and things seem towork. This implies that I'll get something practical within a fewmonths, and I'd like to know what features most people would want. Anysuggestions are welcome, but I won't promise I'll implement them :)Linus (torva.@kruuna.helsinki.fi)PS. Yes it's free of any minix code and it has a multi-threaded fs. Itis NOT portable (uses 386 task switching, etc), and it probably neversupport anything other than AT-harddisks, as that's all I have :(.IV Semana de la Informática 2018-9

And Linux comes out (1991)Keys to success Unix WarsBSD LawsuitCompetition with Windows NT (common enemy)GPL LicenseInternet?IV Semana de la Informática 2018-10

And Linux comes out (1991)But Linux is just a kernel GNU/Linux , IBM/RedHat/HP/ .Linux.Distros (kernel selection of pre-compiled tools) MCC Interim Linux 1992,Slackware (Patrick Volkerding) , Debian (Ian Murdock) 1993S.U.S.E, Red Hat (Marc Erwing, Bob Young) 1994Desktop Environments Xfree86 – Thomas Roel 1991KDE – Matthias Ettrich 1996Gnome – Miguel de Icaza 1997 IV Semana de la Informática 2018-11

Linux Is Not UniXOpen Group Current owner of the UNIX trademarkSUS (Single Unix Specification) certification Unificación SUS / IEEE Posix en 2001 (SUS Version 3)A Unix system must comply with SUS Mac OS X Leopard (based on BSD)Z/OS IBMGNU/Linux is just a UNIX-like OSIV Semana de la Informática 2018-12

Linux kernel: evolution since 1991 y Statistics for the Linux Kernel - Lines of Code, Bad words,.Lines of code per Kernel versionLinesClickofandcodeperkernelversiondrag in theplot areato zoom in25M15M10M5M01.1. 021. .23.1. 273.62. 502. .20.2. 401.2. 3712. .751.12. 132.2. 183.2. 294.2. 165.2. 165.2. 546.3. 160.3. 140.3. 520.3. 902.3. 152.3. 534.3. 284.63. 673. .8103. .912.3. 1113. 3.518.14. 74.4. 48.94.4. 1114.5Lines of code20MVersionLines of CodeSource: https://www.linuxcounter.netHighcharts.comIV Semana de la Informática 2018-13

Android: a Linux-based OSAPPLICATIONSHomePhoneContacts BrowserAPPLICATION eManagerLocationManagerLIBRARIESSurface ManagerNotificationManagerANDROID RUNTIMEMediaFrameworkSQLiteCore LibrariesDalvik VirtualMachine / ARTOpenGL ESFreeTypeWebKitSGLSSLBionic (libc)LINUX KERNELDisplayDriverCamera DriverFlash MemoryDriverBinder (IPC)DriverKeypad ntIV Semana de la Informática 2018-14

Linux kernel in AndroidAndroid relies on extensions made on top of the vanilla kernel: BinderWakelockskloggerAshmemLow Memory KillerParanoid NetworkAlarm Timers Today, the “Androidized” kernel constitutes a significant forkof Linux Linaro maintains an “androidized” kernel close to Linux mainlineIV Semana de la Informática 2018-15

Outline1 Introduction2 Main Features3 Kernel Control Paths and Concurrency4 Common Kernel abstractions5 A case study: PMCTrack toolIV Semana de la Informática 2018-16

user modeMonolithic designFeaturesApplicationApplicationlibc (API)kernel modeSystem call versMemoryManagementProcessManagementAll kernel components sharethe same address spaceEverything runs in kernel (privileged) modeGood performanceNo isolation between componentsKernel ComponentsArchitecture specific code (arch)HardwareIV Semana de la Informática 2018-17

User mode vs. kernel modeState register765X 4N 3Z 2V 1C 00D0D1D2D3D4D5D6D7State registerT 1514S 13121112 1011 910 8765X 4N 3Z 2V 1C 0A0A1A2A3A4A5A6A70SystembyteA0A1A2A3A4A5A6A7Kernel modeUserflagsD0D1D2D3D4D5D6D7UserflagsUser mode0216 1TASK SIZEVirtual addressspaceInstruction setI/O addressspaceTASK SIZE232 1Virtual addressspaceInstruction setIV Semana de la Informática 2018-18

Interactive map of the Linux kernelhttp://www.makelinux.net/kernel map/IV Semana de la Informática 2018-19

Main challengesA beast of a different natureDevelopment process is labor-intensive Testing new features or bugfixes requires1 (Re)Build the kernel (it may take a while!)2 Install the new kernel3 Reboot the machineNo memory protection (no SIGSEGV)User-space like debugging tools are not availableDocumentation becomes outdated quickly Developers must really understand the kernel codeNo standard C library Limited size of the kernel stackIV Semana de la Informática 2018-20

Obtaining kernel sourcesVanilla: (The Linux Kernel Archives) www.kernel.org (tarballor git repository) le/linux-stable.githttps://git.kernel.orgGNU/Linux distributions typically use Linux with patchesExample Debian: Package installation with apt-get o apt source Source package of linux-image *linux-source .deb package (install tarball at /usr/src/)More information at Debian Linux Kernel HandbookIV Semana de la Informática 2018-21

Working with the Linux kernelObtain the sources (extract tarball if necessary)Apply patches (if necessary)Configure the kernelNoReboot intostable kernelDoes it boot?Does it work?YesWe aredone :-)Make changes to the source codeBuild the kernel (make,make-kpkg)NoCompilationerrors?YesInstall kernel and reboot machineIV Semana de la Informática 2018-22

Linux kernel modulesWhat is a kernel module?A “code fragment” that can be loaded/unloaded into/fromthe OS kernel’s address space on demand Written in “C”Bundled as a .ko file (object file) Compilation driven by a Makefileinsmod, rmmodIts functions are executed in kernel modeLimitationNot everything can be implemented as a kernel module System callsScheduling algorithms IV Semana de la Informática 2018-23

KGDBRequires 2 machines connected via serial port Development host: To build the kernel and invoke gdbTarget system: Runs the kernel with KGDB supportSource: https://blog.trendmicro.comIV Semana de la Informática 2018-24

KDBKDB: in-kernel debug shell (serial port/text console) No need to use an external debugging hostKDB is not a source-level debugger Programmer needs to deal with assembly codeYou can use it in conjunction with gdb and external symbol fileterminal echo g /proc/sysrq-triggerSysRq : DEBUGEntering kdb (current 0xdfdff040, pid 71) due to Keyboard Entrykdb bp sys sync 4Instruction(i) BP #0 at 0xc00c9f00 (sys sync 0x4)is enabled addr at 00000000c00c9f00, hardtype 0 installed 0kdb go syncEntering kdb (current 0xdfdaa360, pid 72) due to Breakpoint @ 0xc00c9f00kdb btStack traceback for pid 720xdfdaa3607271 10R 0xdfdaa560 *sync[ c0028cb4 ] (unwind backtrace 0x0/0xe4) from [ c0026d50 ] (show stack 0x10/0x14)[ c0026d50 ] (show stack 0x10/0x14) from [ c0079e78 ] (kdb show stack 0x58/0x80)[ c0079e78 ] (kdb show stack 0x58/0x80) from [ c0079f1c ] (kdb bt1.clone.0 0x7c/0xcc)[ c0079f1c ] (kdb bt1.clone.0 0x7c/0xcc) from [ c007a240 ] (kdb bt 0x2d4/0x338).IV Semana de la Informática 2018-25

SystemTapSystemTap: Tool for dynamic instrumentation of the kernel Scripting languageexample.stpprobe kernel.function("do fork"){printf("do fork() was invoked by PID %d\n",pid());}terminal sudo stap example.stpdo fork() was invoked by PID 4811do fork() was invoked by PID 4820do fork() was invoked by PID 4811.IV Semana de la Informática 2018-26

Linux kernel debuggingMany options available: KGDBKDBGDB /proc/kcoreFtraceSystemTapICE / JTAG (USB or ethernet)printk() dmesgkdump/kexec The challenge is knowing what to use when.Many tools have a steep learning curveIV Semana de la Informática 2018-27

Outline1 Introduction2 Main Features3 Kernel Control Paths and Concurrency4 Common Kernel abstractions5 A case study: PMCTrack toolIV Semana de la Informática 2018-28

Kernel Control PathsLinux kernel is like a server that answers requests Kernel functions are executed in response to events1 System calls2 An exception occurs (e.g., illegal instruction)3 An interrupt is raised by an I/O deviceKernel control path: sequence of instructions executed inkernel mode Lighter than a process (less context)It does not always work on behalf of a processIV Semana de la Informática 2018-29

Execution ContextsAt a certain time instant t each CPU can be running: A process’s code at user space (Process context in user mode)Kernel code from a system call or exception handler (Processcontext in kernel mode)A kernel thread’s code (Process context in kernel mode)Interrupt handling code (Interrupt context)useruseruserkerneluserkernelintrkernelUser ModeKernel ModeintrIV Semana de la Informática 2018-30

Causes of ConcurrencyCauses of interleaved/parallel execution of kernel code paths1Interrupts 23Bottom-half processingKernel preemption 4Because the kernel is preemptive, one kernel code path canpreempt anotherSleeping and synchronization with user-space 5An interrupt can occur asynchronously at almost any time,inter- rupting the currently executing codeA task in the kernel can sleep and thus invoke the scheduler,resulting in the running of a new process.Symmetrical multiprocessing Two or more processors can execute kernel code at exactly thesame timeIV Semana de la Informática 2018-31

Outline1 Introduction2 Main Features3 Kernel Control Paths and Concurrency4 Common Kernel abstractions5 A case study: PMCTrack toolIV Semana de la Informática 2018-32

Interactive map of the Linux kernelhttp://www.makelinux.net/kernel map/IV Semana de la Informática 2018-33

Common Kernel abstractionsCommon abstractionsSystem callsPseudo file systems: /proc, /sysKernel data structuresDynamic memory allocation kmalloc(), vmalloc(), kfree(), vfree()Bottom-half methods (deffering work)Kernel timersKernel threadsKernel synchronization methods IV Semana de la Informática 2018-34

Pseudo file systemsInteraction between user programs and the kernel through files We may read/modify kernel “variables” via shell commands echo 1 /proc/sys/net/ipv4/ip forwardcat /proc/cpuinfo/procProgrammer must provide an implementation of the various file operations (syscalls) supported: read(), write(), /sysFolders in each sysfs directory correspond to objects (kobjects)Files in sysfs represent attributes of an object write file change attribute valueread file retrieve attribute valueIV Semana de la Informática 2018-35

Generic data structuresThe Linux kernel implements various generic data structures Doubly linked listsQueuesMapsBinary trees Avoid dynamic memory allocation when possiblenextprevlist headlist headlist headnextprevnextprevnextprevIV Semana de la Informática 2018-36

Bottom-half methodsMotivationSometimes, a certain kind of work cannot be done immediately Too much computation associated with interrupt processing TCP/IP processing upon recieving network packetInability to invoke blocking functions from interrupt contextand other code regionsSolution: defer work for laterSofirqs raise sofirq

pmctrack -T 1 -c instr,cycles,llc_misses -V energy_core ./mcf06 \ pmc-metric -v -m 'IPC pmc0/pmc1' -m 'LLCMPKI (1000*pmc3/pmc0)' -m 'EPI (1000*virt0)/pmc0' [Event-to-counter mappings] pmc0 instr pmc1 cycles pmc3 llc_misses virt0 energy_core. nsample pid event IPC LLCMPKI EPI 1 7843 tick 0.561232 15.257263 4.006190