Process Synchronization and Interprocess Communication Practice
Application <----> OS <----> Hardware
| |
* System calls |
* CPU state
* Interrupt / Exception Mechanism
Exceptional Control Flow (ECF) 異常控制流
Interrupt Descriptor Table (IDT) 中斷描述符表
- Wiki - Interrupt descriptor table
- Interrupt Descriptor Table
- 中斷描述表
- 中斷描述符表
- Youtube - Accessing the Interrupt Descriptor Table
- CPU
- Register
- Interrupt
- User visible register: used by high-level compiler, to reduce the memory access times
- data register (general purpose register)
- address register
- index register
- segment pointer
- stack pointer
- condition code register
- overflow
- sign
- Control and State register (Only accessible with authority)
Distinguish from user mode to kernel mode
- mode flag
- status register (flag register, condition code register (CCR)): supervisor flag
- previlege level
x86 - Protection ring
- Ring 0: kernel mode
- Ring 3: user mode
Hardware support:
- Executing different instruction set on different authority level.
- Seperate OS and user program.
Use PSW
Eflags register in x86
Privileged Instructions: Can only be used by OS. (can't be used by user)
- Kernel Mode: running system program
- User Mode: running user program
privilege instruction: the instruction can only used by system
Trap instruction is non-privileged instruction (訪管指令)
Example of X86: 4 different privilege
- R0: kernel state
- R1
- R2
- R3: user state
In most of the x86 processor only use R0 and R3 privilege
- User Mode -> Kernel Mode
- Only way: Interrupt/Exception/Trap Mechanism 中斷/異常(例外)/陷入機制
- Kernel Mode -> User Mode
- Setting PSW to user mode
e.g. int, trap, syscall, sysenter/sysexit => 訪管指令 (不同系統implement的名稱可能不同)
OS is interrupt triggered or event triggered
Origin of Interrupt and Exception:
- Interrupt: Support parallel operation between CPU and device
- Exception: Problem appear while CPU executing instruction
- CPU "react" to an "event"
- CPU stop the running process
- Preserve the scene (PC, PSW)
- Execute the handler for the "event"
- After finish, back to the break point
- If it's system call then advance PC
- If other exception then don't advance PC
- Continue executing
- (External) Interrupt
- I/O interrupt
- Time interrupt
- Hardware failure
- Exception (Internal Interrupt)
- System call
- Page fault 頁錯誤/故障
- 缺頁異常
- Protectional exception
- Break point instruction
- Other programming exception
- e.g. overflow
- | Unexpected | Deliberate |
---|---|---|
Exceptions (sync) | fault | syscall trap |
Interrupt (async) | interrupt | software interrupt |
- Interrupts: asynchronous interrupts generated by hardware.
- Exceptions: synchronous interrupts generated by the processor.
Class | Reason | Async/Sync | Return behavior |
---|---|---|---|
Interrupt | I/O device, peripheral | Async | Always return to next instruction |
Trap | Arrange intentionally | Sync | Return to next instruction |
Fault | Recoverable error | Sync | Return to current instruction |
Abort | Unrecoverable error | Sync | Don't return |
Discover interrupt -> Receiving interrupt
In the last step of execution cycle, it will scan the interrupt register check if there is interrupt signal
If there is an interrupt, then interrupt hardware will send the "interrupt code" in the corresponding position in PSW. Through switching interrupt vector to call the interrupt handler.
Location of IDT (address and size) is kept in the IDTR register of the CPU, which can be loaded/stored using LIDT, SIDT instructions
This is similar to the GDT
IDTR Interrupt Descriptor Table Register
The processor has a special register (IDTR) to store both the physical base address and the length in bytes of the IDT
On the x86 architecture, the Interrupt Vector Table (IVT) is a table that specifies the addresses of all the 256 interrupt handlers used in real mode.
It's a unit in the memory. Store the entry address of interrupt handler and PSW.
Linux interrupt vector
- 128 (0x80): for system call (programmable exception)
The x86 architecture is an interrupt driven system. External events trigger an interrupt — the normal control flow is interrupted and an Interrupt Service Routine (ISR) is called
Procedure
- Preserve relative registers
- PC
- PSW
- Analysis the reason of Interrupt / Exception
- Execute the corresponding funciton
- Resume and return to the original program
Example of I/O Interrupt:
- I/O operation end normally
- Wake up the process which is waiting for the result
- I/O operation fail
- Retry the fail operation
- Reach the tolerance maximum, determine as hardware failure
Implementation of Timer Interrupt:
- System necessary
- Software clock
- CPU scheduling
- Round Robin
- Timing task
- Real-time execution
Hardware Fault Interrupt
Program Interrupt
IA32 = Intel's 32-Bit computer architecture = x86 (comes from the Intel Processor model number "Intel 8086") (explained)
- Interrupt
- Exception 異常
- System Call
IA32 system structure
- Advanced Programmable Interrupt Controller (APIC / PIC)
- Transfer the hardware interrupt signal to interrupt vector, trigger CPU interrupt
- Interrupt Vector Table (Real Mode)
- Store the address of interrupt handler
- handler entry address = segment base address + offset
- Store the address of interrupt handler
- Interrupt Descriptor Table (Protection Mode)
- Use data structure gate descriptor to describe interrupt vector
Gates in Interupt Descriptor Table
- Task Gate
- Interrupt Gate
- Trap Gate
- Call Gate
Procedure of Interrupt in x86
- Get the interrupt vector (i)
- Use IDTR find IDT then get interrupt descriptor (ith item in the table)
- From GDTR get the address of GDT (GDT contains entries telling the CPU about memory segments)
- Combine the section selector and get the corresponding section selector from GDT
- From that section selector get the base address of interrupt handler
- handler entry address = section base address + offset
- Check the privilege, make sure it's allow to access the segment
- Make sure CPL (in CS Register) ≤ Gate descriptor DPL
- Prevent user application access special trap gate or interrupt gate
- Make sure RPL (in CS Register) ≤ Section descriptor DPL
- Make sure current privilege greater than the privilege of interrupt handler
- Make sure CPL (in CS Register) ≤ Gate descriptor DPL
- CPL Current privilege level
- RPL Requested privilege level (privilege level associated with a segment selector)
- DPL Descriptor privilege level (privilege level of a segment)
- It defines the minimum1 privilege level required to access the segment.
Privilege levels range from 0-3; lower numbers are more privileged.
Register <--> Cache <--> Memory <--> Disk
Speed Fast ------------------------------> Slow
Capacity Small ------------------------------> Big
Cost High -------------------------------> Low
aka. principle of locality
- Byte, Bit
- Page Frame (物理頁、頁框、頁幀)
- Block / Page size
- 512B, 1KB, 4KB, ..., 256KB, 1MB, 4MB, 16MB
- Block / Page size
Cache (SRAM) 快取記憶體、高速緩存
┌-----┐ ┌-------┐ ┌--------┐
| CPU | <--- Byte or Word transfer ---> | Cache | <--- Block transfer ---> | Memory |
└-----┘ └-------┘ └--------┘
- Program Control
- Interrupt Trigger
- Direct Memory Access (DMA)
Blocking I/O and Non-blocking I/O
Polling aka. Programmed I/O
CPU has to check the I/O status => waste lost of time on polling the status
To solve the problem of "polling". It will free the CPU from polling.
Parallelize the I/O and the other instructions
Send the interrupt when I/O unit is ready to interact with the device
The interrupt-based I/O is not efficient enough.
Use a individual unit DMA controller
TCON (Timer Control register)
Timer is necessary in the following scenario
- Found infinity loop
- Round Robin algorithm in interactive system
- Time delay and Time exceed control in real-time system
- Execute some external event for a time duration
- ...
- absolute clock: record current time (will advance even when shutdown)
- relative clock: implement by clock register
- clock-- for a time unit, when the value become negative, then do something
- hardware clock
- software clock
A system call is a way for programs to interact with the operating system.
System call provides the services of the operating system to the user programs via Application Program Interface (API) (Library)
Trap the CPU state from user state to kernel mode
Kernel Function
Example: printf
printf() --> write() (syscall)
- Interrupt/Exception Mechanism
- implement the services
- Trap Instruction/Privilege Instruction
- switch between user state and kernel state
- System call number and parameter
- number each syscall
- System call table
- store the function pointer address for each syscall's service handler
In Linux, each system call is assigned a unique syscall number that is used to reference a specific system call.
Passing parameter from user program to kernel
- Trap instruction with argument
- limited paramter number
- General purpose register
- can be accessed by both user and system
- limited register number
- it's completely fine in 64bit system
- e.g. Nachos (MIPS) (r2 register)
- Special purpose stack heap area in memory
return
is also a syscall (No. 1)
Ctrl + C
Soft interrupt
send a signal --> .... -> ....
When CPU execute special trap instruction
- Interrup/Exception mechanism: Protect state by hardware
- Lookup the IVT
- Pass the authority for syscall entry function
- Invoke entry function: Preserve state
- Preserve the parameters into kernel stack
- Pass the authority for the syscall handler
- e.g.
sysenter
- Execute system call handler
- Resume state and back to user program
System call number
#define __NR_restart_syscall 0
#define __NR_exit 1
#define __NR_fork 2
#define __NR_read 3
#define __NR_write 4
#define __NR_open 5
#define __NR_close 6
#define __NR_waitpid 7
#define __NR_creat 8
#define __NR_link 9
#define __NR_unlink 10
#define __NR_execve 11
#define __NR_chdir 12
#define __NR_time 13
...
In Linux, all the system call use the same single entrance int
: 0x80
#define __NR_init_module 128
- Change privilege => change stack
- From User stack to Kernel stack
- CPU assign a new stack pointer (SS: ESP) which point to kernel stack in TSS (Task State Segment)
- Push
EFLAGS
into stack, resume TF (Trap Frame), IF stay remain - Find the gate descriptor in IDT by 0x80. Find the segment selector assign to CS (Code Segment) register
- Calculate the "base address of segment descriptor" + "offset in the trap gate descriptor" to locate the entry address of the system call handler
- Privilege check
- code can only access same or lower privilege data
- System call number and arguments/parameters
- EAX: for system call number, and the return value (after handle the system call)
- e.g. Nachos (MIPS) r2 register
- EBX, ECX, EDX, ESI, EDI
- EAX: for system call number, and the return value (after handle the system call)
- (Hardware) Push stack
- PC, etc.
- (Hardware) Insert new PC from interrupt vector
- (Assembly) Preserve value of registers
- (Assembly) Set new stack and heap
- (C language) Execute interrupt handler
- (CPU Scheduler) Decide next process
- (C language) Return to Assembly
- (Assembly) Start running new process
- User- and Kernel Mode, System Calls, I/O, Exceptions
- CPL vs. DPL vs. RPL
- What is x86, IA32, IA64?
- Stackoverflow - kernel stack and user space stack
- Chapter 8 Exceptional Control Flow
- Operating System | Introduction of System Call
- Slides - The Linux OS stack: Introduction to shell, system calls & kernel
- Shichao's Notes Chapter 5. System Calls
- Shichao's Notes Chapter 7. Interrupts and Interrupt Handlers
Operating System Concepts 9ed.
- Ch13 I/O Systems
- Ch13.2 I/O Hardware
- Ch13.2.1 Polling
- Ch13.2.2 Interrupts
- Ch13.2.3 Direct Memory Access
- Ch13.3 Application I/O Interface
- Ch13.3.4 Blocking and Non-blocking I/O
- Ch13.2 I/O Hardware
- Notes