-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathlab4-Preemptive Multitasking.html
1490 lines (1183 loc) · 117 KB
/
lab4-Preemptive Multitasking.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE HTML>
<html lang="" >
<head>
<meta charset="UTF-8">
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<title>Lab4: Preemptive Multitasking · GitBook</title>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="description" content="">
<meta name="generator" content="GitBook 3.2.3">
<link rel="stylesheet" href="gitbook/style.css">
<link rel="stylesheet" href="gitbook/gitbook-plugin-intopic-toc/style.css">
<link rel="stylesheet" href="gitbook/gitbook-plugin-search-pro/search.css">
<link rel="stylesheet" href="gitbook/gitbook-plugin-splitter/splitter.css">
<link rel="stylesheet" href="gitbook/gitbook-plugin-highlight/website.css">
<link rel="stylesheet" href="gitbook/gitbook-plugin-fontsettings/website.css">
<link rel="stylesheet" href="gitbook/gitbook-plugin-theme-comscore/test.css">
<meta name="HandheldFriendly" content="true"/>
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-status-bar-style" content="black">
<link rel="apple-touch-icon-precomposed" sizes="152x152" href="gitbook/images/apple-touch-icon-precomposed-152.png">
<link rel="shortcut icon" href="gitbook/images/favicon.ico" type="image/x-icon">
<link rel="next" href="lab5-File system, Spawn and Shell.html" />
<link rel="prev" href="Lab3-User Environments.html" />
</head>
<body>
<div class="book">
<div class="book-summary">
<div id="book-search-input" role="search">
<input type="text" placeholder="Type to search" />
</div>
<nav role="navigation">
<ul class="summary">
<li class="chapter " data-level="1.1" data-path="./">
<a href="./">
序言
</a>
</li>
<li class="chapter " data-level="1.2" data-path="introduction.html">
<a href="introduction.html">
introduction
</a>
</li>
<li class="chapter " data-level="1.3" data-path="preparation.html">
<a href="preparation.html">
准备工作
</a>
</li>
<li class="chapter " data-level="1.4" >
<span>
JOS
</span>
<ul class="articles">
<li class="chapter " data-level="1.4.1" data-path="lab1-Booting a PC.html">
<a href="lab1-Booting a PC.html">
Lab1: Booting a PC
</a>
</li>
<li class="chapter " data-level="1.4.2" data-path="lab2-Memory Management.html">
<a href="lab2-Memory Management.html">
Lab2: Memory Management
</a>
</li>
<li class="chapter " data-level="1.4.3" data-path="Lab3-User Environments.html">
<a href="Lab3-User Environments.html">
Lab3: User Environments
</a>
</li>
<li class="chapter active" data-level="1.4.4" data-path="lab4-Preemptive Multitasking.html">
<a href="lab4-Preemptive Multitasking.html">
Lab4: Preemptive Multitasking
</a>
</li>
<li class="chapter " data-level="1.4.5" data-path="lab5-File system, Spawn and Shell.html">
<a href="lab5-File system, Spawn and Shell.html">
Lab5: File system, Spawn and Shell
</a>
</li>
<li class="chapter " data-level="1.4.6" data-path="lab6-Network Driver.html">
<a href="lab6-Network Driver.html">
Lab6: Network Driver
</a>
</li>
</ul>
</li>
<li class="chapter " data-level="1.5" >
<span>
xv6
</span>
<ul class="articles">
<li class="chapter " data-level="1.5.1" data-path="hw-boot xv6.html">
<a href="hw-boot xv6.html">
boot xv6
</a>
</li>
<li class="chapter " data-level="1.5.2" data-path="hw-syscall.html">
<a href="hw-syscall.html">
syscall
</a>
</li>
<li class="chapter " data-level="1.5.3" data-path="hw-alarmtest.html">
<a href="hw-alarmtest.html">
alarmtest
</a>
</li>
</ul>
</li>
<li class="divider"></li>
<li>
<a href="https://www.gitbook.com" target="blank" class="gitbook-link">
Published with GitBook
</a>
</li>
</ul>
</nav>
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<!-- Title -->
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i>
<a href="." >Lab4: Preemptive Multitasking</a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
<div id="book-search-results">
<div class="search-noresults">
<section class="normal markdown-section">
<h2 id="lab-4-preemptive-multitasking">Lab 4: Preemptive Multitasking</h2>
<p>这个实验中,我们将会实现一个抢占式(preemptive)的多核系统。</p>
<p>实验整体的思维导图:</p>
<p><img src="lab4-Preemptive Multitasking/arch.png" alt=""></p>
<h3 id="part-a-multiprocessor-support-and-cooperative-multitasking">Part A: Multiprocessor Support and Cooperative Multitasking</h3>
<p>part A我们将拓展原来的JOS系统,使之变成一个多核系统。</p>
<p>之后实现一些system call,使得能够在用户态建立一些新的environments。</p>
<p>然后实现一个<code>cooperative round-robin</code>调度算法,当某个environment主动退出CPU的使用的时候,其他进程(后面默认进程 == environment)接着使用。</p>
<p>最后,我们会实现一个抢占式的调度算法,当某个进程执行一定的时间(不一定执行完了),使得内核能够重新使用CPU。</p>
<h4 id="multiprocessor-support">Multiprocessor Support</h4>
<p>在启动的过程中,我们可以将CPU分成两类:</p>
<ul>
<li>the bootstrap processor (BSP):主要负责内核的启动</li>
<li>the application processors(APs):是被BSP启动的。</li>
</ul>
<p>至于哪一个CPU核是BSP,则是被硬件和BIOS所确定的。</p>
<p>在SMP(symmetric multiprocessing)系统中,每一个CPU都会有一个local APIC(advanced programmable interrupt controller)。这些LAPIC主要负责传递一些中断信号。每一个单元都会有一个unique identifier。我们将会使用LAPIC单元的这些功能。</p>
<ul>
<li>读取LAPIC id,从而知道我们现在此刻代码是跑在哪一个CPU中的。</li>
</ul>
<pre><code class="lang-c"><span class="hljs-keyword">uint8_t</span> cpu_id; <span class="hljs-comment">// Local APIC ID; index into cpus[] below</span>
</code></pre>
<ul>
<li>BSP发送<code>STARTUP</code>interprocessor interrrupt(IPI)给APs,使得能够启动其他的CPU。(见 <code>lapic_startap()</code>)</li>
<li>在part C,我们使用内置的时钟中断,使得能够支持抢占式多任务。</li>
</ul>
<p>一个处理器通过memory-mapped I/O(MMIO)来获取它的LAPIC,相当于是直接通过读写内存来操控硬件(虽然看起来像是操作内存,但实际上还是和硬件中的存储进行交互)。之前我们知道在最开始的1MB内存中0xA0000的物理地址开始存放的是VGA display buffer。</p>
<h4 id="exercise-1">exercise 1</h4>
<blockquote>
<p>实现<code>kern/pmap.c</code>中的mmio_map_region。</p>
</blockquote>
<p>我们读取到CPU的配置信息后,里面有LAPIC单元的物理地址(lapicaddr),我们需要做的就是将虚拟地址<code>MMIOBASE</code>映射到这一部分的物理地址。</p>
<pre><code class="lang-c"><span class="hljs-function"><span class="hljs-keyword">void</span> *
<span class="hljs-title">mmio_map_region</span><span class="hljs-params">(physaddr_t pa, size_t size)</span>
</span>{
<span class="hljs-comment">// Where to start the next region. Initially, this is the</span>
<span class="hljs-comment">// beginning of the MMIO region. Because this is static, its</span>
<span class="hljs-comment">// value will be preserved between calls to mmio_map_region</span>
<span class="hljs-comment">// (just like nextfree in boot_alloc).</span>
<span class="hljs-comment">// 这个static很关键啊,这样就不用保存为全局变量了。</span>
<span class="hljs-keyword">static</span> <span class="hljs-keyword">uintptr_t</span> base = MMIOBASE;
<span class="hljs-comment">// write through: 同时更新cache和后端的存储中的数据</span>
<span class="hljs-comment">// write back: 仅仅更新cache中的数据,必要时才往后端的设备写回数据</span>
<span class="hljs-comment">// Your code here:</span>
<span class="hljs-keyword">if</span>(base + ROUNDUP(size, PGSIZE) > MMIOLIM)
panic(<span class="hljs-string">"mmio_map_region: mmio overflow."</span>);
boot_map_region(kern_pgdir, base, ROUNDUP(size, PGSIZE), pa, PTE_W|PTE_PCD|PTE_PWT);
<span class="hljs-keyword">uintptr_t</span> temp = base;
base += ROUNDUP(size, PGSIZE);
<span class="hljs-keyword">return</span> (<span class="hljs-keyword">void</span> *)temp;
panic(<span class="hljs-string">"mmio_map_region not implemented"</span>);
}
</code></pre>
<p>mmio是一块很特殊的内存区域,能够像访问内存一样,去访问硬件的寄存器,但是又和一般的内存有不同的地方,一般内存都带有缓存的功能,也就是将内存的内容缓存到CPU的cache中。由于设备寄存器的多变性,我们需要将内存的缓存功能关闭。幸运的是,内存的权限位提供这样的支持,需要将相应的权限为设置为<code>PTE_PCD</code>和<code>PTE_PWT</code>即可,也就是cache disable和write through。</p>
<h4 id="application-processor-bootstrap">Application Processor Bootstrap</h4>
<p>在启动APs,我们需要读取从BIOS区域读取一些信息,如CPU的数量,APIC ID和MMIO物理地址。</p>
<p>驱动APs的函数在<code>boot_aps</code>中。APs的启动在实模式下,和boot loader的启动方式很像。因此<code>boot_aps()</code>将要执行的代码复制到实模式能够寻址到的地方(实际上是放在了<code>0x7000</code> (<code>MPENTRY_PADDR</code>))。实际上任何放在640KB以下的内存都是没有问题的。</p>
<pre><code class="lang-c"><span class="hljs-comment">// Write entry code to unused memory at MPENTRY_PADDR</span>
code = KADDR(MPENTRY_PADDR);
memmove(code, mpentry_start, mpentry_end - mpentry_start);
</code></pre>
<p>放置好要执行的代码之后,boot_aps依次激活APs,首先发送一个<code>STARTUP</code>的IPI中断,然后设置CS:IP寄存器。</p>
<p>之后APs类似于boot loader部分执行初始化代码,然后运行mp_main()初始化一些寄存器的值,如GDT,TSS等等。BSP会等待APs的 <code>CPU_STARTED</code>信号,收到了才会激活其他的APs。</p>
<h4 id="exercise-2">exercise 2</h4>
<blockquote>
<p>阅读相关部分的代码,修改<code>kern/pmap.c</code>中的page_init函数,将<code>MPENTRY_PADDR</code>的页从page_free_list中移除。</p>
</blockquote>
<pre><code class="lang-c"><span class="hljs-keyword">for</span> (i = <span class="hljs-number">1</span>; i < npages_basemem; i++) {
<span class="hljs-keyword">if</span>(i == MPENTRY_PADDR/PGSIZE){
pages[i].pp_ref = <span class="hljs-number">1</span>;
<span class="hljs-keyword">continue</span>;
}
pages[i].pp_ref = <span class="hljs-number">0</span>;
pages[i].pp_link = page_free_list;
page_free_list = &pages[i];
}
</code></pre>
<p>Q1: 对比<code>kern/mpentry.S</code>和<code>boot/boot.S</code>,记住<code>kern/mpentry.S</code>始终是运行在KERNBASE地址之上的。那么在<code>kern/mpentry.S</code>中的<code>MPBOOTPHYS</code>宏定义的目的是什么?为什么在<code>kern/mpentry.S</code>是必要的,而在boot loader中不需要?如果去掉有什么错误?</p>
<p>A:boot.S在实模式下的,而重新进入时,<code>mpentry.S</code>是在保护模式下进行的。因此我们必须重新将其转化为实际的物理内存地址。</p>
<h4 id="per-cpu-state-and-initialization">Per-CPU State and Initialization</h4>
<p>当实现一个多核系统的时候,区分私有的状态还是共享的状态是非常重要的。</p>
<p>我们需要知道每一个CPU下面的一些状态:</p>
<ul>
<li>per-CPU kernel stack</li>
</ul>
<p>每一个CPU的栈内容保存在<code>percpu_kstacks[NCPU][KSTKSIZE]</code>。我们需要将虚拟地址从<code>KSTACKTOP</code>开始向下依次的进行映射。</p>
<p>并且不同CPU栈中间是有一块隔离区的。</p>
<ul>
<li>per-CPU TSS 和 TSS descriptor</li>
</ul>
<p>前面我们提到TSS结构主要是保存处理器旧的状态,方便用户态跳转回内核态。现在多核系统中是保存CPU的栈和若干寄存器的状态,具体使用<code>cpus[i].cpu_ts</code>变量。TSS descriptor保存在个GDT中,通过<code>gdt[(GD_TSS0 >> 3) + i]</code>进行索引。</p>
<ul>
<li>per-CPU目前执行的进程</li>
</ul>
<p>通过<code>cpus[cpunum()].cpu_env</code>进行索引。</p>
<ul>
<li>per-CPU system registers</li>
</ul>
<p>所有的寄存器,包括系统寄存器,对CPU都是隐秘的(感觉这里讲的不清楚,应该每一个CPU都有属于自己的寄存器,相互独立不能访问,因此需要进行初始化)。因此我们需要使用一些函数<code>lcr3()</code>, <code>ltr()</code>, <code>lgdt()</code>, <code>lidt()</code>进行初始化。</p>
<h4 id="exercise-3">exercise 3</h4>
<blockquote>
<p>修改在<code>kern/pmap.c</code>中的mem_init_mp(),从而将虚拟地址与物理地址进行相应的映射。</p>
</blockquote>
<pre><code class="lang-c">mem_init_mp(<span class="hljs-keyword">void</span>)
{
<span class="hljs-comment">// Map per-CPU stacks starting at KSTACKTOP, for up to 'NCPU' CPUs.</span>
<span class="hljs-comment">//</span>
<span class="hljs-comment">// For CPU i, use the physical memory that 'percpu_kstacks[i]' refers</span>
<span class="hljs-comment">// to as its kernel stack. CPU i's kernel stack grows down from virtual</span>
<span class="hljs-comment">// address kstacktop_i = KSTACKTOP - i * (KSTKSIZE + KSTKGAP), and is</span>
<span class="hljs-comment">// divided into two pieces, just like the single stack you set up in</span>
<span class="hljs-comment">// mem_init:</span>
<span class="hljs-comment">// * [kstacktop_i - KSTKSIZE, kstacktop_i)</span>
<span class="hljs-comment">// -- backed by physical memory</span>
<span class="hljs-comment">// * [kstacktop_i - (KSTKSIZE + KSTKGAP), kstacktop_i - KSTKSIZE)</span>
<span class="hljs-comment">// -- not backed; so if the kernel overflows its stack,</span>
<span class="hljs-comment">// it will fault rather than overwrite another CPU's stack.</span>
<span class="hljs-comment">// Known as a "guard page".</span>
<span class="hljs-comment">// Permissions: kernel RW, user NONE</span>
<span class="hljs-comment">//</span>
<span class="hljs-comment">// LAB 4: Your code here:</span>
<span class="hljs-keyword">uint32_t</span> i=<span class="hljs-number">0</span>;
<span class="hljs-keyword">uintptr_t</span> start = KSTACKTOP-KSTKSIZE;
<span class="hljs-keyword">for</span>(; i<NCPU; i++){
boot_map_region(kern_pgdir, start, KSTKSIZE, PADDR(percpu_kstacks[i]), PTE_W);
start -= (KSTKSIZE+KSTKGAP);
}
}
</code></pre>
<p><code>KSTKGAP</code>起到的作用就是隔离各CPU的栈空间,防止相互干扰。</p>
<h4 id="exercise-4">exercise 4</h4>
<blockquote>
<p>在lab3 中,我们使用的是全局的ts来保存旧的一个处理器的状态。现在我们对每一个CPU都保存一个ts。</p>
</blockquote>
<pre><code class="lang-c"><span class="hljs-comment">// Initialize and load the per-CPU TSS and IDT</span>
<span class="hljs-function"><span class="hljs-keyword">void</span>
<span class="hljs-title">trap_init_percpu</span><span class="hljs-params">(<span class="hljs-keyword">void</span>)</span>
</span>{
<span class="hljs-comment">// Setup a TSS so that we get the right stack</span>
<span class="hljs-comment">// when we trap to the kernel.</span>
<span class="hljs-comment">//ts.ts_esp0 = KSTACKTOP;</span>
<span class="hljs-comment">//ts.ts_ss0 = GD_KD;</span>
<span class="hljs-comment">//ts.ts_iomb = sizeof(struct Taskstate);</span>
<span class="hljs-keyword">struct</span> Taskstate *thists = &thiscpu->cpu_ts;
thists->ts_esp0 = KSTACKTOP - thiscpu->cpu_id * (KSTKSIZE + KSTKGAP);
thists->ts_ss0 = GD_KD;
thists->ts_iomb = <span class="hljs-keyword">sizeof</span>(<span class="hljs-keyword">struct</span> Taskstate);
<span class="hljs-comment">// Initialize the TSS slot of the gdt.</span>
gdt[(GD_TSS0 >> <span class="hljs-number">3</span>) + thiscpu->cpu_id] = SEG16(STS_T32A, (<span class="hljs-keyword">uint32_t</span>) (thists),
<span class="hljs-keyword">sizeof</span>(<span class="hljs-keyword">struct</span> Taskstate) - <span class="hljs-number">1</span>, <span class="hljs-number">0</span>);
gdt[(GD_TSS0 >> <span class="hljs-number">3</span>) + thiscpu->cpu_id].sd_s = <span class="hljs-number">0</span>;
<span class="hljs-comment">// Load the TSS selector (like other segment selectors, the</span>
<span class="hljs-comment">// bottom three bits are special; we leave them 0)</span>
ltr(GD_TSS0+ (thiscpu->cpu_id << <span class="hljs-number">3</span>));
<span class="hljs-comment">// Load the IDT</span>
lidt(&idt_pd);
}
</code></pre>
<h4 id="locking">Locking</h4>
<p>在让APs进一步执行前,我们需要解决内核竞争的问题防止多个CPU同时运行内核代码。</p>
<p>目前设置一个大锁,使得有一个CPU进入内核时,那么就锁住内核,仅允许一个CPU运行。当返回到用户态时,就释放该锁。</p>
<p>因此,上面的大锁设计,使得用户态的程序能够多CPU的运行,仅能有一个进程运行内核,其他的处理器进程都得等待。</p>
<p>我们需要在下面的四个地方运用内核锁:</p>
<ul>
<li><code>i386_init()</code>中唤起其他APs前需要锁定内核</li>
<li><code>mp_main()</code>中尝试着索取内核锁,从而能够进行进程的调用。</li>
<li><code>trap()</code>中从用户态陷入到内核态,尝试着去获取锁。</li>
<li><code>env_run()</code>在<code>env_pop_tf()</code>前释放锁。</li>
</ul>
<p>我们看到仅有一处是锁的释放的,那就是从内核态进入到用户态的前夕。</p>
<h4 id="exercise-5">exercise 5</h4>
<blockquote>
<p>在合适的位置运行<code>lock_kernel</code>和<code>unlock_kernel</code>。</p>
</blockquote>
<p>目前还无法测试锁的正确性。</p>
<p>几处加锁:</p>
<ol>
<li>一个CPU尝试着启动其他的CPU的时候</li>
</ol>
<pre><code class="lang-c"><span class="hljs-comment">// Lab 4 multiprocessor initialization functions</span>
mp_init();
lapic_init();
<span class="hljs-comment">// Lab 4 multitasking initialization functions</span>
pic_init();
<span class="hljs-comment">// Acquire the big kernel lock before waking up APs</span>
<span class="hljs-comment">// Your code here:</span>
lock_kernel();
<span class="hljs-comment">// Starting non-boot CPUs, mpentry.S 入点</span>
boot_aps();
</code></pre>
<ol>
<li>从用户态陷入到内核态:</li>
</ol>
<pre><code class="lang-c"><span class="hljs-keyword">if</span> ((tf->tf_cs & <span class="hljs-number">3</span>) == <span class="hljs-number">3</span>) {
<span class="hljs-comment">// Trapped from user mode.</span>
<span class="hljs-comment">// Acquire the big kernel lock before doing any</span>
<span class="hljs-comment">// serious kernel work.</span>
<span class="hljs-comment">// LAB 4: Your code here.</span>
lock_kernel();
assert(curenv);
}
</code></pre>
<p>一处解锁:</p>
<ol>
<li>从内核到用户态:</li>
</ol>
<pre><code class="lang-c"><span class="hljs-function"><span class="hljs-keyword">void</span>
<span class="hljs-title">env_run</span><span class="hljs-params">(<span class="hljs-keyword">struct</span> Env *e)</span>
</span>{
curenv->env_status = ENV_RUNNING;
++curenv->env_runs;
lcr3(PADDR(curenv->env_pgdir));
unlock_kernel();
env_pop_tf(&curenv->env_tf);
}
</code></pre>
<p>Q2: 目前的锁机制,使得每一次只有一个CPU运行内核代码,为什么我们仍然需要将CPU栈分离开?如果不分开有什么错?</p>
<p>A:lab3中,内核态->系统态一定是从栈的最顶端开始的(KSTACKTOP),因此每一次陷入的时候都像是一个新的从未使用的内核栈。这样看的话,好像是能够进行公用的。但是不要忘了朋友们,后面的抢占式进行调度,那么内核栈中和有可能保存着我们需要的信息,或者是上次未执行完的信息。如果调用system call那么压入的参数就不一样啊,显然不能使用同一个栈。</p>
<h4 id="round-robin-scheduling">Round-Robin Scheduling</h4>
<p>Round-Robin就是循序的进行调度,每一个进程调度的都是均等的。</p>
<p>执行的步骤如下:</p>
<ul>
<li><code>sched_yield()</code>挑选一个进程进行执行,通过环形的方式对envs[]进行访问,挑选第一个状态为<code>RUNNABLE</code>的进程放入到目前CPU中进行执行。</li>
<li><code>sched_yield()</code>不能在两个CPU上运行同一个进程。</li>
<li>系统已经提供了一个新的系统调用<code>sys_yield()</code>,使得用户态进程能够主动的调用<code>sched_yeild()</code>并且让其他的进程被CPU执行。</li>
</ul>
<h4 id="exercise-6">exercise 6</h4>
<blockquote>
<p>完成sched_yield()函数的实现,实现Round-Robin调度</p>
</blockquote>
<p>首先完成调度的程序:</p>
<pre><code class="lang-c"><span class="hljs-comment">// Choose a user environment to run and run it.</span>
<span class="hljs-function"><span class="hljs-keyword">void</span>
<span class="hljs-title">sched_yield</span><span class="hljs-params">(<span class="hljs-keyword">void</span>)</span>
</span>{
<span class="hljs-keyword">struct</span> Env *idle;
<span class="hljs-comment">// Implement simple round-robin scheduling.</span>
<span class="hljs-comment">//</span>
<span class="hljs-comment">// Search through 'envs' for an ENV_RUNNABLE environment in</span>
<span class="hljs-comment">// circular fashion starting just after the env this CPU was</span>
<span class="hljs-comment">// last running. Switch to the first such environment found.</span>
<span class="hljs-comment">//</span>
<span class="hljs-comment">// If no envs are runnable, but the environment previously</span>
<span class="hljs-comment">// running on this CPU is still ENV_RUNNING, it's okay to</span>
<span class="hljs-comment">// choose that environment.</span>
<span class="hljs-comment">//</span>
<span class="hljs-comment">// Never choose an environment that's currently running on</span>
<span class="hljs-comment">// another CPU (env_status == ENV_RUNNING). If there are</span>
<span class="hljs-comment">// no runnable environments, simply drop through to the code</span>
<span class="hljs-comment">// below to halt the cpu.</span>
<span class="hljs-comment">// LAB 4: Your code here.</span>
<span class="hljs-keyword">int</span> counter;
<span class="hljs-keyword">if</span> (curenv) {
<span class="hljs-keyword">for</span> (counter = ENVX(curenv->env_id) + <span class="hljs-number">1</span>;
counter != ENVX(curenv->env_id);
counter = (counter + <span class="hljs-number">1</span>) % NENV){
<span class="hljs-comment">//cprintf("%d\n", counter);</span>
<span class="hljs-keyword">if</span> (envs[counter].env_status == ENV_RUNNABLE){
env_run(envs + counter);
}
}
<span class="hljs-keyword">if</span>(curenv->env_status != ENV_NOT_RUNNABLE)
env_run(curenv);
<span class="hljs-comment">//cprintf("%d\n", counter);</span>
} <span class="hljs-keyword">else</span> {
<span class="hljs-keyword">for</span> (counter = <span class="hljs-number">0</span>; counter < NENV; ++counter)
<span class="hljs-keyword">if</span> (envs[counter].env_status == ENV_RUNNABLE)
env_run(envs + counter);
}
<span class="hljs-comment">// sched_halt never returns</span>
sched_halt();
}
</code></pre>
<p>然后,在syscall 中添加一个case 来使用sys_yield 系统调用:</p>
<pre><code class="lang-c"><span class="hljs-keyword">case</span> SYS_yield:
sys_yield();
<span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;
</code></pre>
<p>之后在i386_init()函数中加入以下的代码:</p>
<pre><code class="lang-c">ENV_CREATE(user_yield, ENV_TYPE_USER);
ENV_CREATE(user_yield, ENV_TYPE_USER);
ENV_CREATE(user_yield, ENV_TYPE_USER);
</code></pre>
<p>输入<code>make qemu CPUS=2</code>就能看到下面的输出结果了:</p>
<pre><code>Hello, I am environment 00001000.
Hello, I am environment 00001001.
Hello, I am environment 00001002.
Back in environment 00001000, iteration 0.
Back in environment 00001001, iteration 0.
Back in environment 00001002, iteration 0.
Back in environment 00001000, iteration 1.
Back in environment 00001001, iteration 1.
Back in environment 00001002, iteration 1.
...
</code></pre><p>Q3:在env_run()中使用了<code>lcr3()</code>用来更新页目录,因此MMU内容会立刻被更新。那么为什么更新前后当前准备运行的environment虚拟地址映射到物理地址没有变化?</p>
<p>因为在进程页目录初始化时,复制的就是内核的页目录:</p>
<pre><code class="lang-c">e->env_pgdir = (<span class="hljs-keyword">pde_t</span> *)page2kva(p);
<span class="hljs-built_in">memcpy</span>(e->env_pgdir, kern_pgdir, PGSIZE);
</code></pre>
<p>仅仅在UVPT这个虚拟地址有了修改,具体可以看看<code>env_create.env_alloc.env_setup_vm</code></p>
<p>这个问题应该在lab3如何初始化environment就能够回答。</p>
<p>Q4:当CPU从一个environment转移到了另外一个environment,系统必须要保存旧的environment的寄存器使得它之后能够正确的被再次唤起。这个是在哪里进行的。</p>
<p>在env->env_tf中保存的,也就是Trapframe结构。保存是发生在_alltraps构造Trapframe,我们已经在lab3中详细的讨论过了。恢复发生在kern/env.c 中的<code>env_pop_tf</code>处。</p>
<h4 id="system-calls-for-environment-creation">System Calls for Environment Creation</h4>
<p>现在能进行进程的切换,但是运行的进程数在系统初始化的时候就已经确定好了。下面的实现就是为了能够在用户态创建新的environment。</p>
<p>Unix使用fork()来进行创建。该函数会赋值整个进程的地址空间作为child process。parent process和child process的区别就是process ID。在parent进程中,fork返回child ID,在child process中,fork返回0(<code>environment->env_tf.tf_regs.reg_eax = 0;</code>)。</p>
<p>在JOS中,我们需要实现下面的system call:</p>
<ul>
<li><code>sys_exofork</code>: </li>
</ul>
<p>这个系统调用会产生一个空白的进程空间。</p>
<p>调用的parent env会获得子进程的进程号,而子进程得到的数值为0。</p>
<ul>
<li><code>sys_env_set_status</code>: 当初始化好了若干的页面的设置,那么将进程的状态<code>ENV_NOT_RUNNABLE</code>改位<code>ENV_RUNNABLE</code>。</li>
</ul>
<ul>
<li><p><code>sys_page_alloc</code>: 根据相应的虚拟地址,分配相应的物理内存。</p>
</li>
<li><p><code>sys_page_map</code>: </p>
</li>
</ul>
<p>感觉是一种内存共享的方式,两个进程的页目录映射有同一块地址空间。</p>
<ul>
<li><code>sys_page_unmap</code>: </li>
</ul>
<p>与上面的相反。</p>
<p>上面的所有系统调用函数参数中都会包括environment IDs。JOS提供ID到environment的转换,使用<code>envid2env()</code>。</p>
<p>特别需要注意的是,<code>envid2env(0)</code>返回的是当前environment的指针。</p>
<p>JOS此刻已经提供了一个简陋的像<code>fork()</code>那样的实现<code>dumbfork()</code>(为什么说简陋因为没有copy on write的机制)。我们需要实现上面的system call使得能够实现这个简陋的dumbfork()。</p>
<h4 id="exercise-7">exercise 7</h4>
<blockquote>
<p>实现上面的system call。</p>
<p>当调用envid2env函数的时候,checkperm=1,使得我们始终检查environment关系。</p>
<p>检查所有的系统调用,如果参数不正确或者不在规定的范围,那么就返回<code>-E_INVAL</code>。</p>
</blockquote>
<p>首先实现<code>sys_exofork()</code>:</p>
<pre><code class="lang-c"><span class="hljs-comment">// Allocate a new environment.</span>
<span class="hljs-comment">// Returns envid of new environment, or < 0 on error. Errors are:</span>
<span class="hljs-comment">// -E_NO_FREE_ENV if no free environment is available.</span>
<span class="hljs-comment">// -E_NO_MEM on memory exhaustion.</span>
<span class="hljs-function"><span class="hljs-keyword">static</span> envid_t
<span class="hljs-title">sys_exofork</span><span class="hljs-params">(<span class="hljs-keyword">void</span>)</span>
</span>{
<span class="hljs-comment">// Create the new environment with env_alloc(), from kern/env.c.</span>
<span class="hljs-comment">// It should be left as env_alloc created it, except that</span>
<span class="hljs-comment">// status is set to ENV_NOT_RUNNABLE, and the register set is copied</span>
<span class="hljs-comment">// from the current environment -- but tweaked so sys_exofork</span>
<span class="hljs-comment">// will appear to return 0.</span>
<span class="hljs-comment">// LAB 4: Your code here.</span>
<span class="hljs-keyword">struct</span> Env *environment;
<span class="hljs-keyword">int</span> res;
<span class="hljs-keyword">if</span>((res = env_alloc(&environment, curenv->env_id)) < <span class="hljs-number">0</span>)
<span class="hljs-keyword">return</span> res;
environment->env_status = ENV_NOT_RUNNABLE;
environment->env_tf = curenv->env_tf;
environment->env_tf.tf_regs.reg_eax = <span class="hljs-number">0</span>;<span class="hljs-comment">//这应该就是子进程的返回值了</span>
<span class="hljs-keyword">return</span> environment->env_id;
<span class="hljs-comment">//panic("sys_exofork not implemented");</span>
}
</code></pre>
<p>之后实现<code>sys_page_alloc()</code>:</p>
<pre><code class="lang-c"><span class="hljs-function"><span class="hljs-keyword">static</span> <span class="hljs-keyword">int</span>
<span class="hljs-title">sys_page_alloc</span><span class="hljs-params">(envid_t envid, <span class="hljs-keyword">void</span> *va, <span class="hljs-keyword">int</span> perm)</span>
</span>{
<span class="hljs-comment">// Hint: This function is a wrapper around page_alloc() and</span>
<span class="hljs-comment">// page_insert() from kern/pmap.c.</span>
<span class="hljs-comment">// Most of the new code you write should be to check the</span>
<span class="hljs-comment">// parameters for correctness.</span>
<span class="hljs-comment">// If page_insert() fails, remember to free the page you</span>
<span class="hljs-comment">// allocated!</span>
<span class="hljs-comment">// LAB 4: Your code here.</span>
<span class="hljs-keyword">if</span>(!(perm & PTE_U) || !(perm & PTE_U) ||
(perm & (~PTE_SYSCALL)) ||
va > (<span class="hljs-keyword">void</span> *)UTOP ||
va != ROUNDDOWN(va, PGSIZE))
<span class="hljs-keyword">return</span> -E_INVAL;
<span class="hljs-keyword">struct</span> PageInfo *pginfo = page_alloc(ALLOC_ZERO);
<span class="hljs-keyword">if</span>(!pginfo)
<span class="hljs-keyword">return</span> -E_NO_MEM;
<span class="hljs-keyword">struct</span> Env *environment;
<span class="hljs-keyword">if</span>(envid2env(envid, &environment, <span class="hljs-number">1</span>) < <span class="hljs-number">0</span>)
<span class="hljs-keyword">return</span> -E_BAD_ENV;
<span class="hljs-keyword">if</span>(page_insert(environment->env_pgdir, pginfo, va, perm) < <span class="hljs-number">0</span>){
page_free(pginfo);
<span class="hljs-keyword">return</span> -E_NO_MEM;
}
<span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;
<span class="hljs-comment">//panic("sys_page_alloc not implemented");</span>
}
</code></pre>
<p><code>sys_page_map</code>实现:</p>
<pre><code class="lang-c"><span class="hljs-function"><span class="hljs-keyword">static</span> <span class="hljs-keyword">int</span>
<span class="hljs-title">sys_page_map</span><span class="hljs-params">(envid_t srcenvid, <span class="hljs-keyword">void</span> *srcva,
envid_t dstenvid, <span class="hljs-keyword">void</span> *dstva, <span class="hljs-keyword">int</span> perm)</span>
</span>{
<span class="hljs-comment">// Hint: This function is a wrapper around page_lookup() and</span>
<span class="hljs-comment">// page_insert() from kern/pmap.c.</span>
<span class="hljs-comment">// Again, most of the new code you write should be to check the</span>
<span class="hljs-comment">// parameters for correctness.</span>
<span class="hljs-comment">// Use the third argument to page_lookup() to</span>
<span class="hljs-comment">// check the current permissions on the page.</span>
<span class="hljs-comment">// LAB 4: Your code here.</span>
<span class="hljs-keyword">if</span>((<span class="hljs-keyword">uint32_t</span>)srcva >= UTOP || PGOFF(srcva) ||
(<span class="hljs-keyword">uint32_t</span>)dstva >= UTOP || PGOFF(dstva) ||
!(perm & PTE_U) ||
(perm & (~PTE_SYSCALL)) )
<span class="hljs-keyword">return</span> -E_INVAL;
<span class="hljs-keyword">struct</span> Env *src_environemt, *dst_environment;
<span class="hljs-keyword">if</span>(envid2env(srcenvid, &src_environemt, <span class="hljs-number">1</span>) < <span class="hljs-number">0</span> ||
envid2env(dstenvid, &dst_environment, <span class="hljs-number">1</span>) < <span class="hljs-number">0</span>)
<span class="hljs-keyword">return</span> -E_BAD_ENV;
<span class="hljs-keyword">pte_t</span> *pte;
<span class="hljs-keyword">struct</span> PageInfo *page = page_lookup(src_environemt->env_pgdir, srcva, &pte);
<span class="hljs-comment">// if(srcenvid == 4097)</span>
<span class="hljs-comment">// *pte |= PTE_W;</span>
<span class="hljs-keyword">if</span>(!page || (!(*pte & PTE_W) && (perm & PTE_W)))
<span class="hljs-keyword">return</span> -E_INVAL;
<span class="hljs-keyword">if</span>(page_insert(dst_environment->env_pgdir, page, dstva, perm) < <span class="hljs-number">0</span>)
<span class="hljs-keyword">return</span> -E_NO_MEM;
<span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;
<span class="hljs-comment">//panic("sys_page_map not implemented");</span>
}
</code></pre>
<p><code>sys_page_unmap</code>:</p>
<pre><code class="lang-c"><span class="hljs-function"><span class="hljs-keyword">static</span> <span class="hljs-keyword">int</span>
<span class="hljs-title">sys_page_unmap</span><span class="hljs-params">(envid_t envid, <span class="hljs-keyword">void</span> *va)</span>
</span>{
<span class="hljs-comment">// Hint: This function is a wrapper around page_remove().</span>
<span class="hljs-comment">// LAB 4: Your code here.</span>
<span class="hljs-keyword">if</span>((<span class="hljs-keyword">uint32_t</span>)va >= UTOP || PGOFF(va))
<span class="hljs-keyword">return</span> -E_INVAL;
<span class="hljs-keyword">struct</span> Env *environment;
<span class="hljs-keyword">if</span>(envid2env(envid, &environment, <span class="hljs-number">1</span>) < <span class="hljs-number">0</span>)
<span class="hljs-keyword">return</span> -E_BAD_ENV;
page_remove(environment->env_pgdir, va);
<span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;
<span class="hljs-comment">//panic("sys_page_unmap not implemented");</span>
}
</code></pre>
<p>最后增加这几个系统调用的分发:</p>
<pre><code class="lang-c"> <span class="hljs-keyword">case</span> SYS_exofork:
<span class="hljs-keyword">return</span> sys_exofork();
<span class="hljs-keyword">case</span> SYS_env_set_status:
<span class="hljs-keyword">return</span> sys_env_set_status(a1, a2);
<span class="hljs-keyword">case</span> SYS_page_alloc:
<span class="hljs-keyword">return</span> sys_page_alloc(a1, (<span class="hljs-keyword">void</span> *) a2, a3);
<span class="hljs-keyword">case</span> SYS_page_map:
<span class="hljs-keyword">return</span> sys_page_map(a1, (<span class="hljs-keyword">void</span> *) a2, a3, (<span class="hljs-keyword">void</span> *) a4, a5);
<span class="hljs-keyword">case</span> SYS_page_unmap:
<span class="hljs-keyword">return</span> sys_page_unmap(a1, (<span class="hljs-keyword">void</span> *) a2);
</code></pre>
<h3 id="part-b-copy-on-write-fork">Part B: Copy-on-Write Fork</h3>
<p>Unix提供fork()来在用户态进行进程的创建。</p>
<p>vx6通过赋值所有parent的页进入新的页,从而创建新的进程。并且这样的复制行为代价也是整个fork()开销最大的操作。</p>
<p>然而,一个fork()函数调用后往往跟随着一个exec()函数,又需要将原来的内存替换为其他内容。因为child process对parent process拷贝后的内容用的非常的少,因此将整个页内容都拷贝过来将是是非浪费效率的一种行为。</p>
<p>正是上面遇到的这种问题,之后的unix利用虚拟内存的硬件,使得parent和child process能够共享这一部分的内存,直到其中的一个进程修改了页的内容,就会结束共享页的行为,这种行为叫做copy-on-write。</p>
<p>为了能够实现上面的功能,fork()实现时,内核将会赋值parent的address space mappings,也就是页目录和页表到child process,而不将内容拷贝到child process中,并且将内容权限设置为<code>read-only</code>。当其中的一个进程尝试对内存进行写的操作的时候,那么将会发生<code>page falut</code>中断。并且分配新的页,这些内容设置为可写。</p>
<p>上面描述的方式使得一般fork()操作的开销非常的小,一般只用复制1page(4096Bytes)。</p>
<h4 id="user-level-page-fault-handling">User-level page fault handling</h4>
<p>为了能够实现上面的copy-on-write,我们首先需要知道在read-only的页上会发生page fault的错误。</p>
<p>并且发生page fault这种时间也是非常常见的,比如一般内核在初始化程序的时候,仅会分配一个页,当栈空间不够的时候,那么就会发生page fault。同理,这些事件也会发生在BSS区域,并且初始化的时候会全部赋值为0。当需要执行的指令不在内存的时候,也会发生page fault,从而能够从磁盘读取相关指令。可见page fault是一个非常常见的事件,并且是一个很好优化系统性能的手段。</p>
<p>JOS中,我们并不会在内核态封装page fault handler,相反,我们在用户态可以<strong>自由的定制page fault</strong> handler。我们需要好好的设计page fault的处理机制,从而能够灵活的处理上面提到的多种发生page fault情况的事件。</p>
<p>之后我们需要处理获取从硬盘上的文件系统的内容。</p>
<h4 id="setting-the-page-fault-handler">Setting the Page Fault Handler</h4>
<p>为了能够用户态有自己的page fault handler,我们需要在JOS内核的page fault handler entrypoint进行赋值确定(就是函数指针的赋值)。用户态进程通过<code>sys_env_set_pgfault_upcall</code>注册自己的系统调用。并且在Env结构体中增加了一个新的变量<code>env_pgfault_upcall</code>用来记录这个值。</p>
<h4 id="exercise-8">exercise 8</h4>
<blockquote>
<p>实现<code>sys_env_set_pgfault_upcall</code>,确保权限的检查,使得只有相应的进程ID才能进行修改,因为这是一个非常危险的系统调用。</p>
</blockquote>
<p>在<code>kern/syscall.c</code>中进行实现:</p>
<pre><code class="lang-c"><span class="hljs-comment">// Set the page fault upcall for 'envid' by modifying the corresponding struct</span>
<span class="hljs-comment">// Env's 'env_pgfault_upcall' field. When 'envid' causes a page fault, the</span>
<span class="hljs-comment">// kernel will push a fault record onto the exception stack, then branch to</span>
<span class="hljs-comment">// 'func'.</span>
<span class="hljs-comment">//</span>
<span class="hljs-comment">// Returns 0 on success, < 0 on error. Errors are:</span>
<span class="hljs-comment">// -E_BAD_ENV if environment envid doesn't currently exist,</span>
<span class="hljs-comment">// or the caller doesn't have permission to change envid.</span>
<span class="hljs-function"><span class="hljs-keyword">static</span> <span class="hljs-keyword">int</span>
<span class="hljs-title">sys_env_set_pgfault_upcall</span><span class="hljs-params">(envid_t envid, <span class="hljs-keyword">void</span> *func)</span>
</span>{
<span class="hljs-comment">// LAB 4: Your code here.</span>
<span class="hljs-keyword">struct</span> Env *environment;
<span class="hljs-keyword">if</span>(envid2env(envid, &environment, <span class="hljs-number">1</span>))
<span class="hljs-keyword">return</span> -E_BAD_ENV;
environment->env_pgfault_upcall = func;
<span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;
<span class="hljs-comment">// panic("sys_env_set_pgfault_upcall not implemented");</span>
}
</code></pre>
<p>并且在<code>/kern/syscall.c/syscall()</code>中进行分发:</p>
<pre><code class="lang-c"><span class="hljs-keyword">case</span> SYS_env_set_pgfault_upcall:
<span class="hljs-keyword">return</span> sys_env_set_pgfault_upcall(a1, (<span class="hljs-keyword">void</span> *) a2);
</code></pre>
<h4 id="normal-and-exception-stacks-in-user-environments">Normal and Exception Stacks in User Environments</h4>
<p>正常用户态的栈空间在[USTACKTOP-1, USTACKTOP-PGSIZE]之间。</p>
<p>当用户态产生中断,内核将会在一个设定好的栈空间重启用户进程,也就是在<code>user exception stack</code>进行处理。并且这个过程和用户态切换到内核态原理基本是相同的。</p>
<p>user exception stack也是一个页的大小,并且栈底的位置在<code>UXSTACKTOP</code>。然后在这个栈空间,能够正常的调用系统调用,从而能够对页进行重新的映射,分配或者是修复任何与page fault有关问题。user-level page handler通过汇编语言stub返回相应的值,也是原来栈上面填错误码(<code>uint32_t tf_trapno</code>)的地方。</p>
<p>每一个进程如果想要支持用户态的page fault handler,必须自己分配相应的异常处理栈,可以使用<code>sys_page_alloc()</code>进行分配。</p>
<h4 id="invoking-the-user-page-fault-handler">Invoking the User Page Fault Handler</h4>
<p>构造page fault handler栈如下面结果所示:</p>
<pre><code> <-- UXSTACKTOP
trap-time esp
trap-time eflags
trap-time eip
trap-time eax start of struct PushRegs
trap-time ecx
trap-time edx
trap-time ebx
trap-time esp
trap-time ebp
trap-time esi
trap-time edi end of struct PushRegs
tf_err (error code)
fault_va <-- %esp when handler is run
</code></pre><p>与<code>/inc/trap.h</code>里面结构体变量结构一致:</p>
<pre><code class="lang-c"><span class="hljs-keyword">struct</span> UTrapframe {
<span class="hljs-comment">/* information about the fault */</span>
<span class="hljs-keyword">uint32_t</span> utf_fault_va; <span class="hljs-comment">/* va for T_PGFLT, 0 otherwise */</span>
<span class="hljs-keyword">uint32_t</span> utf_err;
<span class="hljs-comment">/* trap-time return state */</span>
<span class="hljs-keyword">struct</span> PushRegs utf_regs;
<span class="hljs-keyword">uintptr_t</span> utf_eip;
<span class="hljs-keyword">uint32_t</span> utf_eflags;
<span class="hljs-comment">/* the trap-time stack to return to */</span>
<span class="hljs-keyword">uintptr_t</span> utf_esp;
} __attribute__((packed));
</code></pre>
<p>当中断处理未完成时,可以进行嵌套的操作,不过此时栈的变化是从此刻的<code>tf->tf_eps</code>向下增长的,而不是从<code>UXSTACKTOP</code>开始向下变化,反正是可以进行嵌套处理的。</p>
<p>确保异常处理栈不能超过空间大小,因为这个栈的下面就是用户进程栈,如果覆盖了那么即使异常正确的处理了,返回后程序依旧不能够正确的执行。</p>
<h4 id="exercise-9">exercise 9</h4>
<blockquote>
<p>实现<code>/kern/trap.c</code>中的<code>page_fault_handler</code>,该函数能够分发用户态的page fault handler。</p>
<p>确保正确的对栈空间进行操作</p>
<p>如果 exception stack使用的空间超过了会发生什么?(上面回答了)</p>
</blockquote>
<p>通过该函数的注释,我们需要对page fault进行分类讨论处理:</p>
<ul>
<li>当初次发生page fault handler时,这个栈陷入的变化时用户栈->内核栈->异常处理栈(返回时异常栈->直接用户栈)。</li>
<li>当有page fault handler的嵌套时,此时已经在异常处理栈了,此时需要再次压入一个UTrapframe,并且之间空4Bytes,此时栈陷入顺序为:异常处理栈->内核栈->异常处理栈(注意返回顺序并不是这样的,而是直接异常栈->一场栈)。</li>
</ul>
<p>并且我们要时刻保证压入的UTrapgrame没有超过内存限制大小,最终的实现代码如下:</p>
<pre><code class="lang-c"><span class="hljs-comment">// LAB 4: Your code here.</span>
<span class="hljs-keyword">if</span>(curenv->env_pgfault_upcall){
<span class="hljs-keyword">struct</span> UTrapframe *utf;
<span class="hljs-keyword">uintptr_t</span> addr; <span class="hljs-comment">// addr of utf</span>
<span class="hljs-keyword">if</span>(UXSTACKTOP - PGSIZE <= tf->tf_esp && tf->tf_esp < UXSTACKTOP)
addr = tf->tf_esp - <span class="hljs-keyword">sizeof</span>(<span class="hljs-keyword">struct</span> UTrapframe) - <span class="hljs-number">4</span>;
<span class="hljs-keyword">else</span>
addr = UXSTACKTOP - <span class="hljs-keyword">sizeof</span>(<span class="hljs-keyword">struct</span> UTrapframe);
user_mem_assert(curenv, (<span class="hljs-keyword">void</span> *)addr, <span class="hljs-keyword">sizeof</span>(<span class="hljs-keyword">struct</span> UTrapframe), PTE_W);
utf = (<span class="hljs-keyword">struct</span> UTrapframe *)addr;
utf->utf_fault_va = fault_va;
utf->utf_err = tf->tf_err;
utf->utf_regs = tf->tf_regs;
utf->utf_eip = tf->tf_eip;<span class="hljs-comment">//用户能够返回到用户态的设置</span>
utf->utf_eflags = tf->tf_eflags;
utf->utf_esp = tf->tf_esp;
tf->tf_eip = (<span class="hljs-keyword">uint32_t</span>)curenv->env_pgfault_upcall;
tf->tf_esp = addr;
env_run(curenv);
}
</code></pre>
<h4 id="user-mode-page-fault-entrypoint">User-mode Page Fault Entrypoint</h4>
<p>接下来,我们需要实现汇编代码,来对page fault正确的进行处理,并且处理完后能够正确的返回当前函数执行的位置。这个assembly routine将会在sys_env_set_pgfault_upcall()上register(赋值)上。</p>
<h4 id="exercise-10">exercise 10</h4>
<blockquote>
<p>实现<code>lib/pfentry.S</code>中的_pgfault_upcall函数。</p>
<p>这一部分比较有趣的是能够返回用户态异常地址的地方,不需要通过内核进行返回。(注意陷入和返回的过程是不一样的)</p>
<p>最难的部分是同时变换栈和重新加载EIP。(此处同时变换不是指内存变换,而是再还原的过程中,有的寄存器一旦被还原,就不能再继续被使用了)</p>
</blockquote>
<p>关于调用page_fault_handler过程比较的简单:</p>
<ol>
<li>用UTrapframe来保存当前用户态的寄存器的值,这部分的值存储在异常栈处理部分<code>UXSTACKTOP</code>。</li>
<li>更新用户态的Tramframe中的EIP=page_fault_handler和ESP(异常栈部分)</li>
<li><code>_pgfault_upcall</code>调用用户自己定义的<code>page_fault_handler</code>,其中传入参数UTrapframe(<code>pushl %esp</code>)。</li>
<li>处理完后,我们直接从异常栈跳转到用户态(而不用通过内核态进行中转)。</li>
</ol>
<p>这个跳转的部分就是这个实现比较难的部分了,下面讲讲具体的思路。</p>
<p>假设现在异常栈只压入了一个栈帧,此刻的栈长成这样:</p>
<pre><code> <-- UXSTACKTOP
4 Bytes
trap-time esp
trap-time eflags
trap-time eip
trap-time eax start of struct PushRegs
trap-time ecx
trap-time edx
trap-time ebx
trap-time esp
trap-time ebp
trap-time esi
trap-time edi end of struct PushRegs
tf_err (error code)
fault_va <-- %esp when handler is run
</code></pre><p>首先执行因为栈顶的两个数据并不影响寄存器的值,不需要进行还原,<code>addl $0x8, %esp</code>,栈内容变成这样:</p>
<pre><code> <-- UXSTACKTOP
4 Bytes
trap-time esp
trap-time eflags
trap-time eip
trap-time eax start of struct PushRegs
trap-time ecx
trap-time edx
trap-time ebx
trap-time esp
trap-time ebp
trap-time esi
trap-time edi <-- %esp when handler is run
</code></pre><p>之后执行下面语句:</p>
<pre><code>subl $0x4, 0x28(%esp)
movl 0x28(%esp), %edx
</code></pre><p>栈中的内容变成了这样:</p>
<pre><code> <-- UXSTACKTOP
4 Bytes
trap-time esp-4
trap-time eflags
trap-time eip
trap-time eax start of struct PushRegs
trap-time ecx
trap-time edx
trap-time ebx
trap-time esp
trap-time ebp
trap-time esi
trap-time edi <-- %esp when handler is run
</code></pre><p>运行下面的指令:</p>
<pre><code>movl 0x20(%esp), %eax
movl %eax, (%edx)
</code></pre><p>栈中内容变成了:</p>
<pre><code> <-- UXSTACKTOP
trap-time eip 4 Bytes //这里发生了变化
trap-time esp-4
trap-time eflags
trap-time eip
trap-time eax start of struct PushRegs
trap-time ecx
trap-time edx
trap-time ebx
trap-time esp