CSE 221: Homework 3
Fall 2024
Due Wednesday, December 4th at 11:59pm
Answer the following questions. For questions asking for short
answers, there may not necessarily be a "right" answer, although some
answers may be more compelling and/or much easier to justify. I
am interested in your explanation (the "why") as much as the answer
itself. Also, do not use shorthand: write your answers using complete
sentences.
When grading homeworks, we will grade one question in detail and assign
full credit for technical answers to the others. Because question 3 includes
papers that will be discussed in class shortly before this homework is due,
we will not grade that question in detail.
Submit your homework by uploading it to Gradescope.
-
(32 pts) A reliability-induced synchronous write is a synchronous write
that is issued by the file system to ensure that the file system's
state (as represented by the system's metadata) is not left
inconsistent if the system crashes at an inconvenient time.
- Let f be a new file created in a directory d. The
file system will issue at least three disk writes to complete
this operation. Ignoring any data blocks allocated for the directory
or new file, what are these three disk writes for?
- In Unix FFS, at least two of these writes will be issued
synchronously. Which are they, and what order should they be
performed in? Briefly explain why.
- Consider the Soft Updates solution to this problem. Does it do
any reliability-induced synchronous writes? If so, how does it
differ from FFS? If not, why can it avoid doing so? Explain.
- Consider the same operation in LFS. Does LFS generate any
reliability-induced synchronous writes? Explain.
- Consider the same operation with SplitFS. Are
reliability-induced synchronous writes an issue with SplitFS?
Explain. (Hint: See Section 2.1 in the Soft Updates paper.)
- (36 pts) Consider a variant of RCU in which threads write (update)
the data structure by following these steps:
- Allocate a separate, private instance of the data structure.
- Copy the old, shared version of the data structure into the new,
private instance (copying just requires reads from the shared
version).
- Update (write to) the new, private instance as needed.
- Atomically change the pointer referencing the data structure to
point to the new instance instead of the old. Any previous reader
threads referencing the old data structure continue to read from the
old version unharmed. Any new reader threads will use the new
instance of the data structure and see the updates from the writer.
This handles the case of a single writer. If there can be
multiple writers, then the last step needs to be modified: When
the writing thread goes to update the shared pointer, it checks to
see if the data structure has been modified since it made a copy
(e.g., because a second writer changed the value of the shared
pointer before this one was able to); if it has, abort and go back
to step 2.
Assuming these semantics, answer the following questions with a
brief explanation.
- What kinds of thread workloads (mix of reader and writer
threads) work particularly well with this kind of synchronization?
- Does this kind of synchronization work equally well for all
kinds of data structures, or does it work better for some than
others?
- Describe a scenario in which a monitor would be a better choice
than the synchronization approach described above (give a different
example than those in a and b above).
- Can readers starve other readers?
- Can writers starve readers?
- Can readers starve writers?
- Can writers starve writers?
- Can deadlock occur?
- Is it safe for a thread to be arbitrarily suspended or stopped
while it is accessing a shared data structure?
- (32 pts) We have read several papers that discuss CPU scheduling
and networking, including Scheduler Activations, IX, and Snap. Answer the
questions below in the context of these three systems:
- Scalability. For each of these three systems, do you think the system can scale to support many cores? For example, could it support 100 or more cores on the same machine? If yes, explain why you think it's scalable. If not, explain what component of the system would likely limit scalability.
- Work conservation. For each of these three systems, is the system work conserving? If yes, why? If not, describe a situation in which it is not work conserving.
- Receive livelock. For IX and Snap, can the system suffer from receive livelock?
Explain briefly.