Code H’yung: Ropes, Sprint 3
Welcome to the Sprint 3 Review of the Ropes h’yung mini-project. As the only person who really matters on this project, I’m happy to report that we have code to show you for our review today!
First, we had to shift some of the goals for this sprint around:
- apply the Null Object Pattern and refactor where possible
- refactor append to be purely functional
partial implementation of string deletion
- analysis of string deletion
- partial implementation of RopeNodeIterator
Let’s start with the first goal: null objects. First I wrote a NullRopeNode:
NullRopeNode class instanceVariableNames: 'theInstance' instance "answer with a new instance of NullRopeNode" (theInstance = nil) ifTrue: [ theInstance := NullRopeNode new. ]. ^ theInstance. asString "answers with a string of this RopeNode" ^ ''. hasNoChildren "answers true if the node has no children" ^ true. key "answers the search key value of this node" ^ 0. leftChild "answers with the left RopeNode" ^ NullRopeNode instance. rightChild "answers with the right RopeNode" ^ NullRopeNode instance. substringFrom:aStartNumber to:anEndNumber "answers a substring from aStartNumber to anEndNumber, 1-based" ^ ''. totalPayloadLength "answers the total payload length in this RopeNode's subtree" ^ 0. isNull "answers true if this RopeNode is null" ^ true.
There is little to note here except for the new isNull method. I borrowed the idea from a post I found by Fredrik Normen. Before going into why I wrote it, here are the implementations of this method in both InternalRopeNode and LeafRopeNode:
isNull "answers true if this RopeNode is null" ^ false.
Well la-dee-da. Now, I needed this because the first place I tried to apply this was in Rope‘s append method:
... (oldRoot leftChild = nil) ifTrue: [ ...
... (oldRoot leftChild isNull) ifTrue: [ ...
I did have some qualms about this. After all, applying the Null Object pattern is supposed to avoid this sort of null-checking. But, before I went on in a misguided attempt to force-fit some of the append logic into NullRopeNode, I realized something. The real benefits of the Null Object pattern come when you check an object for null before you send some messages to that object.
In this case, we are checking the object for null and taking action that has nothing to do with the object we checked. So I think there is no way to avoid the null check, especially when it is part of the algorithm you're implementing.
Sidebar: I ran into an interesting "gotcha" when introducing this null object. I first implemented NullRopeNode and then I applied it to InternalRopeNode and LeafRopeNode. I then re-ran the unit test for append and it passed.
I expected it to fail because of this line:
... (oldRoot leftChild = nil) ifTrue: [ ...
But there was no harm here. This condition was simply never true now that we had introduced NullRopeNode, and so the tree that resulted was correct, but unbalanced. And the unit test didn't catch this.
Other changes in InternalLeafNode:
hasNoChildren "answers true if the node has no children" ^ (leftRopeNodeChild isNull) and:[rightRopeNodeChild isNull]. initialize hasKeyBeenSet := false. hasRightChildBeenSet := false. hasLeftChildBeenSet := false. keyNumber := 0. leftRopeNodeChild := NullRopeNode instance. rightRopeNodeChild := NullRopeNode instance. asString "answers with a string of this RopeNode" ^ (leftRopeNodeChild asString) , (rightRopeNodeChild asString).
Look at how nicely asString is now compared to before:
asString "answers with a string of this RopeNode" (leftRopeNodeChild = nil and:[rightRopeNodeChild = nil]) ifTrue: [ ^ ''. ]. (leftRopeNodeChild = nil) ifTrue: [ ^ rightRopeNodeChild asString. ]. (rightRopeNodeChild = nil) ifTrue: [ ^ leftRopeNodeChild asString. ]. ^ (leftRopeNodeChild asString) , (rightRopeNodeChild asString).
The next goal was to make the append method purely functional. The change was relatively simple (watch the commented-out areas for the “before”):
append:aString "appends a String to this Rope" | rootRightChildHasNoChildren newRoot oldRoot newRootRightChild ropeToReturn | oldRoot := rootRopeNode. (oldRoot leftChild isNull) ifTrue: [ newRoot := InternalRopeNode instanceWithKey:(aString size) andLeftChild:(LeafRopeNode instanceWithPayload:aString andStartingIndex:0) andRightChild:(oldRoot rightChild). "rootRopeNode := newRoot." "^ nil." ropeToReturn := Rope instance. ropeToReturn setRootOnce:newRoot. ^ ropeToReturn. ]. (oldRoot rightChild isNull) ifTrue: [ newRoot := InternalRopeNode instanceWithKey:(oldRoot key) andLeftChild:(oldRoot leftChild) andRightChild:(LeafRopeNode instanceWithPayload:aString andStartingIndex:(oldRoot key)). "rootRopeNode := newRoot." "^ nil." ropeToReturn := Rope instance. ropeToReturn setRootOnce:newRoot. ^ ropeToReturn. ]. rootRightChildHasNoChildren := oldRoot rightChild hasNoChildren. (rootRightChildHasNoChildren) ifTrue: [ newRootRightChild := InternalRopeNode instanceWithKey:((oldRoot key) + (oldRoot rightChild key)) andLeftChild:(oldRoot rightChild) andRightChild:(LeafRopeNode instanceWithPayload:aString andStartingIndex:((oldRoot key) + (oldRoot rightChild key))). newRoot := InternalRopeNode instanceWithKey:(oldRoot key) andLeftChild:(oldRoot leftChild) andRightChild:(newRootRightChild). ]. (rootRightChildHasNoChildren) ifFalse: [ newRoot := InternalRopeNode instanceWithKey:(oldRoot totalPayloadLength) andLeftChild:(oldRoot) andRightChild:(LeafRopeNode instanceWithPayload:aString andStartingIndex:(oldRoot totalPayloadLength)). ]. "rootRopeNode := newRoot." ropeToReturn := Rope instance. ropeToReturn setRootOnce:newRoot. ^ ropeToReturn.
So re-run your tests and all should be fine, right? Wrong. The tests were written like this:
testAppend | someRope | someRope := Rope instance. self assert:(someRope asString = ''). someRope append:'Denny'. self assert:(someRope asString = 'Denny').
That last assertion fails now because the line before it that sends append to someRope returns a new Rope, which we ignore.
Well, nothing left to do except fix the tests:
testAppend | someRope | someRope := Rope instance. self assert:(someRope asString = ''). someRope := someRope append:'Denny'. self assert:(someRope asString = 'Denny').
Cool. This actually required me to change a lot of tests, but it was fine.
Now, on to the behemoth of string deletion. This was not covered in the original article that inspired this h’yung, so I had to think about this myself. Uh oh.
I need an example. Let’s say I have this rope:
Let’s consider the case where the string we want to delete happens to be exactly an entire LeafRopeNode‘s payload. So if we remove JKL we get this:
If we remove FGHI, we get this:
In both cases, we replaced the target node’s parent with its sibling. Let’s see if this pattern continues. Remove DE:
It sure seems like that if we know which LeafRopeNode to remove, we can employ an algorithm like this:
if(targetNode parent parent leftChild = targetNode parent) then targetNode parent parent leftChild := targetNode sibling. else if(targetNode parent parent leftChild = targetNode parent) then targetNode parent parent rightChild := targetNode sibling.
It’s very similar to ordinary binary search tree deletion. So this implies the need for a parent and sibling methods.
The other case where our string to delete is spread across part of one leaf and part of another is easier. Let’s delete EFGH from the original rope. Then we get this:
So this implies the need for a renumber method that’ll re-key the tree so it can be correct.
This analysis is all nice and good, but we have to have a way or recognizing these situations. I was initially stumped, so I turned my attention to the first part of string deletion: does the target string even exist in the rope?
This only further stumped me because I was looking for a way that didn’t involve simply sending asString and searching through it. I toyed with having a Rope method called contains:aSubstring send contains:aSubstring startingAt:aNumber to the RopeNodes, but how do you know what aNumber should be? Where do you start the search in each sub-tree? There is no way of knowing.
Some research turned up an article by Amin Ahmad on IBM developerWorks on ropes, where it is suggested to use the Iterator pattern to only traverse as much of the rope as you need until you find a match. This way, you can keep the rope traversal down to at most one traversal, and you only need to look at the parts of the rope that you need to look at.
Fine, that sounds like it’ll work. But how does the existence of the string to delete help us at all? We still have the problem of distinguishing between entire node deletion and partial node deletion.
The IBM article helped me again. Notice that their rope library has a Rope.delete(start, end) method, which is a lot easier to implement since you navigate the tree based on position anyway, just like substringFrom:To: from the previous sprint.
And from there, the algorithm is easy: recursively call delete(x, y) to traverse the tree until you get to the leaves. If the range that x and y spans also spans the entire leaf’s string payload, invoke the logic for deleting an entire node described earlier. Otherwise, invoke the logic for deleting only part of the node described earlier. After the traversal is done, renumber the rope.
Now all that remains is finding x and y, and the iterator will do that for us. Rope will have an endPointsOf:aString method that will use the iterator to return a pair of end-points. Those end-points can be used to send a deleteFrom:aStartIndex to:anEndIndex message.
Alrighty, now that I have an attack plan, I think the goals for the next sprint are:
- implement parent and sibling
- implement renumber
- I actually already started, so continue implementation of RopeNodeIterator
And the Sprint 4 review will be in two weeks, not one. I have decided to alternate between opinion/snippet posts and coding posts for the Wednesday offering.
So, to answer your question: no, I’m not a creature of routine.
|Announcer: You’re reading the EIP web-ring.|