Hi everyone,
I'm dealing with a problem I can't figure out how to solve, and I'd love to hear some suggestions.
[NOTE: I realise I'm asking several questions; however, answers need to take into account all of the issues, so I cannot split this into several questions]
Here's the deal: I'm implementing a system that underlies user applications and that protect shared objects from concurrent accesses. The application programmer (whose application will run on top of my system) defines such shared objects like this:
public class MyAtomicObject {
// These are just examples of fields you may want to have in your class.
public virtual int x { get; set; }
public virtual List<int> list { get; set; }
public virtual MyClassA objA { get; set; }
public virtual MyClassB objB { get; set; }
}
As you can see they declare the fields of their class as auto-generated properties (auto-generated means they don't need to implement get and set). This is so that I can go in and extend their class and implement each get and set myself in order to handle possible concurrent accesses, etc. This is all well and good, but now it starts to get ugly: the application threads run transactions, like this:
- The thread signals it's starting a transaction. This means we now need to monitor its accesses to the fields of the atomic objects.
- The thread runs its code, possibly accessing fields for reading or writing. If there are accesses for writing, we'll hide them from the other transactions (other threads), and only make them visible in step 3. This is because the transaction may fail and have to roll back (undo) its updates, and in that case we don't want other threads to see its "dirty" data.
- The thread signals it wants to commit the transaction. If the commit is successful, the updates it made will now become visible to everyone else. Otherwise, the transaction will abort, the updates will remain invisible, and no one will ever know the transaction was there.
So basically the concept of transaction is a series of accesses that appear to have happened atomically, that is, all at the same time, in the same instant, which would be the moment of successful commit. (This is as opposed to its updates becoming visible as it makes them)
In order to hide the write accesses in step 2, I clone the accessed field (let's say it's the field list
) and put it in the transaction's write log. After that, any time the transaction accesses list
, it will actually be accessing the clone in its write log, and not the global copy everyone else sees. Like this, any changes it makes will be done to the (invisible) clone, not to the global copy.
If in step 3 the commit is successful, the transaction should replace the global copy with the updated list
it has in its write log, and then the changes become visible for everyone else at once. It would be something like this:
myAtomicObject.list = updatedCloneOfListInTheWriteLog;
Problem #1: possible references to the list. Let's say someone puts a reference to the global list
in a dictionary. When I do...
myAtomicObject.list = updatedCloneOfListInTheWriteLog;
...I'm just replacing the reference in the field list
, but not the real object (I'm not overwriting the data), so in the dictionary we'll still have a reference to the old version of the list. A possible solution would be to overwrite the data (in the case of a list, empty the global list and add all the elements of the clone). More generically, I would need to copy the fields of one list to the other. I can do this with reflection, but that's not very pretty. Is there any other way to do it?
Problem #2: even if problem #1 is solved, I still have a similar problem with the clone: the application programmer doesn't know I'm giving him a clone and not the global copy. What if he puts the clone in a dictionary? Then at commit there will be some references to the global copy and some to the clone, when in truth they should all point to the same object. I thought about providing a wrapper object that contains both the cloned list and a pointer to the global copy, but the programmer doesn't know about this wrapper, so they're not going to use the pointer at all.
The wrapper would be like this:
public class Wrapper<T> : T {
// This would be the pointer to the global copy. The local data is contained in whatever fields the wrapper inherits from T.
private T thisPtr;
}
I do need this wrapper for comparisons: if I have a dictionary that has an entry with the global copy as key, if I look it up with the clone, like this:
dictionary[updatedCloneOfListInTheWriteLog]
I need it to return the entry, that is, to think that updatedCloneOfListInTheWriteLog and the global copy are the same thing. For this, I can just override Equals, GetHashCode, operator== and operator!=, no problem. However I still don't know how to solve the case in which the programmer unknowingly inserts a reference to the clone in a dictionary.
Problem #3: the wrapper must extend the class of the object it wraps (if it's wrapping MyClassA, it must extend MyClassA) so that it's accepted wherever an object of that class (MyClass) would be accepted. However, that class (MyClassA) may be final.
This is pretty horrible :-/. Any suggestions? I don't need to use a wrapper, anything you can think of is fine. What I cannot change is the write log (I need to have a write log) and the fact that the programmer doesn't know about the clone.
I hope I've made some sense. Feel free to ask for more info if something needs some clearing up. Thanks so much!