Question

I was worrying about race conditions in an application I'm designing, when I was wondering about this question.

Let's say I have a large array or collection of some sort that is managed by one component of my program, let's call that component Monitor. Its job is to regularly check if the collection is "dirty", i. e. has changed recently, and if so, write a snapshot to disk (this is to checkpoint the application in case a crash occurs) and mark it as clean again.

Other components of the same program, running in a different thread, call the Monitor's methods to add data to or modify data in the array/collection. Those methods mark the collection as dirty.

Now, the altering methods run in the other components' threads, right? And if I'm not so lucky, they might be called while the snapshot is written to disk, change data that has already been written, set the dirty flag and the Monitor's thread unsets it just after that, without having saved the changes (it was already past the element when it changed). So I have a dirty collection that's marked as clean.

For a while, I thought that I could maybe solve the problem by making a temporary copy of the collection, mark it clean and then go and serialize the copy. But would the copying be atomic, i. e. can I be sure that the collection will not change while I'm copying it?

Meanwhile, I think I have found better solutions, like

  • setting a lock flag before beginning the write to disk and make the data altering methods wait until the flag is unset
  • have the data altering methods write to a "change queue" instead of directly to the collection and have the thread that does the disk writing process that queue

and I think the lock flag might be the best way to go. But I'm still curious: Is copying a variable atomic?


Follow-up: Maybe this should go in a question of its own, but actually it's very much the same. According to the answers below, my "lock flag" approach might also not work, right? Because the data altering method might check the lock flag while it is being set to the "locked" value and decide it's not locked. So I need a special construction like a mutex if I really want to do this right, correct?


Kudos to erickson for his very helpful answer on my follow-up. I really should have made this two questions so I could have accepted two answers. Please, vote him up, too.

Was it helpful?

Solution

No. For instance, long variables in Java are not atomic on 32-bit machines.

In addition, there's a "thread caching" issue - unless your variable is volatile or inside synchronized block, another thread may not see the change to variable value. This is true for all types of variables, not just long.

Read here: http://gee.cs.oswego.edu/dl/cpj/jmm.html, especially "atomicity" and "visibility" paragraphs.

OTHER TIPS

No, it's not atomic. See this question for why and what to do about it.

Take a look at java.util.concurrent.atomic - there may be some good things in there you can use.

You need to be concerned about the visibility of changes to other threads when working on the JVM. In general, you should make your assignments within a synchronized block, or the variable should be volatile, or you should use a variable wrapper from the java.util.concurrent.atomic package.

However, in your case, it sounds like you have just one thread that ever clears the "dirty" flag—the thread that persists the data. If that's the case, clear the flag before writing the data. If other threads set it while you are writing the data, it will stay set until the next scheduled write. I'd use an AtomicBoolean, to give your persistence thread atomicity between checking the flag and clearing it, like this:

private final AtomicBoolean dirty = new AtomicBoolean();

/**
 * Any method that modifies the data structure should set the dirty flag.
 */
public void modify() {
  /* Modify the data first. */
  ...
  /* Set the flag afterward. */
  dirty.set(true);
}

private class Persister extends Thread {
  public void run() {
    while (!Thread.interrupted()) {
      if (dirty.getAndSet(false)) {
        /* The dirty flag was set; this thread cleared it 
         * and should now persist the data. */
         ...
      }
    }
  }
}

Setting a 32-bit (at least in .NET) is atomic, but it does you no good. You have to read it to know if it's locked, so you might read it, and after the read, someone else reads it before you can set it, so two threads end up inside the "protected" code. THis is exactly what actual synchronization objects (like the .NET Monitor class) are for. You could use an Interlocked to check and increment the lock variable as well.

See also: Is accessing a variable in C# an atomic operation?

Depends very much on the hardware and JVM you are running

On some hardware and some JVMs some copies will be atomic but it is safer to assume this is not the case even a simple integer to integer assignment can translate to four machine instructions on x856 hardware.

String and array copies can involve a sequence of thousands of instructions and it is possible for two threads to update simultaniously.

As I understand it, not always.

Int32 will be, an Int64 won't be on a 32 bit sytem as it needs 2 x 32 bit. Therefore, it does not fit in one 32 bit cell.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top