Monday, October 1, 2012

JavaOne 2012: How Do Non-Blocking Data Structures Work?

I was a little surprised when I looked at my schedule for today and noted that all of the sessions I have currently planned to see today are in the Hilton. This became a little less surprising when I realized that about half of the JavaOne presentations are in the Hilton and that they seem to be roughly located by track.

Tobias Lindaaker's (Neo Technology) presentation "How Do Atomic Data Structures Work?" was held in the Hilton's Golden Gate 3/4/5 conference room area. Lindaaker changed his presentation's title since he originally submitted the abstract. The abstract's title (and that listed in the conference materials) was "How Do Atomic Data Structures Work?," but he has renamed it to "How Do Non-Blocking Data Structures Work?"

Lindaaker explained that "atomic" comes from Greek and meaning "undividable." He explained that a "lock-free data structure" is "a data structure that does not block any threads when performing an operation on the data structure (read or write)." He stated that one wants to avoid "spin-waiting" whenever possible.

Lindaaker talked about synchronized regions. He said such regions "create a serialized path through the code" and "guarantee safe publication." He defined "safe publication" as meaning "everything written before exiting synchronized [block]" and "guaranteed to be visible on entry of synchronized [block]." One of his bullets stated, "volatile fields give you safe publication without serialization." Lindaaker focused more on the volatile keyword modifier in his "volatile fields" slide.

The slide "What is a memory barrier?" provided a simple visual representation of the memory barrier concept.

For his slide "Atomic updates," Lindaaker stated that the easiest way to access an atomic reference is via use of java.util.concurrent.atomic.AtomicReference<V>. Lindakker provided a physical demonstration using coasters to illustrate the difference between compareAndSet (sets a value if the conditional matches favorably) and getAndSet. (sets new value returns old value).

Lindaaker prefers java.util.concurrent.atomic.AtomicReferenceFieldUpdater<T,V> because of its "lower memory overhead" ("fewer object headers") and "better memory locality" ("no reference indirection").

Lindaaker explained that array-based queues do block (sometimes a benefit when amount of work needs to be limited due to finite hardware resources), linked queues do not block. Lindaaker used a supermarket queue as an example of the differences. In the link-based queue, you always stand behind the same customer in front of you in the queue. In the array-based queue, you always remain in the same position. Bounded queues "frequently perform better," but will block when full.

One of the main themes of this presentation was the idea of learning new ideas and then individually researching them further. Lindaaker recommended that audience members look at the JDK's code to see some impressive and less impressive code examples.

Lindaaker referenced LMAX (London Multi Asset Exchange) Disruptor as an example of a "ring buffer" ("array with a read mark and a write mark"). He stated that "readers contend on the read mark, writers on write mark" and highlighted the consequence of this, "With single reader / single writer, there is no contention." The Disruptor page describes Disruptor as a "High Performance Inter-Thread Messaging Library."

Lindaaker stated that java.util.concurrent.ConcurrentHashMap is a good general choice, but is not very exciting for discussion in his presentation. He stated that it "scales reasonably well on current commodity hardware" (fewer than 100 CPUs) with proper tuning.

Neo Technology provides a database implementation (Neo4j) that is not relational (graph database). Lindaaker described the Neo Technology's graph-based database offering as, "Stores data as nodes and relationships between nodes."

No comments: